CN112506710B

CN112506710B - Distributed file system data restoration method, device, equipment and storage medium

Info

Publication number: CN112506710B
Application number: CN202011487179.XA
Authority: CN
Inventors: 刘日新; 陈紫卿
Original assignee: Sangfor Technologies Co Ltd
Current assignee: Sangfor Technologies Co Ltd
Priority date: 2020-12-16
Filing date: 2020-12-16
Publication date: 2024-02-23
Anticipated expiration: 2040-12-16
Also published as: CN112506710A

Abstract

The application discloses a distributed file system data restoration method, which comprises the following steps: in the normal working process of the distributed file system, if a redundant node fails, the shared memory is used for recording repair information, and the repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap; and after the failed redundant node is recovered, repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory. By applying the technical scheme provided by the application, the data copy of the failed redundant node can be repaired by using smaller repair granularity, excessive repair amplification caused by overlarge repair granularity is avoided, the data window of the single-point copy is prevented from being enlarged, the resource expenditure can be saved, and the performance stability of the whole system is improved. The application also discloses a distributed file system data repairing device, equipment and a storage medium, which have corresponding technical effects.

Description

Distributed file system data restoration method, device, equipment and storage medium

Technical Field

The present invention relates to the field of computer application technologies, and in particular, to a method, an apparatus, a device, and a storage medium for repairing distributed file system data.

Background

With the rapid development of computer technology, the application of distributed file systems (Distributed File System, DFS) is becoming more and more widespread. A distributed file system refers to a complete hierarchical file system formed by connecting physical storage resources managed by the file system to a local node not necessarily directly, but rather to the node through a computer network, or by combining a plurality of different logical disk partitions or labels. The distributed file system provides a logical tree file system structure for resources distributed at any position on the network, so that a user can access shared files distributed on the network more conveniently.

The distributed file system can comprise a plurality of redundant nodes, different redundant nodes have data copies, and when the redundant nodes are in fault, the distributed file system can work normally without interrupting upper-layer application even if only one redundant node works normally. When the failed redundant node is recovered, the data copy of the failed redundant node needs to be recovered, so that the consistency of the data copy in each redundant node is ensured, and the data safety is ensured.

At present, the repair of the data copy is mostly performed in a global bitmap mode. When the redundant node fails and data is updated, corresponding marks are carried out in the global bitmap, and then after the failed redundant node is recovered, the data copy is recovered through the global bitmap. This repair is performed at a granularity of 128M. If the actual use process is only interrupted for a short time, such as service restarting, system maintenance (upgrading), network jitter and the like, only partial data is inconsistent. In these similar scenarios, the repair granularity of 128M will cause excessive repair amplification, the range of verification data inconsistency is large, a large number of data repair tasks can be caused even if short interruption occurs, the data window of a single point copy is enlarged, and multipoint faults are easily caused. Meanwhile, a large amount of data verification can cause excessive resource expenditure, and the performance stability of the whole system is affected.

Disclosure of Invention

The purpose of the application is to provide a method, a device, equipment and a storage medium for repairing distributed file system data, so as to reduce the granularity of data repair, save the resource overhead and improve the performance stability of the system.

In order to solve the technical problems, the application provides the following technical scheme:

a distributed file system data repair method, comprising:

in the normal working process of the distributed file system, if redundant nodes are in failure, the shared memory is used for recording repair information, the number of the failed redundant nodes is smaller than the total number of the redundant nodes, and the repair granularity corresponding to the repair information is smaller than the repair granularity corresponding to the global bitmap;

and after the failed redundant node is recovered, repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory.

In a specific embodiment of the present application, in a case where a redundant node fails, the method further includes:

and under the condition that the duty ratio of the repair information in the shared memory is larger than a set first ratio threshold, the repair information in the shared memory is moved into a log file in batches.

In one specific embodiment of the present application, the repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory includes:

And repairing the data copy of the failed redundant node based on the shared memory and the repairing information recorded in the log file.

and in the process of recording the repair information, marking to be repaired in the global bitmap.

In a specific embodiment of the present application, before performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, the method further includes:

determining whether the duty ratio of the repair information in the log file is smaller than a set second proportion threshold value;

and if the duty ratio of the repair information in the log file is smaller than the second proportion threshold value, executing the step of repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file.

In one specific embodiment of the present application, after the determining whether the duty ratio of the repair information in the log file is smaller than a set second proportion threshold, the method further includes:

And if the duty ratio of the repair information in the log file is not smaller than the second proportion threshold value, performing repair processing on the data copy of the failed redundant node based on the mark to be repaired of the global bitmap.

In a specific embodiment of the present application, further comprising:

before the atomic updating operation is carried out on the shared memory, corresponding atomic updating operation is recorded in a repair log;

in the process of repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory, if a repairing process is crashed, after restarting the repairing process, determining an atomic updating operation to be performed before the crashing based on the repairing log;

and performing the atomic updating operation on the shared memory.

In a specific embodiment of the present application, the repair log includes a description header and an operation record buffer, where the description header is used to describe a state of a current log record, and the operation record buffer is used to record content of each atomic update operation in sequence.

Marking an unrepaired state in the index node context of the shared memory;

recording the initial values of the version numbers of the head and the tail in the index nodes of the shared memory;

and recording a version number as a value of a tail version number in the index node in each interval node of the shared memory and each batch information of the log file.

In a specific embodiment of the present application, in a process of repairing a data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, the method further includes:

marking a repair-in-progress state in the inode context;

adding a set stepping value to the tail version number recorded in the index node;

and recording a version number in the newly inserted interval node in the shared memory and the newly added batch information in the log file as a value of a tail version number in the index node.

In a specific embodiment of the present application, after the repairing process is completed on the data copy of the failed redundant node, the method further includes:

marking unrepaired states in the inode context;

Updating the head version number recorded in the index node to be the same as the tail version number;

and determining and deleting the failure information in the shared memory and the log file according to the head and tail version numbers recorded in the index node.

if the repair process crashes, marking an unrepaired state in the index node context under the condition that the repair process is restarted and repair is not triggered;

repeatedly executing the marking of the repairing state in the index node context under the condition that the repairing process is restarted and the repairing is triggered; adding a set stepping value to the tail version number recorded in the index node; and recording a version number as a value of a tail version number in the index node in the newly inserted interval node in the shared memory and the newly added batch information in the log file.

In a specific embodiment of the present application, further comprising:

under the condition that the restoration process is restarted and restoration is not triggered, if the tail version number recorded in the index node is equal to the set maximum version number, determining interval nodes in the shared memory and batch information in the log file as failure information and deleting the failure information;

And repeatedly executing the step of recording the initial values of the head version number and the tail version number in the index nodes of the shared memory.

A distributed file system data repair apparatus comprising:

the shared memory information recording unit is used for recording repair information by using the shared memory if redundant nodes are in failure in the normal working process of the distributed file system, the number of the failed redundant nodes is smaller than the total number of the redundant nodes, and the repair granularity corresponding to the repair information is smaller than the repair granularity corresponding to the global bitmap;

and the data copy repairing unit is used for repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory after the failed redundant node is recovered.

A distributed file system data repair device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the distributed file system data repair method of any one of the above when executing the computer program.

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the distributed file system data restoration method of any of the above.

By applying the technical scheme provided by the embodiment of the application, in the normal working process of the distributed file system, if the redundant nodes are failed and the number of the failed redundant nodes is smaller than the total number of the redundant nodes, the shared memory can be used for recording the repair information, and after the failed redundant nodes are recovered, the data copy of the failed redundant nodes is repaired based on the repair information recorded in the shared memory. The repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap, based on the repair information recorded in the shared memory, the data copy of the redundant node with the fault can be repaired by using the smaller repair granularity, excessive repair amplification caused by the overlarge repair granularity is avoided, the data window of the single-point copy is prevented from being enlarged, the resource cost can be saved, and the performance stability of the whole system is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart illustrating an implementation of a method for repairing distributed file system data according to an embodiment of the present application;

fig. 2 is a schematic diagram of a basic structure of a differential repair module in an embodiment of the present application;

FIG. 3 is a schematic diagram of a repair log format according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a version number mechanism implementation under normal conditions in an embodiment of the present application;

FIG. 5 is a schematic diagram of implementation of a version number mechanism under abnormal conditions in an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a distributed file system data repairing apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a distributed file system data repair device according to an embodiment of the present application.

Detailed Description

In order to provide a better understanding of the present application, those skilled in the art will now make further details of the present application with reference to the drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

Referring to fig. 1, a flowchart of an implementation of a distributed file system data repair method according to an embodiment of the present application may include the following steps:

S110: in the normal working process of the distributed file system, if redundant nodes are in failure, the shared memory is used for recording repair information, the number of the failed redundant nodes is smaller than the total number of the redundant nodes, and the repair granularity corresponding to the repair information is smaller than the repair granularity corresponding to the global bitmap.

The distributed file system may include a plurality of redundant nodes that form a redundant relationship with each other, and even if only one redundant node is functioning properly, the distributed file system may function properly. In the normal working process of the distributed file system, redundant nodes may fail due to hardware, network and other reasons, and the failed redundant nodes cannot continue to work normally. If the number of the failed redundant nodes is smaller than the total number of the redundant nodes, the distributed file system can still work normally.

When a redundant node is failed and the number of failed redundant nodes is smaller than the total number of redundant nodes, the non-failed redundant nodes are in a normal working state, the data copies of the non-failed redundant nodes are updated in real time, but the data copies of the failed redundant nodes are not updated any more, so that the data copies among the redundant nodes are inconsistent, and after the failed redundant nodes are recovered, the data copies of the redundant nodes are required to be recovered, so that the data copies of the redundant nodes are consistent with the data copies of the normal working redundant nodes, and the safety of the distributed file system is ensured.

When a redundant node fails, the repair information can be determined based on the data copy of the redundant node which is working normally at present, and the shared memory is used for recording the repair information. The repair information may specifically be a data range that needs to be synchronized, such as information of offset, length, copy number, and the like of repair. The repair granularity corresponding to the repair information is smaller than the repair granularity corresponding to the global bitmap. The repair granularity of the repair information may be 4KB, for example.

S120: and after the failed redundant node is recovered, repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory.

After the failed redundant node is recovered, the data copy of the failed redundant node is inconsistent with the data copy of the redundant node which works normally before, so that the data copy of the failed redundant node needs to be repaired. The repair process can be performed on the data copy of the failed redundant node based on the repair information recorded in the shared memory. Specifically, the information such as the data range needing to be synchronized can be determined through the repair information recorded in the shared memory, the corresponding data range is searched in the data copy of the redundant node with the fault, the data synchronization is performed, and the repair processing of the data copy of the redundant node with the fault is realized.

By applying the method provided by the embodiment of the application, in the normal working process of the distributed file system, if the redundant nodes are failed and the number of the failed redundant nodes is smaller than the total number of the redundant nodes, the shared memory can be used for recording the repair information, and after the failed redundant nodes are recovered, the data copy of the failed redundant nodes is repaired based on the repair information recorded in the shared memory. The repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap, based on the repair information recorded in the shared memory, the data copy of the redundant node with the fault can be repaired by using the smaller repair granularity, excessive repair amplification caused by the overlarge repair granularity is avoided, the data window of the single-point copy is prevented from being enlarged, the resource cost can be saved, and the performance stability of the whole system is improved.

In one embodiment of the present application, in the event that there is a redundant node that fails, the method may further include the steps of:

under the condition that the duty ratio of the repair information in the shared memory is larger than a set first proportion threshold value, the repair information in the shared memory is moved into the log file in batches;

Step S120 performs repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory, and may include the following steps:

In the embodiment of the application, the limited storage space of the shared memory is considered, so that the duty ratio of the repair information in the shared memory can be monitored in the process of recording the repair information by using the shared memory under the condition that the redundant node fails. And under the condition that the duty ratio of the repair information in the shared memory is larger than the set first proportion threshold value, the repair information in the shared memory can be moved into the log file in batches. Specifically, the repair information in the shared memory may be serialized and written into the log file. The log files may be located on a cache disk with greater storage space. The first proportional threshold may be set and adjusted according to the actual situation, for example, set to 80%.

That is, when the amount of the repair information data recorded in the shared memory is not large, the repair information data may be recorded in the shared memory only, when the amount of the repair information data recorded in the shared memory is large, the repair information data is serialized and then moved into the log file in batches, so that the space of the shared memory is released, the newly generated repair information may be continuously recorded in the shared memory, and when the ratio of the repair information recorded in the shared memory is larger than the first ratio threshold, the repair information in the shared memory is moved into the log file in batches.

After the failed redundant node is recovered, repair information may be recorded in the shared memory and the log file, and the data copy of the failed redundant node may be repaired based on the repair information recorded in the shared memory and the log file.

The shared memory ensures the fastest access, meanwhile, recorded information is not lost when the process is restarted, when the shared memory occupies more space, the recorded information is automatically eliminated to the log files stored in the data disk, the log file capacity is larger than that of the shared memory, and longer repair optimization effective time can be ensured.

The repair granularity of the repair information recorded in the shared memory and the log file is small, so that the problems of excessive repair amplification and the like can be avoided, a large number of data repair tasks can not be caused under the scenes of short interruption and the like, the data window of the single-point copy is prevented from being enlarged, and the performance stability of the whole system is improved.

in the process of recording the repair information, marking to be repaired in the global bitmap;

before repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory and the log file, the method further comprises:

if the duty ratio of the repair information in the log file is smaller than a second proportion threshold value, executing a step of repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file;

and if the duty ratio of the repair information in the log file is not smaller than the second proportion threshold value, performing repair processing on the data copy of the failed redundant node based on the to-be-repaired mark of the global bitmap.

In the embodiment of the application, under the condition that a redundant node fails, the repair information is recorded in the shared memory and the log file, and in the process of recording the repair information, the to-be-repaired mark can be performed in the global bitmap. Such as by "1", "0", etc.

Before repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, it may be determined whether the duty cycle of the repair information in the log file is less than a set second proportional threshold. The second ratio threshold may be set and adjusted according to practical circumstances, for example, set to 85%.

If the data copy is smaller than the shared memory, the data copy can be repaired with smaller repair granularity, namely, the data copy of the failed redundant node can be repaired based on the repair information recorded in the shared memory and the log file.

If the data size is not smaller than the data size, the current repair information data size which needs to be synchronized is considered to be larger, and the data copy can be repaired with larger repair granularity. And the data copy of the failed redundant node can be repaired based on the to-be-repaired mark of the global bitmap.

In the embodiment of the application, the data copy can be repaired based on the global bitmap through a traditional repair module Diff-record, and a difference repair module DR-Cache is added above the traditional repair module Diff-record and is used for repairing the data copy based on the repair information recorded by the shared memory and the log file. As shown in FIG. 2, the difference repair module DR-Cache can be implemented under an Xlator framework, and can include a difference repair core sub-module, a shared memory management sub-module and a log file management sub-module, wherein the difference repair core sub-module can perform operations of adding, deleting, merging and the like of repair information. When the batch repair information data quantity recorded in the log file is large, the difference repair module DR-Cache fails, and the traditional repair module Diff-record is used for repairing the data copy. The difference repair module DR-Cache is added on the basis of the traditional repair module Diff-record, so that the repair granularity is reduced, the repair data volume is reduced, and the repair completion time is shortened.

Taking an actual fault scene as an example, compared data of the repairing mode and the traditional repairing mode are as follows:

(1) Fault type: service process restart

Common scenarios: thermal upgrade

TABLE 1

(2) Fault type: storage network offline

Common scenarios: network flashing and short-term network failure

TABLE 2

(3) Fault type: offline magnetic disk

Common scenarios: replacement data disk and replacement cache disk

TABLE 3 Table 3

From the data in the tables, in the scene of more short interruption, compared with the traditional method for repairing the data copy, the method for repairing the data copy has the advantages that the data size is smaller, the repairing completion time is shorter, the repairing efficiency is improved, the repairing cost is reduced, and the repairing data size is more true in data difference.

In one embodiment of the present application, step S120 performs repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory, and may include the following steps:

the first step: before the atomic updating operation is carried out on the shared memory, corresponding atomic updating operation is recorded in a repair log;

and a second step of: in the process of repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory, if the repairing process is crashed, determining an atomic updating operation to be performed before the crash based on a repairing log after restarting the repairing process;

And a third step of: and performing the atomic updating operation on the shared memory.

For ease of description, the three steps described above are combined.

In the normal working process of the distributed file system, if redundant nodes are failed and the number of the failed redundant nodes is smaller than the total number of the redundant nodes, the shared memory can be used for recording repair information, and the repair granularity corresponding to the repair information is smaller than the repair granularity corresponding to the global bitmap.

After the failed redundant node is recovered, the data copy of the failed redundant node can be repaired based on the repair information recorded in the shared memory. In the repair process, for each piece of repair information, after the repair process is completed on the data copy based on the repair information, the repair information can be cleared in the shared memory. The operations of recording and deleting the repair information in the shared memory can be regarded as the atomic update operation of the shared memory.

In order to ensure the consistency of the data updating of the shared memory, the corresponding atomic updating operation can be recorded in the repair log before the atomic updating operation is carried out on the shared memory.

The repair log may include a description header for describing the state of the current log record and an operation record buffer for sequentially recording the contents of each atomic update operation. Specifically, the repair log may be a Journal log, as shown in fig. 3, and the description header shmjoutal_head may include a recording field for marking whether joutal is being updated, a count field for indicating the current record number in joutal, a tail field for indicating the tail pointer of joutal operation record buffer, etc., and each shmjoutal_record in the operation record buffer indicates the content of an atomic update operation, and may include an addr field for indicating the relative destination address of the shared memory, a len field for indicating the data length of the shared memory to be updated, a value field for indicating the data content of the shared memory to be updated, etc.

In the process of repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory, the situation that the repairing process is crashed may occur, in this case, the data structure may be updated just, and the problem that the data update in the shared memory is inconsistent easily occurs. However, because the corresponding atomic update operation is recorded in the repair log before the atomic update operation is performed on the shared memory, after the repair process crashes and restarts, the operation playback can be performed through the record in the repair log, so that the atomic update operation to be performed before the crash is determined.

After the atomic updating operation to be performed before the crash is determined, the atomic updating operation can be performed on the shared memory, so that the consistency of all data structures in the shared memory is ensured.

In one embodiment of the present application, before performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, the method may further include the steps of:

marking unrepaired states in the index node context of the shared memory;

recording head and tail version numbers in index nodes of a shared memory as preset initial values;

recording a version number as a value of a tail version number in the index node in each interval node of the shared memory and each batch information of the log file;

accordingly, in the process of repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, the method may further include the following steps:

marking the state being repaired in the inode context;

recording a version number in newly inserted interval nodes in the shared memory and newly added batch information in the log file as a value of a tail version number in the index node;

Accordingly, after the repair process is completed on the data copy of the failed redundant node, the method may further include the steps of:

marking unrepaired states in the inode context;

and determining and deleting the failure information in the shared memory and the log file according to the head and tail version numbers recorded in the index nodes.

For ease of description, the steps described above are combined.

In the embodiment of the application, in the normal working process of the distributed file system, if there are redundant nodes that fail, and the number of the failed redundant nodes is smaller than the total number of the redundant nodes, the shared memory can be used to record the repair information, and the repair information in the shared memory is moved into the log file in batches under the condition that the duty ratio of the repair information in the shared memory is greater than the set first ratio threshold. And recording the repair information through the shared memory and the log file.

After the failed redundant node is recovered, the data copy of the failed redundant node can be repaired based on the repair information recorded in the shared memory and the log file. Before repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, an unrepaired state may be marked in an inode context of the shared memory, indicating that no repair operation is currently performed on the data copy. In addition, the head version number and the tail version number of the index node of the shared memory can be all preset initial values, such as 0. And recording the version number as the value of the tail version number in the index node in each interval node of the shared memory and each batch of information of the log file.

In the process of repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, the repairing state can be marked in the index node context, namely, the unrepaired state marked in the index node context is changed into the repairing state. And the tail version number recorded in the index node is increased by a set stepping value, such as 1. In this process, there may still be repair information to be inserted into the shared memory or batch information written into the log file, etc., and the version number may be recorded as the value of the tail version number in the index node in the newly inserted interval node in the shared memory and the newly added batch information in the log file.

After the repair processing is completed on the data copy of the redundant node with the fault, the unrepaired state can be marked in the context of the index node, i.e. the state marked in the repair process is modified into the unrepaired state. And updating the head version number recorded in the index node to be the same as the tail version number, and determining and deleting the failure information in the shared memory and the log file according to the head version number and the tail version number recorded in the index node. Specifically, interval nodes or batch information outside the head and tail version number intervals in the index nodes in the shared memory and the log file can be determined as invalid information, and the invalid information is deleted.

In a specific embodiment of the present application, in a process of repairing a data copy of a failed redundant node based on repair information recorded in a shared memory and a log file, the method may further include the following steps:

if the repair process crashes, marking an unrepaired state in the context of the index node under the condition that the repair process is restarted and repair is not triggered;

repeatedly executing marking the repairing state in the context of the index node under the condition that the repairing process is restarted and the repairing is triggered; adding a set stepping value to the tail version number recorded in the index node; and recording the version number as the value of the tail version number in the index node in the newly inserted interval node in the shared memory and the newly added batch information in the log file.

In the embodiment of the application, in the process of repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, the situation of crash repair abort of the repair process may occur. If the repair is performed again, the data which has been repaired before the interruption is repeatedly repaired, and the additional repair data quantity and repair time are increased. In the application, in the repair process, if the repair process crashes, the unrepaired state can be marked in the context of the index node under the condition that the repair process is restarted and repair is not triggered. I.e., modifying the repairing state to the unrepaired state in the context of the inode.

Under the condition that the repairing process is restarted and repairing is triggered, the steps of marking the repairing state in the context of the index node, increasing the tail version number recorded in the index node by a set stepping value, and recording the value of the tail version number in the index node as the version number in the newly inserted interval node in the shared memory and the newly increased batch information in the log file can be repeatedly executed. The method and the device have the advantages that the shared memory and the log files are convenient to determine which are invalid information, the restoration connection of the copy data is convenient, and the addition of extra restoration data quantity and restoration time is avoided.

The embodiment of the application realizes record clearing in the shared memory and the log file through a version number mechanism. As shown in fig. 4 and 5, a version number (generation) field is added to the index node INODE, the interval node RANGE, and the header of log file batch information of the shared memory, and whether a repair flag (repair) is added to the index node context INODE ctx, wherein FALSE indicates an unrepaired state, and TRUE indicates a repairing state. Where "T" represents the type of shared memory node, "G" represents the value of version number generation, "R (rear)" represents the version number tail pointer, and "F (front)" represents the version number head pointer.

Fig. 4 is a schematic diagram of a version number mechanism implementation under normal conditions. When the restoration of the copy data is not triggered, the head and tail pointers of the version numbers recorded in the node INODE of the shared memory are all 0, namely the head and tail version numbers are all initial values of 0. At this time, the version number of the newly inserted RANGE node in the shared memory and the version number of the batch information header subsequently written into the log file are set to the value of the tail pointer.

When repair begins for the replica data, the in-repair state is marked in inode ctx and the tail pointer rear value is self-incremented by 1. At this time, the version number of the newly inserted RANGE node in the shared memory and the version number of the batch information header subsequently written into the log file are set to the rear value, i.e. 1.

After the copy data is repaired, the mark of the repairing state in INODE ctx can be modified to be in an unrepaired state, and then the version number of the head pointer recorded by the INODE node in the shared memory is set to be the value of the tail pointer, namely 1. And finally deleting all the RANGE invalidation nodes (version numbers are outside [ front, rear ]) in the shared memory, and deleting invalidation batch information in the log file in the same way.

Fig. 5 is a schematic diagram illustrating implementation of the version number mechanism in an abnormal situation. In the process of repairing the copy data, the repairing process (tree process) is crashed and restarted, and when the repairing is triggered again, the rear value in the INODE node is continuously increased by 1 so as to distinguish the newly-added batch information in the subsequent RANGE node and the log file newly inserted in the repairing process.

In one embodiment of the present application, the method may further comprise the steps of:

repeatedly executing the step of recording the initial values of the head version number and the tail version number in the index nodes of the shared memory.

In the embodiment of the application, in order to prevent the overflow caused by accumulation of the values of the version numbers after multiple restoration interrupts, the version numbers are recorded by adopting a ring queue. The maximum version number may be set in the inode context. Under the condition that the restoration process is restarted and restoration is not triggered, if the annular queue is full, namely the maximum version number is reached, the interval node in the current shared memory and batch information in the log file can be determined to be failure information and deleted, and then the head and tail version numbers are recorded in index nodes of the shared memory to be preset initial values. I.e. the version number is re-recorded starting from the initial value. Thus, the excessive occupation of the space of the shared memory and the log file can be avoided.

The troubleshooting difficulty when problems occur in the repairing process can be effectively reduced through a version number mechanism.

Corresponding to the above method embodiments, the embodiments of the present application further provide a distributed file system data repair device, where the distributed file system data repair device described below and the distributed file system data repair method described above may be referred to correspondingly.

Referring to fig. 6, the apparatus may include the following units:

the shared memory information recording unit 610 is configured to record repair information using the shared memory if there are redundant nodes that fail during normal operation of the distributed file system, where the number of failed redundant nodes is smaller than the total number of redundant nodes, and repair granularity corresponding to the repair information is smaller than repair granularity corresponding to the global bitmap;

and the data copy repairing unit 620 is configured to repair the data copy of the failed redundant node based on the repair information recorded in the shared memory after the failed redundant node is recovered.

By applying the device provided by the embodiment of the application, in the normal working process of the distributed file system, if the redundant nodes are failed and the number of the failed redundant nodes is smaller than the total number of the redundant nodes, the shared memory can be used for recording the repair information, and after the failed redundant nodes are recovered, the data copy of the failed redundant nodes is repaired based on the repair information recorded in the shared memory. The repair granularity corresponding to the repair information is smaller than that corresponding to the global bitmap, based on the repair information recorded in the shared memory, the data copy of the redundant node with the fault can be repaired by using the smaller repair granularity, excessive repair amplification caused by the overlarge repair granularity is avoided, the data window of the single-point copy is prevented from being enlarged, the resource cost can be saved, and the performance stability of the whole system is improved.

In a specific embodiment of the present application, the system further includes a log file information recording unit, configured to:

and under the condition that the redundant node fails, the repair information in the shared memory is moved into the log file in batches under the condition that the duty ratio of the repair information in the shared memory is larger than a set first ratio threshold value.

In a specific embodiment of the present application, the data copy repairing unit 620 is configured to:

In a specific embodiment of the present application, further comprising:

and the bitmap marking unit is used for marking to be repaired in the global bitmap in the process of recording the repair information under the condition that the redundant node fails.

In a specific embodiment of the present application, the data copy repairing unit 620 is further configured to:

before repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file, determining whether the duty ratio of the repair information in the log file is smaller than a set second proportion threshold value;

and if the duty ratio of the repair information in the log file is smaller than a second proportion threshold value, executing the step of repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file.

after determining whether the duty ratio of the repair information in the log file is smaller than a set second proportion threshold, if the duty ratio of the repair information in the log file is not smaller than the second proportion threshold, repairing the data copy of the failed redundant node based on the to-be-repaired mark of the global bitmap.

In a specific embodiment of the present application, the method further includes a shared memory updating unit, configured to:

in the process of repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory, if the repairing process is crashed, determining an atomic updating operation to be performed before the crash based on a repairing log after restarting the repairing process;

and performing the atomic updating operation on the shared memory.

In a specific embodiment of the present application, the repair log includes a description header for describing a state of a current log record and an operation record buffer area for sequentially recording contents of each atomic update operation.

In a specific embodiment of the present application, a version number mechanism implementation unit is further included, where the version number mechanism implementation unit is configured to:

marking an unrepaired state in the index node context of the shared memory before repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory and the log file;

In a specific embodiment of the present application, the version number mechanism implementing unit is further configured to:

marking the repairing state in the context of the index node in the process of repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory and the log file;

and recording the version number in the newly inserted interval node in the shared memory and the newly added batch information in the log file as the value of the tail version number in the index node.

Marking an unrepaired state in the context of the index node after the repair processing of the data copy of the failed redundant node is completed;

in the process of repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory and the log file, if the repairing process crashes, marking an unrepaired state in the context of the index node under the condition that the repairing process is restarted and repair is not triggered;

Corresponding to the above method embodiment, the embodiment of the present application further provides a distributed file system data repair device, including:

a memory for storing a computer program;

and the processor is used for realizing the steps of the distributed file system data repairing method when executing the computer program.

As shown in fig. 7, which is a schematic structural diagram of a distributed file system data repair device, the distributed file system data repair device may include: a processor 10, a memory 11, a communication interface 12 and a communication bus 13. The processor 10, the memory 11 and the communication interface 12 all complete communication with each other through a communication bus 13.

In the present embodiment, the processor 10 may be a central processing unit (Central Processing Unit, CPU), an asic, a dsp, a field programmable gate array, or other programmable logic device, etc.

The processor 10 may call a program stored in the memory 11, and in particular, the processor 10 may perform operations in an embodiment of a distributed file system data repair method.

The memory 11 is used for storing one or more programs, and the programs may include program codes, where the program codes include computer operation instructions, and in this embodiment, at least the programs for implementing the following functions are stored in the memory 11:

In one possible implementation, the memory 11 may include a storage program area and a storage data area, where the storage program area may store an operating system, and application programs required for at least one function (such as a node monitoring function, an information recording function), and the like; the storage data area may store data created during use, such as repair information data, data copy data, and the like.

In addition, the memory 11 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device or other volatile solid-state storage device.

The communication interface 13 may be an interface of a communication module for connection with other devices or systems.

Of course, it should be noted that the structure shown in fig. 7 is not limited to the distributed file system data repair device in the embodiment of the present application, and the distributed file system data repair device may include more or fewer components than those shown in fig. 7, or may combine some components in practical applications.

Corresponding to the above method embodiments, the present application further provides a computer readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the distributed file system data restoration method are implemented.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Specific examples are used herein to illustrate the principles and embodiments of the present application, and the description of the above examples is only for aiding in understanding the technical solution of the present application and its core ideas. It should be noted that it would be obvious to those skilled in the art that various improvements and modifications can be made to the present application without departing from the principles of the present application, and such improvements and modifications fall within the scope of the claims of the present application.

Claims

1. A method for repairing distributed file system data, comprising:

After the failed redundant node is recovered, repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory;

wherein, under the condition that the redundant node has a fault, the method further comprises: under the condition that the duty ratio of the repair information in the shared memory is larger than a set first ratio threshold, the repair information in the shared memory is moved into a log file in batches;

and repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory, including: repairing the data copy of the failed redundant node based on the shared memory and the repairing information recorded in the log file;

before the repairing process is performed on the data copy of the failed redundant node based on the repairing information recorded in the shared memory and the log file, the method further comprises: determining whether the duty ratio of the repair information in the log file is smaller than a set second proportion threshold value; and if the duty ratio of the repair information in the log file is smaller than the second proportion threshold value, executing the step of repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file.

2. The method of claim 1, wherein in the event of a failure of a redundant node, further comprising:

3. The method of claim 2, further comprising, after said determining whether the duty cycle of the repair information in the log file is less than a set second ratio threshold:

4. The method as recited in claim 1, further comprising:

And performing the atomic updating operation on the shared memory.

5. The method of claim 4, wherein the repair log includes a description header for describing a state of a current log record and an operation record buffer for sequentially recording contents of each atomic update operation.

6. The method of claim 5, further comprising, prior to performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file:

marking an unrepaired state in the index node context of the shared memory;

7. The method of claim 6, wherein in repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, further comprising:

Marking a repair-in-progress state in the inode context;

8. The method of claim 7, further comprising, after the repair process is completed for the data copy of the failed redundant node:

marking unrepaired states in the inode context;

9. The method of claim 6, wherein in repairing the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, further comprising:

10. The method as recited in claim 9, further comprising:

11. A distributed file system data repair apparatus, comprising:

The data copy repairing unit is used for repairing the data copy of the failed redundant node based on the repairing information recorded in the shared memory after the failed redundant node is recovered;

the device is further used for moving the repair information in the shared memory into a log file in batches under the condition that a redundant node fails and the duty ratio of the repair information in the shared memory is larger than a set first ratio threshold;

the data copy repairing unit is further configured to repair the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file;

the device is further configured to determine, before performing repair processing on the data copy of the failed redundant node based on the repair information recorded in the shared memory and the log file, whether a duty ratio of the repair information in the log file is less than a set second proportional threshold; and if the duty ratio of the repair information in the log file is smaller than the second proportion threshold value, executing the step of repairing the data copy of the failed redundant node based on the shared memory and the repair information recorded in the log file.

12. A distributed file system data repair device, comprising:

a memory for storing a computer program;

processor for implementing the steps of the distributed file system data repair method according to any of claims 1 to 10 when executing said computer program.

13. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the distributed file system data restoration method according to any of claims 1 to 10.