CN116594809A

CN116594809A - Distributed coding backup recovery system

Info

Publication number: CN116594809A
Application number: CN202310491812.XA
Authority: CN
Inventors: 刘�东; 赵彦钧; 常清雪
Original assignee: Sichuan Huakun Zhenyu Intelligent Technology Co ltd
Current assignee: Sichuan Huakun Zhenyu Intelligent Technology Co ltd
Priority date: 2023-04-28
Filing date: 2023-04-28
Publication date: 2023-08-15

Abstract

The invention relates to a distributed coding backup recovery system and a method, which belong to the technical field of electric digital data processing and aim at real-time abnormality monitoring of the execution process of a storage server consisting of a failure detection module, a distributed arbitration module, a repair module and the like; and the abnormal accurate positioning can be realized through the orderly coordination operation among the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit, and meanwhile, the long-time invalid operation of a plurality of detection units can be avoided, the occupation of the system operation space is reduced, and the like.

Description

Distributed coding backup recovery system

Technical Field

The invention belongs to the technical field of electric digital data processing, and particularly relates to a distributed coding backup recovery system and method.

Background

In wide area network data storage systems, a wide variety of backup and archiving systems have been implemented at different levels. Most backup and archiving systems primarily take into account disk failures or disk failures, etc., and do not take into account the impact of the data transmission link in the wide area network environment. The storage server for overcoming the defects generally comprises a failure detection module, a distributed arbitration module, a repair module and the like. When the failure detection module finds that a certain storage server fails, the repair module downloads image file copies which are the same as the image file copies stored by the failure storage server from other effective storage servers, stores the image file copies onto the alternative storage servers selected by the distributed arbitration module, and the alternative storage servers completely replace the failed storage server; namely, the detection result of the failure detection module triggers the distributed arbitration module, the distributed arbitration module triggers the repair module after selecting to finish the replacement storage server, and the repair module stores the target image file attachment to the replacement storage server.

However, at present, a corresponding execution anomaly monitoring scheme is not designed for the execution process, and an anomaly locating scheme when an anomaly occurs in the execution process is also lacking.

Therefore, a distributed code backup and restore system, a distributed code backup and restore method and a storage medium are needed to solve the above problems.

Disclosure of Invention

The invention aims to provide a distributed coding backup recovery system, a distributed coding backup recovery method and a storage medium, which are used for solving the technical problems in the prior art, monitoring abnormality in the execution process of a storage server consisting of a failure detection module, a distributed arbitration module, a repair module and the like, and realizing abnormality positioning.

In order to achieve the above purpose, the technical scheme of the invention is as follows:

the distributed coding backup recovery system comprises a failure detection module, a distributed arbitration module, a repair result verification unit, a detection output detection unit, an arbitration input detection unit, an arbitration output detection unit, a repair input detection unit and an operation control unit, wherein the failure detection module, the distributed arbitration module and the repair module sequentially execute related data transmission;

the operation control unit is respectively connected with the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit;

the repair result verification unit is used for verifying whether the repair module successfully stores the target image file attachment in the alternative storage server;

the detection output detection unit is used for detecting whether the output data of the failure detection module is abnormal or not;

the arbitration input detection unit is used for detecting whether the input data of the distributed arbitration module is abnormal;

the arbitration output detection unit is used for detecting whether the output data of the distributed arbitration module is abnormal;

the repair input detection unit is used for detecting whether the input data of the repair module is abnormal or not;

the operation control unit is used for controlling the operation of the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit.

Further, the operation control unit controls the operation state of the repair result verification unit to be normally open, and controls the operation states of the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit to be normally closed;

when the repair result verification unit verifies that the repair module does not successfully store the target image file attachment in the alternative storage server, the operation control unit controls the repair input detection unit to be started;

and if the repair input detection unit detects that the input data of the repair module is not abnormal, the operation control unit judges that the repair module fails.

Further, when the repair input detection unit detects that the input data of the repair module is abnormal, the operation control unit controls the arbitration output detection unit to be started;

and if the arbitration output detection unit detects that the output data of the distributed arbitration module is not abnormal, the operation control unit judges that the data transmission between the distributed arbitration module and the repair module is faulty.

Further, when the arbitration output detection unit detects that the output data of the distributed arbitration module is abnormal, the operation control unit controls the arbitration input detection unit to be started;

and if the arbitration input detection unit detects that the input data of the distributed arbitration module is not abnormal, the operation control unit judges that the distributed arbitration module is faulty.

Further, when the arbitration input detection unit detects that the input data of the distributed arbitration module is abnormal, the operation control unit controls the detection output detection unit to be started;

if the detection output detection unit detects that the output data of the failure detection module is not abnormal, the operation control unit judges that the data transmission between the failure detection module and the distributed arbitration module is faulty; and if the detection output detection unit detects that the output data of the failure detection module is abnormal, the operation control unit judges that the failure detection module is faulty.

Further, the system also comprises an abnormal feedback unit, wherein the abnormal feedback unit is connected with the operation control unit.

A distributed coding backup recovery method adopts the distributed coding backup recovery system to carry out distributed coding backup recovery.

A storage medium having stored thereon a computer program which when executed performs a distributed coded backup restoration method as described above.

Compared with the prior art, the invention has the following beneficial effects:

one of the beneficial effects of the scheme is that the real-time abnormality monitoring is carried out for the execution process of a storage server consisting of a failure detection module, a distributed arbitration module, a repair module and the like; and the abnormal accurate positioning can be realized through the orderly coordination operation among the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit, and meanwhile, the long-time invalid operation of a plurality of detection units can be avoided, the occupation of the system operation space is reduced, and the like.

Drawings

Fig. 1 is a schematic system configuration diagram of the embodiment.

Fig. 2 is a schematic diagram of the system operation principle of the embodiment.

Detailed Description

For the purpose of making the technical solution and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and examples. It should be understood that the particular embodiments described herein are illustrative only and are not intended to limit the invention, i.e., the embodiments described are merely some, but not all, of the embodiments of the invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention. It is noted that relational terms such as "first" and "second", and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.

Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

As shown in fig. 1, a distributed code backup recovery system is provided, which includes a failure detection module, a distributed arbitration module, a repair result verification unit, a detection output detection unit, an arbitration input detection unit, an arbitration output detection unit, a repair input detection unit, and an operation control unit, wherein the failure detection module, the distributed arbitration module, and the repair module sequentially execute related data transmission;

Further, as shown in fig. 2, the operation control unit controls the operation state of the repair result verification unit to be normally open, and controls the operation states of the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit to be normally closed;

In the scheme, the real-time abnormality monitoring is carried out on the execution process of the storage server consisting of the failure detection module, the distributed arbitration module, the repair module and the like; and the abnormal accurate positioning can be realized through the orderly coordination operation among the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit, and meanwhile, the long-time invalid operation of a plurality of detection units can be avoided, the occupation of the system operation space is reduced, and the like.

Further, the system also comprises an abnormal feedback unit, wherein the abnormal feedback unit is connected with the operation control unit and can perform corresponding abnormal feedback for each fault.

The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.

Claims

1. The distributed coding backup recovery system comprises a failure detection module, a distributed arbitration module and a repair module, wherein the failure detection module, the distributed arbitration module and the repair module sequentially execute related data transmission, and the distributed coding backup recovery system is characterized by further comprising a repair result verification unit, a detection output detection unit, an arbitration input detection unit, an arbitration output detection unit, a repair input detection unit and an operation control unit;

2. The distributed code backup and restoration system according to claim 1, wherein the operation control unit controls the operation state of the restoration result verification unit to be normally open, and controls the operation states of the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit, and the restoration input detection unit to be normally closed;

3. The distributed code backup and restore system according to claim 2, wherein when the repair input detection unit detects that the input data of the repair module is abnormal, the operation control unit controls the arbitration output detection unit to be turned on;

4. A distributed code backup restoration system according to claim 3, wherein when said arbitration output detection unit detects that the output data of said distributed arbitration module is abnormal, said operation control unit controls said arbitration input detection unit to be turned on;

5. The distributed backup and restore system according to claim 4, wherein when the arbitration input detection unit detects that the input data of the distributed arbitration module is abnormal, the operation control unit controls the detection output detection unit to be turned on;

6. The distributed backup and restore system according to claim 5, further comprising an anomaly feedback unit, wherein the anomaly feedback unit is connected to the operation control unit.

7. A distributed code backup recovery method, wherein a distributed code backup recovery system as claimed in any one of claims 1 to 6 is used for distributed code backup recovery.

8. A storage medium having a computer program stored thereon, which when executed performs a distributed coded backup restoration method as claimed in claim 7.