CN116594809A - Distributed coding backup recovery system - Google Patents
Distributed coding backup recovery system Download PDFInfo
- Publication number
- CN116594809A CN116594809A CN202310491812.XA CN202310491812A CN116594809A CN 116594809 A CN116594809 A CN 116594809A CN 202310491812 A CN202310491812 A CN 202310491812A CN 116594809 A CN116594809 A CN 116594809A
- Authority
- CN
- China
- Prior art keywords
- detection unit
- module
- arbitration
- repair
- distributed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000011084 recovery Methods 0.000 title claims abstract description 19
- 238000001514 detection method Methods 0.000 claims abstract description 167
- 230000002159 abnormal effect Effects 0.000 claims abstract description 44
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000012795 verification Methods 0.000 claims abstract description 21
- 230000005540 biological transmission Effects 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 3
- 230000005856 abnormality Effects 0.000 abstract description 5
- 238000012544 monitoring process Methods 0.000 abstract description 5
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1458—Management of the backup or restore process
- G06F11/1464—Management of the backup or restore process for networked environments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02W—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO WASTEWATER TREATMENT OR WASTE MANAGEMENT
- Y02W90/00—Enabling technologies or technologies with a potential or indirect contribution to greenhouse gas [GHG] emissions mitigation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Hardware Redundancy (AREA)
Abstract
The invention relates to a distributed coding backup recovery system and a method, which belong to the technical field of electric digital data processing and aim at real-time abnormality monitoring of the execution process of a storage server consisting of a failure detection module, a distributed arbitration module, a repair module and the like; and the abnormal accurate positioning can be realized through the orderly coordination operation among the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit, and meanwhile, the long-time invalid operation of a plurality of detection units can be avoided, the occupation of the system operation space is reduced, and the like.
Description
Technical Field
The invention belongs to the technical field of electric digital data processing, and particularly relates to a distributed coding backup recovery system and method.
Background
In wide area network data storage systems, a wide variety of backup and archiving systems have been implemented at different levels. Most backup and archiving systems primarily take into account disk failures or disk failures, etc., and do not take into account the impact of the data transmission link in the wide area network environment. The storage server for overcoming the defects generally comprises a failure detection module, a distributed arbitration module, a repair module and the like. When the failure detection module finds that a certain storage server fails, the repair module downloads image file copies which are the same as the image file copies stored by the failure storage server from other effective storage servers, stores the image file copies onto the alternative storage servers selected by the distributed arbitration module, and the alternative storage servers completely replace the failed storage server; namely, the detection result of the failure detection module triggers the distributed arbitration module, the distributed arbitration module triggers the repair module after selecting to finish the replacement storage server, and the repair module stores the target image file attachment to the replacement storage server.
However, at present, a corresponding execution anomaly monitoring scheme is not designed for the execution process, and an anomaly locating scheme when an anomaly occurs in the execution process is also lacking.
Therefore, a distributed code backup and restore system, a distributed code backup and restore method and a storage medium are needed to solve the above problems.
Disclosure of Invention
The invention aims to provide a distributed coding backup recovery system, a distributed coding backup recovery method and a storage medium, which are used for solving the technical problems in the prior art, monitoring abnormality in the execution process of a storage server consisting of a failure detection module, a distributed arbitration module, a repair module and the like, and realizing abnormality positioning.
In order to achieve the above purpose, the technical scheme of the invention is as follows:
the distributed coding backup recovery system comprises a failure detection module, a distributed arbitration module, a repair result verification unit, a detection output detection unit, an arbitration input detection unit, an arbitration output detection unit, a repair input detection unit and an operation control unit, wherein the failure detection module, the distributed arbitration module and the repair module sequentially execute related data transmission;
the operation control unit is respectively connected with the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit;
the repair result verification unit is used for verifying whether the repair module successfully stores the target image file attachment in the alternative storage server;
the detection output detection unit is used for detecting whether the output data of the failure detection module is abnormal or not;
the arbitration input detection unit is used for detecting whether the input data of the distributed arbitration module is abnormal;
the arbitration output detection unit is used for detecting whether the output data of the distributed arbitration module is abnormal;
the repair input detection unit is used for detecting whether the input data of the repair module is abnormal or not;
the operation control unit is used for controlling the operation of the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit.
Further, the operation control unit controls the operation state of the repair result verification unit to be normally open, and controls the operation states of the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit to be normally closed;
when the repair result verification unit verifies that the repair module does not successfully store the target image file attachment in the alternative storage server, the operation control unit controls the repair input detection unit to be started;
and if the repair input detection unit detects that the input data of the repair module is not abnormal, the operation control unit judges that the repair module fails.
Further, when the repair input detection unit detects that the input data of the repair module is abnormal, the operation control unit controls the arbitration output detection unit to be started;
and if the arbitration output detection unit detects that the output data of the distributed arbitration module is not abnormal, the operation control unit judges that the data transmission between the distributed arbitration module and the repair module is faulty.
Further, when the arbitration output detection unit detects that the output data of the distributed arbitration module is abnormal, the operation control unit controls the arbitration input detection unit to be started;
and if the arbitration input detection unit detects that the input data of the distributed arbitration module is not abnormal, the operation control unit judges that the distributed arbitration module is faulty.
Further, when the arbitration input detection unit detects that the input data of the distributed arbitration module is abnormal, the operation control unit controls the detection output detection unit to be started;
if the detection output detection unit detects that the output data of the failure detection module is not abnormal, the operation control unit judges that the data transmission between the failure detection module and the distributed arbitration module is faulty; and if the detection output detection unit detects that the output data of the failure detection module is abnormal, the operation control unit judges that the failure detection module is faulty.
Further, the system also comprises an abnormal feedback unit, wherein the abnormal feedback unit is connected with the operation control unit.
A distributed coding backup recovery method adopts the distributed coding backup recovery system to carry out distributed coding backup recovery.
A storage medium having stored thereon a computer program which when executed performs a distributed coded backup restoration method as described above.
Compared with the prior art, the invention has the following beneficial effects:
one of the beneficial effects of the scheme is that the real-time abnormality monitoring is carried out for the execution process of a storage server consisting of a failure detection module, a distributed arbitration module, a repair module and the like; and the abnormal accurate positioning can be realized through the orderly coordination operation among the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit, and meanwhile, the long-time invalid operation of a plurality of detection units can be avoided, the occupation of the system operation space is reduced, and the like.
Drawings
Fig. 1 is a schematic system configuration diagram of the embodiment.
Fig. 2 is a schematic diagram of the system operation principle of the embodiment.
Detailed Description
For the purpose of making the technical solution and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and examples. It should be understood that the particular embodiments described herein are illustrative only and are not intended to limit the invention, i.e., the embodiments described are merely some, but not all, of the embodiments of the invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention. It is noted that relational terms such as "first" and "second", and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
As shown in fig. 1, a distributed code backup recovery system is provided, which includes a failure detection module, a distributed arbitration module, a repair result verification unit, a detection output detection unit, an arbitration input detection unit, an arbitration output detection unit, a repair input detection unit, and an operation control unit, wherein the failure detection module, the distributed arbitration module, and the repair module sequentially execute related data transmission;
the operation control unit is respectively connected with the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit;
the repair result verification unit is used for verifying whether the repair module successfully stores the target image file attachment in the alternative storage server;
the detection output detection unit is used for detecting whether the output data of the failure detection module is abnormal or not;
the arbitration input detection unit is used for detecting whether the input data of the distributed arbitration module is abnormal;
the arbitration output detection unit is used for detecting whether the output data of the distributed arbitration module is abnormal;
the repair input detection unit is used for detecting whether the input data of the repair module is abnormal or not;
the operation control unit is used for controlling the operation of the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit.
Further, as shown in fig. 2, the operation control unit controls the operation state of the repair result verification unit to be normally open, and controls the operation states of the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit to be normally closed;
when the repair result verification unit verifies that the repair module does not successfully store the target image file attachment in the alternative storage server, the operation control unit controls the repair input detection unit to be started;
and if the repair input detection unit detects that the input data of the repair module is not abnormal, the operation control unit judges that the repair module fails.
Further, when the repair input detection unit detects that the input data of the repair module is abnormal, the operation control unit controls the arbitration output detection unit to be started;
and if the arbitration output detection unit detects that the output data of the distributed arbitration module is not abnormal, the operation control unit judges that the data transmission between the distributed arbitration module and the repair module is faulty.
Further, when the arbitration output detection unit detects that the output data of the distributed arbitration module is abnormal, the operation control unit controls the arbitration input detection unit to be started;
and if the arbitration input detection unit detects that the input data of the distributed arbitration module is not abnormal, the operation control unit judges that the distributed arbitration module is faulty.
Further, when the arbitration input detection unit detects that the input data of the distributed arbitration module is abnormal, the operation control unit controls the detection output detection unit to be started;
if the detection output detection unit detects that the output data of the failure detection module is not abnormal, the operation control unit judges that the data transmission between the failure detection module and the distributed arbitration module is faulty; and if the detection output detection unit detects that the output data of the failure detection module is abnormal, the operation control unit judges that the failure detection module is faulty.
In the scheme, the real-time abnormality monitoring is carried out on the execution process of the storage server consisting of the failure detection module, the distributed arbitration module, the repair module and the like; and the abnormal accurate positioning can be realized through the orderly coordination operation among the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit, and meanwhile, the long-time invalid operation of a plurality of detection units can be avoided, the occupation of the system operation space is reduced, and the like.
Further, the system also comprises an abnormal feedback unit, wherein the abnormal feedback unit is connected with the operation control unit and can perform corresponding abnormal feedback for each fault.
A distributed coding backup recovery method adopts the distributed coding backup recovery system to carry out distributed coding backup recovery.
A storage medium having stored thereon a computer program which when executed performs a distributed coded backup restoration method as described above.
The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.
Claims (8)
1. The distributed coding backup recovery system comprises a failure detection module, a distributed arbitration module and a repair module, wherein the failure detection module, the distributed arbitration module and the repair module sequentially execute related data transmission, and the distributed coding backup recovery system is characterized by further comprising a repair result verification unit, a detection output detection unit, an arbitration input detection unit, an arbitration output detection unit, a repair input detection unit and an operation control unit;
the operation control unit is respectively connected with the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit;
the repair result verification unit is used for verifying whether the repair module successfully stores the target image file attachment in the alternative storage server;
the detection output detection unit is used for detecting whether the output data of the failure detection module is abnormal or not;
the arbitration input detection unit is used for detecting whether the input data of the distributed arbitration module is abnormal;
the arbitration output detection unit is used for detecting whether the output data of the distributed arbitration module is abnormal;
the repair input detection unit is used for detecting whether the input data of the repair module is abnormal or not;
the operation control unit is used for controlling the operation of the repair result verification unit, the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit and the repair input detection unit.
2. The distributed code backup and restoration system according to claim 1, wherein the operation control unit controls the operation state of the restoration result verification unit to be normally open, and controls the operation states of the detection output detection unit, the arbitration input detection unit, the arbitration output detection unit, and the restoration input detection unit to be normally closed;
when the repair result verification unit verifies that the repair module does not successfully store the target image file attachment in the alternative storage server, the operation control unit controls the repair input detection unit to be started;
and if the repair input detection unit detects that the input data of the repair module is not abnormal, the operation control unit judges that the repair module fails.
3. The distributed code backup and restore system according to claim 2, wherein when the repair input detection unit detects that the input data of the repair module is abnormal, the operation control unit controls the arbitration output detection unit to be turned on;
and if the arbitration output detection unit detects that the output data of the distributed arbitration module is not abnormal, the operation control unit judges that the data transmission between the distributed arbitration module and the repair module is faulty.
4. A distributed code backup restoration system according to claim 3, wherein when said arbitration output detection unit detects that the output data of said distributed arbitration module is abnormal, said operation control unit controls said arbitration input detection unit to be turned on;
and if the arbitration input detection unit detects that the input data of the distributed arbitration module is not abnormal, the operation control unit judges that the distributed arbitration module is faulty.
5. The distributed backup and restore system according to claim 4, wherein when the arbitration input detection unit detects that the input data of the distributed arbitration module is abnormal, the operation control unit controls the detection output detection unit to be turned on;
if the detection output detection unit detects that the output data of the failure detection module is not abnormal, the operation control unit judges that the data transmission between the failure detection module and the distributed arbitration module is faulty; and if the detection output detection unit detects that the output data of the failure detection module is abnormal, the operation control unit judges that the failure detection module is faulty.
6. The distributed backup and restore system according to claim 5, further comprising an anomaly feedback unit, wherein the anomaly feedback unit is connected to the operation control unit.
7. A distributed code backup recovery method, wherein a distributed code backup recovery system as claimed in any one of claims 1 to 6 is used for distributed code backup recovery.
8. A storage medium having a computer program stored thereon, which when executed performs a distributed coded backup restoration method as claimed in claim 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310491812.XA CN116594809A (en) | 2023-04-28 | 2023-04-28 | Distributed coding backup recovery system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310491812.XA CN116594809A (en) | 2023-04-28 | 2023-04-28 | Distributed coding backup recovery system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116594809A true CN116594809A (en) | 2023-08-15 |
Family
ID=87600009
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310491812.XA Pending CN116594809A (en) | 2023-04-28 | 2023-04-28 | Distributed coding backup recovery system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116594809A (en) |
-
2023
- 2023-04-28 CN CN202310491812.XA patent/CN116594809A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6678639B2 (en) | Automated problem identification system | |
US9274902B1 (en) | Distributed computing fault management | |
US6785838B2 (en) | Method and apparatus for recovering from failure of a mirrored boot device | |
US20130262919A1 (en) | Systems and methods for preventing data loss | |
US8347142B2 (en) | Non-disruptive I/O adapter diagnostic testing | |
CN107480014A (en) | A kind of High Availabitity equipment switching method and device | |
US7730029B2 (en) | System and method of fault tolerant reconciliation for control card redundancy | |
CN110865907B (en) | Method and system for providing service redundancy between master server and slave server | |
CN112148204A (en) | Method, apparatus and computer program product for managing independent redundant disk arrays | |
EP2787401B1 (en) | Method and apparatus for controlling a physical unit in an automation system | |
US7373542B2 (en) | Automatic startup of a cluster system after occurrence of a recoverable error | |
EP0976041B1 (en) | Detecting memory problems in computers | |
WO2015045122A1 (en) | Storage device, storage system, and data management method | |
US20070006166A1 (en) | Code coverage for an embedded processor system | |
JP4592511B2 (en) | IP network server backup system | |
CN114253225A (en) | Self-healing process control system | |
CN116594809A (en) | Distributed coding backup recovery system | |
US7533297B2 (en) | Fault isolation in a microcontroller based computer | |
US8230261B2 (en) | Field replaceable unit acquittal policy | |
CN111338456B (en) | BBU power failure protection implementation method and system | |
CN113868000B (en) | Link fault repairing method, system and related components | |
CN117873408B (en) | Cloud printer data recovery method and related device | |
US20080133440A1 (en) | System, method and program for determining which parts of a product to replace | |
CN113778753B (en) | Method, device, equipment and medium for automatically correcting database after storage recovery | |
CN116545845B (en) | Redundant backup device, system and method for production server |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |