CN110377459A - A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster - Google Patents

A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster Download PDF

Info

Publication number
CN110377459A
CN110377459A CN201910579657.0A CN201910579657A CN110377459A CN 110377459 A CN110377459 A CN 110377459A CN 201910579657 A CN201910579657 A CN 201910579657A CN 110377459 A CN110377459 A CN 110377459A
Authority
CN
China
Prior art keywords
cluster
abnormal state
backup
business datum
disaster tolerance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910579657.0A
Other languages
Chinese (zh)
Inventor
轩艳东
马豹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Wave Intelligent Technology Co Ltd
Original Assignee
Suzhou Wave Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Wave Intelligent Technology Co Ltd filed Critical Suzhou Wave Intelligent Technology Co Ltd
Priority to CN201910579657.0A priority Critical patent/CN110377459A/en
Publication of CN110377459A publication Critical patent/CN110377459A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/2033Failover techniques switching over of hardware resources

Abstract

The application provides a kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster, the system comprises: working cluster, backup cluster and monitor node;The monitoring node is used for configuration work cluster and the corresponding backup cluster of the working cluster, is also used to the state of monitoring work cluster He backup cluster;Wherein, the working cluster and backup cluster share public storage area;The monitoring node is located at except the working cluster and backup cluster.The computing resource of the cluster of abnormal state can be transferred on backup cluster and run by above-mentioned technical proposal, the disaster tolerance processing between cluster be realized, so that it is guaranteed that the continuity and availability of business.

Description

A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster
Technical field
The present invention relates to distributed computing field more particularly to disaster tolerance system, disaster tolerance processing method, monitoring node and backups Cluster.
Background technique
Traditional distributed cloud computing disaster tolerance, is calculated just between each node inside the same distributed type assemblies The disaster tolerance of resource, backup.For distributed type assemblies delay machine, there is link failure and obtain situation, then the resource of this cluster internal will It is inaccessible, the loss, irrecoverable of the severe disruptions or even business datum of business will be caused.Therefore, it is necessary to one kind for collection Disaster recovery solution between group.
Summary of the invention
The application technology to be solved is to provide a kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup set Group, can carry out disaster tolerance processing between distributed type assemblies.
In order to solve the above-mentioned technical problem, this application provides a kind of disaster tolerance system, the system comprises: working cluster, Backup cluster and monitoring node;
The monitoring node is used for configuration work cluster and the corresponding backup cluster of the working cluster, is also used to supervise Control the state of working cluster and backup cluster;
Wherein, the working cluster and backup cluster share public storage area;
The monitoring node is located at except the working cluster and backup cluster.
Optionally,
The monitoring node is also used to abnormal state and the corresponding backup cluster of the working cluster in working cluster State it is normal when, the working cluster of abnormal state is stored in the business datum in public storage area and is mounted to the state On the corresponding backup cluster of abnormal cluster, and the metadata of the state cluster is synchronized to the backup cluster;
The backup cluster is also used to after receiving the metadata that the monitoring node is sent, according to the metadata And the business datum of the abnormal state cluster read, start the computing resource of the abnormal state cluster.
The application also provides a kind of disaster tolerance processing method, is applied to disaster tolerance system above-mentioned, which comprises
Abnormal state cluster is monitored in working cluster when monitoring node, and the abnormal state cluster is corresponding standby When the state of part cluster is normal, it is corresponding standby that the metadata for the abnormal state cluster that will acquire is synchronized to the failed cluster On part cluster;
Business datum of the abnormal state cluster-based storage in public storage area is mounted to the backup cluster, with The backup cluster is set to start the computing resource of the abnormal state cluster according to the metadata and business datum.
Optionally, it is described business datum of the abnormal state cluster-based storage in public storage area is mounted to it is described Backup cluster includes:
The business datum volume identification of the abnormal state cluster is obtained according to the metadata;
According to the business datum volume identification, the business number of the abnormal state cluster is searched from public storage area According to;
The business datum of the abnormal state cluster found is mounted to the backup cluster.
Optionally, the method also includes:
Configure monitoring parameters information.
The application also provides a kind of disaster tolerance processing method, is applied to disaster tolerance system above-mentioned, which comprises
Backup cluster receives the metadata for the abnormal state cluster that monitoring node is sent;
After receiving data carry success notification, the business number of the abnormal state cluster described in common storage area domain browsing According to;
Start the computing resource of the abnormal state cluster according to the metadata and business datum.
The application also provides a kind of monitoring node, comprising: memory and processor;
The memory, for saving the program for being used for disaster tolerance processing;
The processor executes the program for disaster tolerance processing for reading, performs the following operations:
Monitoring node is worked as monitors abnormal state cluster in working cluster, and the abnormal state cluster is corresponding standby When the state of part cluster is normal, it is corresponding standby that the metadata for the abnormal state cluster that will acquire is synchronized to the failed cluster On part cluster;
Business datum of the abnormal state cluster-based storage in public storage area is mounted to the backup cluster, with The backup cluster is set to start the computing resource of the abnormal state cluster according to the metadata and business datum.
Optionally, it is described business datum of the abnormal state cluster-based storage in public storage area is mounted to it is described Backup cluster includes:
The business datum volume identification of the abnormal state cluster is obtained according to the metadata;
According to the business datum volume identification, the business number of the abnormal state cluster is searched from public storage area According to;
The business datum of the abnormal state cluster found is mounted to the backup cluster.
Optionally, the processor executes the program for disaster tolerance processing for reading, also performs the following operations:
Configure monitoring parameters information.
The application also provides a kind of backup cluster, comprising: memory and processor;
The memory, for saving the program for being used for disaster tolerance processing;
The processor executes the program for disaster tolerance processing for reading, performs the following operations:
Receive the metadata for the abnormal state cluster that monitoring node is sent;
After receiving data carry success notification, the business number of the abnormal state cluster described in common storage area domain browsing According to;
Start the computing resource of the abnormal state cluster according to the metadata and business datum.
The application provides a kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster, the system comprises: Working cluster, backup cluster and monitoring node;The monitoring node, it is corresponding for configuration work cluster and the working cluster Backup cluster, be also used to the state of monitoring work cluster He backup cluster;Wherein, the working cluster and backup cluster are shared Public storage area;The monitoring node is located at except the working cluster and backup cluster.Above-mentioned technical proposal can be by shape The computing resource of the cluster of state exception is transferred on backup cluster and runs, and the disaster tolerance processing between cluster is realized, so that it is guaranteed that industry The continuity and availability of business.
Detailed description of the invention
Attached drawing is used to provide the understanding to technical scheme, and constitutes part of specification, with the application's Embodiment is used to explain the technical solution of the application together, does not constitute the limitation to technical scheme.
Fig. 1 is the schematic diagram of the disaster tolerance system of the embodiment of the present invention one;
Fig. 2 is the flow diagram of the disaster recovery method of the embodiment of the present invention one;
Fig. 3 is another flow diagram of the disaster recovery method of the embodiment of the present invention one
Fig. 4 is the schematic diagram of the monitoring node of the embodiment of the present invention one;
Fig. 5 is the schematic diagram of the backup cluster of the embodiment of the present invention one;
Fig. 6 is another flow diagram of the disaster recovery method of the embodiment of the present invention one;
Fig. 7 is another schematic diagram of the disaster tolerance system of the embodiment of the present invention one;
Fig. 8 is the system schematic after the disaster tolerance processing of the embodiment of the present invention one.
Specific embodiment
This application describes multiple embodiments, but the description is exemplary, rather than restrictive, and for this It is readily apparent that can have more in the range of embodiments described herein includes for the those of ordinary skill in field More embodiments and implementation.Although many possible feature combinations are shown in the attached drawings, and in a specific embodiment It is discussed, but many other combinations of disclosed feature are also possible.Unless the feelings specially limited Other than condition, any feature or element of any embodiment can be with any other features or element knot in any other embodiment It closes and uses, or any other feature or the element in any other embodiment can be substituted.
The application includes and contemplates the combination with feature known to persons of ordinary skill in the art and element.The application is It can also combine with any general characteristics or element through disclosed embodiment, feature and element, be defined by the claims with being formed Unique scheme of the invention.Any feature or element of any embodiment can also be with features or member from other scheme of the invention Part combination, to form the unique scheme of the invention that another is defined by the claims.It will thus be appreciated that showing in this application Out and/or any feature of discussion can be realized individually or in any suitable combination.Therefore, in addition to according to appended right It is required that and its other than the limitation done of equivalent replacement, embodiment is not limited.Furthermore, it is possible in the guarantor of appended claims It carry out various modifications and changes in shield range.
In addition, method and/or process may be rendered as spy by specification when describing representative embodiment Fixed step sequence.However, in the degree of this method or process independent of the particular order of step described herein, this method Or process should not necessarily be limited by the step of particular order.As one of ordinary skill in the art will appreciate, other steps is suitable Sequence is also possible.Therefore, the particular order of step described in specification is not necessarily to be construed as limitations on claims.This Outside, the claim for this method and/or process should not necessarily be limited by the step of executing them in the order written, art technology Personnel are it can be readily appreciated that these can sequentially change, and still remain in the spirit and scope of the embodiment of the present application.
Embodiment one
As shown in Figure 1, the present embodiment provides a kind of disaster tolerance system, the system comprises: working cluster 1,2 and of backup cluster Monitor node 3;
The monitoring node 3 is used for configuration work cluster and the corresponding backup cluster of the working cluster, is also used to supervise Control the state of working cluster and backup cluster;
Wherein, the working cluster 1 and backup cluster 2 share public storage area;
The monitoring node 3 is located at except the working cluster and backup cluster.
Optionally,
The monitoring node 3 can be also used for abnormal state and the corresponding backup of the working cluster in working cluster When the state of cluster is normal, by the working cluster of abnormal state be stored in the business datum in public storage area be mounted to it is described On the corresponding backup cluster of abnormal state cluster, and the metadata of the state cluster is synchronized to the backup cluster;
The backup cluster 2 can be also used for after receiving the metadata that the monitoring node is sent, according to described The business datum of metadata and the abnormal state cluster read, starts the computing resource of the abnormal state cluster.
The computing resource of the cluster of abnormal state can be transferred on backup cluster and run by above-mentioned technical proposal, realize collection Disaster tolerance processing between group, so that it is guaranteed that the continuity and availability of business.
As shown in Fig. 2, the present embodiment also provides a kind of disaster tolerance processing method, it is applied to disaster tolerance system above-mentioned, the side Method includes:
Step S101, when monitoring node monitors abnormal state cluster, and the abnormal state collection in working cluster When the state of the corresponding backup cluster of group is normal, the metadata for the abnormal state cluster that will acquire is synchronized to the fault set On the corresponding backup cluster of group;
Step S103, business datum of the abnormal state cluster-based storage in public storage area is mounted to described standby Part cluster, so that the backup cluster is provided according to the calculating that the metadata and business datum start the abnormal state cluster Source.
Optionally, it is described business datum of the abnormal state cluster-based storage in public storage area is mounted to it is described Backup cluster may include:
The business datum volume identification of the abnormal state cluster is obtained according to the metadata;
According to the business datum volume identification, the business number of the abnormal state cluster is searched from public storage area According to;
The business datum of the abnormal state cluster found is mounted to the backup cluster.
Optionally, the method can also include:
Configure monitoring parameters information.
The computing resource of the cluster of abnormal state can be transferred on backup cluster and run by above-mentioned technical proposal, realize collection Disaster tolerance processing between group, so that it is guaranteed that the continuity and availability of business.
As shown in figure 3, the present embodiment also provides a kind of disaster tolerance processing method, it is applied to disaster tolerance system above-mentioned, the side Method includes:
Step S102, backup cluster receives the metadata for the abnormal state cluster that monitoring node is sent;
Step S104, after receiving data carry success notification, the abnormal state collection described in common storage area domain browsing The business datum of group;
Step S106, start the computing resource of the abnormal state cluster according to the metadata and business datum.
The computing resource of the cluster of abnormal state can be transferred on backup cluster and run by above-mentioned technical proposal, realize collection Disaster tolerance processing between group, so that it is guaranteed that the continuity and availability of business.
As shown in figure 4, the present embodiment also provides a kind of monitoring node, comprising: memory 10 and processor 11;
The memory 10, for saving the program for being used for disaster tolerance processing;
The processor 11 executes the program for disaster tolerance processing for reading, performs the following operations:
Monitoring node is worked as monitors abnormal state cluster in working cluster, and the abnormal state cluster is corresponding standby When the state of part cluster is normal, it is corresponding standby that the metadata for the abnormal state cluster that will acquire is synchronized to the failed cluster On part cluster;
Business datum of the abnormal state cluster-based storage in public storage area is mounted to the backup cluster, with The backup cluster is set to start the computing resource of the abnormal state cluster according to the metadata and business datum.
Optionally, it is described business datum of the abnormal state cluster-based storage in public storage area is mounted to it is described Backup cluster may include:
The business datum volume identification of the abnormal state cluster is obtained according to the metadata;
According to the business datum volume identification, the business number of the abnormal state cluster is searched from public storage area According to;
The business datum of the abnormal state cluster found is mounted to the backup cluster.
Optionally, the processor 11 executes the program for disaster tolerance processing for reading, can also be performed as follows Operation:
Configure monitoring parameters information.
The computing resource of the cluster of abnormal state can be transferred on backup cluster and run by above-mentioned technical proposal, realize collection Disaster tolerance processing between group, so that it is guaranteed that the continuity and availability of business.
As shown in figure 5, the present embodiment also provides a kind of backup cluster, comprising: memory 20 and processor 21;
The memory 20, for saving the program for being used for disaster tolerance processing;
The processor 21 executes the program for disaster tolerance processing for reading, performs the following operations:
Receive the metadata for the abnormal state cluster that monitoring node is sent;
After receiving data carry success notification, the business number of the abnormal state cluster described in common storage area domain browsing According to;
Start the computing resource of the abnormal state cluster according to the metadata and business datum.
The computing resource of the cluster of abnormal state can be transferred on backup cluster and run by above-mentioned technical proposal, realize collection Disaster tolerance processing between group, so that it is guaranteed that the continuity and availability of business.
The disaster tolerance processing method of the application is further illustrated below.
As shown in fig. 6, the disaster tolerance processing method of the present embodiment may include:
Step S201, monitoring node configuration production cluster, the corresponding backup cluster of production cluster and monitoring parameters information;
In the present embodiment, it can determine that monitoring node, monitoring node are located at the working cluster according to network topology Except backup cluster, that is to say, that monitoring node is not belonging to production cluster, is not also not belonging to backup cluster.
After determining monitoring node, monitoring node can configure production cluster, that is, need to supervise which cluster It surveys, then configures each production cluster and back up cluster accordingly.For some cluster, production cluster both can be used as, it can also Using the backup cluster as other clusters.For example, it includes cluster A and cluster B, cluster A that monitoring node, which can configure production cluster, Backup cluster be cluster B, the backup cluster of cluster B is cluster A, and the backup cluster of configuration cluster A is cluster C, cluster B Backup cluster be cluster D.Monitoring node can be according to the resource distribution situation and/or operation shape of cluster each in distributed type assemblies State specifically determines.
Monitoring parameters information may include monitoring heartbeat, that is, every how long to production cluster health status supervise It surveys.Number of retries when monitoring parameters information can also include the link failure of access cluster.
Step S202, the state of production cluster and the corresponding backup cluster of production cluster is monitored;
Two kinds of services can be provided on monitoring node: network monitoring and health status are monitored, and network monitoring can monitor chain Line state, monitor state monitor the operating status that can monitor cluster.
In this implementation, monitoring node can monitor the health status of production cluster and corresponding backup cluster in real time.It is assumed that Production cluster is cluster A, and the backup cluster of cluster A is cluster B, then monitoring node will be according to configuration monitoring heartbeat to cluster A It is monitored with the health status of cluster B.When cluster A is inaccessible, or other failures occurs, it is believed that cluster A's At this moment abnormal state monitors the preparation that node carries out the standby migration of calamity.If cluster B's is in good condition, so that it may execute migration behaviour Make.
In the present embodiment, monitoring node can judge the health degree of cluster according to the operating status of nodes all in cluster, The health degree of cluster can also be judged according to the operating status of part of nodes, it can also be according to the operating status of cluster core node Judge the health degree of cluster.
In addition, if cluster A condition is abnormal, monitoring node can be by the meter of all nodes of cluster A when carrying out calamity for migration It calculates on resource migration to backup cluster, the computing resource on the node of selected focused protection can also be moved into backup cluster On.
Step S203, abnormal state cluster, and the shape of the corresponding backup cluster of the abnormal state cluster are being monitored When state is normal, the metadata for the abnormal state cluster that will acquire is synchronized on the corresponding backup cluster of the failed cluster;
Step S204, business datum of the abnormal state cluster in public storage area is mounted to the backup set Group;
After monitoring node gets the metadata of abnormal state cluster, abnormal state cluster can be obtained according to metadata Business datum volume identification;According to business datum volume identification, the business datum of abnormal state cluster is searched from data storage areas; Then the business datum found is mounted to backup cluster.
Step S205, the business datum of the abnormal state cluster in backup cluster access public storage area;
In the present embodiment, business datum of the abnormal state cluster in public storage area is being mounted to institute by monitoring node It, can be to backup set pocket transmission data carry success notification, to inform that backup cluster has other clusters after stating backup cluster Business datum is mounted to its own cluster.Backup cluster is accessible public to deposit after receiving the data carry success notification The business datum of abnormal state cluster in storage area domain.
Step S206, the computing resource of backup cluster starting state exception cluster;
Backup cluster can according to metadata and the computing resource that business datum starting state exception cluster is accessed, thus It ensure that the continuity of abnormal state group service.
In the present embodiment, after this disaster tolerance is disposed, monitoring node can be corresponding to production cluster and production cluster Backup cluster be updated, monitoring parameters information can also be updated.
It is further illustrated below by specific.
As shown in fig. 7, cluster A is as production cluster (i.e. production cluster), backup of the cluster B as cluster A in the scene Cluster.
Public SAN (Storage Area Network, storage of the storage pool Storage Pool as cluster A and cluster B Local Area Network) storage, cluster A and cluster B keep the continuous link stored to SAN;Computing resource is created on cluster A (for example, empty Quasi- machine), the corresponding data of computing resource are stored in Storage Pool storage.
The health status that monitoring node monitors cluster A and cluster B is initiated if cluster A breaks down by monitoring node The migration of computing resource acts, and the corresponding storage resource of the computing resources such as virtual machine is mounted on cluster B, and on cluster B Starting, to realize that the calamity of computing resource is standby and high availability.
For example, operation has virtual machine 1, virtual machine 2 and virtual machine 3 on cluster A, it is assumed that the business datum of virtual machine 1 stores On the volume 1 of storage pool, the business datum of virtual machine 2 is stored on the volume 2 of storage pool, and the business datum of virtual machine 3 is stored in On the volume 3 of storage pool.As shown in figure 8, cluster B is just after the storage resource of cluster A is mounted on cluster B by monitoring node The business datum of virtual machine 1 in accessible storage pool volume 1, storage pool roll up the business datum and storage of the virtual machine 2 in 2 The business datum of virtual machine 3 in pond volume 3, then according to the metadata for the business datum and cluster A being accessed starting cluster A Virtual machine 1, virtual machine 2 and virtual machine 3.
It is run through the above technical solutions, the cluster of abnormal state can be moved on backup cluster, it is ensured that state The continuity and availability of the business of abnormal cluster.Meanwhile when carrying out Data Migration, only by the metadata of abnormal state cluster (such as configuration information) is synchronized to backup cluster, and the business datum of abnormal state cluster is without synchronizing, but passing through will be public The business datum of abnormal state cluster in storage region is directly mounted on backup cluster, is opened on backup cluster to realize The computing resource of dynamic state cluster.Therefore, data duplication amount when above-mentioned technical proposal disaster tolerance is handled greatly reduces, to keep away Exempt to replicate the problem of total data bring calamity takes long time for process.
It will appreciated by the skilled person that whole or certain steps, system, dress in method disclosed hereinabove Functional module/unit in setting may be implemented as software, firmware, hardware and its combination appropriate.In hardware embodiment, Division between the functional module/unit referred in the above description not necessarily corresponds to the division of physical assemblies;For example, one Physical assemblies can have multiple functions or a function or step and can be executed by several physical assemblies cooperations.Certain groups Part or all components may be implemented as by processor, such as the software that digital signal processor or microprocessor execute, or by It is embodied as hardware, or is implemented as integrated circuit, such as specific integrated circuit.Such software can be distributed in computer-readable On medium, computer-readable medium may include computer storage medium (or non-transitory medium) and communication media (or temporarily Property medium).As known to a person of ordinary skill in the art, term computer storage medium is included in for storing information (such as Computer readable instructions, data structure, program module or other data) any method or technique in the volatibility implemented and non- Volatibility, removable and nonremovable medium.Computer storage medium include but is not limited to RAM, ROM, EEPROM, flash memory or its His memory technology, CD-ROM, digital versatile disc (DVD) or other optical disc storages, magnetic holder, tape, disk storage or other Magnetic memory apparatus or any other medium that can be used for storing desired information and can be accessed by a computer.This Outside, known to a person of ordinary skill in the art to be, communication media generally comprises computer readable instructions, data structure, program mould Other data in the modulated data signal of block or such as carrier wave or other transmission mechanisms etc, and passed including any information Send medium.

Claims (10)

1. a kind of disaster tolerance system, which is characterized in that the system comprises: working cluster, backup cluster and monitoring node;
The monitoring node is used for configuration work cluster and the corresponding backup cluster of the working cluster, is also used to monitor work Make the state of cluster He backup cluster;
Wherein, the working cluster and backup cluster share public storage area;
The monitoring node is located at except the working cluster and backup cluster.
2. disaster tolerance system as described in claim 1, it is characterised in that:
The monitoring node is also used to the shape of abnormal state and the corresponding backup cluster of the working cluster in working cluster When state is normal, the working cluster of abnormal state is stored in the business datum in public storage area and is mounted to the abnormal state On the corresponding backup cluster of cluster, and the metadata of the state cluster is synchronized to the backup cluster;
The backup cluster is also used to after receiving the metadata that the monitoring node is sent, according to the metadata and The business datum of the abnormal state cluster read starts the computing resource of the abnormal state cluster.
3. a kind of disaster tolerance processing method is applied to disaster tolerance system as described in claim 1, which is characterized in that the method packet It includes:
When monitoring node monitors abnormal state cluster, and the corresponding backup set of the abnormal state cluster in working cluster When the state of group is normal, the metadata for the abnormal state cluster that will acquire is synchronized to the corresponding backup set of the failed cluster On group;
Business datum of the abnormal state cluster-based storage in public storage area is mounted to the backup cluster, so that institute State the computing resource that backup cluster starts the abnormal state cluster according to the metadata and business datum.
4. disaster tolerance processing method according to claim 3, which is characterized in that described that the abnormal state cluster-based storage exists Business datum in public storage area is mounted to the backup cluster
The business datum volume identification of the abnormal state cluster is obtained according to the metadata;
According to the business datum volume identification, the business datum of the abnormal state cluster is searched from public storage area;
The business datum of the abnormal state cluster found is mounted to the backup cluster.
5. disaster tolerance processing method according to claim 3 or 4, which is characterized in that the method also includes:
Configure monitoring parameters information.
6. a kind of disaster tolerance processing method is applied to disaster tolerance system as described in claim 1, which is characterized in that the method packet It includes:
Backup cluster receives the metadata for the abnormal state cluster that monitoring node is sent;
After receiving data carry success notification, the business datum of abnormal state cluster described in common storage area domain browsing;
Start the computing resource of the abnormal state cluster according to the metadata and business datum.
7. a kind of monitoring node, comprising: memory and processor;It is characterized by:
The memory, for saving the program for being used for disaster tolerance processing;
The processor executes the program for disaster tolerance processing for reading, performs the following operations:
Monitoring node is worked as monitors abnormal state cluster, and the corresponding backup set of the abnormal state cluster in working cluster When the state of group is normal, the metadata for the abnormal state cluster that will acquire is synchronized to the corresponding backup set of the failed cluster On group;
Business datum of the abnormal state cluster-based storage in public storage area is mounted to the backup cluster, so that institute State the computing resource that backup cluster starts the abnormal state cluster according to the metadata and business datum.
8. monitoring node according to claim 7, which is characterized in that it is described by the abnormal state cluster-based storage public Business datum in storage region is mounted to the backup cluster
The business datum volume identification of the abnormal state cluster is obtained according to the metadata;
According to the business datum volume identification, the business datum of the abnormal state cluster is searched from public storage area;
The business datum of the abnormal state cluster found is mounted to the backup cluster.
9. monitoring node according to claim 7 or 8, which is characterized in that the processor executes the use for reading In the program of disaster tolerance processing, also perform the following operations:
Configure monitoring parameters information.
10. a kind of backup cluster, comprising: memory and processor;It is characterized by:
The memory, for saving the program for being used for disaster tolerance processing;
The processor executes the program for disaster tolerance processing for reading, performs the following operations:
Receive the metadata for the abnormal state cluster that monitoring node is sent;
After receiving data carry success notification, the business datum of abnormal state cluster described in common storage area domain browsing;
Start the computing resource of the abnormal state cluster according to the metadata and business datum.
CN201910579657.0A 2019-06-28 2019-06-28 A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster Withdrawn CN110377459A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910579657.0A CN110377459A (en) 2019-06-28 2019-06-28 A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910579657.0A CN110377459A (en) 2019-06-28 2019-06-28 A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster

Publications (1)

Publication Number Publication Date
CN110377459A true CN110377459A (en) 2019-10-25

Family

ID=68251309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910579657.0A Withdrawn CN110377459A (en) 2019-06-28 2019-06-28 A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster

Country Status (1)

Country Link
CN (1) CN110377459A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111211930A (en) * 2019-12-31 2020-05-29 杭州趣链科技有限公司 Block chain service disaster-tolerant backup containerized deployment method
CN111966467A (en) * 2020-08-21 2020-11-20 苏州浪潮智能科技有限公司 Method and device for disaster recovery based on kubernetes container platform
CN112181724A (en) * 2020-09-23 2021-01-05 支付宝(杭州)信息技术有限公司 Big data disaster tolerance method and device and electronic equipment
CN112583634A (en) * 2020-11-16 2021-03-30 麒麟软件有限公司 Monitoring system-based highway portal disaster recovery method
CN113076212A (en) * 2021-03-29 2021-07-06 青岛特来电新能源科技有限公司 Cluster management method, device and equipment and computer readable storage medium
CN113127310A (en) * 2021-04-30 2021-07-16 北京奇艺世纪科技有限公司 Task processing method and device, electronic equipment and storage medium
CN113434340A (en) * 2021-06-29 2021-09-24 聚好看科技股份有限公司 Server and cache cluster fault rapid recovery method
CN114428760A (en) * 2021-12-30 2022-05-03 北京云宽志业网络技术有限公司 Cluster storage system and metadata recovery method
CN115022209A (en) * 2022-06-24 2022-09-06 中国电信股份有限公司 Monitoring method, monitoring device and computer-readable storage medium
CN115174364A (en) * 2022-06-30 2022-10-11 济南浪潮数据技术有限公司 Data recovery method, device and medium in disaster tolerance scene
CN115499299A (en) * 2022-09-13 2022-12-20 航天信息股份有限公司 Cluster equipment monitoring method and device

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111211930A (en) * 2019-12-31 2020-05-29 杭州趣链科技有限公司 Block chain service disaster-tolerant backup containerized deployment method
CN111211930B (en) * 2019-12-31 2022-08-26 杭州趣链科技有限公司 Block chain service disaster-tolerant backup containerized deployment method
CN111966467B (en) * 2020-08-21 2022-07-29 苏州浪潮智能科技有限公司 Method and device for disaster recovery based on kubernetes container platform
CN111966467A (en) * 2020-08-21 2020-11-20 苏州浪潮智能科技有限公司 Method and device for disaster recovery based on kubernetes container platform
CN112181724A (en) * 2020-09-23 2021-01-05 支付宝(杭州)信息技术有限公司 Big data disaster tolerance method and device and electronic equipment
CN112583634A (en) * 2020-11-16 2021-03-30 麒麟软件有限公司 Monitoring system-based highway portal disaster recovery method
CN112583634B (en) * 2020-11-16 2022-03-18 麒麟软件有限公司 Monitoring system-based highway portal disaster recovery method
CN113076212A (en) * 2021-03-29 2021-07-06 青岛特来电新能源科技有限公司 Cluster management method, device and equipment and computer readable storage medium
CN113127310A (en) * 2021-04-30 2021-07-16 北京奇艺世纪科技有限公司 Task processing method and device, electronic equipment and storage medium
CN113127310B (en) * 2021-04-30 2023-09-01 北京奇艺世纪科技有限公司 Task processing method and device, electronic equipment and storage medium
CN113434340A (en) * 2021-06-29 2021-09-24 聚好看科技股份有限公司 Server and cache cluster fault rapid recovery method
CN114428760A (en) * 2021-12-30 2022-05-03 北京云宽志业网络技术有限公司 Cluster storage system and metadata recovery method
CN115022209A (en) * 2022-06-24 2022-09-06 中国电信股份有限公司 Monitoring method, monitoring device and computer-readable storage medium
CN115174364A (en) * 2022-06-30 2022-10-11 济南浪潮数据技术有限公司 Data recovery method, device and medium in disaster tolerance scene
CN115499299A (en) * 2022-09-13 2022-12-20 航天信息股份有限公司 Cluster equipment monitoring method and device

Similar Documents

Publication Publication Date Title
CN110377459A (en) A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster
CN106713487B (en) Data synchronization method and device
EP3518110B1 (en) Designation of a standby node
CN105187464B (en) Method of data synchronization, apparatus and system in a kind of distributed memory system
CN110362381A (en) HDFS cluster High Availabitity dispositions method, system, equipment and storage medium
CN109828868B (en) Data storage method, device, management equipment and double-active data storage system
CN108153622B (en) Fault processing method, device and equipment
CN106341454A (en) Across-room multiple-active distributed database management system and across-room multiple-active distributed database management method
CN110581782B (en) Disaster tolerance data processing method, device and system
CN102088490B (en) Data storage method, device and system
CN102394914A (en) Cluster brain-split processing method and device
CN108319618B (en) Data distribution control method, system and device of distributed storage system
CN111176888B (en) Disaster recovery method, device and system for cloud storage
CN108540315A (en) Distributed memory system, method and apparatus
CN107864055A (en) The management method and platform of virtualization system
CN105357042A (en) High-availability cluster system, master node and slave node
CN111935244B (en) Service request processing system and super-integration all-in-one machine
CN108462756B (en) Data writing method and device
CN111431980B (en) Distributed storage system and path switching method thereof
CN114064217B (en) OpenStack-based node virtual machine migration method and device
CN108170507A (en) Virtual application management method/system, computer readable storage medium and server-side
CN105490847B (en) A kind of private cloud storage system interior joint failure real-time detection and processing method
CN108512753A (en) The method and device that message is transmitted in a kind of cluster file system
CN104811348A (en) Availability device, storage area network system with availability device and methods for operation thereof
CN112698979A (en) Method and device for processing zookeeper double nodes, storage medium and processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20191025