CN111966467A - Method and device for disaster recovery based on kubernetes container platform - Google Patents

Method and device for disaster recovery based on kubernetes container platform Download PDF

Info

Publication number
CN111966467A
CN111966467A CN202010852391.5A CN202010852391A CN111966467A CN 111966467 A CN111966467 A CN 111966467A CN 202010852391 A CN202010852391 A CN 202010852391A CN 111966467 A CN111966467 A CN 111966467A
Authority
CN
China
Prior art keywords
backup
pod
agent component
operation information
setting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010852391.5A
Other languages
Chinese (zh)
Other versions
CN111966467B (en
Inventor
刘娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010852391.5A priority Critical patent/CN111966467B/en
Publication of CN111966467A publication Critical patent/CN111966467A/en
Application granted granted Critical
Publication of CN111966467B publication Critical patent/CN111966467B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45575Starting, stopping, suspending or resuming virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Abstract

The invention provides a method and a device for disaster recovery based on a kubernets container platform, wherein the method comprises the following steps: respectively arranging a first backup agent component and a second backup agent component on a main and standby kubernets container platform; setting a first backup agent component to monitor pod operation information in the main container platform, and synchronizing the pod operation information to a second backup agent component of the standby container platform to serve as backup pod operation information; the pod operation information and the backup pod operation information both comprise the corresponding relation between the pod and the mount volume of the main storage; setting a second backup agent component to monitor the health state of the first backup agent component, creating a new pod according to the running information of the backup pod when the health state of the first backup agent component is monitored to be abnormal, and automatically identifying the mounted volume for data recovery by the standby storage of the new pod; the standby storage is synchronized with the main storage data, and the mount volume information is consistent.

Description

Method and device for disaster recovery based on kubernetes container platform
Technical Field
The invention belongs to the technical field of data center disaster recovery, and particularly relates to a method and a device for platform disaster recovery based on kubernetes containers.
Background
Kuberents is used as a container arranging and managing platform, the strong container automation operation and maintenance management capacity, large-scale cluster resource management and scheduling capacity are accepted by a plurality of enterprises, more and more systems are used as development platforms to expand or run own services based on the kuberents platform, and therefore, the kuberents is higher and higher in status along with the continuous development of container technology.
The scheme that a single data center of kubernets deploys and uses multiple control nodes guarantees a high-availability scheme of a cluster in a single data center scene, but when a data center fails, such as a machine room power failure, an earthquake and a fire disaster, all nodes of the data center are down, all services deployed on the kubernets cluster are interrupted due to the machine room failure, most importantly, service data can be lost due to the fact that an effective backup mechanism is not available, data backup in different places is achieved even if a data backup mechanism of storage hardware is used, hardware backup is only hard backup of file data and does not correspond to a management mechanism of kubernets for storage, and PODs cannot read backup data information even if the PODs are started in a new kubernets system.
This is a disadvantage of the prior art, and therefore, it is necessary to provide a method and an apparatus for kubernets container platform based disaster recovery.
Disclosure of Invention
The invention provides a method and a device for platform disaster recovery based on a kubernets container, aiming at the defect that backup data information cannot be read due to lack of a management mechanism for storage even if the existing data center based on kubernets in the prior art carries out hardware backup, so as to solve the technical problems.
In a first aspect, the present invention provides a method for disaster recovery based on a kubernetes container platform, comprising the following steps:
s1, setting a first backup agent component on a main kubernets container platform, and setting a second backup agent component on a backup kubernets container platform;
s2, setting a first backup agent component to monitor pod operation information in the main kubernetes container platform, and synchronizing the pod operation information to a second backup agent component of the backup kubernetes container platform to serve as backup pod operation information; the pod operation information and the backup pod operation information both comprise the corresponding relation between the pod and the mount volume of the main storage;
s3, setting a second backup agent component to monitor the health state of the first backup agent component, and starting backup recovery when the health state of the first backup agent component is monitored to be abnormal;
s4, setting a second backup agent component to create a new pod according to backup pod operation information, and enabling the new pod to automatically identify the mounted volume from the standby storage for data recovery; the standby storage is synchronized with the main storage data, and the mount volume information is consistent.
Further, the step S2 specifically includes the following steps:
s21, setting a first backup agent component to monitor each pod operation information recorded by a main etcd component in a main kubernets container platform;
and S22, setting a first backup agent component to synchronize the pod operation information to a second backup agent component of the backup kubernets container platform in an incremental synchronization mode to serve as backup pod operation information. And ensuring that the pod operation information acquired by the first backup agent component and the second backup agent component is consistent. And the first backup agent component synchronizes the pod operation information to the second backup agent component in a jason message mode.
Further, the pod operation information also includes resource quota and mirror information; the resource quota comprises cpu and memory information;
the correspondence relationship between the pod and the mount volume of the main storage includes the ID of the mount volume corresponding to the pod.
Further, the step S3 specifically includes the following steps:
s31, setting a second backup agent component to monitor the health state of the first backup agent component in real time through a heartbeat mechanism;
s32, when the second backup agent component detects that the pod operation information of the first backup agent component is failed to be received and the duration exceeds a set threshold value, judging that the health state of the first backup agent component is abnormal;
and S33, setting a second backup agent component to start backup recovery.
Further, the step S4 specifically includes the following steps:
s41, a second backup agent component is arranged to transmit backup pod operation information to a standby user interface of a standby kubernetes container platform;
and S42, setting a standby user interface to simulate a user creating process, creating a new pod according to the backup pod operation information, automatically identifying the mount volume associated with the new pod from the standby storage according to the mount volume ID, and recovering data. The new pod with the kubernets container platform takes over the service of the pod in the kubernets container platform.
Further, step S21 includes that the master user interface in the master kubernets container platform obtains the pod creation, deletion, and change operations of the user, manages the pod through the master lifecycle management component kebuelet, and stores the pod operation information to the master etcd component;
step S2 further includes:
s23, setting a main storage to synchronize the mounting volume data to a standby storage, and keeping the mounting volume ID of the main storage consistent with that of the standby storage;
in step S42, a backup user interface is set to simulate a user creation process, and a new pod is created through the backup lifecycle management component kubelelet according to the backup pod operation information. The main storage can adopt a real-time copy mode or a timing copy mode to synchronize the mount volume data to the standby storage.
In a second aspect, the present invention provides a device for disaster recovery based on a kubernets container platform, comprising:
the backup proxy component setting module is used for setting a first backup proxy component on a main kubernets container platform and setting a second backup proxy component on a backup kubernets container platform;
the pod operation information backup module is used for setting a first backup agent component to monitor pod operation information in the main kubernets container platform and synchronizing the pod operation information to a second backup agent component of the backup kubernets container platform to serve as backup pod operation information; the pod operation information and the backup pod operation information both comprise the corresponding relation between the pod and the mount volume of the main storage;
the health state monitoring module is used for setting a second backup agent component to monitor the health state of the first backup agent component, and starting backup recovery when the health state of the first backup agent component is monitored to be abnormal;
the pod and data recovery module is used for setting a second backup agent component to create a new pod according to the backup pod operation information, and the new pod automatically identifies the mounted volume from the standby storage to perform data recovery; the standby storage is synchronized with the main storage data, and the mount volume information is consistent.
Further, the pod operation information backup module comprises:
a pod operation information monitoring unit, configured to set a first backup proxy component to monitor each pod operation information recorded by a main etcd component in a main kubernets container platform;
and the pod operation information backup unit is used for setting the first backup agent component to synchronize the pod operation information to the second backup agent component of the backup kubernets container platform in an incremental synchronization mode to serve as the backup pod operation information.
Further, the health status monitoring module comprises:
the standby agent component monitoring unit is used for setting a second standby agent component to monitor the health state of the first standby agent component in real time through a heartbeat mechanism;
the health state abnormity determining unit is used for setting that when the second backup agent component detects that the pod operation information of the first backup agent component fails to be received and the duration exceeds a set threshold value, the health state abnormity of the first backup agent component is determined;
and the backup recovery starting unit is used for setting the second backup agent component to start backup recovery.
Further, the pod and data recovery module includes:
the backup pod operation information transmission unit is used for setting a second backup agent component to transmit the backup pod operation information to a standby user interface of a standby kubernets container platform;
and the pod creation simulation unit is used for setting a device user interface to simulate a user creation flow, creating a new pod according to the backup pod operation information, automatically identifying the mount volume associated with the new pod from the standby storage according to the mount volume ID, and recovering data.
The beneficial effect of the invention is that,
according to the method and the device for disaster recovery based on the kubernets container platform, the association relationship between the pod and the stored mount volume is added on the basis of hardware data backup, so that the association between the backup data and the pod is realized, the backed-up file data can be directly used when the pod is started in a new kubernets container, and the defect that the original kubernets container platform management service capability is insufficient is overcome.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Therefore, compared with the prior art, the invention has prominent substantive features and remarkable progress, and the beneficial effects of the implementation are also obvious.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a first schematic flow chart of the method of the present invention;
FIG. 2 is a second schematic flow chart of the method of the present invention;
FIG. 3 is a schematic diagram of the system of the present invention;
in the figure, 1-backup agent component setting module; 2-pod operation information backup module; 2.1-pod operation information monitoring unit; 2.2-pod operation information backup unit; 3-a health status monitoring module; 3.1-Standby agent component monitoring Unit; 3.2-abnormal health status determination unit; 3.3-backup recovery starting unit; 4-pod and data recovery module; 4.1-backup pod operation information transfer unit; 4.2-pod creates the simulation unit.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Pod, in the kubernets cluster, a Pod is the basis for all traffic types, which is a combination of one or more containers. In a Pod, all containers are identically arranged and scheduled for a particular application, the Pod is their logical host, and the Pod contains multiple application containers that are business related.
In a distributed cluster management system, in a kubberlenes, a worker is operated on each node to manage the life cycle of a container, and the worker program is the kubbelet.
The method comprises the steps that etcd is a CoreOS open source project, the goal is to construct a high-availability distributed key value (key-value) database, a raft protocol is adopted in the etcd as a consistency algorithm, the etcd is realized based on Go language, and the etcd is equivalent to a database of a kubernets system and used for storing all service data generated in the management process of the kubernets system.
Example 1:
as shown in fig. 1, the present invention provides a method for disaster recovery based on a kubernetes container platform, comprising the following steps:
s1, setting a first backup agent component on a main kubernets container platform, and setting a second backup agent component on a backup kubernets container platform;
s2, setting a first backup agent component to monitor pod operation information in the main kubernetes container platform, and synchronizing the pod operation information to a second backup agent component of the backup kubernetes container platform to serve as backup pod operation information; the pod operation information and the backup pod operation information both comprise the corresponding relation between the pod and the mount volume of the main storage;
s3, setting a second backup agent component to monitor the health state of the first backup agent component, and starting backup recovery when the health state of the first backup agent component is monitored to be abnormal;
s4, setting a second backup agent component to create a new pod according to backup pod operation information, and enabling the new pod to automatically identify the mounted volume from the standby storage for data recovery; the standby storage is synchronized with the main storage data, and the mount volume information is consistent.
In some embodiments, the first backup component and the second backup proxy component are implemented in a go language that ensures interactivity between the components and the kubernets container platform.
Example 2:
as shown in fig. 2, the present invention provides a method for disaster recovery based on kubernets container platform, comprising the following steps:
s1, setting a first backup agent component on a main kubernets container platform, and setting a second backup agent component on a backup kubernets container platform;
s2, setting a first backup agent component to monitor pod operation information in the main kubernetes container platform, and synchronizing the pod operation information to a second backup agent component of the backup kubernetes container platform to serve as backup pod operation information; the pod operation information and the backup pod operation information both comprise the corresponding relation between the pod and the mount volume of the main storage; the pod operation information also comprises resource quota and mirror image information; the resource quota comprises cpu and memory information; the correspondence between the pod and the mount volume of the main storage comprises the ID of the mount volume corresponding to the pod; the method comprises the following specific steps:
s21, setting a first backup agent component to monitor each pod operation information recorded by a main etcd component in a main kubernets container platform;
s22, setting a first backup agent component to synchronize the pod operation information to a second backup agent component of a backup kubernets container platform in an incremental synchronization mode to serve as backup pod operation information;
s3, setting a second backup agent component to monitor the health state of the first backup agent component, and starting backup recovery when the health state of the first backup agent component is monitored to be abnormal; the method comprises the following specific steps:
s31, setting a second backup agent component to monitor the health state of the first backup agent component in real time through a heartbeat mechanism;
s32, when the second backup agent component detects that the pod operation information of the first backup agent component is failed to be received and the duration exceeds a set threshold value, judging that the health state of the first backup agent component is abnormal;
s33, setting a second backup agent component to start backup recovery;
s4, setting a second backup agent component to create a new pod according to backup pod operation information, and enabling the new pod to automatically identify the mounted volume from the standby storage for data recovery; the standby storage is synchronous with the main storage data, and the mounting volume information is consistent; the method comprises the following specific steps:
s41, a second backup agent component is arranged to transmit backup pod operation information to a standby user interface of a standby kubernetes container platform;
and S42, setting a standby user interface to simulate a user creating process, creating a new pod according to the backup pod operation information, automatically identifying the mount volume associated with the new pod from the standby storage according to the mount volume ID, and recovering data.
In some embodiments, step S21 further includes that the master user interface in the master kubernets container platform obtains the pod creation, deletion, and change operations of the user, manages the pod through the master lifecycle management component kebuelet, and stores the pod operation information to the master etcd component;
in some embodiments, step S2 further includes:
s23, setting a main storage to synchronize the mounting volume data to a standby storage, and keeping the mounting volume ID of the main storage consistent with the mounting volume ID of the standby storage.
In some embodiments, in step S42, the backup user interface is configured to simulate a user creation process, and a new pod is created through the backup lifecycle management component kubelet according to the backup pod operation information.
In the kubernets cluster deployment process, a shared storage mode is adopted for storage, PODs in a cluster store business data generated in a business operation process to shared storage equipment in a mount volume mode, association relations between PODs and mount volumes are stored under the appointed purpose of control nodes in a file mode, each mount volume generates a folder named by a unique ID in a shared storage space, and data are stored under the mount volume folder with the association relations in the POD operation process.
Example 3:
as shown in fig. 3, the present invention provides a device for disaster recovery based on kubernets container platform, comprising:
the backup proxy component setting module 1 is used for setting a first backup proxy component on a main kubernets container platform and setting a second backup proxy component on a backup kubernets container platform;
the pod operation information backup module 2 is used for setting a first backup agent component to monitor pod operation information in the main kubernets container platform, and synchronizing the pod operation information to a second backup agent component of the backup kubernets container platform to serve as backup pod operation information; the pod operation information and the backup pod operation information both comprise the corresponding relation between the pod and the mount volume of the main storage; the pod operation information backup module 2 includes:
a pod operation information monitoring unit 2.1, configured to set a first backup proxy component to monitor each pod operation information recorded by a main etcd component in a main kubernets container platform;
a pod operation information backup unit 2.2, configured to set the first backup proxy component to synchronize the pod operation information to a second backup proxy component of the backup kubernets container platform in an incremental synchronization manner, where the second backup proxy component serves as backup pod operation information;
the health state monitoring module 3 is used for setting a second backup agent component to monitor the health state of the first backup agent component, and starting backup recovery when the health state of the first backup agent component is monitored to be abnormal; the health state monitoring module 3 includes:
the standby agent component monitoring unit 3.1 is used for setting a second standby agent component to monitor the health state of the first standby agent component in real time through a heartbeat mechanism;
the health state abnormity determining unit 3.2 is used for setting that when the second backup agent component detects that the pod operation information of the first backup agent component fails to be received and the duration exceeds a set threshold value, the health state abnormity of the first backup agent component is determined;
a backup recovery starting unit 3.3, configured to set a second backup proxy component to start backup recovery;
the pod and data recovery module 4 is used for setting a second backup agent component to create a new pod according to the backup pod operation information, and the new pod automatically identifies the mounted volume from the standby storage to perform data recovery; the standby storage is synchronous with the main storage data, and the mounting volume information is consistent; the pod and data recovery module 4 includes:
the backup pod operation information transmission unit 4.1 is used for setting a second backup agent component to transmit the backup pod operation information to a standby user interface of a standby kubernets container platform;
and the pod creation simulation unit 4.2 is used for setting a device user interface to simulate a user creation flow, creating a new pod according to the backup pod operation information, automatically identifying the mount volume associated with the new pod from the standby storage according to the mount volume ID, and recovering data.
Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. A method for disaster recovery based on a kubernets container platform is characterized by comprising the following steps:
s1, setting a first backup agent component on a main kubernets container platform, and setting a second backup agent component on a backup kubernets container platform;
s2, setting a first backup agent component to monitor pod operation information in the main kubernetes container platform, and synchronizing the pod operation information to a second backup agent component of the backup kubernetes container platform to serve as backup pod operation information; the pod operation information and the backup pod operation information both comprise the corresponding relation between the pod and the mount volume of the main storage;
s3, setting a second backup agent component to monitor the health state of the first backup agent component, and starting backup recovery when the health state of the first backup agent component is monitored to be abnormal;
s4, setting a second backup agent component to create a new pod according to backup pod operation information, and enabling the new pod to automatically identify the mounted volume from the standby storage for data recovery; the standby storage is synchronized with the main storage data, and the mount volume information is consistent.
2. The method of claim 1, wherein step S2 is as follows:
s21, setting a first backup agent component to monitor each pod operation information recorded by a main etcd component in a main kubernets container platform;
and S22, setting a first backup agent component to synchronize the pod operation information to a second backup agent component of the backup kubernets container platform in an incremental synchronization mode to serve as backup pod operation information.
3. The method of claim 2, wherein the pod operation information further includes resource quotas and mirror information; the resource quota comprises cpu and memory information;
the correspondence relationship between the pod and the mount volume of the main storage includes the ID of the mount volume corresponding to the pod.
4. The method of claim 1, wherein step S3 is as follows:
s31, setting a second backup agent component to monitor the health state of the first backup agent component in real time through a heartbeat mechanism;
s32, when the second backup agent component detects that the pod operation information of the first backup agent component is failed to be received and the duration exceeds a set threshold value, judging that the health state of the first backup agent component is abnormal;
and S33, setting a second backup agent component to start backup recovery.
5. The method of claim 3, wherein step S4 is as follows:
s41, a second backup agent component is arranged to transmit backup pod operation information to a standby user interface of a standby kubernetes container platform;
and S42, setting a standby user interface to simulate a user creating process, creating a new pod according to the backup pod operation information, automatically identifying the mount volume associated with the new pod from the standby storage according to the mount volume ID, and recovering data.
6. The method according to claim 5, wherein the step S21 further includes the main user interface in the main kubernets container platform obtaining the pod creation, deletion and change operations of the user, managing the pod through the main lifecycle management component kebuelet, and storing the pod running information to the main etcd component;
step S2 further includes:
s23, setting a main storage to synchronize the mounting volume data to a standby storage, and keeping the mounting volume ID of the main storage consistent with that of the standby storage;
in step S42, a backup user interface is set to simulate a user creation process, and a new pod is created through the backup lifecycle management component kubelelet according to the backup pod operation information.
7. A device based on kubernets container platform disaster recovery, characterized by comprising:
the backup proxy component setting module (1) is used for setting a first backup proxy component on a main kubernets container platform and setting a second backup proxy component on a backup kubernets container platform;
the pod operation information backup module (2) is used for setting a first backup agent component to monitor pod operation information in the main kubernets container platform and synchronizing the pod operation information to a second backup agent component of the backup kubernets container platform to serve as backup pod operation information; the pod operation information and the backup pod operation information both comprise the corresponding relation between the pod and the mount volume of the main storage;
the health state monitoring module (3) is used for setting a second backup agent component to monitor the health state of the first backup agent component, and starting backup recovery when the health state of the first backup agent component is monitored to be abnormal;
the pod and data recovery module (4) is used for setting a second backup agent component to create a new pod according to the backup pod operation information, and the new pod automatically identifies the mounted volume from the standby storage to perform data recovery; the standby storage is synchronized with the main storage data, and the mount volume information is consistent.
8. The apparatus according to claim 7, wherein the pod operation information backup module (2) comprises:
a pod operation information monitoring unit (2.1) for setting a first backup agent component to monitor each pod operation information recorded by a main etcd component in a main kubernets container platform;
and the pod operation information backup unit (2.2) is used for setting the first backup agent component to synchronize the pod operation information to the second backup agent component of the backup kubernets container platform in an incremental synchronization mode to serve as backup pod operation information.
9. The apparatus according to claim 7, wherein the health monitoring module (3) comprises:
the standby agent component monitoring unit (3.1) is used for setting a second standby agent component to monitor the health state of the first standby agent component in real time through a heartbeat mechanism;
the health state abnormity determining unit (3.2) is used for setting that when the second backup agent component detects that the pod operation information of the first backup agent component fails to be received and the duration exceeds a set threshold value, the health state abnormity of the first backup agent component is determined;
and the backup recovery starting unit (3.3) is used for setting the second backup agent component to start backup recovery.
10. The apparatus according to claim 7, wherein the pod and data recovery module (4) comprises:
the backup pod operation information transmission unit (4.1) is used for setting a second backup agent component to transmit the backup pod operation information to a standby user interface of a standby kubernets container platform;
and the pod creation simulation unit (4.2) is used for setting a device user interface to simulate a user creation flow, creating a new pod according to the backup pod operation information, automatically identifying the mount volume associated with the new pod from the standby storage according to the mount volume ID, and recovering data.
CN202010852391.5A 2020-08-21 2020-08-21 Method and device for disaster recovery based on kubernetes container platform Active CN111966467B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010852391.5A CN111966467B (en) 2020-08-21 2020-08-21 Method and device for disaster recovery based on kubernetes container platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010852391.5A CN111966467B (en) 2020-08-21 2020-08-21 Method and device for disaster recovery based on kubernetes container platform

Publications (2)

Publication Number Publication Date
CN111966467A true CN111966467A (en) 2020-11-20
CN111966467B CN111966467B (en) 2022-07-29

Family

ID=73391131

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010852391.5A Active CN111966467B (en) 2020-08-21 2020-08-21 Method and device for disaster recovery based on kubernetes container platform

Country Status (1)

Country Link
CN (1) CN111966467B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112631727A (en) * 2020-12-26 2021-04-09 中国农业银行股份有限公司 Method and device for monitoring pod
CN115277652A (en) * 2022-06-29 2022-11-01 北京百度网讯科技有限公司 Inference service-based streaming media processing method and device, and electronic equipment
US11734136B1 (en) 2022-02-11 2023-08-22 International Business Machines Corporation Quick disaster recovery in distributed computing environment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106598789A (en) * 2016-11-30 2017-04-26 成都华为技术有限公司 Container service disaster recovery method and device, production site and disaster recovery backup site
CN107203440A (en) * 2017-05-27 2017-09-26 郑州云海信息技术有限公司 A kind of integration is backed up in realtime disaster tolerance system and building method
CN110377459A (en) * 2019-06-28 2019-10-25 苏州浪潮智能科技有限公司 A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106598789A (en) * 2016-11-30 2017-04-26 成都华为技术有限公司 Container service disaster recovery method and device, production site and disaster recovery backup site
CN107203440A (en) * 2017-05-27 2017-09-26 郑州云海信息技术有限公司 A kind of integration is backed up in realtime disaster tolerance system and building method
CN110377459A (en) * 2019-06-28 2019-10-25 苏州浪潮智能科技有限公司 A kind of disaster tolerance system, disaster tolerance processing method, monitoring node and backup cluster

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112631727A (en) * 2020-12-26 2021-04-09 中国农业银行股份有限公司 Method and device for monitoring pod
CN112631727B (en) * 2020-12-26 2024-02-23 中国农业银行股份有限公司 Monitoring method and device for pod group pod
US11734136B1 (en) 2022-02-11 2023-08-22 International Business Machines Corporation Quick disaster recovery in distributed computing environment
CN115277652A (en) * 2022-06-29 2022-11-01 北京百度网讯科技有限公司 Inference service-based streaming media processing method and device, and electronic equipment
CN115277652B (en) * 2022-06-29 2024-03-22 北京百度网讯科技有限公司 Streaming media processing method and device based on reasoning service and electronic equipment

Also Published As

Publication number Publication date
CN111966467B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN111966467B (en) Method and device for disaster recovery based on kubernetes container platform
CN103077242B (en) The method of a kind of fulfillment database server two-node cluster hot backup
WO2019154394A1 (en) Distributed database cluster system, data synchronization method and storage medium
CN106062717B (en) A kind of distributed storage dubbing system and method
CN110581782B (en) Disaster tolerance data processing method, device and system
JP2011530127A (en) Method and system for maintaining data integrity between multiple data servers across a data center
JP2008059583A (en) Cluster system, method for backing up replica in cluster system, and program product
CN108810150B (en) Data replication method of application-level disaster recovery backup system of cooperative office system
CN102214128A (en) Repurposable recovery environment
CN109656753B (en) Redundant hot standby system applied to rail transit comprehensive monitoring system
CN108964986B (en) Application-level double-active disaster recovery system of cooperative office system
CN103902405B (en) Quasi-continuity data replication method and device
CN102394914A (en) Cluster brain-split processing method and device
CN105955836A (en) Cold-hot backup automatic walkthrough multifunction system
CN104486438B (en) The disaster recovery method and device of distributed memory system
CN111478796B (en) Cluster capacity expansion exception handling method for AI platform
CN111949444A (en) Data backup and recovery system and method based on distributed service cluster
CN105389231A (en) Database dual-computer backup method and system
CN110209526A (en) A kind of accumulation layer synchronization system and storage medium
CN105847723B (en) The backup method and device of video information
CN111078352A (en) Dual-computer hot standby deployment method and system based on KVM virtualization system
CN107357800A (en) A kind of database High Availabitity zero loses solution method
US20230004465A1 (en) Distributed database system and data disaster backup drilling method
CN116185697B (en) Container cluster management method, device and system, electronic equipment and storage medium
CN110554933A (en) Cloud management platform, and cross-cloud high-availability method and system for cloud platform service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant