WO2024120227A1 - Container data protection system, method and apparatus, and device and readable storage medium - Google Patents

Container data protection system, method and apparatus, and device and readable storage medium Download PDF

Info

Publication number
WO2024120227A1
WO2024120227A1 PCT/CN2023/134107 CN2023134107W WO2024120227A1 WO 2024120227 A1 WO2024120227 A1 WO 2024120227A1 CN 2023134107 W CN2023134107 W CN 2023134107W WO 2024120227 A1 WO2024120227 A1 WO 2024120227A1
Authority
WO
WIPO (PCT)
Prior art keywords
container
volume
persistent volume
replication
persistent
Prior art date
Application number
PCT/CN2023/134107
Other languages
French (fr)
Chinese (zh)
Inventor
王一杰
张振广
Original Assignee
浪潮电子信息产业股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浪潮电子信息产业股份有限公司 filed Critical 浪潮电子信息产业股份有限公司
Publication of WO2024120227A1 publication Critical patent/WO2024120227A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45579I/O management, e.g. providing access to device drivers or storage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Definitions

  • the present application relates to the field of computer application technology, and in particular to a container data protection system, method, device, equipment and non-volatile readable storage medium.
  • the data in the container can be stored in a medium similar to a virtual machine disk.
  • the container can use external storage devices through persistent volumes.
  • the persistent volume of the container can be used to store the data of the application in the container, and can also be used to share data between containers.
  • the purpose of this application is to provide a container data protection system, method, apparatus, device and non-volatile readable storage medium, which can effectively protect the data of the container.
  • a container data protection system comprising:
  • Replication manager container storage controller, container orchestration server, and container storage interface
  • the container orchestration server and the container storage controller communicate with the container storage interface through the remote procedure call protocol;
  • the replication manager and container storage controller communicate with the container orchestration server via the Hypertext Transfer Protocol;
  • the replication manager includes a replication control management module and a cluster management module;
  • the container storage controller includes a container storage interface interaction module and a container storage management module;
  • the container storage interface is used to perform remote replication management and volume operations on the storage system.
  • the cluster management module and the container orchestration server in the remote cluster use Rest API calls to query, create, and modify resource object information in the remote cluster.
  • the cluster management module is used to obtain cluster access configuration information, communicate between container clusters, and query, create, and modify resource objects in a remote cluster.
  • the replication control management module includes a backup object controller, a recovery object controller, a persistent volume controller, a persistent volume declaration controller and a protection group controller to implement monitoring and operation of resource objects.
  • the backup object controller is used to obtain resource objects related to cluster and application configuration from the container orchestration server;
  • the persistent volume controller is used to monitor the status, labels, and annotations of persistent volumes, and query, create, and modify resource objects in the remote cluster as needed.
  • the container storage interface interaction module is used to call the RPC service in the container storage interface to perform remote replication management on the storage system.
  • the controller management module including a persistent volume controller, a persistent volume declaration controller, and a protection group controller, monitors a persistent volume or a persistent volume declaration creation event, and if it is a created replica volume, calls an RPC service of a container storage interface through a container storage interface interaction module to perform a storage operation;
  • the persistent volume controller creates a protection group resource object based on the volume information and associates the persistent volume with the protection group by adding annotations and tags to the persistent volume and persistent volume declaration.
  • the protection group controller is used to manage protection group instances, process operation requests for protection groups, monitor replication status, and update sub-resource status.
  • the container storage interface also includes an RPC service for remote replication volume management
  • the remote replication volume management service connects to the storage system and uses the storage system's remote replication function to create replication pairs, add protection groups, synchronize data, and monitor the status of protection groups.
  • the remote replication function includes synchronous remote replication and asynchronous remote replication.
  • a container data protection method applied to the above container data protection system, comprises:
  • associating the first persistent volume, the persistent volume claim, and the protection group includes:
  • the first persistent volume, the persistent volume declaration, and the protection group are associated using the volume annotation, the volume tag, the resource annotation, and the resource tag.
  • keeping data of the first persistent volume synchronized with data of the second persistent volume includes:
  • Synchronous remote replication or asynchronous remote replication is used to keep data of the first persistent volume and the second persistent volume synchronized.
  • it also includes:
  • the target persistent volumes corresponding to the two target persistent volume declarations have a high-availability replication relationship.
  • it also includes:
  • it also includes:
  • it also includes:
  • the secondary cluster is used to restore the container application of the primary cluster based on the backup data of the container resources.
  • it also includes:
  • it also includes:
  • restoring a container application of a primary cluster based on backup data of container resources using a secondary cluster includes:
  • it also includes:
  • it also includes:
  • the backup resource data is obtained from the backup storage location and verified using the recovery object controller
  • a container data protection device comprising:
  • a storage class creation unit used to create a storage class with a replication type in a local container cluster
  • An object association unit configured to create a first persistent volume with a replication function on the storage, and add the first persistent volume to the protection group; create a persistent volume declaration of the storage class, and associate the first persistent volume, the persistent volume declaration, and the protection group;
  • a data synchronization unit is used to send a volume creation command to a remote container cluster; after creating a second persistent volume having a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, the data of the first persistent volume and the second persistent volume are kept synchronized.
  • An electronic device comprising:
  • a processor is used to implement the steps of the above-mentioned container data protection method when executing a computer program.
  • a non-volatile readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the container data protection method are implemented.
  • the system provided by the embodiment of the present application includes: a replication manager, a container storage controller, a container orchestration server and a container storage interface; wherein the container orchestration server and the container storage controller communicate with the container storage interface through a remote procedure call protocol; the replication manager and the container storage controller communicate with the container orchestration server through a hypertext transfer protocol; the replication manager includes a replication control management module and a cluster management module; the container storage controller includes a container storage interface interaction module and a container storage management module; the container storage interface is used to perform remote replication management and volume operations on the storage system.
  • volume data backup and recovery function is brought to the container cluster, and the backup and recovery capabilities of the storage system are brought to the container cluster.
  • Persistent volumes with different replication relationship types can be created in the same container cluster or in multiple clusters.
  • the storage's own replication technology synchronous/asynchronous replication or remote replication is used to replicate the persistent volume data of the container in the cluster to maintain data consistency.
  • a storage class with a replication type is created; a first persistent volume with a replication function is created on the storage, and the first persistent volume is added to a protection group; a persistent volume declaration of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated; a volume creation command is sent to a remote container cluster; after creating a second persistent volume with a replication relationship with the first persistent volume in the remote container cluster, and a protection group with the same name as the protection group, the data of the first persistent volume and the second persistent volume are kept synchronized.
  • a storage class with a replication type is first created, and then a first persistent volume with a replication function is created on the storage, and then the first persistent volume is added to the protection group.
  • a persistent volume of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated.
  • the remote container cluster can create a second persistent volume with a replication relationship with the first persistent volume, and a protection group with the same name as the protection group.
  • the data in the container can be effectively protected by keeping the data of the first persistent volume and the second persistent volume synchronized. That is to say, in the present application, a persistent volume with a replication relationship is created in the container cluster, and the replication technology of the storage itself is used to replicate the persistent volume data in the container cluster, thereby maintaining data consistency and protecting the container data.
  • the embodiments of the present application also provide a container data protection method, apparatus, device and non-volatile readable storage medium corresponding to the above-mentioned container data protection method, which have the above-mentioned technical effects and are not repeated here.
  • FIG1 is a flowchart of an implementation method of a container data protection method in an embodiment of the present application
  • FIG2 is a schematic diagram of a container data protection system according to an embodiment of the present application.
  • FIG3 is a functional schematic diagram of a container data protection system according to an embodiment of the present application.
  • FIG4 is a schematic diagram of the structure of a container data protection device according to an embodiment of the present application.
  • FIG5 is a schematic diagram of the structure of an electronic device in an embodiment of the present application.
  • FIG6 is a schematic diagram of a specific structure of an electronic device in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of the structure of a non-volatile readable storage medium in an embodiment of the present application.
  • FIG. 2 is a schematic diagram of the structure of a container data protection system in an embodiment of the present application.
  • the system includes:
  • Replication manager container storage controller, container orchestration server, and container storage interface
  • the container orchestration server and the container storage controller communicate with the container storage interface through the remote procedure call protocol.
  • the replication manager and container storage controller communicate with the container orchestration server via the Hypertext Transfer Protocol;
  • the replication manager includes a replication control management module and a cluster management module;
  • the container storage controller includes a container storage interface interaction module and a container storage management module;
  • the container storage interface is used to perform remote replication management and volume operations on the storage system.
  • the replication manager and the container orchestration server communicate through the http protocol, and use the Rest API (Representational State Transfer Application Programming Interfac) to query, create, and modify the resource object information in the local cluster and the remote cluster, including storage classes, persistent volumes, persistent volume declarations, protection groups, backup objects, recovery objects, and other different resource objects.
  • Rest API Presentational State Transfer Application Programming Interfac
  • the container storage controller and the container orchestration server also communicate through the http protocol, using Rest API calls to query, create, and modify resource object information in this cluster and remote clusters.
  • the container storage controller and the container storage interface communicate through RPC (Remote Procedure Call Protocol), call the functions provided in the container storage interface, operate the storage system, and complete management operations such as creation and deletion of corresponding persistent volumes and remote replication volumes.
  • RPC Remote Procedure Call Protocol
  • the container orchestration server and the container storage interface communicate through RPC, calling the functions provided in the container storage interface to operate the storage system.
  • the cluster management module and the container orchestration server in the remote cluster use Rest API calls to query, create, and modify resource object information in the remote cluster.
  • the cluster management module is used to obtain cluster access configuration information, communicate between container clusters, and query, create and modify resource objects in remote clusters.
  • the replication control management module includes a backup object controller, a recovery object controller, a persistent volume controller, a persistent volume declaration controller and a protection group controller to realize the monitoring and operation of resource objects.
  • the backup object controller is used to obtain resource objects related to cluster and application configuration from the container orchestration server;
  • the persistent volume controller is used to monitor the status, labels, and annotations of persistent volumes, and query, create, and modify resource objects in the remote cluster as needed.
  • the container storage interface interaction module is used to call the RPC service in the container storage interface to perform remote replication management on the storage system.
  • the controller management module includes a persistent volume controller, a persistent volume declaration controller and a protection group controller, which monitors the persistent volume or persistent volume declaration creation event. If it is a created replica volume, it calls the RPC service of the container storage interface through the container storage interface interaction module to perform storage operations;
  • the persistent volume controller creates a protection group resource object based on the volume information and associates the persistent volume with the protection group by adding annotations and tags to the persistent volume and persistent volume declaration.
  • the protection group controller is used to manage protection group instances, process operation requests for protection groups, monitor replication status, and update sub-resource status.
  • the container storage interface also includes RPC services for remote replication volume management
  • the remote replication volume management service connects to the storage system and uses the storage system's remote replication function to create replication pairs, add protection groups, synchronize data, and monitor the status of protection groups.
  • the remote replication function includes synchronous remote replication and asynchronous remote replication.
  • the replication manager consists of a cluster management module and a replication control management module.
  • the cluster management module is responsible for obtaining cluster access configuration information, communicating between container clusters, and operating remote clusters to query, create, and modify resource objects. You can also perform single cluster operations, set the remote cluster to yourself, and implement operations on the remote cluster in this cluster.
  • the replication control management module includes a backup object controller, a recovery object controller, a persistent volume controller, a persistent volume declaration controller, and a protection group controller to monitor and operate resource objects.
  • the backup object controller can obtain resource objects related to cluster and application configuration from the container orchestration server.
  • the persistent volume controller can monitor the persistent volume status, labels, and annotations, and operate on resource objects in the remote cluster as needed, including querying, creating, and modifying resource objects.
  • the protection group controller, the remote persistent volume and the protection group are associated through additional metadata annotations or tables.
  • the container storage controller consists of a controller management module and a container storage interface interaction module.
  • the container storage interface interaction module can call the RPC service in the container storage interface to perform remote replication management operations on the storage system.
  • the controller management module includes a persistent volume controller, a persistent volume declaration controller, and a protection group controller to monitor and operate resource objects. By monitoring the creation events of persistent volumes or persistent volume declarations, if it is a created replicated volume, the RPC service of the container storage interface will be called through the container storage interface interaction module to perform related storage operations.
  • the persistent volume controller uses volume information to create protection group resource objects, and establishes an association between persistent volumes and protection groups by adding annotations and tags to persistent volumes and persistent volume declarations.
  • the protection group controller is used to manage protection group instances, process operation requests for protection groups, monitor replication status, update sub-resource status, etc.
  • the container storage interface also adds an RPC service for remote replication volume management.
  • the remote replication volume management service can connect to the storage system and use the storage system's remote replication function, including synchronous remote replication and asynchronous remote replication. It can create replication pairs, add protection groups, synchronize data, perform some operations on protection groups, modify status, and other functions.
  • the system provided by the embodiment of the present application includes: a replication manager, a container storage controller, a container orchestration server and a container storage interface; wherein the container orchestration server and the container storage controller communicate with the container storage interface through a remote procedure call protocol; the replication manager and the container storage controller communicate with the container orchestration server through a hypertext transfer protocol; the replication manager includes a replication control management module and a cluster management module; the container storage controller includes a container storage interface interaction module and a container storage management module; the container storage interface is used to perform remote replication management and volume operations on the storage system.
  • Persistent volumes with different types of replication relationships can be created within the same container cluster or in multiple clusters, and the storage's own replication technology (synchronous/asynchronous replication or remote replication) can be used to replicate the persistent volume data of containers in the cluster to maintain data consistency.
  • a persistent volume when creating a persistent volume, you can create a pair of persistent volumes with a replication relationship through parameter configuration. When a failure occurs, you can seamlessly switch between the replication volumes, and select the fast-responding persistent volume for reading and writing according to the performance and load of the storage where the persistent volume is located. For applications with high access frequency, select high-performance volumes for reading and writing to ensure the high availability of container services and the continuity of applications. For off-site disaster recovery scenarios, you can create a primary and backup volume with a remote replication relationship, and back up container cluster object resources and application configurations at the same time.
  • the primary site When the primary site has a problem, you can perform a failover through the slave volume that has a replication relationship with the primary volume, restore the container cluster object resources and application configuration in another cluster, and quickly restore to the disaster recovery site. After the primary site is restored, perform data recovery and master-slave switching of the persistent volume to ensure consistent data availability.
  • Figure 1 is a flow chart of a container data protection method in an embodiment of the present application.
  • the method can be applied to the container data protection system shown in Figure 2.
  • two sets of storage are used in Figure 2 to connect to two clusters, forming one master and one backup.
  • the container persistent volume data backup disaster recovery solution includes three components: a replication manager, a container storage controller, and a container storage interface.
  • the backup object and the protection group are custom resources, representing the protection group on the storage.
  • the container data protection method is implemented in the system, which specifically includes the following steps:
  • S101 In a local container cluster, create a storage class with a replication type.
  • cluster deployment configuration can be performed, that is, the replication manager is deployed in the cluster as an application in the container orchestration server.
  • the local cluster information, remote cluster information, and access information are passed to the replication manager through configuration for use.
  • the remote cluster information can be set to the local cluster.
  • the container storage interface is deployed according to the interface specification, and the container storage controller is deployed as a sidecar container of the container storage interface.
  • the storage class sets the replication type, starts the replication function, sets the remote cluster identifier, remote storage class name and other parameters. If replication is performed in a single cluster, the remote cluster is set to its own cluster.
  • the replication type may specifically include local replication or remote replication, synchronous replication or asynchronous replication and the like.
  • S102 Create a first persistent volume with a replication function on the storage, and add the first persistent volume to a protection group.
  • the persistent volume declaration controller of the container storage controller listens to the creation event, calls the container storage interface to create a volume with remote replication function (ie, the first persistent volume) on the storage, and adds it to the protection group.
  • S103 Create a persistent volume declaration for the storage class, and associate the first persistent volume, the persistent volume declaration, and the protection group.
  • Persistent volumes with different replication relationship types can be created in the same container cluster or in multiple clusters, and the storage's own replication technology (synchronous/asynchronous replication or remote replication) can be used to replicate the persistent volume data of the container in the cluster to maintain data consistency.
  • the first and second in the first persistent volume and the second persistent volume are only used to distinguish the existence of two persistent volumes, but do not limit the order, priority, etc. of the persistent volumes.
  • associating the first persistent volume, the persistent volume declaration, and the protection group includes:
  • Step 2 Set the resource object of the protection group, and set the resource annotation and resource tag;
  • Step 3 Use volume annotations, volume tags, resource annotations, and resource tags to associate the first persistent volume, the persistent volume declaration, and the protection group.
  • the persistent volume controller when it detects that the volume has been created successfully, it can set the annotations and labels of the persistent volume. Create a protection group resource object and set the relevant annotations and labels. Associate the persistent volume, persistent volume declaration, and protection group through annotations and labels.
  • volume annotations, volume labels, resource annotations, and resource labels as well as the information covered by the content of the annotations and labels themselves, please refer to the specific definitions and implementations of the relevant annotations and labels, which will not be repeated here.
  • the persistent volume controller in the replication manager monitors the status of the persistent volume and related annotations and labels, obtains remote cluster information, queries the storage class in the remote cluster, sends a command to create a persistent volume to the remote cluster, creates a remote persistent volume, and replicates data. It can set synchronous remote replication and asynchronous remote replication.
  • the protection group controller monitors the status of the protection group and related annotations and labels, sends a command to create a protection group to the remote cluster, and creates a remote protection group.
  • S105 After creating a second persistent volume in a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, keep data of the first persistent volume and the second persistent volume synchronized.
  • a persistent volume and a protection group are created in the source cluster.
  • a persistent volume and a protection group with the same name and a replication relationship are also created in the target cluster.
  • the data in the two persistent volumes are kept synchronized.
  • maintaining data synchronization between the first persistent volume and the second persistent volume includes: maintaining data synchronization between the first persistent volume and the second persistent volume by using synchronous remote replication or asynchronous remote replication. That is, synchronous remote replication or asynchronous remote replication can be used to achieve data synchronization between the first persistent volume and the second persistent volume.
  • the method provided by the embodiment of the present application includes: creating a storage class with a replication type in a local container cluster; creating a first persistent volume with a replication function on the storage, and adding the first persistent volume to a protection group; creating a persistent volume declaration for the storage class, and associating the first persistent volume, the persistent volume declaration, and the protection group; sending a volume creation command to a remote container cluster; creating a second persistent volume with a replication relationship with the first persistent volume in the remote container cluster, and a protection group with the same name as the protection group, and then maintaining data synchronization between the first persistent volume and the second persistent volume.
  • a storage class with a replication type is first created, and then a first persistent volume with a replication function is created on the storage, and then the first persistent volume is added to the protection group.
  • a persistent volume of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated.
  • the remote container cluster can create a second persistent volume with a replication relationship with the first persistent volume, and a protection group with the same name as the protection group.
  • the data in the container can be effectively protected by keeping the data of the first persistent volume and the second persistent volume synchronized. That is to say, in the present application, a persistent volume with a replication relationship is created in the container cluster, and the replication technology of the storage itself is used to replicate the persistent volume data in the container cluster, thereby maintaining data consistency and protecting the container data.
  • a persistent volume with a highly available replication relationship can also be created to protect the data in the container, thereby maintaining the high availability of the container application.
  • the specific implementation process includes:
  • Step 1 Configure the target storage class with real-time high-availability replication relationship
  • Step 2 Create two target persistent volume declarations for the target storage class.
  • the target persistent volumes corresponding to the two target persistent volume declarations have a high-availability replication relationship.
  • Step 3 Get the storage performance status and select the fastest-response persistent volume from the two target persistent volumes for reading and writing.
  • the container storage controller monitors the storage performance status and automatically switches to different persistent volumes based on the performance and load of the storage where the persistent volume is located. It selects persistent volumes with fast response for reading and writing, and selects high-performance persistent volumes for reading and writing for applications with high access frequency.
  • a target persistent volume after a target persistent volume fails, it switches to another target persistent volume for reading and writing. That is, when a persistent volume fails, it can immediately switch to a persistent volume with a replication relationship to ensure high availability of container services and continuity of applications.
  • the replication relationship is re-established, and data is synchronized from the normally running target persistent volume. That is, during recovery, the replication relationship is re-established, data is synchronized from another volume, and after consistency is maintained, services are provided again.
  • remote multi-cluster failover and recovery can also be achieved.
  • the specific implementation steps include:
  • Step 1 Use the protection group resource objects to maintain the master-slave relationship between the resource objects and the persistent volumes on the storage in the master cluster and the slave cluster;
  • Step 2 When the primary cluster fails, use the secondary cluster to restore the container application of the primary cluster based on the backup data of the container resources.
  • the action attribute of the protection group resource object is also set to failover. That is, the action attribute of the protection group resource object is set to failover.
  • the protection group controller in the container storage controller detects the protection group change, calls the protection group failover operation in the container storage interface, puts the protection group into a failover state in the storage system, and stops data synchronization.
  • the backup data of container resources is used to restore the container application of the main cluster by using the slave cluster, including: using the copy data of the persistent volume in the slave storage of the slave cluster to pull up the business of the container application.
  • the copy data of the persistent volume in the slave storage of the slave cluster is used to immediately pull up the business, ensuring business availability and data security at the time of disaster.
  • the action attribute of the protection group resource object is set to re-protect.
  • the protection group controller in the container storage controller detects the change of the protection group, it calls the protection group re-protection operation in the container storage interface, and in the storage system, the protection group enters the re-protection state, so that the persistent volume resumes replication from the new "source" and performs data synchronization.
  • data backup and recovery can also be achieved through backup objects.
  • the specific implementation process includes:
  • Step 1 Receive a backup request and create a corresponding backup object
  • Step 2 Query the object resources from the container orchestration server and create custom resources for the backup object;
  • Step 3 Call the container storage interface to create a snapshot of the volume to be backed up on the storage system
  • Step 4 Use the replication manager to upload the backed-up resource data to the backup storage location.
  • the backup object controller After the backup object controller detects the custom resources of the backup object, it will query the container orchestration server to collect cluster container resource objects, application configuration and other object resources. Call the container storage interface to create a snapshot of the volume to be backed up. Upload the backed-up resource data to the backup storage location.
  • the corresponding data recovery process includes:
  • Step 1 Receive a recovery request and create a corresponding recovery object custom resource
  • Step 2 Use the replication manager to verify the custom resource of the recovery object
  • Step 3 After verification, use the recovery object controller to obtain the backup resource data from the backup storage location and verify it;
  • Step 4 After verification, use the backup resource data to create and restore the backup resources.
  • the recovery object custom resource After receiving the recovery request, the recovery object custom resource is created. After the recovery object controller detects the recovery object custom resource, it verifies it. The recovery object controller obtains the backup resource data from the backup storage location and verifies it. The recovery object controller creates the resource for restoring the backup.
  • Remote replication is the core technology of storage system disaster recovery and backup, which can realize remote data backup and disaster recovery.
  • Remote replication can be used to synchronize and back up the data of the primary site to maintain data consistency.
  • the business data of the primary site can be quickly taken over by the secondary site to ensure business sustainability and avoid losses caused by business terminals.
  • the data of the primary site can be restored by the data of the secondary site, which can facilitate business recovery.
  • Remote replication is divided into synchronous replication and asynchronous replication. Synchronous remote replication is to synchronize data in real time after the initial remote synchronization, to maximize data consistency and reduce the amount of data loss when a disaster occurs.
  • Asynchronous remote replication is to synchronize data periodically after the initial synchronization, to minimize the degradation of business performance caused by the delay of remote data transmission.
  • Container backup and recovery are mainly divided into cluster resource backup and recovery, and persistent volume backup and recovery.
  • the backup and recovery of cluster resources mainly include:
  • Method a Backup and restore of container images.
  • the standard approach is to synchronize images in image repositories between different data centers through the replication function of the image repository.
  • Method b Back up resource objects, including various configurations and resource relationships, to restore the same cluster, application, and configuration to ensure functional consistency.
  • the first is to directly use the server that stores the data to implement regular snapshot backups.
  • the second method is to deploy a dedicated backup client on each target server and specify a backup data directory, and regularly copy the data remotely to external storage.
  • the third method of backing up persistent volumes is to create a snapshot of the persistent volume based on the snapshot function of the container storage interface, perform the backup, and then restore the volume by creating a volume from the snapshot.
  • Some protection solutions can back up container resource objects and persistent volume data at the same time, but the persistent volume is only backed up and restored based on the snapshot function of the container storage interface, which takes a long time to backup and restore data, and cannot adapt to the high availability and fast recovery scenarios of containers.
  • Persistent volumes with different types of replication relationships can be created in the same container cluster or in multiple clusters, and the replication technology of the storage itself (synchronous/asynchronous replication or remote replication) can be used to replicate the persistent volume data of the container in the cluster to maintain data consistency.
  • the replication technology of the storage itself synchronous/asynchronous replication or remote replication
  • a pair of persistent volumes with a replication relationship can be created.
  • a fault error When a fault error occurs, it can be seamlessly switched between the replicated volumes, and according to the performance and load of the storage where the persistent volume is located, a fast-responding persistent volume can be selected for reading and writing. For applications with high access frequency, high-performance volumes can be selected for reading and writing, ensuring the high availability of container services and the continuity of applications.
  • a primary and backup volume with a remote replication relationship can be created, and container cluster object resources and application configurations can be backed up at the same time.
  • a fault switch can be performed through a slave volume with a replication relationship with the primary volume, and the container cluster object resources and application configuration can be restored in another cluster for rapid recovery to the disaster recovery site. After the primary site is restored, data recovery and master-slave switching of persistent volumes are performed to ensure consistent data availability.
  • This technical solution is applicable to various scenarios of container data protection, including backup, high availability, application continuity, and disaster recovery.
  • this application can integrate the backup and recovery functions of container resource objects and persistent volume data, bring the backup and recovery capabilities of the storage system to the container cluster, and use the replication and recovery technology of the storage device to achieve the replication of the container persistent volume data, thereby realizing the disaster recovery protection of the container persistent volume.
  • the data of the disaster recovery center can be directly used to establish an operation support environment to provide IT (Information Technology) support for the continued operation of the business.
  • IT Information Technology
  • the data of the disaster recovery center can also be used to restore the business system of the main data center, so that the company's business operations can quickly return to the normal operation state before the disaster.
  • This solution adopts a cloud-native development approach and can be well integrated with the container orchestration server.
  • This application can simultaneously back up and restore container resource objects, application configurations, and persistent volume data.
  • the backup and recovery methods of persistent volume data support both local snapshots and remote replication.
  • the persistent volume of the container uses the replication technology of the storage itself (synchronous/asynchronous replication, remote replication) to maintain data replication and consistency.
  • It supports cross-cluster and cross-storage area disaster recovery deployment, and can create master and backup volumes with remote replication relationships, while backing up container cluster object resources and application configurations.
  • the primary site fails, it can perform failover through the slave volume that has a replication relationship with the primary volume, restore the container cluster object resources and application configuration in another cluster, and quickly restore to the disaster recovery site. After the primary site is restored, data recovery and master-slave switching of the persistent volume are performed to ensure consistent data availability.
  • the embodiment of the present application further provides a container data protection device.
  • the container data protection device described below and the container data protection method described above can refer to each other.
  • the device comprises:
  • the storage class creation unit 101 is used to create a storage class with a replication type in a local container cluster
  • the object association unit 102 is used to create a first persistent volume with a replication function on the storage, and add the first persistent volume to the protection group; create a persistent volume declaration of the storage class, and associate the first persistent volume, the persistent volume declaration, and the protection group;
  • the data synchronization unit 103 is used to send a volume creation command to the remote container cluster; after creating a second persistent volume with a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, the data of the first persistent volume and the second persistent volume are kept synchronized.
  • a storage class with a replication type is created in a local container cluster; a first persistent volume with a replication function is created on the storage, and the first persistent volume is added to a protection group; a persistent volume declaration of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated; a volume creation command is sent to a remote container cluster; and after creating a second persistent volume with a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, the data of the first persistent volume and the second persistent volume are kept synchronized.
  • a storage class with a replication type is first created, and then a first persistent volume with a replication function is created on the storage, and then the first persistent volume is added to the protection group.
  • a persistent volume of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated.
  • the remote container cluster can create a second persistent volume with a replication relationship with the first persistent volume, and a protection group with the same name as the protection group.
  • the data in the container can be effectively protected by keeping the data of the first persistent volume and the second persistent volume synchronized. That is to say, in the present application, a persistent volume with a replication relationship is created in the container cluster, and the replication technology of the storage itself is used to replicate the persistent volume data in the container cluster, thereby maintaining data consistency and protecting the container data.
  • the object association unit 102 is specifically used to set the volume annotation and volume label of the first persistent volume
  • the first persistent volume, the persistent volume declaration, and the protection group are associated using the volume annotation, the volume tag, the resource annotation, and the resource tag.
  • the object association unit 102 is specifically configured to maintain data synchronization between the first persistent volume and the second persistent volume by using synchronous remote replication or asynchronous remote replication.
  • the object association unit 102 is further used to configure a target storage class having a real-time high-availability replication relationship
  • the target persistent volumes corresponding to the two target persistent volume declarations have a high-availability replication relationship.
  • the object association unit 102 is further configured to switch to another target persistent volume for reading and writing after a target persistent volume fails.
  • the object association unit 102 is further configured to re-establish the replication relationship during recovery and synchronize data from the normally operating target persistent volume.
  • the object association unit 102 is further used to use the protection group resource object to maintain the master-slave relationship between the resource objects and the persistent volumes on the storage in the master cluster and the slave cluster;
  • the secondary cluster is used to restore the container application of the primary cluster based on the backup data of the container resources.
  • the object association unit 102 is further configured to set the action attribute of the protection group resource object to failover.
  • the object association unit 102 is further configured to set the action attribute of the protection group resource object to re-protection after the main cluster fails and recovers.
  • the data synchronization unit 103 is further configured to start the service of the container application by using the replicated data of the persistent volume from the cluster secondary storage.
  • the data synchronization unit 103 is further used to receive a backup request and create a corresponding backup object
  • the data synchronization unit 103 is further used to receive a recovery request and create a corresponding recovery object custom resource;
  • the backup resource data is obtained from the backup storage location and verified using the recovery object controller
  • the embodiment of the present application further provides an electronic device.
  • the electronic device described below and the container data protection method described above can refer to each other.
  • the electronic device includes:
  • Memory 332 used for storing computer programs
  • the processor 322 is configured to implement the steps of the container data protection method of the above method embodiment when executing a computer program.
  • FIG. 6 is a schematic diagram of the specific structure of an electronic device provided in this embodiment.
  • the electronic device may have relatively large differences due to different configurations or performances, and may include one or more processors (central processing units, CPU) 322 (for example, one or more processors) and a memory 332, and the memory 332 stores one or more computer programs 342 or data 344.
  • the memory 332 can be a temporary storage or a permanent storage.
  • the program stored in the memory 332 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the data processing device.
  • the central processing unit 322 can be configured to communicate with the memory 332 to execute a series of instruction operations in the memory 332 on the electronic device 301.
  • the electronic device 301 may further include one or more power supplies 326 , one or more wired or wireless network interfaces 350 , one or more input and output interfaces 358 , and/or one or more operating systems 341 .
  • the steps in the container data protection method described above can be implemented by the structure of an electronic device.
  • the embodiment of the present application further provides a non-volatile readable storage medium.
  • the non-volatile readable storage medium described below and the container data protection method described above can refer to each other.
  • FIG 7 is a non-volatile readable storage medium provided in this embodiment.
  • the non-volatile readable storage medium stores a computer program.
  • the steps of the container data protection method of the above method embodiment are implemented.
  • the non-volatile readable storage medium may specifically be a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other non-volatile readable storage media that can store program codes.
  • each embodiment is described in a progressive manner, and each embodiment focuses on the differences from other embodiments.
  • the same or similar parts between the embodiments can be referred to each other.
  • the description is relatively simple, and the relevant parts can be referred to the method part.
  • the steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two.
  • the software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed in the present application in the technical field of computer application are a container data protection system, method and apparatus, and a device and a non-volatile readable storage medium. In the system, the method comprises: creating, in a local container cluster, a storage class having a duplication type; creating, on a storage, a first persistent volume having a duplication function, and adding the first persistent volume to a protection group; creating a persistent volume claim of the storage class, and associating the first persistent volume, the persistent volume claim and the protection group; sending a volume creation command to a remote container cluster; and creating, in the remote container cluster, a second persistent volume having a duplication relationship with the first persistent volume, and a homonymous protection group of the protection group, and then maintaining data synchronization between the first persistent volume and the second persistent volume. In the present application, a persistent volume having a duplication relationship is created in a container cluster, and persistent volume data in the container cluster is duplicated by using duplication technology of a storage itself, such that the consistency of data is maintained, and data of a container can be protected.

Description

容器数据保护系统、方法、装置、设备及可读存储介质Container data protection system, method, device, equipment and readable storage medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2022年12月09日提交中国专利局,申请号为202211575807.9,申请名称为“容器数据保护系统、方法、装置、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on December 9, 2022, with application number 202211575807.9, and application name “Container Data Protection System, Method, Device, Equipment and Readable Storage Medium”, all contents of which are incorporated by reference in this application.
技术领域Technical Field
本申请涉及计算机应用技术领域,特别是涉及一种容器数据保护系统、方法、装置、设备及非易失性可读存储介质。The present application relates to the field of computer application technology, and in particular to a container data protection system, method, device, equipment and non-volatile readable storage medium.
背景技术Background technique
容器中的数据可以存储在类似于虚拟机磁盘的介质中,容器可通过持久化卷使用外部的存储设备,容器的持久化卷可以用来存储容器内应用的数据,也可以用来在容器间进行数据共享。The data in the container can be stored in a medium similar to a virtual machine disk. The container can use external storage devices through persistent volumes. The persistent volume of the container can be used to store the data of the application in the container, and can also be used to share data between containers.
传统的数据保护方案主要聚焦于虚拟机或物理机,常常侧重于保护单个服务器以及运行在上面的应用程序。而在容器编排时代,容器是一个动态变化的资源,应用程序通常是广泛分布的,有时需要启动多云和多个数据中心。传统的备份和灾难恢复方案在容器化环境中不能很好的发挥作用。Traditional data protection solutions mainly focus on virtual machines or physical machines, often focusing on protecting a single server and the applications running on it. In the era of container orchestration, containers are a dynamically changing resource, and applications are often widely distributed, sometimes requiring the launch of multiple clouds and multiple data centers. Traditional backup and disaster recovery solutions do not work well in containerized environments.
综上,如何有效地解决容器数据保护等问题,是目前本领域技术人员急需解决的技术问题。In summary, how to effectively solve problems such as container data protection is a technical problem that technical personnel in this field urgently need to solve.
发明内容Summary of the invention
本申请的目的是提供一种容器数据保护系统、方法、装置、设备及非易失性可读存储介质,能够对容器的数据进行有效保护。The purpose of this application is to provide a container data protection system, method, apparatus, device and non-volatile readable storage medium, which can effectively protect the data of the container.
为解决上述技术问题,本申请提供如下技术方案:In order to solve the above technical problems, this application provides the following technical solutions:
一种容器数据保护系统,包括:A container data protection system, comprising:
复制管理器、容器存储控制器、容器编排服务器和容器存储接口;Replication manager, container storage controller, container orchestration server, and container storage interface;
其中,容器编排服务端和容器存储控制器,通过远程过程调用协议与容器存储接口进行通信;Among them, the container orchestration server and the container storage controller communicate with the container storage interface through the remote procedure call protocol;
复制管理器和容器存储控制器,通过超文本传输协议与容器编排服务端进行通信;The replication manager and container storage controller communicate with the container orchestration server via the Hypertext Transfer Protocol;
复制管理器包括复制控制管理模块和集群管理模块;The replication manager includes a replication control management module and a cluster management module;
容器存储控制器包括容器存储接口交互模块和容器存储管理模块;The container storage controller includes a container storage interface interaction module and a container storage management module;
容器存储接口用于对存储系统进行远程复制管理和卷操作。The container storage interface is used to perform remote replication management and volume operations on the storage system.
在一些实施例中,集群管理模块与远端集群中的容器编排服务器之间,使用Rest API调用查询、创建、修改远端集群中的资源对象信息。In some embodiments, the cluster management module and the container orchestration server in the remote cluster use Rest API calls to query, create, and modify resource object information in the remote cluster.
在一些实施例中,集群管理模块,用于获取集群访问配置信息,在容器集群间进行通信,对远端集群中的资源对象进行查询、创建和修改。In some embodiments, the cluster management module is used to obtain cluster access configuration information, communicate between container clusters, and query, create, and modify resource objects in a remote cluster.
在一些实施例中,复制控制管理模块,包括备份对象控制器,恢复对象控制器,持久化卷控制器,持久化卷声明控制器和保护组控制器,实现对资源对象的监控和操作。In some embodiments, the replication control management module includes a backup object controller, a recovery object controller, a persistent volume controller, a persistent volume declaration controller and a protection group controller to implement monitoring and operation of resource objects.
在一些实施例中,备份对象控制器,用于从容器编排服务端中获取集群及应用配置相关的资源对象; In some embodiments, the backup object controller is used to obtain resource objects related to cluster and application configuration from the container orchestration server;
持久化卷控制器,用于监控持久化卷状态及标签和注解,并根据需要对远端集群中的资源对象进行查询、创建和修改。The persistent volume controller is used to monitor the status, labels, and annotations of persistent volumes, and query, create, and modify resource objects in the remote cluster as needed.
在一些实施例中,容器存储接口交互模块,用于调用容器存储接口中的RPC服务,对存储系统进行远程复制管理。In some embodiments, the container storage interface interaction module is used to call the RPC service in the container storage interface to perform remote replication management on the storage system.
在一些实施例中,控制器管理模块,包括持久化卷控制器,持久化卷声明控制器和保护组控制器,通过监视持久化卷或持久化卷声明创建事件,如果是创建的复制卷,则通过容器存储接口交互模块调用容器存储接口的RPC服务,进行存储操作;In some embodiments, the controller management module, including a persistent volume controller, a persistent volume declaration controller, and a protection group controller, monitors a persistent volume or a persistent volume declaration creation event, and if it is a created replica volume, calls an RPC service of a container storage interface through a container storage interface interaction module to perform a storage operation;
持久化卷控制器通过卷信息创建保护组资源对象,通过在持久化卷和持久化卷声明上添加注解和标签,建立持久化卷和保护组的关联;The persistent volume controller creates a protection group resource object based on the volume information and associates the persistent volume with the protection group by adding annotations and tags to the persistent volume and persistent volume declaration.
保护组控制器,用于管理保护组实例,处理对保护组的操作请求,监视复制状态,更新子资源状态。The protection group controller is used to manage protection group instances, process operation requests for protection groups, monitor replication status, and update sub-resource status.
在一些实施例中,容器存储接口还包括远程复制卷管理的RPC服务;In some embodiments, the container storage interface also includes an RPC service for remote replication volume management;
远程复制卷管理服务连接存储系统,并通过使用存储系统的远程复制功能实现创建复制对,添加保护组,数据同步,对保护组进行状态;The remote replication volume management service connects to the storage system and uses the storage system's remote replication function to create replication pairs, add protection groups, synchronize data, and monitor the status of protection groups.
远程复制功能包括同步远程复制和异步远程复制功能。The remote replication function includes synchronous remote replication and asynchronous remote replication.
一种容器数据保护方法,应用于上述的容器数据保护系统,包括:A container data protection method, applied to the above container data protection system, comprises:
在本地容器集群中,创建具有复制类型的存储类;In the local container cluster, create a storage class with the replication type.
在存储上创建具有复制功能的第一持久化卷,并将第一持久化卷加入保护组;Create a first persistent volume with a replication function on the storage, and add the first persistent volume to the protection group;
创建存储类的持久化卷声明,并关联第一持久化卷、持久化卷声明和保护组;Create a persistent volume declaration for the storage class and associate the first persistent volume, the persistent volume declaration, and the protection group;
向远端容器集群发送卷创建命令;Send a volume creation command to the remote container cluster;
在远端容器集群中创建出与第一持久化卷具有复制关系的第二持久化卷,及保护组的同名保护组后,保持第一持久化卷与第二持久化卷的数据同步。After creating a second persistent volume in a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, keep data of the first persistent volume and the second persistent volume synchronized.
在一些实施例中,关联第一持久化卷、持久化卷声明和保护组,包括:In some embodiments, associating the first persistent volume, the persistent volume claim, and the protection group includes:
设置第一持久化卷的卷注解和卷标签;Set the volume annotation and volume label of the first persistent volume;
设置保护组的资源对象,并设置资源注解和资源标签;Set the resource object of the protection group, and set the resource annotation and resource tag;
利用卷注解、卷标签、资源注解和资源标签,关联第一持久化卷、持久化卷声明和保护组。The first persistent volume, the persistent volume declaration, and the protection group are associated using the volume annotation, the volume tag, the resource annotation, and the resource tag.
在一些实施例中,保持第一持久化卷与第二持久化卷的数据同步,包括:In some embodiments, keeping data of the first persistent volume synchronized with data of the second persistent volume includes:
利用同步远程复制或异步远程复制,保持第一持久化卷与第二持久化卷的数据同步。Synchronous remote replication or asynchronous remote replication is used to keep data of the first persistent volume and the second persistent volume synchronized.
在一些实施例中,还包括:In some embodiments, it also includes:
配置具有实时高可用复制关系的目标存储类;Configure the target storage class with a real-time high-availability replication relationship;
为目标存储类创建两个目标持久化卷声明,两个目标持久化卷声明对应的目标持久化卷具有高可用复制关系;Create two target persistent volume declarations for the target storage class. The target persistent volumes corresponding to the two target persistent volume declarations have a high-availability replication relationship.
获取存储性能状态,从两个目标持久化卷中选出响应快的持久化卷进行读写。Get the storage performance status and select the fastest-response persistent volume from the two target persistent volumes for reading and writing.
在一些实施例中,还包括:In some embodiments, it also includes:
在一个目标持久化卷故障后,切换至另一个目标持久化卷进行读写。When a target persistent volume fails, switch to another target persistent volume for reading and writing.
在一些实施例中,还包括:In some embodiments, it also includes:
在恢复时,重新建立复制关系,从正常运行的目标持久化卷中同步数据。During recovery, the replication relationship is reestablished and data is synchronized from the normally functioning target persistent volume.
在一些实施例中,还包括: In some embodiments, it also includes:
利用保护组资源对象,保持主集群和从集群中资源对象和存储上的持久化卷建立主备关系;Use the protection group resource objects to maintain the master-slave relationship between the resource objects and the persistent volumes on the storage in the master and slave clusters;
在主集群故障时,则利用从集群,基于容器资源的备份数据恢复主集群的容器应用。When the primary cluster fails, the secondary cluster is used to restore the container application of the primary cluster based on the backup data of the container resources.
在一些实施例中,还包括:In some embodiments, it also includes:
设置保护组资源对象的动作属性为故障切换。Set the action attribute of the protection group resource object to failover.
在一些实施例中,还包括:In some embodiments, it also includes:
当主集群故障恢复后,设置保护组资源对象的动作属性为重保护。When the primary cluster fails and recovers, set the action attribute of the protection group resource object to re-protect.
在一些实施例中,利用从集群,基于容器资源的备份数据恢复主集群的容器应用,包括:In some embodiments, restoring a container application of a primary cluster based on backup data of container resources using a secondary cluster includes:
利用从集群从备存储中持久化卷从卷的复制数据,拉起容器应用的业务。Use the replicated data of the persistent volume from the cluster's secondary storage to start the container application business.
在一些实施例中,还包括:In some embodiments, it also includes:
接收备份请求,并创建对应的备份对象;Receive backup requests and create corresponding backup objects;
从容器编排服务端查询对象资源,并创建出备份对象的自定义资源;Query object resources from the container orchestration server and create custom resources for backup objects;
调用容器存储接口,在存储系统上对需要备份的卷创建快照;Call the container storage interface to create a snapshot of the volume to be backed up on the storage system;
利用复制管理器向备份的存储位置上传备份的资源数据。Use the replication manager to upload the backed-up resource data to the backup storage location.
在一些实施例中,还包括:In some embodiments, it also includes:
接收恢复请求,并创建对应的恢复对象自定义资源;Receive the recovery request and create the corresponding recovery object custom resource;
利用复制管理器,对恢复对象自定义资源进行验证;Use the replication manager to verify the recovery object custom resources;
验证通过后,利用恢复对象控制器从备份存储位置获取备份资源数据并校验;After verification, the backup resource data is obtained from the backup storage location and verified using the recovery object controller;
校验通过后,利用备份资源数据创建恢复备份的资源。After verification, use the backup resource data to create and restore the backup resources.
一种容器数据保护装置,包括:A container data protection device, comprising:
存储类创建单元,用于在本地容器集群中,创建具有复制类型的存储类;A storage class creation unit, used to create a storage class with a replication type in a local container cluster;
对象关联单元,用于在存储上创建具有复制功能的第一持久化卷,并将第一持久化卷加入保护组;创建存储类的持久化卷声明,并关联第一持久化卷、持久化卷声明和保护组;An object association unit, configured to create a first persistent volume with a replication function on the storage, and add the first persistent volume to the protection group; create a persistent volume declaration of the storage class, and associate the first persistent volume, the persistent volume declaration, and the protection group;
数据同步单元,用于向远端容器集群发送卷创建命令;在远端容器集群中创建出与第一持久化卷具有复制关系的第二持久化卷,及保护组的同名保护组后,保持第一持久化卷与第二持久化卷的数据同步。A data synchronization unit is used to send a volume creation command to a remote container cluster; after creating a second persistent volume having a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, the data of the first persistent volume and the second persistent volume are kept synchronized.
一种电子设备,包括:An electronic device, comprising:
存储器,用于存储计算机程序;Memory for storing computer programs;
处理器,用于执行计算机程序时实现上述容器数据保护方法的步骤。A processor is used to implement the steps of the above-mentioned container data protection method when executing a computer program.
一种非易失性可读存储介质,非易失性可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述容器数据保护方法的步骤。A non-volatile readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the container data protection method are implemented.
应用本申请实施例所提供的系统,包括:复制管理器、容器存储控制器、容器编排服务器和容器存储接口;其中,容器编排服务端和容器存储控制器,通过远程过程调用协议与容器存储接口进行通信;复制管理器和容器存储控制器,通过超文本传输协议与容器编排服务端进行通信;复制管理器包括复制控制管理模块和集群管理模块;容器存储控制器包括容器存储接口交互模块和容器存储管理模块;容器存储接口用于对存储系统进行远程复制管理和卷操作。The system provided by the embodiment of the present application includes: a replication manager, a container storage controller, a container orchestration server and a container storage interface; wherein the container orchestration server and the container storage controller communicate with the container storage interface through a remote procedure call protocol; the replication manager and the container storage controller communicate with the container orchestration server through a hypertext transfer protocol; the replication manager includes a replication control management module and a cluster management module; the container storage controller includes a container storage interface interaction module and a container storage management module; the container storage interface is used to perform remote replication management and volume operations on the storage system.
基于该系统的各个内部器件以及器件间的相互通信能力,能够融合容器资源对象和持久 化卷数据备份恢复功能,并把存储系统的备份和恢复能力带到容器集群,在同一容器集群内或多个集群中能够创建具有不同复制关系类型的持久化卷,使用存储本身的复制技术(同步/异步复制或远程复制)来复制集群中容器的持久化卷数据,保持数据的一致性。Based on the internal components of the system and the communication capabilities between components, it is possible to integrate container resource objects and persistent The volume data backup and recovery function is brought to the container cluster, and the backup and recovery capabilities of the storage system are brought to the container cluster. Persistent volumes with different replication relationship types can be created in the same container cluster or in multiple clusters. The storage's own replication technology (synchronous/asynchronous replication or remote replication) is used to replicate the persistent volume data of the container in the cluster to maintain data consistency.
应用本申请实施例所提供的方法,在本地容器集群中,创建具有复制类型的存储类;在存储上创建具有复制功能的第一持久化卷,并将第一持久化卷加入保护组;创建存储类的持久化卷声明,并关联第一持久化卷、持久化卷声明和保护组;向远端容器集群发送卷创建命令;在远端容器集群中创建出与第一持久化卷具有复制关系的第二持久化卷,及保护组的同名保护组后,保持第一持久化卷与第二持久化卷的数据同步。Applying the method provided in the embodiment of the present application, in a local container cluster, a storage class with a replication type is created; a first persistent volume with a replication function is created on the storage, and the first persistent volume is added to a protection group; a persistent volume declaration of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated; a volume creation command is sent to a remote container cluster; after creating a second persistent volume with a replication relationship with the first persistent volume in the remote container cluster, and a protection group with the same name as the protection group, the data of the first persistent volume and the second persistent volume are kept synchronized.
在本申请中,首先创建出具有复制类型的存储类,然后在存储上创建具有复制功能的第一持久化卷,然后将第一持久化卷加入到保护组中。创建该存储类的持久化卷,并将第一持久化卷、持久化卷声明和保护组关联起来。通过向远端容器集群发出卷创建命令,可以使得远端容器集群创建出与第一持久化卷具有复制关系的第二持久化卷,以及保护组的同名保护组。如此,便可通过保持第一持久化卷与第二持久化卷的数据同步的方式对容器中的数据进行有效保护。也就是说,在本申请中,在容器集群中创建具有复制关系的持久化卷,使用存储本身的复制技术来复制容器集群中的持久化卷数据,从而保持数据的一致性,可实现对容器的数据进行保护。In the present application, a storage class with a replication type is first created, and then a first persistent volume with a replication function is created on the storage, and then the first persistent volume is added to the protection group. A persistent volume of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated. By issuing a volume creation command to the remote container cluster, the remote container cluster can create a second persistent volume with a replication relationship with the first persistent volume, and a protection group with the same name as the protection group. In this way, the data in the container can be effectively protected by keeping the data of the first persistent volume and the second persistent volume synchronized. That is to say, in the present application, a persistent volume with a replication relationship is created in the container cluster, and the replication technology of the storage itself is used to replicate the persistent volume data in the container cluster, thereby maintaining data consistency and protecting the container data.
相应地,本申请实施例还提供了与上述容器数据保护方法相对应的容器数据保护方法、装置、设备和非易失性可读存储介质,具有上述技术效果,在此不再赘述。Correspondingly, the embodiments of the present application also provide a container data protection method, apparatus, device and non-volatile readable storage medium corresponding to the above-mentioned container data protection method, which have the above-mentioned technical effects and are not repeated here.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the related technologies, the drawings required for use in the embodiments or the related technical descriptions are briefly introduced below. Obviously, the drawings described below are only some embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.
图1为本申请实施例中一种容器数据保护方法的实施流程图;FIG1 is a flowchart of an implementation method of a container data protection method in an embodiment of the present application;
图2为本申请实施例中一种容器数据保护系统的示意图;FIG2 is a schematic diagram of a container data protection system according to an embodiment of the present application;
图3为本申请实施例中一种容器数据保护系统功能示意图;FIG3 is a functional schematic diagram of a container data protection system according to an embodiment of the present application;
图4为本申请实施例中一种容器数据保护装置的结构示意图;FIG4 is a schematic diagram of the structure of a container data protection device according to an embodiment of the present application;
图5为本申请实施例中一种电子设备的结构示意图;FIG5 is a schematic diagram of the structure of an electronic device in an embodiment of the present application;
图6为本申请实施例中一种电子设备的具体结构示意图;FIG6 is a schematic diagram of a specific structure of an electronic device in an embodiment of the present application;
图7为本申请实施例中一种非易失性可读存储介质的结构示意图。FIG. 7 is a schematic diagram of the structure of a non-volatile readable storage medium in an embodiment of the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面结合附图和具体实施方式对本申请作进一步的详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to enable those skilled in the art to better understand the present application, the present application is further described in detail below in conjunction with the accompanying drawings and specific implementation methods. Obviously, the described embodiments are only part of the embodiments of the present application, rather than all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in the field without making creative work are within the scope of protection of the present application.
请参考图2,图2为本申请实施例中一种容器数据保护系统的结构示意图。该系统,包括:Please refer to FIG. 2 , which is a schematic diagram of the structure of a container data protection system in an embodiment of the present application. The system includes:
复制管理器、容器存储控制器、容器编排服务器和容器存储接口;Replication manager, container storage controller, container orchestration server, and container storage interface;
其中,容器编排服务端和容器存储控制器,通过远程过程调用协议与容器存储接口进行 通信;Among them, the container orchestration server and the container storage controller communicate with the container storage interface through the remote procedure call protocol. Communications;
复制管理器和容器存储控制器,通过超文本传输协议与容器编排服务端进行通信;The replication manager and container storage controller communicate with the container orchestration server via the Hypertext Transfer Protocol;
复制管理器包括复制控制管理模块和集群管理模块;The replication manager includes a replication control management module and a cluster management module;
容器存储控制器包括容器存储接口交互模块和容器存储管理模块;The container storage controller includes a container storage interface interaction module and a container storage management module;
容器存储接口用于对存储系统进行远程复制管理和卷操作。The container storage interface is used to perform remote replication management and volume operations on the storage system.
也就是说,复制管理器和容器编排服务端之间通过http协议通信,使用Rest API(Representational State Transfer Application Programming Interfac,应用程序编程接口)调用查询、创建、修改本集群和远端集群中的资源对象信息。包括存储类,持久化卷,持久化卷声明,保护组,备份对象,恢复对象等不同资源对象的信息。That is to say, the replication manager and the container orchestration server communicate through the http protocol, and use the Rest API (Representational State Transfer Application Programming Interfac) to query, create, and modify the resource object information in the local cluster and the remote cluster, including storage classes, persistent volumes, persistent volume declarations, protection groups, backup objects, recovery objects, and other different resource objects.
容器存储控制器和容器编排服务端之间也是通过http协议通信,使用Rest API调用查询、创建、修改本集群和远端集群中的资源对象信息。The container storage controller and the container orchestration server also communicate through the http protocol, using Rest API calls to query, create, and modify resource object information in this cluster and remote clusters.
容器存储控制器和容器存储接口之间通过RPC(远程过程调用协议)通信,调用容器存储接口中提供的功能,操作存储系统,完成对应持久化卷和远程复制卷的创建删除等管理操作。The container storage controller and the container storage interface communicate through RPC (Remote Procedure Call Protocol), call the functions provided in the container storage interface, operate the storage system, and complete management operations such as creation and deletion of corresponding persistent volumes and remote replication volumes.
容器编排服务端和容器存储接口之间通过RPC通信,调用容器存储接口中提供的功能,操作存储系统。The container orchestration server and the container storage interface communicate through RPC, calling the functions provided in the container storage interface to operate the storage system.
其中,集群管理模块与远端集群中的容器编排服务器之间,使用Rest API调用查询、创建、修改远端集群中的资源对象信息。Among them, the cluster management module and the container orchestration server in the remote cluster use Rest API calls to query, create, and modify resource object information in the remote cluster.
其中,集群管理模块,用于获取集群访问配置信息,在容器集群间进行通信,对远端集群中的资源对象进行查询、创建和修改。Among them, the cluster management module is used to obtain cluster access configuration information, communicate between container clusters, and query, create and modify resource objects in remote clusters.
其中,复制控制管理模块,包括备份对象控制器,恢复对象控制器,持久化卷控制器,持久化卷声明控制器和保护组控制器,实现对资源对象的监控和操作。Among them, the replication control management module includes a backup object controller, a recovery object controller, a persistent volume controller, a persistent volume declaration controller and a protection group controller to realize the monitoring and operation of resource objects.
其中,备份对象控制器,用于从容器编排服务端中获取集群及应用配置相关的资源对象;Among them, the backup object controller is used to obtain resource objects related to cluster and application configuration from the container orchestration server;
持久化卷控制器,用于监控持久化卷状态及标签和注解,并根据需要对远端集群中的资源对象进行查询、创建和修改。The persistent volume controller is used to monitor the status, labels, and annotations of persistent volumes, and query, create, and modify resource objects in the remote cluster as needed.
其中,容器存储接口交互模块,用于调用容器存储接口中的RPC服务,对存储系统进行远程复制管理。Among them, the container storage interface interaction module is used to call the RPC service in the container storage interface to perform remote replication management on the storage system.
其中,控制器管理模块,包括持久化卷控制器,持久化卷声明控制器和保护组控制器,通过监视持久化卷或持久化卷声明创建事件,如果是创建的复制卷,则通过容器存储接口交互模块调用容器存储接口的RPC服务,进行存储操作;Among them, the controller management module includes a persistent volume controller, a persistent volume declaration controller and a protection group controller, which monitors the persistent volume or persistent volume declaration creation event. If it is a created replica volume, it calls the RPC service of the container storage interface through the container storage interface interaction module to perform storage operations;
持久化卷控制器通过卷信息创建保护组资源对象,通过在持久化卷和持久化卷声明上添加注解和标签,建立持久化卷和保护组的关联;The persistent volume controller creates a protection group resource object based on the volume information and associates the persistent volume with the protection group by adding annotations and tags to the persistent volume and persistent volume declaration.
保护组控制器,用于管理保护组实例,处理对保护组的操作请求,监视复制状态,更新子资源状态。The protection group controller is used to manage protection group instances, process operation requests for protection groups, monitor replication status, and update sub-resource status.
其中,容器存储接口还包括远程复制卷管理的RPC服务;Among them, the container storage interface also includes RPC services for remote replication volume management;
远程复制卷管理服务连接存储系统,并通过使用存储系统的远程复制功能实现创建复制对,添加保护组,数据同步,对保护组进行状态;The remote replication volume management service connects to the storage system and uses the storage system's remote replication function to create replication pairs, add protection groups, synchronize data, and monitor the status of protection groups.
远程复制功能包括同步远程复制和异步远程复制功能。 The remote replication function includes synchronous remote replication and asynchronous remote replication.
举例说明,在实际应用中,复制管理器由集群管理模块和复制控制管理模块组成。其中,集群管理模块负责获取集群访问配置信息,在容器集群间进行通信,操作远端集群进行查询、创建、修改资源对象。也可以进行单集群操作,把远端集群设置为自己,对远端集群的操作在本集群实现。复制控制管理模块,包含备份对象控制器,恢复对象控制器,持久化卷控制器,持久化卷声明控制器和保护组控制器,实现对资源对象的监控和操作。备份对象控制器能够从容器编排服务端中获取集群及应用配置相关的资源对象。持久化卷控制器能够监控持久化卷状态及标签和注解等,根据需要去对远端集群中的资源对象进行操作,包括查询、创建、修改资源对象。保护组控制器同理,远端的持久化卷和保护组是通过额外的元数据注解或表此前进行关联。For example, in actual applications, the replication manager consists of a cluster management module and a replication control management module. The cluster management module is responsible for obtaining cluster access configuration information, communicating between container clusters, and operating remote clusters to query, create, and modify resource objects. You can also perform single cluster operations, set the remote cluster to yourself, and implement operations on the remote cluster in this cluster. The replication control management module includes a backup object controller, a recovery object controller, a persistent volume controller, a persistent volume declaration controller, and a protection group controller to monitor and operate resource objects. The backup object controller can obtain resource objects related to cluster and application configuration from the container orchestration server. The persistent volume controller can monitor the persistent volume status, labels, and annotations, and operate on resource objects in the remote cluster as needed, including querying, creating, and modifying resource objects. Similarly, the protection group controller, the remote persistent volume and the protection group are associated through additional metadata annotations or tables.
容器存储控制器由控制器管理模块和容器存储接口交互模块组成。其中容器存储接口交互模块能够调用容器存储接口中的RPC服务,对存储系统进行远程复制管理的操作。而控制器管理模块,包含持久化卷控制器,持久化卷声明控制器和保护组控制器,实现对资源对象的监控和操作。通过监视持久化卷或持久化卷声明创建事件,如果是创建的复制卷,则会通过容器存储接口交互模块调用容器存储接口的RPC服务,进行相关的存储操作。持久化卷控制器通过卷信息去创建保护组资源对象,通过在持久化卷和持久化卷声明上添加注解和标签,来建立持久化卷和保护组的关联。保护组控制器用来管理保护组实例,处理对保护组的操作请求,监视复制状态,更新子资源状态等。The container storage controller consists of a controller management module and a container storage interface interaction module. The container storage interface interaction module can call the RPC service in the container storage interface to perform remote replication management operations on the storage system. The controller management module includes a persistent volume controller, a persistent volume declaration controller, and a protection group controller to monitor and operate resource objects. By monitoring the creation events of persistent volumes or persistent volume declarations, if it is a created replicated volume, the RPC service of the container storage interface will be called through the container storage interface interaction module to perform related storage operations. The persistent volume controller uses volume information to create protection group resource objects, and establishes an association between persistent volumes and protection groups by adding annotations and tags to persistent volumes and persistent volume declarations. The protection group controller is used to manage protection group instances, process operation requests for protection groups, monitor replication status, update sub-resource status, etc.
容器存储接口除了一般的卷操作,另外增加远程复制卷管理的RPC服务。远程复制卷管理服务能够连接存储系统,使用存储系统的远程复制功能,包括同步远程复制和异步远程复制功能。实现创建复制对,添加保护组,数据同步,对保护组进行一些操作,修改状态等功能。In addition to general volume operations, the container storage interface also adds an RPC service for remote replication volume management. The remote replication volume management service can connect to the storage system and use the storage system's remote replication function, including synchronous remote replication and asynchronous remote replication. It can create replication pairs, add protection groups, synchronize data, perform some operations on protection groups, modify status, and other functions.
应用本申请实施例所提供的系统,包括:复制管理器、容器存储控制器、容器编排服务器和容器存储接口;其中,容器编排服务端和容器存储控制器,通过远程过程调用协议与容器存储接口进行通信;复制管理器和容器存储控制器,通过超文本传输协议与容器编排服务端进行通信;复制管理器包括复制控制管理模块和集群管理模块;容器存储控制器包括容器存储接口交互模块和容器存储管理模块;容器存储接口用于对存储系统进行远程复制管理和卷操作。The system provided by the embodiment of the present application includes: a replication manager, a container storage controller, a container orchestration server and a container storage interface; wherein the container orchestration server and the container storage controller communicate with the container storage interface through a remote procedure call protocol; the replication manager and the container storage controller communicate with the container orchestration server through a hypertext transfer protocol; the replication manager includes a replication control management module and a cluster management module; the container storage controller includes a container storage interface interaction module and a container storage management module; the container storage interface is used to perform remote replication management and volume operations on the storage system.
基于该系统的各个内部器件以及器件间的相互通信能力,能够融合容器资源对象和持久化卷数据备份恢复功能,并把存储系统的备份和恢复能力带到容器集群,在同一容器集群内或多个集群中能够创建具有不同复制关系类型的持久化卷,使用存储本身的复制技术(同步/异步复制或远程复制)来复制集群中容器的持久化卷数据,保持数据的一致性。Based on the system's internal components and the ability to communicate with each other, it is possible to integrate container resource objects and persistent volume data backup and recovery functions, and bring the storage system's backup and recovery capabilities to the container cluster. Persistent volumes with different types of replication relationships can be created within the same container cluster or in multiple clusters, and the storage's own replication technology (synchronous/asynchronous replication or remote replication) can be used to replicate the persistent volume data of containers in the cluster to maintain data consistency.
具体的,可通过参数配置在创建持久化卷时,能够创建具有复制关系的一对持久化卷,在一个出现故障错误时,能够在复制卷间无缝切换,并且可以根据持久化卷所在的存储的性能及负载,选择响应快的持久化卷进行读写,访问频率高的应用,选择高性能卷进行读写,保证容器服务的高可用及应用的连续性。对于异地容灾恢复场景下,能够创建具有远程复制关系的主备卷,同时备份容器集群对象资源和应用配置,主站点出问题时,能够通过和主卷具有复制关系的从卷进行故障切换,在另一个集群中恢复容器集群对象资源和应用配置,进行快速恢复到容灾站点。等主站点恢复,进行持久化卷的数据恢复及主从切换,保障数据的一致可用。 Specifically, when creating a persistent volume, you can create a pair of persistent volumes with a replication relationship through parameter configuration. When a failure occurs, you can seamlessly switch between the replication volumes, and select the fast-responding persistent volume for reading and writing according to the performance and load of the storage where the persistent volume is located. For applications with high access frequency, select high-performance volumes for reading and writing to ensure the high availability of container services and the continuity of applications. For off-site disaster recovery scenarios, you can create a primary and backup volume with a remote replication relationship, and back up container cluster object resources and application configurations at the same time. When the primary site has a problem, you can perform a failover through the slave volume that has a replication relationship with the primary volume, restore the container cluster object resources and application configuration in another cluster, and quickly restore to the disaster recovery site. After the primary site is restored, perform data recovery and master-slave switching of the persistent volume to ensure consistent data availability.
请参考图1,图1为本申请实施例中一种容器数据保护方法的流程图,该方法可以应用于如图2所示的容器数据保护系统中。为了应用系统的容灾及高可用,图2中使用两套存储对接两个集群,构成一主一备,当然,也可以根据需要设置为两地三中心部署,或者同一集群对接两套或双活存储。容器持久化卷数据备份容灾方案包括3个组件:复制管理器,容器存储控制器和容器存储接口。其中,备份对象和保护组(即一对具有复制关系的卷)为自定义资源,表示存储上的保护组。Please refer to Figure 1, which is a flow chart of a container data protection method in an embodiment of the present application. The method can be applied to the container data protection system shown in Figure 2. In order to achieve disaster recovery and high availability of the application system, two sets of storage are used in Figure 2 to connect to two clusters, forming one master and one backup. Of course, it can also be set up as a two-site three-center deployment as needed, or the same cluster can be connected to two sets or active-active storage. The container persistent volume data backup disaster recovery solution includes three components: a replication manager, a container storage controller, and a container storage interface. Among them, the backup object and the protection group (that is, a pair of volumes with a replication relationship) are custom resources, representing the protection group on the storage.
在该系统中实现容器数据保护方法,具体包括以下步骤:The container data protection method is implemented in the system, which specifically includes the following steps:
S101、在本地容器集群中,创建具有复制类型的存储类。S101. In a local container cluster, create a storage class with a replication type.
首先,可进行集群部署配置,即复制管理器作为容器编排服务端中的应用部署在集群中,本集群信息和远端集群信息及访问信息通过配置传入复制管理器中使用,其中远端集群信息可以设置为本集群。容器存储接口按照接口规范部署,容器存储控制器作为容器存储接口的边车容器部署。First, cluster deployment configuration can be performed, that is, the replication manager is deployed in the cluster as an application in the container orchestration server. The local cluster information, remote cluster information, and access information are passed to the replication manager through configuration for use. The remote cluster information can be set to the local cluster. The container storage interface is deployed according to the interface specification, and the container storage controller is deployed as a sidecar container of the container storage interface.
其中,存储类设置复制类型,启动复制功能,设置远端集群标识,远端存储类名称等参数。如果是单集群中复制,远端集群设置为自身集群。The storage class sets the replication type, starts the replication function, sets the remote cluster identifier, remote storage class name and other parameters. If replication is performed in a single cluster, the remote cluster is set to its own cluster.
在本申请中,复制类型可以具体包括本地复制或远程复制,同步复制或异步复制等关系。In the present application, the replication type may specifically include local replication or remote replication, synchronous replication or asynchronous replication and the like.
创建好了存储类之后,便可创建此存储类的持久化卷声明。After creating a storage class, you can create a persistent volume claim for this storage class.
S102、在存储上创建具有复制功能的第一持久化卷,并将第一持久化卷加入保护组。S102: Create a first persistent volume with a replication function on the storage, and add the first persistent volume to a protection group.
容器存储控制器的持久化卷声明控制器监听到创建事件,调用容器存储接口在存储上创建具有远程复制功能的卷(即第一持久化卷),并加入保护组。The persistent volume declaration controller of the container storage controller listens to the creation event, calls the container storage interface to create a volume with remote replication function (ie, the first persistent volume) on the storage, and adds it to the protection group.
S103、创建存储类的持久化卷声明,并关联第一持久化卷、持久化卷声明和保护组。S103: Create a persistent volume declaration for the storage class, and associate the first persistent volume, the persistent volume declaration, and the protection group.
在本实施例中,在创建好了存储类的持久化卷声明后,需要将第一持久化卷、持久化卷声明和保护组建立起关联关系,从而融合资源对象和持久化卷数据备份恢复功能,并把存储系统的备份和恢复能力带到容器集群,在同一容器集群内或多个集群中能够创建具有不同复制关系类型的持久化卷,使用存储本身的复制技术(同步/异步复制或远程复制)来复制集群中容器的持久化卷数据,保持数据的一致性。In this embodiment, after creating the persistent volume declaration of the storage class, it is necessary to establish an association relationship between the first persistent volume, the persistent volume declaration, and the protection group, so as to integrate the resource object and persistent volume data backup and recovery functions, and bring the backup and recovery capabilities of the storage system to the container cluster. Persistent volumes with different replication relationship types can be created in the same container cluster or in multiple clusters, and the storage's own replication technology (synchronous/asynchronous replication or remote replication) can be used to replicate the persistent volume data of the container in the cluster to maintain data consistency.
需要注意的是,在保护组内可以添加一对具有复制关系的持久化卷,也可以添加3个或三个以上的具有复制关系的持久化卷。即一个保护组内至少对应了2个具有复制关系的持久化卷,但实际在一个保护组内具体头多少个持久化卷可以根据实际的备份需求而定。保护组内的持久化卷之间的复制关系也可以根据实际应用需求进行设置或调整,在此不再一一赘述。It should be noted that you can add a pair of persistent volumes with replication relationships in a protection group, or you can add three or more persistent volumes with replication relationships. That is, there are at least two persistent volumes with replication relationships in a protection group, but the actual number of persistent volumes in a protection group can be determined based on actual backup needs. The replication relationship between persistent volumes in a protection group can also be set or adjusted according to actual application needs, which will not be described here one by one.
其中,第一持久化卷与第二持久化卷中的第一和第二仅用于区别存在两个持久化卷,而并非是对持久化卷的先后、主次等进行限定。The first and second in the first persistent volume and the second persistent volume are only used to distinguish the existence of two persistent volumes, but do not limit the order, priority, etc. of the persistent volumes.
在本申请中的一种具体实施方式中,关联第一持久化卷、持久化卷声明和保护组,包括:In a specific implementation of the present application, associating the first persistent volume, the persistent volume declaration, and the protection group includes:
步骤一、设置第一持久化卷的卷注解和卷标签;Step 1: Set the volume annotation and volume label of the first persistent volume.
步骤二、设置保护组的资源对象,并设置资源注解和资源标签;Step 2: Set the resource object of the protection group, and set the resource annotation and resource tag;
步骤三、利用卷注解、卷标签、资源注解和资源标签,关联第一持久化卷、持久化卷声明和保护组。 Step 3: Use volume annotations, volume tags, resource annotations, and resource tags to associate the first persistent volume, the persistent volume declaration, and the protection group.
为便于描述,下面将上述三个步骤结合起来进行说明。For ease of description, the above three steps are combined for explanation below.
也就是说,当持久化卷控制器监听到卷创建成功,可设置持久化卷的注解和标签。创建保护组资源对象,并设置相关注解和标签。通过注解和标签把持久化卷,持久化卷声明和保护组关联起来。对于卷注解、卷标签、资源注解和资源标签具体如何设置,以及注解和标签本身的内容涵盖何种信息,可以具体参照相关注解与标签的具体定义和实现,在此不再一一赘述。That is to say, when the persistent volume controller detects that the volume has been created successfully, it can set the annotations and labels of the persistent volume. Create a protection group resource object and set the relevant annotations and labels. Associate the persistent volume, persistent volume declaration, and protection group through annotations and labels. For the specific settings of volume annotations, volume labels, resource annotations, and resource labels, as well as the information covered by the content of the annotations and labels themselves, please refer to the specific definitions and implementations of the relevant annotations and labels, which will not be repeated here.
S104、向远端容器集群发送卷创建命令。S104: Send a volume creation command to the remote container cluster.
复制管理器中的持久化卷控制器监测持久化卷的状态及相关注解和标签,获取远端集群信息,查询远端集群中的存储类,向远端集群发送创建持久化卷的命令,创建远端持久化卷,进行数据复制,可设置同步远程复制和异步远程复制。保护组控制器监测保护组的状态及相关注解和标签,向远端集群发送创建保护组的命令,创建远端保护组。The persistent volume controller in the replication manager monitors the status of the persistent volume and related annotations and labels, obtains remote cluster information, queries the storage class in the remote cluster, sends a command to create a persistent volume to the remote cluster, creates a remote persistent volume, and replicates data. It can set synchronous remote replication and asynchronous remote replication. The protection group controller monitors the status of the protection group and related annotations and labels, sends a command to create a protection group to the remote cluster, and creates a remote protection group.
需要注意的是,如果仅需在本地容器集群中实现复制备份,则无需向元旦容器集群发送卷创建命令,而采取直接参照上述步骤S101至步骤S103的方式,创建出与第一持久化卷具有复制关系的第二持久化卷,并将该第二持久化卷加入到第一持久化卷对应的保护组内,从而在本地实现第一持久化卷与第二持久化卷的数据同步。It should be noted that if only replication backup needs to be implemented in the local container cluster, there is no need to send a volume creation command to the New Year's Day container cluster. Instead, directly refer to the above steps S101 to S103 to create a second persistent volume that has a replication relationship with the first persistent volume, and add the second persistent volume to the protection group corresponding to the first persistent volume, thereby achieving data synchronization between the first persistent volume and the second persistent volume locally.
S105、在远端容器集群中创建出与第一持久化卷具有复制关系的第二持久化卷,及保护组的同名保护组后,保持第一持久化卷与第二持久化卷的数据同步。S105: After creating a second persistent volume in a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, keep data of the first persistent volume and the second persistent volume synchronized.
在源集群中会创建出持久化卷,创建出保护组,在目标集群中也会创建出具有复制关系的持久化卷和同名的保护组,且两个持久化卷中数据保持同步。A persistent volume and a protection group are created in the source cluster. A persistent volume and a protection group with the same name and a replication relationship are also created in the target cluster. The data in the two persistent volumes are kept synchronized.
其中,保持第一持久化卷与第二持久化卷的数据同步,包括:利用同步远程复制或异步远程复制,保持第一持久化卷与第二持久化卷的数据同步。即,第一持久化卷与第二持久化卷之间可以采用同步远程复制或者异步远程复制来实现数据同步。Wherein, maintaining data synchronization between the first persistent volume and the second persistent volume includes: maintaining data synchronization between the first persistent volume and the second persistent volume by using synchronous remote replication or asynchronous remote replication. That is, synchronous remote replication or asynchronous remote replication can be used to achieve data synchronization between the first persistent volume and the second persistent volume.
如此,即便其中一个集群出现了故障,另外一个集群也可快速接管所有业务。In this way, even if one of the clusters fails, the other cluster can quickly take over all services.
应用本申请实施例所提供的方法,该方法包括:在本地容器集群中,创建具有复制类型的存储类;在存储上创建具有复制功能的第一持久化卷,并将第一持久化卷加入保护组;创建存储类的持久化卷声明,并关联第一持久化卷、持久化卷声明和保护组;向远端容器集群发送卷创建命令;在远端容器集群中创建出与第一持久化卷具有复制关系的第二持久化卷,及保护组的同名保护组后,保持第一持久化卷与第二持久化卷的数据同步。The method provided by the embodiment of the present application is applied, and the method includes: creating a storage class with a replication type in a local container cluster; creating a first persistent volume with a replication function on the storage, and adding the first persistent volume to a protection group; creating a persistent volume declaration for the storage class, and associating the first persistent volume, the persistent volume declaration, and the protection group; sending a volume creation command to a remote container cluster; creating a second persistent volume with a replication relationship with the first persistent volume in the remote container cluster, and a protection group with the same name as the protection group, and then maintaining data synchronization between the first persistent volume and the second persistent volume.
在本申请中,首先创建出具有复制类型的存储类,然后在存储上创建具有复制功能的第一持久化卷,然后将第一持久化卷加入到保护组中。创建该存储类的持久化卷,并将第一持久化卷、持久化卷声明和保护组关联起来。通过向远端容器集群发出卷创建命令,可以使得远端容器集群创建出与第一持久化卷具有复制关系的第二持久化卷,以及保护组的同名保护组。如此,便可通过保持第一持久化卷与第二持久化卷的数据同步的方式对容器中的数据进行有效保护。也就是说,在本申请中,在容器集群中创建具有复制关系的持久化卷,使用存储本身的复制技术来复制容器集群中的持久化卷数据,从而保持数据的一致性,可实现对容器的数据进行保护。In the present application, a storage class with a replication type is first created, and then a first persistent volume with a replication function is created on the storage, and then the first persistent volume is added to the protection group. A persistent volume of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated. By issuing a volume creation command to the remote container cluster, the remote container cluster can create a second persistent volume with a replication relationship with the first persistent volume, and a protection group with the same name as the protection group. In this way, the data in the container can be effectively protected by keeping the data of the first persistent volume and the second persistent volume synchronized. That is to say, in the present application, a persistent volume with a replication relationship is created in the container cluster, and the replication technology of the storage itself is used to replicate the persistent volume data in the container cluster, thereby maintaining data consistency and protecting the container data.
需要说明的是,基于上述实施例,本申请实施例还提供了相应的改进方案。在一些实施例中涉及与上述实施例中相同步骤或相应步骤之间可相互参考,相应的有益效果也可相互参照,在本文的一些实施例中不再一一赘述。 It should be noted that, based on the above embodiments, the embodiments of the present application also provide corresponding improved solutions. In some embodiments, the same steps or corresponding steps as those in the above embodiments can be referenced to each other, and the corresponding beneficial effects can also be referenced to each other, which will not be repeated one by one in some embodiments of this article.
在本申请中的一种具体实施方式中,还可创建出高可用的复制关系的持久化卷对容器中的数据进行保护,从而维持容器应用的高可用。具体的实现过程包括:In a specific implementation of the present application, a persistent volume with a highly available replication relationship can also be created to protect the data in the container, thereby maintaining the high availability of the container application. The specific implementation process includes:
步骤一、配置具有实时高可用复制关系的目标存储类;Step 1: Configure the target storage class with real-time high-availability replication relationship;
步骤二、为目标存储类创建两个目标持久化卷声明,两个目标持久化卷声明对应的目标持久化卷具有高可用复制关系;Step 2: Create two target persistent volume declarations for the target storage class. The target persistent volumes corresponding to the two target persistent volume declarations have a high-availability replication relationship.
步骤三、获取存储性能状态,从两个目标持久化卷中选出响应快的持久化卷进行读写。Step 3: Get the storage performance status and select the fastest-response persistent volume from the two target persistent volumes for reading and writing.
为了便于描述,下面将上述三个步骤结合起来进行说明。For ease of description, the above three steps are combined for explanation below.
首先,可配置集群信息,远端集群设置为自身集群,对接2套存储保持高可用。然后,创建具有实时高可用复制关系的存储类。创建这个存储类的持久化卷声明,产生两个持久化卷声明,对应的持久化卷具有高可用复制关系。First, you can configure cluster information, set the remote cluster to your own cluster, and connect two sets of storage to maintain high availability. Then, create a storage class with real-time high-availability replication. Create a persistent volume declaration for this storage class, generate two persistent volume declarations, and the corresponding persistent volumes have a high-availability replication relationship.
容器存储控制器监控存储性能状态,根据持久化卷所在的存储的性能及负载,自动切换使用不同的持久化卷,选择响应快的持久化卷进行读写,访问频率高的应用,选择高性能的持久化卷进行读写。The container storage controller monitors the storage performance status and automatically switches to different persistent volumes based on the performance and load of the storage where the persistent volume is located. It selects persistent volumes with fast response for reading and writing, and selects high-performance persistent volumes for reading and writing for applications with high access frequency.
在本申请中的一种具体实施方式中,在一个目标持久化卷故障后,切换至另一个目标持久化卷进行读写。也就是说,在一个持久化卷出现故障错误时,能够立即切换为具有复制关系的持久化卷,保证容器服务的高可用及应用的连续性。In a specific implementation of the present application, after a target persistent volume fails, it switches to another target persistent volume for reading and writing. That is, when a persistent volume fails, it can immediately switch to a persistent volume with a replication relationship to ensure high availability of container services and continuity of applications.
在本申请中的一种具体实施方式,在恢复时,重新建立复制关系,从正常运行的目标持久化卷中同步数据。也就是说,在恢复时,重新建立复制关系,从另一个卷进行同步数据,保持一致性后,重新提供服务。In a specific implementation of the present application, during recovery, the replication relationship is re-established, and data is synchronized from the normally running target persistent volume. That is, during recovery, the replication relationship is re-established, data is synchronized from another volume, and after consistency is maintained, services are provided again.
在本申请中的一种具体实施方式中,还可以实现异地多集群故障切换和恢复。具体实现步骤包括,In a specific implementation of the present application, remote multi-cluster failover and recovery can also be achieved. The specific implementation steps include:
步骤一、利用保护组资源对象,保持主集群和从集群中资源对象和存储上的持久化卷建立主备关系;Step 1: Use the protection group resource objects to maintain the master-slave relationship between the resource objects and the persistent volumes on the storage in the master cluster and the slave cluster;
步骤二、在主集群故障时,则利用从集群,基于容器资源的备份数据恢复主集群的容器应用。Step 2: When the primary cluster fails, use the secondary cluster to restore the container application of the primary cluster based on the backup data of the container resources.
为便于描述,下面将上述两个步骤结合起来进行说明。For ease of description, the above two steps are combined for explanation below.
假设有2套容器应用集群,2套存储系统,进行异地容灾,构成主备关系。Assume that there are two container application clusters and two storage systems for remote disaster recovery, forming a master-slave relationship.
当主集群出现故障无法提供业务,则进行故障切换,从集群利用容器资源的备份数据恢复集群的容器应用。When the main cluster fails and cannot provide services, a failover is performed and the cluster's container applications are restored using the backup data of the container resources in the cluster.
为便于处理切换操作,还设置保护组资源对象的动作属性为故障切换。即,设置保护组资源对象的动作属性为故障切换。To facilitate the processing of the switching operation, the action attribute of the protection group resource object is also set to failover. That is, the action attribute of the protection group resource object is set to failover.
容器存储控制器中的保护组控制器监测到保护组变更,调用容器存储接口中的保护组故障切换操作,在存储系统中,使保护组进入故障切换状态,并停止数据同步。The protection group controller in the container storage controller detects the protection group change, calls the protection group failover operation in the container storage interface, puts the protection group into a failover state in the storage system, and stops data synchronization.
其中,利用从集群,基于容器资源的备份数据恢复主集群的容器应用,包括:利用从集群从备存储中持久化卷从卷的复制数据,拉起容器应用的业务。从集群从存储中的持久化卷从卷的复制数据立即拉起业务,保证灾难时刻的业务可用性和数据安全性。Among them, the backup data of container resources is used to restore the container application of the main cluster by using the slave cluster, including: using the copy data of the persistent volume in the slave storage of the slave cluster to pull up the business of the container application. The copy data of the persistent volume in the slave storage of the slave cluster is used to immediately pull up the business, ensuring business availability and data security at the time of disaster.
相应地,当主集群故障恢复后,设置保护组资源对象的动作属性为重保护。当容器存储控制器中的保护组控制器监测到保护组变更,调用容器存储接口中的保护组重保护操作,在存储系统中,使保护组进入重保护状态,使持久卷从新的“源”恢复复制,进行数据同步。 Accordingly, when the primary cluster fails and recovers, the action attribute of the protection group resource object is set to re-protect. When the protection group controller in the container storage controller detects the change of the protection group, it calls the protection group re-protection operation in the container storage interface, and in the storage system, the protection group enters the re-protection state, so that the persistent volume resumes replication from the new "source" and performs data synchronization.
在本申请中的一种具体实施方式中,还可以通过备份对象来实现数据的备份与恢复。具体的实施过程,包括:In a specific implementation of the present application, data backup and recovery can also be achieved through backup objects. The specific implementation process includes:
步骤一、接收备份请求,并创建对应的备份对象;Step 1: Receive a backup request and create a corresponding backup object;
步骤二、从容器编排服务端查询对象资源,并创建出备份对象的自定义资源;Step 2: Query the object resources from the container orchestration server and create custom resources for the backup object;
步骤三、调用容器存储接口,在存储系统上对需要备份的卷创建快照;Step 3: Call the container storage interface to create a snapshot of the volume to be backed up on the storage system;
步骤四、利用复制管理器向备份的存储位置上传备份的资源数据。Step 4: Use the replication manager to upload the backed-up resource data to the backup storage location.
为便于描述,下面将上述四个步骤结合起来进行说明。For ease of description, the above four steps are combined for explanation below.
创建备份对象自定义资源容器需要备份的信息/数据,集群和存储需要备份的资源。Create a backup object to customize the resource container information/data to be backed up, cluster and storage resources to be backed up.
配置备份的存储位置及访问账号。备份对象控制器监测到备份对象自定义资源后,会向容器编排服务端查询,收集集群容器资源对象,应用配置等对象资源。调用容器存储接口,对需要备份的卷创建快照。向备份的存储位置上传备份的资源数据。Configure the backup storage location and access account. After the backup object controller detects the custom resources of the backup object, it will query the container orchestration server to collect cluster container resource objects, application configuration and other object resources. Call the container storage interface to create a snapshot of the volume to be backed up. Upload the backed-up resource data to the backup storage location.
进一步地,相应的数据恢复过程包括:Furthermore, the corresponding data recovery process includes:
步骤一、接收恢复请求,并创建对应的恢复对象自定义资源;Step 1: Receive a recovery request and create a corresponding recovery object custom resource;
步骤二、利用复制管理器,对恢复对象自定义资源进行验证;Step 2: Use the replication manager to verify the custom resource of the recovery object;
步骤三、验证通过后,利用恢复对象控制器从备份存储位置获取备份资源数据并校验;Step 3: After verification, use the recovery object controller to obtain the backup resource data from the backup storage location and verify it;
步骤四、校验通过后,利用备份资源数据创建恢复备份的资源。Step 4: After verification, use the backup resource data to create and restore the backup resources.
为便于描述,下面将上述四个步骤结合起来进行说明。For ease of description, the above four steps are combined for explanation below.
在接收到恢复请求后,创建恢复对象自定义资源。恢复对象控制器监测到恢复对象自定义资源后,进行验证。恢复对象控制器从备份存储位置获取备份资源数据并校验。恢复对象控制器创建恢复备份的资源。After receiving the recovery request, the recovery object custom resource is created. After the recovery object controller detects the recovery object custom resource, it verifies it. The recovery object controller obtains the backup resource data from the backup storage location and verifies it. The recovery object controller creates the resource for restoring the backup.
其中,对于如何进行资源验证,可以具体参考相关验证实现方案,在此不再一一赘述。Among them, as to how to perform resource verification, you can refer to the relevant verification implementation plan for details, which will not be repeated here.
为便于本领域技术人员更好地理解本申请实施例所提供的容器数据保护方法,下面结合相关技术,对容器数据保护方法本身以及其技术效果进行详细说明。To help those skilled in the art better understand the container data protection method provided in the embodiment of the present application, the container data protection method itself and its technical effects are described in detail below in combination with relevant technologies.
远程复制是存储系统容灾备份的核心技术,可以实现远程数据备份和灾难恢复。可通过远程复制对主站点的数据进行同步和备份,保持数据一致性。当灾难发生时,可以通过从站点快速接管主站点业务数据,保证业务的可持续性,避免业务终端造成的损失。主站点业务数据失效时,可通过从站点的数据对主站点数据进行恢复,能够方便的进行业务恢复。远程复制又分为同步复制和异步复制,同步远程复制是初始同步远程后实时地同步数据,最大限度保证数据的一致性,以减少灾难发生时的数据丢失量。异步远程复制是初始同步后周期性地同步数据,最大限度减少由于数据远程传输的时延而造成的业务性能下降。Remote replication is the core technology of storage system disaster recovery and backup, which can realize remote data backup and disaster recovery. Remote replication can be used to synchronize and back up the data of the primary site to maintain data consistency. When a disaster occurs, the business data of the primary site can be quickly taken over by the secondary site to ensure business sustainability and avoid losses caused by business terminals. When the business data of the primary site fails, the data of the primary site can be restored by the data of the secondary site, which can facilitate business recovery. Remote replication is divided into synchronous replication and asynchronous replication. Synchronous remote replication is to synchronize data in real time after the initial remote synchronization, to maximize data consistency and reduce the amount of data loss when a disaster occurs. Asynchronous remote replication is to synchronize data periodically after the initial synchronization, to minimize the degradation of business performance caused by the delay of remote data transmission.
相关容器备份容灾技术:容器的备份恢复主要分为集群资源的备份和恢复,持久化卷的备份和恢复。Related container backup and disaster recovery technologies: Container backup and recovery are mainly divided into cluster resource backup and recovery, and persistent volume backup and recovery.
对于集群资源的备份和恢复,主要包括:The backup and recovery of cluster resources mainly include:
方式a、容器镜像的备份和恢复,通常标准的做法是通过镜像仓库的复制功能实现不同数据中心之间镜像仓库的镜像同步。Method a: Backup and restore of container images. The standard approach is to synchronize images in image repositories between different data centers through the replication function of the image repository.
方式b、对资源对象的备份,包括各种配置及资源关系,能够恢复出相同的集群,应用,配置,保证功能的一致。Method b: Back up resource objects, including various configurations and resource relationships, to restore the same cluster, application, and configuration to ensure functional consistency.
对于持久化卷的备份和恢复,主要有以下几种方案:There are several solutions for backing up and restoring persistent volumes:
第一种是直接使用存储数据的服务端实现定期快照的备份。 The first is to directly use the server that stores the data to implement regular snapshot backups.
第二种在每台目标服务器上部署专有备份客户端并指定备份数据目录,定期把数据远程复制到外部存储上。The second method is to deploy a dedicated backup client on each target server and specify a backup data directory, and regularly copy the data remotely to external storage.
第三种对于持久化卷的备份,是基于容器存储接口的快照功能,创建持久化卷的快照,进行备份,并通过从快照创建卷进行恢复。The third method of backing up persistent volumes is to create a snapshot of the persistent volume based on the snapshot function of the container storage interface, perform the backup, and then restore the volume by creating a volume from the snapshot.
由此可见,传统的数据保护方案主要聚焦于虚拟机或物理机,常常侧重于保护单个服务器以及运行在上面的应用程序。而在容器编排时代,容器是一个动态变化的资源,应用程序通常是广泛分布的,有时需要启动多云和多个数据中心。传统的备份和灾难恢复方案在容器化环境中不能很好的发挥作用。而现有的容器数据保护方案,往往对容器资源对象和持久化卷数据分开备份,而对持久化卷的备份容灾,使用存储服务端进行备份和部署专有客户端程序操作存储系统备份,存在备份机制固化、数据恢复慢等问题。部分保护方案能够同时备份容器资源对象和持久化卷数据,但对持久化卷只是基于容器存储接口的快照功能进行备份和恢复,对数据备份恢复时间长,无法适应容器高可用,快速恢复的场景。As can be seen, traditional data protection solutions mainly focus on virtual machines or physical machines, and often focus on protecting a single server and the applications running on it. In the era of container orchestration, containers are a dynamically changing resource, and applications are usually widely distributed, sometimes requiring the launch of multiple clouds and multiple data centers. Traditional backup and disaster recovery solutions do not work well in containerized environments. Existing container data protection solutions often back up container resource objects and persistent volume data separately, and for the backup and disaster recovery of persistent volumes, use the storage service end to perform backup and deploy proprietary client programs to operate the storage system backup, which has problems such as rigid backup mechanism and slow data recovery. Some protection solutions can back up container resource objects and persistent volume data at the same time, but the persistent volume is only backed up and restored based on the snapshot function of the container storage interface, which takes a long time to backup and restore data, and cannot adapt to the high availability and fast recovery scenarios of containers.
请参考图3,从上述实施例描述可知,本申请所提供的技术方案能够融合容器资源对象和持久化卷数据备份恢复功能,并把存储系统的备份和恢复能力带到容器集群,在同一容器集群内或多个集群中能够创建具有不同复制关系类型的持久化卷,使用存储本身的复制技术(同步/异步复制或远程复制)来复制集群中容器的持久化卷数据,保持数据的一致性。通过参数配置在创建持久化卷时,能够创建具有复制关系的一对持久化卷,在一个出现故障错误时,能够在复制卷间无缝切换,并且可以根据持久化卷所在的存储的性能及负载,选择响应快的持久化卷进行读写,访问频率高的应用,选择高性能卷进行读写,保证容器服务的高可用及应用的连续性。对于异地容灾恢复场景下,能够创建具有远程复制关系的主备卷,同时备份容器集群对象资源和应用配置,主站点出问题时,能够通过和主卷具有复制关系的从卷进行故障切换,在另一个集群中恢复容器集群对象资源和应用配置,进行快速恢复到容灾站点。等主站点恢复,进行持久化卷的数据恢复及主从切换,保障数据的一致可用。本技术方案适用于容器的数据保护的各个场景包括备份、高可用性、应用连续性和容灾恢复。Please refer to Figure 3. From the above-mentioned embodiment description, it can be seen that the technical solution provided by the present application can integrate the container resource object and persistent volume data backup and recovery functions, and bring the backup and recovery capabilities of the storage system to the container cluster. Persistent volumes with different types of replication relationships can be created in the same container cluster or in multiple clusters, and the replication technology of the storage itself (synchronous/asynchronous replication or remote replication) can be used to replicate the persistent volume data of the container in the cluster to maintain data consistency. Through parameter configuration, when creating a persistent volume, a pair of persistent volumes with a replication relationship can be created. When a fault error occurs, it can be seamlessly switched between the replicated volumes, and according to the performance and load of the storage where the persistent volume is located, a fast-responding persistent volume can be selected for reading and writing. For applications with high access frequency, high-performance volumes can be selected for reading and writing, ensuring the high availability of container services and the continuity of applications. For off-site disaster recovery scenarios, a primary and backup volume with a remote replication relationship can be created, and container cluster object resources and application configurations can be backed up at the same time. When a problem occurs at the primary site, a fault switch can be performed through a slave volume with a replication relationship with the primary volume, and the container cluster object resources and application configuration can be restored in another cluster for rapid recovery to the disaster recovery site. After the primary site is restored, data recovery and master-slave switching of persistent volumes are performed to ensure consistent data availability. This technical solution is applicable to various scenarios of container data protection, including backup, high availability, application continuity, and disaster recovery.
也就是说,本申请能够融合容器资源对象和持久化卷数据备份恢复功能,把存储系统的备份和恢复能力带到容器集群,利用存储设备的复制恢复技术实现对容器持久化卷数据的复制,从而实现容器持久化卷的容灾保护。在主数据中心发生灾难时,可以直接利用灾备中心的数据建立运营支撑环境,为业务继续运营提供IT(Information Technology,信息技术)支持。同时,也可以利用灾备中心的数据恢复主数据中心的业务系统,从而能够让企业的业务运营快速回复到灾难发生前的正常运营状态。此方案采用云原生的开发方式,能够和容器编排服务端很好的融合。In other words, this application can integrate the backup and recovery functions of container resource objects and persistent volume data, bring the backup and recovery capabilities of the storage system to the container cluster, and use the replication and recovery technology of the storage device to achieve the replication of the container persistent volume data, thereby realizing the disaster recovery protection of the container persistent volume. When a disaster occurs in the main data center, the data of the disaster recovery center can be directly used to establish an operation support environment to provide IT (Information Technology) support for the continued operation of the business. At the same time, the data of the disaster recovery center can also be used to restore the business system of the main data center, so that the company's business operations can quickly return to the normal operation state before the disaster. This solution adopts a cloud-native development approach and can be well integrated with the container orchestration server.
本申请能够同时对容器资源对象,应用配置和持久化卷数据进行备份和恢复,持久化卷数据的备份恢复方式既支持本地快照方式、也支持远程复制方式。This application can simultaneously back up and restore container resource objects, application configurations, and persistent volume data. The backup and recovery methods of persistent volume data support both local snapshots and remote replication.
把存储系统的备份和恢复能力带到容器集群,容器的持久化卷使用存储本身的复制技术(同步/异步复制,远程复制)来保持数据的副本和一致性。Bring the backup and recovery capabilities of the storage system to the container cluster. The persistent volume of the container uses the replication technology of the storage itself (synchronous/asynchronous replication, remote replication) to maintain data replication and consistency.
创建持久化卷时,能够创建具有同步复制关系的一对持久化卷,在一个出现故障错误时,能够在复制卷间无缝切换,并且可以根据持久化卷所在的存储的性能及负载,选择响应快的持久化卷进行读写,访问频率高的应用,选择高性能卷进行读写,保证容器服务的高可用及应用的连续性。 When creating a persistent volume, you can create a pair of persistent volumes with a synchronous replication relationship. When a failure occurs, you can seamlessly switch between the replicated volumes. And based on the performance and load of the storage where the persistent volume is located, you can select a persistent volume with a fast response for reading and writing. For applications with high access frequency, select a high-performance volume for reading and writing, to ensure high availability of container services and application continuity.
支持跨集群、跨存储区域容灾部署,能够创建具有远程复制关系的主备卷,同时备份容器集群对象资源和应用配置,主站点故障时,能够通过和主卷具有复制关系的从卷进行故障切换,在另一个集群中恢复容器集群对象资源和应用配置,进行快速恢复到容灾站点。等主站点恢复,进行持久化卷的数据恢复及主从切换,保障数据的一致可用。It supports cross-cluster and cross-storage area disaster recovery deployment, and can create master and backup volumes with remote replication relationships, while backing up container cluster object resources and application configurations. When the primary site fails, it can perform failover through the slave volume that has a replication relationship with the primary volume, restore the container cluster object resources and application configuration in another cluster, and quickly restore to the disaster recovery site. After the primary site is restored, data recovery and master-slave switching of the persistent volume are performed to ensure consistent data availability.
相应于上面的方法实施例,本申请实施例还提供了一种容器数据保护装置,下文描述的容器数据保护装置与上文描述的容器数据保护方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present application further provides a container data protection device. The container data protection device described below and the container data protection method described above can refer to each other.
参见图4所示,该装置包括:As shown in FIG4 , the device comprises:
存储类创建单元101,用于在本地容器集群中,创建具有复制类型的存储类;The storage class creation unit 101 is used to create a storage class with a replication type in a local container cluster;
对象关联单元102,用于在存储上创建具有复制功能的第一持久化卷,并将第一持久化卷加入保护组;创建存储类的持久化卷声明,并关联第一持久化卷、持久化卷声明和保护组;The object association unit 102 is used to create a first persistent volume with a replication function on the storage, and add the first persistent volume to the protection group; create a persistent volume declaration of the storage class, and associate the first persistent volume, the persistent volume declaration, and the protection group;
数据同步单元103,用于向远端容器集群发送卷创建命令;在远端容器集群中创建出与第一持久化卷具有复制关系的第二持久化卷,及保护组的同名保护组后,保持第一持久化卷与第二持久化卷的数据同步。The data synchronization unit 103 is used to send a volume creation command to the remote container cluster; after creating a second persistent volume with a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, the data of the first persistent volume and the second persistent volume are kept synchronized.
应用本申请实施例所提供的装置,在本地容器集群中,创建具有复制类型的存储类;在存储上创建具有复制功能的第一持久化卷,并将第一持久化卷加入保护组;创建存储类的持久化卷声明,并关联第一持久化卷、持久化卷声明和保护组;向远端容器集群发送卷创建命令;在远端容器集群中创建出与第一持久化卷具有复制关系的第二持久化卷,及保护组的同名保护组后,保持第一持久化卷与第二持久化卷的数据同步。Using the device provided in the embodiment of the present application, a storage class with a replication type is created in a local container cluster; a first persistent volume with a replication function is created on the storage, and the first persistent volume is added to a protection group; a persistent volume declaration of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated; a volume creation command is sent to a remote container cluster; and after creating a second persistent volume with a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, the data of the first persistent volume and the second persistent volume are kept synchronized.
在本申请中,首先创建出具有复制类型的存储类,然后在存储上创建具有复制功能的第一持久化卷,然后将第一持久化卷加入到保护组中。创建该存储类的持久化卷,并将第一持久化卷、持久化卷声明和保护组关联起来。通过向远端容器集群发出卷创建命令,可以使得远端容器集群创建出与第一持久化卷具有复制关系的第二持久化卷,以及保护组的同名保护组。如此,便可通过保持第一持久化卷与第二持久化卷的数据同步的方式对容器中的数据进行有效保护。也就是说,在本申请中,在容器集群中创建具有复制关系的持久化卷,使用存储本身的复制技术来复制容器集群中的持久化卷数据,从而保持数据的一致性,可实现对容器的数据进行保护。In the present application, a storage class with a replication type is first created, and then a first persistent volume with a replication function is created on the storage, and then the first persistent volume is added to the protection group. A persistent volume of the storage class is created, and the first persistent volume, the persistent volume declaration, and the protection group are associated. By issuing a volume creation command to the remote container cluster, the remote container cluster can create a second persistent volume with a replication relationship with the first persistent volume, and a protection group with the same name as the protection group. In this way, the data in the container can be effectively protected by keeping the data of the first persistent volume and the second persistent volume synchronized. That is to say, in the present application, a persistent volume with a replication relationship is created in the container cluster, and the replication technology of the storage itself is used to replicate the persistent volume data in the container cluster, thereby maintaining data consistency and protecting the container data.
在本申请的一种具体实施方式中,对象关联单元102,具体用于设置第一持久化卷的卷注解和卷标签;In a specific implementation of the present application, the object association unit 102 is specifically used to set the volume annotation and volume label of the first persistent volume;
设置保护组的资源对象,并设置资源注解和资源标签;Set the resource object of the protection group, and set the resource annotation and resource tag;
利用卷注解、卷标签、资源注解和资源标签,关联第一持久化卷、持久化卷声明和保护组。The first persistent volume, the persistent volume declaration, and the protection group are associated using the volume annotation, the volume tag, the resource annotation, and the resource tag.
在本申请的一种具体实施方式中,对象关联单元102,具体用于利用同步远程复制或异步远程复制,保持第一持久化卷与第二持久化卷的数据同步。In a specific implementation of the present application, the object association unit 102 is specifically configured to maintain data synchronization between the first persistent volume and the second persistent volume by using synchronous remote replication or asynchronous remote replication.
在本申请的一种具体实施方式中,对象关联单元102,还用于配置具有实时高可用复制关系的目标存储类;In a specific implementation of the present application, the object association unit 102 is further used to configure a target storage class having a real-time high-availability replication relationship;
为目标存储类创建两个目标持久化卷声明,两个目标持久化卷声明对应的目标持久化卷具有高可用复制关系; Create two target persistent volume declarations for the target storage class. The target persistent volumes corresponding to the two target persistent volume declarations have a high-availability replication relationship.
获取存储性能状态,从两个目标持久化卷中选出响应快的持久化卷进行读写。Get the storage performance status and select the fastest-response persistent volume from the two target persistent volumes for reading and writing.
在本申请的一种具体实施方式中,对象关联单元102,还用于在一个目标持久化卷故障后,切换至另一个目标持久化卷进行读写。In a specific implementation of the present application, the object association unit 102 is further configured to switch to another target persistent volume for reading and writing after a target persistent volume fails.
在本申请的一种具体实施方式中,对象关联单元102,还用于在恢复时,重新建立复制关系,从正常运行的目标持久化卷中同步数据。In a specific implementation of the present application, the object association unit 102 is further configured to re-establish the replication relationship during recovery and synchronize data from the normally operating target persistent volume.
在本申请的一种具体实施方式中,对象关联单元102,还用于利用保护组资源对象,保持主集群和从集群中资源对象和存储上的持久化卷建立主备关系;In a specific implementation of the present application, the object association unit 102 is further used to use the protection group resource object to maintain the master-slave relationship between the resource objects and the persistent volumes on the storage in the master cluster and the slave cluster;
在主集群故障时,则利用从集群,基于容器资源的备份数据恢复主集群的容器应用。When the primary cluster fails, the secondary cluster is used to restore the container application of the primary cluster based on the backup data of the container resources.
在本申请的一种具体实施方式中,对象关联单元102,还用于设置保护组资源对象的动作属性为故障切换。In a specific implementation manner of the present application, the object association unit 102 is further configured to set the action attribute of the protection group resource object to failover.
在本申请的一种具体实施方式中,对象关联单元102,还用于当主集群故障恢复后,设置保护组资源对象的动作属性为重保护。In a specific implementation of the present application, the object association unit 102 is further configured to set the action attribute of the protection group resource object to re-protection after the main cluster fails and recovers.
在本申请的一种具体实施方式中,数据同步单元103,还用于利用从集群从备存储中持久化卷从卷的复制数据,拉起容器应用的业务。In a specific implementation of the present application, the data synchronization unit 103 is further configured to start the service of the container application by using the replicated data of the persistent volume from the cluster secondary storage.
在本申请的一种具体实施方式中,数据同步单元103,还用于接收备份请求,并创建对应的备份对象;In a specific implementation of the present application, the data synchronization unit 103 is further used to receive a backup request and create a corresponding backup object;
从容器编排服务端查询对象资源,并创建出备份对象的自定义资源;Query object resources from the container orchestration server and create custom resources for backup objects;
调用容器存储接口,在存储系统上对需要备份的卷创建快照;Call the container storage interface to create a snapshot of the volume to be backed up on the storage system;
利用复制管理器向备份的存储位置上传备份的资源数据。Use the replication manager to upload the backed-up resource data to the backup storage location.
在本申请的一种具体实施方式中,数据同步单元103,还用于接收恢复请求,并创建对应的恢复对象自定义资源;In a specific implementation of the present application, the data synchronization unit 103 is further used to receive a recovery request and create a corresponding recovery object custom resource;
利用复制管理器,对恢复对象自定义资源进行验证;Use the replication manager to verify the recovery object custom resources;
验证通过后,利用恢复对象控制器从备份存储位置获取备份资源数据并校验;After verification, the backup resource data is obtained from the backup storage location and verified using the recovery object controller;
校验通过后,利用备份资源数据创建恢复备份的资源。After verification, use the backup resource data to create and restore the backup resources.
相应于上面的方法实施例,本申请实施例还提供了一种电子设备,下文描述的一种电子设备与上文描述的一种容器数据保护方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present application further provides an electronic device. The electronic device described below and the container data protection method described above can refer to each other.
参见图5所示,该电子设备包括:As shown in FIG5 , the electronic device includes:
存储器332,用于存储计算机程序;Memory 332, used for storing computer programs;
处理器322,用于执行计算机程序时实现上述方法实施例的容器数据保护方法的步骤。The processor 322 is configured to implement the steps of the container data protection method of the above method embodiment when executing a computer program.
具体的,请参考图6,图6为本实施例提供的一种电子设备的具体结构示意图,该电子设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)322(例如,一个或一个以上处理器)和存储器332,存储器332存储有一个或一个以上的计算机程序342或数据344。其中,存储器332可以是短暂存储或持久存储。存储在存储器332的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对数据处理设备中的一系列指令操作。更进一步地,中央处理器322可以设置为与存储器332通信,在电子设备301上执行存储器332中的一系列指令操作。Specifically, please refer to Figure 6, which is a schematic diagram of the specific structure of an electronic device provided in this embodiment. The electronic device may have relatively large differences due to different configurations or performances, and may include one or more processors (central processing units, CPU) 322 (for example, one or more processors) and a memory 332, and the memory 332 stores one or more computer programs 342 or data 344. Among them, the memory 332 can be a temporary storage or a permanent storage. The program stored in the memory 332 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the data processing device. Furthermore, the central processing unit 322 can be configured to communicate with the memory 332 to execute a series of instruction operations in the memory 332 on the electronic device 301.
电子设备301还可以包括一个或一个以上电源326,一个或一个以上有线或无线网络接口350,一个或一个以上输入输出接口358,和/或,一个或一个以上操作系统341。 The electronic device 301 may further include one or more power supplies 326 , one or more wired or wireless network interfaces 350 , one or more input and output interfaces 358 , and/or one or more operating systems 341 .
上文所描述的容器数据保护方法中的步骤可以由电子设备的结构实现。The steps in the container data protection method described above can be implemented by the structure of an electronic device.
相应于上面的方法实施例,本申请实施例还提供了一种非易失性可读存储介质,下文描述的一种非易失性可读存储介质与上文描述的一种容器数据保护方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present application further provides a non-volatile readable storage medium. The non-volatile readable storage medium described below and the container data protection method described above can refer to each other.
参考图7,图7为本实施例提供的一种非易失性可读存储介质,非易失性可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述方法实施例的容器数据保护方法的步骤。Refer to Figure 7, which is a non-volatile readable storage medium provided in this embodiment. The non-volatile readable storage medium stores a computer program. When the computer program is executed by the processor, the steps of the container data protection method of the above method embodiment are implemented.
该非易失性可读存储介质具体可以为U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可存储程序代码的非易失性可读存储介质。The non-volatile readable storage medium may specifically be a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and other non-volatile readable storage media that can store program codes.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的系统而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。In this specification, each embodiment is described in a progressive manner, and each embodiment focuses on the differences from other embodiments. The same or similar parts between the embodiments can be referred to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant parts can be referred to the method part.
本领域技术人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应该认为超出本申请的范围。Those skilled in the art may further appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the interchangeability of hardware and software, the composition and steps of each example have been generally described in terms of function in the above description. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two. The software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系属于仅仅用来将一个实体或者操作与另一个实体或者操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语包括、包含或者其他任何变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。Finally, it should be noted that, in this article, relationships such as first and second, etc. are used only to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms include, include or any other variations are intended to cover non-exclusive inclusion, so that a process, method, article or device that includes a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device.
本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本申请的限制。 Specific examples are used herein to illustrate the principles and implementation methods of the present application. The description of the above embodiments is only used to help understand the method and core idea of the present application. At the same time, for those skilled in the art, according to the idea of the present application, there will be changes in the specific implementation methods and application scope. In summary, the content of this specification should not be understood as a limitation on the present application.

Claims (23)

  1. 一种容器数据保护系统,其特征在于,包括:A container data protection system, comprising:
    复制管理器、容器存储控制器、容器编排服务器和容器存储接口;Replication manager, container storage controller, container orchestration server, and container storage interface;
    其中,所述容器编排服务端和所述容器存储控制器,通过远程过程调用协议与所述容器存储接口进行通信;Wherein, the container orchestration server and the container storage controller communicate with the container storage interface via a remote procedure call protocol;
    所述复制管理器和所述容器存储控制器,通过超文本传输协议与所述容器编排服务端进行通信;The replication manager and the container storage controller communicate with the container orchestration server via a hypertext transfer protocol;
    所述复制管理器包括复制控制管理模块和集群管理模块;The replication manager includes a replication control management module and a cluster management module;
    所述容器存储控制器包括容器存储接口交互模块和容器存储管理模块;The container storage controller includes a container storage interface interaction module and a container storage management module;
    所述容器存储接口用于对存储系统进行远程复制管理和卷操作。The container storage interface is used to perform remote replication management and volume operations on the storage system.
  2. 根据权利要求1所述的容器数据保护系统,其特征在于,所述集群管理模块与远端集群中的所述容器编排服务器之间,使用Rest API调用查询、创建、修改所述远端集群中的资源对象信息。The container data protection system according to claim 1 is characterized in that the cluster management module and the container orchestration server in the remote cluster use Rest API calls to query, create, and modify resource object information in the remote cluster.
  3. 根据权利要求1所述的容器数据保护系统,其特征在于,所述集群管理模块,用于获取集群访问配置信息,在容器集群间进行通信,对远端集群中的资源对象进行查询、创建和修改。The container data protection system according to claim 1 is characterized in that the cluster management module is used to obtain cluster access configuration information, communicate between container clusters, and query, create and modify resource objects in remote clusters.
  4. 根据权利要求1所述的容器数据保护系统,其特征在于,所述复制控制管理模块,包括备份对象控制器,恢复对象控制器,持久化卷控制器,持久化卷声明控制器和保护组控制器,实现对资源对象的监控和操作。The container data protection system according to claim 1 is characterized in that the replication control management module includes a backup object controller, a recovery object controller, a persistent volume controller, a persistent volume declaration controller and a protection group controller to realize the monitoring and operation of resource objects.
  5. 根据权利要求4所述的容器数据保护系统,其特征在于,所述备份对象控制器,用于从所述容器编排服务端中获取集群及应用配置相关的资源对象;The container data protection system according to claim 4 is characterized in that the backup object controller is used to obtain resource objects related to cluster and application configuration from the container orchestration server;
    所述持久化卷控制器,用于监控持久化卷状态及标签和注解,并根据需要对远端集群中的资源对象进行查询、创建和修改。The persistent volume controller is used to monitor the persistent volume status, labels and annotations, and query, create and modify resource objects in the remote cluster as needed.
  6. 根据权利要求1所述的容器数据保护系统,其特征在于,所述容器存储接口交互模块,用于调用所述容器存储接口中的RPC服务,对所述存储系统进行远程复制管理。The container data protection system according to claim 1 is characterized in that the container storage interface interaction module is used to call the RPC service in the container storage interface to perform remote replication management on the storage system.
  7. 根据权利要求1所述的容器数据保护系统,其特征在于,所述控制器管理模块,包括持久化卷控制器,持久化卷声明控制器和保护组控制器,通过监视持久化卷或持久化卷声明创建事件,如果是创建的复制卷,则通过所述容器存储接口交互模块调用所述容器存储接口的RPC服务,进行存储操作;The container data protection system according to claim 1 is characterized in that the controller management module includes a persistent volume controller, a persistent volume declaration controller and a protection group controller, and monitors the persistent volume or persistent volume declaration creation event. If it is a created replica volume, the RPC service of the container storage interface is called through the container storage interface interaction module to perform storage operations;
    所述持久化卷控制器通过卷信息创建保护组资源对象,通过在持久化卷和持久化卷声明上添加注解和标签,建立持久化卷和保护组的关联;The persistent volume controller creates a protection group resource object through volume information, and establishes an association between the persistent volume and the protection group by adding annotations and tags to the persistent volume and the persistent volume declaration;
    所述保护组控制器,用于管理保护组实例,处理对保护组的操作请求,监视复制状态,更新子资源状态。The protection group controller is used to manage protection group instances, process operation requests for protection groups, monitor replication status, and update sub-resource status.
  8. 根据权利要求1所述的容器数据保护系统,其特征在于,所述容器存储接口还包括远程复制卷管理的RPC服务;The container data protection system according to claim 1, wherein the container storage interface further comprises an RPC service for remote replication volume management;
    所述远程复制卷管理服务连接所述存储系统,并通过使用存储系统的远程复制功能实现创建复制对,添加保护组,数据同步,对保护组进行状态;The remote replication volume management service is connected to the storage system and uses the remote replication function of the storage system to create replication pairs, add protection groups, synchronize data, and perform status on protection groups;
    所述远程复制功能包括同步远程复制和异步远程复制功能。The remote replication function includes synchronous remote replication and asynchronous remote replication.
  9. 一种容器数据保护方法,其特征在于,应用于如权利要求1至8任一项所述的容器数据保护系统,包括: A container data protection method, characterized in that it is applied to the container data protection system according to any one of claims 1 to 8, comprising:
    在本地容器集群中,创建具有复制类型的存储类;In the local container cluster, create a storage class with the replication type.
    在存储上创建具有复制功能的第一持久化卷,并将所述第一持久化卷加入保护组;Creating a first persistent volume with a replication function on the storage, and adding the first persistent volume to the protection group;
    创建所述存储类的持久化卷声明,并关联所述第一持久化卷、所述持久化卷声明和所述保护组;Creating a persistent volume declaration for the storage class, and associating the first persistent volume, the persistent volume declaration, and the protection group;
    向远端容器集群发送卷创建命令;Send a volume creation command to the remote container cluster;
    在所述远端容器集群中创建出与所述第一持久化卷具有复制关系的第二持久化卷,及所述保护组的同名保护组后,保持所述第一持久化卷与所述第二持久化卷的数据同步。After creating a second persistent volume having a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, data of the first persistent volume and the second persistent volume are kept synchronized.
  10. 根据权利要求9所述的容器数据保护方法,其特征在于,关联所述第一持久化卷、所述持久化卷声明和所述保护组,包括:The container data protection method according to claim 9, characterized in that associating the first persistent volume, the persistent volume declaration, and the protection group comprises:
    设置所述第一持久化卷的卷注解和卷标签;Setting a volume annotation and a volume label of the first persistent volume;
    设置所述保护组的资源对象,并设置资源注解和资源标签;Set the resource object of the protection group, and set the resource annotation and resource tag;
    利用所述卷注解、所述卷标签、所述资源注解和所述资源标签,关联所述第一持久化卷、所述持久化卷声明和所述保护组。The first persistent volume, the persistent volume declaration, and the protection group are associated using the volume annotation, the volume label, the resource annotation, and the resource label.
  11. 根据权利要求9所述的容器数据保护方法,其特征在于,保持所述第一持久化卷与所述第二持久化卷的数据同步,包括:The container data protection method according to claim 9, characterized in that maintaining data synchronization between the first persistent volume and the second persistent volume comprises:
    利用同步远程复制或异步远程复制,保持所述第一持久化卷与所述第二持久化卷的数据同步。Synchronous remote replication or asynchronous remote replication is used to keep data of the first persistent volume and the second persistent volume synchronized.
  12. 根据权利要求9所述的容器数据保护方法,其特征在于,还包括:The container data protection method according to claim 9, further comprising:
    配置具有实时高可用复制关系的目标存储类;Configure the target storage class with a real-time high-availability replication relationship;
    为所述目标存储类创建两个目标持久化卷声明,两个所述目标持久化卷声明对应的目标持久化卷具有高可用复制关系;Creating two target persistent volume declarations for the target storage class, wherein the target persistent volumes corresponding to the two target persistent volume declarations have a high-availability replication relationship;
    获取存储性能状态,从两个所述目标持久化卷中选出响应快的持久化卷进行读写。The storage performance status is obtained, and a persistent volume with a faster response is selected from the two target persistent volumes for reading and writing.
  13. 根据权利要求12所述的容器数据保护方法,其特征在于,还包括:The container data protection method according to claim 12, further comprising:
    在一个所述目标持久化卷故障后,切换至另一个所述目标持久化卷进行读写。After one of the target persistent volumes fails, switch to another target persistent volume for reading and writing.
  14. 根据权利要求13所述的容器数据保护方法,其特征在于,还包括:The container data protection method according to claim 13, further comprising:
    在恢复时,重新建立复制关系,从正常运行的所述目标持久化卷中同步数据。During recovery, the replication relationship is re-established, and data is synchronized from the target persistent volume that is operating normally.
  15. 根据权利要求9所述的容器数据保护方法,其特征在于,还包括:The container data protection method according to claim 9, further comprising:
    利用保护组资源对象,保持主集群和从集群中资源对象和存储上的持久化卷建立主备关系;Use the protection group resource objects to maintain the master-slave relationship between the resource objects and the persistent volumes on the storage in the master and slave clusters;
    在所述主集群故障时,则利用所述从集群,基于容器资源的备份数据恢复所述主集群的容器应用。When the main cluster fails, the slave cluster is used to restore the container application of the main cluster based on the backup data of the container resources.
  16. 根据权利要求15所述的容器数据保护方法,其特征在于,还包括:The container data protection method according to claim 15, further comprising:
    设置所述保护组资源对象的动作属性为故障切换。The action attribute of the protection group resource object is set to failover.
  17. 根据权利要求16所述的容器数据保护方法,其特征在于,还包括:The container data protection method according to claim 16, further comprising:
    当所述主集群故障恢复后,设置所述保护组资源对象的动作属性为重保护。When the main cluster fails and recovers, the action attribute of the protection group resource object is set to re-protection.
  18. 根据权利要求15所述的容器数据保护方法,其特征在于,利用所述从集群,基于容器资源的备份数据恢复所述主集群的容器应用,包括:The container data protection method according to claim 15 is characterized in that, using the slave cluster to restore the container application of the master cluster based on the backup data of the container resources, comprises:
    利用所述从集群从备存储中持久化卷从卷的复制数据,拉起所述容器应用的业务。The business of the container application is started by using the replicated data of the persistent volume in the secondary storage of the secondary cluster.
  19. 根据权利要求9所述的容器数据保护方法,其特征在于,还包括:The container data protection method according to claim 9, further comprising:
    接收备份请求,并创建对应的备份对象; Receive backup requests and create corresponding backup objects;
    从容器编排服务端查询对象资源,并创建出所述备份对象的自定义资源;Query object resources from the container orchestration server and create custom resources for the backup object;
    调用容器存储接口,在存储系统上对需要备份的卷创建快照;Call the container storage interface to create a snapshot of the volume to be backed up on the storage system;
    利用复制管理器向备份的存储位置上传备份的资源数据。Use the replication manager to upload the backed-up resource data to the backup storage location.
  20. 根据权利要求19所述的容器数据保护方法,其特征在于,还包括:The container data protection method according to claim 19, further comprising:
    接收恢复请求,并创建对应的恢复对象自定义资源;Receive the recovery request and create the corresponding recovery object custom resource;
    利用所述复制管理器,对所述恢复对象自定义资源进行验证;Using the replication manager, verifying the recovery object custom resource;
    验证通过后,利用恢复对象控制器从备份存储位置获取备份资源数据并校验;After verification, the backup resource data is obtained from the backup storage location and verified using the recovery object controller;
    校验通过后,利用所述备份资源数据创建恢复备份的资源。After verification, the backup resource data is used to create and restore the backup resources.
  21. 一种容器数据保护装置,其特征在于,包括:A container data protection device, comprising:
    存储类创建单元,用于在本地容器集群中,创建具有复制类型的存储类;A storage class creation unit, used to create a storage class with a replication type in a local container cluster;
    对象关联单元,用于在存储上创建具有复制功能的第一持久化卷,并将所述第一持久化卷加入保护组;创建所述存储类的持久化卷声明,并关联所述第一持久化卷、所述持久化卷声明和所述保护组;An object association unit, configured to create a first persistent volume with a replication function on the storage, and add the first persistent volume to a protection group; create a persistent volume declaration of the storage class, and associate the first persistent volume, the persistent volume declaration, and the protection group;
    数据同步单元,用于向远端容器集群发送卷创建命令;在所述远端容器集群中创建出与所述第一持久化卷具有复制关系的第二持久化卷,及所述保护组的同名保护组后,保持所述第一持久化卷与所述第二持久化卷的数据同步。A data synchronization unit is used to send a volume creation command to a remote container cluster; after creating a second persistent volume having a replication relationship with the first persistent volume and a protection group with the same name as the protection group in the remote container cluster, keep the data of the first persistent volume and the second persistent volume synchronized.
  22. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    存储器,用于存储计算机程序;Memory for storing computer programs;
    处理器,用于执行所述计算机程序时实现如权利要求9至20任一项所述容器数据保护方法的步骤。A processor, configured to implement the steps of the container data protection method as claimed in any one of claims 9 to 20 when executing the computer program.
  23. 一种非易失性可读存储介质,其特征在于,所述非易失性可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求9至20任一项所述容器数据保护方法的步骤。 A non-volatile readable storage medium, characterized in that a computer program is stored on the non-volatile readable storage medium, and when the computer program is executed by a processor, the steps of the container data protection method as described in any one of claims 9 to 20 are implemented.
PCT/CN2023/134107 2022-12-09 2023-11-24 Container data protection system, method and apparatus, and device and readable storage medium WO2024120227A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211575807.9A CN115576655B (en) 2022-12-09 2022-12-09 Container data protection system, method, device, equipment and readable storage medium
CN202211575807.9 2022-12-09

Publications (1)

Publication Number Publication Date
WO2024120227A1 true WO2024120227A1 (en) 2024-06-13

Family

ID=84589998

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/134107 WO2024120227A1 (en) 2022-12-09 2023-11-24 Container data protection system, method and apparatus, and device and readable storage medium

Country Status (2)

Country Link
CN (1) CN115576655B (en)
WO (1) WO2024120227A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115576655B (en) * 2022-12-09 2023-04-14 浪潮电子信息产业股份有限公司 Container data protection system, method, device, equipment and readable storage medium
CN116088768B (en) * 2023-02-24 2023-07-14 苏州浪潮智能科技有限公司 Dynamic storage allocation method, dynamic storage allocation device, electronic equipment and storage medium
CN116244040B (en) * 2023-03-10 2024-05-03 安超云软件有限公司 Main and standby container cluster system, data synchronization method thereof and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870570A (en) * 2014-03-14 2014-06-18 广州携智信息科技有限公司 HBase (Hadoop database) data usability and durability method based on remote log backup
US20200034240A1 (en) * 2018-07-30 2020-01-30 EMC IP Holding Company LLC Network block device based continuous replication for kubernetes container management systems
CN111400307A (en) * 2020-02-20 2020-07-10 上海交通大学 Persistent hash table access system supporting remote concurrent access
US20220236879A1 (en) * 2021-01-27 2022-07-28 Hitachi, Ltd. Dynamic volume provisioning for remote replication
CN115576655A (en) * 2022-12-09 2023-01-06 浪潮电子信息产业股份有限公司 Container data protection system, method, device, equipment and readable storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8788768B2 (en) * 2010-09-29 2014-07-22 International Business Machines Corporation Maintaining mirror and storage system copies of volumes at multiple remote sites
US11256434B2 (en) * 2019-04-17 2022-02-22 Robin Systems, Inc. Data de-duplication
US11467775B2 (en) * 2019-10-15 2022-10-11 Hewlett Packard Enterprise Development Lp Virtual persistent volumes for containerized applications
CN113296871A (en) * 2020-04-10 2021-08-24 阿里巴巴集团控股有限公司 Method, equipment and system for processing container group instance
CN114138408B (en) * 2021-11-12 2024-05-03 苏州浪潮智能科技有限公司 Clone volume creation method, clone volume creation device, computer equipment and storage medium
CN114996053A (en) * 2022-05-31 2022-09-02 济南浪潮数据技术有限公司 Remote volume replication transmission method, system, device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103870570A (en) * 2014-03-14 2014-06-18 广州携智信息科技有限公司 HBase (Hadoop database) data usability and durability method based on remote log backup
US20200034240A1 (en) * 2018-07-30 2020-01-30 EMC IP Holding Company LLC Network block device based continuous replication for kubernetes container management systems
CN111400307A (en) * 2020-02-20 2020-07-10 上海交通大学 Persistent hash table access system supporting remote concurrent access
US20220236879A1 (en) * 2021-01-27 2022-07-28 Hitachi, Ltd. Dynamic volume provisioning for remote replication
CN115576655A (en) * 2022-12-09 2023-01-06 浪潮电子信息产业股份有限公司 Container data protection system, method, device, equipment and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PINGFU ZHU: "The Research of a Development Method and Apply Based on Lightweight J2EE Architecture", CHINESE MASTER'S THESIS, CETRAL SOUTH UNIVERSITY, 30 June 2008 (2008-06-30), Cetral South University, XP093179564 *
WENJING DONG: "Research and Optimization of High Availability Management Technology Based on Alluxio", CHINESE MASTER'S THESES, 1 June 2017 (2017-06-01), XP093179585 *

Also Published As

Publication number Publication date
CN115576655B (en) 2023-04-14
CN115576655A (en) 2023-01-06

Similar Documents

Publication Publication Date Title
WO2024120227A1 (en) Container data protection system, method and apparatus, and device and readable storage medium
EP3694148B1 (en) Configuration modification method for storage cluster, storage cluster and computer system
US9747179B2 (en) Data management agent for selective storage re-caching
JP4668763B2 (en) Storage device restore method and storage device
TW497071B (en) Method and apparatus for managing clustered computer systems
US9329949B2 (en) Comprehensive error management capabilities for disaster recovery operations
US9582532B2 (en) Management and synchronization of batch workloads with active/active sites OLTP workloads
US7003692B1 (en) Dynamic configuration synchronization in support of a “hot” standby stateful switchover
WO2015096500A1 (en) Service migration method and device and disaster tolerance system
CN103336728A (en) Disk data recovery method
US7069317B1 (en) System and method for providing out-of-band notification of service changes
US11880282B2 (en) Container-based application data protection method and system
WO2024103594A1 (en) Container disaster recovery method, system, apparatus and device, and computer-readable storage medium
US20180225183A1 (en) SMB Service Fault Processing Method and Storage Device
WO2012171346A1 (en) Telephone number mapping-domain name system (enum-dns) and disaster tolerance method thereof
CN112000444B (en) Database transaction processing method and device, storage medium and electronic equipment
CN114143175A (en) Method and system for realizing main and standby clusters
CN113438111A (en) Method for restoring RabbitMQ network partition based on Raft distribution and application
WO2022227719A1 (en) Data backup method and system, and related device
CN111083074A (en) High availability method and system for main and standby dual OSPF state machines
CN100362760C (en) Duplication of distributed configuration database system
CN114584459A (en) Method for realizing high availability of main and standby container cloud platforms
JP2004272318A (en) System changeover system, processing method therefor, and processing program therefor
JPH07114495A (en) Multiplexing file managing system
CN117827544B (en) Hot backup system, method, electronic device and storage medium