CN115640169A

CN115640169A - Method, system, device and storage medium for ensuring that a master cluster stops providing services

Info

Publication number: CN115640169A
Application number: CN202211654141.6A
Authority: CN
Inventors: 李龙峰
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2022-12-22
Filing date: 2022-12-22
Publication date: 2023-01-24

Abstract

The present invention relates to the field of containers. The invention provides a method, a system, equipment and a storage medium for ensuring that a main cluster stops providing services, wherein the method comprises the following steps: in a disaster-tolerant master-slave switching scene, in response to receiving a master-slave switching pause container request, initiating a pause container request of a master cluster; in response to receiving a pause container request of a master cluster, pausing a target container according to the pause container request of the master cluster; putting the pause container resource into a cache, and acquiring a pause container resource list; detecting whether container service data has completely landed upon reaching a completion state in response to a pause container request of the master cluster. When the master-slave switching is carried out in a disaster-tolerant scene, the detection operation of the pause container and the service data can be conveniently and efficiently executed, so that the characteristic of strong consistency of application after the master-slave switching is improved, and the problem of data loss caused by untimely disk drop of user data is solved.

Description

Method, system, device and storage medium for ensuring that a master cluster stops providing services

Technical Field

The present invention relates to the field of containers, and in particular, to a method, system, device, and storage medium for guaranteeing that a master cluster stops providing a service.

Background

In the container technology era, a large number of service application scenarios are technically clouded through kubernets containers, wherein under some disaster tolerance scenarios, a kubernets cluster needs to be subjected to active-standby switching so as to perform some operations on the main kubernets cluster, such as cluster migration or upgrading; the main-standby switching has an important measurement index, namely, the integrity and the consistency of data are ensured, and all data generated by a user request processed in a main cluster are required to be dropped. Therefore, when the master/standby switching is performed in a disaster tolerance scene, all user data can be timely off-line, which is a problem to be solved.

Disclosure of Invention

In view of this, an object of the embodiments of the present invention is to provide a method, a system, a computer device, and a computer readable storage medium for ensuring that a master cluster stops providing services, where the method and the system can perform operations of suspending and detecting service data on a container of the master cluster, so that a corresponding container can perform a sufficient drainage operation, ensure that user data can be dropped and stored in a storage device, ensure that operations of suspending and detecting service data on an operating container during master-slave switching in a disaster tolerance scenario are performed, improve a characteristic of strong consistency of applications after the master-slave switching, and solve a problem of data loss caused by untimely drop of user data.

Based on the above object, an aspect of the embodiments of the present invention provides a method for ensuring that a master cluster stops providing services, including the following steps: in a disaster-tolerant master-slave switching scene, in response to receiving a master-slave switching pause container request, initiating a pause container request of a master cluster; in response to receiving a pause container request of a main cluster, pausing a target container according to the pause container request of the main cluster; putting the pause container resource into a cache, and acquiring a pause container resource list; detecting whether container service data has completely landed upon reaching a completion state in response to a pause container request of the master cluster.

In some embodiments, the initiating a pause container request for the master cluster comprises: and starting the custom resource controller to monitor the resource requested by the pause container.

In some embodiments, the initiating a pause container request for the master cluster comprises: and responding to the events of newly adding, modifying and deleting the resources when the container suspension request is answered, and processing corresponding service logic according to the events.

In some embodiments, the initiating a pause container request for the master cluster comprises: find containers running on the primary cluster and organize the list of containers that need to be paused.

In some embodiments, the initiating a pause container request for the master cluster comprises: the start listening service listens for execution of the suspended container.

In some embodiments, said suspending the target container according to the suspension container request of the master cluster comprises: and querying and listening the pause container request of the main cluster through a ListWatch mechanism.

In some embodiments, said suspending the target container according to the suspension container request of the master cluster comprises: obtaining a pause container request of the main cluster through a List and executing the pause container request of the main cluster through a Watch.

In some embodiments, said suspending the target container according to the suspension container request of the master cluster comprises: the container resource is suspended by Watch and the GRPC interface service is invoked to handle different business logic.

In some embodiments, the method further comprises: the probbuffer transport protocol data is defined to provide GRPC service for suspending container requests.

In some embodiments, the detecting whether the container service data has completely landed comprises: and circularly traversing the pause container resource list and calling a pause container service interface.

In some embodiments, the detecting whether the container service data has completely landed includes: and acquiring the drainage time and the detection configuration information from the pause container request annotation, and detecting whether the corresponding database data is not increased any more according to the detection configuration information.

In some embodiments, the detecting whether the corresponding database data is no longer newly added according to the detection configuration information includes: and judging whether the corresponding database data is changed within the preset time, and determining that the corresponding database data is not newly added in response to that the corresponding database data is not changed within the preset time.

In another aspect of the embodiments of the present invention, a system for guaranteeing that a master cluster stops providing services is provided, including: the definition module is configured to initiate a pause container request of the main cluster in response to receiving a main/standby switch pause container request in a disaster-tolerant main/standby switch scene; the system comprises a pause module, a pause module and a control module, wherein the pause module is configured to respond to a pause container request of a main cluster and pause a target container according to the pause container request of the main cluster; the buffer module is configured to put the pause container resources into a buffer and acquire a pause container resource list; and the detection module is configured to respond to the pause container request of the main cluster and detect whether the container service data completely falls off the disk when the pause container request of the main cluster reaches a completion state.

In another aspect of the embodiments of the present invention, there is also provided a computer device, including: at least one processor; and a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of the method as above.

In a further aspect of the embodiments of the present invention, a computer-readable storage medium is also provided, in which a computer program for implementing the above method steps is stored when the computer program is executed by a processor.

The invention has the following beneficial technical effects: the method can suspend and detect the service data of the main cluster container, so that the corresponding container can fully perform drainage operation, ensure that user data can be dropped and stored in the storage device, ensure the operation of suspending and detecting the service data of the operating container during the main-standby switching in a disaster tolerance scene, improve the characteristic of strong consistency of application after the main-standby switching, and solve the problem of data loss caused by the fact that the user data is not dropped in time.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.

FIG. 1 is a schematic diagram illustrating an embodiment of a method for guaranteeing that a master cluster stops providing a service according to the present invention;

FIG. 2 is an architecture diagram of an embodiment of a method for ensuring that a master cluster stops providing services according to the present invention;

FIG. 3 is a flowchart illustrating the operation of the method for securing a master cluster stopping providing services according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating an embodiment of a system for securing a master cluster from stopping providing a service according to the present invention;

FIG. 5 is a diagram illustrating a hardware structure of an embodiment of a computer device for ensuring that a master cluster stops providing services according to the present invention;

FIG. 6 is a diagram of an embodiment of a computer storage medium for securing a suspension of a service from a master cluster.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.

It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.

In a first aspect of the embodiments of the present invention, an embodiment of a method for ensuring that a master cluster stops providing services is provided. Fig. 1 is a schematic diagram illustrating an embodiment of a method for guaranteeing that a master cluster stops providing a service according to the present invention. As shown in fig. 1, the embodiment of the present invention includes the following steps:

s1, in a disaster-tolerant master-slave switching scene, in response to receiving a master-slave switching pause container request, initiating a pause container request of a master cluster;

s2, in response to receiving a pause container request of a main cluster, pausing a target container according to the pause container request of the main cluster;

s3, placing the pause container resources into a cache, and acquiring a pause container resource list; and

and S4, responding to the pause container request of the main cluster, and detecting whether the container service data completely falls off the disk or not when the pause container request of the main cluster reaches a completion state.

Kubernets is an open source platform for automated deployment, capacity expansion and operation and maintenance of container clusters. And (3) node: the node refers to a composition unit of the Kubernetes cluster and can be a virtual machine or a physical machine. Working cluster (master cluster): a Kubernetes cluster for normal operation of customer traffic. Disaster recovery clustering: under some unpredictable conditions, the working clusters cannot work normally, and one cluster needs to be started quickly to meet the normal service of customers, so that the cluster is called a disaster recovery cluster. Switching between main equipment and standby equipment: there are two Kubernetes clusters, and only one cluster is in normal working state under normal condition, this cluster is called main cluster, when the standby cluster needs to be started to run service, the operation is called main-standby switching. Suspending the container: the method refers to that the container is subjected to pause operation, service data requests are not executed any more, and the method is used for performing drainage processing on user request data and facilitating other operations of the service. And (3) falling: meaning that data generated by a user accessing an application can be saved to a storage device. Draining: the data generated by the application program may be stored in the memory very briefly, and the data is not yet stored in the storage device, and the data of the memory can be immediately subjected to a disk-dropping operation when waiting, which is called as a draining operation.

Etcd is an open-source, highly available distributed key-value storage system developed using the Go language that can be used for configuration sharing and registration and discovery of services. Etcd has the feature of complete replication: each node in the cluster may use a complete archive; high availability: etcd may be used to avoid single point failures or network problems of hardware; consistency: each read returns the latest write across multiple hosts; the method is simple: includes a well-defined, user-oriented API (gPC); safety: automated TLS with optional client certificate authentication is implemented; and (3) fast: a reference speed of 10000 writes per second; reliable: the use of the Raft algorithm enables a strong consistent, highly available service storage directory. The Etcd stores data information of the cluster, the apiserver serves as a unified entry, and any operation on the data must pass through the apiserver.

The client (kubel/scheduler/operator-manager) listens to create (new increase), update (update) and delete (delete) events of resources (pod/rs/rc, etc.) in the apiserver through list-watch, and calls corresponding event processing functions according to event types. The list-watch is composed of two parts, namely a list and a watch, wherein the list is a list API resource for calling the resource and is realized based on HTTP short link; the watch is a watch API for calling resources to monitor resource change events and is realized based on an HTTP long link. The actor module of K8S encapsulates the list-watch API, and the user only needs to specify resources and write event processing functions, addFunc, updateFunc and DeleteFunc. The Informer lists resources through a list API, calls a watch API to monitor the change events of the resources, puts the results into a FIFO (first in first out) queue, takes out the events from the queue by a protocol at the other end of the queue, and calls a corresponding registration function to process the events. The Informer also maintains a read-only Map Store cache, mainly for improving the efficiency of query and reducing the load of apiserver.

The embodiment of the invention is a method and a system for performing pause and service data detection operation on an operation container during main/standby switching in a disaster tolerance scene. The method and the system can ensure the operation of suspending the operation container and detecting the service data during the main/standby switching in a disaster tolerance scene, improve the characteristic of strong consistency of application after the main/standby switching, and solve the problem of data loss caused by untimely disk drop of user data. The application of strong consistency mainly reflects the consistency of data in the main and standby clusters, and data loss caused by main and standby switching can be avoided.

Fig. 2 is an architecture diagram of an embodiment of the method for guaranteeing that a master cluster stops providing services according to the present invention. As shown in fig. 2, the embodiment of the present invention mainly defines a monitoring device, a request processing device, and a processing implementation device to ensure that a master cluster stops providing services during master/slave switching. The basic idea of the invention is to establish a definition monitoring device based on webhook and CRD definition and monitoring, which is used for defining and monitoring the complete life cycle of a container suspension request, and effectively detecting service data after the container suspension request is successful, so as to ensure that the data is landed; establishing a request processing device, inquiring and monitoring the request of a pause container based on a listwatch mechanism, and sending the request of the pause container to a processing realizing device; a process implementation is established, the encapsulation runtime suspends the container interface and provides GRPC services to request the processing device for use. The GRPC service: the GRPC service is one of communication modes, is based on an http2 protocol, can realize duplex communication, and has higher efficiency than a normal http request. crd: kubernets provides crd (custom Resource) for extending Resource definitions for facilitating user extension functions.

The definition monitoring device is mainly used for defining and monitoring a received container suspension request after a user initiates a master-slave switching action, and comprises the steps of searching a container running on a master cluster, organizing a container suspension list required to be suspended, initiating the container suspension request of the master cluster, starting a corresponding service to monitor the execution condition of the request, and detecting whether container data completely falls after monitoring that the container suspension service is successfully executed. The request processing device is used for running on each node and receiving a pause container request sent by the definition monitoring device. The processing implementation device is mainly used for encapsulating a specific runtime pause container interface for the request processing device to use so as to really process the pause container request.

And declaring and defining a pause container request, and initiating a pause container request of the main cluster in response to receiving the main/standby switching pause container request. The declaration and definition of the pause container request is made in the definition listening device.

The declaration of the pause container request is as follows:

apiVersion: apiextensions.k8s.io/v1beta1

kind: CustomResourceDefinition

metadata:

name: pausecontainerrequest.xxx.xxx.xxx.com

annotations:

"xxx": "xxxxx"

spec:

group: xxx.xxx.xxx.com

scope: Namespaced

names:

plural: pausecontainerrequests

singular: pauseContainerrequest

kind: PauseContainerRequest

shortNames:

- pcr

version: v1

versions:

- name: v1

served: true

storage: true

validation:

# openAPIV3Schema is the schema for validating custom objects.

openAPIV3Schema:

type: object

properties:

spec:

type: object

properties:

podName:

type: string

containers:

type: array

status:

properties:

phase:

type: string

message:

type: string

completionTime:

type: string

the definition file for the pause container request is as follows:

{

"apiVersion": "xxx.com/v1",

"kind": "PauseContainerRequest",

"name": "pausecontainer-example",

"annotations": {

"drainTime": "60",

"detectConfigInfo": "xxxxx",

},

"spec": {

"podName": "dp-pausecontainer-xxx",

"containers": ["container-0",

"container-1"]

},

"status": {

"phase": "pausing",

"completionTime": "2022-0926 09:59:00",

"message": ""

}

in response to receiving a pause container request of a master cluster, pausing a target container according to the pause container request of the master cluster.

In some embodiments, the initiating a pause container request for the master cluster comprises: and starting the custom resource controller to monitor the resource requested by the pause container. A crd-controller is started in the definition listening device to listen for the resource requested by the suspended container. The crd-controller is a framework for listening to the self-defined resource in the extension mechanism provided by kubernets.

In some embodiments, the initiating a pause container request for the master cluster comprises: and responding to newly adding, modifying and deleting events of the resources when the container suspension request is answered, and processing corresponding business logic according to the events. And monitoring events such as newly adding, modifying and deleting resources requested by the pause container, and processing corresponding service logic according to the events. And simultaneously, a webhook client program is injected into the crd-controller and is used for detecting whether the container service data completely falls into the disk or not when the container suspension request reaches the completion state.

In some embodiments, the initiating a pause container request for the master cluster comprises: find the container running on the primary cluster and organize the list of containers that need to be suspended.

In some embodiments, said suspending the target container according to the suspension container request of the master cluster comprises: and querying and monitoring a pause container request of the main cluster through a ListWatch mechanism. The request processing device runs in the form of a daemon service, and queries and listens for the pause container request through a listWatch mechanism.

In some embodiments, said suspending the target container according to the suspension container request of the master cluster comprises: and acquiring the pause container request of the main cluster through the List and executing the pause container request of the main cluster through the Watch.

In some embodiments, the method further comprises: the probbuffer transport protocol data is defined to provide GRPC service for suspending container requests. The processing implementation device exposes the pause container to request GRPC service by defining the buffer transmission protocol data for the request processing device to use. protobuf: the GRPC service delivers a protocol for data transmission, which defines the format of the data to be transmitted.

The protobuf transport protocol data used by the pause container request GRPC service is defined as follows:

type PauseContainerRequest struct {

// ID of the container to stop.

ContainerId string `protobuf："bytes，1，opt，name=container_id，json=containerId，proto3" json:"container_id,omitempty"`

// Timeout in seconds to wait for the container to stop before forcibly

// terminating it. Default: 0 (forcibly terminate the container immediately)

Timeout int64 `protobuf:"varint，2，opt，name=timeout，proto3" json:"timeout,omitempty"`

}

and putting the pause container resource into a cache, and acquiring a pause container resource list.

Detecting whether container service data has completely landed when a completion state is reached in response to a pause container request of the master cluster. After receiving the completion of the container pause operation, the definition monitoring device acquires the drainage time and the detection configuration information from the pause container request annotation, and detects whether the corresponding DB (database) data is not newly added according to the detection configuration information.

In some embodiments, the detecting whether the container service data has completely landed includes: and acquiring the drainage time and the detection configuration information from the request annotation of the pause container, and detecting whether the corresponding database data is not newly added according to the detection configuration information.

In some embodiments, the detecting whether the corresponding database data is no longer newly added according to the detection configuration information includes: and judging whether the corresponding database data changes within the preset time, and determining that the corresponding database data is not newly added in response to that the corresponding database data does not change within the preset time.

Fig. 3 is an operation flowchart of an embodiment of the method for ensuring that the master cluster stops providing services according to the present invention, and as shown in fig. 3, the master/slave switching suspension container request is sent to an ApiServer, the ApiServer queries a suspension container resource from a defined monitoring device, the defined monitoring device starts a service, and performs a corresponding operation according to the suspension container resource event. And the definition monitoring device returns the resources of the newly added, updated and deleted suspended containers to the ApiServer. The request processing device pauses the container resource through the List, the Watch pauses the container resource and processes the corresponding service logic, puts the paused container resource into the cache, and acquires the paused container resource List from the cache. And circularly traversing the resource list of the pause container by the request processing device, calling a service interface of the pause container until the pause container is finished, and checking whether the drainage operation is finished. And (4) Cache: in order to improve the access efficiency of data, an application program temporarily puts the data into a memory, and the corresponding memory data is called Cache.

By means of a Kubernetes powerful CRD (cross-reference device) extension mechanism, a webhook mechanism and custom resources, query monitoring operation can be performed based on a listWatch mechanism, reasonable pause container request custom resources and pause container request proxy transmission protocol data used by GRPC (general packet radio service) service are designed, and when main/standby switching is performed in a disaster tolerance scene, pause container and service data detection operation can be conveniently and efficiently executed, so that the characteristic of strong consistency of application after main/standby switching is improved, and the problem of data loss caused by untimely disk drop of user data is solved.

The embodiment of the invention can perform pause and service data detection operation on the containers of the main cluster, so that the corresponding containers can perform drainage operation fully, ensure that user data can be dropped and stored on the storage device, ensure the operation of pause and service data detection on the operating containers during main-standby switching in a disaster-tolerant scene, improve the characteristic of strong consistency of application after the main-standby switching, and solve the problem of data loss caused by untimely dropping of the user data.

It should be particularly noted that, the steps in the embodiments of the method for guaranteeing that the master cluster stops providing the service may be intersected, replaced, added, or deleted, and therefore, these reasonable permutation and combination transformations should also belong to the scope of the present invention, and should not limit the scope of the present invention to the embodiments.

In view of the above object, a second aspect of the embodiments of the present invention provides a system for ensuring that a master cluster stops providing services. As shown in fig. 4, the system 200 includes the following modules: the definition module is configured to initiate a pause container request of the main cluster in response to receiving a main/standby switch pause container request in a disaster-tolerant main/standby switch scene; the system comprises a suspension module, a storage module and a control module, wherein the suspension module is configured to respond to a suspension container request of a main cluster and suspend a target container according to the suspension container request of the main cluster; the buffer module is configured to put the pause container resources into a buffer and acquire a pause container resource list; and the detection module is configured to respond to the pause container request of the main cluster and detect whether the container service data completely falls off the disk when the pause container request of the main cluster reaches a completion state.

In some embodiments, the definition module is configured to: and starting the custom resource controller to monitor the resource requested by the pause container.

In some embodiments, the definition module is configured to: and responding to the events of newly adding, modifying and deleting the resources when the container suspension request is answered, and processing corresponding service logic according to the events.

In some embodiments, the definition module is configured to: find containers running on the primary cluster and organize the list of containers that need to be paused.

In some embodiments, the definition module is configured to: the start listening service listens for execution of the suspended container.

In some embodiments, the suspension module is configured to: and querying and listening the pause container request of the main cluster through a ListWatch mechanism.

In some embodiments, the suspend module is configured to: and acquiring the pause container request of the main cluster through the List and executing the pause container request of the main cluster through the Watch.

In some embodiments, the suspend module is configured to: the container resource is suspended by Watch and the GRPC interface service is invoked to handle different business logic.

In some embodiments, the system further comprises a second definition module configured to: the probbuffer transport protocol data is defined to provide GRPC service for suspending container requests.

In some embodiments, the detection module is configured to: and circularly traversing the suspended container resource list and calling a suspended container service interface.

In some embodiments, the detection module is configured to: and acquiring the drainage time and the detection configuration information from the pause container request annotation, and detecting whether the corresponding database data is not increased any more according to the detection configuration information.

In some embodiments, the detection module is configured to: and judging whether the corresponding database data is changed within the preset time, and determining that the corresponding database data is not newly added in response to that the corresponding database data is not changed within the preset time.

In view of the above object, a third aspect of the embodiments of the present invention provides a computer device, including: at least one processor; and a memory storing computer instructions executable on the processor, the instructions being executable by the processor to perform the steps of: s1, in a disaster-tolerant master-slave switching scene, in response to receiving a master-slave switching pause container request, initiating a pause container request of a master cluster; s2, in response to receiving a pause container request of a main cluster, pausing a target container according to the pause container request of the main cluster; s3, placing the pause container resources into a cache, and acquiring a pause container resource list; and S4, responding to the pause container request of the main cluster, and detecting whether the container service data completely falls off the disk when the pause container request of the main cluster reaches a completion state.

In some embodiments, the initiating a pause container request for the master cluster comprises: and responding to newly adding, modifying and deleting events of the resources when the container suspension request is answered, and processing corresponding business logic according to the events.

In some embodiments, the steps further comprise: the probbuffer transport protocol data is defined to provide GRPC service for suspending container requests.

In some embodiments, the detecting whether the container service data has completely landed includes: and circularly traversing the pause container resource list and calling a pause container service interface.

In some embodiments, the detecting whether the container service data has completely landed comprises: and acquiring the drainage time and the detection configuration information from the request annotation of the pause container, and detecting whether the corresponding database data is not newly added according to the detection configuration information.

Fig. 5 is a schematic hardware structural diagram of an embodiment of the computer device for ensuring that the master cluster stops providing services according to the present invention.

Taking the device shown in fig. 5 as an example, the device includes a processor 301 and a memory 302.

The processor 301 and the memory 302 may be connected by a bus or other means, such as the bus connection in fig. 5.

The memory 302 is used as a non-volatile computer-readable storage medium for storing non-volatile software programs, non-volatile computer-executable programs, and modules, such as program instructions/modules corresponding to the method for ensuring that a master cluster stops providing services in the embodiments of the present application. The processor 301 executes various functional applications of the server and data processing by running nonvolatile software programs, instructions, and modules stored in the memory 302, that is, implements a method of securing the master cluster from stopping providing a service.

The memory 302 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of a method of securing that the master cluster stops providing the service, and the like. Further, the memory 302 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 302 optionally includes memory located remotely from processor 301, which may be connected to a local module via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

One or more computer instructions 303 corresponding to a method of securing a stopping of a service by a master cluster are stored in the memory 302 and when executed by the processor 301 perform the method of securing a stopping of a service by a master cluster in any of the above-described method embodiments.

Any embodiment of the computer device executing the method for ensuring that the main cluster stops providing the service can achieve the same or similar effects as any corresponding embodiment of the method.

The embodiment of the invention can perform pause and service data detection operations on the containers of the main cluster, so that the corresponding containers can perform drainage operations sufficiently, ensure that user data can be stored on a storage device after being landed, ensure the operation of pause and service data detection on the operating containers during main-standby switching in a disaster-tolerant scene, improve the characteristic of strong consistency of application after the main-standby switching, and solve the problem of data loss caused by the fact that the user data is not landed in time.

The present invention also provides a computer-readable storage medium storing a computer program that, when executed by a processor, performs a method of securing a master cluster from stopping providing a service.

Fig. 6 is a schematic diagram of an embodiment of a computer storage medium for providing service to stop providing service for the above-mentioned primary cluster. Taking the computer storage medium as shown in fig. 6 as an example, the computer readable storage medium 401 stores a computer program 402 which, when executed by a processor, performs the method as described above.

Finally, it should be noted that, as those skilled in the art can understand that all or part of the processes in the methods according to the embodiments described above can be implemented by instructing relevant hardware through a computer program, and the program of the method for ensuring that the main cluster stops providing the service may be stored in a computer-readable storage medium, and when executed, the program may include the processes according to the embodiments of the methods described above. The storage medium of the program may be a magnetic disk, an optical disk, a read-only memory (ROM), or a Random Access Memory (RAM). The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.

The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.

It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.

The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also combinations between technical features in the above embodiments or in different embodiments are possible, and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements and the like that may be made without departing from the spirit or scope of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims

1. A method for ensuring that a master cluster stops providing services is characterized by comprising the following steps:

in a disaster-tolerant master-slave switching scene, initiating a master cluster pause container request in response to receiving a master-slave switching pause container request;

in response to receiving a pause container request of a master cluster, pausing a target container according to the pause container request of the master cluster;

putting the pause container resource into a cache, and acquiring a pause container resource list;

detecting whether container service data has completely landed when a completion state is reached in response to a pause container request of the master cluster.

2. The method of claim 1, wherein the initiating a pause container request for the master cluster comprises:

and starting the custom resource controller to monitor the resource requested by the pause container.

3. The method of claim 2, wherein the initiating a pause container request for the master cluster comprises:

and responding to newly adding, modifying and deleting events of the resources when the container suspension request is answered, and processing corresponding business logic according to the events.

4. The method of claim 1, wherein the initiating a pause container request for the master cluster comprises:

find the container running on the primary cluster and organize the list of containers that need to be suspended.

5. The method of claim 4, wherein the initiating a pause container request for the master cluster comprises:

the start listening service listens for execution of the suspended container.

6. The method of claim 1, wherein the suspending the target container according to the suspension container request of the master cluster comprises:

and querying and listening the pause container request of the main cluster through a ListWatch mechanism.

7. The method of claim 6, wherein the suspending the target container according to the suspension container request of the master cluster comprises:

obtaining a pause container request of the main cluster through a List and executing the pause container request of the main cluster through a Watch.

8. The method of claim 6, wherein the suspending the target container according to the suspension container request of the master cluster comprises:

the container resource is suspended by Watch and the GRPC interface service is invoked to handle different business logic.

9. The method of claim 1, further comprising:

the probuffer transport protocol data is defined to provide GRPC service for suspending container requests.

10. The method of claim 1, wherein the detecting whether the container service data has completely landed comprises:

and circularly traversing the suspended container resource list and calling a suspended container service interface.

11. The method of claim 10, wherein the detecting whether the container service data has completely landed comprises:

and acquiring the drainage time and the detection configuration information from the pause container request annotation, and detecting whether the corresponding database data is not increased any more according to the detection configuration information.

12. The method of claim 11, wherein the detecting whether the corresponding database data is no longer newly added according to the detection configuration information comprises:

and judging whether the corresponding database data changes within the preset time, and determining that the corresponding database data is not newly added in response to that the corresponding database data does not change within the preset time.

13. A system for securing a master cluster from ceasing to provide service, comprising:

the definition module is configured to initiate a pause container request of the main cluster in response to receiving a main/standby switch pause container request in a disaster-tolerant main/standby switch scene;

the system comprises a pause module, a pause module and a control module, wherein the pause module is configured to respond to a pause container request of a main cluster and pause a target container according to the pause container request of the main cluster;

the buffer module is configured to put the pause container resources into a buffer and acquire a pause container resource list;

and the detection module is configured to detect whether the container service data has completely fallen in the disk in response to the suspension container request of the main cluster when the suspension container request reaches a completion state.

14. A computer device, comprising:

at least one processor; and

a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of the method of any one of claims 1 to 12.

15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 12.