CN114510464A

CN114510464A - Management method and management system of high-availability database

Info

Publication number: CN114510464A
Application number: CN202210146171.XA
Authority: CN
Inventors: 杨世利; 陈雪; 宋阳; 刘娟; 洪晓霞; 熊炜; 宋鹏; 裴劼; 王仁菊; 杨颖�; 李佳; 江欣祝; 吴云松; 何健
Original assignee: Individual
Current assignee: Individual
Priority date: 2022-02-17
Filing date: 2022-02-17
Publication date: 2022-05-17

Abstract

The invention discloses a management method and a management system of a high-availability database, belonging to the technical field of database management, wherein the management method comprises the following steps: creating or modifying description information; creating, modifying or deleting custom resources according to the description information; monitoring the self-defined resource and generating a configuration resource; and managing the database cluster according to the configuration resources. The configuration information and the deployment information of the database are uniformly described through the description information, so that the unified management and the deployment of the custom resources are facilitated, the database is uniformly managed through the custom resources, and the misoperation caused by the configuration of a large number of internal resources is prevented.

Description

Management method and management system of high-availability database

Technical Field

The invention relates to the technical field of database management, in particular to a management method and a management system of a high-availability database.

Background

Custom resources (Custom resources) are an extension of the kubernets API, and can be dynamically registered in a running cluster or appear or disappear, and a cluster administrator can update the Custom resources independently of the cluster. Once a custom resource is installed, users can use kubecect to create and access objects therein as they do for built-in resources such as pods.

Kubernetes is a mainstream cloud native container platform at present, and supports the operation of workloads such as general stateless application and stateful application. The database such as MYSQL can run in a stateful application mode. Operator is an extension software to kubernets, where repetitive tasks can be handled by automation. The Operator mode encapsulates the written task automation code. The Operator mode concept of Kubernetes enables one or more than one customized resources to be managed through the customized controller without modifying the code of the Kubernetes, and the function of expanding the cluster is achieved.

With the rise of the service volume, the change of a plurality of internal resources is more and more, the configuration attributes required by the user can be centralized in a single resource for unified management by adopting the custom resources, and the crash of the database caused by the wrong change is avoided.

Disclosure of Invention

In view of the above technical problems in the prior art, the present invention provides a management method and a management system for a high-availability database, which uniformly describe the deployment or management of the database through description information, manage the database according to the description information, simplify the management of the database, and prevent erroneous operation.

The invention discloses a management method of a high-availability database, which comprises the following steps: creating or modifying description information; creating, modifying or deleting custom resources according to the description information; monitoring the self-defined resource and generating a configuration resource; and managing the database cluster according to the configuration resources.

Preferably, the description information includes any one or a combination of the following information:

database configuration information, an instance creation policy, an upgrade policy, a master-slave policy, and a replication policy.

Preferably, the database failure recovery method includes:

monitoring an instance of a database through an Operator;

judging whether the main instance has a fault;

if yes, selecting a normal example as a main example according to the main selection strategy;

and modifying the copy strategy, and pointing the data copy to the main instance.

Preferably, the method of selecting the master instance comprises:

accessing a survival instance and acquiring the delay state of the instance;

the main instance is selected from the surviving instances in a manner that minimizes the delay.

Preferably, the copy policy includes a connection address, a user name, a password, and a copy mode of the main instance;

the read-only strategy of the master instance is set to be off, and the read-only strategy of the slave instance is set to be on;

the method for reconnection of the fault instance comprises the following steps:

detecting a fault instance;

judging whether the fault instance is recovered;

if yes, the recovered fault instance is set as a slave instance, and the copy strategy is synchronized.

Preferably, the configuration resource includes any one or a combination of the following resources:

configuration files, stateful applications, resource objects of storage volumes, and service discovery objects;

the method for instance creation comprises the following steps:

creating a stateful application according to the creation strategy;

creating container resources according to the creation event of the stateful application;

and configuring the container resources according to the configuration file to obtain a database instance.

Preferably, the upgrading method comprises:

creating an upgrade policy, wherein upgrade description information of the upgrade policy comprises: updating database configuration, updating container cluster deployment configuration and updating database version;

generating or modifying custom resources according to the upgrading description information;

monitoring the user-defined resource through an Operator, and updating a configuration file;

and upgrading the database, the container or the cluster according to the configuration file.

Preferably, the example offloading method comprises:

deleting the custom resource corresponding to the instance;

monitoring a deletion event of the custom resource through an Operator, and deleting the corresponding configuration resource;

the container group of the instance is deleted according to a deletion event of the configuration resource.

The invention also provides a management system for realizing the management method, which comprises the following steps:

the system comprises a description information management module, a user-defined resource management module, a monitoring module and an execution module;

the description information management module is used for creating or modifying description information;

the user-defined resource management module is used for creating, modifying or deleting user-defined resources according to the description information;

the monitoring module is used for monitoring the self-defined resource and generating a configuration resource;

the execution module is used for managing the database cluster according to the configuration resource.

Preferably, the listening module is further configured to:

monitoring an instance of a database;

judging whether the main instance has a fault;

Compared with the prior art, the invention has the beneficial effects that: the configuration information and the deployment information of the database are uniformly described through the description information, so that the unified management and the deployment of the custom resources are facilitated, the database is uniformly managed through the custom resources, and the misoperation caused by a large number of APIs (application programming interfaces) is prevented.

Drawings

FIG. 1 is a flow chart of a method for managing a highly available database according to the present invention;

FIG. 2 is a logical block diagram of the management system of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

The invention is described in further detail below with reference to the attached drawing figures:

a management method of a high availability database, as shown in fig. 1, the management method comprising:

step 101: the description information is created or modified. The description information can be saved by a file, such as a yaml or json file; the description information is used to describe the attributes of database management, such as configuration information inside the database: root passwords, sizes of various cache regions, a data disk falling mechanism and the like, and related configuration information of the database in the cluster: number of instances, hardware resource quotas, network policies, scheduling policies, and the like.

Step 102: and creating, modifying or deleting the custom resource according to the description information.

Step 103: and monitoring the self-defined resources and generating configuration resources.

The configuration resources include any one or a combination of the following: configuration files (Configmap), stateful applications (stateful), resource objects (PVC) of storage volumes, and service discovery objects (service). In one embodiment, the Operator is utilized to listen for custom resources.

Step 104: managing the containerized database according to the configuration resource. The database may be a MYSQL database, but is not limited thereto. The containerized database may have multiple instances, making up a database cluster.

The configuration information and the deployment information of the database are uniformly described through the description information, so that uniform management and deployment of custom resources are facilitated, the database is uniformly managed through the custom resources, misoperation caused by a large number of APIs is prevented, and high availability of the database is improved.

In step 101, the description information includes any one or a combination of the following information:

database configuration information, an instance creation policy, an upgrade policy, a master-slave policy, and a replication policy. The self-defined resource corresponds to the description information.

Database management includes database failure recovery, database upgrade, instance creation, and instance uninstallation.

Example 1

The method for recovering the database failure comprises the following steps:

step 201: an instance of the database is listened to by the Operator. The Operator can periodically try to connect each database instance through a timing task and execute a corresponding SQL statement to perform the activity detection.

Step 202: it is determined whether the primary instance failed. The Operator program can regularly explore and activate each MYSQL instance, namely, the MYSQL instances distributed in different nodes are connected through configured information such as MYSQL connection addresses, user names, passwords and ports. If the connection times out, it represents a failure of the instance. After the connection is successful, a command "SELECT 1" is executed, if a normal result cannot be returned, the instance fails, otherwise, the instance is a normal instance. Normal instances and abnormal instances are marked.

If yes, go to step 203: and selecting a normal instance as a main instance according to the main selection strategy.

The method for selecting the main instance comprises the following steps: step 211: the live instance is accessed and the deferred status of the instance is obtained. Step 212: the main instance is selected from the surviving instances in a manner that minimizes the delay.

If not, go to step 206: the instances are listened to continuously or periodically.

Step 205: and modifying the copy strategy, the load balancing strategy and the like, copying the data of the slave instance to the master instance, setting the read-only strategy of the master instance to be off, and setting the read-only strategy of the slave instance to be on.

The copy policy comprises a connection address, a user name, a password and a copy mode of the main instance. Specifically, MYSQL with sequence number 0 is taken as a master library by default, other database instances are taken as slave libraries, and configuration data is copied from the master library to the slave libraries. And after the starting replication is finally executed, pulling the Binlog log reserved by the data base written in the main base from the base, and playing back the Binlog in the base to fulfill the aim of data replication so as to ensure the consistency of the data.

Step 206: the faulty instance reconnects. The method for reconnection of the fault instance comprises the following steps: detecting a fault instance; judging whether the fault instance is recovered; if yes, the recovered fault instance is set as a slave instance, and the copy strategy is synchronized. The Operator can automatically repair a specific fault scene, and can recover in a short time without the participation of operation and maintenance personnel.

The Operator program will record the metadata of the master-slave relationship and record it at the time of operation. And 2 service discoveries with load balancing capability are created at the same time and used as entries for writing data and reading data. The client side accesses the domain names discovered by the 2 services to realize read-write separation.

It should be noted that types that are not available for example include: master library failure, slave library survival; failure of the slave library and survival of the master library; and three conditions of failure of the master library and the slave library are provided. When the master library fails and the slave libraries are alive, master-slave switching is required to change one slave library into the master library. Under the conditions of slave library failure and master library survival, master-slave switching is not needed, and the slave libraries which cannot be accessed can be automatically removed by the load balancing opportunity in Kubernetes. And under the condition that both the master library and the slave library have faults, the whole system has faults and cannot be repaired.

The MYSQL cluster can adopt a structure belonging to a master and multiple slaves, and each instance is independent. I.e., one instance of which provides the primary data writing capability as the master instance and the other instance provides the data reading capability as the slave instance. In the embodiment, the master-slave architecture implementation and the automatic fault repair of the MYSQL cluster are realized through an Operator; therefore, high availability of the MYSQL cluster is maintained, for example, master-slave switching of the database is carried out when a downtime fault occurs, and the read-write function of the database is guaranteed to be normal.

After the database is started, the Operator can sequentially access each MYSQL database instance and configure a copy strategy, specifically, the slave library is configured with information such as a database connection address, a user name, a password and a copy mode of the master library, and data is copied to the slave library from the master library increment, so that the redundancy of the data is ensured. And simultaneously, a Kubernetes platform provides a read-write separated load balancing inlet, the load balancing inlet comprises a service inlet for writing data and a service outlet for reading data, the outlets are respectively a master library and a slave library, and the Operator maintains the mapping relation.

Example 2

The database instance creating method comprises the following steps:

step 301: and creating the stateful application according to the creating strategy.

Step 302: based on the creation event of the stateful application, a container resource is created. Container resources may be created through the kubernets platform.

Step 303: and configuring the container resources according to the configuration file to obtain a database instance.

Example 3

The method for upgrading the database comprises the following steps:

step 401: creating an upgrade policy, wherein upgrade description information of the upgrade policy comprises: database configuration updates, container cluster deployment configuration updates, and database version updates. For example, upgrade a MYSQL version to a specified version; and horizontally expanding the database cluster, namely recreating the instance.

Step 402: and generating or modifying the custom resource according to the upgrading description information.

Step 403: and monitoring the custom resource through an Operator, and updating a configuration file.

Step 404: and upgrading the database, the container or the cluster according to the configuration file.

Example 4

The method for example unloading comprises the following steps:

step 501: and deleting the custom resource corresponding to the instance.

Step 502: and monitoring a deletion event of the self-defined resource through an Operator, and deleting the corresponding configuration resource.

Step 503: upon a deletion event of a configuration resource, a container resource, such as a container group (POD), of the instance is deleted.

Because the resources have an incidence relation, the database self-defined resources belong to the uppermost layer, the controller of the cloud native platform can automatically delete all related subordinate resources according to a cascade relation, the self-defined resources can automatically delete Statefouset, Service, Configmap and other built-in resources, and because Statefouset and Pod are in the cascade relation, the Pod can be automatically deleted after Statefouset is deleted, and the process is controlled by the controller of the cloud native platform. The data volumes used by the database are retained or automatically deleted according to the actually defined mode.

Example 5

The embodiment provides a management system for implementing the above-mentioned high-availability database management method, as shown in fig. 2, including a description information management module 1, a custom resource management module 2, a monitoring module 3, and an execution module 4;

the description information management module 1 is used for creating or modifying the description information 11;

the custom resource management module 2 is used for creating, modifying or deleting custom resources 12 according to the description information 11;

the monitoring module 3 is used for monitoring the custom resource 12 and generating a configuration resource 13 according to the custom resource;

the execution module 4 is adapted to manage a containerized database cluster comprising a plurality of instances 15 according to the configuration resources 13. In particular, the execution module 4 may manage the database cluster through the kubernets platform.

The monitoring module 3 is further configured to execute the database failure recovery method in embodiment 1:

monitoring an instance of a database;

judging whether the main instance has a fault;

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A management method for a high-availability database is characterized by comprising the following steps:

creating or modifying description information;

creating, modifying or deleting custom resources according to the description information;

monitoring the self-defined resource and generating a configuration resource;

and managing the database cluster according to the configuration resources.

2. The management method according to claim 1, wherein the description information includes any one or a combination of the following information:

3. The method of managing according to claim 2, wherein the method of database failure recovery comprises:

monitoring an instance of a database through an Operator;

judging whether the main instance has a fault;

4. The method of managing according to claim 3, wherein the method of selecting a master instance comprises:

accessing a survival instance and acquiring the delay state of the instance;

5. The management method according to claim 4, wherein the copy policy includes a connection address, a user name, a password, a copy mode of the primary instance;

detecting a fault instance;

judging whether the fault instance is recovered;

6. The method according to claim 2, wherein the configuration resource comprises any one or a combination of the following resources:

the method for instance creation comprises the following steps:

creating a stateful application according to the creation strategy;

7. The management method according to claim 6, wherein the upgrading method comprises:

creating an upgrade policy, wherein upgrade description information of the upgrade policy comprises: updating database configuration, updating container cluster deployment configuration or updating database version;

8. The method of managing according to claim 6, wherein the method of instance offloading comprises:

deleting the custom resource corresponding to the instance;

and deleting the container resource of the instance according to the deletion event of the configuration resource.

9. A management system for implementing the management method according to any one of claims 1 to 8, characterized by comprising: the system comprises a description information management module, a user-defined resource management module, a monitoring module and an execution module;

10. The management system of claim 9, wherein the listening module is further configured to:

monitoring an instance of a database;

judging whether the main instance has a fault;

and modifying the copy strategy, and directing the data copy to the main instance.