CN114510464A - Management method and management system of high-availability database - Google Patents

Management method and management system of high-availability database Download PDF

Info

Publication number
CN114510464A
CN114510464A CN202210146171.XA CN202210146171A CN114510464A CN 114510464 A CN114510464 A CN 114510464A CN 202210146171 A CN202210146171 A CN 202210146171A CN 114510464 A CN114510464 A CN 114510464A
Authority
CN
China
Prior art keywords
instance
database
resource
configuration
description information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210146171.XA
Other languages
Chinese (zh)
Inventor
杨世利
陈雪
宋阳
刘娟
洪晓霞
熊炜
宋鹏
裴劼
王仁菊
杨颖�
李佳
江欣祝
吴云松
何健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202210146171.XA priority Critical patent/CN114510464A/en
Publication of CN114510464A publication Critical patent/CN114510464A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1479Generic software techniques for error detection or fault masking

Abstract

The invention discloses a management method and a management system of a high-availability database, belonging to the technical field of database management, wherein the management method comprises the following steps: creating or modifying description information; creating, modifying or deleting custom resources according to the description information; monitoring the self-defined resource and generating a configuration resource; and managing the database cluster according to the configuration resources. The configuration information and the deployment information of the database are uniformly described through the description information, so that the unified management and the deployment of the custom resources are facilitated, the database is uniformly managed through the custom resources, and the misoperation caused by the configuration of a large number of internal resources is prevented.

Description

Management method and management system of high-availability database
Technical Field
The invention relates to the technical field of database management, in particular to a management method and a management system of a high-availability database.
Background
Custom resources (Custom resources) are an extension of the kubernets API, and can be dynamically registered in a running cluster or appear or disappear, and a cluster administrator can update the Custom resources independently of the cluster. Once a custom resource is installed, users can use kubecect to create and access objects therein as they do for built-in resources such as pods.
Kubernetes is a mainstream cloud native container platform at present, and supports the operation of workloads such as general stateless application and stateful application. The database such as MYSQL can run in a stateful application mode. Operator is an extension software to kubernets, where repetitive tasks can be handled by automation. The Operator mode encapsulates the written task automation code. The Operator mode concept of Kubernetes enables one or more than one customized resources to be managed through the customized controller without modifying the code of the Kubernetes, and the function of expanding the cluster is achieved.
With the rise of the service volume, the change of a plurality of internal resources is more and more, the configuration attributes required by the user can be centralized in a single resource for unified management by adopting the custom resources, and the crash of the database caused by the wrong change is avoided.
Disclosure of Invention
In view of the above technical problems in the prior art, the present invention provides a management method and a management system for a high-availability database, which uniformly describe the deployment or management of the database through description information, manage the database according to the description information, simplify the management of the database, and prevent erroneous operation.
The invention discloses a management method of a high-availability database, which comprises the following steps: creating or modifying description information; creating, modifying or deleting custom resources according to the description information; monitoring the self-defined resource and generating a configuration resource; and managing the database cluster according to the configuration resources.
Preferably, the description information includes any one or a combination of the following information:
database configuration information, an instance creation policy, an upgrade policy, a master-slave policy, and a replication policy.
Preferably, the database failure recovery method includes:
monitoring an instance of a database through an Operator;
judging whether the main instance has a fault;
if yes, selecting a normal example as a main example according to the main selection strategy;
and modifying the copy strategy, and pointing the data copy to the main instance.
Preferably, the method of selecting the master instance comprises:
accessing a survival instance and acquiring the delay state of the instance;
the main instance is selected from the surviving instances in a manner that minimizes the delay.
Preferably, the copy policy includes a connection address, a user name, a password, and a copy mode of the main instance;
the read-only strategy of the master instance is set to be off, and the read-only strategy of the slave instance is set to be on;
the method for reconnection of the fault instance comprises the following steps:
detecting a fault instance;
judging whether the fault instance is recovered;
if yes, the recovered fault instance is set as a slave instance, and the copy strategy is synchronized.
Preferably, the configuration resource includes any one or a combination of the following resources:
configuration files, stateful applications, resource objects of storage volumes, and service discovery objects;
the method for instance creation comprises the following steps:
creating a stateful application according to the creation strategy;
creating container resources according to the creation event of the stateful application;
and configuring the container resources according to the configuration file to obtain a database instance.
Preferably, the upgrading method comprises:
creating an upgrade policy, wherein upgrade description information of the upgrade policy comprises: updating database configuration, updating container cluster deployment configuration and updating database version;
generating or modifying custom resources according to the upgrading description information;
monitoring the user-defined resource through an Operator, and updating a configuration file;
and upgrading the database, the container or the cluster according to the configuration file.
Preferably, the example offloading method comprises:
deleting the custom resource corresponding to the instance;
monitoring a deletion event of the custom resource through an Operator, and deleting the corresponding configuration resource;
the container group of the instance is deleted according to a deletion event of the configuration resource.
The invention also provides a management system for realizing the management method, which comprises the following steps:
the system comprises a description information management module, a user-defined resource management module, a monitoring module and an execution module;
the description information management module is used for creating or modifying description information;
the user-defined resource management module is used for creating, modifying or deleting user-defined resources according to the description information;
the monitoring module is used for monitoring the self-defined resource and generating a configuration resource;
the execution module is used for managing the database cluster according to the configuration resource.
Preferably, the listening module is further configured to:
monitoring an instance of a database;
judging whether the main instance has a fault;
if yes, selecting a normal example as a main example according to the main selection strategy;
and modifying the copy strategy, and pointing the data copy to the main instance.
Compared with the prior art, the invention has the beneficial effects that: the configuration information and the deployment information of the database are uniformly described through the description information, so that the unified management and the deployment of the custom resources are facilitated, the database is uniformly managed through the custom resources, and the misoperation caused by a large number of APIs (application programming interfaces) is prevented.
Drawings
FIG. 1 is a flow chart of a method for managing a highly available database according to the present invention;
FIG. 2 is a logical block diagram of the management system of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The invention is described in further detail below with reference to the attached drawing figures:
a management method of a high availability database, as shown in fig. 1, the management method comprising:
step 101: the description information is created or modified. The description information can be saved by a file, such as a yaml or json file; the description information is used to describe the attributes of database management, such as configuration information inside the database: root passwords, sizes of various cache regions, a data disk falling mechanism and the like, and related configuration information of the database in the cluster: number of instances, hardware resource quotas, network policies, scheduling policies, and the like.
Step 102: and creating, modifying or deleting the custom resource according to the description information.
Step 103: and monitoring the self-defined resources and generating configuration resources.
The configuration resources include any one or a combination of the following: configuration files (Configmap), stateful applications (stateful), resource objects (PVC) of storage volumes, and service discovery objects (service). In one embodiment, the Operator is utilized to listen for custom resources.
Step 104: managing the containerized database according to the configuration resource. The database may be a MYSQL database, but is not limited thereto. The containerized database may have multiple instances, making up a database cluster.
The configuration information and the deployment information of the database are uniformly described through the description information, so that uniform management and deployment of custom resources are facilitated, the database is uniformly managed through the custom resources, misoperation caused by a large number of APIs is prevented, and high availability of the database is improved.
In step 101, the description information includes any one or a combination of the following information:
database configuration information, an instance creation policy, an upgrade policy, a master-slave policy, and a replication policy. The self-defined resource corresponds to the description information.
Database management includes database failure recovery, database upgrade, instance creation, and instance uninstallation.
Example 1
The method for recovering the database failure comprises the following steps:
step 201: an instance of the database is listened to by the Operator. The Operator can periodically try to connect each database instance through a timing task and execute a corresponding SQL statement to perform the activity detection.
Step 202: it is determined whether the primary instance failed. The Operator program can regularly explore and activate each MYSQL instance, namely, the MYSQL instances distributed in different nodes are connected through configured information such as MYSQL connection addresses, user names, passwords and ports. If the connection times out, it represents a failure of the instance. After the connection is successful, a command "SELECT 1" is executed, if a normal result cannot be returned, the instance fails, otherwise, the instance is a normal instance. Normal instances and abnormal instances are marked.
If yes, go to step 203: and selecting a normal instance as a main instance according to the main selection strategy.
The method for selecting the main instance comprises the following steps: step 211: the live instance is accessed and the deferred status of the instance is obtained. Step 212: the main instance is selected from the surviving instances in a manner that minimizes the delay.
If not, go to step 206: the instances are listened to continuously or periodically.
Step 205: and modifying the copy strategy, the load balancing strategy and the like, copying the data of the slave instance to the master instance, setting the read-only strategy of the master instance to be off, and setting the read-only strategy of the slave instance to be on.
The copy policy comprises a connection address, a user name, a password and a copy mode of the main instance. Specifically, MYSQL with sequence number 0 is taken as a master library by default, other database instances are taken as slave libraries, and configuration data is copied from the master library to the slave libraries. And after the starting replication is finally executed, pulling the Binlog log reserved by the data base written in the main base from the base, and playing back the Binlog in the base to fulfill the aim of data replication so as to ensure the consistency of the data.
Step 206: the faulty instance reconnects. The method for reconnection of the fault instance comprises the following steps: detecting a fault instance; judging whether the fault instance is recovered; if yes, the recovered fault instance is set as a slave instance, and the copy strategy is synchronized. The Operator can automatically repair a specific fault scene, and can recover in a short time without the participation of operation and maintenance personnel.
The Operator program will record the metadata of the master-slave relationship and record it at the time of operation. And 2 service discoveries with load balancing capability are created at the same time and used as entries for writing data and reading data. The client side accesses the domain names discovered by the 2 services to realize read-write separation.
It should be noted that types that are not available for example include: master library failure, slave library survival; failure of the slave library and survival of the master library; and three conditions of failure of the master library and the slave library are provided. When the master library fails and the slave libraries are alive, master-slave switching is required to change one slave library into the master library. Under the conditions of slave library failure and master library survival, master-slave switching is not needed, and the slave libraries which cannot be accessed can be automatically removed by the load balancing opportunity in Kubernetes. And under the condition that both the master library and the slave library have faults, the whole system has faults and cannot be repaired.
The MYSQL cluster can adopt a structure belonging to a master and multiple slaves, and each instance is independent. I.e., one instance of which provides the primary data writing capability as the master instance and the other instance provides the data reading capability as the slave instance. In the embodiment, the master-slave architecture implementation and the automatic fault repair of the MYSQL cluster are realized through an Operator; therefore, high availability of the MYSQL cluster is maintained, for example, master-slave switching of the database is carried out when a downtime fault occurs, and the read-write function of the database is guaranteed to be normal.
After the database is started, the Operator can sequentially access each MYSQL database instance and configure a copy strategy, specifically, the slave library is configured with information such as a database connection address, a user name, a password and a copy mode of the master library, and data is copied to the slave library from the master library increment, so that the redundancy of the data is ensured. And simultaneously, a Kubernetes platform provides a read-write separated load balancing inlet, the load balancing inlet comprises a service inlet for writing data and a service outlet for reading data, the outlets are respectively a master library and a slave library, and the Operator maintains the mapping relation.
Example 2
The database instance creating method comprises the following steps:
step 301: and creating the stateful application according to the creating strategy.
Step 302: based on the creation event of the stateful application, a container resource is created. Container resources may be created through the kubernets platform.
Step 303: and configuring the container resources according to the configuration file to obtain a database instance.
Example 3
The method for upgrading the database comprises the following steps:
step 401: creating an upgrade policy, wherein upgrade description information of the upgrade policy comprises: database configuration updates, container cluster deployment configuration updates, and database version updates. For example, upgrade a MYSQL version to a specified version; and horizontally expanding the database cluster, namely recreating the instance.
Step 402: and generating or modifying the custom resource according to the upgrading description information.
Step 403: and monitoring the custom resource through an Operator, and updating a configuration file.
Step 404: and upgrading the database, the container or the cluster according to the configuration file.
Example 4
The method for example unloading comprises the following steps:
step 501: and deleting the custom resource corresponding to the instance.
Step 502: and monitoring a deletion event of the self-defined resource through an Operator, and deleting the corresponding configuration resource.
Step 503: upon a deletion event of a configuration resource, a container resource, such as a container group (POD), of the instance is deleted.
Because the resources have an incidence relation, the database self-defined resources belong to the uppermost layer, the controller of the cloud native platform can automatically delete all related subordinate resources according to a cascade relation, the self-defined resources can automatically delete Statefouset, Service, Configmap and other built-in resources, and because Statefouset and Pod are in the cascade relation, the Pod can be automatically deleted after Statefouset is deleted, and the process is controlled by the controller of the cloud native platform. The data volumes used by the database are retained or automatically deleted according to the actually defined mode.
Example 5
The embodiment provides a management system for implementing the above-mentioned high-availability database management method, as shown in fig. 2, including a description information management module 1, a custom resource management module 2, a monitoring module 3, and an execution module 4;
the description information management module 1 is used for creating or modifying the description information 11;
the custom resource management module 2 is used for creating, modifying or deleting custom resources 12 according to the description information 11;
the monitoring module 3 is used for monitoring the custom resource 12 and generating a configuration resource 13 according to the custom resource;
the execution module 4 is adapted to manage a containerized database cluster comprising a plurality of instances 15 according to the configuration resources 13. In particular, the execution module 4 may manage the database cluster through the kubernets platform.
The monitoring module 3 is further configured to execute the database failure recovery method in embodiment 1:
monitoring an instance of a database;
judging whether the main instance has a fault;
if yes, selecting a normal example as a main example according to the main selection strategy;
and modifying the copy strategy, and pointing the data copy to the main instance.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A management method for a high-availability database is characterized by comprising the following steps:
creating or modifying description information;
creating, modifying or deleting custom resources according to the description information;
monitoring the self-defined resource and generating a configuration resource;
and managing the database cluster according to the configuration resources.
2. The management method according to claim 1, wherein the description information includes any one or a combination of the following information:
database configuration information, an instance creation policy, an upgrade policy, a master-slave policy, and a replication policy.
3. The method of managing according to claim 2, wherein the method of database failure recovery comprises:
monitoring an instance of a database through an Operator;
judging whether the main instance has a fault;
if yes, selecting a normal example as a main example according to the main selection strategy;
and modifying the copy strategy, and pointing the data copy to the main instance.
4. The method of managing according to claim 3, wherein the method of selecting a master instance comprises:
accessing a survival instance and acquiring the delay state of the instance;
the main instance is selected from the surviving instances in a manner that minimizes the delay.
5. The management method according to claim 4, wherein the copy policy includes a connection address, a user name, a password, a copy mode of the primary instance;
the read-only strategy of the master instance is set to be off, and the read-only strategy of the slave instance is set to be on;
the method for reconnection of the fault instance comprises the following steps:
detecting a fault instance;
judging whether the fault instance is recovered;
if yes, the recovered fault instance is set as a slave instance, and the copy strategy is synchronized.
6. The method according to claim 2, wherein the configuration resource comprises any one or a combination of the following resources:
configuration files, stateful applications, resource objects of storage volumes, and service discovery objects;
the method for instance creation comprises the following steps:
creating a stateful application according to the creation strategy;
creating container resources according to the creation event of the stateful application;
and configuring the container resources according to the configuration file to obtain a database instance.
7. The management method according to claim 6, wherein the upgrading method comprises:
creating an upgrade policy, wherein upgrade description information of the upgrade policy comprises: updating database configuration, updating container cluster deployment configuration or updating database version;
generating or modifying custom resources according to the upgrading description information;
monitoring the user-defined resource through an Operator, and updating a configuration file;
and upgrading the database, the container or the cluster according to the configuration file.
8. The method of managing according to claim 6, wherein the method of instance offloading comprises:
deleting the custom resource corresponding to the instance;
monitoring a deletion event of the custom resource through an Operator, and deleting the corresponding configuration resource;
and deleting the container resource of the instance according to the deletion event of the configuration resource.
9. A management system for implementing the management method according to any one of claims 1 to 8, characterized by comprising: the system comprises a description information management module, a user-defined resource management module, a monitoring module and an execution module;
the description information management module is used for creating or modifying description information;
the user-defined resource management module is used for creating, modifying or deleting user-defined resources according to the description information;
the monitoring module is used for monitoring the self-defined resource and generating a configuration resource;
the execution module is used for managing the database cluster according to the configuration resource.
10. The management system of claim 9, wherein the listening module is further configured to:
monitoring an instance of a database;
judging whether the main instance has a fault;
if yes, selecting a normal example as a main example according to the main selection strategy;
and modifying the copy strategy, and directing the data copy to the main instance.
CN202210146171.XA 2022-02-17 2022-02-17 Management method and management system of high-availability database Pending CN114510464A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210146171.XA CN114510464A (en) 2022-02-17 2022-02-17 Management method and management system of high-availability database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210146171.XA CN114510464A (en) 2022-02-17 2022-02-17 Management method and management system of high-availability database

Publications (1)

Publication Number Publication Date
CN114510464A true CN114510464A (en) 2022-05-17

Family

ID=81552555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210146171.XA Pending CN114510464A (en) 2022-02-17 2022-02-17 Management method and management system of high-availability database

Country Status (1)

Country Link
CN (1) CN114510464A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170822A (en) * 2022-12-22 2023-05-26 博上(山东)网络科技有限公司 5G network resource management method and system
CN116974703A (en) * 2023-09-22 2023-10-31 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Kubernetes application resource management method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170822A (en) * 2022-12-22 2023-05-26 博上(山东)网络科技有限公司 5G network resource management method and system
CN116170822B (en) * 2022-12-22 2023-09-08 博上(山东)网络科技有限公司 5G network resource management method and system
CN116974703A (en) * 2023-09-22 2023-10-31 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Kubernetes application resource management method and system
CN116974703B (en) * 2023-09-22 2024-01-02 山东未来网络研究院(紫金山实验室工业互联网创新应用基地) Kubernetes application resource management method and system

Similar Documents

Publication Publication Date Title
US20230081942A1 (en) Automatically Deployed Information Technology (IT) System and Method
CN106716360B (en) System and method for supporting patch patching in a multi-tenant application server environment
JP6514308B2 (en) Failover and Recovery for Replicated Data Instances
US7941510B1 (en) Management of virtual and physical servers using central console
US8195979B2 (en) Method and apparatus for realizing application high availability
US7430616B2 (en) System and method for reducing user-application interactions to archivable form
US7383327B1 (en) Management of virtual and physical servers using graphic control panels
US7587483B1 (en) System and method for managing computer networks
US20060259594A1 (en) Progressive deployment and maintenance of applications on a set of peer nodes
EP3745269B1 (en) Hierarchical fault tolerance in system storage
CN114510464A (en) Management method and management system of high-availability database
CN106657167B (en) Management server, server cluster, and management method
US8688644B1 (en) Systems and methods for performing recovery of directory data
CN113626286A (en) Multi-cluster instance processing method and device, electronic equipment and storage medium
JP2005056392A (en) Method and device for validity inspection of resource regarding geographical mirroring and for ranking
US8499080B2 (en) Cluster control apparatus, control system, control method, and control program
CN116389233A (en) Container cloud management platform active-standby switching system, method and device and computer equipment
CN116069358A (en) Method, device and storage medium for upgrading data in distributed database
US11895102B2 (en) Identity management
CN111371606A (en) Method for specifying monitor ip when using look to deploy ceph cluster
WO2016046951A1 (en) Computer system and file management method therefor
US11853177B2 (en) Global entity distribution
CN116010111B (en) Cross-cluster resource scheduling method, system and terminal equipment
CN116185708A (en) MySQL cluster high availability system and equipment
CN114385592A (en) Fault transfer method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination