CN117520053A

CN117520053A - Database system management method and device based on master-slave replication protocol

Info

Publication number: CN117520053A
Application number: CN202311550393.9A
Authority: CN
Inventors: 黄永; 吴学强; 刘东明
Original assignee: Hangzhou Yunyingsheng Data Co ltd
Current assignee: Hangzhou Yunyingsheng Data Co ltd
Priority date: 2023-11-20
Filing date: 2023-11-20
Publication date: 2024-02-06

Abstract

The application discloses a database system management method and device based on a master-slave replication protocol, which relate to the technical field of computers and comprise the following steps: determining an initial master node and a backup node based on a declaration mode, and adding role information of the initial master node and the backup node into labels and comments of corresponding objects; when an abnormality occurs in the initial master node or the initial master node sequence number in the cluster object is modified, selecting a new master node according to a preset master selection strategy; and after determining the new master node, performing role switching on the initial master node and the initial standby node according to a preset switching strategy, and updating the labels and the role information in the notes. The method and the system make up the defect of the StatefulSet controller, can provide differentiated read-write service for the outside, simultaneously support the configuration management of the differentiation of the main and standby nodes, and support the HA scheme integration outside the main stream and the HA scheme self-defining inside through the HA high-availability scheme applicable to all the main and standby type databases.

Description

Database system management method and device based on master-slave replication protocol

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for managing a database system based on a master-slave replication protocol.

Background

The database is a stateful system, because a single-node database stores the problems of low data security, low upper limit of reading and writing capability, poor disaster tolerance capability and the like, the modern database can solve the problems by using data replication, namely, the purpose of improving the data security, the upper limit of reading and writing capability and the disaster tolerance capability is achieved by replicating data to a plurality of nodes and simultaneously providing reading and writing to the outside, the primary and standby replication is one of the common data replication modes in the database system, and in K8s, a management method of the stateful system is provided by a primary component StatefulSet, the component can manage a plurality of nodes, each node consists of a computing unit and a data unit, but when the database system based on a primary and standby replication protocol is managed by using the StatefulSet in K8s, the StatefulSet cannot sense the association relation among different nodes, so that the following problems are easy to occur:

firstly, different nodes have differences in service capability provided externally, some nodes can provide read-write, some nodes can only provide read-only, some nodes can not provide externally service, and StatefulSet can not recognize the differences, so that differentiated service can not be provided externally; secondly, under the operations of upgrading and downgrading, lifting and matching, restarting, expanding and shrinking capacity and the like, in order to improve the usability of the system, the operation sequence among the nodes needs to accord with a specific sequence or rule, while StatefulSet does not support the operations through the specific sequence or rule; thirdly, different roles often need different configuration file parameters, but StatefulSet does not support to mount different configuration files to Pod according to rules such as roles; fourth, the kernel only provides the primary and backup replication capability, and does not have the capability of Failover (Failover) and active handoff (Switchover), and needs to be interfered by external components, and the implementation schemes of the existing different database systems based on the primary and backup replication protocols are different.

Disclosure of Invention

The database system management method based on the active-standby replication protocol aims to solve the problems that in the prior art, a database system based on the active-standby replication protocol cannot provide differentiated services and does not have the capability of fault switching or active switching.

In order to achieve the above purpose, the present application adopts the following technical scheme:

a database system management method based on a master-slave replication protocol comprises the following steps:

determining an initial master node and a backup node based on a declaration mode, and adding role information of the initial master node and the backup node into labels and comments of corresponding objects;

when an abnormality occurs in the initial master node or the initial master node sequence number in the cluster object is modified, selecting a new master node according to a preset master selection strategy;

and after determining a new master node, performing role switching on the initial master node and the initial standby node according to a preset switching strategy, and updating role information in the labels and the notes.

Preferably, the determining the initial active-standby node based on the declaration manner includes: the sequence number of the initial master node is specified by the primary index field in the declarative API.

Preferably, the method further comprises:

the initial master node and each initial standby node are respectively placed in different controllers.

Preferably, the master selection policy includes a weight priority policy, a co-regional or co-machine room priority policy, and a node data delay minimum priority policy.

Preferably, the switching policy is defined by a SwitchPolicy field in the declarative API, and when switchpolicy=noop, the high availability switching action is implemented by an external component.

Preferably, the performing role switching on the initial active-standby node according to a preset switching policy includes:

judging whether the role information of the initial master node is consistent with the role information of the new master node, if so, acquiring the data synchronization delay between the initial master node and the new master node, and generating an environment variable which is dependent during switching, wherein the role information comprises roles in a kernel and roles on corresponding labels.

A database system management device based on a master-slave replication protocol, comprising:

the initialization module is used for determining an initial master-slave node based on a declaration mode and adding information of the initial master-slave node into labels and comments of corresponding objects;

the reselection module is used for selecting a new master node according to a preset master selection strategy when an abnormality occurs in the initial master node or the serial number of the initial master node in the cluster object is modified;

and the switching module is used for performing role switching on the initial master and slave nodes according to a preset switching strategy after determining the new master node, and updating the information in the labels and the comments.

Preferably, the switching module includes:

and the judging unit is used for judging whether the role information of the initial master node and the role information of the new master node are consistent, if so, acquiring the data synchronization delay between the initial master node and the new master node, and generating an environment variable which is dependent during switching, wherein the role information comprises roles in a kernel and roles on corresponding labels.

An electronic device comprising a memory and a processor, the memory to store one or more computer instructions, wherein the one or more computer instructions are executable by the processor to implement a master-slave replication protocol based database system management method as claimed in any one of the preceding claims.

A computer-readable storage medium storing a computer program which, when executed by a computer, implements a database system management method based on a master-slave replication protocol as claimed in any one of the above.

The invention has the following beneficial effects:

the method and the system make up the defect of the StatefulSet controller, can provide differentiated read-write service to the outside, support operation and maintenance operation based on a specific sequence or rule, and also support configuration management of primary-backup node differentiation.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.

FIG. 1 is a flowchart of a method for managing a database system based on a master-slave replication protocol provided in the present application;

FIG. 2 is a schematic diagram of a database system management device based on a master-slave replication protocol provided in the present application;

fig. 3 is a schematic diagram of an electronic device implementing a database system management method based on a master-slave replication protocol.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The terms "first," "second," and the like in the claims and the description of the present application are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order, and it should be understood that the terms so used may be interchanged, if appropriate, merely to describe the manner in which objects of the same nature are distinguished in the embodiments of the present application when described, and furthermore, the terms "comprise" and "have" and any variations thereof are intended to cover a non-exclusive inclusion such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

As shown in fig. 1, the present application provides a database system management method based on a master-slave replication protocol, which includes the following steps:

s110, determining an initial master node and a backup node based on a declaration mode, and adding role information of the initial master node and the backup node into labels and comments of corresponding objects;

s120, when an abnormality occurs in the initial master node or the initial master node serial number in the cluster object is modified, selecting a new master node according to a preset master selection strategy;

and S130, after determining a new master node, performing role switching on the initial master node and the initial standby node according to a preset switching strategy, and updating role information in the labels and the notes.

In the K8s cluster, the nodes are mainly classified into three types, namely a primary node primary for supporting reading and writing, a secondary node secondary for keeping data synchronization with the primary node, capable of being actively or passively switched to primary without providing external service, a read-only node readonly for keeping data synchronization with the primary node, capable of providing external read-only service, and capable of being actively or passively switched to primary, wherein in the embodiment, the secondary and readonly are collectively called as standby nodes.

Illustratively, the sequence number of the initial master node is specified by a primary index field in the declarative API.

In this embodiment, the user may specify the serial number index of the initial master node through the primary index field in the declarative API to determine which node becomes the initial master node, where the primary index field may also be set to be the primary name field, which means the same, but not limited to this, the remaining nodes automatically become initial standby nodes, if no specification is made, the node with default index=0 is the initial master node, and in this embodiment, in order to make the StatefulSet support to mount different configuration files to Pod according to rules such as roles, in topology, different StatefulSet are used to describe the initial master standby node, that is, there will be 1+n StatefulSet for one master N standby, one StatefulSet contains one Pod, each StatefulSet may contain different configuration files, and the Pod specification of each StatefulSet may not be consistent.

After determining the initial primary and backup relation in the primary and backup database system, the system adds role information of the primary and backup nodes to labels label and annotation of corresponding K8S objects, information in labels and annotation is mounted in Pod through volume in a manner of Down wardAPI, and then the Pod executes different starting commands according to roles of the nodes, so that initialization of the primary and backup database system is completed, and at the moment, one primary node and a plurality of primary and backup nodes exist in the system.

Normally, the roles of the nodes in the system are stable, but when a master-slave switching event is triggered, the roles will be switched and changed, wherein the master-slave switching event is triggered, including: firstly, when an initial master Node fails, the initial master Node may fail for various reasons, such as downtime, at this time, a new master Node needs to be selected from the initial standby nodes to provide new services to the outside, and in this case, a failure switching event is triggered, and preferably, node abnormality in a K8S primary event, a DB process abnormality event in a Probe abnormality event or a DB Node state abnormality event (read-write detection failure) may trigger a failure; secondly, when the user needs to manually switch the current initial master node away because of service requirements, if the current initial master node is in the available area a, the user wants to switch the master node to the available area B to reduce service delay, and the like, at this time, an initial backup node meeting the conditions needs to be selected and switched, and in this case, a switch switching event is triggered, and in this embodiment, the primary node is determined by using a primary index field in the declarative API, so that the switching can trigger a switch flow in the system only by modifying the primary index of the Cluster object to designate a new master node without manual switching.

Whether the Failover switching event or the switchoverter switching event is triggered, the selection of a new master node is required according to a preset master selection policy, and as optimization, the master selection policy comprises a weight priority policy, a same-region or same-machine-room priority policy and a node data delay minimum priority policy, and the custom expansion can also be performed, and the specific use of which master selection policy is determined according to actual needs, for example, a healthy initial standby node with small data delay is selected as the new master node according to needs, if no initial standby node meets the conditions, the failure exits, and event warning is performed.

Illustratively, the switching policy is defined by a SwitchPolicy field in the declarative API, and when switchpolicy=noop, the high availability switching action is implemented by an external component.

After the new master node determines, the switching flow is automatically triggered, in this embodiment, the switching policy is defined by a switchPolicy field defined in the declarative API, and when switchpolicy=noop, it is indicated that the high available HA switching action is implemented by the external component, and at this time, the system is only responsible for maintaining the role information of Pod after the external component is switched, i.e. switchPolicy may support the HA scheme implemented outside the host, so that the user is convenient to integrate the existing HA scheme into the system.

The method includes the steps that whether role information of the initial master node is consistent with role information of the new master node or not is judged, if so, data synchronization delay between the initial master node and the new master node is obtained, environment variables which are dependent during switching are generated, and the role information comprises roles in a kernel and roles on corresponding labels.

When switching is carried out, whether character information of the initial master node is consistent with that of the new master node is required to be judged, if so, data synchronization delay between the initial master node and the new master node is acquired, environment variables which are depended on when switching is generated, then switching actions are carried out according to a switchstate and a switchstep defined by a user in yaml, wherein the character information comprises characters in a kernel and characters on corresponding labels, if not, the switching is not carried out, and character information in corresponding label and section is updated to be the characters after the switching is completed.

In addition, in the K8s, services are provided to the outside through a Service component, wherein the services are selected to provide the Pod of the services to the outside through a Label Selector, in this embodiment, the Pod corresponding to each node of the data base contains a role quad Label, and in a high-availability scene, correct updating of the Label of the Pod is ensured, so that a user can achieve the purpose of differentially accessing the node of the Pod through different services as long as different services are mapped to the Pod with different Label labels.

As shown in fig. 2, the present application further provides a database system management device based on a master-slave replication protocol, including:

an initialization module 10, configured to determine an initial active-standby node based on a declarative manner, and add information of the initial active-standby node to a tag and an annotation of a corresponding object;

the reselection module 20 is configured to perform new master node selection according to a preset master selection policy when an abnormality occurs in an initial master node or an initial master node sequence number in a cluster object is modified;

and the switching module 30 is used for performing role switching on the initial master and slave nodes according to a preset switching strategy after determining the new master node, and updating the information in the labels and the comments.

One embodiment of the above device may be: the initialization module 10 determines an initial master-slave node based on a declarative mode, and adds information of the initial master-slave node to a label and a comment of a corresponding object; the reselection module 20 selects a new master node according to a preset master selection policy when an abnormality occurs in the initial master node or the serial number of the initial master node in the cluster object is modified; after the reselection module 2 determines the new master node, the switching module 30 performs role switching on the initial master node and the standby node determined by the initialization module 10 according to a preset switching strategy, and updates information in the tag and the annotation.

In a preferred embodiment, a switching module 30 of a database system management device based on a master-slave replication protocol further includes:

and the judging unit 31 is configured to judge whether the role information of the initial master node and the role information of the new master node are consistent, and if so, acquire a data synchronization delay between the initial master node and the new master node, and generate an environment variable that depends on the handover, where the role information includes a role in a kernel and a role on a corresponding label.

As shown in fig. 3, the present application further provides an electronic device, including a memory 301 and a processor 302, where the memory 301 is configured to store one or more computer instructions, and the one or more computer instructions are executed by the processor 302 to implement a database system management method based on a master-slave replication protocol.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the electronic device described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

The present application also provides a computer-readable storage medium storing a computer program that causes a computer to implement a database system management method based on a master-slave replication protocol as described above when executed.

By way of example, a computer program may be divided into one or more modules/units stored in the memory 301 and executed by the processor 302 and completed by the input interface 305 and the output interface 306 to complete the present invention, and one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program in a computer device.

The computer device may be a desktop computer, a notebook computer, a palm computer, a cloud server, or the like. The computer device may include, but is not limited to, a memory 301, a processor 302, it will be understood by those skilled in the art that the present embodiment is merely an example of a computer device and is not limiting of a computer device, may include more or fewer components, or may combine certain components, or different components, e.g., a computer device may also include an input 307, a network access device, a bus, etc.

The processor 302 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors 302, digital signal processors 302 (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor 302 may be a microprocessor 302 or the processor 302 may be any conventional processor 302 or the like.

The memory 301 may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The memory 301 may also be an external storage device of a computer device, such as a plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash memory Card (Flash Card) or the like, which is provided on a computer device, and further, the memory 301 may also include an internal storage unit of a computer device and an external storage device, the memory 301 may also be used to store computer programs and other programs and data required by a computer device, the memory 301 may also be used to temporarily store the programs and data in the output 308, and the aforementioned storage Media include a U disk, a removable hard disk, a read-only memory ROM303, a random access memory RAM304, a disk or an optical disk and other various Media that can store program codes.

The foregoing is merely illustrative of specific embodiments of the present invention, and the scope of the present invention is not limited thereto, but any changes or substitutions within the technical scope of the present invention should be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A database system management method based on a master-slave replication protocol is characterized by comprising the following steps:

2. The method for managing a database system based on a primary-backup replication protocol according to claim 1, wherein determining an initial primary-backup node based on a declarative manner comprises: the sequence number of the initial master node is specified by the primary index field in the declarative API.

3. The method for managing a database system based on a master-slave replication protocol according to claim 1, further comprising:

4. The method for managing a database system based on a master-slave replication protocol according to claim 1, wherein the master-selection policy includes a weight priority policy, a co-regional or co-machine room priority policy, and a node data delay minimum priority policy.

5. The method according to claim 1, wherein the switching policy is defined by a SwitchPolicy field in a declarative API, and the high availability switching action is implemented by an external component when switchpolicy=noop.

6. The method for managing a database system based on a master-slave replication protocol according to claim 1, wherein performing role switching on the initial master-slave node according to a preset switching policy comprises:

7. A database system management device based on a master-slave replication protocol, comprising:

8. The database system management apparatus according to claim 7, wherein the switching module comprises:

9. An electronic device comprising a memory and a processor, the memory configured to store one or more computer instructions, wherein the one or more computer instructions are executable by the processor to implement a master-slave replication protocol based database system management method as recited in any one of claims 1-6.

10. A computer-readable storage medium storing a computer program, wherein the computer program when executed causes a computer to implement a database system management method based on a master-slave replication protocol according to any one of claims 1 to 6.