CN110069365B

CN110069365B - Method for managing database and corresponding device, computer readable storage medium

Info

Publication number: CN110069365B
Application number: CN201910353959.6A
Authority: CN
Inventors: 姜承尧; 赖明星
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-04-26
Filing date: 2019-04-26
Publication date: 2023-07-04
Anticipated expiration: 2039-04-26
Also published as: CN110069365A

Abstract

The present disclosure provides a method of managing a database and a corresponding server, computer-readable storage medium. The method comprises the following steps: receiving operation state information of a main database; determining the operation state of the main database according to the operation state information of the main database; when the running state of the master database is determined to be that the master database fails, selecting one slave database from at least two slave databases as a new master database; and sending information about the new master database to the management server agent modules respectively corresponding to other slave databases in the at least two slave databases, and controlling the other slave databases to backup data from the new master database through the management server agent modules corresponding to the other slave databases.

Description

Method for managing database and corresponding device, computer readable storage medium

Technical Field

The present disclosure relates to the field of databases, and in particular, to a method for managing a database and a corresponding apparatus, computer-readable storage medium.

Background

The database is widely applied to various products in the field of the Internet as an infrastructure for storing data. For products with characteristics of large data volume, high requirements on response delay, complex service and the like, in order to support the load of product service, especially instantaneous ultrahigh load, the use of database clusters has been proposed. A database cluster may include one master database that is responsible for writing data and one or more slave databases that are responsible for reading data, and that may backup data (or may also be referred to as synchronized data or replicated data) from the master database. In a database cluster, when a master database fails, a slave database can be promoted to the master database, so that the downtime of business is reduced, and the high availability of database services is maintained.

In order to ensure data consistency of each database in the database cluster and improve operability of the database cluster, some general database management systems have been proposed. For example, currently widely used database management systems include the keepalive system, the cloud database system, and the master database high availability system (Master High Availability, MHA). However, these existing database management systems have various drawbacks, such as inability to support expansion from databases to meet the read-write requirements of ultra-high loads, insufficient self-healing capability to manage large numbers of database clusters, security problems, and inability to meet compliance requirements of financial scenarios.

Disclosure of Invention

To this end, the present disclosure provides a method for managing a database and corresponding apparatus, computer-readable storage medium.

According to one aspect of the present disclosure, a method for managing a database is provided. The method comprises the following steps: receiving operation state information of a main database; determining the operation state of the main database according to the operation state information of the main database; when the running state of the master database is determined to be that the master database fails, selecting one slave database from at least two slave databases as a new master database; and sending information about the new master database to the management server agent modules respectively corresponding to other slave databases in the at least two slave databases, and controlling the other slave databases to backup data from the new master database through the management server agent modules corresponding to the other slave databases.

According to another aspect of the present disclosure, an apparatus for managing a database is provided. The device comprises: a receiving unit configured to receive operation state information of the main database; a determining unit configured to determine an operation state of the main database according to the operation state information of the main database; a selection unit configured to select one slave database from at least two slave databases as a new master database when the determination unit determines that the operation state of the master database is that the master database has failed; and a transmitting unit configured to transmit information about the new master database to the management server agent modules respectively corresponding to other slave databases among the at least two slave databases, the other slave databases being controlled to backup data from the new master database through the management server agent modules corresponding to the other slave databases.

According to one example of the present disclosure, the receiving unit is configured to receive the operation state information of the master database from a management server agent module corresponding to the master database.

According to one example of the present disclosure, the receiving unit is configured to receive the operation state information of the master database from a management server agent module corresponding to at least one of the at least two slave databases.

According to one example of the present disclosure, the selection unit is configured to determine candidate slave databases from the at least two slave databases; and selecting one slave database from the candidate slave databases as a new master database.

According to one example of the present disclosure, the selection unit is configured to determine a slave database of the at least two slave databases, the transaction identity of which is acquired by a management server, wherein the management server is used for managing the master database and the at least two slave databases; judging whether the number of the determined slave databases is greater than a predetermined proportion of the number of the at least two slave databases; determining the determined slave database as a candidate slave database when the number of the determined slave databases is greater than a predetermined proportion of the number of the at least two slave databases; and selecting one slave database from the candidate slave databases as a new master database.

According to one example of the present disclosure, each slave database has its priority to act as a master database; the selection unit is configured to select one of the candidate slave databases as a new master database based at least on the priority of each of the candidate slave databases.

According to one example of the present disclosure, the selection unit is configured to select one slave database from the candidate slave databases as the new master database according to the priority of each of the candidate slave databases and the transaction identification.

According to one example of the present disclosure, the selecting unit is configured to determine whether a slave database having a highest priority among the candidate slave databases has a largest transaction identification; and when the slave database with the highest priority has the largest transaction identification, taking the slave database with the highest priority as a new master database.

According to one example of the present disclosure, the selecting unit is configured to control, when the slave database having the highest priority does not have the largest transaction identification, the slave database having the highest priority to backup data from the slave database having the largest transaction identification so that the slave database having the highest priority has the largest transaction identification, through the management server agent module corresponding to the slave database having the highest priority; and taking the slave database with the highest priority as a new master database.

According to one example of the present disclosure, the transmitting unit is configured to transmit configuration information to a management server proxy module corresponding to the new master database, wherein the configuration information indicates a replication mechanism between the new master database and the other slave databases, so that the management server proxy module corresponding to the new master database controls the new master database to employ the replication mechanism.

According to one example of the present disclosure, the transmitting unit is configured to transmit repair information to a management server agent module corresponding to the master database so that the management server agent module corresponding to the master database repairs the master database.

According to another aspect of the present disclosure, an apparatus for managing a database is provided. The management server includes a processor; and a memory, wherein the memory has stored therein computer readable code which, when executed by the processor, performs the above method.

According to another aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon instructions which, when executed by a processor, cause the processor to perform the above-described method.

By the method, the corresponding device and the computer readable storage medium, when the master database fails, one slave database can be selected as a new master database to ensure high availability of database services. In addition, through the method, the corresponding device and the computer readable storage medium, the management server operates the database through the management server proxy module, thereby avoiding the operation of the database by relying on SSH and meeting the compliance requirement of financial scenes.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments thereof with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. In the drawings, like reference numerals generally refer to like parts or steps.

Fig. 1 shows a schematic architecture diagram of a keepalive system.

Fig. 2 shows an architectural schematic diagram of a cloud database system.

Fig. 3 shows a schematic architecture of an MHA system.

Fig. 4A is a schematic architecture diagram of a database management system according to an embodiment of the present disclosure.

Fig. 4B is a schematic diagram of a database in a database management system according to an embodiment of the present disclosure.

Fig. 5 is a flowchart of a method performed by a management server according to an embodiment of the present disclosure.

Fig. 6 is an exemplary specific flow of applying the architecture shown in fig. 4A, according to an embodiment of the present disclosure.

Fig. 7 is a schematic diagram of an architecture implementing high availability of management servers and high availability of system libraries according to an embodiment of the present disclosure.

Fig. 8 illustrates a schematic structure of a management server according to an embodiment of the present disclosure.

Fig. 9 illustrates an architecture of a computer device according to an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the present disclosure more apparent, exemplary embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. In the drawings, like reference numerals refer to like elements throughout. It should be understood that: the embodiments described herein are merely illustrative and should not be construed as limiting the scope of the present disclosure.

First, an existing keepalive system, a cloud database system, and an MHA system are described with reference to fig. 1 to 3, respectively. Fig. 1 shows a schematic architecture diagram of a keepalive system. As shown in fig. 1, the keepalive system includes a master database 110 and a slave database 120. When the master database 110 fails, the slave database 120 may be promoted to a new master database. Because the keepalive system only supports one master and one slave, and can not expand the slave database, the system is not suitable for application scenes with more database instances or more database loads. In addition, in the keepalive system, when the main database fails, the main database does not have any self-healing capability.

The cloud database system has a plurality of advantages compared with the keepalive system. Specifically, the cloud database system is added with one or more read-only slave databases on a master-slave basis, so that the slave databases are expanded, and better read expansion capability is provided. In addition, in the cloud database system, a complex database self-healing component is designed to self-heal the main database when the main database fails.

Fig. 2 shows an architectural schematic diagram of a cloud database system. As shown in fig. 2, the cloud database system includes one master database 210, one slave database 220, and three read-

only slave databases

230, 240, and 250 (i.e., databases connected to the master database 210 in fig. 2 by dotted lines). When the master database 210 fails, only the slave database 220 may be promoted to the master database. In addition, when the master database 210 fails, the master database 210 may self-heal, and after self-healing, the master database 210 may backup data from the slave database 220 as a slave database of the slave database 220.

While the cloud database system overcomes the drawbacks of the keepalive system, it is essentially a master-to-slave database high availability system. Cloud database systems suffer from certain drawbacks. For example, if a master database and a slave database in a cloud database system fail at the same time, high availability of database services cannot be guaranteed. In addition, because of the forced limitation of the cloud database system, the number of read-only slave databases in the cloud database system is limited, so that the slave databases cannot be expanded according to requirements. In addition, in the cloud database system, the machine room where the read-only slave database is located cannot be customized, so that personalized requirements of certain businesses on the geographic position of the machine room cannot be met.

The master database high availability system (Master High Availability, MHA) is truly a master multi-slave database high availability system. Fig. 3 shows a schematic architecture of an existing MHA system. As shown in fig. 3, the MHA system includes a master database 310, four

slave databases

320, 330, 340 and 350. When the master database 310 fails, any one of the

slave databases

320, 330, 340, and 350 may be promoted to the master database.

However, MHA systems are implemented by scripts, which cannot support a complex series of operations for database self-healing, and thus, databases in MHA systems do not have self-healing capabilities. Further, in the MHA system, a management device for managing the database is configured. The management device operates on the database via a secure shell protocol (Secure SHell protocol, SSH). For example, the management device may periodically detect a master database in the cluster through the SSH, and when a failure of the master database is detected, the management device may select one from a plurality of slave databases and promote it as the master database, and instruct the remaining slave databases to backup data from the new master database through the SSH. However, SSH has drawbacks in terms of security, for example SSH may be subject to a "man-in-the-middle" attack, and therefore use of SSH is prohibited in financial settings.

In view of the various problems with existing database management systems, the present disclosure proposes a method for managing a database in a master-multi-slave, and a corresponding server, computer-readable storage medium. The database management system overcomes the defects of the existing database management system.

The architecture of the database management system of the present disclosure is described below with reference to fig. 4A-4B. Fig. 4A is a schematic architecture diagram of a database management system according to an embodiment of the present disclosure. As shown in fig. 4A, the database management system includes a management server 410, a master database 420, slave databases 430 to 450, a management server agent module 410-2 corresponding to the master database 420, a management server agent module 410-3 corresponding to the slave database 430, a management server agent module 410-4 corresponding to the slave database 440, and a management server agent module 410-5 corresponding to the slave database 450.

In the architecture shown in fig. 4A, various databases may be deployed at a certain node, e.g., on a certain server, in the manner of an application. For example, the databases may be deployed on a server in the manner of existing MySQL. In addition, the management server agent modules corresponding to the respective databases may also be deployed at a certain node in the manner of an application program. In addition, each database and the management server agent module corresponding to each database may be deployed at the same node, or may be deployed at different nodes. Further, the master database 420 and the slave databases 430 to 450 may be deployed at nodes different from each other.

In addition, in the architecture shown in fig. 4A, the management server 410 operates the master database 420 and the slave databases 430 to 450 through the management server agent modules 410-2 to 410-5, respectively, to achieve management of the databases 420 and the slave databases 430 to 450. In particular, the management server 410 is used to control the operations to be performed on the database. When the management server 410 is to operate on a database, the management server 410 may send an instruction to the management server agent module corresponding to the database, and then the management server agent module corresponding to the database operates on the database according to the received instruction. For example, the management server agent module corresponding to the database may operate on the database in the SQL language according to the received instructions. The "operations" herein may include one or more of switching, modifying, creating, deleting, installing, restarting, uninstalling, starting replication, setting replication mechanisms, and the like of the database. With the architecture shown in fig. 4A, when a master database fails, one of a plurality of slave databases may be selected as a new master database to ensure high availability of database services. In addition, through the architecture, when the main database fails, the main database has self-healing capability, so that the number of manual operation can be effectively reduced, and the operation and maintenance efficiency is improved. In addition, through the architecture, the database is prevented from being operated by the SSH, and the compliance requirement of a financial scene is met.

In addition, the database management system shown in FIG. 4A may also include a system library 410-1 for the management server 410 to store instance metadata to persist information for instances.

In addition, the database management system shown in FIG. 4A may also include one or more read-only slave databases to enable expansion of the slave databases, thereby providing better read expansion capabilities. Fig. 4A shows that the database management system includes a read-only slave database, namely slave database 460. In addition, the architecture may also include a management server agent module 410-6 corresponding to the slave database 460.

Furthermore, in the architecture shown in FIG. 4A, the various slave databases may be located in different geographic locations. For example, databases 430 through 450 may be located in a first city and database 460 may be located in a second city. The advantage of this arrangement is that when the service in the second city is to read the data of the database cluster, only the slave database in the second city is required to be read, and the slave database in the first city is not required to be read, so that the cross-city access when the service reads the database is avoided, and the cross-city network delay from the second city to the first city is reduced.

Fig. 4B is a schematic diagram of a database in a database management system according to an embodiment of the present disclosure. As shown in fig. 4B, the database management system has five databases, namely a master database 420, slave databases 430 through 450, and a read-only slave database 460, wherein the slave databases 430 through 450 are located in a first city (as shown by the white boxes in fig. 4B), and the read-only slave database 460 is located in a second city (as shown by the gray boxes in fig. 4B). The master database 420 is responsible for writing data, the

slave databases

430, 440, 450 and the read-only slave database 460 are responsible for reading data, and the

slave databases

430, 440, 450 and the read-only slave database 460 may backup data from the master database 420. When the master database 420 fails, any one of the

slave databases

430, 440, and 450 may be promoted as the master database, while the read-only slave database 460 cannot be promoted as the master database. For example, when the master database 420 fails, the slave database 440 is promoted to the master database, the slave database 440 takes charge of writing of data, the

slave databases

430, 450 and the read-only slave database 460 take charge of reading of data, and the

slave databases

430, 450 and the read-only slave database 460 can backup data from the slave database 440.

It should be appreciated that while one management server, one system library, one master database, and four slave databases are shown in fig. 4A and 4B, this is illustrative only. The architecture may include more management servers, and/or more system libraries, and/or more master databases, and/or fewer or more slave databases.

The method performed by the management server shown in fig. 4A will be described below in conjunction with fig. 5. Specifically, a method of master-slave database switching performed by the management server shown in fig. 4A will be described in connection with fig. 5. Fig. 5 is a flowchart of a method 500 performed by a management server according to an embodiment of the present disclosure. In method 500, a management server is used to manage one master database and at least two slave databases. The "at least two slave databases" herein may not include a read-only slave database. For example, in the example of fig. 4A, "at least two slave databases" may include slave databases 430 through 450.

As shown in fig. 5, in step S501, the management server receives operation state information of the master database. The operational status information of the primary database may indicate an operational status of the primary database. The operational status of the primary database may be that the primary database has not failed. Alternatively, the operational status of the primary database may also be that the primary database is malfunctioning.

According to one example of the present disclosure, when a node deploying the master database and the management server agent module corresponding to the master database does not fail, the management server may receive operation state information of the master database from the management server agent module corresponding to the master database. In this example, the management server may periodically receive the operational status information of the primary database from the management server agent module corresponding to the primary database. Alternatively, the management server may receive the operation state information of the master database from the management server agent module corresponding to the master database non-periodically.

In case that the management server periodically receives the operation state information of the main database from the management server agent module corresponding to the main database, the management server may periodically receive the heartbeat information of the main database from the management server agent module corresponding to the main database, wherein the heartbeat information of the main database includes the operation state information of the main database.

For example, the heartbeat mechanism between the management server and the master database may be preset. Under the heartbeat mechanism, the master database may periodically report heartbeat information of the master database to the management server as a client of the management server, and the heartbeat information of the master database includes operation state information of the master database. Specifically, the master database may periodically generate the operation state information of the master database according to the operation state thereof, and include the operation state information of the master database in the heartbeat information of the master database. Then, the management server agent module corresponding to the master database may transmit heartbeat information of the master database to the management server. Accordingly, the management server may receive heartbeat information of the master database from the management server agent module corresponding to the master database to receive operation state information of the master database.

In the case where the management server receives the operation state information of the master database from the management server agent module corresponding to the master database aperiodically, the management server may poll the operation state information of the master database to the management server agent module corresponding to the master database.

However, when a node deploying the main database and the management server agent module corresponding to the main database fails, the management server may not receive the operation state information of the main database from the management server agent module corresponding to the main database. Accordingly, the management server may set a waiting time (for example, 24 seconds) in advance as a time for the management server to wait when the management server cannot receive the operation state information of the master database from the management server agent module corresponding to the master database. When the management server still cannot receive the operation state information of the master database from the management server proxy module corresponding to the master database during the waiting time, it may be determined that the nodes deploying the master database and the management server proxy module corresponding to the master database have failed, and the management server may receive the operation state information of the master database by means of the slave database. In contrast, when the management server receives the operation state information of the master database from the management server agent module corresponding to the master database during the waiting time, the management server does not have to receive the operation state information of the master database by means of the slave database.

An example in which the management server receives the operation state information of the master database by means of the slave database will be described below. According to one example of the present disclosure, the management server may receive the operation state information of the master database from a management server agent module corresponding to at least one of the at least two slave databases. In this example, the management server may periodically receive the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases. Alternatively, the management server may receive the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases non-periodically.

In case that the management server periodically receives the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases, the management server may periodically receive the heartbeat information of the at least one slave database from the management server agent module corresponding to the at least one of the at least two slave databases, wherein the heartbeat information of the at least one slave database includes the operation state information of the master database.

For example, the heartbeat mechanism between the management server and the slave database may be preset. Under this heartbeat mechanism, the slave database may periodically report heartbeat information of the slave database to the management server as a client of the management server, and the heartbeat information of the slave database includes operation state information of the master database. Specifically, the slave database may periodically detect the operation state of the master database, then generate operation state information of the master database according to the operation state of the master database, and include the operation state information of the master database in the heartbeat information of the slave database. Then, the management server agent module corresponding to the slave database may transmit heartbeat information of the slave database to the management server. Accordingly, the management server may receive heartbeat information of the slave database from the management server proxy module corresponding to the slave database to receive operation state information of the master database.

Furthermore, the heartbeat information of the slave database may also include operational status information of the slave database. Specifically, the slave database may periodically generate the operation state information of the slave database according to the operation state thereof, and include the operation state information of the slave database in the heartbeat information of the slave database. Then, the management server agent module corresponding to the slave database may transmit heartbeat information of the slave database to the management server. Accordingly, the management server may receive heartbeat information of the slave database from the management server agent module corresponding to the slave database to receive operation state information of the slave database.

In case that the management server receives the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases aperiodically, the management server may poll the operation state information of the master database to the management server agent module corresponding to the at least one of the at least two slave databases. For example, the management server may request the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases. The secondary database may then detect an operational state of the primary database in response to the request, and then generate operational state information of the primary database based on the operational state of the primary database. Then, the management server agent module corresponding to the slave database may transmit the operation state information of the master database to the management server.

Returning to fig. 5, in step S502, the management server determines the operation state of the primary database according to the operation state information of the primary database. When the operation state information of the main database indicates that the main database has not failed, the management server may determine that the operation state of the main database is that the main database has not failed. Alternatively, when the operation state information of the primary database indicates that the primary database is failed, the management server may determine that the operation state of the primary database is that the primary database is failed.

Then, in step S503, when the management server determines that the operation state of the master database is that the master database has failed, the management server selects one slave database from the at least two slave databases as a new master database. For example, the management server may determine a candidate slave database from at least two slave databases, and then select one slave database from the candidate slave databases as a new master database. The "candidate slave database" herein may be one or more slave databases.

An exemplary flow of the management server determining candidate slave databases from at least two slave databases will be described below.

According to one example of the present disclosure, first, a management server may determine at least two slave databases whose transaction identifications are acquired by the management server. Specifically, the management server may preset a timeout time (e.g., 5 seconds) for acquiring the transaction identification. The management server may determine at least two slave databases whose transaction identities were acquired by the management server during the timeout period.

For example, in the example of fig. 4A, the management server has acquired the transaction identity of the slave database 430 and the transaction identity of the slave database 440 within a timeout period without acquiring the transaction identity of the slave database 450, then the management server may determine that the slave databases whose transaction identities were acquired by the management server within the timeout period are the slave database 430 and the slave database 440.

The "transaction identification" described above may include identification information of the transaction performed by the master database and identification information of the transaction performed by the slave database. For example, the transaction identifier may be a global transaction identifier (Global Transaction IDentifiers, GTID). The management server may determine the transaction performed by the master database and the transaction performed by the slave database according to the transaction identification of the slave database, thereby determining the consistency (or called the coincidence) of the data of the slave database and the data of the master database. For example, when the transaction performed by the master database and the transaction performed by the slave database are identical, the management server may determine that the data of the slave database is identical to the data of the master database. When the transaction performed by the master database is identical to the transaction performed by the slave database, the management server may determine that the data of the slave database is identical to the data portion of the master database.

Furthermore, for at least two slave databases whose transaction identity is not acquired by the management server, the management server may delete these slave databases from the database cluster, e.g. kick these slave databases out of the database cluster. For example, in the example of fig. 4A, the management server has acquired the transaction identification from database 430 and the transaction identification from database 440 within a timeout period and has not acquired the transaction identification from database 450, then the management server may kick the database cluster out of database 450.

After the management server determines at least two slave databases whose transaction identifications were obtained by the management server, the management server may determine whether the determined number of slave databases is greater than a predetermined proportion (e.g., one half) of the number of the at least two slave databases. The management server may determine the determined slave database as a candidate slave database when the number of the determined slave databases is greater than a predetermined proportion of the number of the at least two slave databases. When the number of the determined slave databases is smaller than a predetermined proportion of the number of the at least two slave databases, the management server does not determine the determined slave databases as candidate slave databases.

For example, in the example of fig. 4A, the slave databases determined by the management server are slave database 430 and slave database 440. Since the number of the slave databases determined by the management server is 2, the number of the at least two slave databases is 3, and the number of the determined slave databases is greater than one half of the number of the at least two slave databases, the management server may determine the

slave databases

430 and 440 as candidate slave databases.

Also for example, in the example of fig. 4A, the slave database determined by the management server is slave database 430. Since the number of the slave databases determined by the management server is 1, the number of the at least two slave databases is 3, and the number of the determined slave databases is less than one half of the number of the at least two slave databases, the management server does not determine the slave database 430 as a candidate slave database.

After the management server determines a candidate slave database from at least two slave databases, the management server may select one slave database from the candidate slave databases as a new master database. According to one example of the present disclosure, each slave database may have its priority to act as a master database. For example, the management server may set in advance a priority for each slave database to serve as a master database. For example, in the example of fig. 4A, the management server may set in advance to the slave database 430 a priority of 1 for functioning as the master database, to the slave database 440 a priority of 2 for functioning as the master database, and to the slave database 450 a priority of 3 for functioning as the master database.

In this example, the management server may select one of the candidate slave databases as the new master database based at least on the priority of each of the candidate slave databases. For example, the management server may select one of the candidate slave databases as the new master database based on the priority of each of the candidate slave databases and the transaction identification.

It is necessary that the management server takes into account both the priority of the secondary database and the transaction identity when selecting a new primary database. This is because, in an ideal case, the slave database with the highest priority may have the largest transaction identification, i.e., the coincidence of the data of the slave database with the highest priority with the data of the master database is the highest. Therefore, when the management server takes the slave database with the highest priority as a new master database, other slave databases can backup data from the slave database with the highest priority, so that the data consistency of the database cluster is ensured. However, in non-ideal cases, the slave database with the highest priority may not have the largest transaction identification, i.e. the coincidence of the data of the slave database with the highest priority with the data of the master database is not the highest. Therefore, when the management server takes the slave database with the highest priority as a new master database, although other slave databases can backup data from the slave database with the highest priority, data consistency of the database cluster cannot be guaranteed. Therefore, the management server can select the slave database with the highest priority from the candidate slave databases as a new master database according to the priority and the transaction identification of each slave database in the candidate slave databases, so that the data consistency of the database cluster is ensured.

An exemplary flow of the management server selecting one of the candidate slave databases as the new master database based on the priority of each of the candidate slave databases and the transaction identification will be described below.

The management server may determine whether the candidate slave database having the highest priority has the largest transaction identification. According to one example of the present disclosure, when the slave database having the highest priority has the largest transaction identification, the management server may regard the slave database having the highest priority as the new master database.

For example, in the example of fig. 4A, the candidate slave databases are slave database 430 and slave database 440, and the slave database 430 serves as the master database with priority 1, the slave database 440 serves as the master database with priority 2, and the slave database 440 has the largest transaction identification, then the management server may treat the slave database 440 as the new master database.

According to another example of the present disclosure, when the slave database having the highest priority does not have the largest transaction identifier, the management server may control the slave database having the highest priority to have the largest transaction identifier, and then take the slave database having the highest priority as a new master database, thereby ensuring data consistency of the database cluster.

Specifically, when the slave database having the highest priority does not have the largest transaction identifier, the management server may control the slave database having the highest priority to backup data from the slave database having the largest transaction identifier through the management server agent module corresponding to the slave database having the highest priority so that the slave database having the highest priority has the largest transaction identifier. For example, when the slave database having the highest priority does not have the largest transaction identification, the management server may transmit information about the slave database having the largest transaction identification to the management server agent module corresponding to the slave database having the highest priority so that the management server agent module corresponding to the slave database having the highest priority controls the slave database having the highest priority to backup data from the slave database having the largest transaction identification. When the slave database with the highest priority has backed up data from the slave database with the largest transaction identifier, the slave database with the highest priority can have the largest transaction identifier, and the management server takes the slave database with the highest priority as a new master database.

For example, in the example of fig. 4A, where the candidate slave databases are the slave database 430 and the slave database 440, and the slave database 430 acts as the master database with a priority of 1, the slave database 440 acts as the master database with a priority of 2, and the slave database 430 has the largest transaction identification, the management server may send information about the slave database 430 to the management server agent module 410-4 corresponding to the slave database 440 so that the management server agent module 410-4 controls the slave database 440 to backup data from the slave database 430. When the slave database 440 backs up data from the slave database 430, the slave database 440 can have the largest transaction identification, and the management server may regard the slave database 440 after backing up data as a new master database.

Furthermore, as described above, the master database is responsible for writing of data, the slave database is responsible for reading of data, and the slave database can copy data from the master database. To prohibit writing data to the slave database, the management server may set a copy attribute (e.g., super read only option) to the slave database to limit performing write operations to the slave database. For example, in the example of FIG. 4A, the management server 410 may set a super read only option to each of the databases 430 through 450.

In the present disclosure, when the management server wants to use a certain slave database as a new master database, the replication attribute of the slave database may be removed first. For example, in the example of FIG. 4A, when the management server is to take the secondary database 440 as a new master database, the super read only option of the secondary database 440 may be removed first.

Returning to fig. 5, in step S504, the management server sends information about the new master database to the management server proxy modules corresponding to the other slave databases of the at least two slave databases, respectively, and controls the other slave databases to backup data from the new master database through the management server proxy modules corresponding to the other slave databases.

For example, in the example of fig. 4A, after the management server regards the slave database 440 as a new master database, the management server may send information about the slave database 440 to the management server agent module 410-3 corresponding to the slave database 430, the management server agent module 410-5 corresponding to the slave database 450. The management server may control the slave database 430 to backup data from the slave database 440 through the management server agent module 410-3 and the management server agent module 410-5 controls the slave database 450 to backup data from the slave database 440.

The "information about the new main database" described above may be identification information of the new main database, for example, a static IP address (or a fixed IP address) of the new main database. The secondary database may backup data from the new primary database by the static IP address of the new primary database. For example, in the example of fig. 4A, after the management server has the slave database 440 as a new master database, the slave database 430 and the slave database 450 may backup data from the slave database 440 through the static IP address of the slave database 440.

After step S504, the method 500 may further include step S505: the management server may send configuration information to a management server proxy module corresponding to the new master database, where the configuration information indicates a replication mechanism between the new master database and the other slave databases, so that the management server proxy module corresponding to the new master database controls the new master database to employ the replication mechanism.

For example, in the example of fig. 4A, after the management server regards the slave database 440 as a new master database, the management server may transmit configuration information to the management server agent module 410-4 corresponding to the slave database 440, wherein the configuration information indicates a replication mechanism between the slave database 440 as the new master database and the

slave databases

430, 450, so that the management server agent module 410-4 controls the slave database 440 as the new master database to employ the replication mechanism.

The "replication mechanism" described above may include any one of an all-synchronous replication mechanism, an asynchronous replication mechanism, a semi-synchronous replication mechanism, or the like. In the case where the master database employs the full replication mechanism, the master database may perform a write operation of data according to a user instruction, and after the write operation of data is performed, instead of immediately returning a confirmation message to the user, the master database waits for all the slave databases to backup data from the master database and then returns the confirmation message to the user. In the case where the master database employs an asynchronous replication mechanism, the master database may perform a write operation of data according to a user instruction, and immediately return a confirmation message to the user after the write operation of data is performed, without waiting for the slave database to back up the data from the master database and then return the confirmation message to the user. In the case where the master database employs a semi-synchronous replication mechanism, the master database may perform a write operation of data according to a user instruction, and instead of immediately returning a confirmation message to the user after the write operation of data is performed, wait for a part of the slave databases (the number of the slave databases may be set by the parameter ack count) to backup data from the master database and then return the confirmation message to the user.

In the present disclosure, in the case where the replication mechanism between the new master database and the slave databases is a semi-synchronous replication mechanism, the new master database may perform a data writing operation according to a user instruction, and after performing the data writing operation, instead of immediately returning a confirmation message to the user, the new master database waits for one or more slave databases to backup data from the new master database and then returns a confirmation message to the user, thereby ensuring data consistency of the new database cluster. By adopting the semi-synchronous replication mechanism, even if the new database cluster is hit again, for example, the master database in the new database cluster is down, the slave database in the new database cluster can still guarantee the high availability of the database because the slave database in the new database cluster is consistent with the data of the master database. Thus, by employing a semi-synchronous replication mechanism, a database cluster may be given resistance to secondary hits.

Further, in the present disclosure, the management server may configure a dynamic IP (e.g., floating IP) for each database cluster so that the dynamic IP is a unified IP that the database cluster serves outside. A user may access the database cluster through the dynamic IP, for example, by writing data to the database cluster through the dynamic IP. In this case, after step S505, the method 500 may further include step S506: the management server may bind the dynamic IP of the database cluster to the new master database. In this way, the user can write data into the database cluster through the dynamic IP without knowing which database in the database cluster is the master database, thereby improving the experience of the user.

For example, in the example of fig. 4A, the management server may bind the floating IP of the database cluster to the master database 410 before the management server treats the slave database 440 as a new master database. After the management server has the slave database 440 as a new master database, the management server may bind the floating IP of the database cluster to the slave database 440 in step S506.

By the method executed by the management server, after the main database in the database cluster is down, the management server can promote a certain slave database to be a new main database, so that the database cluster is restored to a service available state to continue to provide services for users. However, the present disclosure is not limited thereto. The management server can also repair the downed main database. According to one example of the present disclosure, after step S506, the method 500 may further include step S507: the management server may transmit repair information to the management server proxy module corresponding to the master database so that the management server proxy module corresponding to the master database repairs the master database. The "repair information" herein may include a database restart instruction and information about a new master database.

For example, in the example of fig. 4A, after the primary database 420 is down, the management server may send repair information to the management server proxy module 410-2 corresponding to the primary database 420, such that the management server proxy module 410-2 controls the primary database 420 to shut down and restart and controls the primary database 420 after the restart to backup data from the new primary database in order for the management server proxy module 410-2 to repair the primary database 420.

In this example, the management server may join the repaired master database to the database cluster. For example, the management server may join the repaired master database 420 as a slave database to the slave database 440 (which has been promoted to the master database) into the database cluster. Alternatively, the management server may delete the repaired master database from the database cluster.

Furthermore, in this example, the management server may perform a check of data consistency, i.e. a secondary check, on the repaired primary database to ensure data consistency of the database cluster. In addition, the management server may set the database instance to an available state in order to provide database services.

Further, in the present disclosure, the management server may display notification information to an operator of the management server so that the operator of the management server knows the operation state of the database cluster. For example, in the example of fig. 4A, when the primary database 420 is down, the management server may display alarm information to an operator of the management server so that the operator of the management server knows that the primary database 420 is down. For another example, when a master database in a database cluster is down and a management server cannot promote a slave database to a new master database, the management server may display alarm information to an operator of the management server to notify the operator of the management server that master-slave switching is impossible, thereby setting the database cluster to a maintenance state. For another example, in the example of fig. 4A, after the primary database 420 is down and repaired, the management server may display alarm information to the operator of the management server so that the operator of the management server knows that the primary database 420 has been repaired.

By the method of the embodiment of the disclosure, when the master database fails, one slave database can be selected as a new master database from a plurality of slave databases so as to ensure high availability of database services. In addition, through the method of the embodiment of the disclosure, the management server operates the database through the management server proxy module, so that the database is prevented from being operated by relying on SSH, and the compliance requirement of a financial scene is met. In addition, by the method of the embodiment of the disclosure, the extension of the slave database is realized, so that better reading extension capability is provided.

An exemplary specific flow of applying the architecture shown in fig. 4A will be described below in connection with fig. 6. Fig. 6 is an exemplary specific flow of applying the architecture shown in fig. 4A, according to an embodiment of the present disclosure. As shown in fig. 6, in step S1, the management server discovers that the master database is down through the heartbeat reported by the master database and the slave database (in the case of one master and three slaves, the master database must be discovered to be down first). Then, in step S2, the management server attempts to acquire GTIDs from all the slave databases. Then, in step S3, the management server determines whether or not more than half of the GTIDs of the slave databases have been correctly acquired, and may kick the slave databases whose GTIDs have not been correctly acquired out of the database cluster. Then, in step S4, the management server determines whether the slave database having the highest priority has the largest GTID. When the secondary database with the highest priority has the largest GTID, in step S5, the management server promotes the secondary database with the highest priority to the primary database, and removes the super read only option of the secondary database. When the slave database having the highest priority does not have the maximum GTID, the management server controls the slave database having the highest priority to copy data with the slave database having the maximum GTID connected thereto in step S12, and performs step S5 after the copying is completed. Then, in step S6, the management server controls the other slave databases to connect to the new master database for replication. Then, in step S7, the management server may convert the high availability database into synchronous replication and bind the floating IP to the new master database to make the database service available for restoration. Then, in step S8, the management server may repair the old master database, specifically, restart the old master database by a kill instruction and a restart instruction, so that the old master database is successfully restarted. Then, in step S9, the management server controls the old master database to be connected with the new master database to copy the data. Then, in step S10, the management server may modify the ack count parameter, if necessary. Then, in step S11, the management server may set the database instance to an available state.

Further, in fig. 6, when the management server fails to perform steps S3 to S7, the management server may alarm to set the database to a maintenance state. In addition, in fig. 6, after the management server executes steps S9 to S10, if the database cluster is master-slave, the management server may alarm and set the database to a maintenance state, otherwise, the management server may alarm and kick the old master database out of the data cluster and set the old master database to a maintenance state to prevent secondary hit.

The embodiments described above are embodiments in which a management server manages master and/or slave databases in a database cluster to achieve high availability of the databases. However, the present disclosure is not limited thereto. According to another embodiment of the present disclosure, high availability of the management server may be achieved. In this embodiment, a plurality of management servers may be deployed. Although a plurality of management servers are deployed, at some point only one management server is used to manage the master and/or slave databases in the database cluster. In this disclosure, a management server that manages a master database and/or a slave database in a database cluster at the current time may be referred to as a work management server.

According to one example of the present disclosure, when a work management server fails, other management servers may act as work management servers. In particular, a management module for managing a plurality of management servers may be deployed. Other management servers may act as work management servers by means of the management module. The "management module" here may be an existing ZooKeeper application.

For example, when a work management server fails, the management module may select one from other management servers to act as a work management server. Specifically, the management server may select one from a plurality of management servers to serve as a work management server according to an identification (e.g., ID) of each management server. For example, the management server may select a management server having the smallest ID from among a plurality of management servers to serve as the work management server. In this example, when the work management server malfunctions, the management module may select a management server having the smallest ID from the remaining management servers to serve as the work management server.

For another example, when a work management server fails, one management server may not be selected by the management module to act as a work management server, but other management servers autonomously attempt to act as work management servers. Specifically, the management module may notify other management servers of the failure of the work management server, and then the other management servers may attempt to act as the work management server. For example, the management module may detect an operation state of the work management server, and when the work management server fails, the management module may notify other management servers of the failure of the work management server, and then the other management servers may attempt to act as the work management server. When the other management servers attempt to become the work management servers, each of the other management servers can determine whether or not itself can function as the work management server by the respective identification information. For example, one of the other management servers may determine whether its own ID is smaller than the IDs of the remaining management servers among the other management servers, and if so, the management server may function as a work management server.

By this embodiment, in case of failure of the work management server, another management server may act as a work management server to continue to manage the master database and/or the slave database in the database cluster, thereby achieving a high availability of the management server.

According to yet another embodiment of the present disclosure, high availability of a system library may be achieved. In this embodiment, a plurality of system libraries may be configured. Although a plurality of system libraries are configured, at a certain time, the management server uses only one of the plurality of system libraries.

According to one example of the present disclosure, an IP address may be configured for each of a plurality of system libraries. In this case, the management server may select one system library from the plurality of system libraries to use based on the IP address of each system library. For example, the management server may select a system library having the smallest IP address from among a plurality of system libraries to use. In this example, when the current system library used by the management server fails, the management server may select a system library having the smallest IP address from the remaining system libraries to use.

With this embodiment, when the current system library fails, the management server can use another system library to store instance metadata, thereby achieving high availability of the system library.

An architecture diagram implementing high availability of management servers and high availability of system libraries will be described below in connection with fig. 7. Fig. 7 is a schematic diagram of an architecture implementing high availability of management servers and high availability of system libraries according to an embodiment of the present disclosure. As shown in fig. 7, five management servers, respectively management servers 1 to 5, may be deployed, wherein the management servers 1 to 3 may be located in Shenzhen and the management servers 4 to 5 may be located in Shanghai. In addition, a ZooKeeper may be deployed on each management server. Furthermore, five system libraries, meta databases (mata batabase) 1 to 5, respectively, may be deployed. Furthermore, a database cluster may be deployed to be managed by the management server, the database cluster comprising databases MySQL 1 to 6. In addition, a management server Agent module Agent corresponding to the databases MySQL 1 to 6 may be deployed.

As shown in fig. 7, the management server 3 uses the system library 1 to manage databases MySQL 1 to 6. When the management server 3 malfunctions, the ZooKeeper may detect that the management server 3 malfunctions and may notify the

management servers

1, 2, 4, and 5 of a message that the management server 3 malfunctions. Then, one of the

management servers

1, 2, 4 and 5 may take over from the management server 3 to manage the databases MySQL 1 to 6. In addition, when the system library 1 fails, the management server 3 may manage the databases MySQL 1 to 6 using one of the system libraries 2 to 5.

Further, in fig. 7, an operator of the management server can operate the management server through a Web application programming interface (Web Application Programming Interface, web API).

Hereinafter, an apparatus for managing a database according to an embodiment of the present disclosure will be described with reference to fig. 8. The device may be a management server. Fig. 8 illustrates a schematic structure of a management server 800 according to an embodiment of the present disclosure. Since the function of the management server 800 is the same as the details of the method described above with reference to fig. 5, a detailed description of the same is omitted herein for simplicity. As shown in fig. 8, the management server 800 includes: a receiving unit 810 configured to receive operation state information of the main database; a determining unit 820 configured to determine an operation state of the main database according to the operation state information of the main database; a selecting unit 830 configured to select one slave database from at least two slave databases as a new master database when the determining unit determines that the operation state of the master database is that the master database has failed; and a transmitting unit 840 configured to transmit information about the new master database to the management server agent modules respectively corresponding to other slave databases among the at least two slave databases, the other slave databases being controlled to backup data from the new master database through the management server agent modules corresponding to the other slave databases. In addition to these four units, the management server 800 may include other components, however, since these components are not related to the contents of the embodiments of the present disclosure, illustration and description thereof are omitted herein. The "at least two slave databases" herein may not include a read-only slave database. For example, in the example of fig. 4A, "at least two slave databases" may include slave databases 430 through 450.

In the present disclosure, the operational status information of the primary database may indicate an operational status of the primary database. The operational status of the primary database may be that the primary database has not failed. Alternatively, the operational status of the primary database may also be that the primary database is malfunctioning.

According to one example of the present disclosure, when a node deploying the main database and the management server agent module corresponding to the main database does not fail, the receiving unit 810 may receive the operation state information of the main database from the management server agent module corresponding to the main database. In this example, the receiving unit 810 may periodically receive the operation state information of the master database from the management server agent module corresponding to the master database. Alternatively, the receiving unit 810 may non-periodically receive the operation state information of the master database from the management server agent module corresponding to the master database.

In case that the receiving unit 810 periodically receives the operation state information of the main database from the management server agent module corresponding to the main database, the receiving unit 810 may periodically receive the heartbeat information of the main database from the management server agent module corresponding to the main database, wherein the heartbeat information of the main database includes the operation state information of the main database.

For example, the heartbeat mechanism between the management server and the master database may be preset. Under the heartbeat mechanism, the master database may periodically report heartbeat information of the master database to the management server as a client of the management server, and the heartbeat information of the master database includes operation state information of the master database. Specifically, the master database may periodically generate the operation state information of the master database according to the operation state thereof, and include the operation state information of the master database in the heartbeat information of the master database. Then, the management server agent module corresponding to the master database may transmit heartbeat information of the master database to the management server. Accordingly, the receiving unit 810 may receive heartbeat information of the main database from the management server agent module corresponding to the main database to receive operation state information of the main database.

In case that the receiving unit 810 non-periodically receives the operation state information of the main database from the management server agent module corresponding to the main database, the receiving unit 810 may poll the operation state information of the main database to the management server agent module corresponding to the main database.

However, when a node deploying the main database and the management server agent module corresponding to the main database fails, the receiving unit 810 may not receive the operation state information of the main database from the management server agent module corresponding to the main database. Accordingly, the receiving unit 810 may set a waiting time (for example, 24 seconds) in advance as a time for the receiving unit 810 to wait when the receiving unit 810 cannot receive the operation state information of the master database from the management server agent module corresponding to the master database. When the receiving unit 810 still cannot receive the operation state information of the master database from the management server agent module corresponding to the master database within the waiting time, it may be determined that the node where the master database and the management server agent module corresponding to the master database are disposed has failed, and the receiving unit 810 may receive the operation state information of the master database by means of the slave database. In contrast, when the receiving unit 810 receives the operation state information of the master database from the management server agent module corresponding to the master database within the waiting time, the receiving unit 810 does not have to receive the operation state information of the master database by means of the slave database.

An example in which the receiving unit 810 receives the operation state information of the master database by means of the slave database will be described below. According to one example of the present disclosure, the receiving unit 810 may receive the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases. In this example, the receiving unit 810 may periodically receive the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases. Alternatively, the receiving unit 810 may aperiodically receive the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases.

In case that the receiving unit 810 periodically receives the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases, the receiving unit 810 may periodically receive the heartbeat information of the at least one slave database from the management server agent module corresponding to the at least one of the at least two slave databases, wherein the heartbeat information of the at least one slave database includes the operation state information of the master database.

For example, the heartbeat mechanism between the management server and the slave database may be preset. Under this heartbeat mechanism, the slave database may periodically report heartbeat information of the slave database to the management server as a client of the management server, and the heartbeat information of the slave database includes operation state information of the master database. Specifically, the slave database may periodically detect the operation state of the master database, then generate operation state information of the master database according to the operation state of the master database, and include the operation state information of the master database in the heartbeat information of the slave database. Then, the management server agent module corresponding to the slave database may transmit heartbeat information of the slave database to the management server. Accordingly, the receiving unit 810 may receive heartbeat information of the slave database from the management server agent module corresponding to the slave database to receive operation state information of the master database.

Furthermore, the heartbeat information of the slave database may also include operational status information of the slave database. Specifically, the slave database may periodically generate the operation state information of the slave database according to the operation state thereof, and include the operation state information of the slave database in the heartbeat information of the slave database. Then, the management server agent module corresponding to the slave database may transmit heartbeat information of the slave database to the management server. Accordingly, the receiving unit 810 may receive heartbeat information of the slave database from the management server agent module corresponding to the slave database to receive operation state information of the slave database.

In case that the receiving unit 810 non-periodically receives the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases, the receiving unit 810 may poll the operation state information of the master database from the management server agent module corresponding to the at least one of the at least two slave databases. For example, the receiving unit 810 may request the operation state information of the master database from the management server agent module corresponding to at least one of the at least two slave databases. The slave database may then detect the operational state of the master database in response to the request and then generate operational state information for the master database based on the operational state of the master database. Then, the management server agent module corresponding to the slave database may transmit the operation state information of the master database to the receiving unit 810.

Further, in the present disclosure, when the operation state information of the main database indicates that the main database has not failed, the determination unit 820 may determine that the operation state of the main database is that the main database has not failed. Alternatively, when the operation state information of the main database indicates that the main database is failed, the determining unit 820 may determine that the operation state of the main database is that the main database is failed.

Further, in the present disclosure, when the determining unit 820 determines that the operation state of the master database is that the master database fails, the selecting unit 830 selects one slave database from the at least two slave databases as a new master database. For example, the selection unit 830 may determine a candidate slave database from at least two slave databases and then select one slave database from the candidate slave databases as a new master database. The "candidate slave database" herein may be one or more slave databases.

An exemplary flow of the selection unit 830 determining candidate slave databases from at least two slave databases will be described below.

According to one example of the present disclosure, first, the selection unit 830 may determine at least two slave databases whose transaction identifications are acquired by the management server from the databases. Specifically, the selection unit 830 may set a timeout time (for example, 5 seconds) for acquiring the transaction identification in advance. The selection unit 830 may determine at least two slave databases whose transaction identities were acquired by the management server during the timeout period.

For example, in the example of fig. 4A, where the management server has acquired the transaction identity of the slave database 430 and the transaction identity of the slave database 440 within a timeout period without acquiring the transaction identity of the slave database 450, the selection unit 830 may determine that the slave databases whose transaction identities were acquired by the management server within the timeout period are the slave database 430 and the slave database 440.

Furthermore, for at least two slave databases whose transaction identity is not acquired by the management server, the management server may delete these slave databases from the database cluster, e.g. kick these slave databases out of the database cluster. For example, in the example of fig. 4A, where the management server has acquired the transaction identification from database 430 and the transaction identification from database 440 within a timeout period and has not acquired the transaction identification from database 450, selection unit 830 may kick the database cluster out of database 450.

After the selection unit 830 determines at least two slave databases whose transaction identifications are acquired by the management server, the selection unit 830 may determine whether the determined number of slave databases is greater than a predetermined proportion (e.g., one half) of the number of the at least two slave databases. The selecting unit 830 may determine the determined slave database as a candidate slave database when the number of the determined slave databases is greater than a predetermined proportion of the number of the at least two slave databases. When the number of the determined slave databases is smaller than a predetermined ratio of the number of the at least two slave databases, the selecting unit 830 does not determine the determined slave databases as candidate slave databases.

For example, in the example of fig. 4A, the slave databases determined by the management server are slave database 430 and slave database 440. Since the number of the slave databases determined by the management server is 2, the number of the at least two slave databases is 3, and the number of the determined slave databases is greater than one half of the number of the at least two slave databases, the selection unit 830 may determine the

slave databases

430 and 440 as candidate slave databases.

Also for example, in the example of fig. 4A, the slave database determined by the management server is slave database 430. Since the number of the slave databases determined by the management server is 1, the number of the at least two slave databases is 3, and the number of the determined slave databases is less than one half of the number of the at least two slave databases, the selection unit 830 does not determine the slave database 430 as a candidate slave database.

After the selecting unit 830 determines a candidate slave database from at least two slave databases, the selecting unit 830 may select one slave database from the candidate slave databases as a new master database. According to one example of the present disclosure, each slave database may have its priority to act as a master database. For example, the management server may set in advance a priority for each slave database to serve as a master database. For example, in the example of fig. 4A, the management server may set in advance to the slave database 430 a priority of 1 for functioning as the master database, to the slave database 440 a priority of 2 for functioning as the master database, and to the slave database 450 a priority of 3 for functioning as the master database.

In this example, the selection unit 830 may select one slave database from the candidate slave databases as the new master database according to at least the priority of each of the candidate slave databases. For example, the selection unit 830 may select one slave database from the candidate slave databases as the new master database according to the priority of each slave database and the transaction identification.

An exemplary flow of selecting one slave database from among the candidate slave databases as a new master database by the selecting unit 830 according to the priority of each of the candidate slave databases and the transaction identification will be described below.

The selection unit 830 may determine whether the slave database having the highest priority among the candidate slave databases has the largest transaction identification. According to one example of the present disclosure, when the slave database having the highest priority has the largest transaction identification, the selection unit 830 may regard the slave database having the highest priority as the new master database.

For example, in the example of fig. 4A, the candidate slave databases are the slave database 430 and the slave database 440, and the slave database 430 serves as the master database with a priority of 1, the slave database 440 serves as the master database with a priority of 2, and the slave database 440 has the largest transaction identification, the selecting unit 830 may regard the slave database 440 as the new master database.

According to another example of the present disclosure, when the slave database having the highest priority does not have the largest transaction identifier, the selection unit 830 may control the slave database having the highest priority to have the largest transaction identifier and then take the slave database having the highest priority as a new master database, thereby ensuring data consistency of the database cluster.

Specifically, when the slave database having the highest priority does not have the largest transaction identification, the selection unit 830 may control the slave database having the highest priority to backup data from the slave database having the largest transaction identification through the management server agent module corresponding to the slave database having the highest priority so that the slave database having the highest priority has the largest transaction identification. For example, when the slave database having the highest priority does not have the largest transaction identification, the selection unit 830 may transmit information about the slave database having the largest transaction identification to the management server agent module corresponding to the slave database having the highest priority so that the management server agent module corresponding to the slave database having the highest priority controls the slave database having the highest priority to backup data from the slave database having the largest transaction identification. When the slave database having the highest priority can have the largest transaction identifier after backing up data from the slave database having the largest transaction identifier, the selection unit 830 takes the slave database having the highest priority as a new master database.

For example, in the example of fig. 4A, where the candidate slave databases are the slave database 430 and the slave database 440, and the slave database 430 acts as the master database with a priority of 1, the slave database 440 acts as the master database with a priority of 2, and the slave database 430 has the largest transaction identification, the management server may send information about the slave database 430 to the management server agent module 410-4 corresponding to the slave database 440 so that the management server agent module 410-4 controls the slave database 440 to backup data from the slave database 430. When the slave database 440 backs up data from the slave database 430, the slave database 440 can have the largest transaction identification, and the selection unit 830 may regard the slave database 440 after backing up data as a new master database.

In the present disclosure, when the selection unit 830 is to take a certain slave database as a new master database, the replication attribute of the slave database may be removed first. For example, in the example of FIG. 4A, when selection unit 830 is to treat secondary database 440 as a new master database, the super read only option of secondary database 440 may be removed first.

In addition, in the present disclosure, the transmitting unit 840 transmits information about the new master database to the management server agent modules respectively corresponding to other slave databases among the at least two slave databases, and controls the other slave databases to backup data from the new master database through the management server agent modules corresponding to the other slave databases.

For example, in the example of fig. 4A, after the management server regards the slave database 440 as a new master database, the transmitting unit 840 may transmit information about the slave database 440 to the management server agent module 410-3 corresponding to the slave database 430, the management server agent module 410-5 corresponding to the slave database 450. The management server may control the backup of data from the slave database 440 to the slave database 430 through the management server agent module 410-3 and the management server agent module 410-5 may control the backup of data from the slave database 440 to the slave database 450.

In addition, the transmitting unit 840 may transmit configuration information to the management server agent module corresponding to the new master database, wherein the configuration information indicates a replication mechanism between the new master database and the other slave databases, so that the management server agent module corresponding to the new master database controls the new master database to employ the replication mechanism.

For example, in the example of fig. 4A, after the management server regards the slave database 440 as a new master database, the transmitting unit 840 may transmit configuration information to the management server agent module 410-4 corresponding to the slave database 440, wherein the configuration information indicates a replication mechanism between the slave database 440 as the new master database and the

slave databases

Further, in the present disclosure, the management server may configure a dynamic IP (e.g., floating IP) for each database cluster so that the dynamic IP is a unified IP that the database cluster serves outside. A user may access the database cluster through the dynamic IP, for example, by writing data to the database cluster through the dynamic IP. In this case, the management server may bind the dynamic IP of the database cluster to the new master database. In this way, the user can write data into the database cluster through the dynamic IP without knowing which database in the database cluster is the master database, thereby improving the experience of the user.

By the method executed by the management server, after the main database in the database cluster is down, the management server can promote a certain slave database to be a new main database, so that the database cluster is restored to a service available state to continue to provide services for users. However, the present disclosure is not limited thereto. The management server can also repair the downed main database. According to one example of the present disclosure, the transmitting unit 840 may transmit repair information to the management server agent module corresponding to the master database so that the management server agent module corresponding to the master database repairs the master database. The "repair information" herein may include a database restart instruction and information about a new master database.

Further, in the present disclosure, the management server may further include a notification unit (not shown in the drawings) to display notification information to an operator of the management server so that the operator of the management server knows the operation state of the database cluster. For example, in the example of fig. 4A, when the main database 420 is down, the notification unit may display alarm information to the operator of the management server so that the operator of the management server knows that the main database 420 is down. For another example, when a master database in the database cluster is down and the management server cannot promote a slave database to a new master database, the notification unit may display alarm information to an operator of the management server so as to notify the operator of the management server that master-slave switching cannot be performed, thereby setting the database cluster to a maintenance state. For another example, in the example of fig. 4A, after the primary database 420 is down and repaired, the notification unit may display alarm information to the operator of the management server so that the operator of the management server knows that the primary database 420 has been repaired.

Through the management server of the embodiment of the disclosure, when the master database fails, one slave database can be selected as a new master database from a plurality of slave databases so as to ensure high availability of database services. In addition, through the method of the embodiment of the disclosure, the management server operates the database through the management server proxy module, so that the database is prevented from being operated by relying on SSH, and the compliance requirement of a financial scene is met. In addition, by the method of the embodiment of the disclosure, the extension of the slave database is realized, so that better reading extension capability is provided.

Furthermore, a management server according to embodiments of the present disclosure may also be implemented by means of the architecture of the computing device shown in fig. 9. Fig. 9 illustrates an architecture of the computing device. As shown in fig. 9, computing device 900 may include a bus 910, one or more CPUs 920, a Read Only Memory (ROM) 930, a Random Access Memory (RAM) 940, a communication port 950 connected to a network, an input/output component 960, a hard disk 970, and the like. A storage device in computing device 900, such as ROM 930 or hard disk 970, may store various data or files for computer processing and/or communication and program instructions for execution by the CPU. Computing device 900 may also include a user interface 980. Of course, the architecture shown in FIG. 9 is merely exemplary, and one or more components of the computing device shown in FIG. 9 may be omitted as may be practical in implementing different devices.

Embodiments of the present disclosure may also be implemented as a computer-readable storage medium. Computer readable storage media according to embodiments of the present disclosure have computer readable instructions stored thereon. When executed by a processor, may perform a method according to embodiments of the present disclosure described with reference to the above figures. The computer-readable storage medium includes, but is not limited to, for example, volatile memory and/or nonvolatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like.

Those skilled in the art will appreciate that various modifications and improvements can be made to the disclosure. For example, the various devices or components described above may be implemented in hardware, or may be implemented in software, firmware, or a combination of some or all of the three.

Furthermore, as shown in the present disclosure and claims, unless the context clearly indicates otherwise, the words "a," "an," "the," and/or "the" are not specific to the singular, but may include the plural. The terms "first," "second," and the like, as used in this disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. Likewise, the word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

Further, a flowchart is used in this disclosure to describe the operations performed by the system according to embodiments of the present disclosure. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

While the present disclosure has been described in detail above, it will be apparent to those skilled in the art that the present disclosure is not limited to the embodiments described in the present specification. The present disclosure may be embodied as modifications and variations without departing from the spirit and scope of the disclosure, which is defined by the appended claims. Accordingly, the description herein is for the purpose of illustration and is not intended to be in any limiting sense with respect to the present disclosure.

Claims

1. A method for managing a database, the method comprising:

receiving operation state information of a main database;

determining the operation state of the main database according to the operation state information of the main database;

when the running state of the master database is determined to be that the master database fails, selecting one slave database from at least two slave databases as a new master database; and

sending information about the new master database to the management server agent modules respectively corresponding to other slave databases of the at least two slave databases, controlling the other slave databases to backup data from the new master database through the management server agent modules corresponding to the other slave databases,

wherein selecting one slave database from the at least two slave databases as a new master database comprises:

determining a slave database of which the transaction identification is acquired by a management server in the at least two slave databases, wherein the management server is used for managing the master database and the at least two slave databases;

judging whether the number of the determined slave databases is greater than a predetermined proportion of the number of the at least two slave databases;

Determining the determined slave database as a candidate slave database when the number of the determined slave databases is greater than a predetermined proportion of the number of the at least two slave databases; and

and selecting one slave database from the candidate slave databases as a new master database.

2. The method of claim 1, wherein said receiving operational status information of a master database comprises:

and receiving the running state information of the main database from a management server agent module corresponding to the main database.

3. The method of claim 1, wherein said receiving operational status information of a master database comprises:

and receiving the running state information of the master database from a management server agent module corresponding to at least one slave database in the at least two slave databases.

4. The method of claim 1, wherein

Each slave database has its priority as master database;

selecting one of the candidate slave databases as a new master database comprises:

and selecting one slave database from the candidate slave databases as a new master database according to at least the priority of each slave database in the candidate slave databases.

5. The method of claim 4, wherein said selecting one of said candidate slave databases as a new master database based at least on the priority of each of said candidate slave databases comprises:

and selecting one slave database from the candidate slave databases as a new master database according to the priority and the transaction identification of each slave database in the candidate slave databases.

6. The method of claim 5, wherein selecting one of the candidate slave databases as the new master database based on the priority of each of the candidate slave databases and the transaction identification comprises:

judging whether the slave database with the highest priority in the candidate slave databases has the largest transaction identifier or not; and

when the slave database with the highest priority has the largest transaction identification, the slave database with the highest priority is taken as a new master database.

7. The method of claim 6, wherein selecting one of the candidate slave databases as a new master database based on the priority of each of the candidate slave databases and the transaction identification comprises:

When the slave database with the highest priority does not have the largest transaction identifier, controlling the slave database with the highest priority to backup data from the slave database with the largest transaction identifier through a management server agent module corresponding to the slave database with the highest priority so that the slave database with the highest priority has the largest transaction identifier; and

the slave database with the highest priority is taken as the new master database.

8. A method as claimed in any one of claims 1 to 3, further comprising:

and sending configuration information to a management server agent module corresponding to the new master database, wherein the configuration information indicates a replication mechanism between the new master database and the other slave databases, so that the management server agent module corresponding to the new master database controls the new master database to adopt the replication mechanism.

9. A method as claimed in any one of claims 1 to 3, further comprising:

and sending the repair information to the management server proxy module corresponding to the main database so as to repair the main database by the management server proxy module corresponding to the main database.

10. An apparatus for managing a database, comprising:

A receiving unit configured to receive operation state information of the main database;

a determining unit configured to determine an operation state of the main database according to the operation state information of the main database;

a selection unit configured to select one slave database from at least two slave databases as a new master database when the determination unit determines that the operation state of the master database is that the master database has failed; and

a transmission unit configured to transmit information about a new master database to management server agent modules respectively corresponding to other slave databases among the at least two slave databases, control the other slave databases to backup data from the new master database through the management server agent modules corresponding to the other slave databases,

wherein the selection unit is configured to determine a slave database of the at least two slave databases, the transaction identity of which is acquired by a management server, wherein the management server is used for managing the master database and the at least two slave databases; judging whether the number of the determined slave databases is greater than a predetermined proportion of the number of the at least two slave databases; determining the determined slave database as a candidate slave database when the number of the determined slave databases is greater than a predetermined proportion of the number of the at least two slave databases; and selecting one slave database from the candidate slave databases as a new master database.

11. The apparatus of claim 10, wherein the receiving unit is configured to receive the operational status information of the primary database from a management server agent module corresponding to the primary database.

12. An apparatus for managing a database, comprising:

a processor; and

a memory, wherein the memory has stored therein computer readable code which, when executed by the processor, performs the method of any of claims 1-9.

13. A computer readable storage medium having stored thereon instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-9.