CN113220784A

CN113220784A - Method, device, equipment and storage medium for realizing high-availability database system

Info

Publication number: CN113220784A
Application number: CN202110518991.2A
Authority: CN
Inventors: 刘坚君; 丁鹏; 朱德润; 罗唐; 宋志强; 丁顺; 彭晶鑫; 吴斌炜
Original assignee: Ucloud Technology Co ltd
Current assignee: Ucloud Technology Co ltd
Priority date: 2021-05-12
Filing date: 2021-05-12
Publication date: 2021-08-06

Abstract

The invention relates to the technical field of databases, and discloses a method, a device, equipment and a storage medium for realizing a high-availability database system. The method comprises the following steps: detecting whether a fault occurs in the operation process of the main computing node in real time; when the main computing node fails in the operation process, the physical configuration specification of the standby computing node is upgraded to the same physical configuration specification as the main computing node in a hot mode; performing a fast crash recovery operation on a database instance operated by the standby computing node to upgrade the standby computing node to a new main computing node; and when the rapid crash recovery operation is executed, the new main computing node is adopted to provide database service for the outside. The database system of the invention adopts an asymmetric main and standby node architecture, and further reduces the operation cost of the database on the premise of ensuring the reliable operation of the database system.

Description

Method, device, equipment and storage medium for realizing high-availability database system

Technical Field

The invention relates to the technical field of databases, in particular to a method, a device, equipment and a storage medium for realizing a high-availability database system.

Background

In the whole software and hardware stack where the database system is located, both hardware and software are likely to have faults; if the database system is a single-machine database system, the system fault needs longer processing time (the database system has long restart time or needs manual intervention), so that the database service is unavailable for a longer time; obviously, a stand-alone database system does not meet the application requirements of an enterprise-level database.

The high-availability database system is one of the mainstream solutions of enterprise-level databases in the market at present, and the working principle of the high-availability database system is that automatic processing and quick recovery of database faults are realized through redundant database nodes and an automatic disaster recovery mechanism, and even if a certain node in the database system is unavailable due to a problem, the database system can still normally provide database services to the outside as a whole.

The traditional high-availability database system requires that the hardware configurations (CPU, memory, disk, etc.) of the active and standby nodes are the same, that is, the active and standby nodes are configured and deployed on the hardware in a peer-to-peer manner (symmetric active and standby nodes). Typical high-availability database systems in the market, such as MySQL cluster, Oracle DataGuard and the like, are all 1-master-1-slave dual-node architectures; high availability database systems require several times the hardware cost of a stand-alone database system while achieving shorter downtime and higher availability.

Disclosure of Invention

The invention mainly aims to provide a method, a device, equipment and a storage medium for realizing a high-availability database system, and aims to solve the technical problem that the traditional high-availability database has overhigh operation cost.

The invention provides a method for realizing a high-availability database system, wherein the high-availability database system comprises at least one main computing node, a distributed storage and a standby computing node, the physical configuration specification of the standby computing node is lower than that of the main computing node, and the method for realizing the high-availability database system comprises the following steps:

detecting whether a fault occurs in the operation process of the main computing node in real time;

when the main computing node fails in the operation process, the physical configuration specification of the standby computing node is upgraded to the same physical configuration specification as the main computing node in a hot mode;

performing a fast crash recovery operation on a database instance operated by the standby computing node to upgrade the standby computing node to a new main computing node;

and executing the rapid crash recovery operation on the database instance, and adopting the new main computing node to provide database service for the outside when the rapid crash recovery operation is executed.

Optionally, in a first implementation manner of the first aspect of the present invention, after upgrading the standby computing node to a new master computing node, the method further includes:

and starting a new standby computing node, and establishing a main-standby relation between the new main computing node and the new standby computing node.

Optionally, in a second implementation manner of the first aspect of the present invention, the performing a fast crash recovery operation on the database instance run by the standby computing node to upgrade the standby computing node to a new primary computing node includes:

loading unsynchronized redo logs from the redo log file stored in the distributed mode, analyzing the unsynchronized redo logs, and storing analysis results into a preset hash table;

reading a physical page corresponding to preset necessary data from the distributed storage to a memory of the standby computing node for version updating, and writing the physical page after version updating into the distributed storage, wherein the necessary data comprises: data dictionary metadata, maximum transaction ID, global unique ID;

according to the updated physical page and the updated rollback log in the physical page, performing version recovery on the necessary data by taking DDL operation as a unit so as to enable the version of the necessary data to be consistent with that of the main computing node when the main computing node fails, wherein after the version recovery of the necessary data, the high-availability database system provides services to the outside;

and according to the updated physical page and the updated rollback log in the physical page, performing version recovery on the table record of the high-availability database system by taking a transaction as a unit so as to keep the version of the table record consistent with that of the main computing node when the main computing node fails.

Optionally, in a third implementation manner of the first aspect of the present invention, when the crash recovery operation is executed, the providing, by the new master computing node, a database service to the outside includes:

when the crash recovery operation is executed, receiving a database service request sent by an external client through an SQL engine where the new main computing node is located, and determining a target table record which needs to be processed by the database service request and a target physical page containing the target table record;

initiating a first acquisition request of the target table record to a table record engine through the SQL engine, and initiating a second acquisition request of a target physical page containing the target table record to a physical page engine through the table record engine;

if the target physical page does not exist in a physical page cache pool of the physical page engine, initiating a third acquisition request of the target physical page to the distributed storage through the physical page engine;

when the distributed storage responds to the third acquisition request and returns to the target physical page, intercepting the target physical page for version updating, and storing the target physical page after version updating into the physical page cache pool;

reading the target physical page from the physical page cache pool through the physical page engine and returning the target physical page to the table record engine so as to respond to the second acquisition request;

judging whether unprocessed suspended transactions exist in target table records needing to be read in the target physical page or not by the table record engine;

if the unprocessed suspended affairs do not exist, returning the target table record to the SQL engine through the table record engine so as to respond to the first acquisition request; if the unprocessed suspended transaction exists, loading a corresponding rollback log, performing rollback or submission processing on the unprocessed suspended transaction by using the loaded rollback log, and returning the target table record to the SQL engine through the table record engine to respond to the first acquisition request;

and processing the database service request through the SQL engine based on the target table record, and returning a processing result to the external client.

Optionally, in a fourth implementation manner of the first aspect of the present invention, the analyzing the unsynchronized redo log, and storing an analysis result in a preset hash table includes:

analyzing the unsynchronized redo log to obtain the physical page number of the unsynchronized redo log;

and constructing a key value pair consisting of the physical page number and the redo log by taking the physical page number as a key word and taking the redo log of the physical page corresponding to the physical page number as a value, and inserting the key value pair into a preset hash table.

Optionally, in a fifth implementation manner of the first aspect of the present invention, when the distributed storage responds to the third obtaining request and returns to the target physical page, intercepting the target physical page for version update, and storing the target physical page after version update into the physical page cache pool includes:

intercepting the target physical page when the distributed storage responds to the third acquisition request and returns the target physical page;

retrieving the hash table according to the physical page number of the target physical page to obtain a redo log which is not applied in the target physical page;

and updating the version of the target physical page according to the redo log which is not applied in the target physical page, and storing the target physical page with the updated version into the physical page cache pool.

Optionally, in a sixth implementation manner of the first aspect of the present invention, the implementation method of the high-availability database system further includes:

and after the high-availability database system provides external services, according to the redo log of the physical page corresponding to the hash table, performing version updating on the physical pages of other data except the necessary data so as to keep the versions of the physical pages of the other data consistent with those of the main computing node when the main computing node fails.

A second aspect of the present invention provides an apparatus for implementing a high-availability database system, where the high-availability database system includes at least a main computing node, a distributed storage, and a standby computing node, and a physical configuration specification of the standby computing node is lower than a physical configuration specification of the main computing node, and the apparatus includes:

the detection module is used for detecting whether a fault occurs in the operation process of the main computing node in real time;

the upgrading module is used for thermally upgrading the physical configuration specification of the standby computing node to the same physical configuration specification as the main computing node when the main computing node fails in the operation process;

the crash recovery module is used for executing quick crash recovery operation on the database instance operated by the standby computing node so as to upgrade the standby computing node into a new main computing node;

and the service module is used for adopting the new main computing node to provide database service for the outside when the rapid crash recovery operation is executed.

Optionally, in a first implementation manner of the second aspect of the present invention, the apparatus for implementing a high availability database system further includes:

the establishing module is used for starting a new standby computing node and establishing a main-standby relationship between the new main computing node and the new standby computing node.

Optionally, in a second implementation manner of the second aspect of the present invention, the crash recovery module is specifically configured to:

reading a physical page corresponding to preset necessary data from the distributed storage to a memory of the standby computing node for version updating, and writing the physical page after the version updating into the distributed storage, wherein the necessary data comprises: data dictionary metadata, maximum transaction ID, global unique ID;

Optionally, in a third implementation manner of the second aspect of the present invention, the service module is specifically configured to:

Optionally, in a fourth implementation manner of the second aspect of the present invention, the crash recovery module is further configured to:

analyzing the unsynchronized redo log to obtain the physical page number of the unsynchronized redo log; and constructing a key value pair consisting of the physical page number and the redo log by taking the physical page number as a key word and taking the redo log of the physical page corresponding to the physical page number as a value, and inserting the key value pair into a preset hash table.

Optionally, in a fifth implementation manner of the second aspect of the present invention, the service module is further configured to:

intercepting the target physical page when the distributed storage responds to the third acquisition request and returns the target physical page; retrieving the hash table according to the physical page number of the target physical page to obtain a redo log which is not applied in the target physical page; and updating the version of the target physical page according to the redo log which is not applied in the target physical page, and storing the target physical page with the updated version into the physical page cache pool.

Optionally, in a sixth implementation manner of the second aspect of the present invention, the crash recovery module is further configured to:

A third aspect of the present invention provides a computer apparatus comprising: a memory and at least one processor, the memory having instructions stored therein;

the at least one processor invokes the instructions in the memory to cause the computer device to perform the implementation method of the high availability database system described above.

A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon instructions, which when executed by a processor, implement the method of implementing the high-availability database system described above.

The invention provides a method, a device, equipment and a storage medium for realizing a high-availability database system, which comprise a main computing node and a standby computing node with asymmetric hardware configuration specifications and distributed storage; the invention realizes a high-availability database system of a low-cost asymmetric main and standby node, when a main computing node fails, the physical configuration specification of a standby computing node is thermally upgraded to the same physical configuration specification as the main computing node so as to upgrade the standby computing node to a new main computing node, thereby ensuring that the standby computing node can take over quickly after the main computing node fails and simultaneously provide database service to the outside normally. The backup computing node adopts a lower physical configuration specification, so that the operation cost of the database is reduced on the aspect of hardware, meanwhile, the external service can be quickly realized, and the influence of internal faults on external access is greatly reduced.

Drawings

FIG. 1 is a flow chart illustrating a method for implementing a highly available database system according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a technical architecture of a high availability database according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an implementation process of an embodiment of the high availability database system of the present invention;

FIG. 4 is a flowchart illustrating an embodiment of performing a fast crash recovery operation on a database instance in the method for implementing a highly available database system according to the present invention;

FIG. 5 is a flowchart illustrating an embodiment of providing database services to the outside when performing a fast crash recovery operation in an implementation method of a highly available database system according to the present invention;

FIG. 6 is a functional block diagram of an implementation apparatus of a high availability database system according to an embodiment of the present invention;

FIG. 7 is a functional block diagram of another embodiment of an apparatus for implementing a high availability database system according to the present invention;

FIG. 8 is a diagram of a hardware configuration of an embodiment of a computer device according to the present invention.

Detailed Description

The embodiment of the invention provides a method, a device, equipment and a storage medium for realizing a high-availability database system. The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be noted that the implementation method of the highly available database in the present invention is applicable to all types of database systems in principle, for example: MySQL, PGSQL, MongoDB, etc., the basic concepts of databases may have some differences in different types of database systems, and are not described here too much.

Before introducing the embodiment of the invention, taking a MySQL database system as an example, some basic concepts in the database technology are introduced:

table: database systems store the basic structure of the same type of data. A table stores a plurality of table records (records), each table Record having a plurality of fields.

Physical Page (Page): the data objects of the table records are stored. The physical page is fixed in size (e.g., 16KB), and one physical page can store multiple table records of one table.

Physical Page Buffer Pool (Page Buffer Pool): the database system is maintained in the memory and is used for caching the physical page.

Transaction: a database transaction contains a plurality of SQL statements, and the transaction mechanism of the database ensures that the SQL statements are either all executed correctly (if all SQL statements in the transaction are executed, the transaction is successfully submitted) or all executed (if some SQL statements in the transaction are not executed successfully, the transaction is not successfully submitted or the transaction is rolled back).

Redo log (Redolog): and the database system generates a corresponding redo log aiming at each modification operation of the physical page, and the redo log faithfully records the modification of the physical page by the transaction.

Rollback log (undo): and when each table record is inserted, modified and deleted by the database system, generating a corresponding rollback log. The rollback log records the version of this table record prior to being operated on by the transaction. After the transaction commits, these rollback logs will be deleted; when a transaction rolls back, the database system rolls back the modified table record by using the roll-back logs, and the modification of the table record by the transaction is cancelled.

It should be noted that the rollback logs are also stored in the physical pages, and a plurality of rollback logs are stored in one physical page; that is, the physical Page (Page) is not only a data object storing the table Record (Record) but also a data object storing the rollback log (undo).

The SQL engine: the data of the SQL engine is cache data which does not need to be persisted, so that the data of the SQL engine does not need to be restored during crash recovery.

A table recording engine: the data managed and maintained by the table record engine is a table record. When the crash is recovered, the table record needs to be recovered, so that the table record is finally consistent with the transaction level at the time of system failure.

A physical page engine: the data managed and maintained by the physical page engine is physical pages. When the crash is recovered, the physical page data needs to be recovered, so that the physical page data is finally consistent with the system failure time.

A data dictionary: the data managed and maintained by the data dictionary is the table index metadata. The metadata itself exists in the form of table records, stored in physical pages.

For ease of understanding, the following detailed description is given in conjunction with the embodiments of the present invention.

The high-availability database system in the embodiment of the invention comprises at least one main computing node, a distributed storage and a standby computing node, wherein the physical configuration specification of the standby computing node is lower than that of the main computing node.

Referring to fig. 1 and fig. 2, fig. 1 is a flowchart illustrating an implementation method of a high-availability database system according to an embodiment of the present invention, and fig. 2 is a technical architecture diagram illustrating a high-availability database system according to an embodiment of the present invention. In this embodiment, the implementation method of the high-availability database system includes the following steps:

s10: detecting whether a fault occurs in the operation process of the main computing node in real time;

when the high-availability database system normally operates, a main computing node with a high physical configuration specification receives and processes a read-write request of a client, and writes a physical Page (Page) and a redo log (Redollog) into the distributed storage; meanwhile, when the main computing node submits a transaction, a redo log Position (Redollog Position) pushed by the transaction is notified to the standby computing node through a TCP channel and a protocol; after receiving the latest redo log Position information, the standby computing node with the low physical configuration specification reads a redo log (Redolog) which is up to the Position point from the distributed storage according to the Position point information, and then analyzes and caches the redo log; and meanwhile, discarding the expired redo log (Redolog) cached by the computing node according to the redo log Position information.

When a main computing node of the high-availability database system fails, stopping writing and reading data into the distributed storage (stopping writing Page and Redol at the moment) because the main computing node fails to be in an unavailable state at the moment; at the same time, redo log Position notification (Redolog Position) between the master compute node and the standby compute node also stops.

S20: when the main computing node fails in the operation process, the physical configuration specification of the standby computing node is upgraded to the same physical configuration specification as the main computing node in a hot mode;

when a main computing node fails, the management and control system of the high-availability database first performs hot upgrade on the physical configuration specifications (CPU and memory) of the standby computing node, for example: upgrading a standby computing node provided with a 1 core CPU and a 2GB memory to a 16 core CPU and a 32GB memory which are the same as those of a main computing node; then sending a standby lifting main instruction to the standby computing node; when the standby computing node receives the instruction, the related configuration of the database (such as the size of the physical page cache pool) is adjusted.

S30: performing a fast crash recovery operation on a database instance operated by the standby computing node to upgrade the standby computing node to a new main computing node;

and performing hot upgrade on the physical configuration specification of the standby computing node, keeping the normal operation of the standby database instance in the upgrade process, simultaneously modifying the access authority of distributed storage from read-only to read-write, then performing quick crash recovery operation, and after the quick crash recovery is completed, upgrading the standby database instance into the main database instance, so that the standby computing node is upgraded into a new main computing node.

Optionally, in an embodiment, after upgrading the standby computing node to a new master computing node, the method further includes:

After a new main computing node provides service to the outside, in order to realize disaster recovery aiming at the new main computing node, a new standby computing node needs to be started, and a main-standby relation (a main-standby coordination relation) is established between the new standby computing node and the new main computing node, wherein the new standby computing node adopts a physical configuration specification which is lower than that of the new main computing node; and after the main-standby relation between the new main computing node and the new standby computing node is established, regenerating a set of new low-cost high-availability database system of the asymmetric main-standby nodes.

Referring to fig. 3, fig. 3 is a schematic diagram illustrating an implementation process of the high availability database system according to the present invention. The implementation process of the high-availability database system is divided into four stages: firstly, normally operating a high-availability database system; a main computing node of the high-availability database system fails; thirdly, upgrading the standby computing node into a main computing node; establishing a relationship between the new standby computing node and the new main computing node.

Specifically, the state of normal operation of the high-availability database system of the asymmetric main and standby nodes is shown. At the moment, the main computing node receives and processes a client read-write request, writes a physical Page (Page) and a Redolog (Redolog) into the distributed storage, and simultaneously notifies a Redolog Position (Redolog Position) pushed by a transaction to the standby computing node through a TCP channel and a protocol when the main computing node submits the transaction. After receiving the latest redo log Position information, the backup computing node reads the redo log (Redol) of the ending Position from the distributed storage according to the Position information, and then analyzes and caches the redo log; and meanwhile, discarding the expired redo log (Redolog) cached by the computing node according to the redo log Position information.

And (c) representing the state of the highly available database system when the master computing node fails. At this time, the main computing node is in an unavailable state due to the failure, and the data writing and reading to the distributed storage are stopped (the Page and Redol are stopped being written at this time); at the same time, redo log Position (Redolog Position) notification between the master compute node and the standby compute node also stops.

And thirdly, representing the process of upgrading the standby computing node into a new main computing node. When the main computing node fails, the management and control system of the high-availability database firstly carries out hot upgrade on the configuration (CPU and memory) of the standby computing node and then sends a standby upgrade main instruction to the standby computing node. After receiving the instruction, the standby computing node firstly adjusts the relevant configuration of the database (such as the size of a physical page cache pool, and the like), then executes a quick crash recovery operation, and after the operation is executed, the standby computing node is upgraded to a new main computing node and can provide external services. And from the detection of the failure of the main computing node to the external service provision of a new main computing node, the time consumption of the whole process is controlled within 30 s.

And fourthly, indicating that the new standby computing node is added into the high-availability database system again. After the new primary computing node provides service to the outside, in order to implement disaster recovery for the primary computing node, a new backup computing node needs to be started, and a primary-backup relationship (a primary-secondary coordination relationship) as in fig. 1 is established with the new primary computing node. When a new standby computing node is started and a main-standby relationship (a main-standby coordination relationship) between a new main computing node and the new standby computing node is established, a set of new high-availability database systems of low-cost asymmetric main-standby nodes is regenerated.

S40: and when the rapid crash recovery operation is executed, the new main computing node is adopted to provide database service for the outside.

The invention provides a method for realizing a high-availability database system, which can realize the high-availability database system of an asymmetric main and standby node, when a main computing node fails, the physical configuration specification of a standby computing node is thermally upgraded to the same physical configuration specification as the main computing node, distributed storage is mounted to the standby computing node, a new database instance is started, so that the standby computing node is upgraded to a new main computing node, and service is provided for the outside; the method can ensure that the standby computing node can take over quickly after the main computing node fails to provide database service normally, and reduces the operation cost of the database because the standby computing node adopts a lower physical configuration specification.

Referring to fig. 4, fig. 4 is a flowchart illustrating an embodiment of performing a fast crash recovery operation on a database instance in the method for implementing a highly available database system according to the present invention, which includes the following steps:

s401: loading unsynchronized redo logs from the redo log file stored in the distributed mode, analyzing the unsynchronized redo logs, and storing analysis results into a preset hash table;

in the running process of the high-availability database system, not only the main computing node can run 1 database instance (such as a mysql server process, hereinafter referred to as a main database instance), but also the standby computing node can run 1 database instance (such as a mysql server process, hereinafter referred to as a standby database instance). Two database instances will share the underlying distributed storage, while setting the sharing constraints: the distributed storage only allows the main database instance to write in, and the standby database instance only can read and cannot write in. Meanwhile, real-time communication can be carried out between the main database instance and the standby database instance, and the main database instance can inform the advanced redo log site to the standby database instance. And then after the database instance receives the position, some redo logs are read from the bottom distributed storage according to the position.

The master database instance will not push the log site any more due to the failure of the master compute node. Therefore, when the master computing node fails, the unsynchronized redo log needs to be loaded from the redo log file of the distributed storage, that is, the redo log from the redo log site of the last synchronization of the master database instance to the redo log at the tail end of the redo log file of the distributed storage is loaded. And meanwhile, further analyzing the loaded unsynchronized redo logs, and then storing the analysis result into a preset hash table.

Optionally, in an embodiment, the analyzing the unsynchronized redo log, and storing the analysis result in a preset hash table includes:

Specifically, in a preset hash table into which key value pairs are inserted, redo logs with the same physical page label are stored in a single linked list; when a physical page number of a physical page is appointed, all the redo logs which are not applied to the physical page can be found, so that the standby computing node is quickly consistent with the main computing node at the fault moment, and the crash recovery is still kept while the database provides external services.

S402: reading a physical page corresponding to preset necessary data from the distributed storage to a memory of the standby computing node for version updating, and writing the physical page after the version updating into the distributed storage, wherein the necessary data comprises: data dictionary metadata, maximum transaction ID, global unique ID;

specifically, the high-availability database system is to provide services to the outside normally, that is, the SQL read-write request initiated by the client is to be processed correctly, depending on two conditions: one is essential data such as data dictionary metadata, maximum transaction ID, globally unique ID; and the second physical page consistent with the fault time comprises a Record physical page and an undo physical page. As long as these two conditions are met, the highly available database system can provide services to the outside.

Therefore, it is necessary to read the physical page corresponding to the preset necessary data from the distributed storage to the memory of the standby computing node for version update, and write the physical page after version update into the distributed storage.

At the moment, the consistency of the physical page in the distributed storage and the main computing node at the fault moment is realized; meanwhile, the rollback log is also implicitly consistent with the fault time (since the rollback log is stored in a physical page, when the physical page is consistent, the rollback log is naturally consistent).

S403: according to the updated physical page and the updated rollback log in the physical page, performing version recovery on the necessary data by taking DDL operation as a unit so as to enable the version of the necessary data to be consistent with that of the main computing node when the main computing node fails, wherein after the version recovery of the necessary data, the high-availability database system provides services to the outside;

specifically, according to the updated physical page and the updated rollback log in S402, the metadata of the data dictionary is recovered in units of ddl (data Definition language) operation, so that the metadata is consistent with the metadata at the time of the failure on the transaction level; when the metadata recovery of the data dictionary is completed, the database system can use correct metadata to execute the read-write operation. And after the necessary data is recovered to be consistent with the fault of the main computing node, the high-availability database system can provide services to the outside.

Optionally, in an embodiment, the implementation method of the high-availability database system further includes:

s404: and after the high-availability database system provides external services, according to the redo log of the physical page corresponding to the hash table, performing version updating on the physical pages of other data except the necessary data so as to keep the versions of the physical pages of the other data consistent with those of the main computing node when the main computing node fails.

In this embodiment, after the high-availability database system recovers the external service, the version of the physical page corresponding to the unrecovered other data is further updated according to the redo log written in the physical page corresponding to the physical page number in the hash table. Specifically, a redo log is started to refresh a thread group of the physical page, the redo log in the hash table is used for refreshing the remaining unrefreshed physical pages, and the refreshed physical pages are written into the distributed storage, so that the physical pages in the distributed storage are gradually kept consistent with the physical pages in the case of failure of the main database.

S405: according to the updated physical page and the updated rollback log in the physical page, performing version recovery on the table record of the high-availability database system by taking a transaction as a unit so as to keep the version of the table record consistent with that of the main computing node when the main computing node fails;

the operation of version recovery of the table records of the high-availability database system can be carried out asynchronously, namely, the table records are recovered while service is provided; the high-availability database can normally provide services to the outside while performing crash recovery operation, and does not need to wait for the completion of a long crash recovery process.

Referring to fig. 5, fig. 5 is a flowchart illustrating an embodiment of providing a database service to the outside when performing a fast crash recovery operation in an implementation method of a highly available database system according to the present invention, which includes the following steps:

s411: when crash recovery operation is executed, receiving a database service request sent by an external client through an SQL engine where the new main computing node is located, and determining a target table record which needs to be processed by the database service request and a target physical page containing the target table record;

specifically, when an external client sends an SQL statement (database service request) to the SQL engine, the SQL engine generates an execution plan for the statement and executes it, and determines a target table record to be processed and a target physical page containing the target table record.

S412: and initiating a first acquisition request of the target table record to a table record engine through the SQL engine, and initiating a second acquisition request of a target physical page containing the target table record to a physical page engine through the table record engine.

Specifically, in the execution process, a target table record is acquired by calling a table record engine, and a target physical page is acquired by calling a physical page engine through the table record engine.

S413: if the target physical page does not exist in a physical page cache pool of the physical page engine, initiating a third acquisition request of the target physical page to the distributed storage through the physical page engine;

specifically, when a crash recovery operation is performed, the entire database system is in an initialized state, and a target physical page does not exist in a physical page cache pool in the physical page engine, at this time, an acquisition request needs to be issued to the distributed storage by the physical page engine to attempt to read the physical page.

S414, when the distributed storage responds to the third acquisition request and returns to the target physical page, intercepting the target physical page for version updating, and storing the target physical page after version updating into a physical page cache pool;

specifically, the distributed storage responds to an acquisition request of the physical page engine, when a target physical page (the target physical page at this time is a version before the failure time) is returned, the target physical page is intercepted and updated (the target physical page at this time is a version of a master computing node at the failure time), and the updated target physical page is stored in the physical page cache pool.

S415, reading the target physical page from the physical page cache pool through the physical page engine and returning the target physical page to the table record engine so as to respond to a second acquisition request;

specifically, after the updated target physical page is stored in the physical page cache pool, the physical page engine reads the target physical page from the physical page cache pool, returns the target physical page to the table record engine, and responds to the second obtaining request in step S412.

S416: judging whether unprocessed suspended transactions exist in target table records needing to be read in the target physical page or not by the table record engine;

specifically, the table record engine judges a target table record needing to be read in a target physical page;

if the target table record cannot be judged to be the table record at the fault moment, directly returning the target table record to the SQL engine for the SQL engine to read and write;

if unprocessed suspended transactions exist in the read target table record (at this time, the table record is the table record at the time of the failure time), the target table record is not returned to the SQL engine (if the SQL engine needs to write), or the table record version before the failure time is found from the rollback log and is returned to the SQL engine (if the SQL engine needs to read).

S417: if the unprocessed suspended affairs do not exist, returning the target table record to the SQL engine through the table record engine so as to respond to the first acquisition request; if the unprocessed suspended transaction exists, loading a corresponding rollback log, performing rollback or submission processing on the unprocessed suspended transaction by using the loaded rollback log, and returning the target table record to the SQL engine through the table record engine to respond to the first acquisition request;

s418: and processing the database service request through the SQL engine based on the target table record, and returning a processing result to the external client.

Optionally, in an embodiment, step S414 specifically includes:

Specifically, by intercepting the target physical page and searching in the hash table according to the physical page number of the target physical page, the redo log which is not applied can be quickly and accurately acquired in the target physical page, and the target physical page (the version of the master computing node when the redo log is updated to the failure time) is further updated.

The implementation method of the high-availability database system in this embodiment can implement a high-availability database system, can perform crash recovery quickly while providing database services to the outside, and is different from a conventional high-availability database system that provides services to the outside only after completion of crash recovery.

In the above description of the implementation method of the high-availability database system in the embodiment of the present invention, an implementation apparatus of the high-availability database system in the embodiment of the present invention is described below, please refer to fig. 6, where fig. 6 is a functional module schematic diagram of an implementation apparatus of the high-availability database system in an embodiment of the present invention, and in this embodiment, the implementation apparatus of the high-availability database system includes:

a detection module 401, configured to detect whether a failure occurs in an operation process of the primary computing node in real time;

an upgrade module 402, configured to, when a failure occurs in an operation process of the primary computing node, thermally upgrade a physical configuration specification of the standby computing node to a physical configuration specification identical to that of the primary computing node;

a crash recovery module 403, configured to perform a fast crash recovery operation on a database instance run by the standby computing node, so as to upgrade the standby computing node to a new primary computing node;

a service module 404, configured to provide a database service externally with the new master computing node when the fast crash recovery operation is performed.

Optionally, in an embodiment, the crash recovery module 403 is specifically configured to:

Optionally, in an embodiment, the crash recovery module 403 is further configured to:

Optionally, in an embodiment, the service module 404 is specifically configured to:

Optionally, in an embodiment, the service module 404 is further configured to:

The embodiment of the invention realizes a high-availability database system of a low-cost asymmetric main and standby node, when a main computing node fails, the physical configuration specification of a standby computing node is thermally upgraded to the same physical configuration specification as the main computing node, distributed storage is mounted to the standby computing node, and a new database instance is started, so that the standby computing node is upgraded to a new main computing node to provide service to the outside; the method can ensure that the standby computing node can take over quickly after the main computing node fails to provide database service normally, and reduces the operation cost of the database because the standby computing node adopts a lower physical configuration specification.

Referring to fig. 7, fig. 7 is a schematic diagram of another functional module of the implementation apparatus of the high-availability database system according to the present invention, in this embodiment, the implementation apparatus of the high-availability database system further includes:

An establishing module 405, configured to start a new standby computing node, and establish a primary-standby relationship between the new main computing node and the new standby computing node.

The implementation apparatus of the high-availability database system in this embodiment can implement a high-availability database system, which can perform crash recovery quickly while providing database services to the outside, and is different from a conventional high-availability database system that provides services to the outside only after completion of crash recovery.

The implementation apparatus of the high availability database system in the embodiment of the present invention is described in detail from the perspective of the modular functional entity, and the computer device in the embodiment of the present invention is described in detail from the perspective of hardware processing.

Referring to fig. 8, fig. 8 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention, the computer device 500 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 510 (e.g., one or more processors) and a memory 520, and one or more storage media 530 (e.g., one or more mass storage devices) for storing applications 533 or data 532. Memory 520 and storage media 530 may be, among other things, transient or persistent storage. The program stored on the storage medium 530 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the computer device 500. Further, the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the computer device 500.

The computer device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input-output interfaces 560, and/or one or more operating systems 531, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and the like. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 8 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components.

The present invention also provides a computer device, which includes a memory and a processor, wherein the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the processor executes the steps of the method for implementing the high availability database system in the above embodiments.

The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, and which may also be a volatile computer readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the method of implementing the high availability database system.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for implementing a high availability database system, the high availability database system comprising at least a main compute node, a distributed storage, and a standby compute node, the physical configuration specification of the standby compute node being lower than the physical configuration specification of the main compute node, the method comprising:

and when the rapid crash recovery operation is executed, the new main computing node is adopted to provide database service for the outside.

2. The method for implementing a highly available database system according to claim 1, further comprising, after upgrading said standby compute node to a new master compute node:

3. The method for implementing the high availability database system according to claim 1 or 2, wherein the performing a fast crash recovery operation on the database instance run by the standby compute node to upgrade the standby compute node to a new primary compute node comprises:

4. The method for implementing a highly available database system according to claim 3, wherein the providing database services externally with the new master computing node while performing the crash recovery operation comprises:

5. The method for implementing a highly available database system according to claim 3, wherein the parsing the unsynchronized redo log and storing the parsed result into a preset hash table includes:

6. The method for implementing a high availability database system according to claim 4, wherein the intercepting the target physical page for version update and storing the target physical page after version update into the physical page cache pool when the distributed storage responds to the third get request and returns the target physical page comprises:

7. The method for implementing a high availability database system according to claim 3, wherein the method for implementing a high availability database system further comprises:

8. An apparatus for implementing a high availability database system, the high availability database system comprising at least a main compute node, a distributed storage, and a standby compute node, the physical configuration specification of the standby compute node being lower than the physical configuration specification of the main compute node, the apparatus comprising:

9. A computer device, characterized in that the computer device comprises: a memory and at least one processor, the memory having instructions stored therein;

the at least one processor invokes the instructions in the memory to cause the computer device to perform an implementation method of the high availability database system of any one of claims 1-7.

10. A computer-readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement a method of implementing a high availability database system as claimed in any one of claims 1 to 7.