CN113239120A - Log synchronization method, device, equipment and storage medium - Google Patents
Log synchronization method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN113239120A CN113239120A CN202110631080.0A CN202110631080A CN113239120A CN 113239120 A CN113239120 A CN 113239120A CN 202110631080 A CN202110631080 A CN 202110631080A CN 113239120 A CN113239120 A CN 113239120A
- Authority
- CN
- China
- Prior art keywords
- log
- synchronized
- library
- packet
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a log synchronization method, a log synchronization device, log synchronization equipment and a log synchronization storage medium. The method comprises the following steps: the master library generates a log packet to be synchronized according to the synchronized log packet sequence number array, the synchronized log sequence value array and the received data modification operation, and sends the log packet to be synchronized to each backup library; the standby database replays according to the received log packets to be synchronized, and feeds back the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized to the main database after modifying the log packet parameters of the standby database according to the log packets to be synchronized; and updating the synchronized log packet sequence number array and the synchronized log sequence value array by the main library according to the received log packet sequence number and the maximum log sequence value. According to the technical scheme of the embodiment of the invention, the log truncation is allowed when the database fails, the availability and the reliability of the database cluster are improved, and the running speed of the master database in the database cluster is increased.
Description
Technical Field
The embodiment of the invention relates to the technical field of databases, in particular to a log synchronization method, a log synchronization device, log synchronization equipment and a log synchronization storage medium.
Background
With the rapid development of information technology, databases play an extremely important role in daily work, business activities, and people's lives. In order to avoid the problem that the databases cannot normally provide services due to downtime in the using process, a database cluster is often built in the existing application, namely, a plurality of databases are divided into a main database and a standby database, so that the standby database can still continue to provide services when the main database crashes.
In order to ensure data synchronization of each database in the database cluster, a redo log is generated after data modification operations such as addition, deletion and modification are carried out on the main database, and the redo log is sent to the standby database, so that the standby database can replay according to the redo log and further keep data synchronization with the main database. The core idea of the conventional database is that a redo log generated by modifying a data page needs to be written into a disk before the data page, and the data page is written into the disk after the action of storing the data page into the disk is delayed until a check point is advanced or the data page cache is insufficient and needs to be eliminated.
However, under the above mechanism, the persistence of the redo log in the local disk may be equal to the persistence of the data page, so that the redo log cannot be truncated at will, and if the redo log is to be truncated, the data modification recorded by the redo log needs to be rolled back from the corresponding data page. For a conventional database, after a redo log is written to a disk, data pages are not excluded to be written to the disk immediately (for example, while a checkpoint is being advanced), if transactions that perform data page modifications have already been committed, the modifications are persistently stored in the database (persistent nature of the transactions), and even if the database fails, a rollback operation cannot be performed on the data pages after the failure is recovered, so the conventional database cannot support log truncation. Based on the characteristic, when the traditional database cluster supports automatic fault switching, the fault condition needs to be confirmed through third party arbitration, otherwise, the fault switching may occur, so that the old master library cannot be added back again (for example, only network abnormality occurs between the master library and the standby library, the master library still operates normally and writes local data, the standby library considers that the master library has a fault and automatically switches to a new master library due to the network failure, the current data written in the old master library does not exist on the new master library, the old master library cannot intercept logs and roll back data, and the old master library and the new master library cannot be added back into the cluster). Further, in order to ensure real-time synchronization of data between the master library and the standby libraries in the conventional database cluster, the master library needs to receive responses from all the standby libraries after sending the redo log to continue generating a new redo log, and the standby libraries greatly influence the operating speed of the master library when data processing is delayed.
Disclosure of Invention
The invention provides a log synchronization method, a log synchronization device, log synchronization equipment and a log synchronization storage medium, which are used for improving the running speed of a main database in a database cluster and allowing log truncation when a database fails, so that the normal running of the database cluster cannot be influenced by the failure (network abnormity or database failure) of a small number of databases, and the availability and the reliability of the database cluster are improved without the need of third party arbitration during automatic failure switching.
In a first aspect, an embodiment of the present invention provides a log synchronization method, which is applied to a database cluster, where the database cluster includes a master repository and at least two backup repositories, and the number of the backup repositories is an even number; the method comprises the following steps:
the master library generates a log packet to be synchronized according to the synchronized log packet sequence number array, the synchronized log sequence value array and the received data modification operation, and sends the log packet to be synchronized to each backup library;
the standby database replays according to the received log packets to be synchronized, and feeds back the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized to the main database after modifying the log packet parameters of the standby database according to the log packets to be synchronized;
and updating the synchronized log packet sequence number array and the synchronized log sequence value array by the main library according to the received log packet sequence number and the maximum log sequence value.
Further, the log packets to be synchronized at least comprise a log packet sequence number, a log packet length, a minimum log sequence value, a maximum log sequence value, a current option number, a master library submitted log packet sequence number, a master library submitted log sequence value and at least one log to be synchronized;
the master library generates a log packet to be synchronized according to the synchronized log packet sequence number array, the synchronized log sequence value array and the received data modification operation, and the method comprises the following steps:
determining the submitted log packet sequence number of the master library according to the synchronized log packet sequence number array;
determining submitted log sequence values of the main library according to the synchronized log sequence value array;
modifying the data page according to the received data modification operation, and generating at least one log to be synchronized corresponding to the data modification operation; the log sequence value of the log to be synchronized is the page log sequence value of the modified data page;
adding one to the log packet sequence number of the last log packet to be synchronized to determine the log packet sequence number of the log packet to be synchronized;
determining a minimum log sequence value, a maximum log sequence value and a log packet length according to each log to be synchronized;
and generating a log packet to be synchronized according to the log packet sequence number, the log packet length, the minimum log sequence value, the maximum log sequence value, the master library submitted log packet sequence number, the master library submitted log sequence value, the current tenure number of the master library and at least one log to be synchronized.
Further, determining the submitted log packet sequence number of the master library according to the synchronized log packet sequence number array comprises:
determining the number of the synchronized log packet sequence numbers in the synchronized log packet sequence number array as a first number;
determining the number of the synchronized log packet sequence numbers which are greater than or equal to the synchronized log packet sequence numbers in the synchronized log packet sequence number array as a second number aiming at each synchronized log packet sequence number in the synchronized log packet sequence number array;
if the second number is larger than the product of the first number and a preset proportion threshold value, determining the synchronized log packet sequence number as a candidate log packet sequence number;
and determining the maximum candidate log packet sequence number in the candidate log packet sequence numbers, and determining the maximum candidate log packet sequence number as the submitted log packet sequence number of the master library.
Further, determining a master library committed log sequence value from the synchronized array of log sequence values comprises:
determining the number of the synchronized log sequence values in the synchronized log sequence value array as a third number;
determining, for each synchronized log sequence value in the array of synchronized log sequence values, a fourth number of synchronized log sequence values in the array of synchronized log sequence values that is greater than or equal to the synchronized log sequence value;
if the fourth number is larger than the product of the third number and a preset proportion threshold value, determining the synchronized log sequence value as a candidate log sequence value;
determining a maximum candidate log sequence value of the candidate log sequence values, and determining the maximum candidate log sequence value as a submitted log sequence value of the master library.
Further, the replaying of the standby library according to the received log packet to be synchronized comprises the following steps:
the standby database correspondingly modifies the standby database data page according to each received log to be synchronized in the log packet to be synchronized, and generates a standby database synchronization log corresponding to the log to be synchronized;
determining the log sequence value of the log to be synchronized as a page log sequence value of a database data page;
and writing the synchronous log of the standby library into a local disk of the standby library.
Further, modifying the log packet parameters of the standby database according to the log packets to be synchronized, including:
after writing the standby library synchronous log into a local disk of the standby library, setting the current task number as a replayed task number of the standby library;
setting the serial number of the submitted log packet of the master library as the serial number of the submitted log packet of the standby library;
setting the master library committed log sequence value as the backup library committed log sequence value.
Further, the master library updates the synchronized log packet sequence number array and the synchronized log sequence value array according to the received log packet sequence number and the maximum log sequence value, and includes:
the master library determines the synchronized log packet sequence number corresponding to the standby library in the synchronized log packet sequence number array;
storing the received log packet sequence number as a new synchronized log packet sequence number into a synchronized log packet sequence number array;
the master library determines a synchronized log sequence value corresponding to the backup library in the synchronized log sequence value array;
the received maximum log sequence value is stored as a new synchronized log sequence value in the array of synchronized log sequence values.
Further, for each database in the database cluster,
when the data page in the database meets the preset disk refreshing condition, determining whether the page log sequence value corresponding to the data page is greater than the submitted log sequence value of the main library or greater than the submitted log sequence value of the standby library;
if so, not writing the data page into a local disk of the database;
and if not, writing the data page into a local disk of the database.
In a second aspect, an embodiment of the present invention further provides a log synchronization apparatus, which is applied to a database cluster, where the database cluster includes a master library and at least two backup libraries, and the number of the backup libraries is an even number; the log synchronizing device includes:
the log packet generation module is used for generating a log packet to be synchronized by the main library according to the sequence number array of the synchronized log packet, the sequence value array of the synchronized log packet and the received data modification operation, and sending the log packet to be synchronized to each standby library;
the replay feedback module is used for replaying the standby library according to the received log packets to be synchronized, and feeding back the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized to the main library after modifying the log packet parameters of the standby library according to the log packets to be synchronized;
and the array updating module is used for updating the synchronized log packet sequence number array and the synchronized log sequence value array by the main library according to the received log packet sequence number and the maximum log sequence value.
In a third aspect, an embodiment of the present invention further provides a log synchronization device, including:
a database cluster, storage, and one or more processors;
the database cluster comprises a main library and at least two standby libraries, and the number of the standby libraries is an even number;
storage means for storing one or more programs;
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the log synchronization method as described above in the first aspect.
In a fourth aspect, embodiments of the present invention also provide a storage medium containing computer-executable instructions for performing the log synchronization method of the first aspect as described above when executed by a computer processor.
The embodiment of the invention provides a log synchronization method, a log synchronization device, log synchronization equipment and a log synchronization storage medium, wherein the method is applied to a database cluster, the database cluster comprises a main library and at least two standby libraries, and the number of the standby libraries is an even number; generating a log packet to be synchronized by the main library according to the synchronized log packet sequence number array, the synchronized log sequence value array and the received data modification operation, and sending the log packet to be synchronized to each standby library; the standby database replays according to the received log packets to be synchronized, and feeds back the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized to the main database after modifying the log packet parameters of the standby database according to the log packets to be synchronized; and updating the synchronized log packet sequence number array and the synchronized log sequence value array by the main library according to the received log packet sequence number and the maximum log sequence value. By adopting the technical scheme, when the master library generates the log packets to be synchronized, only the logs corresponding to the data modification operation are considered, the synchronized log packet sequence number array formed according to the log packets of each backup library in the database cluster and the log synchronization condition and the information determined by the synchronized log sequence value array are written into the log packets to be synchronized, the log packets to be synchronized are sent to each backup library to realize the data synchronization between the backup libraries and the master library, the parameters of the log packets of the backup libraries are modified through the log packets to be synchronized, and the synchronized log packet sequence number array and the synchronized log sequence value array maintained by the master library are adjusted according to the feedback of the backup libraries to the master library. Because the log packet to be synchronized contains the information determined according to the sequence number array of the synchronized log packet and the sequence value array of the synchronized log, and the sequence number array of the synchronized log packet and the sequence value array of the synchronized log are updated along with the feedback information of each backup library to the master library, thus, each database persists in writing the log to the local disk, not as much as the corresponding data page, and the main library can continue to generate new logs to be synchronized without waiting for the feedback of all the standby libraries, thereby realizing the log truncation allowed when the database fails, so that the failure of a small number of databases (network anomaly or database failure) does not affect the normal operation of the database cluster, and when the fault is automatically switched, third party arbitration is not needed, the availability and the reliability of the database cluster are improved, and the running speed of a master database in the database cluster is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a flowchart of a log synchronization method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a log synchronization method according to a second embodiment of the present invention;
FIG. 3 is a flowchart illustrating a second embodiment of determining the sequence numbers of committed log packets in the master repository according to the sequence number array of synchronized log packets;
FIG. 4 is a flowchart illustrating a second embodiment of the present invention for determining committed log sequence values of a master library from an array of synchronized log sequence values;
fig. 5 is a schematic structural diagram of a log synchronization apparatus in a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of a log synchronization apparatus in a fourth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
In the description of the present invention, it is to be understood that the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not necessarily used to describe a particular order or sequence, nor are they to be construed as indicating or implying relative importance. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
Example one
Fig. 1 is a flowchart of a log synchronization method according to an embodiment of the present invention, where this embodiment is applicable to a situation where log synchronization is performed between master and backup databases in a database cluster, and the method may be executed by a log synchronization apparatus, where the log synchronization apparatus may be implemented by software and/or hardware, and the log synchronization apparatus may be configured on a computing device, and the computing device may be formed by two or more physical entities or may be formed by one physical entity.
It should be clear that, the log synchronization method provided in this embodiment is applied to a database cluster, where the database cluster includes a master library and at least two backup libraries, and the number of the backup libraries is an even number. A database cluster can be understood as a virtual single logical image of a database consisting of a plurality of database servers, which can provide transparent data service for clients and can keep data synchronization in each database in the database cluster. The master database can be understood as a database which allows data modification operations such as addition, deletion, modification and the like in the database cluster and correspondingly generates redo logs; the backup database can be understood as a database which only allows read-only operation in the database cluster, receives the redo log synchronized by the main database, and then performs replay according to the redo log so as to keep the data and the data in the main database synchronized.
As shown in fig. 1, a log synchronization method provided in this embodiment specifically includes the following steps:
s101, the master library generates a log packet to be synchronized according to the sequence number array of the synchronized log packet, the sequence value array of the synchronized log and the received data modification operation, and sends the log packet to be synchronized to each backup library.
The log packets to be synchronized at least comprise a log packet sequence number, a log packet length, a minimum log sequence value, a maximum log sequence value, a current any number, a master library submitted log packet sequence number, a master library submitted log sequence value and at least one log to be synchronized.
In this embodiment, the synchronized log packet sequence number array may be understood as an array formed by the largest log packet sequence numbers of the databases in the database cluster, where each synchronized log packet sequence number represents the log packet sequence number of the latest synchronization completed in one standby database or the primary database; the synchronized array of Log Sequence values may be understood as an array formed by the maximum Log Sequence values of the databases in the database cluster, where each synchronized Log Sequence value represents the maximum Log Sequence value in the latest synchronized Log packet in a backup repository or a master repository, where each data modification is identified by a new Log Sequence Number (LSN), the LSN ranges from 0 to infinity, and one LSN value represents one database modification operation.
In this embodiment, the log packet to be synchronized may be understood as a log packet that includes one or more logs generated by data modification performed by the primary library and needs to be sent to the backup library, so that the backup library performs replay according to the logs therein to implement data synchronization between the primary and backup libraries. The log packet sequence number can be understood as a number corresponding to the log packet according to the generation sequence, and it can be clear that the log packet sequence number has the characteristic of unique increment, that is, the log packet sequence number of each log packet is uniquely determined, and according to the difference of the generation sequence, the log packet sequence number corresponding to the log packet generated later is inevitably greater than the log packet sequence number of the log packet generated before the log packet. The minimum log sequence value can be understood as the smallest log sequence value among the log sequence values corresponding to the logs in the log packet. The maximum log sequence value can be understood as the maximum log sequence value in the log sequence values corresponding to the logs in the log packet. While the current option number may be understood to be the option number of the currently selected master library based on the RAFT protocol, it is understood that the option number is a continuously increasing value, and each time a new master library is selected in the database cluster, the option number of the new master library is larger than the option number of the original master library and is a unique value. The committed log packet sequence number of the master library can be understood as the largest log packet sequence number in the log packet sequence numbers written into most databases (the proportion of the databases including the master library is greater than a preset proportion threshold value) in the database cluster. The master library committed log sequence value may be understood as the largest log sequence value of the log sequence values that have been written to the majority of the databases in the database cluster (the proportion of databases including the master library is greater than a preset proportion threshold). The log to be synchronized can be understood as a redo log generated when the data modification is performed by the main library and according to which the standby library is required to be replayed so as to realize the data synchronization.
Specifically, when receiving a data modification operation, the master library modifies data in one or more data pages and correspondingly generates one or more logs to be synchronized, wherein each time modification is performed, one log to be synchronized is generated, and the data pages modified by different logs to be synchronized may be the same or different. The master database determines the latest synchronization-completed log packet sequence number in each database in the database cluster according to the maintained synchronized log packet sequence number array, and determines the maximum synchronization-completed log packet sequence number of most databases in the database cluster according to the latest synchronization-completed log packet sequence number; the master database also determines the maximum log sequence value in the latest synchronization-completed log packets in each database in the database cluster according to the maintained synchronized log sequence value array, determines the maximum log sequence value in synchronization-completed log packets of most databases in the database cluster according to the maximum log sequence value, generates the log packets to be synchronized according to the information determined by the synchronized log packet sequence number array and the synchronized log sequence value array and the characteristics of the data packets, and sends the log packets to be synchronized to all databases except the master database in the database cluster, namely, the master database in the database cluster generates the log packets to be synchronized and sends the log packets to be synchronized to all backup databases.
In the embodiment of the invention, when the master library generates the log packets to be synchronized, the information determined according to the sequence number array of the synchronized log packets and the sequence value array of the synchronized log packets is carried in the log packets to be synchronized, so that each backup library determines the sequence numbers and the sequence values of the log packets which are synchronized by the master library and the plurality of databases in the database cluster, and when the databases have faults, the log packets or logs which are not synchronized by the plurality of databases can be cut off, thereby ensuring that the data between the backup libraries and the master library are completely consistent.
S102, the standby library replays according to the received log packets to be synchronized, and after the log packet parameters of the standby library are modified according to the log packets to be synchronized, the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized are fed back to the main library.
In this embodiment, replay may be understood as an operation in which the database modifies its own data according to the acquired log, which is the same as the log record. The log packet parameters of the backup library can be understood as the main library parameters recorded in the backup library, and the logs of which the main library has completed synchronization and the parameters corresponding to the log packets.
Specifically, for each backup library in the database cluster, after receiving a log packet to be synchronized sent by the master library, the backup library replays data contained in the backup library according to one or more logs to be synchronized contained in the log packet to be synchronized, that is, modifies the data of the backup library according to the log to be synchronized, the modification is the same as the modification operation of the data recorded in the log to be synchronized, generates a log corresponding to the modification, and stores the generated log in a local disk corresponding to the backup library. And after the log storage is finished, updating the main library parameters recorded in the standby library, the logs of which the synchronization of the main library is finished and the parameters corresponding to the log packets according to the parameter information carried in the log packets to be synchronized, and feeding back the log packet sequence number of the log packets to be synchronized and the maximum log sequence value in the log sequence values corresponding to the logs to be synchronized in the log packets to be synchronized to the main library so that the main library can determine the log packet sequence number and the log sequence value of which the log writing in of the log is finished in the standby library.
S103, the main library updates the synchronized log packet sequence number array and the synchronized log sequence value array according to the received log packet sequence number and the maximum log sequence value.
Specifically, because the log packet sequence number and the maximum log sequence value represent the log packet sequence number successfully synchronized by the backup repository and the maximum log sequence value in the log successfully synchronized by the backup repository, when the master repository receives the log packet sequence number and the maximum log sequence value, the synchronized log packet sequence number corresponding to the backup repository in the synchronized log packet sequence number array can be updated, and the original synchronized log packet sequence number is replaced by the log packet sequence number; and replacing the synchronized log sequence value corresponding to the standby database in the synchronized log sequence value array with the maximum log sequence value, so that the main database can carry the latest parameter information when the log packet to be synchronized is generated again.
According to the embodiment of the invention, the master library generates the log packets to be synchronized according to the serial number array of the synchronized log packets, the serial number array of the synchronized log packets and the received data modification operation, and sends the log packets to be synchronized to each backup library; the standby database replays according to the received log packets to be synchronized, and feeds back the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized to the main database after modifying the log packet parameters of the standby database according to the log packets to be synchronized; and updating the synchronized log packet sequence number array and the synchronized log sequence value array by the main library according to the received log packet sequence number and the maximum log sequence value. By adopting the technical scheme, when the master library generates the log packets to be synchronized, only the logs corresponding to the data modification operation are considered, the synchronized log packet sequence number array formed according to the log packets of each backup library in the database cluster and the log synchronization condition and the information determined by the synchronized log sequence value array are written into the log packets to be synchronized, the log packets to be synchronized are sent to each backup library to realize the data synchronization between the backup libraries and the master library, the parameters of the log packets of the backup libraries are modified through the log packets to be synchronized, and the synchronized log packet sequence number array and the synchronized log sequence value array maintained by the master library are adjusted according to the feedback of the backup libraries to the master library. Because the log packet to be synchronized contains the information determined according to the sequence number array of the synchronized log packet and the sequence value array of the synchronized log, and the sequence number array of the synchronized log packet and the sequence value array of the synchronized log are updated along with the feedback information of each backup library to the master library, thus, each database persists in writing the log to the local disk, not as much as the corresponding data page, and the main library can continue to generate new logs to be synchronized without waiting for the feedback of all the standby libraries, thereby realizing the log truncation allowed when the database fails, so that the failure of a small number of databases (network anomaly or database failure) does not affect the normal operation of the database cluster, and when the fault is automatically switched, third party arbitration is not needed, the availability and the reliability of the database cluster are improved, and the running speed of a master database in the database cluster is improved.
Example two
Fig. 2 is a flowchart of a log synchronization method provided in the second embodiment of the present invention, which is further refined based on the optional technical solutions, and determines a sequence number of a committed log packet in a master library and a sequence value of a committed log in the master library through a sequence number array of a synchronized log packet and a sequence value array of a synchronized log sequence value, and provides a method for determining a sequence number of a committed log packet in the master library and a sequence value of a committed log in the master library, and transmits the sequence number of the committed log packet in the master library and the sequence value of the committed log in the master library to each backup library, so that the backup library can modify log packet parameters according to the sequence number of the committed log packet in the master library and the sequence value of the committed log in the master library after replaying according to the log to be synchronized, and further provide a judgment basis for the master library and the backup library when data pages need to be written into a local disk in the future, so that the master library and the backup library can only store the data pages corresponding to the logs stored in the multiple libraries into the local disk The safety of intercepting the logs which are not written into the disk is guaranteed, meanwhile, the backup database only needs to feed back information to the main database after the logs are stored, the main database only needs to receive the feedback information of a plurality of backup databases to realize the propulsion of the submitted log packet serial number of the main database and the submitted log sequence value of the main database, the running speed of the main database cannot be influenced by the faults or network delay of a few backup databases, and the availability, reliability and running efficiency of the database cluster are improved.
As shown in fig. 2, a log synchronization method provided in the second embodiment of the present invention specifically includes the following steps:
s201, determining the submitted log packet sequence number of the master library according to the synchronized log packet sequence number array.
Further, fig. 3 is a schematic flowchart of a process for determining a sequence number of a committed log packet of a master repository according to a sequence number array of a synchronized log packet according to a second embodiment of the present invention, and as shown in fig. 3, the process specifically includes the following steps:
and S2011, determining the number of the synchronized log packet sequence numbers in the synchronized log packet sequence number array as a first number.
S2012, aiming at each synchronized log packet sequence number in the synchronized log packet sequence number array, determining the number of the synchronized log packet sequence numbers which are greater than or equal to the synchronized log packet sequence number in the synchronized log packet sequence number array as a second number.
Specifically, each synchronized log packet sequence number in the synchronized log packet sequence number array is used as a standard sequence number when being processed, the size relationship between each synchronized log packet sequence number in the synchronized log packet sequence number array and the standard sequence number is determined, if the synchronized log packet sequence number is greater than or equal to the standard sequence number, the backup library corresponding to the synchronized log packet sequence number can be considered to complete log replay of the log packet corresponding to the standard sequence number, and the corresponding log is written into a local disk corresponding to the backup library; otherwise, the backup library corresponding to the synchronized log packet serial number is considered to have not completed the log replay of the log packet corresponding to the standard serial number. And determining the number of the synchronized log packet sequence numbers which are greater than or equal to the standard sequence number in the synchronized log packet sequence number array as a second number.
S2013, determining whether the second quantity is greater than the product of the first quantity and a preset ratio threshold, if so, executing step S2014, and if not, returning to execute step S2012.
In this embodiment, the preset proportion threshold may be understood as a ratio of the log replay warehouse to the total database number of the database cluster, which is determined by a person skilled in the art according to actual needs. Optionally, the preset proportion threshold may be 50%, that is, when the second number is greater than the product of the first number and the preset proportion threshold, it may be determined that the log packets corresponding to the synchronized log packet sequence numbers have been written into the local disk by the majority of databases in the database cluster, and the specific proportion threshold may be set according to an actual situation, which is not limited in this embodiment of the present invention.
Specifically, by determining whether the second number is greater than the product of the first number and a preset proportion threshold, it may be determined whether the log packet corresponding to the synchronized log packet sequence number has been written into the local disk by the majority of databases in the database cluster, and if the second number is greater than the product of the first number and the preset proportion threshold, it may be determined that the log packet has been written into the majority of databases, at this time, step S2014 is performed; otherwise, the log packet is considered not to have been written into the database cluster, and then the process returns to step S2012 to determine the second number corresponding to another synchronized log packet sequence number in the synchronized log packet sequence number array until all synchronized log packet sequence numbers in the synchronized log packet sequence number array are selected and determined.
And S2014, determining the synchronized log packet sequence number as a candidate log packet sequence number.
Specifically, when the second number is greater than the product of the first number and the preset proportion threshold, it may be determined that the log packet corresponding to the synchronized log packet sequence number has been written into the plurality of databases in the database cluster, and at this time, the synchronized log packet sequence number may be determined as a candidate log packet sequence number, which is used as a candidate for the submitted log packet sequence number of the master database.
S2015, determining the maximum candidate log packet sequence number in the candidate log packet sequence numbers, and determining the maximum candidate log packet sequence number as the submitted log packet sequence number of the master library.
Specifically, the submitted log packet sequence number of the master database is the largest log packet sequence number among the log packet sequence numbers written into a plurality of databases in the database cluster (the ratio of the database ratios including the master database is greater than a preset ratio threshold), so that the sizes of the candidate log packet sequence numbers can be compared to determine the largest candidate log packet sequence number therein, and the largest candidate log packet sequence number can be determined as the submitted log packet sequence number of the master database.
S202, determining submitted log sequence values of the main library according to the synchronized log sequence value array.
Further, fig. 4 is a schematic flowchart of a process for determining a submitted log sequence value of the master library according to the synchronized log sequence value array according to the second embodiment of the present invention, as shown in fig. 4, specifically including the following steps:
s2021, determining the number of the synchronized log sequence values in the synchronized log sequence value array as a third number.
S2022, for each synchronized log sequence value in the array of synchronized log sequence values, determining a fourth number of synchronized log sequence values in the array of synchronized log sequence values that is greater than or equal to the synchronized log sequence value.
Specifically, each synchronized log sequence value in the synchronized log sequence value array is used as a standard sequence value when being processed, the magnitude relation between each synchronized log sequence value in the synchronized log sequence value array and the standard sequence value is determined, if the synchronized log sequence value is greater than or equal to the standard sequence value, the backup library corresponding to the synchronized log sequence value can be considered to finish replay of all logs smaller than the standard sequence value, and the corresponding log is written into a local disk corresponding to the backup library; otherwise, the backup library corresponding to the synchronized log sequence value is considered to have not completed replaying all logs smaller than the standard sequence value. Determining a fourth number of synchronized log sequence values in the array of synchronized log sequence values that are greater than or equal to the standard sequence value.
S2023, determining whether the fourth quantity is greater than the product of the third quantity and the preset ratio threshold, if yes, executing step S2024, otherwise, returning to execute step S2022.
In this embodiment, the preset proportion threshold is the same as the preset proportion threshold in step S2013, so as to ensure that the determined ratio of the log replay banks to the total database number of the database cluster is consistent, where the log sequence value submitted by the master bank corresponds to the sequence number of the log packet submitted by the master bank.
Specifically, by determining whether the fourth quantity is greater than the product of the third quantity and the preset proportion threshold, it is possible to determine the log corresponding to the synchronized log sequence value, and whether each log smaller than the synchronized log sequence value has been written into the local disk by the majority of databases in the database cluster, and if the fourth quantity is greater than the product of the third quantity and the preset proportion threshold, it is possible to consider that both the synchronized log sequence value and the log smaller than the synchronized log sequence value have been written by the majority of databases, at this time, step S2024 is executed; otherwise, it is assumed that the logs corresponding to the synchronized log sequence value are not written into the multiple databases in the database cluster, and then the process returns to step S2022 to determine a fourth number corresponding to another synchronized log sequence value in the synchronized log sequence value array until all the synchronized log sequence values in the synchronized log sequence value array are selected and determined.
S2024, determining the synchronized log sequence value as a candidate log sequence value.
Specifically, when the fourth number is greater than the product of the third number and the preset proportion threshold, it may be considered that all logs corresponding to the synchronized log sequence value have been written into the local disk by the majority of databases in the database cluster, and at this time, the synchronized log sequence value may be determined as a candidate log sequence value, and is used as a candidate for the submitted log sequence value of the master library.
S2025, determining the maximum candidate log sequence value in the candidate log sequence values, and determining the maximum candidate log sequence value as the submitted log sequence value of the main library.
Specifically, the submitted log sequence value of the master library is the largest log sequence value among the log sequence values written into a plurality of databases in the database cluster (the ratio of the database ratios including the master library is greater than a preset ratio threshold), so that the largest candidate log sequence value can be determined by comparing the sizes of the candidate log sequence values, and the largest candidate log sequence value can be determined as the submitted log sequence value of the master library.
S203, modifying the data page according to the received data modification operation, and generating at least one log to be synchronized corresponding to the data modification operation.
And the log sequence value of the log to be synchronized is the page log sequence value of the modified data page.
It should be clear that, the data modification operation can be performed for the same data page for multiple modifications, the same data page is correspondingly modified each time with different log sequence values, and a unique incremental relationship exists between the log sequence values, that is, each modification is performed, and the log sequence value allocated to the log corresponding to the modification action is incremented by one.
Specifically, the master library determines a data page to be modified and data to be modified in the data page according to received data modification operation, modifies the same data page or different data pages once or more according to the determined content, generates a redo log containing current modified content each time modification is performed, determines the redo log as a log to be synchronized, and can generate a number of logs to be synchronized corresponding to the number of times of modification under the condition that multiple data modifications occur. Further, each time the log sequence value of the generated log is modified once, one is correspondingly added, and the generated log sequence value is used as the page log sequence value of the modified data page, that is, when the same data page is modified for multiple times, the page log sequence value of the data page changes along with the change of the log sequence value, and finally, the page log sequence value of the data page is saved as the log sequence value corresponding to the log generated when the data page is subjected to the last data operation.
And S204, adding one to the log packet sequence number of the previous log packet to be synchronized to determine the log packet sequence number of the log packet to be synchronized.
Specifically, the log packet needs to include the log packet sequence number corresponding to the log packet when being generated, and the log packet sequence number has a unique increasing characteristic, that is, the log packet sequence number is increased one by one along with the generation sequence. Therefore, when the log packet to be synchronized is generated, the log packet sequence number of the previous log packet to be synchronized can be added by one, and the log packet sequence number obtained after adding by one is determined as the log packet sequence number of the log packet to be synchronized.
S205, determining a minimum log sequence value, a maximum log sequence value and a log packet length according to each log to be synchronized.
Specifically, according to one or more determined logs to be synchronized, a log sequence value corresponding to each log to be synchronized is determined, a minimum value in the log sequence values is determined as a minimum log sequence value, a maximum value in the log sequence values is determined as a maximum log sequence value, and a log packet length is determined according to the size of a magnetic disk space occupied by each log to be synchronized.
S206, generating a log packet to be synchronized according to the log packet sequence number, the log packet length, the minimum log sequence value, the maximum log sequence value, the master library submitted log packet sequence number, the master library submitted log sequence value, the current tenure number of the master library and at least one log to be synchronized.
Specifically, the sequence number of the log packet, the length of the log packet, the minimum log sequence value, the maximum log sequence value, the sequence number of the log packet submitted by the main library, the sequence value of the log submitted by the main library and the current random number of the main library are combined to form a packet head of the log packet to be synchronized, all logs to be synchronized are arranged in sequence to form a packet body of the log packet to be synchronized, and the packet head and the packet body of the log packet to be synchronized are combined to generate the log packet to be synchronized.
And S207, sending the log packets to be synchronized to each standby library.
S208, the standby library correspondingly modifies the standby library data page according to each log to be synchronized in the received log packet to be synchronized, and generates a standby library synchronization log corresponding to the log to be synchronized.
Specifically, for each log to be synchronized in the log packet to be synchronized, the backup library determines a backup library data page which needs to be subjected to data modification in the backup library according to the log to be synchronized, and modifies data in the backup library data page according to data modification operations recorded in the log to be synchronized, and the backup library can generate a backup library synchronization log with the same content as the log to be synchronized after performing data modification according to the log to be synchronized.
S209, determining the log sequence value of the log to be synchronized as the page log sequence value of the database data page.
Specifically, in order to ensure that the page log sequence values of the modified data page of the main library and the modified data page of the standby library are synchronous, the log sequence value of the log to be synchronized is the same as the page log sequence value of the modified data page of the main library, so that the same page log sequence values of the modified data page of the main library and the modified data page of the standby library can be ensured to be the same after the log sequence value of the log to be synchronized is determined as the page log sequence value of the data page of the standby library, and further, the judgment operation when the data page is written into the local disk is facilitated.
And S210, writing the synchronous log of the standby library into a local disk of the standby library.
Specifically, after the standby library generates the standby library synchronization log, the standby library synchronization log may be written into the standby library local disk corresponding to the standby library, or after each to-be-synchronized log in the same to-be-synchronized log packet is replayed, each correspondingly generated standby library synchronization log may be written into the corresponding standby library local disk.
And S211, setting the current appointment number as the replayed appointment number of the standby database.
Specifically, after writing all the backup library synchronization logs into the backup library local disk, the backup library may be considered to have completed replay on the currently received log packet to be synchronized, that is, the replay deadline of the backup library is the deadline where the current master library is located, so that the current deadline number included in the log packet to be synchronized may be set as the replay deadline number of the backup library.
Further, data inconsistencies may result because the master library in the database cluster cannot submit the last due log. Therefore, when the backup library replays the log packet, the option number contained in the replayed log packet needs to be compared with the option number of the main library published by the election main library, if the option number contained in the replayed log packet is smaller than the option number corresponding to the election main library, the backup library can be considered to be in the state of replaying the history log packet, and at the moment, the backup library does not need to send information about the replayed log packet to the main library, so that data confusion caused by the fact that the main library submits logs which are not generated at the option by mistake is avoided. It should be clear that the current session number used for modifying the replayed session number of the standby library may be different from the session number corresponding to the election of the main library, that is, a new main library election may have been performed when the standby library replays the received log packet, and at this time, the session number corresponding to the main library should be greater than the current session number included in the log packet, that is, the replay standby library that replays the log packet should not send the information of the replayed log packet to the main library in the current session.
S212, the serial number of the submitted log packet of the master library is set as the serial number of the submitted log packet of the standby library.
Specifically, in order to ensure the synchronization of data information between the main database and the standby database, the serial number of the submitted log packet of the main database determined by the main database is set as the serial number of the submitted log packet of the standby database, so that the standby database can clearly determine the serial numbers corresponding to the log packets which are persisted to most databases in the database cluster, and further the standby database can not cut off the log packets which are less than or equal to the serial number of the submitted log packet of the standby database, namely, the log packets of which the serial number is greater than the serial number of the submitted log packet of the standby database in the standby database can be safely cut off.
S213, setting the submitted log sequence value of the master library as the submitted log sequence value of the backup library.
Specifically, in order to ensure synchronization of data information between the main database and the standby database, the submitted log sequence value of the main database determined by the main database is set as the submitted log sequence value of the standby database, so that the standby database is enabled to definitely persist the log sequence values corresponding to the log packets of the majority of databases in the database cluster, and further, the data pages in the standby database are written into the local disk definite judgment standard, thereby avoiding writing the data pages of which the logs are not written into the majority of databases into the local disk, and further ensuring that the logs which are not written into the majority of databases can be safely intercepted (the corresponding data pages are modified and are not persisted to the disk).
S214, feeding back the log packet sequence number of the log packet to be synchronized and the maximum log sequence value in the log packet to be synchronized to the master library.
Specifically, after the backup library finishes replay of the log packets to be synchronized and writing of the synchronization log of the backup library, the backup library is considered to finish processing of the whole log to be synchronized, and at this time, the backup library needs to feed back the processing condition of the backup library to the master library, so that the log packet sequence number of the log packets to be synchronized and the maximum log sequence value in the log packets to be synchronized need to be fed back to the master library, so that the master library determines the maximum log packet sequence number and the maximum log sequence value of the disk written in the backup library.
S215, the main library determines the synchronized log packet sequence number corresponding to the standby library in the synchronized log packet sequence number array.
Specifically, since the synchronized log packet sequence number array maintained by the master library includes the synchronized log packet sequence number corresponding to each backup library at the current time, it can also be understood that different backup libraries have different subscripts in the synchronized log packet sequence number array, and the synchronized log packet sequence number corresponding to the backup library can be determined in the synchronized log packet sequence number array according to the subscripts.
S216, storing the received log packet sequence number as a new synchronized log packet sequence number into the synchronized log packet sequence number array.
Specifically, the log packet replay process of the standby library is equivalent to the promotion of the original synchronized log packet sequence number, so that the log packet sequence number received by the main library is larger than the synchronized log packet sequence number of the standby library originally stored in the synchronized log packet sequence number array, the received log packet sequence number is stored in the synchronized log packet sequence number array as a new synchronized log packet sequence number at the moment, and the log synchronization condition of the standby library is equivalent to the updating of the synchronized log packet sequence number array, so that the main library process is promoted, and the submitted log packet sequence number of the main library in the next generated log packet to be synchronized can be updated in time.
S217, the main library determines the synchronized log sequence value corresponding to the standby library in the synchronized log sequence value array.
Specifically, since the synchronized log sequence value array maintained by the master library includes the maximum synchronized log sequence value corresponding to each backup library at the current time, it can also be understood that different backup libraries have different subscripts in the synchronized log sequence value array, and the synchronized log sequence value corresponding to the backup library can be determined in the synchronized log sequence value array according to the subscripts.
S218, storing the received maximum log sequence value as a new synchronized log sequence value into the synchronized log sequence value array.
Specifically, the process of replaying the log packet by the backup library is equivalent to advancing the original synchronized log sequence value, and the maximum log sequence value in the log packet to be synchronized, which is replayed by the backup library most recently, is inevitably greater than the synchronized log sequence value originally corresponding to the backup library in the synchronized log sequence value array, so that the master library can store the received maximum log sequence value as the new synchronized log sequence value of the backup library into the synchronized log sequence value array, which is equivalent to updating the log synchronization condition of the synchronized log sequence value array and the backup library, and further advance the master library process, so that the master library can write more data pages into the disk locally along with the progress of the master library, and simultaneously update the submitted log value of the master library in the next log packet to be synchronized in time.
Furthermore, after the backup library writes the log packet into the local disk, the log packet sequence number and the log packet sequence value fed back to the master library are the packet sequence number and the maximum log sequence value of the same written log packet, therefore, for the same subscript position in the synchronized log packet sequence number array and the synchronized log sequence value array of the master library, the taken out synchronized log packet sequence number and the synchronized log sequence value correspond to the same log packet and the same time point, if a log packet sequence number in the synchronized array of log packet sequence numbers is determined to be a committed log packet sequence number, the synchronized log sequence value corresponding to the same index position must also be determined to be a committed log sequence value (the two committed information are synchronously advanced), the committed log packet sequence number and the committed log sequence value for each advance of the master library may correspond to the same log packet.
Further, while the primary and secondary libraries perform data synchronization, the primary and secondary libraries also perform data page disk refreshing, that is, the modified data page is written into the local disk corresponding to the database. Specifically, the method comprises the following steps for each database in the database cluster:
A. and when the data page in the database meets the preset disk refreshing condition, judging whether the page log sequence value corresponding to the data page is larger than the log sequence value submitted by the main library of the main library or larger than the log sequence value submitted by the standby library of the standby library, if so, executing the step B, otherwise, executing the step C.
In this embodiment, the preset disk-flushing condition may be a predetermined condition for determining whether the corresponding data page in the database should be written into the local disk. Optionally, the predetermined flushing condition may be that the buffer of the data page of the database is insufficient or the database is checked to advance (a checkpoint refers to a specific location in the online log file of the database, the log sequence value of the log stored at this location is referred to as a checkpoint sequence value, the log before this location (i.e. the log with the log sequence value less than or equal to the checkpoint sequence value) must have persisted to the disk, the log after this location (i.e. the log with the log sequence value greater than the checkpoint sequence value) cannot guarantee to have persisted to the disk, each time a checkpoint is advanced, the corresponding data page modification will try to advance to the location where the last log was written in the online log file, but whether the data page modification corresponding to these logs can be written to the disk or not depends on whether the data page modification corresponding to these logs can be persisted to the disk or not, if the condition for flushing the disk is satisfied, the check point may push the data pages to modify the corresponding log writing position, otherwise, the check point may not push forward), or may be another preset determination condition, which is not limited in the embodiment of the present invention.
Specifically, when a data page in the database meets a preset disk-flushing condition, the database is considered to need to write the data page into a corresponding local disk, at this time, a judgment needs to be made for the data page, whether a log corresponding to the data page is written into most databases in the database cluster is determined, if the data page is a data page in the master library, whether a page log sequence value corresponding to the data page to be written is greater than a master library submitted log sequence value of the master library is determined, if the page log sequence value is greater than the master library submitted log sequence value, it is considered that a modified log corresponding to the data page in the master library is not written into the most databases, and at this time, step B is executed; if the number of the data pages is less than or equal to the number of the data pages in the master library, the modified log corresponding to the data page in the master library is considered to be written into the plurality of databases, and at this time, the step C is executed. And similarly, when the data page can be determined to be the data page in the standby library, the operation is performed on the data page of the standby library.
B. The data page is not written to the local disk of the database.
C. And writing the data page into a local disk of the database.
Based on the above data page flushing rule, it is known that the data page modifications corresponding to the logs larger than the submitted log sequence value are not persisted to the disk, so the logs larger than the submitted log sequence value can be safely truncated (since the submitted log packet sequence number and the submitted log sequence value are synchronously advanced, the logs larger than the submitted log packet sequence number can also be considered to be safely truncated), and the data page modifications corresponding to these truncated logs can also be safely rolled back.
On the basis that logs can be safely cut off, the fault automatic switching of a database cluster can be supported without third-party arbitration, even if the fault switching occurs (only network abnormity occurs between a main database and a standby database), after the network is recovered, the newly written logs on the old main database are not synchronized to most databases (the old main database cannot synchronize the logs to other standby databases during the network abnormity), namely the logs can be cut off, after the network is recovered, the old main database cuts off the newly written logs and modifies corresponding data pages back to roll back, the logs can be consistent with the data of the new main database, and the old main database can be automatically switched into the standby databases to be added into the database cluster again.
The technical scheme of the embodiment of the invention determines the sequence numbers of the submitted log packets of the main library and the submitted log sequence values of the main library, which are written into the logs of a plurality of databases in a database cluster, according to the sequence number array of the synchronized log packets and the sequence value array of the synchronized log sequences maintained by the main library, and sends the submitted log packets of the main library and the submitted log sequence values of the main library to the standby libraries to be synchronized, so that the standby libraries can modify the log packet parameters according to the sequence numbers of the submitted log packets of the main library and the submitted log sequence values of the main library after replaying according to the logs to be synchronized, define the condition for cutting logs in the standby libraries, and provide a judgment basis for judging whether the data pages can be written into when the data pages are required to be written into the local disk subsequently, so that the main library and the standby libraries can only store the data pages corresponding to the logs stored in the plurality of databases into the local disk, the safety of intercepting the logs which are not written into most of databases is guaranteed, meanwhile, the backup database only needs to feed back information to the main database after the logs are stored, the main database only needs to receive the feedback information of a plurality of backup databases to realize the promotion of the submitted log packet serial number of the main database and the submitted log sequence value of the main database, and the running speed of the main database cannot be influenced by the faults or network delay of a few backup databases. In the technical scheme, the logs larger than the submitted log sequence value of the main library or the standby library on the main library or the standby library are allowed to be cut off, and meanwhile, when the page log sequence value of the data page on the main library or the standby library is limited to be larger than the submitted log sequence value of the main library or the standby library, the data page corresponding to the page log sequence value is not allowed to be written into a disk, so that the data page modification related to the cut-off log is not persisted to the disk, the corresponding data page modification can be rolled back, and the log can be safely cut off. Furthermore, on the basis that the log can be cut off, the automatic fault switching can be completed without the arbitration of a third party, and after the network is recovered, the old master database can be automatically switched into the standby database to be added into the database cluster again, so that the availability, the reliability and the operating efficiency of the database cluster are improved.
EXAMPLE III
Fig. 5 is a schematic structural diagram of a log synchronization apparatus according to a third embodiment of the present invention, where the log synchronization apparatus is applied to a database cluster, the database cluster includes a master library and at least two backup libraries, and the number of the backup libraries is an even number. The log synchronizing device includes: a log packet generation module 31, a replay feedback module 32 and a group update module 33.
The log packet generating module 31 is configured to generate a log packet to be synchronized by the master library according to the synchronized log packet sequence number array, the synchronized log sequence value array, and the received data modification operation, and send the log packet to be synchronized to each backup library; the replay feedback module 32 is configured to replay the received log packets to be synchronized, modify log packet parameters of the standby library according to the log packets to be synchronized, and feed back the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized to the main library; and the array updating module 33 is configured to update the synchronized log packet sequence number array and the synchronized log sequence value array according to the received log packet sequence number and the maximum log sequence value.
In the technical scheme of the embodiment, when the master library generates the log packets to be synchronized, only the logs corresponding to the data modification operation are considered, the synchronized log packet sequence number array formed according to the log packets of each backup library in the database cluster and the log synchronization condition and the information determined by the synchronized log sequence value array are written into the log packets to be synchronized, the log packets to be synchronized are sent into each backup library to realize the data synchronization between the backup library and the master library, the parameters of the log packets of the backup library are modified through the log packets to be synchronized, the synchronized log packet sequence number array and the synchronized log sequence value array maintained by the master library are adjusted according to the feedback of the backup library to the master library, the log truncation is allowed when the database fails, so that the failure (network abnormality or database failure) of a small number of databases does not influence the normal operation of the database cluster, and third-party arbitration is not needed when the failure is automatically switched, the usability and the reliability of the database cluster are improved, and the running speed of the master database in the database cluster is improved.
Further, the log packets to be synchronized at least include a log packet sequence number, a log packet length, a minimum log sequence value, a maximum log sequence value, a current option number, a master library committed log packet sequence number, a master library committed log sequence value, and at least one log to be synchronized.
Optionally, the log packet generating module 31 includes:
and the master library submitted sequence number determining unit is used for determining the master library submitted log packet sequence number according to the synchronized log packet sequence number array.
And the master library submitted sequence value determining unit is used for determining the submitted log sequence value of the master library according to the synchronized log sequence value array.
The log generation unit is used for modifying the data page according to the received data modification operation and generating at least one log to be synchronized corresponding to the data modification operation; the log sequence value of the log to be synchronized is the page log sequence value of the modified data page.
And the log packet sequence number determining unit is used for adding one to the log packet sequence number of the previous log packet to be synchronized to determine the log packet sequence number of the log packet to be synchronized.
And the log packet parameter determining unit is used for determining a minimum log sequence value, a maximum log sequence value and a log packet length according to each log to be synchronized.
And the log packet generating unit is used for generating the log packets to be synchronized according to the log packet sequence number, the log packet length, the minimum log sequence value, the maximum log sequence value, the master library submitted log packet sequence number, the master library submitted log sequence value, the current tenure number of the master library and at least one log to be synchronized.
Optionally, the submitted sequence number determining unit of the master library is specifically configured to:
determining the number of the synchronized log packet sequence numbers in the synchronized log packet sequence number array as a first number;
determining the number of the synchronized log packet sequence numbers which are greater than or equal to the synchronized log packet sequence numbers in the synchronized log packet sequence number array as a second number aiming at each synchronized log packet sequence number in the synchronized log packet sequence number array;
if the second number is larger than the product of the first number and a preset proportion threshold value, determining the synchronized log packet sequence number as a candidate log packet sequence number;
and determining the maximum candidate log packet sequence number in the candidate log packet sequence numbers, and determining the maximum candidate log packet sequence number as the submitted log packet sequence number of the master library.
Optionally, the submitted sequence value determining unit of the master library is specifically configured to:
determining the number of the synchronized log sequence values in the synchronized log sequence value array as a third number;
determining, for each synchronized log sequence value in the array of synchronized log sequence values, a fourth number of synchronized log sequence values in the array of synchronized log sequence values that is greater than or equal to the synchronized log sequence value;
if the fourth number is larger than the product of the third number and a preset proportion threshold value, determining the synchronized log sequence value as a candidate log sequence value;
determining a maximum candidate log sequence value of the candidate log sequence values, and determining the maximum candidate log sequence value as a submitted log sequence value of the master library.
Optionally, the replay feedback module 32 includes:
the replay unit is used for correspondingly modifying the data page of the standby database according to each log to be synchronized in the received log packet to be synchronized by the standby database and generating the synchronous log of the standby database corresponding to the log to be synchronized; determining the log sequence value of the log to be synchronized as a page log sequence value of a database data page; and writing the synchronous log of the standby library into a local disk of the standby library.
The parameter modification unit is used for setting the current any term number as a replayed any term number of the standby library after the synchronous log of the standby library is written into the local disk of the standby library; setting the serial number of the submitted log packet of the master library as the serial number of the submitted log packet of the standby library; setting the master library committed log sequence value as the backup library committed log sequence value.
And the feedback unit is used for feeding back the log packet sequence number of the log packet to be synchronized and the maximum log sequence value in the log packet to be synchronized to the master library.
Optionally, the array updating module 33 includes:
and the packet sequence number determining unit is used for determining the synchronized log packet sequence number corresponding to the standby library in the synchronized log packet sequence number array by the main library.
And the packet sequence number updating unit is used for storing the received log packet sequence number as a new synchronized log packet sequence number into the synchronized log packet sequence number array.
And the sequence value determining unit is used for determining the synchronized log sequence value corresponding to the standby library in the synchronized log sequence value array by the main library.
And the sequence value updating unit is used for storing the received maximum log sequence value as a new synchronized log sequence value into the synchronized log sequence value array.
Optionally, the log synchronizing apparatus further includes:
the data page writing module is used for determining whether a page log sequence value corresponding to a data page is greater than a main library submitted log sequence value of a main library or a standby library submitted log sequence value of a standby library when the data page in the database meets a preset disk refreshing condition aiming at each database in the database cluster; if so, not writing the data page into a local disk of the database; and if not, writing the data page into a local disk of the database.
The log synchronization device provided by the embodiment of the invention can execute the log synchronization method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 6 is a schematic structural diagram of a log synchronization device according to a fourth embodiment of the present invention. The log synchronizing apparatus includes: a database cluster 40, a processor 41, a storage 42, an input 43, and an output 44. The database cluster 40 in the log synchronization device includes a main library 401 and at least two backup libraries 402, the number of the backup libraries 402 is an even number, and fig. 6 takes two backup libraries 402 as an example. The number of the processors 41 in the log synchronization device may be one or more, and one processor 41 is taken as an example in fig. 6. The number of the storage devices 42 in the log synchronization device may be one or more, and one storage device 42 is taken as an example in fig. 6. The database cluster 40, the processor 41, the storage device 42, the input device 43, and the output device 44 of the log synchronization apparatus may be connected by a bus or other means, and fig. 6 illustrates an example of the connection by the bus. In an embodiment, the log synchronization device may be a computer, a notebook, or a smart tablet.
The storage device 42 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules (e.g., the log packet generation module 31, the replay feedback module 32, and the group update module 33) corresponding to the log synchronization apparatus according to any embodiment of the present application. The storage device 42 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the device, and the like. Further, the storage 42 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, storage 42 may further include memory located remotely from processor 41, which may be connected to the device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 43 may be used for receiving input numeric or character information and generating key signal inputs related to user settings and function controls of the presentation apparatus, and may be a camera for acquiring images and a sound pickup apparatus for acquiring audio data. The output device 44 may include an audio device such as a speaker. It should be noted that the specific composition of the input device 43 and the output device 44 can be set according to actual conditions.
The processor 41 executes various functional applications of the device and data processing by executing software programs, instructions, and modules stored in the storage device 42, that is, implements the log synchronization method described above.
The log synchronization device provided by the foregoing can be used to execute the log synchronization method provided by any of the foregoing embodiments, and has corresponding functions and advantageous effects.
EXAMPLE five
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a log synchronization method, including:
the master library generates a log packet to be synchronized according to the synchronized log packet sequence number array, the synchronized log sequence value array and the received data modification operation, and sends the log packet to be synchronized to each backup library;
the standby database replays according to the received log packets to be synchronized, and feeds back the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized to the main database after modifying the log packet parameters of the standby database according to the log packets to be synchronized;
and updating the synchronized log packet sequence number array and the synchronized log sequence value array by the main library according to the received log packet sequence number and the maximum log sequence value.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also perform related operations in the log synchronization method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the above search apparatus, each included unit and module are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.
Claims (11)
1. A log synchronization method is characterized in that the log synchronization method is applied to a database cluster, the database cluster comprises a main library and at least two standby libraries, and the number of the standby libraries is an even number; the method comprises the following steps:
the master library generates a log packet to be synchronized according to the synchronized log packet sequence number array, the synchronized log sequence value array and the received data modification operation, and sends the log packet to be synchronized to each backup library;
the standby library is replayed according to the received log packets to be synchronized, and after the log packet parameters of the standby library are modified according to the log packets to be synchronized, the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized are fed back to the main library;
and the master library updates the synchronized log packet sequence number array and the synchronized log sequence value array according to the received log packet sequence number and the maximum log sequence value.
2. The method of claim 1, wherein the log packets to be synchronized comprise at least a log packet sequence number, a log packet length, a minimum log sequence value, a maximum log sequence value, a current option number, a master library committed log packet sequence number, a master library committed log sequence value, and at least one log to be synchronized;
the master library generates a log packet to be synchronized according to the synchronized log packet sequence number array, the synchronized log sequence value array and the received data modification operation, and the method comprises the following steps:
determining the submitted log packet sequence number of the master library according to the synchronized log packet sequence number array;
determining submitted log sequence values of a main library according to the synchronized log sequence value array;
modifying a data page according to the received data modification operation, and generating at least one log to be synchronized corresponding to the data modification operation; the log sequence value of the log to be synchronized is a page log sequence value of the modified data page;
adding one to the log packet sequence number of the last log packet to be synchronized to determine the log packet sequence number of the log packet to be synchronized;
determining a minimum log sequence value, a maximum log sequence value and a log packet length according to each log to be synchronized;
and generating a log packet to be synchronized according to the log packet sequence number, the log packet length, the minimum log sequence value, the maximum log sequence value, the master library submitted log packet sequence number, the master library submitted log sequence value, the current tenure number of the master library and at least one log to be synchronized.
3. The method of claim 2, wherein determining master library committed log packet sequence numbers from the synchronized array of log packet sequence numbers comprises:
determining the number of the synchronized log packet sequence numbers in the synchronized log packet sequence number array as a first number;
determining the number of the synchronized log packet sequence numbers which are greater than or equal to the synchronized log packet sequence number in the synchronized log packet sequence number array as a second number aiming at each synchronized log packet sequence number in the synchronized log packet sequence number array;
if the second number is larger than the product of the first number and a preset proportion threshold, determining the synchronized log packet sequence number as a candidate log packet sequence number;
and determining the maximum candidate log packet sequence number in the candidate log packet sequence numbers, and determining the maximum candidate log packet sequence number as the submitted log packet sequence number of the master library.
4. The method of claim 2, wherein determining a master library committed log sequence value from the array of synchronized log sequence values comprises:
determining the number of synchronized log sequence values in the synchronized log sequence value array as a third number;
determining, for each synchronized log sequence value in the array of synchronized log sequence values, a fourth number of synchronized log sequence values in the array of synchronized log sequence values that is greater than or equal to the synchronized log sequence value;
if the fourth number is greater than the product of the third number and a preset proportion threshold, determining the synchronized log sequence value as a candidate log sequence value;
determining a maximum candidate log sequence value of the candidate log sequence values, and determining the maximum candidate log sequence value as a submitted log sequence value of a master library.
5. The method of claim 2, wherein the replaying the backup repository according to the received log packet to be synchronized comprises:
the standby database correspondingly modifies a standby database data page according to each received log to be synchronized in the log packet to be synchronized, and generates a standby database synchronization log corresponding to the log to be synchronized;
determining the log sequence value of the log to be synchronized as a page log sequence value of the backup database data page;
and writing the standby library synchronous log into a local disk of the standby library.
6. The method according to claim 5, wherein the modifying the log packet parameters of the backup repository according to the log packet to be synchronized comprises:
after the standby library synchronous log is written into a local disk of the standby library, setting the current any term number as a replayed any term number of the standby library;
setting the serial number of the submitted log packet of the master library as the serial number of the submitted log packet of the standby library;
and setting the submitted log sequence value of the master library as the submitted log sequence value of the backup library.
7. The method of claim 1, wherein updating the synchronized array of log packet sequence numbers and the synchronized array of log sequence values by the master library based on the received log packet sequence numbers and the maximum log sequence value comprises:
the master library determines the synchronized log packet sequence number corresponding to the standby library in the synchronized log packet sequence number array;
taking the received log packet sequence number as a new synchronized log packet sequence number and storing the new synchronized log packet sequence number into the synchronized log packet sequence number array;
the master library determines a synchronized log sequence value corresponding to the backup library in the synchronized log sequence value array;
storing the received maximum log sequence value as the new synchronized log sequence value in the array of synchronized log sequence values.
8. The method of any one of claims 1-7, further comprising:
for each database in the database cluster,
when a data page in the database meets a preset disk refreshing condition, determining whether a page log sequence value corresponding to the data page is greater than a main library submitted log sequence value of a main library or greater than a standby library submitted log sequence value of a standby library;
if so, not writing the data page into a local disk of the database;
and if not, writing the data page into a local disk of the database.
9. The log synchronization device is applied to a database cluster, wherein the database cluster comprises a main library and at least two standby libraries, and the number of the standby libraries is an even number; the log synchronizing device includes:
the log packet generation module is used for generating log packets to be synchronized by the master library according to the synchronized log packet sequence number array, the synchronized log sequence value array and the received data modification operation, and sending the log packets to be synchronized to the standby libraries;
the replay feedback module is used for replaying the received log packets to be synchronized by the standby library, modifying the log packet parameters of the standby library according to the log packets to be synchronized, and feeding back the log packet sequence numbers of the log packets to be synchronized and the maximum log sequence values in the log packets to be synchronized to the main library;
and the array updating module is used for updating the synchronized log packet sequence number array and the synchronized log sequence value array by the master library according to the received log packet sequence number and the maximum log sequence value.
10. A log synchronization apparatus comprising a database cluster, storage, and one or more processors;
the database cluster comprises a main library and at least two standby libraries, and the number of the standby libraries is an even number;
the storage device to store one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the log synchronization method of any of claims 1-8.
11. A storage medium containing computer-executable instructions, which when executed by a computer processor, operate to perform the log synchronization method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110631080.0A CN113239120B (en) | 2021-06-07 | 2021-06-07 | Log synchronization method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110631080.0A CN113239120B (en) | 2021-06-07 | 2021-06-07 | Log synchronization method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113239120A true CN113239120A (en) | 2021-08-10 |
CN113239120B CN113239120B (en) | 2023-08-18 |
Family
ID=77137002
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110631080.0A Active CN113239120B (en) | 2021-06-07 | 2021-06-07 | Log synchronization method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113239120B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0336549A2 (en) * | 1988-04-08 | 1989-10-11 | International Business Machines Corporation | Database recovery in a computer system after a system crash |
EP0625752A2 (en) * | 1993-05-21 | 1994-11-23 | International Business Machines Corporation | Method and means for archiving in a transaction management system |
US9223843B1 (en) * | 2013-12-02 | 2015-12-29 | Amazon Technologies, Inc. | Optimized log storage for asynchronous log updates |
US9552242B1 (en) * | 2013-09-25 | 2017-01-24 | Amazon Technologies, Inc. | Log-structured distributed storage using a single log sequence number space |
CN110442560A (en) * | 2019-08-14 | 2019-11-12 | 上海达梦数据库有限公司 | Method, apparatus, server and storage medium are recurred in a kind of log |
CN111858501A (en) * | 2020-06-02 | 2020-10-30 | 武汉达梦数据库有限公司 | Log reading method and data synchronization system based on log analysis synchronization |
CN112416654A (en) * | 2020-11-26 | 2021-02-26 | 上海达梦数据库有限公司 | Database log replay method, device, equipment and storage medium |
CN112612647A (en) * | 2020-12-29 | 2021-04-06 | 上海达梦数据库有限公司 | Log parallel replay method, device, equipment and storage medium |
CN112637284A (en) * | 2020-12-09 | 2021-04-09 | 北京金山云网络技术有限公司 | Redo log storage method and device, electronic device and storage medium |
-
2021
- 2021-06-07 CN CN202110631080.0A patent/CN113239120B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0336549A2 (en) * | 1988-04-08 | 1989-10-11 | International Business Machines Corporation | Database recovery in a computer system after a system crash |
EP0625752A2 (en) * | 1993-05-21 | 1994-11-23 | International Business Machines Corporation | Method and means for archiving in a transaction management system |
US9552242B1 (en) * | 2013-09-25 | 2017-01-24 | Amazon Technologies, Inc. | Log-structured distributed storage using a single log sequence number space |
US9223843B1 (en) * | 2013-12-02 | 2015-12-29 | Amazon Technologies, Inc. | Optimized log storage for asynchronous log updates |
CN110442560A (en) * | 2019-08-14 | 2019-11-12 | 上海达梦数据库有限公司 | Method, apparatus, server and storage medium are recurred in a kind of log |
CN111858501A (en) * | 2020-06-02 | 2020-10-30 | 武汉达梦数据库有限公司 | Log reading method and data synchronization system based on log analysis synchronization |
CN112416654A (en) * | 2020-11-26 | 2021-02-26 | 上海达梦数据库有限公司 | Database log replay method, device, equipment and storage medium |
CN112637284A (en) * | 2020-12-09 | 2021-04-09 | 北京金山云网络技术有限公司 | Redo log storage method and device, electronic device and storage medium |
CN112612647A (en) * | 2020-12-29 | 2021-04-06 | 上海达梦数据库有限公司 | Log parallel replay method, device, equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
徐震;张敏;: "核心化多级安全数据库系统未决提交事务日志写出依赖研究", 计算机学报, no. 08 * |
胡君, 许群岚: "Oracle重做日志机制分析", 电脑与信息技术, no. 05 * |
Also Published As
Publication number | Publication date |
---|---|
CN113239120B (en) | 2023-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12007846B2 (en) | Manifest-based snapshots in distributed computing environments | |
US8127174B1 (en) | Method and apparatus for performing transparent in-memory checkpointing | |
CN109542682B (en) | Data backup method, device, equipment and storage medium | |
US7987158B2 (en) | Method, system and article of manufacture for metadata replication and restoration | |
US6397351B1 (en) | Method and apparatus for rapid data restoration including on-demand output of sorted logged changes | |
CN102253869B (en) | Scalable fault-tolerant Metadata Service | |
EP3206128B1 (en) | Data storage method, data storage apparatus, and storage device | |
US20150378830A1 (en) | Use of replicated copies to improve database backup performance | |
CN110543386B (en) | Data storage method, device, equipment and storage medium | |
JP2006277208A (en) | Backup system, program and backup method | |
CN113268472B (en) | Distributed data storage system and method | |
US10983709B2 (en) | Methods for improving journal performance in storage networks and devices thereof | |
CN116680256B (en) | Database node upgrading method and device and computer equipment | |
CN111046024A (en) | Data processing method, device, equipment and medium for sharing storage database | |
CN104750755A (en) | Method and system for recovering data after switching between main database and standby database | |
CN111382011B (en) | File data access method and device and computer readable storage medium | |
US8015375B1 (en) | Methods, systems, and computer program products for parallel processing and saving tracking information for multiple write requests in a data replication environment including multiple storage devices | |
CN113609090B (en) | Data storage method and device, computer readable storage medium and electronic equipment | |
CN102025758A (en) | Method, device and system fore recovering data copy in distributed system | |
CN115955488B (en) | Distributed storage copy cross-machine room placement method and device based on copy redundancy | |
CN111858173A (en) | Data recovery method, device, equipment and medium | |
CN113239120A (en) | Log synchronization method, device, equipment and storage medium | |
AU2016218367A1 (en) | Externalized execution of input method editor | |
CN110569231B (en) | Data migration method, device, equipment and medium | |
CN113986878A (en) | Data writing method, data migration device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |