CN111858502A - Log reading method and log reading synchronization system based on log analysis synchronization - Google Patents

Log reading method and log reading synchronization system based on log analysis synchronization Download PDF

Info

Publication number
CN111858502A
CN111858502A CN202010491618.8A CN202010491618A CN111858502A CN 111858502 A CN111858502 A CN 111858502A CN 202010491618 A CN202010491618 A CN 202010491618A CN 111858502 A CN111858502 A CN 111858502A
Authority
CN
China
Prior art keywords
log
reading
file
read
serial number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010491618.8A
Other languages
Chinese (zh)
Inventor
孙峰
付铨
彭青松
刘启春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dameng Database Co Ltd
Original Assignee
Wuhan Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dameng Database Co Ltd filed Critical Wuhan Dameng Database Co Ltd
Priority to CN202010491618.8A priority Critical patent/CN111858502A/en
Publication of CN111858502A publication Critical patent/CN111858502A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention discloses a log reading method and a log reading synchronization system based on log analysis synchronization, wherein the log reading method comprises the following steps: acquiring a log serial number LSN1 of a log record to be read, and acquiring a read position of a log file through a log serial number LSN 1; acquiring the current maximum log serial number LSN2 of a source database, and acquiring the write-in position of a log file through a log serial number LSN 2; based on the reading position of the log file and the writing position of the log file, obtaining a difference value of the number of log pages of the log serial number LSN1 and the log serial number LSN2 in the log file; and performing strategic log reading and synchronization according to the difference value of the log page numbers. In the invention, the number of the read log pages can be dynamically adjusted when the log file is read by calculating the distance between the log serial number of the log record to be read and the current maximum log serial number in the database system, so that strategic log reading and synchronization are carried out, and the read-write conflict of the logs is prevented.

Description

Log reading method and log reading synchronization system based on log analysis synchronization
Technical Field
The invention belongs to the technical field of data synchronization, and particularly relates to a log reading method and a log reading synchronization system based on log analysis synchronization.
Background
In the scheme based on log analysis synchronization, a database log capturing process is deployed on a source database, and online logs of the database are continuously scanned and read, so that the running logs of the database are captured at the first time to perform data synchronization. The architecture can cause the log file of the database to be opened by two or more processes at the same time, wherein the database opens the online log file and writes the log, the data synchronization process opens the online log file to read the log, and if the writing of the database and the reading of the data synchronization are simultaneously and concurrently performed on the same file offset of the same online log file, the read-write conflict can be caused. The new data to be written in the database is flushed by the old data read from the log file in a data synchronization manner, which is mainly caused by that an operating system or hardware equipment performs read-write access by using the same cache for the same offset of the same file under certain environments, the log written in the database is written into the cache first, and before the disk refreshing is not completed, the read operation of a data synchronization thread reads the old data at the same position in the log file into the cache, so that the new data written in the database is covered, and finally the old data actually written in the log file is refreshed. Through testing, the phenomenon has a high occurrence probability in the running environment of the virtual machine, once the phenomenon occurs, a data synchronization error is caused, and more seriously, if the database is restarted due to a fault, the database cannot be recovered by using the damaged REDO log. Therefore, how to solve the read-write conflict that the database process and the data synchronization process access the online log file at the same time becomes a technical problem to be solved urgently in the industry.
In view of this, overcoming the deficiencies of the prior art products is an urgent problem to be solved in the art.
Disclosure of Invention
The invention provides a log reading method and a log reading synchronization system based on log analysis synchronization, aiming at dynamically adjusting the number of read log pages when a log file is read by calculating the distance between the log serial number of a log record to be read and the current maximum log serial number in a database system, reading and synchronizing strategic logs and preventing the read-write conflict of the logs.
To achieve the above object, according to an aspect of the present invention, there is provided a log reading method based on log parsing synchronization, the log reading method including:
acquiring a log serial number LSN1 of a log record to be read, and acquiring a read position of a log file through the log serial number LSN 1;
acquiring the current maximum log serial number LSN2 of a source database, and acquiring the write-in position of a log file through the log serial number LSN 2;
obtaining a difference value of the number of log pages of the log serial number LSN1 and the log serial number LSN2 in the log file based on the reading position of the log file and the writing position of the log file;
And reading and synchronizing the strategic logs according to the difference value of the number of the logs.
Preferably, the obtaining of the log serial number LSN1 of the log record to be read, the obtaining of the reading position of the log file through the log serial number LSN1 includes:
acquiring a log serial number LSN1 of a log record to be read;
analyzing the log serial number LSN1 to obtain a log file number ID1 and a log page number P1 corresponding to the log serial number LSN1, and obtaining the reading position of the log file.
Preferably, the obtaining the current maximum log sequence number LSN2 of the source database, and the obtaining the write location of the log file through the log sequence number LSN2 includes:
acquiring a current maximum log serial number LSN2 of a source database;
analyzing the log serial number LSN2 to obtain the log file number ID2 and the log page number P2 corresponding to the log serial number LSN2, and obtaining the writing position of the log file.
Preferably, the source-end database side comprises a plurality of log files, and the total number of log pages P0 contained in each log file is the same;
obtaining a difference value of the number of log pages of the log serial number LSN1 and the log serial number LSN2 in the log file based on the reading position of the log file and the writing position of the log file;
Aiming at the writing position of the log file, multiplying the total number of pages P0 by the number ID2 of the log file, and adding the total number of pages P2 of the log file to obtain a first number of pages of the log;
for the reading position of the log file, multiplying the log file number ID1 by the total number of pages P0 of the log, and adding the log file number ID1 and the total number of pages P1 of the log to obtain a second number of pages of the log;
and subtracting the second log page number from the first log page number to obtain a difference value of the log page numbers of the log serial number LSN1 and the log serial number LSN2 in the log file.
Preferably, the performing strategic log reading and synchronization according to the difference value of the number of log pages includes:
when the difference value of the number of the log pages is equal to 0, maintaining the log serial number of the log record to be read unchanged so as to continuously read the log record at the reading position of the log file;
when the difference value of the log page numbers is larger than 0 and smaller than the maximum log page number allowed to be read, setting the reading page number N of the log to be equal to the difference value of the log page numbers, and backwards reading N log pages by taking the reading position of the log file as a log reading limit so as to read and synchronize log records;
And when the difference value of the log page numbers is not less than the maximum log page number allowed to be read, setting the reading page number N of the log at this time to be equal to the maximum log page number allowed to be read, and backwards reading the N log pages by taking the reading position of the log file as a log reading limit so as to read and synchronize log records.
Preferably, the performing strategic log reading and synchronization according to the difference value of the number of log pages further comprises:
and after the log records of the N log pages are read, advancing the log serial number of the log record to be read according to the log serial number LSN1 and the reading page number N of the log.
Preferably, according to the log sequence number LSN1 and the number of pages N read from the log this time, advancing the log sequence number of the log record to be read includes:
analyzing the log serial number LSN1 to obtain a log file number ID1 and a log page number P1 corresponding to the log serial number LSN 1;
multiplying the log file number ID1 with the total log page number P0, and adding a log page number P1 to obtain a total page number P4;
dividing the total page number P4 by the total log page number P0 to obtain a log file number ID 3;
performing complementation calculation on the total page number P0 of the log by using the total page number P4 to obtain a log page number P3;
And obtaining a new log serial number LSN1 of the log record to be read based on the log file number ID3 and the log page number P3.
Preferably, the log reading method further includes:
creating an auxiliary table, and inserting a row of auxiliary data into the auxiliary table;
and after the current maximum log serial number LSN2 is obtained, updating the auxiliary data to ensure that log records in the source database, which are smaller than the log serial number LSN2, are all flushed.
Preferably, the obtaining the log sequence number LSN1 and the log sequence number LSN2 based on the reading position of the log file and the writing position of the log file further includes, before the difference between the number of log pages in the log file:
judging whether the log file number ID1 corresponding to the log serial number LSN1 is equal to the log file number ID2 corresponding to the log serial number LSN 2;
if the log number difference value is equal to the read position of the log file, the step of obtaining the difference value of the log page number of the log serial number LSN1 and the log serial number LSN2 in the log file based on the read position of the log file and the write position of the log file is executed;
and if not, completing the reading and synchronization of the log record to be read, reading the next log record to be read from the log file, executing the step of obtaining the log serial number LSN1 of the log record to be read, and obtaining the reading position of the log file through the log serial number LSN 1.
To achieve the above object, according to another aspect of the present invention, there is provided a synchronization system including at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions programmed to perform the log reading method of the present invention.
Generally, compared with the prior art, the technical scheme of the invention has the following beneficial effects: the invention provides a log reading method and a log reading synchronization system based on log analysis synchronization, wherein the log reading method comprises the following steps: acquiring a log serial number LSN1 of a log record to be read, and acquiring a read position of a log file through the log serial number LSN 1; acquiring the current maximum log serial number LSN2 of a source database, and acquiring the write-in position of a log file through the log serial number LSN 2; obtaining a difference value of the number of log pages of the log serial number LSN1 and the log serial number LSN2 in the log file based on the reading position of the log file and the writing position of the log file; and reading and synchronizing the strategic logs according to the difference value of the number of the logs.
In the invention, in order to avoid read-write conflict between the data synchronization process and the database process, the two processes must be prevented from simultaneously accessing the data with the same offset in the same file at the same time. To achieve this objective, it is necessary for the data synchronization process to perform mutual exclusion on the write operations of the log to be read and the database by determining whether the log to be written is located at the same position of the same log file when the log file is read. By calculating the distance between the log serial number of the log record to be read and the current maximum log serial number in the database system, the number of the read log pages can be dynamically adjusted when the log file is read, strategic log reading and synchronization are carried out, and the read-write conflict of the logs is prevented.
Drawings
Fig. 1 is a schematic flowchart of a log reading method based on log parsing synchronization according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of another log reading method based on log parsing synchronization according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of another log reading method based on log parsing synchronization according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a synchronization system according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of another synchronization system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are for convenience only to describe the present invention without requiring the present invention to be necessarily constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1:
at present, although the data lock technology may only allow one process to access a certain file, the data lock is applicable to the same database or databases of the same manufacturer, and if the write database and the read database are from different manufacturers, that is, the write operation on the side of the source-end database and the read operation on the side of the destination-end database originate from different databases of two manufacturers, the above data lock cannot be adopted, and the method of this embodiment is applicable to different sources or different types of databases.
In order to avoid read-write conflicts between the data synchronization process and the database process, it is necessary to prevent the two processes from accessing the same offset data in the same file at the same time. To achieve this objective, it is necessary for the data synchronization process to perform mutual exclusion on the write operations of the log to be read and the database by determining whether the log to be written is located at the same position of the same log file when the log file is read.
In a practical application scenario, each log record generated by the database has an LSN (Logsequence number, abbreviated as LSN) value, which represents the order in which the log is generated. However, the representation form of the LSN is different on different databases, and there is a physical LSN, which is formed by using the number of similar log files + the number of log pages + the offset in log pages, such as Postgresql, Sql server and DM 6; the method comprises the steps of setting a log file reading scheme specially for a database using a physical LSN mechanism, calculating the distance between a log serial number of a log record to be read and the current maximum log serial number in a database system, dynamically adjusting the number of read log pages when the log file is read, and performing strategic log reading and synchronization to prevent log reading and writing conflict. Moreover, the log reading performance is improved to the maximum extent on the premise of preventing read-write conflict.
With reference to fig. 4, in this embodiment, synchronization systems are deployed in the source database and the destination database, the source database synchronization system reads logs from the source database, and the destination database synchronization system is responsible for sending synchronization operations sent by the source to the destination database.
When the source end data synchronization system is started, a database interaction thread and a log file reading thread need to be initialized. The database interaction thread is used for acquiring the maximum log serial number of the log from the source-end database at regular time; and the log file reading thread is used for reading and analyzing the log file of the database. In this embodiment, the database interaction thread and the log file reading thread act together to avoid log read-write conflicts between the data synchronization process and the database log writing process.
A specific implementation process of the log reading method based on log parsing synchronization according to the embodiment is specifically described below with reference to fig. 1, where the log reading method includes the following steps:
step 101: and acquiring a log serial number LSN1 of a log record to be read, and acquiring the reading position of the log file through the log serial number LSN 1.
In a practical scenario, each log record generated by the database has an LSN value, which represents the order in which the log was generated. The expression form of the LSN is different on different databases, the LSN has a physical LSN, and the LSN is formed by the number of a similar log file, the number of a log page and the offset in the log page, such as Postgresql, Sql server, DM6 and the like; there is a logical LSN, which is constructed using an sequentially increasing integer, e.g., ORACLE and DM7, etc. In either form, they follow a principle that the log sequence number LSN is strictly incremented as the database runs. The log reading method provided by the embodiment is suitable for the physical LSN.
When the log reading thread starts working, a log sequence number LSN1 of a log record to be read is initialized, and the log sequence number LSN1 is continuously increased backwards along with the progress of log reading.
In this embodiment, the log file number ID1 and the log page number P1 corresponding to the log serial number LSN1 are obtained by analyzing the log serial number LSN1, and the reading position of the log file is obtained.
The database comprises a plurality of log files, the total number of pages of each log file is the same, the number of each log file is a positive integer, and the number of each log file is gradually increased in a mode that the tolerance is 1. The log sequence number further includes an offset in a page, but the read-write conflict mentioned above is that if the log reading process and the log writing process operate on the same log page number of the same file at the same time, the read-write conflict occurs, and therefore, the log page number is the minimum storage unit of the read-write conflict, and the offset in the page does not need to be considered.
Step 102: and acquiring the current maximum log serial number LSN2 of the source database, and acquiring the write-in position of the log file through the log serial number LSN 2.
In this embodiment, after the source-end database starts to work, the data interaction thread performs polling on the source-end database according to a preset time interval to obtain the current maximum log serial number of the source-end database. Because the log serial number is strictly increased along with the running of the database, the current maximum log serial number of the source-end database is obtained in a polling mode, and all log records generated by the source-end database can be obtained. When the polling time interval is small enough, the current maximum log sequence number LSN2 can also be understood as the log sequence number of the log record written newly at the source database side, and by controlling the distance between the log sequence number LSN1 and the log sequence number LSN2, the read-write conflict of the log can be prevented.
Specifically, acquiring a current maximum log serial number LSN2 of a source database; analyzing the log serial number LSN2 to obtain the log file number ID2 and the log page number P2 corresponding to the log serial number LSN2, and obtaining the writing position of the log file.
Step 103: and obtaining the difference value of the number of log pages of the log serial number LSN1 and the log serial number LSN2 in the log file based on the reading position of the log file and the writing position of the log file.
In this embodiment, the log serial number LSN1 and the log serial number LSN2 are analyzed to obtain a log file number and a log page number corresponding to each log serial number LSN 8932, and then a difference between the log number of the log serial number LSN1 and the log serial number LSN2 in the log file is obtained through calculation, and then the read-write conflict can be avoided by dynamically controlling the number of the log pages read each time according to the difference between the log number of the log pages.
Specifically, the source-end database side includes a plurality of log files, and the total number of log pages P0 included in each log file is the same; aiming at the writing position of the log file, multiplying the total number of pages P0 by the number ID2 of the log file, and adding the total number of pages P2 of the log file to obtain a first number of pages of the log; for the reading position of the log file, multiplying the log file number ID1 by the total number of pages P0 of the log, and adding the log file number ID1 and the total number of pages P1 of the log to obtain a second number of pages of the log; and subtracting the second log page number from the first log page number to obtain a difference value of the log page numbers of the log serial number LSN1 and the log serial number LSN2 in the log file.
That is, the difference in the number of log pages is (log file number ID2 × total number of log pages P0+ log page number P2) - (log file number ID1 × total number of log pages P0+ log page number P1).
Step 104: and reading and synchronizing the strategic logs according to the difference value of the number of the logs.
Referring to fig. 2, in this embodiment, the step 104 specifically includes the following steps:
step 1041: and judging whether the difference value of the number of the log pages is greater than 0.
In this embodiment, it is first determined whether the difference between the log pages is greater than 0, if the difference between the log pages is equal to 0, step 1042 is executed, and if the difference between the log pages is greater than 0, step 1043 is executed.
Step 1042: and when the difference value of the number of the log pages is equal to 0, maintaining the log sequence number of the log record to be read unchanged.
In this embodiment, when the difference between the number of log pages is equal to 0, the log sequence number of the log record to be read is maintained unchanged, so as to continue reading the log record at the read position of the log file.
In the process, the maximum log serial number of the source-side database gradually increases, and after a period of time, the process returns to step 102, the current maximum log serial number LSN2 of the source-side database is obtained again, then the difference between the log page numbers of the log serial number LSN2 and the log serial number LSN1 is calculated again, and whether the difference between the log page numbers is greater than 0 is judged.
Step 1043: and judging whether the difference value of the log page numbers is smaller than the maximum log page number allowed to be read.
In this embodiment, if the difference between the log pages is greater than 0, it is determined whether the difference between the log pages is smaller than the maximum log page number allowed to be read, when the difference between the log pages is smaller than the maximum log page number allowed to be read, step 1044 is executed, and when the difference between the log pages is not smaller than the maximum log page number allowed to be read, step 1045 is executed.
Step 1044: and when the difference value of the log page numbers is smaller than the maximum log page number allowed to be read, setting the reading page number N of the log to be equal to the difference value of the log page numbers, and backwards reading N log pages by taking the reading position of the log file as a log reading limit so as to read and synchronize log records.
Step 1045: and when the difference value of the log page numbers is not less than the maximum log page number allowed to be read, setting the reading page number N of the log at this time to be equal to the maximum log page number allowed to be read, and backwards reading the N log pages by taking the reading position of the log file as a log reading limit so as to read and synchronize log records.
In this embodiment, the maximum number of log pages allowed to be read is introduced, so that not only can read-write conflict be prevented, but also the log reading performance is improved to the maximum extent on the premise of preventing read-write conflict.
In a specific application scenario, after log record reading of N log pages is completed, the log serial number of the log record to be read is advanced according to the log serial number LSN1 and the current log reading page number N, and then the current log reading page number is dynamically adjusted according to the distance between the new log serial number of the log record to be read and the current maximum log serial number of the source-end database, so as to prevent read-write conflict.
Specifically, the log serial number LSN1 is analyzed to obtain a log file number ID1 and a log page number P1 corresponding to the log serial number LSN 1; multiplying the log file number ID1 with the total log page number P0, and adding a log page number P1 to obtain a total page number P4; dividing the total page number P4 by the total log page number P0 to obtain a log file number ID 3; performing complementation calculation on the total page number P0 of the log by using the total page number P4 to obtain a log page number P3; and obtaining a new log serial number LSN1 of the log record to be read based on the log file number ID3 and the log page number P3.
In this embodiment, it can be understood that the total page number P4 is (log file number ID1 × total number of log pages P0+ log page number P1+ N), the log sequence number LSN1 of the new log record to be read is (total page number P4/total number of log pages P0, total page number P4% total number of log pages P0, 0), and since the intra-page offset does not affect the determination of the read-write conflict, the intra-page offset of the new LSN1 is initialized to 0.
In an actual application scenario, after the current maximum log sequence number LSN2 is obtained, if an operation is performed on the database but the operation is not submitted, although the LSN value of the database is continuously advanced, the operation log corresponding to the log sequence number LSN2 may not be flushed, because the operations are all in the log buffer of the database, the read-write collision still cannot be avoided. In order to ensure that all log records in the source database smaller than the log sequence number LSN2 are flushed, in a preferred embodiment, an auxiliary table is initialized, and operations in the database log buffer are ensured to be flushed by an update operation of the auxiliary table.
Specifically, an auxiliary table is created, and a row of auxiliary data is inserted into the auxiliary table; and after the current maximum log serial number LSN2 is obtained, updating the auxiliary data to ensure that log records in the source database, which are smaller than the log serial number LSN2, are all flushed. Thus, the operation in the log buffer can be guaranteed to be flushed.
Example 2:
with reference to embodiment 1, this embodiment provides another log reading method, and with reference to fig. 2, the log reading method includes the following steps:
Step 201: and acquiring a log serial number LSN1 of a log record to be read, and acquiring the reading position of the log file through the log serial number LSN 1.
Step 202: and acquiring the current maximum log serial number LSN2 of the source database, and acquiring the write-in position of the log file through the log serial number LSN 2.
The specific implementation process of step 201 and step 202 is the same as that of embodiment 1, and is described in detail in embodiment 1, which is not described herein again.
Step 203: and judging whether the log file number ID1 corresponding to the log serial number LSN1 is equal to the log file number ID2 corresponding to the log serial number LSN 2.
Although in practical application scenarios, in order to reduce synchronization delay, the distance between the currently largest log sequence number LSN2 of the source database and the log sequence number LSN1 of the log record to be read is generally small, in general, there may be a case that the log file number ID1 corresponding to the log sequence number LSN1 and the log file number ID2 corresponding to the log sequence number LSN2 are not equal when synchronization is just started, in this case, the read position and the write position are for different log files, no log collision exists, step 204 is directly executed,
Step 204: and if not, finishing the reading and synchronization of the log record to be read at this time, and reading the next log record to be read from the log file.
In this embodiment, if the log file number ID1 corresponding to the log serial number LSN1 is not equal to the log file number ID2 corresponding to the log serial number LSN2, the reading and synchronization of the log record to be read this time are completed, and the next log record to be read is read from the log file. And then, returning to the step 201 again, and executing the step of obtaining the log serial number LSN1 of the log record to be read and obtaining the reading position of the log file through the log serial number LSN 1.
Step 205: if the log number difference value is equal to the read position of the log file, the step of obtaining the difference value of the log page number of the log serial number LSN1 and the log serial number LSN2 in the log file based on the read position of the log file and the write position of the log file is executed; and obtaining the difference value of the number of log pages of the log serial number LSN1 and the log serial number LSN2 in the log file based on the reading position of the log file and the writing position of the log file.
Step 206: and reading and synchronizing the strategic logs according to the difference value of the number of the logs.
In this embodiment, if the log file number ID1 corresponding to the log serial number LSN1 is equal to the log file number ID2 corresponding to the log serial number LSN2, the read-write conflict is avoided in the manner of the foregoing embodiment 1, which is described in detail in the foregoing embodiment 1 and will not be described again here.
In this embodiment, it is first determined whether the read position and the write position are for the same log file, and if the read position and the write position are for different log files, there is no read-write conflict, and step 205 and step 206 do not need to be executed, so that the efficiency of reading the log can be improved.
Example 3:
based on the foregoing embodiment 1, here, taking DM6 as an example, the implementation process of the foregoing embodiment 1 is illustrated:
a set of data synchronization system is built on the database A and the database B, and the maximum number of pages allowed to be read in the data synchronization process is set as Z pages; the time interval for polling the database log LSN is set to 1 second and the source data synchronization system creates a secondary table t (c int) on database a to push the log.
The database interaction process of the source end log synchronization system is as follows:
(1) every 1 second, the database interaction thread acquires the maximum log LSN of the current database from the source database through a statement, and the acquired SQL is select get _ last _ LSN () from dual; a log sequence number LSN2 is obtained.
(2) The database interaction thread then executes the statement: BEGIN UPDATE T SET C1; COMMIT; END; to ensure that all log operations with log sequence number LSN less than log sequence number LSN2 are flushed and then jump to step (1) for the next poll.
The log file reading thread process is as follows:
(1) the log file read thread locates the location of the log file to be read based on the starting log sequence number LSN 1.
(2) And calculating the distance between the log serial number LSN1 and the latest log serial number LSN2 of the current database according to a formula, and then obtaining the number N of the log pages which need to be actually read by combining the maximum number Z of the allowed read pages.
(3) And (3) reading N pages of logs from the current position by the log file reading thread for analysis synchronization, then advancing the read LSN1, and jumping to the step (1) to continue reading the next batch of logs.
Example 4:
referring to fig. 4, fig. 4 is a schematic structural diagram of a synchronization system according to an embodiment of the present invention. The synchronization system of the present embodiment includes one or more processors 41 and a memory 42. In fig. 4, one processor 41 is taken as an example.
The processor 41 and the memory 42 may be connected by a bus or other means, such as the bus connection in fig. 4.
Memory 42, which is a non-volatile computer-readable storage medium based on log reading, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, the methods of the above embodiments, and corresponding program instructions. The processor 41 implements the methods of the foregoing embodiments by executing non-volatile software programs, instructions, and modules stored in the memory 42 to thereby execute various functional applications and data processing.
The memory 42 may include, among other things, high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, memory 42 may optionally include memory located remotely from processor 41, which may be connected to processor 41 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
It should be noted that, for the information interaction, execution process and other contents between the modules and units in the apparatus and system, the specific contents may refer to the description in the embodiment of the method of the present invention because the same concept is used as the embodiment of the processing method of the present invention, and are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A log reading method based on log analysis synchronization is characterized by comprising the following steps:
acquiring a log serial number LSN1 of a log record to be read, and acquiring a read position of a log file through the log serial number LSN 1;
acquiring the current maximum log serial number LSN2 of a source database, and acquiring the write-in position of a log file through the log serial number LSN 2;
obtaining a difference value of the number of log pages of the log serial number LSN1 and the log serial number LSN2 in the log file based on the reading position of the log file and the writing position of the log file;
And reading and synchronizing the strategic logs according to the difference value of the number of the logs.
2. The method for reading the log according to claim 1, wherein the obtaining a log serial number LSN1 of the log record to be read, and obtaining the read position of the log file through the log serial number LSN1 comprises:
acquiring a log serial number LSN1 of a log record to be read;
analyzing the log serial number LSN1 to obtain a log file number ID1 and a log page number P1 corresponding to the log serial number LSN1, and obtaining the reading position of the log file.
3. The log reading method of claim 2, wherein the obtaining a current maximum log sequence number LSN2 of the source database, and obtaining a write location of the log file through the log sequence number LSN2 comprises:
acquiring a current maximum log serial number LSN2 of a source database;
analyzing the log serial number LSN2 to obtain the log file number ID2 and the log page number P2 corresponding to the log serial number LSN2, and obtaining the writing position of the log file.
4. The log reading method according to claim 3, wherein the source database side includes a plurality of log files, and the total number of log pages P0 included in each log file is the same;
Obtaining a difference value of the number of log pages of the log serial number LSN1 and the log serial number LSN2 in the log file based on the reading position of the log file and the writing position of the log file;
aiming at the writing position of the log file, multiplying the total number of pages P0 by the number ID2 of the log file, and adding the total number of pages P2 of the log file to obtain a first number of pages of the log;
for the reading position of the log file, multiplying the log file number ID1 by the total number of pages P0 of the log, and adding the log file number ID1 and the total number of pages P1 of the log to obtain a second number of pages of the log;
and subtracting the second log page number from the first log page number to obtain a difference value of the log page numbers of the log serial number LSN1 and the log serial number LSN2 in the log file.
5. The log reading method of claim 1, wherein the performing strategic log reading and synchronization based on the difference in log page count comprises:
when the difference value of the number of the log pages is equal to 0, maintaining the log serial number of the log record to be read unchanged so as to continuously read the log record at the reading position of the log file;
when the difference value of the log page numbers is larger than 0 and smaller than the maximum log page number allowed to be read, setting the reading page number N of the log to be equal to the difference value of the log page numbers, and backwards reading N log pages by taking the reading position of the log file as a log reading limit so as to read and synchronize log records;
And when the difference value of the log page numbers is not less than the maximum log page number allowed to be read, setting the reading page number N of the log at this time to be equal to the maximum log page number allowed to be read, and backwards reading the N log pages by taking the reading position of the log file as a log reading limit so as to read and synchronize log records.
6. The log reading method of claim 5, wherein the performing strategic log reading and synchronization based on the difference in log page count further comprises:
and after the log records of the N log pages are read, advancing the log serial number of the log record to be read according to the log serial number LSN1 and the reading page number N of the log.
7. The log reading method according to claim 6, wherein advancing the log sequence number of the log record to be read according to the log sequence number LSN1 and the number N of the current log reading comprises:
analyzing the log serial number LSN1 to obtain a log file number ID1 and a log page number P1 corresponding to the log serial number LSN 1;
multiplying the log file number ID1 with the total log page number P0, and adding a log page number P1 to obtain a total page number P4;
Dividing the total page number P4 by the total log page number P0 to obtain a log file number ID 3;
performing complementation calculation on the total page number P0 of the log by using the total page number P4 to obtain a log page number P3;
and obtaining a new log serial number LSN1 of the log record to be read based on the log file number ID3 and the log page number P3.
8. The log reading method according to any one of claims 1 to 7, further comprising:
creating an auxiliary table, and inserting a row of auxiliary data into the auxiliary table;
and after the current maximum log serial number LSN2 is obtained, updating the auxiliary data to ensure that log records in the source database, which are smaller than the log serial number LSN2, are all flushed.
9. A log reading method as claimed in any one of claims 1 to 7, wherein said deriving the log sequence number LSN1 and the log sequence number LSN2 before the difference of the number of log pages in the log file based on the reading position of the log file and the writing position of the log file further comprises:
judging whether the log file number ID1 corresponding to the log serial number LSN1 is equal to the log file number ID2 corresponding to the log serial number LSN 2;
If the log number difference value is equal to the read position of the log file, the step of obtaining the difference value of the log page number of the log serial number LSN1 and the log serial number LSN2 in the log file based on the read position of the log file and the write position of the log file is executed;
and if not, completing the reading and synchronization of the log record to be read, reading the next log record to be read from the log file, executing the step of obtaining the log serial number LSN1 of the log record to be read, and obtaining the reading position of the log file through the log serial number LSN 1.
10. A synchronization system, characterized in that the synchronization system comprises at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor and programmed to perform a log reading method as claimed in any one of claims 1 to 9.
CN202010491618.8A 2020-06-02 2020-06-02 Log reading method and log reading synchronization system based on log analysis synchronization Pending CN111858502A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010491618.8A CN111858502A (en) 2020-06-02 2020-06-02 Log reading method and log reading synchronization system based on log analysis synchronization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010491618.8A CN111858502A (en) 2020-06-02 2020-06-02 Log reading method and log reading synchronization system based on log analysis synchronization

Publications (1)

Publication Number Publication Date
CN111858502A true CN111858502A (en) 2020-10-30

Family

ID=72984906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010491618.8A Pending CN111858502A (en) 2020-06-02 2020-06-02 Log reading method and log reading synchronization system based on log analysis synchronization

Country Status (1)

Country Link
CN (1) CN111858502A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102760161A (en) * 2012-06-12 2012-10-31 天津神舟通用数据技术有限公司 Log organization structure clustered based on transaction aggregation and method for realizing corresponding recovery protocol thereof
US8818943B1 (en) * 2011-05-14 2014-08-26 Pivotal Software, Inc. Mirror resynchronization of fixed page length tables for better repair time to high availability in databases
US20170083535A1 (en) * 2015-09-22 2017-03-23 Facebook, Inc. Managing sequential data store
US20180144015A1 (en) * 2016-11-18 2018-05-24 Microsoft Technology Licensing, Llc Redoing transaction log records in parallel
CN108073656A (en) * 2016-11-17 2018-05-25 杭州华为数字技术有限公司 A kind of method of data synchronization and relevant device
CN109189726A (en) * 2018-08-08 2019-01-11 北京奇安信科技有限公司 A kind of processing method and processing device for reading and writing log
CN109241185A (en) * 2018-08-27 2019-01-18 武汉达梦数据库有限公司 A kind of method and data synchronization unit that data are synchronous
CN109271399A (en) * 2018-11-19 2019-01-25 武汉达梦数据库有限公司 A kind of method of calibration of database write-in log consistency
CN110262929A (en) * 2018-08-13 2019-09-20 武汉达梦数据库有限公司 A kind of method guaranteeing duplication transaction consistency and corresponding reproducing unit
US20200050687A1 (en) * 2018-08-09 2020-02-13 Netapp Inc. Resynchronization to a filesystem synchronous replication relationship endpoint

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8818943B1 (en) * 2011-05-14 2014-08-26 Pivotal Software, Inc. Mirror resynchronization of fixed page length tables for better repair time to high availability in databases
CN102760161A (en) * 2012-06-12 2012-10-31 天津神舟通用数据技术有限公司 Log organization structure clustered based on transaction aggregation and method for realizing corresponding recovery protocol thereof
US20170083535A1 (en) * 2015-09-22 2017-03-23 Facebook, Inc. Managing sequential data store
CN108073656A (en) * 2016-11-17 2018-05-25 杭州华为数字技术有限公司 A kind of method of data synchronization and relevant device
US20180144015A1 (en) * 2016-11-18 2018-05-24 Microsoft Technology Licensing, Llc Redoing transaction log records in parallel
CN109189726A (en) * 2018-08-08 2019-01-11 北京奇安信科技有限公司 A kind of processing method and processing device for reading and writing log
US20200050687A1 (en) * 2018-08-09 2020-02-13 Netapp Inc. Resynchronization to a filesystem synchronous replication relationship endpoint
CN110262929A (en) * 2018-08-13 2019-09-20 武汉达梦数据库有限公司 A kind of method guaranteeing duplication transaction consistency and corresponding reproducing unit
CN109241185A (en) * 2018-08-27 2019-01-18 武汉达梦数据库有限公司 A kind of method and data synchronization unit that data are synchronous
CN109271399A (en) * 2018-11-19 2019-01-25 武汉达梦数据库有限公司 A kind of method of calibration of database write-in log consistency

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
胡君, 许群岚: "Oracle重做日志机制分析", 电脑与信息技术, no. 05 *
蒋志勇, 董金祥: "一种工程数据库日志管理系统的设计与实现", 计算机应用研究, no. 10 *
陶琛嵘;陈莉君;: "基于闪存特性的EXT4文件系统读写性能优化研究", 计算机测量与控制, no. 08 *

Similar Documents

Publication Publication Date Title
CN111858501B (en) Log reading method based on log analysis synchronization and data synchronization system
CN110442560B (en) Log replay method, device, server and storage medium
US8825719B2 (en) Incremental lock-free stack scanning for garbage collection
CN112286941A (en) Big data synchronization method and device based on Binlog + HBase + Hive
CN111475517A (en) Data updating method and device, computer equipment and storage medium
US20230137119A1 (en) Method for replaying log on data node, data node, and system
CN115145697A (en) Database transaction processing method and device and electronic equipment
CN110765204B (en) Method and device for processing incremental synchronous abnormal interrupt condition
CN112966025B (en) Binlog log mining dictionary implementation method
CN111858504B (en) Operation merging execution method based on log analysis synchronization and data synchronization system
CN112035222B (en) Transaction operation merging execution method and device based on log analysis synchronization
CN111858502A (en) Log reading method and log reading synchronization system based on log analysis synchronization
CN111858503A (en) Parallel execution method and data synchronization system based on log analysis synchronization
CN114297216B (en) Data synchronization method and device, computer storage medium and electronic equipment
CN114780489B (en) Method and device for realizing distributed block storage bottom layer GC
CN115168307A (en) Data synchronization method, system, equipment and storage medium supporting breakpoint continuous transmission
CN111367718B (en) Database starting method, device, equipment and storage medium
CN114489480A (en) Method and system for high-concurrency data storage
CN110297673B (en) Method and storage medium for optimizing loading of memory data
CN103685350B (en) The synchronous method of storage system and relevant equipment
CN114297214B (en) Data synchronization method and device, computer storage medium and electronic equipment
CN105893521A (en) Reading-and-writing separation HBase warehousing method
CN110096389A (en) A kind of starting method, apparatus, equipment and the storage medium of database
CN113190281B (en) ROWID interval-based initialization loading method and device
CN112256702B (en) Incremental identification correction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Applicant after: Wuhan dream database Co., Ltd

Address before: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Applicant before: WUHAN DAMENG DATABASE Co.,Ltd.

CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Sun Feng

Inventor after: Peng Qingsong

Inventor after: Liu Qichun

Inventor before: Sun Feng

Inventor before: Fu Quan

Inventor before: Peng Qingsong

Inventor before: Liu Qichun