CN111221909A - Database modification column synchronization method and device based on log analysis - Google Patents

Database modification column synchronization method and device based on log analysis Download PDF

Info

Publication number
CN111221909A
CN111221909A CN201911422394.9A CN201911422394A CN111221909A CN 111221909 A CN111221909 A CN 111221909A CN 201911422394 A CN201911422394 A CN 201911422394A CN 111221909 A CN111221909 A CN 111221909A
Authority
CN
China
Prior art keywords
column
database
log
synchronization
transaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911422394.9A
Other languages
Chinese (zh)
Other versions
CN111221909B (en
Inventor
付铨
孙峰
彭青松
余院兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dameng Database Co Ltd
Original Assignee
Wuhan Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dameng Database Co Ltd filed Critical Wuhan Dameng Database Co Ltd
Priority to CN201911422394.9A priority Critical patent/CN111221909B/en
Publication of CN111221909A publication Critical patent/CN111221909A/en
Application granted granted Critical
Publication of CN111221909B publication Critical patent/CN111221909B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of databases, and provides a database modification column synchronization method and device based on log analysis. The method comprises the steps of judging that the DDL affair is the operation of a modified column when the collected log content corresponding to the DDL affair is the INSERT operation of a user table, and constructing a table building operation according to the column information of the table to which the modified column operation belongs and sending the table building operation to a destination end; constructing a synchronous system execution of query insertion operation and sending to a destination end database, and copying data in a corresponding original table of the destination end database to a new table of the destination end database; and the source end database synchronization system continues to capture the log until the submitted log is captured, so that the synchronization system of the source end database informs the synchronization system of the destination end database to delete the original table, and the DDL transaction synchronization is completed. The invention can basically ensure that the source database and the destination database carry out the operation of modifying the list at almost the same time.

Description

Database modification column synchronization method and device based on log analysis
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of databases, in particular to a database modification column synchronization method and device based on log analysis.
[ background of the invention ]
In a database real-time synchronization system based on a log parsing architecture (supporting heterogeneous), synchronization of modified column operations of table modification tends to slow down performance of synchronization. Mainly, in an actual application system, the data volume of some tables is too large, if the table structure is modified on the table with a large number of records, for example, a certain column is modified, the table of the database is reconstructed, and due to the large number of records of the tables, the database consumes a large amount of time and generates a large amount of database logs during reconstruction.
In the traditional synchronization method, a source end log analysis service needs to capture a submission log of a source end database table modification operation and then begin to analyze and synchronize the table modification operation, before the operation submission log is captured, a large amount of logs generated by the operation need to be collected and cached, the caching operation can generate a large amount of IO (input/output) to influence the performance of a source end database, the table modification operation is sent to a target end, the database of the target end can be completed by consuming a large amount of time when the modification operation is executed, and during the period, when the data synchronization service synchronizes the subsequent operation of the table, the data synchronization service can be performed only after the target end database completes the table modification, so that great synchronization delay is caused.
For example, if an a-table with 1 hundred million rows of records in the source database modifies the data type definition of 1 column, this operation takes approximately 60 minutes to execute in the source database, and the synchronization to the destination database also takes at least 60 minutes to complete, resulting in a delay of at least 60 minutes in data synchronization.
In an environment of building main-standby synchronization, a target-end database is often used as a query library to share the pressure of the main library, when a table modifies a column, query operation for the table is blocked, long-time table modification operation of the target-end database affects the query function of an application, and how to shorten synchronization delay of large table modification operation and reduce the influence of synchronization on an application program on the target-end database becomes a technical problem to be solved in the industry.
[ summary of the invention ]
The technical problem to be solved by the present invention is that in an environment where a main-standby synchronization is established, a target-side database is often used as a query library to share the pressure of the main library, when a table modifies a column, query operation for the table will be blocked, long-time table modification operation of the target-side database will affect the query function of an application, and how to shorten synchronization delay of large table modification operation and reduce the influence of synchronization on an application program on the target-side database becomes a technical problem to be solved in the industry.
The invention adopts the following technical scheme:
in a first aspect, the present invention provides a log parsing-based database modified column synchronization method, including:
a synchronization system of a source end database judges that an operation to be synchronized in a log is a DDL transaction, and then collects the subsequent log belonging to the DDL transaction according to the transaction number;
when the collected log content corresponding to the DDL transaction is the user table INSERT operation, judging that the DDL transaction is the operation of modifying the column, constructing a table building operation according to the column information of the table to which the operation of modifying the column belongs, and sending the table building operation to a destination end synchronous system for execution; the table building operation comprises building a table name of a new table and building a mapping relation with an original table name, and column information of the new table is set by referring to column information after column modification operation is completed corresponding to the original table;
constructing a synchronous system execution of query insertion operation and sending to a destination end database, and copying data in a corresponding original table of the destination end database to a new table of the destination end database;
and the source end database synchronization system continues to capture the log until the submission log of the DDL transaction is captured, so that the synchronization system of the source end database informs the synchronization system of the destination end database to delete the original table, and the table name of the newly built table is replaced by the table name of the original table in the destination end database, thereby completing the DDL transaction synchronization.
Preferably, the method for synchronizing a DDL transaction in a log includes that a synchronization system of a source database determines that an operation to be synchronized in the log is a DDL transaction, and then collects subsequent logs belonging to the DDL transaction according to the transaction number, and further includes:
when the log content corresponding to the DDL transaction is collected firstly, the log content is not the user table INSERT operation but the log is submitted, the DDL transaction is judged to be the operation of a modified column, and the operation of the modified column is sent to a destination end synchronization system.
Preferably, after receiving the modified column operation of the synchronization system of the source database, the synchronization system of the destination database directly executes the modified column operation on the table on the destination database, thereby completing the DDL transaction synchronization.
Preferably, before the DDL, if a query request is received by the destination database, the method further includes:
the original table of the destination database is locked so as to be in an inaccessible state; and after the DDL things are synchronized, the query request is completed through the new table replacing the original table.
Preferably, the source database synchronization system continues to capture the log, further comprising:
when the rollback log of the DDL transaction is captured, the synchronization system of the source database constructs and deletes the operation of newly building the table of the destination and sends the operation to the destination for execution, and the DDL transaction synchronization is completed.
Preferably, before completing the DDL transaction synchronization, if the destination receives a query request, the method further includes:
and the destination database provides the original table as a queried object of the query request, wherein the query request process does not influence the process of copying the original table data into the newly built table.
Preferably, when the collected log content corresponding to the DDL transaction is a user table INSERT operation, determining whether the DDL transaction is a column modification operation, specifically including:
aiming at DDL affairs, the source end database can generate a log of a system table and a log corresponding to a common table;
after the log of the system table is collected, the operation to which the DDL transaction belongs is reversely deduced through the log of the system table;
the condition for collecting the log of the system table is as follows: generating a commit or rollback message for the DDL transaction, and/or generating a log of a common table corresponding to the DDL transaction.
Preferably, the reversely deducing the operation to which the DDL transaction belongs through the system table log specifically includes:
according to the updating operation in the capture sys. col $ table, one or more items in default formulas for describing each column name, each column unit value and each column assignment of the modification table are obtained and compared with the definition description of the column of the original table; and if one or more items in the default formula for the column names, the column unit values and the column assignments are found to be changed, determining the operation as column modification.
Preferably, the database comprises an Oracle and/or damming database; the operation of modifying the column is specifically column data type adjustment.
Preferably, in the method, synchronization systems are deployed in the source end database and the destination end database, the source end database synchronization system reads logs from the source end database, and the destination end database synchronization system is responsible for applying synchronization operations sent by the source end to the destination end database.
In a second aspect, the present invention further provides a log parsing-based database modification column synchronization method, where the method includes:
the source end database synchronization system judges whether the operation in the log belongs to the system table operation, if so, the operation is classified into a DDL transaction, the transaction number of the DDL operation is obtained, and the subsequent log belonging to the transaction is collected according to the transaction number;
when the commit log of the DDL transaction is collected, the modified column is synchronized in a traditional way;
when the INSERT operation log of the user table related to the DDL transaction is collected, judging whether the operation in the DDL transaction is the operation of modifying the column, if so, restoring the original operation of modifying the column according to the previously collected system table operation log;
sending the modified column operation to a synchronous system execution of a destination end database;
the source end database synchronization system continues to capture logs, and subsequent DML logs belonging to the modified column transaction can be directly discarded until the commit or rollback log of the transaction is captured;
when capturing the rollback log of the modified column, because the target end has already executed the operation of the modified column, the restore operation of the original modified column is constructed and sent to the target end for execution, and the transaction synchronization is completed; the transaction synchronization is completed when a commit log is captured that takes the modified column.
Preferably, in the method, synchronization systems are deployed in the source end database and the destination end database, the source end database synchronization system reads logs from the source end database, and the destination end database synchronization system is responsible for applying synchronization operations sent by the source end to the destination end database.
Preferably, the modify column operation includes: modifying one or more of the column name, modifying the assignment in the column, modifying the default formula for the assignment in the column.
In a third aspect, the present invention further provides a database modified column synchronization apparatus based on log parsing, which is used to implement the database modified column synchronization method based on log parsing in the first aspect, and the apparatus includes:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor for performing the log resolution-based database modified column synchronization method of the first aspect or the second aspect.
In a fourth aspect, the present invention further provides a non-transitory computer storage medium storing computer-executable instructions for execution by one or more processors to perform the log parsing based database modified column synchronization method according to the first aspect or the second aspect.
The invention captures the user table operation log corresponding to the DDL things as the basis for triggering the target end to carry out new table operation, further carries out the original table modification of the target end database only when capturing the submission log or rollback log of the affair, and replaces the new table name with the deleted original table, thereby basically ensuring that the source and target databases carry out the operation of modifying columns at almost the same time, and the target end adopts the mode of reserving the original table to create a new table when modifying the columns, the query operation of the original table is not influenced by the operation synchronization of modifying columns, thus shortening the time delay caused by the operation synchronization of modifying columns and reducing the influence of the operation synchronization of modifying columns on the application of the target end.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic flowchart of a database modified column synchronization method based on log parsing according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a more complicated database modified column synchronization method based on log parsing according to an embodiment of the present invention;
FIG. 3 is a schematic flowchart of a log parsing-based database modified column synchronization method according to an embodiment of the present invention, which is further complicated in the related art;
FIG. 4 is a flowchart of a method for analyzing operations of DDL things according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating another database modification column synchronization method based on log parsing according to an embodiment of the present invention;
fig. 6 is a schematic flowchart of a database modified column synchronization method based on log parsing in an SQL scenario according to an embodiment of the present invention;
fig. 7 is a schematic flowchart of a database modified column synchronization method based on log parsing according to an embodiment of the present invention;
fig. 8 is a schematic flowchart of a database modified column synchronization method based on log parsing according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a database modified column synchronization apparatus based on log parsing according to an embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are for convenience only to describe the present invention without requiring the present invention to be necessarily constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1:
embodiment 1 of the present invention provides a method for synchronizing a modified column of a database based on log parsing, where the database includes an Oracle and/or dreams database, and the embodiment of the present invention is particularly suitable for synchronizing the operation of the modified column, specifically, adjusting the column data type. The embodiment of the invention deploys the synchronization system in the source end database and the target end database, the source end database synchronization system reads the logs from the source end database, and the target end database synchronization system is responsible for applying the synchronization operation sent by the source end to the target end database. As shown in fig. 1, includes:
in step 201, the synchronization system of the source database determines that the operation to be synchronized in the log is a DDL transaction, and then collects subsequent logs belonging to the DDL transaction according to the transaction number.
Aiming at DDL affairs, a source end database generates a log of a system table and then generates a log corresponding to a common table; and step 201 of the embodiment of the present invention mainly refers to collecting the log of the system table.
In step 202, when the collected log content corresponding to the DDL transaction is a user table INSERT operation (when the table is in a modified state, for example, a modified column in the embodiment of the present invention, the table is locked inside the database, other transactions cannot operate the table, and only after the DDL is completed, the table can be accessed by other transactions, but the INSERT operation is a boundary ridge and belongs to the DDL operation before the INSERT operation), determining that the DDL transaction is an operation of the modified column, and constructing a table building operation according to the column information of the table to which the modified column operation belongs and sending the table building operation to a destination synchronization system for execution; and the table building operation comprises building a mapping relation between a table name of a new table and an original table name, and the column information of the new table refers to the column information set after the column modifying operation is finished corresponding to the original table.
In the prior art, it is common that the source database sends the existing modified column operation to the destination database after the modified column operation is completed, which is why "the operation takes about 60 minutes to execute on the source database, and the operation takes at least 60 minutes to complete when synchronized to the destination database, which causes a delay of at least 60 minutes. In the embodiment of the present invention, the difference is that once the modified column operation is collected by the log (at this time, the operation process of modifying the column is just started for the source), a table building operation is sent to the synchronization system of the destination database, where the table building operation is actually prepared to be completed in advance by delaying the execution for 60 minutes (the execution from the source database takes about 60 minutes) as exemplified in the background art.
The practical significance of the improvement of the above steps in the embodiment of the present invention is that in specific applications, once the column modification operation is executed in the source database, the probability of more than 99% of the column modification operation is normally completed except for the respective reasons of downtime, network interruption, etc. Because of the objective factors of stable completion, the above-mentioned "when the collected log content corresponding to the DDL transaction is the user table INSERT operation, the DDL transaction is judged to be the operation of modifying the column, and according to the column information of the table to which the operation of modifying the column belongs, a table building operation is constructed and sent to the destination synchronous system to be executed" becomes an effective and feasible operation mode, so that the fundamental place that the operation of basically ensuring that the source database and the destination database modify the column at almost the same time "can be finally obtained is realized.
In step 203, a query insertion operation is constructed and sent to the synchronization system of the destination database to execute, and the data in the corresponding original table of the destination database is copied to the new table of the destination database.
It should be emphasized that the operation of step 203 does not lock the original table in the destination database as in the prior art, and therefore, the query operation on the original table can be performed normally while step 203 is executed. Since the synchronization operation between the source-side database and the destination-side database is performed by the synchronization systems on both sides, and the synchronization system is an independent layer with respect to the database itself, it can be understood that the synchronization system is a control subject that sends a specified DML operation to the database, and therefore, there is a space for implementing the improvement steps proposed in the above embodiment of the present invention.
In step 204, the source database synchronization system continues to capture the log until capturing the commit log of the DDL transaction, so the synchronization system of the source database notifies the synchronization system of the destination database to delete the original table, and replaces the table name of the newly created table with the table name of the original table in the destination database, thereby completing the DDL transaction synchronization.
Before completing the DDL transaction synchronization, if a query request is received by the destination, the method further comprises: and the destination database provides the original table as a queried object of the query request, wherein the query request process does not influence the process of copying the original table data into the newly built table.
The embodiment of the invention captures the user table operation log corresponding to the DDL things as the basis for triggering the target end to carry out new table operation, further carries out original table modification of the target end database only when capturing the submission log or rollback log of the affairs, and replaces the new table name with the deleted original table, thereby basically ensuring that the source and the target database carry out column modification operation at almost the same time, and when modifying the column, the target end adopts a mode of reserving the original table to create a new table, the query operation of the original table is not influenced by the synchronization of the column modification operation, so that the time delay caused by the synchronization of the column modification operation is shortened, and the influence of the synchronization of the column modification operation on the application of the target end is reduced.
As shown in fig. 2, after the synchronization system of the source database in step 201 determines that the operation to be synchronized in the log is a DDL transaction, and collects subsequent logs belonging to the DDL transaction according to the transaction number, in addition to the above-mentioned case in step 202, another case may be encountered, specifically:
in step 205, when the log content corresponding to the DDL transaction is collected first, instead of the user table INSERT operation, but a commit log, it is determined that the DDL transaction is a column modification operation, and then a column modification operation is sent to the destination synchronization system. Here, it is considered that if the commit log in step 205 is collected instead of the INSERT operation in step 202, it can be determined that the original table to which the modified column operation needs to be performed is an empty table, and there is no delay described in the background art by using the conventional mechanism, because the creation of the empty table can be completed quickly by using the conventional database mechanism, and at this time, the process of step 203 and 204 proposed by the embodiment of the present invention is somewhat redundant; therefore, in this case, the modified column operation instruction between the two-side synchronization systems can be directly used for synchronization.
Further from the destination database, after receiving the modified column operation from the source database in step 205, the synchronization system of the destination database directly executes the modified column operation on the table on the destination database after receiving the modified column operation of the synchronization system of the source database, thereby completing the DDL transaction synchronization. And on the destination database, the process of executing the operation of modifying the column comprises the following steps: locking the original table of the destination database, creating a table name of a newly created table and establishing a mapping relation with the original table name, keeping the column information of the newly created table consistent with the column information of the corresponding original table after column modification operation is completed, and modifying the original table of the destination database after the new creation is completed
In this embodiment of the present invention, before completing the DDL transaction synchronization, if a query request is received by a destination database, the method further includes:
the original table of the destination database is locked so as to be in an inaccessible state; and after the DDL things are synchronized, the query request is completed through the new table replacing the original table.
As shown in fig. 3, the source-side database synchronization system continues to capture logs, and includes the case of rolling back logs in addition to the commit log described in step 204, specifically:
in step 206, when the rollback log of the DDL transaction is captured, the synchronization system of the source database constructs an operation of deleting the newly-built table of the destination and sends the operation to the destination for execution, thereby completing the DDL transaction synchronization.
In this embodiment of the present invention, when the collected log content corresponding to the DDL transaction is a user table INSERT operation, determining whether the DDL transaction is a column modification operation, as shown in fig. 4, specifically includes:
in step 301, for a DDL transaction, the source database generates a log of a system table and then generates a log corresponding to a common table.
In step 302, after the log of the system table is collected, the operation to which the DDL transaction belongs is reversely deduced through the log of the system table; for example, in embodiment 1 of the present invention, the inverse push out of the INSERT oplog through the user table is a modified column operation. The reverse reasoning here is usually to get one or more items in default formulas for describing the column names, column unit values and column assignments of the modification table according to the update operation in the capture sys. col $ table, as described in example 3, and compare with the definition description of the column in the original table; and if one or more items in the default formula for the column names, the column unit values and the column assignments are found to be changed, determining the operation as column modification.
The condition for collecting the log of the system table is as follows: generating a commit or rollback message for the DDL transaction, and/or generating a log of a common table corresponding to the DDL transaction.
Example 2:
the embodiment of the present invention describes the implementation process of embodiment 1 of the present invention from a relatively complete method step process, and compared with embodiment 1, the process of the embodiment of the present invention is more complete in performance and more instantiated. In the embodiment of the present invention, synchronization systems are deployed in a source database and a destination database, the source database synchronization system reads a log from the source database, and the destination database synchronization system is responsible for applying a synchronization operation sent by a source to the destination database, as shown in fig. 5, including the following steps:
in step 401, the source database synchronization system determines whether the operation in the log belongs to a system table operation, if so, the operation is classified as a DDL transaction, obtains a transaction number of the DDL operation, and collects subsequent logs belonging to the transaction according to the transaction number.
When the commit log of the above-described DDL transaction is collected, the modified column is synchronized in step 402 using conventional means. The conventional manner described herein may be a modified column synchronization mechanism provided by each platform itself, which is not described herein.
In step 403, when the user table INSERT operation log related to the DDL transaction is collected, it needs to determine whether the DDL transaction is a column modification operation, and according to the column information of the table to which the column modification operation belongs, a table building operation is constructed and sent to a destination terminal for synchronous service execution, so as to ensure that the table name a _ BAK of the table building is different from the original table name a, but the column information is consistent with the original table.
In step 404, a query insertion operation is constructed and sent to the synchronization system of the destination database to execute, and the data in the A table is copied to the A _ BAK table.
In step 405, the source-side database synchronization system continues to capture logs, and subsequent DML logs belonging to the modified-column transaction may be discarded directly until a commit or rollback log of the transaction is captured.
When the rollback log of the modified column is captured, in step 406, the operation to construct the delete table a _ BAK is sent to the target for execution, since the target has already performed the create table a _ BAK operation, completing the transaction synchronization.
In step 407, when the commit log of the modified column is captured, the destination synchronization service is notified to perform a renaming operation on the newly created table, the destination synchronization service modifies the original table a, and then the new table a _ BAK is renamed to a, thereby completing the transaction synchronization.
When the scheme is operated, the operation of modifying the columns of the table can be basically ensured to be carried out on the source database and the target database at the same time, and the target end is carried out in a mode of reserving the original table to create a new table when modifying the columns, and the query operation of the original table is not influenced by the synchronization of the operation of modifying the columns, so that the delay caused by the synchronization of the operation of modifying the columns is shortened, and the influence of the synchronization of the operation of modifying the columns on the application of the target end is reduced.
Example 3:
in the embodiment of the present invention, when the source database is ORACLE, the method content in the above embodiment 2 is further developed and instantiated as follows:
all object information of the source database is stored in the related system dictionary tables (such as sys. obj $, sys. col $, sys. icol $), and taking ORACLE as an example, the table definition of the table is stored in the sys. obj $ system table, and the column definition is placed in the sys. col system $ table. When ORACLE modifies column C for table a, it will modify a row definition of the table in the sys.col $ system table first, and the synchronization service can acquire definition information of modified C column in the log by capturing the modified column operation of the sys.col $ system table; after the database constructs the dictionary of the A table, the database constructs the data recorded in the A table, the operation generates a large amount of INSERT logs aiming at the A table in the logs, the synchronization service can know that the log operation of the system table is finished when acquiring the insertion operation of the A table, and the original modified column operation can be restored according to the collected system table operation.
Examples are as follows:
the existing table a (C INT, COL INT) of the source database, a, has 1 row of data, and performs a modify column operation at the source end:
ALTER TABLE A MODIFY COL VARCHAR(100);
as shown in fig. 6, the synchronization process is as follows:
in step 501, the source data synchronization service collects a system table operation log for the modified column.
In step 502, when the INSERT operation log of A is found in the DDL transaction of the modified column, resolving the log restore of the system TABLE results in the operation ALTER TABLE A MODIFY COL VARCHAR of the modified column (100).
In step 503, the source data synchronization constructs a TABLE build operation to the destination synchronization service, and the TABLE structure uses the modified column, and the TABLE build SQL is CREATE TABLE A _ BAK (C INT, COL VARCHAR (100)).
In step 504, the source data synchronization construct query insertion operation is sent to the synchronization system of the destination database to copy the data of the a table to a _ BAK, and the query insertion SQL is INSERT INTOA _ BAK (C, COL) SELECT C, CAST (COL AS VARCHAR (100)) COL FROM a.
In step 505, if the source modified column operation is rolled back, the SQL constructed by the deletion table in the above example is DROP table _ BAK, and needs to be sent to the target for execution, so as to delete the newly created table a _ BAK.
In step 506, if the source modified column operation commits, the renamed end database a _ BAK is needed to be TABLE a, so a modified TABLE operation is first sent to modify the old TABLE a, the SQL to modify the TABLE is DROPTABLE a, the TABLE a _ BAK is named a, and the SQL to modify the TABLE is named ALTER TABLE a _ BAK RENAME TOA.
In step 507, synchronization is completed.
The embodiment of the invention further deletes the original table of the database at the destination end only when capturing the submission log or the rollback log of the transaction by capturing the user table operation log corresponding to the DDL object as the basis for triggering the destination end to carry out the operation of newly creating the table, and replaces the name of the newly created table with the deleted original table, thereby basically ensuring that the source database and the destination database carry out the operation of modifying columns at almost the same time.
Example 4:
according to the change of the records of the database system dictionary table, the embodiment of the invention tracks the DML operation of the corresponding system dictionary table in the log and deduces the operation of a modification column related to the modification of the table by a user. The embodiment of the invention is particularly suitable for the synchronous operation process of modifying one or more items in the default formulas for column name modification, assignment in the modified column and assignment of the modified column. The embodiment of the invention optimizes the synchronization of the modified columns in a targeted manner, thereby shortening the synchronization time delay of the operation of the modified columns.
Based on the foregoing practical situation, an embodiment of the present invention provides a method for synchronizing operations of a modified column of a source-side database based on log parsing, where as shown in fig. 7, the method includes:
601. and a synchronization system is deployed in the source end database and the destination end database, the source end database synchronization system reads logs from the source end database, and the destination end database synchronization system is responsible for transmitting synchronization operations sent by the source end database to the destination end database.
602. The source end database synchronization system judges whether the operation in the log belongs to the system table operation, if so, the operation is classified into a DDL transaction, the transaction number of the DDL operation is obtained, and the subsequent log belonging to the transaction is collected according to the transaction number.
603. When the commit log of the DDL transaction is collected, the modified column is synchronized using conventional means.
604. When the user table INSERT operation log related to the DDL transaction is collected, it needs to be judged whether the user table INSERT operation log is an operation for modifying a column in the DDL transaction, and if so, the original operation for modifying the column is restored according to the previously collected system table operation log.
605. And sending the modified column operation to a synchronous system execution of a destination end database.
606. The source database synchronization system continues to capture logs, and subsequent DML logs belonging to the modified column transaction can be directly discarded until a commit or rollback log of the transaction is captured.
607. When the rollback log of the modified column is captured, since the target end has already executed the operation of the modified column, the restore operation for constructing the original modified column is sent to the target end for execution, and the transaction synchronization is completed.
608. The transaction synchronization is completed when a commit log is captured that takes the modified column.
When the scheme is operated, the operation of modifying the columns of the table by the source database and the target database can be basically ensured to be almost performed at the same time, and the delay caused by the synchronization of the operation of modifying the columns is shortened.
Example 5:
the basic steps of the above embodiment 4, when the source database is ORACLE, as shown in fig. 8, can be explained as follows:
first, all object information of the source database is stored in the related system dictionary tables (e.g., sys. obj $, sys. col $, sys. icol $), and taking ORACLE as an example, the table definitions of the tables are stored in the sys. obj $ system table, and the column definitions are set in the sys. col $ system table. When ORACLE modifies the data type of column C for table a, it will modify a row and column definition in the sys.col $ system table first, and the synchronization service can acquire the definition information of the column in the log by capturing the operation of the sys.col $ system table; after the database constructs the dictionary of the A table, the database constructs the data recorded in the A table, the operation generates a large amount of INSERT logs aiming at the A table in the logs, when the synchronous service captures the insertion operation of the A table, the completion of the log operation of the system table can be known, the original modified column operation can be restored according to the collected system table operation, and the modified table structure can be obtained.
Examples are as follows:
the source database existing table a (col date), a having 1 line of data, performs a modify column operation at the source end:
ALTER TABLE A MODIFY COL VARCHAR(30);
the synchronization process is as follows:
in step 701, the source database synchronization system collects a system table operation log for the modified column.
In step 702, when the INSERT operation log of A is found in the DDL transaction of the modified column, the log restore of the parsing system TABLE gets the operation ALTER TABLE A MODIFY COL VARCHAR of the modified column (30).
In step 703, the source database synchronization system sends the modified column operation to the destination database for execution, which is ALTER TABLE A MODIFY COL VARCHAR (30).
In step 704, if the modified column operation of the source database is rolled back, the SQL constructed to restore the modified column is sent to the target for execution, and the SQL constructed in the above example is the ALTER TABLE a modified column.
In step 705, synchronization is complete if the modify column operation of the source database commits.
With the combination of embodiment 5 and embodiment 1 of the present invention, there is also a preferred implementation manner, that is, it is determined whether the modification type is one or more items in a default formula for modifying the column name, assigning values in the modified column, and assigning values to the modified column, or is the column data type adjustment; if the scheme is adopted in the former embodiment 5, and if the scheme is adopted in the latter embodiment (column data type adjustment), the scheme in the embodiment 1 is adopted, so that a more efficient synchronization effect is achieved, and the performance advantages of the two are integrated.
Example 6:
fig. 9 is a schematic diagram illustrating an architecture of a database modified column synchronization apparatus based on log parsing according to an embodiment of the present invention. The log resolution-based database modification column synchronization apparatus of the present embodiment includes one or more processors 21 and a memory 22. In fig. 9, one processor 21 is taken as an example.
The processor 21 and the memory 22 may be connected by a bus or other means, and fig. 9 illustrates the connection by a bus as an example.
The memory 22, which is a non-volatile computer-readable storage medium, can be used to store a non-volatile software program and a non-volatile computer-executable program, such as the log-parsing-based database modified column synchronization method in embodiment 1. Processor 21 executes the log resolution based database modified column synchronization method by executing non-volatile software programs and instructions stored in memory 22.
The memory 22 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 22 and, when executed by the one or more processors 21, perform the log resolution-based database modified column synchronization method of embodiment 1 described above, for example, perform the steps shown in fig. 1 to 8 described above.
It should be noted that, for the information interaction, execution process and other contents between the modules and units in the apparatus and system, the specific contents may refer to the description in the embodiment of the method of the present invention because the same concept is used as the embodiment of the processing method of the present invention, and are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A database modification column synchronization method based on log analysis is characterized by comprising the following steps:
a synchronization system of a source end database judges that an operation to be synchronized in a log is a DDL transaction, and then collects the subsequent log belonging to the DDL transaction according to the transaction number;
when the collected log content corresponding to the DDL transaction is the user table INSERT operation, judging that the DDL transaction is the operation of modifying the column, constructing a table building operation according to the column information of the table to which the operation of modifying the column belongs, and sending the table building operation to a destination end synchronous system for execution; the table building operation comprises building a table name of a new table and building a mapping relation with an original table name, and column information of the new table is set by referring to column information after column modification operation is completed corresponding to the original table;
constructing a synchronous system execution of query insertion operation and sending to a destination end database, and copying data in a corresponding original table of the destination end database to a new table of the destination end database;
and the source end database synchronization system continues to capture the log until the submission log of the DDL transaction is captured, so that the synchronization system of the source end database informs the synchronization system of the destination end database to delete the original table, and the table name of the newly built table is replaced by the table name of the original table in the destination end database, thereby completing the DDL transaction synchronization.
2. The log resolution-based database modification column synchronization method of claim 1, wherein the source-side database synchronization system continues to capture logs, further comprising:
when the rollback log of the DDL transaction is captured, the synchronization system of the source database constructs and deletes the operation of newly building the table of the destination and sends the operation to the destination for execution, and the DDL transaction synchronization is completed.
3. The log resolution-based database modification column synchronization method according to claim 1, wherein before completing the DDL transaction synchronization, if a query request is received by a destination, the method further comprises:
and the destination database provides the original table as a queried object of the query request, wherein the query request process does not influence the process of copying the original table data into the newly built table.
4. The method for synchronizing modified columns of a database based on log parsing according to any one of claims 1 to 3, wherein when the collected log content corresponding to the DDL transaction is a user table INSERT operation, determining whether the DDL transaction is a column modification operation specifically includes:
aiming at DDL affairs, the source end database can generate a log of a system table and a log corresponding to a common table;
after the log of the system table is collected, the operation to which the DDL transaction belongs is reversely deduced through the log of the system table;
the condition for collecting the log of the system table is as follows: generating a commit or rollback message for the DDL transaction, and/or generating a log of a common table corresponding to the DDL transaction.
5. The log resolution-based database modification column synchronization method according to claim 4, wherein the backward-deriving the operation to which the DDL transaction belongs through the system table log specifically comprises:
according to the updating operation in the capture sys. col $ table, one or more items in default formulas for describing each column name, each column unit value and each column assignment of the modification table are obtained and compared with the definition description of the column of the original table; and if one or more items in the default formula for the column names, the column unit values and the column assignments are found to be changed, determining the operation as column modification.
6. The log resolution based database modification column synchronization method of any one of claims 1 to 3, wherein the database comprises an Oracle and/or Daemon database; the operation of modifying the column is specifically column data type adjustment.
7. A method for synchronizing modified columns of a database based on log parsing according to any of claims 1-3, wherein synchronization systems are deployed in the source database and the destination database, the source database synchronization system reads logs from the source database, and the destination database synchronization system is responsible for applying synchronization operations sent from the source to the destination database.
8. A log parsing-based database modification column synchronization method is characterized by comprising the following steps:
the source end database synchronization system judges whether the operation in the log belongs to the system table operation, if so, the operation is classified into a DDL transaction, the transaction number of the DDL operation is obtained, and the subsequent log belonging to the transaction is collected according to the transaction number;
when the commit log of the DDL transaction is collected, the modified column is synchronized in a traditional way;
when the INSERT operation log of the user table related to the DDL transaction is collected, judging whether the operation in the DDL transaction is the operation of modifying the column, if so, restoring the original operation of modifying the column according to the previously collected system table operation log;
sending the modified column operation to a synchronous system execution of a destination end database;
the source database synchronization system continues to capture the log until the commit or rollback log of the transaction is captured;
when capturing the rollback log of the modified column, because the target end has already executed the operation of the modified column, the restore operation of the original modified column is constructed and sent to the target end for execution, and the transaction synchronization is completed; the transaction synchronization is completed when a commit log is captured that takes the modified column.
9. The log resolution-based database modified column synchronization method of claim 7, wherein the modified column operation comprises: modifying one or more of the column name, modifying the assignment in the column, modifying the default formula for the assignment in the column.
10. An apparatus for synchronizing modified columns of a database based on log parsing, the apparatus comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor for performing the log resolution based database modified column synchronization method of any of claims 1-9.
CN201911422394.9A 2019-12-31 2019-12-31 Database modification column synchronization method and device based on log analysis Active CN111221909B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911422394.9A CN111221909B (en) 2019-12-31 2019-12-31 Database modification column synchronization method and device based on log analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911422394.9A CN111221909B (en) 2019-12-31 2019-12-31 Database modification column synchronization method and device based on log analysis

Publications (2)

Publication Number Publication Date
CN111221909A true CN111221909A (en) 2020-06-02
CN111221909B CN111221909B (en) 2021-05-28

Family

ID=70828036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911422394.9A Active CN111221909B (en) 2019-12-31 2019-12-31 Database modification column synchronization method and device based on log analysis

Country Status (1)

Country Link
CN (1) CN111221909B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858505A (en) * 2020-06-04 2020-10-30 武汉达梦数据库有限公司 Parallel execution method and data synchronization system based on log analysis synchronization
CN112433992A (en) * 2020-11-16 2021-03-02 武汉船舶通信研究所(中国船舶重工集团公司第七二二研究所) Data synchronization log optimization method

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050234934A1 (en) * 2004-04-14 2005-10-20 Blackrock System and method for controlling the release of updates to a database configuration
CN102752372A (en) * 2012-06-18 2012-10-24 天津神舟通用数据技术有限公司 File based database synchronization method
CN104820701A (en) * 2015-05-11 2015-08-05 北京瑞星信息技术有限公司 Method and system for recording and synchronizing data
CN105005618A (en) * 2015-07-21 2015-10-28 杭州合众数据技术有限公司 Data synchronization method and system among heterogeneous databases
CN105205053A (en) * 2014-05-30 2015-12-30 阿里巴巴集团控股有限公司 Method and system for analyzing database incremental logs
US20160147788A1 (en) * 2014-11-20 2016-05-26 Ricoh Company, Ltd. Pre-synchronization analysis between local documents and an online document management system
CN108920698A (en) * 2018-07-16 2018-11-30 北京京东金融科技控股有限公司 A kind of method of data synchronization, device, system, medium and electronic equipment
CN109189852A (en) * 2018-08-01 2019-01-11 武汉达梦数据库有限公司 A kind of method that data are synchronous and the device synchronous for data
CN109558452A (en) * 2018-11-19 2019-04-02 武汉达梦数据库有限公司 It is a kind of to inquire the synchronous method for building table handling
CN109656934A (en) * 2018-11-19 2019-04-19 武汉达梦数据库有限公司 Source oracle database DDL synchronous method and equipment based on log parsing
CN109933630A (en) * 2019-03-19 2019-06-25 武汉达梦数据库有限公司 Database data real-time synchronization method and equipment

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050234934A1 (en) * 2004-04-14 2005-10-20 Blackrock System and method for controlling the release of updates to a database configuration
CN102752372A (en) * 2012-06-18 2012-10-24 天津神舟通用数据技术有限公司 File based database synchronization method
CN105205053A (en) * 2014-05-30 2015-12-30 阿里巴巴集团控股有限公司 Method and system for analyzing database incremental logs
US20160147788A1 (en) * 2014-11-20 2016-05-26 Ricoh Company, Ltd. Pre-synchronization analysis between local documents and an online document management system
CN104820701A (en) * 2015-05-11 2015-08-05 北京瑞星信息技术有限公司 Method and system for recording and synchronizing data
CN105005618A (en) * 2015-07-21 2015-10-28 杭州合众数据技术有限公司 Data synchronization method and system among heterogeneous databases
CN108920698A (en) * 2018-07-16 2018-11-30 北京京东金融科技控股有限公司 A kind of method of data synchronization, device, system, medium and electronic equipment
CN109189852A (en) * 2018-08-01 2019-01-11 武汉达梦数据库有限公司 A kind of method that data are synchronous and the device synchronous for data
CN109558452A (en) * 2018-11-19 2019-04-02 武汉达梦数据库有限公司 It is a kind of to inquire the synchronous method for building table handling
CN109656934A (en) * 2018-11-19 2019-04-19 武汉达梦数据库有限公司 Source oracle database DDL synchronous method and equipment based on log parsing
CN109933630A (en) * 2019-03-19 2019-06-25 武汉达梦数据库有限公司 Database data real-time synchronization method and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YUBIAO WANG: "Research on Incremental Heterogeneous database Synchronization Update Based on Web Service", 《2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN)》 *
刘传尧: "基于日志解析的Oracle远程数据库同步方法研究及实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858505A (en) * 2020-06-04 2020-10-30 武汉达梦数据库有限公司 Parallel execution method and data synchronization system based on log analysis synchronization
CN111858505B (en) * 2020-06-04 2024-04-16 武汉达梦数据库股份有限公司 Parallel execution method and data synchronization system based on log analysis synchronization
CN112433992A (en) * 2020-11-16 2021-03-02 武汉船舶通信研究所(中国船舶重工集团公司第七二二研究所) Data synchronization log optimization method
CN112433992B (en) * 2020-11-16 2023-06-02 武汉船舶通信研究所(中国船舶重工集团公司第七二二研究所) Data synchronization log optimization method

Also Published As

Publication number Publication date
CN111221909B (en) 2021-05-28

Similar Documents

Publication Publication Date Title
CN111221907B (en) Database added column synchronization method and device based on log analysis
CN111241094B (en) Database deleted column synchronization method and device based on log analysis
CN109241185B (en) Data synchronization method and data synchronization device
US11704290B2 (en) Methods, devices and systems for maintaining consistency of metadata and data across data centers
CN109739935B (en) Data reading method and device, electronic equipment and storage medium
CN110196856B (en) Distributed data reading method and device
EP3803618B1 (en) Distributed transactions in cloud storage with hierarchical namespace
US10430298B2 (en) Versatile in-memory database recovery using logical log records
CN110262929B (en) Method for ensuring consistency of copying affairs and corresponding copying device
US20170192863A1 (en) System and method of failover recovery
CN109189852B (en) Data synchronization method and device for data synchronization
WO2019231689A1 (en) Multi-protocol cloud storage for big data and analytics
US20120259863A1 (en) Low Level Object Version Tracking Using Non-Volatile Memory Write Generations
US20120191679A1 (en) Database server apparatus, method for updating database, and recording medium for database update program
JP7220807B2 (en) Data reading method, device, computer device and computer program
CN109710388A (en) Method for reading data, device, electronic equipment and storage medium
CN111221909B (en) Database modification column synchronization method and device based on log analysis
CN111797121A (en) Strong consistency query method, device and system for read-write separation architecture service system
US20200034472A1 (en) Asynchronous cache coherency for mvcc based database systems
CN111930850A (en) Data verification method and device, computer equipment and storage medium
CN114741453A (en) Method, system and computer readable storage medium for data synchronization
US20200097581A1 (en) Systems and methods of managing manifest refresh in a database
CN111241193B (en) Database added column synchronization method and device based on log analysis
US9020905B2 (en) Synchronizing database and non-database resources without a commit coordinator
CN111858504A (en) Operation merging execution method based on log analysis synchronization and data synchronization system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Applicant after: Wuhan dream database Co., Ltd

Address before: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Applicant before: WUHAN DAMENG DATABASE Co.,Ltd.

CB02 Change of applicant information
CB03 Change of inventor or designer information

Inventor after: Sun Feng

Inventor after: Peng Qingsong

Inventor after: Yu Yuanlan

Inventor before: Fu Quan

Inventor before: Sun Feng

Inventor before: Peng Qingsong

Inventor before: Yu Yuanlan

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant