Database deleted column synchronization method and device based on log analysis
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of databases, in particular to a database deleted column synchronization method and device based on log analysis.
[ background of the invention ]
In a database real-time synchronization system (supporting heterogeneous) based on a log resolution architecture, deletion column operation synchronization of table modification tends to slow down the performance of synchronization. Mainly, in an actual application system, the data volume of some tables is too large, if the table structure is modified on the table with a large number of records, such as deleting a certain column, the table structure of the database can be reconstructed, and due to the large number of records of the tables, the database consumes a large amount of time and generates a large amount of database logs during reconstruction.
In the traditional synchronization method, a source end log analysis service needs to capture a submission log of a source end database table modification operation and then begin to analyze and synchronize the table modification operation, before the operation submission log is captured, a large amount of logs generated by the operation need to be collected and cached, the caching operation can generate a large amount of IO (input/output) to influence the performance of a source end database, the table modification operation is sent to a target end, the database of the target end can be completed by consuming a large amount of time when the modification operation is executed, and during the period, when the data synchronization service synchronizes the subsequent operation of the table, the data synchronization service can be performed only after the target end database completes the table modification, so that great synchronization delay is caused.
For example, if an a-table deletion of 1 hundred million rows of records in the source database takes approximately 60 minutes to execute in the source database, and if synchronization to the destination database also takes at least 60 minutes to complete, then data synchronization is delayed by at least 60 minutes.
In an environment of building main-standby synchronization, a target-end database is often used as a query library to share the pressure of the main library, when a table deletes a column, query operation for the table is blocked, long-time table modification operation of the target-end database affects the query function of an application, and how to shorten synchronization delay of large table modification operation and reduce the influence of synchronization on an application program on the target-end database becomes a technical problem to be solved in the industry.
[ summary of the invention ]
The technical problem to be solved by the present invention is that in an environment where a main-standby synchronization is established, a target-side database is often used as a query library to share the pressure of the main library, when a table deletes a column, query operation for the table will be blocked, long-time table modification operation of the target-side database will affect the query function of an application, and how to shorten synchronization delay of large table modification operation and reduce the influence of synchronization on an application program on the target-side database becomes a technical problem to be solved in the industry.
The invention adopts the following technical scheme:
in a first aspect, the present invention provides a database deletion column synchronization method based on log parsing, including:
a synchronization system of a source end database judges that an operation to be synchronized in a log is a DDL transaction, and then collects the subsequent log belonging to the DDL transaction according to the transaction number;
when the collected log content corresponding to the DDL transaction is the user table INSERT operation, judging that the DDL transaction is a column deleting operation, and constructing a table building operation according to the column information of a table to which the column deleting operation belongs and sending the table building operation to a destination end synchronous system for execution; the table building operation comprises building a table name of a new table and building a mapping relation with an original table name, and column information of the new table is set by referring to column information corresponding to the original table after column deletion operation is completed;
constructing a query insertion operation, sending the query insertion operation to a target end synchronous service execution, and copying data in a corresponding original table of a target end database to a new table of the target end database;
and the source end database synchronization system continues to capture the log until the submission log of the DDL transaction is captured, so that the synchronization system of the source end database informs the synchronization system of the destination end database to delete the original table, and the table name of the newly built table is replaced by the table name of the original table in the destination end database, thereby completing the DDL transaction synchronization.
Preferably, the method for synchronizing a DDL transaction in a log includes that a synchronization system of a source database determines that an operation to be synchronized in the log is a DDL transaction, and then collects subsequent logs belonging to the DDL transaction according to the transaction number, and further includes:
when the log content corresponding to the DDL transaction is collected firstly, the log content is not the user table INSERT operation but the log is submitted, the DDL transaction is judged to be the operation of deleting the column, and the operation of deleting the column is sent to a destination end synchronization system.
Preferably, after receiving the delete column operation of the synchronization system of the source database, the synchronization system of the destination database directly executes the delete column operation on the table on the destination database, thereby completing the DDL transaction synchronization.
Preferably, before the DDL, if a query request is received by the destination database, the method further includes:
the original table of the destination database is locked so as to be in an inaccessible state; and after the DDL things are synchronized, the query request is completed through the new table replacing the original table.
Preferably, the source database synchronization system continues to capture the log, further comprising:
when the rollback log of the DDL transaction is captured, the synchronization system of the source database constructs and deletes the operation of newly building the table of the destination and sends the operation to the destination for execution, and the DDL transaction synchronization is completed.
Preferably, before completing the DDL transaction synchronization, if the destination receives a query request, the method further includes:
and the destination database provides the original table as a queried object of the query request, wherein the query request process does not influence the process of copying the original table data into the newly built table.
Preferably, when the collected log content corresponding to the DDL transaction is a user table INSERT operation, determining whether the DDL transaction is a column deletion operation, specifically including:
aiming at DDL affairs, a source end database generates a log of a system table and then generates a log corresponding to a common table;
after the log of the system table is collected, the operation to which the DDL transaction belongs is reversely deduced through the log of the system table;
the condition for collecting the log of the system table is as follows: generating a commit or rollback message for the DDL transaction, and/or generating a log of a common table corresponding to the DDL transaction.
Preferably, the database comprises an Oracle and/or damming database.
Preferably, in the method, synchronization systems are deployed in the source end database and the destination end database, the source end database synchronization system reads logs from the source end database, and the destination end database synchronization system is responsible for applying synchronization operations sent by the source end to the destination end database.
In a second aspect, the present invention further provides a database deleted column synchronization apparatus based on log parsing, which is used to implement the database deleted column synchronization method based on log parsing in the first aspect, and the apparatus includes:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor for performing the log parsing based database delete column synchronization method of the first aspect.
In a third aspect, the present invention further provides a non-volatile computer storage medium, where the computer storage medium stores computer-executable instructions, which are executed by one or more processors, for performing the log parsing-based database deletion column synchronization method according to the first aspect.
The invention captures the user table operation log corresponding to the DDL things as the basis for triggering the target end to carry out new table operation, further deletes the original table of the target end database only when capturing the submission log or rollback log of the affair, and replaces the new table name with the deleted original table, thereby basically ensuring that the original and target databases carry out the operation of deleting columns at almost the same time.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic flowchart of a database deletion column synchronization method based on log parsing according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a more complicated database deletion column synchronization method based on log parsing according to an embodiment of the present invention;
FIG. 3 is a schematic flowchart of a complicated database deletion column synchronization method based on log parsing according to an embodiment of the present invention;
FIG. 4 is a flowchart of a method for analyzing operations of DDL things according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating another database deletion column synchronization method based on log parsing according to an embodiment of the present invention;
fig. 6 is a schematic flowchart of a database deleted column synchronization method based on log parsing in an SQL scenario according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a database deletion column synchronization apparatus based on log parsing according to an embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are for convenience only to describe the present invention without requiring the present invention to be necessarily constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1:
the embodiment 1 of the invention provides a database deleted column synchronization method based on log analysis, wherein the database comprises an Oracle database and/or a Damomeng database. As shown in fig. 1, includes:
in step 201, the synchronization system of the source database determines that the operation to be synchronized in the log is a DDL transaction, and then collects subsequent logs belonging to the DDL transaction according to the transaction number.
Aiming at DDL affairs, a source end database generates a log of a system table and then generates a log corresponding to a common table; and step 201 of the embodiment of the present invention mainly refers to collecting the log of the system table.
In step 202, when the collected log content corresponding to the DDL transaction is a user table INSERT operation (when the table is in a modified state, for example, a delete column in the embodiment of the present invention, the table is locked inside the database, other transactions cannot operate the table, and only after the DDL is completed, the table can be accessed by other transactions, but the INSERT operation is a boundary ridge and belongs to the DDL operation before the INSERT operation), it is determined that the DDL transaction is an operation of deleting the column, and a table creating operation is constructed according to the column information of the table to which the delete column operation belongs, and sent to the destination synchronization system for execution; the table building operation comprises building a mapping relation between a table name of a new table and an original table name, and column information of the new table is set by referring to column information corresponding to the original table after the column deleting operation is completed.
In the prior art, it is common that the source database sends the existing deleted column operation to the destination database after the deleted column operation is completed, which is why "the operation takes about 60 minutes to execute on the source database, and the operation takes at least 60 minutes to complete when synchronized to the destination database, which causes a delay of at least 60 minutes. In the embodiment of the present invention, it is different that once the operation of deleting the column is collected through the log (at this time, the operation process of deleting the column is just started for the source), a table building operation is sent to the synchronization system of the destination database, where the operation is actually performed after delaying for 60 minutes (taking about 60 minutes from the source database execution) as exemplified in the background art, and is prepared to be completed in advance.
The practical significance of the improvement of the steps in the embodiment of the present invention is that in specific applications, once the deletion column operation is executed in the source database, the probability of more than 99% of the deletion column operation is normally completed except for the respective reasons of downtime, network interruption, etc. Because of the objective factors for stably completing, it is realized that the above-mentioned "when the collected log content corresponding to the DDL transaction is the user table INSERT operation, the DDL transaction is judged to be the operation of deleting the column, and according to the column information of the table to which the operation of deleting the column belongs, a table building operation is constructed and sent to the destination synchronization system to be executed" becomes an effective and feasible operation mode, thereby becoming the root place capable of finally obtaining "the operation of deleting the column from the table at almost the same time in the source and destination databases can be basically ensured".
In step 203, a query insertion operation is constructed and sent to the target synchronization service execution, and the data in the corresponding original table of the target database is copied to the new table of the target database.
It should be emphasized that the operation of step 203 does not lock the original table in the destination database as in the prior art, and therefore, the query operation on the original table can be performed normally while step 203 is executed. Since the synchronization operation between the source-side database and the destination-side database is performed by the synchronization systems on both sides, and the synchronization system is an independent layer with respect to the database itself, it can be understood that the synchronization system is a control subject that sends a specified DML operation to the database, and therefore, there is a space for implementing the improvement steps proposed in the above embodiment of the present invention.
In step 204, the source database synchronization system continues to capture the log until capturing the commit log of the DDL transaction, so the synchronization system of the source database notifies the synchronization system of the destination database to delete the original table, and replaces the table name of the newly created table with the table name of the original table in the destination database, thereby completing the DDL transaction synchronization.
Before completing the DDL transaction synchronization, if a query request is received by the destination, the method further comprises: and the destination database provides the original table as a queried object of the query request, wherein the query request process does not influence the process of copying the original table data into the newly built table.
The embodiment of the invention captures the user table operation log corresponding to the DDL things as the basis for triggering the target end to carry out new table operation, further deletes the original table of the target end database only when capturing the submission log or rollback log of the affair, and replaces the new table name with the deleted original table, thereby basically ensuring that the source and the target database almost carry out column deletion operation at the same time.
As shown in fig. 2, after the synchronization system of the source database in step 201 determines that the operation to be synchronized in the log is a DDL transaction, and collects subsequent logs belonging to the DDL transaction according to the transaction number, in addition to the above-mentioned case in step 202, another case may be encountered, specifically:
in step 205, when the log content corresponding to the DDL transaction is collected first, instead of the user table INSERT operation, but a commit log, it is determined that the DDL transaction is a delete column operation, and then a delete column operation is sent to the destination synchronization system. Here, it is considered that if the commit log in step 205 is collected instead of the INSERT operation in step 202, it can be determined that the original table on which the delete column operation needs to be performed is an empty table, and there is no delay described in the background art by using the conventional mechanism, because the creation of the empty table can be completed quickly by using the conventional database mechanism, and at this time, there is some redundancy by using the method process of step 203 and 204 proposed in the embodiment of the present invention; therefore, in this case, synchronization can be performed directly using the delete column operation instruction between the two-side synchronization systems.
Further, from the destination database, after receiving the delete column operation from the source database in step 205, the synchronization system of the destination database directly executes the delete column operation on the table on the destination database after receiving the delete column operation of the synchronization system of the source database, thereby completing the DDL transaction synchronization. And on the destination database, the process of executing the column deletion operation comprises the following steps: locking the original table of the target end database, creating a table name of a newly created table and establishing a mapping relation with the original table name, keeping the column information of the newly created table consistent with the column information of the corresponding original table after the column deleting operation is finished, and deleting the original table of the target end database after the new created table is established
In this embodiment of the present invention, before completing the DDL transaction synchronization, if a query request is received by a destination database, the method further includes:
the original table of the destination database is locked so as to be in an inaccessible state; and after the DDL things are synchronized, the query request is completed through the new table replacing the original table.
As shown in fig. 3, the source-side database synchronization system continues to capture logs, and includes the case of rolling back logs in addition to the commit log described in step 204, specifically:
in step 206, when the rollback log of the DDL transaction is captured, the synchronization system of the source database constructs an operation of deleting the newly-built table of the destination and sends the operation to the destination for execution, thereby completing the DDL transaction synchronization.
In this embodiment of the present invention, when the collected log content corresponding to the DDL transaction is a user table INSERT operation, determining whether the DDL transaction is a column deletion operation, as shown in fig. 4, specifically includes:
in step 301, for a DDL transaction, the source database generates a log of a system table and then generates a log corresponding to a common table.
In step 302, after the log of the system table is collected, the operation to which the DDL transaction belongs is reversely deduced through the log of the system table; for example, in embodiment 1 of the present invention, the inverse push out of the INSERT operation log by the user table is a delete column operation. Col analysis determines that a column in the table is less defined by one column, and thus a delete operation, for example, as described in example 3.
The condition for collecting the log of the system table is as follows: generating a commit or rollback message for the DDL transaction, and/or generating a log of a common table corresponding to the DDL transaction.
Example 2:
the embodiment of the present invention describes the implementation process of embodiment 1 of the present invention from a relatively complete method step process, and compared with embodiment 1, the process of the embodiment of the present invention is more complete in performance and more instantiated. In the embodiment of the present invention, synchronization systems are deployed in a source database and a destination database, the source database synchronization system reads a log from the source database, and the destination database synchronization system is responsible for applying a synchronization operation sent by a source to the destination database, as shown in fig. 5, including the following steps:
in step 401, the source database synchronization system determines whether the operation in the log belongs to a system table operation, if so, the operation is classified as a DDL transaction, obtains a transaction number of the DDL operation, and collects subsequent logs belonging to the transaction according to the transaction number.
When the commit log of the above-described DDL transaction is collected, the delete column is synchronized in step 402 using conventional means. The conventional manner described herein may be a delete column synchronization mechanism provided by each platform itself, which is not described herein for brevity.
In step 403, when the user table INSERT operation log related to the DDL transaction is collected, it needs to determine whether the DDL transaction is a column deletion operation, and according to the column information of the table to which the column deletion operation belongs, a table creation operation is constructed and sent to the destination for synchronous service execution, so as to ensure that the table name a _ BAK of the table creation is different from the original table name a, but the column information is consistent with the original table.
In step 404, a query insertion operation is constructed and sent to the target synchronization service for execution, and the data in the a table is copied to the a _ BAK table.
In step 405, the source-side database synchronization system continues to capture logs, and subsequent DML logs belonging to the delete column transaction may be discarded directly until a commit or rollback log of the transaction is captured.
When the rollback log of the delete column is captured, in step 406, the operation to construct the delete table a _ BAK is sent to the target for execution, since the target has already performed the create table a _ BAK operation, completing the transaction synchronization.
In step 407, when the commit log of the deleted column is captured, the destination synchronization service is notified to perform a renaming operation on the newly created table, the destination synchronization service deletes the original table a, and then renames the new table a _ BAK to a, thereby completing the transaction synchronization.
When the scheme is operated, the operation of deleting the columns of the table can be basically ensured to be carried out by the source database and the target database at the same time, and the target end is carried out in a mode of reserving the original table to create a new table when deleting the columns, and the query operation of the original table is not influenced by the synchronization of the column deleting operation, so that the delay caused by the synchronization of the column deleting operation is shortened, and the influence of the synchronization of the column deleting operation on the application of the target end is reduced.
Example 3:
in the embodiment of the present invention, when the source database is ORACLE, the method content in the above embodiment 2 is further developed and instantiated as follows:
all object information of the source database is stored in the related system dictionary tables (such as sys. obj $, sys. col $, sys. icol $), and taking ORACLE as an example, the table definition of the table is stored in the sys. obj $ system table, and the column definition is placed in the sys. col system $ table. When ORACLE deletes the column C from the table a, it deletes a row definition of the table in the sys.col $ system table first, and the synchronization service can acquire definition information of the column C to be deleted in the log by capturing the column deletion operation of the sys.col $ system table; after the database constructs the dictionary of the A table, the database constructs the data recorded in the A table, the operation generates a large amount of INSERT logs aiming at the A table in the logs, the synchronization service can know that the log operation of the system table is finished when acquiring the insertion operation of the A table, and the original deletion column operation can be restored according to the collected system table operation.
Examples are as follows:
the existing table a (C INT, COL INT) of the source database, a, has 1 row of data, and executes delete column operation at the source end:
ALTER TABLE A DROP COLUMN COL;
as shown in fig. 6, the synchronization process is as follows:
in step 501, the source data synchronization service collects a system table operation log for the deleted column.
In step 502, when the INSERT operation log of A is found in the DDL transaction of the deleted COLUMN, the log recovery of the parsing system TABLE obtains the operation ALTER TABLE A DROP COLUMN COL of the deleted COLUMN.
In step 503, the source data synchronization constructs a TABLE build operation to the destination synchronization service, and the TABLE structure uses the deleted column, and the TABLE build SQL is CREATE TABLE a _ bak (c int).
In step 504, the operation of query insertion is constructed by the source data synchronization to send to the target synchronization service, so as to copy the data of the a table to a _ BAK, and the query insertion SQL is INSERT intra a _ BAK (C) SELECT C FROM a.
In step 505, if the source delete column operation is rolled back, the SQL constructed by the delete TABLE in the above example is DROP TABLE a _ BAK, and needs to be sent to the target for execution in order to delete the newly created TABLE a _ BAK.
In step 506, if the source delete column operation commits, the renamed end database a _ BAK needs TO be TABLE a, so a delete TABLE operation needs TO be sent first TO delete the old TABLE a, the SQL TO delete the TABLE is DROP TABLE a, the TABLE a _ BAK is renamed TO a, and the SQL TO RENAME the TABLE is ALTER TABLE a _ BAK restore TO a.
In step 507, synchronization is completed.
The embodiment of the invention captures the user table operation log corresponding to the DDL things as the basis for triggering the target end to carry out new table operation, further deletes the original table of the target end database only when capturing the submission log or rollback log of the affair, and replaces the new table name with the deleted original table, thereby basically ensuring that the source and the target database almost carry out column deletion operation at the same time.
Example 4:
fig. 7 is a schematic diagram illustrating an architecture of a database deleted column synchronization apparatus based on log parsing according to an embodiment of the present invention. The database deletion column synchronization device based on log parsing of the present embodiment includes one or more processors 21 and a memory 22. In fig. 7, one processor 21 is taken as an example.
The processor 21 and the memory 22 may be connected by a bus or other means, and fig. 7 illustrates the connection by a bus as an example.
The memory 22, which is a non-volatile computer-readable storage medium, may be used to store a non-volatile software program and a non-volatile computer-executable program, such as the database deletion column synchronization method based on log parsing in embodiment 1. The processor 21 executes the log resolution based database delete column synchronization method by executing non-volatile software programs and instructions stored in the memory 22.
The memory 22 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 22 and, when executed by the one or more processors 21, perform the log parsing-based database deletion column synchronization method of embodiment 1 described above, for example, perform the steps shown in fig. 1 to 6 described above.
It should be noted that, for the information interaction, execution process and other contents between the modules and units in the apparatus and system, the specific contents may refer to the description in the embodiment of the method of the present invention because the same concept is used as the embodiment of the processing method of the present invention, and are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.