CN109189852B - Data synchronization method and device for data synchronization - Google Patents

Data synchronization method and device for data synchronization Download PDF

Info

Publication number
CN109189852B
CN109189852B CN201810860116.0A CN201810860116A CN109189852B CN 109189852 B CN109189852 B CN 109189852B CN 201810860116 A CN201810860116 A CN 201810860116A CN 109189852 B CN109189852 B CN 109189852B
Authority
CN
China
Prior art keywords
log
definition information
serial number
data
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810860116.0A
Other languages
Chinese (zh)
Other versions
CN109189852A (en
Inventor
付铨
孙峰
陈艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Dream Database Co ltd
Original Assignee
Wuhan Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Dameng Database Co Ltd filed Critical Wuhan Dameng Database Co Ltd
Priority to CN201810860116.0A priority Critical patent/CN109189852B/en
Publication of CN109189852A publication Critical patent/CN109189852A/en
Application granted granted Critical
Publication of CN109189852B publication Critical patent/CN109189852B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of database synchronization, and provides a data synchronization method and a device for data synchronization, wherein the data synchronization method comprises the steps of establishing an off-line dictionary containing object definition information; acquiring an operation log in a source database log file; acquiring corresponding object definition information from an offline dictionary according to a log serial number of the operation log, and analyzing the operation log according to the acquired object definition information; and sending the data obtained by log analysis to a destination database. The data synchronization method can establish complementary analysis with the operation log only containing local key information in the prior art by accessing the off-line dictionary, thereby not only ensuring the correctness of the definition information of the object in the operation log, but also avoiding the condition that the performance of the database is influenced by accessing the database in real time.

Description

Data synchronization method and device for data synchronization
[ technical field ] A method for producing a semiconductor device
The present invention relates to the field of database synchronization technologies, and in particular, to a data synchronization method and a data synchronization device.
[ background of the invention ]
The database synchronization technology is widely applied to various emerging products such as distributed systems, big data, cloud databases and the like at present, and the huge pressure brought to the systems by the high-volume concurrent access of the big data can be effectively relieved through the redundancy scheme of the databases. The data operation of the source database, including INSERT, DELETE, UPDATE, is obtained by extracting the transaction log of the source database, and then sent to the destination database through the network, and the destination database is restored to the original SQL (Structured Query Language, abbreviated SQL) statement and executed on the destination database, which is a commonly used database synchronization technology. The database synchronization technology has the advantages of having small influence on the performance and data mode of a source database, being capable of configuring various topological structures in a very flexible mode, supporting the copying of an operation system and a database platform which are different in cross-structure, and the like.
Currently, the log file of the database uses the object identifier to represent the operation object, and does not record the object definition information corresponding to each object. In the process of analyzing the transaction log by adopting the log analysis technology, accurate object definition information can be acquired by inquiring and accessing the database in real time through the object identification, and then correct log analysis work can be completed.
However, when the object definition information is acquired by connecting the databases, there are problems in that: on one hand, because only the definition information of the current object (the latest definition information at the current time) is recorded in the database system dictionary, the database may not be queried in real time during the log analysis process to obtain the correct definition information of the target operation object recorded by the operation log. For example, before the object definition modification occurs to a DML (Data management Language) operation of a certain object, the definition information of the object corresponding to the operation log in which the DML operation is recorded is different from the definition information of the object recorded in the database at the current time, and therefore, at this time, by accessing the object definition in the database system dictionary, the correct definition of the object in the operation log cannot be obtained. On the other hand, the database is queried in real time in the log analysis process, which has an influence on the database performance. Moreover, when there is an abnormality in the database, it is easy to cause failure in normally acquiring the object definition information.
Therefore, the traditional log analysis technology utilizes a mode of inquiring the database in real time to obtain the object definition information, which cannot ensure the correctness of the object definition, further influences the accuracy of data synchronization, and influences the performance of the database.
In view of the above, overcoming the drawbacks of the prior art is an urgent problem in the art.
[ summary of the invention ]
The technical problem to be solved by the invention is that when data synchronization is performed by adopting a log analysis mode, correct definition information of an operation object corresponding to an operation log cannot be acquired, so that data synchronization is wrong.
The embodiment of the invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for data synchronization, where the method for data synchronization includes: establishing an off-line dictionary containing object definition information; acquiring an operation log in a source database log file; acquiring corresponding object definition information from the offline dictionary according to the log serial number of the operation log, and analyzing the operation log according to the acquired object definition information; and sending the data obtained by log analysis to a destination database.
Preferably, after the establishing the offline dictionary containing the object definition information, the method further includes: acquiring a data synchronization point of a destination database, and determining an initial log serial number according to the data synchronization point; the obtaining the operation log in the source database log file includes: and obtaining an operation log of which the log serial number is not less than the initial log serial number in the source database log file.
Preferably, the establishing an offline dictionary containing object definition information includes: acquiring object definition information in a dictionary of a source-end database system and a current log serial number corresponding to a source-end database at the current moment; integrating the object definition information and the current log serial number into a first version offline dictionary file, and storing the first version offline dictionary file in a local disk; analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation.
Preferably, the obtaining object definition information in the source database system dictionary and the current log sequence number corresponding to the source database at the current time includes: querying a source database system dictionary based on a structured query language; acquiring definition information of an object matched with the structured query language and a current log serial number corresponding to a source database at the current moment; the definition information comprises one or more items of mode name, object name and partition position.
Preferably, the data operation is creating an object; analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises: judging and determining whether the object created by the source database is an object needing synchronization; if the object needs to be synchronized, acquiring definition information of the created object and a log serial number corresponding to the operation of the created object; and integrating the definition information of the created object and the corresponding log serial number into a first version offline dictionary file, and storing the first version offline dictionary file in a local disk.
Preferably, the data operation is a modification object; analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises: acquiring definition information of the modified object and a log serial number corresponding to the operation of the modified object; and integrating the definition information of the modified object and the corresponding log serial number into a next version offline dictionary file.
Preferably, the data operation is a delete object; analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises: acquiring a log serial number corresponding to the operation of the deleted object; judging and determining whether the log serial number corresponding to the operation of the deleted object is smaller than the initial log serial number; and if the number of the deleted objects is smaller than the preset number, deleting the offline dictionary file corresponding to the deleted objects from the local disk.
Preferably, the acquiring corresponding object definition information from the offline dictionary according to the log serial number of the operation log, and analyzing the operation log according to the acquired object definition information includes: analyzing the operation log to obtain a target operation object; acquiring log serial numbers corresponding to all versions of offline dictionary files containing definition information of the target operation object; comparing the log serial number of the operation log with the log serial numbers corresponding to all versions of the off-line dictionary files; and determining the corresponding version of the offline dictionary file based on the principle more than the latest principle so as to acquire the definition information of the target operation object.
Preferably, the object is one or more of a table, an index and a view.
In a second aspect, the present invention provides an apparatus for data synchronization, comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor and programmed to perform the method of data synchronization of the first aspect.
In a third aspect, the present invention also provides a non-transitory computer storage medium storing computer-executable instructions for execution by one or more processors for performing the method of data synchronization of the first aspect.
Compared with the prior art, the embodiment of the invention has the beneficial effects that: when data synchronization is carried out by adopting a log analysis mode, an offline dictionary containing object definition information is firstly established, the offline dictionary can be accessed during log analysis, and complementary analysis with an operation log only containing local key information in the prior art is established, so that the accuracy of the definition information of the object in the operation log is ensured, and the condition that the performance of the database is influenced by accessing the database in real time is avoided.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a schematic flow chart of a method for data synchronization according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart showing the detail of step 10 in FIG. 1;
FIG. 3 is a schematic flow chart diagram illustrating one embodiment of step 103 in FIG. 2;
FIG. 4 is a schematic flow chart of another embodiment of step 103 in FIG. 2;
FIG. 5 is a schematic flow chart of a further embodiment of step 103 in FIG. 2;
fig. 6 is a schematic structural diagram of an apparatus for data synchronization according to an embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example 1:
in an actual application scenario, when data synchronization is performed by using log analysis, correct object definition information must be used, and analysis or synchronization errors may be caused when the object definition information is too new or too old. In this embodiment, the object definition information in the dictionary of the source database system is stored locally, and during the log analysis process, the local offline dictionary is accessed without accessing the database in real time, so that not only can the influence on the performance of the database be reduced, but also the correct object definition information can be obtained, and further the accuracy of data synchronization or log analysis is ensured. The following describes a method for data synchronization using an offline dictionary.
Step 10: an offline dictionary containing object definition information is established.
In this embodiment, the source database first establishes an offline dictionary containing object definition information according to actual requirements.
The source database comprises any one of an SQLSERVER database, a MYSQL database, a DB2 database, a DM6 database, a DM7 database, a POSTGRESQL database or an ORACLE database, the type of the database is not specifically limited, and the type of the database can be selected according to actual conditions.
Wherein the object comprises one or more of a table, an index, and a view.
Specifically, a table is composed of rows and columns, the columns are composed of the same kind of information, each column is also called a field, the header of each column is called a field name, and the rows include a plurality of columns of information items. A line of data is called a record or record and it expresses a meaningful combination of information. A table may consist of one or more records, and a table with no records is referred to as an empty table.
Indexes are built according to the order of the designated database table columns, provide a way to quickly access data, and can supervise the data of the tables so that the data in the columns to which the indexes point are not duplicated, such as clustered indexes.
A view looks just like a table, with a set of named fields and data items, but a view is actually a virtual table and does not actually exist in a database. Views are generated by querying database tables, which limits the data that a user can see and modify. The view may be used to control user access to the data and to simplify the display of the data, i.e. to display only those data information that is required through the view.
Referring to fig. 2, a method for creating an offline dictionary is described in detail.
Step 101: and acquiring object definition information in a dictionary of the source database system and a current log serial number corresponding to the source database at the current moment.
In this embodiment, definition information of an object matching a Structured Query Language (SQL) is obtained based on a SQL source-side database system dictionary. In an actual application scenario, SQL statements may be designed according to actual requirements to obtain definition information of corresponding objects. The definition information of the object comprises one or more items of a mode name, an object name and a partition position. In an alternative embodiment, the definition information of the object includes a mode name, an object name, and a partition position. For example, when the object is a table, the definition information of the object includes a schema name, a table name, a column name, and a partition position of the table.
In an actual application scenario, when the types of the database systems are different, the manner of acquiring the object definition information is also different. For convenience of explanation, the manner of obtaining the object definition information is explained by taking the database system as an ORACLE database and the object as a table as an example.
First, the initialization information of the table is obtained by using a select statement, a form statement and a where statement. For example, in an actual application scenario, the initialization information of the table is obtained as follows: a selection sc.obj #, sc.interactive #, sc.name, sc.type #, sc.charset form (a selection dataobj # from sys. obj # and sl. interactive #) local, sc.length, sc.precision #, sc.scale, sc.null, sc.default #, sc.pro, sc.property, (a selection toid. system, sc.internal, sc.1, sc.60, sc.surface, sc.pro, sc.60, sc.100, sc.internal, sc.100, sc.60, sc.1, sc.internal, sc.60, sc.100, 13, 2, 60, 183, 60, 2, 60, 2, 60, 2, 60, 2, 60, 2, 60.
Then, the column information of the table is obtained by adopting a select statement, a form statement and a where statement. For example, in an actual application scenario, the following method is adopted to obtain the column information of the table: a selection sc.obj #, sc.intcol #, sc.name, sc.type #, sc.charstform, (a selection dataobj # from sys.obj # and sl.intcol #) combined, sc.length, sc.precision #, sc.scale, sc.null, sc.default, sc.property, (a selection toid. sys.obj # and sl.intcol #) combined, sc.length, sc.integer # and sc.intcol # combined, sc.1, sc.1000, sc.kernel #, sc.32, sc.100, sc.60, sc.100, 13, etc. (sc.100, 13, 60, 183, 13, 60, etc.), and 60, 183, 60, 2, 60, 2, 60, 2, 60, 2, 60.
And finally, obtaining the partition information of the table by adopting a select statement, a form statement and a where statement. For example, in an actual application scenario, the following manner is adopted to obtain the partition information of the table: a selection tp.obj #, tp.databobj #, tp.bo #, tab.subnet from system.tabpart #, a.bbo # tp, a.bbo from system.tab from system.obj #, c.subnet from system.tabsub.p.a.tab.b.sub.p.a.tab.b.sub.b, a.ob.b.p.b.p.b.p.b.p.b.p.b.p.b.p.b.p.b.p.b.p.b.p.b.p.b.b.p.b.o.b.o.b.p.b.and a.obj # c.o.b.p.u.n.p.o.p.obj #, tcp.ob.p.p.p.p.b.p.p.p.b.p.p.b.p.p.b.p.p.b.p.p.b.p.p.b.p.p.b.p.p.b.p.p.b.p.p.p.b.p.p.b.p.p.b.p.p.b.p.p.p.b.p.p.b.p
// obtaining partition information
select obj#,parttype,mod(SPARE2,256)from sys.partobj$
// partition table Large object
select a.dataobj#,b.tabfragobj#,a.name from sys.obj$a left join sys.lobfrag$b on(a.obj#=b.fragobj#)where(a.type#=40or a.type#=41)。
The above-mentioned method for acquiring the definition information of the specification table is explained only according to a specific example, and in an actual application scenario, a corresponding SQL statement can be written according to the type of the database, so as to acquire the definition information of the table.
In this embodiment, the offline dictionary includes at least one offline dictionary file, and each offline dictionary file corresponds to definition information of one object, that is, the offline dictionary may include definition information of at least one object. Since the definition information of an object may be modified in an actual application scenario, in order to distinguish the definition information of an object at different times, in the present embodiment, each offline dictionary file has at least one version of the offline dictionary file. In order to distinguish the offline dictionary files of different versions of the same object, in this embodiment, the log sequence number is configured as a version number of the offline dictionary file.
Specifically, when the object definition information is acquired, a current log serial number corresponding to the source-end database at the current time is acquired, where the current log serial number may be a maximum log serial number in a log file of the source-end database at the current time. The log sequence number is a numerical value automatically maintained by a database system, and has the characteristics of automatic increment and global uniqueness.
In an actual application scenario, because a large number of objects exist in a database system, an operation log is correspondingly generated when each object is operated, and meanwhile, the system automatically generates a log serial number to distinguish different operations. The log sequence number is used to represent one physical transaction generated inside the database system, and has global uniqueness in the log file of the database system to distinguish different physical transactions. Currently, most database management systems use LSN (Log sequence number, LSN) to represent a physical transaction generated inside the database system, for example, SQLSERVER, MYSQL, DB2, DM6, DM7, and other database management systems. However, in the ORACLE database, SCN (System Change Number, SCN for short) is used as the identifier for generating a physical transaction, and LSN in ORACLE is only used as a serial Number for log switching, so SCN can be used as the log serial Number in the ORACLE database.
In addition, when the types of the database systems are different, the mode of obtaining the log serial number is correspondingly different. For example, when the database system is an ORACLE database, the SCN is used as a log serial number, and can be queried through an SQL statement, specifically, a select dbms _ flash. When the database system is a DM7 database, the log sequence number LSN is used as the log sequence number, and the query can be performed through SQL statements, specifically, a select cur _ LSN from sys.v. $ rlog command can be used to obtain the LSN value (log sequence number). When the database system is a DM6 database, the log sequence number LSN is used as a log sequence number, and the query may be performed through an SQL statement, specifically, a SELECT GET _ LAST _ LSN ('database name') command may be used to obtain an LSN value (log sequence number). When the database system is a MYSQL database, the log sequence number LSN is used as a log sequence number, and the query can be performed through SQL statements, and specifically, a show binary logs command can be used to obtain an LSN value (log sequence number). When the database system is a POSTGRESQL database, the log sequence number LSN is used as the log sequence number, and query may be performed through an SQL statement, and specifically, a SELECT cast (pg _ current _ xlog _ insert _ location () -cast ('0/0' as pg _ LSN) as binary) command may be used to obtain the LSN value (log sequence number). When the database system is an SQLSERVER database, the log sequence number LSN is used as the log sequence number, and the query can be performed through an SQL statement, specifically, a select sys. When the database system is a DB2 database, the log serial number LSN is used as a log serial number, and can be acquired through an Api interface, and specifically, a DB2ReadLog function can be used to acquire an LSN value (log serial number).
The foregoing is only a list of simple description of the manner of obtaining the log serial number, and specifically, the log serial number may be obtained by using an SQL statement or an Api interface according to an actual application scenario, which is not described herein again.
Step 102: integrating the object definition information and the current log serial number into a first version offline dictionary file, and storing the first version offline dictionary file in a local disk.
In this embodiment, after the definition information of the object and the current log serial number are obtained, the definition information corresponding to the object is integrated into a first version offline dictionary file, and the offline dictionary file is stored in a local disk. The definition information of the object in the first version offline dictionary file can be used as the reference definition information of the object. When the object definition information is modified, the modified object definition information may be added to the first version offline dictionary file, and the log sequence number corresponding to the object modification operation is configured as the version number of the version to distinguish.
Specifically, the definition information of each object may be integrated into an offline dictionary file in txt text format. Since the number of the objects is multiple, and correspondingly, the number of the offline dictionary files in the offline dictionary is also multiple, in order to distinguish different offline dictionary files, in this embodiment, a unique file identification number is further allocated to the offline dictionary file corresponding to each object, and after the unique file identification number is allocated to the offline dictionary file corresponding to each object, the offline dictionary file is stored in the local disk. The file identification number is automatically allocated by the database system and has global uniqueness.
After the definition information corresponding to the object is integrated into an offline dictionary file, configuring the log serial number as the version number of the offline dictionary file, and finally storing the offline dictionary file with the configured version number in a local disk.
Step 103: and analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation.
In this embodiment, when the source database performs data operation, the definition information of the object may be changed. In order to ensure the accuracy of the definition information of the object, the embodiment further needs to analyze the data operation of the database system, and update the offline dictionary file of the object corresponding to the data operation.
In particular, the data operation is a DDL operation, wherein the DDL operation includes creating an object, modifying an object, and deleting an object.
In an alternative embodiment, the data operation is a create object. Referring to fig. 3, step 103: analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises the following steps:
step 1131: judging and determining whether the object created by the source database is an object needing synchronization.
In this embodiment, the object created by the source database belongs to a newly added object, and the local existing offline dictionary does not include a corresponding offline dictionary file. First, it is determined whether an object created by the source database is an object that needs to be synchronized, and specifically, it may be determined whether the created object is an object that needs to be synchronized according to the SQL statement that acquires the reference object definition information. If the object needs to be synchronized, step 1132 is executed, and if the object does not need to be synchronized, step 1134 is executed.
Step 1132: and if the object needs to be synchronized, acquiring definition information of the created object and a log serial number corresponding to the operation of the created object.
In the database system, each transaction is generated, and a corresponding log serial number is correspondingly distributed so as to distinguish different operation logs. In this embodiment, first, definition information of a created object and a log serial number corresponding to an operation of the created object are acquired.
Step 1133: and integrating the definition information of the created object and the corresponding log serial number into a first version offline dictionary file, and storing the first version offline dictionary file in a local disk.
And integrating the definition information of the created object and the corresponding log serial number into a first version offline dictionary file, and storing the first version offline dictionary file in a local disk. Please refer to step 102 for a specific method for creating an offline dictionary file, which is not described herein again.
Step 1134: and if the synchronous object is not needed, the offline dictionary is not updated.
And if the synchronous object is not needed, not establishing the off-line dictionary file corresponding to the created object.
The embodiment mainly illustrates how to update the offline dictionary when creating an object, so as to ensure the accuracy and comprehensiveness of the object definition information stored in the offline dictionary.
In an alternative embodiment, the data operation is a modified object. Referring to fig. 4, step 103: analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises the following steps:
step 1231: and acquiring definition information of the modified object and a log serial number corresponding to the operation of the modified object.
In this embodiment, when an object is modified, definition information of the object is modified. For example, the object is a table, and in an actual application scenario, after a table name or a column name is modified, definition information of the object is changed accordingly. At this time, the definition information of the modified object and the log serial number corresponding to the operation of the modified object in the log file of the database system are obtained.
Step 1232: and integrating the definition information of the modified object and the corresponding log serial number into a next version offline dictionary file.
In this embodiment, when the offline dictionary file corresponding to the modified object is updated, the offline dictionary file of the previous version is not covered, but the updated offline dictionary file of the updated version is added to the end of the offline dictionary file corresponding to the object. In order to distinguish the offline dictionary files with different versions of the same object, the log sequence number is configured to be the version number of the updated offline dictionary file. In the log analysis, corresponding definition information can be obtained according to the version number of the offline dictionary file. The following will explain in detail how to obtain the corresponding definition information according to the version number of the offline dictionary file.
In an alternative embodiment, the data operation is a delete object. Referring to fig. 5, step 103: analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises the following steps:
step 1331: acquiring a log serial number corresponding to the operation of the deleted object;
in this embodiment, after an object is deleted, an offline dictionary file corresponding to the deleted object and a log serial number corresponding to the operation of the deleted object in a log file of a database system are obtained. In an optional embodiment, the special character is used to mark that the offline dictionary file has been deleted, so that an object to be deleted is determined according to the special character in a subsequent operation, and the offline dictionary file corresponding to the deleted object is removed from the local disk, so as to save disk space.
Step 1332: and judging and determining whether the log sequence number corresponding to the operation of the deleted object is smaller than the initial log sequence number.
Specifically, when the source database receives a data synchronization point of the destination database, the source database determines an initial log serial number according to the data synchronization point, and determines data to be synchronized according to the log serial number. Specifically, the starting log sequence number may be a maximum corresponding log sequence number in a log file of the source-end database, and the maximum log sequence number is set as the starting log sequence number, where the first synchronization request that the source-end database receives the destination-end database is received by the source-end database.
In this embodiment, the source database determines whether the log sequence number corresponding to the delete object operation is smaller than the start log sequence number, and if so, it indicates that the object has been deleted before data synchronization, that is, the deleted object does not need to be synchronized to the destination database, and then step 1333 is executed. If the data needed to be synchronized may exist from the synchronization point to the time point when the object is deleted, the offline dictionary file is reserved, and the offline dictionary file corresponding to the deleted object is deleted from the local disk after the data synchronization is completed.
Step 1333: and if the number of the deleted objects is smaller than the preset number, deleting the offline dictionary file corresponding to the deleted object from the local disk.
The foregoing describes in detail a process of updating an offline dictionary according to different data operations (DDL operations) to ensure the accuracy and comprehensiveness of offline dictionary files in the offline dictionary, and further provide accurate and comprehensive definition information of objects for log analysis or data synchronization.
Step 11: and acquiring an operation log in a log file of a source database.
In a specific application scenario, first, a data synchronization point of a destination database is obtained, and a start log sequence number is determined according to the data synchronization point, where the start log sequence number may be a maximum log sequence number corresponding to a log file of the destination database when the source database receives a first synchronization request of the destination database, and the maximum log sequence number is set as a start log sequence number.
Then, obtaining an operation log of which the log serial number is not less than the initial log serial number in the source database log file, and performing log analysis on the operation log meeting the conditions.
Step 12: and acquiring corresponding object definition information from the offline dictionary according to the log serial number of the operation log, and analyzing the operation log according to the acquired object definition information.
In this embodiment, since there are a large number of operation logs in the log file, and the operation object corresponding to each operation log may be different, the source database parses the operation logs to obtain the target operation object. In an alternative embodiment, the operation object name may be obtained, and then the corresponding offline dictionary file may be obtained from the offline dictionary according to the operation object name, so as to obtain the correct object definition information.
Specifically, as can be seen from the foregoing analysis, each object may correspond to multiple versions of the offline dictionary file, in order to determine the object definition corresponding to the operated object at the operation time point. In this embodiment, log sequence numbers corresponding to all versions of offline dictionary files containing definition information of the target operation object are obtained, the log sequence numbers of the operation logs are compared with the log sequence numbers corresponding to all versions of offline dictionary files, and the offline dictionary files of corresponding versions are determined based on a rule greater than the latest rule, so as to obtain the definition information of the target operation object. The reason that the log sequence number corresponding to the operation log is possibly greater than the log sequence numbers corresponding to the offline dictionary files of the multiple versions is based on the principle that the log sequence number is greater than the most recent principle, and the offline dictionary file which is greater than the log sequence number corresponding to one of the multiple versions and has the smallest difference value is selected as the target version offline dictionary file. Then, correct object definition information is acquired from the target version offline dictionary file.
To more clearly illustrate the aforementioned process of acquiring the correct definition information, a specific application scenario is first illustrated.
Now, it is assumed that the offline dictionary file corresponding to the target operation object includes two versions of offline dictionary files, and the log sequence number corresponding to the first version of offline dictionary file is l0After the database synchronization service is started, a defined modification operation is performed on the object, and then an offline dictionary file corresponding to definition information after the modification operation is performed is added after the first version offline dictionary file, that is, a second version offline dictionary file, where a log sequence number corresponding to the second version offline dictionary file is l1
In the database synchronization process, the log serial number is used for identifying the data synchronization sequence, and when the source end database performs data synchronization, the initial log serial number is l0I.e. from0I.e. synchronizing the source database data from the beginning. Three operation logs are assumed in a source database log file, and the log sequence number corresponding to the first operation log is ll-1The log sequence number corresponding to the second operation log is ll1And the log sequence number corresponding to the third operation log is ll2And ll is-1<l0<ll1<l1<ll2
When the data of the source end database is synchronized, comparing the log serial number corresponding to each operation log with the initial log serial number, because ll-1Identified oplogs are synchronized in the database (corresponding to l)0) Before, ll was not analyzed-1Carrying out corresponding operation; when the log analysis reaches ll1When due to l0<ll1<l1That is, the second operation log occurs at the time point corresponding to the first version of the offline dictionary file and the time point corresponding to the second version of the offline dictionary file, and l in the offline dictionary file is selected based on the rule greater than the latest rule0Analyzing the definition corresponding to the corresponding operation object; when the log analysis reaches ll2When due to l1<ll2That is, the third operation log occurs after the time point corresponding to the second version of the offline dictionary file, and is selected from the offline dictionary file based on the rule greater than the latest rule1And analyzing the definition corresponding to the object. Therefore, the offline dictionary file with the correct version can be selected in the offline dictionary according to the principle that the log serial number of the current operation log is larger than the latest principle so as to obtain the correct definition information.
Step 13: and sending the data obtained by log analysis to a destination database.
In this embodiment, the data obtained by performing log analysis is sent to the destination database, but it is needless to say that the operation logs and the object definition information corresponding to each operation log may also be sent to the destination database, and the destination database restores the operation logs into SQL statements for data synchronization. The design may be specifically designed according to actual conditions, and is not specifically limited herein.
In an actual application scenario, in order to reduce the data volume of the source end database synchronized to the destination end database, the synchronization efficiency is improved. In an optional embodiment, before data is synchronized, it is determined whether an object that needs to be synchronized has been deleted before a synchronization point, and if the object has been deleted, log analysis on a plurality of DML operation logs corresponding to the object is not performed during data synchronization, and meanwhile, data corresponding to the deleted object does not need to be synchronized to a destination database. This reduces the amount of data and improves the efficiency of the synchronization.
Specifically, if the log sequence number corresponding to the last version of the offline dictionary file of the object is less than the start log sequence number determined according to the synchronization point, it is determined that the object to be synchronized has been deleted before the synchronization point, and the data corresponding to the object does not need to be synchronized to the destination-side database.
Different from the prior art, when the data synchronization is carried out in a log analysis mode, the method firstly establishes the offline dictionary containing the object definition information, can access the offline dictionary during log analysis, and establishes the complementary analysis with the operation log only containing the local key information in the prior art, thereby not only ensuring the correctness of the object definition information in the operation log, but also avoiding the condition that the performance of the database is influenced by accessing the database in real time.
Example 2
Referring to fig. 6, fig. 6 is a schematic structural diagram of an apparatus for data synchronization according to an embodiment of the present invention. The means for data synchronization of the present embodiment comprises one or more processors 61 and a memory 62. In fig. 6, one processor 61 is taken as an example.
The processor 61 and the memory 62 may be connected by a bus or other means, such as the bus connection in fig. 6.
The memory 62, which is a non-volatile computer-readable storage medium based on data synchronization, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the method for data synchronization in embodiment 1and corresponding program instructions. The processor 61 executes various functional applications of the method of performing data synchronization and data processing, that is, functions of the method of data synchronization of embodiment 1, by executing nonvolatile software programs, instructions, and modules stored in the memory 62.
The memory 62 may include, among other things, high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some embodiments, the memory 62 may optionally include memory located remotely from the processor 61, and these remote memories may be connected to the processor 61 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Please refer to fig. 1 to 5 and the related text description for a data synchronization method, which will not be described again.
It should be noted that, for the information interaction, execution process and other contents between the modules and units in the apparatus and system, the specific contents may refer to the description in the embodiment of the method of the present invention because the same concept is used as the embodiment of the processing method of the present invention, and are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A method for data synchronization, the method for data synchronization comprising:
establishing an offline dictionary containing object definition information, comprising: acquiring object definition information in a source database system dictionary and a current log serial number corresponding to the source database at the current moment, integrating the object definition information and the current log serial number into a first version offline dictionary file, and storing the first version offline dictionary file in a local disk;
acquiring an operation log in a source database log file;
acquiring corresponding object definition information from the offline dictionary according to the log serial number of the operation log, and analyzing the operation log according to the acquired object definition information;
and sending the data obtained by log analysis to a destination database.
2. The method for data synchronization according to claim 1, wherein the establishing of the offline dictionary containing the object definition information further comprises:
acquiring a data synchronization point of a destination database, and determining an initial log serial number according to the data synchronization point;
the obtaining the operation log in the source database log file includes:
and obtaining an operation log of which the log serial number is not less than the initial log serial number in the source database log file.
3. The method of claim 2, wherein the creating an offline dictionary containing object definition information comprises:
analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation.
4. The method of claim 1, wherein the obtaining object definition information in a dictionary of a source database system and a current log sequence number corresponding to a source database at a current time comprises:
querying a source database system dictionary based on a structured query language;
acquiring definition information of an object matched with the structured query language and a current log serial number corresponding to a source database at the current moment; the definition information comprises one or more items of mode name, object name and partition position.
5. The method of data synchronization of claim 3, wherein the data operation is creating an object;
analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises:
judging and determining whether the object created by the source database is an object needing synchronization;
if the object needs to be synchronized, acquiring definition information of the created object and a log serial number corresponding to the operation of the created object;
and integrating the definition information of the created object and the corresponding log serial number into a first version offline dictionary file, and storing the first version offline dictionary file in a local disk.
6. The method of data synchronization of claim 3, wherein the data operation is a modification object;
analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises:
acquiring definition information of the modified object and a log serial number corresponding to the operation of the modified object;
and integrating the definition information of the modified object and the corresponding log serial number into a next version offline dictionary file.
7. The method of data synchronization of claim 3, wherein the data operation is a delete object;
analyzing the data operation of the source database, and updating the offline dictionary file of the object corresponding to the data operation comprises:
acquiring a log serial number corresponding to the operation of the deleted object;
judging and determining whether the log serial number corresponding to the operation of the deleted object is smaller than the initial log serial number;
and if the number of the deleted objects is smaller than the preset number, deleting the offline dictionary file corresponding to the deleted object from the local disk.
8. The data synchronization method according to any one of claims 3 to 5, wherein the acquiring corresponding object definition information from the offline dictionary according to the log serial number of the operation log, and analyzing the operation log according to the acquired object definition information includes:
analyzing the operation log to obtain a target operation object;
acquiring log serial numbers corresponding to all versions of offline dictionary files containing definition information of the target operation object;
comparing the log serial number of the operation log with the log serial numbers corresponding to all versions of the off-line dictionary files;
and determining the corresponding version of the offline dictionary file based on the principle more than the latest principle so as to acquire the definition information of the target operation object.
9. The method of data synchronization of claim 1, wherein the object is one or more of a table, an index, and a view.
10. An apparatus for data synchronization, comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor and programmed to perform a method of data synchronization according to any of claims 1 to 9.
CN201810860116.0A 2018-08-01 2018-08-01 Data synchronization method and device for data synchronization Active CN109189852B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810860116.0A CN109189852B (en) 2018-08-01 2018-08-01 Data synchronization method and device for data synchronization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810860116.0A CN109189852B (en) 2018-08-01 2018-08-01 Data synchronization method and device for data synchronization

Publications (2)

Publication Number Publication Date
CN109189852A CN109189852A (en) 2019-01-11
CN109189852B true CN109189852B (en) 2021-05-28

Family

ID=64937608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810860116.0A Active CN109189852B (en) 2018-08-01 2018-08-01 Data synchronization method and device for data synchronization

Country Status (1)

Country Link
CN (1) CN109189852B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119406B (en) * 2019-04-09 2021-06-01 口碑(上海)信息技术有限公司 Method and device for checking real-time task records
CN110347746B (en) * 2019-06-13 2020-06-26 武汉达梦数据库有限公司 Heterogeneous database synchronous data consistency checking method and device
CN110727548B (en) * 2019-09-29 2022-03-04 上海英方软件股份有限公司 Continuous data protection method and device based on database DML synchronization
CN111159020B (en) * 2019-12-20 2021-11-05 武汉达梦数据库股份有限公司 Method and device applied to synchronous software test
CN111241094B (en) * 2019-12-31 2021-06-08 武汉达梦数据库有限公司 Database deleted column synchronization method and device based on log analysis
CN111221909B (en) * 2019-12-31 2021-05-28 武汉达梦数据库股份有限公司 Database modification column synchronization method and device based on log analysis
CN111221907B (en) * 2019-12-31 2021-03-30 武汉达梦数据库股份有限公司 Database added column synchronization method and device based on log analysis
CN111930828B (en) * 2020-05-29 2024-01-19 武汉达梦数据库股份有限公司 Data synchronization method and data synchronization system based on log analysis
CN112307118B (en) * 2020-09-30 2024-03-22 武汉达梦数据库股份有限公司 Method for guaranteeing data consistency based on log analysis synchronization and synchronization system
CN113761052A (en) * 2020-11-27 2021-12-07 北京沃东天骏信息技术有限公司 Database synchronization method and device
CN112612859A (en) * 2020-12-31 2021-04-06 上海英方软件股份有限公司 DDL analysis method and device based on log analysis
CN116881371B (en) * 2023-09-07 2023-11-14 北京逐风科技有限公司 Data synchronization method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930443A (en) * 2009-06-22 2010-12-29 金蝶软件(中国)有限公司 Data service system and data service realization method
EP2302538A1 (en) * 2009-09-16 2011-03-30 Business Objects Software Ltd. Method and system for capturing change of data
CN103617176A (en) * 2013-11-04 2014-03-05 广东电子工业研究院有限公司 Method for achieving automatic synchronization of multi-source heterogeneous data resources
CN104881418A (en) * 2014-02-28 2015-09-02 阿里巴巴集团控股有限公司 Method and device for quickly reclaiming rollback space in MySQL
CN105005618A (en) * 2015-07-21 2015-10-28 杭州合众数据技术有限公司 Data synchronization method and system among heterogeneous databases
CN105243067A (en) * 2014-07-07 2016-01-13 北京明略软件系统有限公司 Method and apparatus for realizing real-time increment synchronization of data
CN105320680A (en) * 2014-07-15 2016-02-10 中国移动通信集团公司 Data synchronization method and device
CN107943979A (en) * 2017-11-29 2018-04-20 山东鲁能软件技术有限公司 The quasi real time synchronous method and device of data between a kind of database

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10762107B2 (en) * 2016-11-29 2020-09-01 Sap Se Synchronization mechanism for serialized data log replay in database systems

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930443A (en) * 2009-06-22 2010-12-29 金蝶软件(中国)有限公司 Data service system and data service realization method
EP2302538A1 (en) * 2009-09-16 2011-03-30 Business Objects Software Ltd. Method and system for capturing change of data
CN103617176A (en) * 2013-11-04 2014-03-05 广东电子工业研究院有限公司 Method for achieving automatic synchronization of multi-source heterogeneous data resources
CN104881418A (en) * 2014-02-28 2015-09-02 阿里巴巴集团控股有限公司 Method and device for quickly reclaiming rollback space in MySQL
CN105243067A (en) * 2014-07-07 2016-01-13 北京明略软件系统有限公司 Method and apparatus for realizing real-time increment synchronization of data
CN105320680A (en) * 2014-07-15 2016-02-10 中国移动通信集团公司 Data synchronization method and device
CN105005618A (en) * 2015-07-21 2015-10-28 杭州合众数据技术有限公司 Data synchronization method and system among heterogeneous databases
CN107943979A (en) * 2017-11-29 2018-04-20 山东鲁能软件技术有限公司 The quasi real time synchronous method and device of data between a kind of database

Also Published As

Publication number Publication date
CN109189852A (en) 2019-01-11

Similar Documents

Publication Publication Date Title
CN109189852B (en) Data synchronization method and device for data synchronization
CN109241175B (en) Data synchronization method and device, storage medium and electronic equipment
CN106933703B (en) Database data backup method and device and electronic equipment
US10031935B1 (en) Customer-requested partitioning of journal-based storage systems
EP2891994A1 (en) Method for achieving automatic synchronization of multisource heterogeneous data resources
KR20180021679A (en) Backup and restore from a distributed database using consistent database snapshots
CN112286941B (en) Big data synchronization method and device based on Binlog + HBase + Hive
CN103617176A (en) Method for achieving automatic synchronization of multi-source heterogeneous data resources
US8832022B2 (en) Transaction processing device, transaction processing method and transaction processing program
CN113204571B (en) SQL execution method and device related to write-in operation and storage medium
CN111143470A (en) Cross-platform database data synchronization comparison method and device
US10489356B1 (en) Truncate and append database operation
CN111930850A (en) Data verification method and device, computer equipment and storage medium
CN113868028A (en) Method for replaying log on data node, data node and system
CN114741453A (en) Method, system and computer readable storage medium for data synchronization
KR101424568B1 (en) Client and database server for resumable transaction and method thereof
KR101989074B1 (en) Migration based on replication log in database sharding environment
CN111914028A (en) Method and device for synchronizing data relation of heterogeneous data sources based on graph increment
CN114579559A (en) Data synchronization method and device, computer equipment and storage medium
CN112395284A (en) Data processing method and device, database system, electronic equipment and storage medium
CN116049306A (en) Data synchronization method, device, electronic equipment and readable storage medium
CN114691307A (en) Transaction processing method and computer system
US9916154B2 (en) File-based transport of table content
CN113778975A (en) Data processing method and device based on distributed database
CN112000671A (en) Block chain-based database table processing method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Fu Quan

Inventor after: Sun Feng

Inventor after: Chen Yan

Inventor before: Fu Quan

Inventor before: Sun Feng

Inventor before: Chen Yan

Inventor before: Yang Chun

CB03 Change of inventor or designer information
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Patentee after: Wuhan dream database Co.,Ltd.

Address before: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Patentee before: WUHAN DAMENG DATABASE Co.,Ltd.

CP01 Change in the name or title of a patent holder
TR01 Transfer of patent right

Effective date of registration: 20220908

Address after: 430073 16-19 / F, building C3, future science and technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan City, Hubei Province

Patentee after: Wuhan dream database Co.,Ltd.

Patentee after: HUAZHONG University OF SCIENCE AND TECHNOLOGY

Address before: 430000 16-19 / F, building C3, future technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan, Hubei Province

Patentee before: Wuhan dream database Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230728

Address after: 16-19/F, Building C3, Future Science and Technology Building, No. 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan City, Hubei Province, 430206

Patentee after: Wuhan dream database Co.,Ltd.

Address before: 430073 16-19 / F, building C3, future science and technology building, 999 Gaoxin Avenue, Donghu New Technology Development Zone, Wuhan City, Hubei Province

Patentee before: Wuhan dream database Co.,Ltd.

Patentee before: HUAZHONG University OF SCIENCE AND TECHNOLOGY

TR01 Transfer of patent right