WO2017080431A1 - 一种基于日志解析的数据库复制方法及装置 - Google Patents

一种基于日志解析的数据库复制方法及装置 Download PDF

Info

Publication number
WO2017080431A1
WO2017080431A1 PCT/CN2016/105007 CN2016105007W WO2017080431A1 WO 2017080431 A1 WO2017080431 A1 WO 2017080431A1 CN 2016105007 W CN2016105007 W CN 2016105007W WO 2017080431 A1 WO2017080431 A1 WO 2017080431A1
Authority
WO
WIPO (PCT)
Prior art keywords
database
data
merged
statement
row
Prior art date
Application number
PCT/CN2016/105007
Other languages
English (en)
French (fr)
Inventor
祖立军
李戈
刘国宝
Original Assignee
中国银联股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国银联股份有限公司 filed Critical 中国银联股份有限公司
Publication of WO2017080431A1 publication Critical patent/WO2017080431A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • the present application relates to the field of database replication technologies, and in particular, to a database replication method and apparatus based on log parsing.
  • the log generated by the source database engine is parsed into a database playback statement by the primary end, and the parsed large number of data playback statements are sent to the target database through the off-site network, and then the standby end
  • the database playback is performed according to the received data playback statement, and the replayed data is stored in the target database, thereby realizing copying the data of the source database to the target database.
  • the problem with the above method is that when the number of database playback statements parsed by the primary end is very large, the transmission of the off-site network will be relatively slow, resulting in a very long time for the entire database to be replicated off-site, reducing the efficiency of remote database replication.
  • the present invention provides a database replication method based on log parsing, which solves the technical problem that the replication efficiency of the remote database is low when the number of database playback statements obtained by the main end parsing is very large.
  • a database replication method based on log parsing includes:
  • the statements corresponding to the row objects having the same primary key identifier are merged into one statement, and the merged row objects are generated according to the primary key identifier and the merged statement;
  • the merged row object is sent to the target database to cause the target database to perform database playback according to the merged row object.
  • the parsing the database log into a row object according to the database log in the source database including:
  • the statements corresponding to the row objects having the same primary key identifier are merged into one statement, and the merged row objects are generated according to the primary key identifier and the merged statement, including:
  • For each row object table assign a thread to the row object table, and call the thread to merge the statements corresponding to the row object with the same primary key identifier into one statement according to a preset rule, and according to the primary key identifier and The merged statement produces a merged row object.
  • the predetermined rule comprises one or a combination of any one of the following:
  • the data will be deleted first, and then the two statements for data insertion are merged into one data insertion statement;
  • the data is deleted first, and then the two statements of data update are merged into one data update statement;
  • the data is updated first, and then the two statements for data deletion are combined into one data deletion statement.
  • the method before sending the merged row object to the target database, the method further includes:
  • the method further includes:
  • a database replication device based on log parsing includes:
  • a parsing unit configured to parse the database log into a row object according to a database log in the source database
  • a merging unit configured to merge the statements corresponding to the row objects having the same primary key identifier into one statement according to a preset rule, and generate the merged row objects according to the primary key identifier and the merged statement;
  • a sending unit configured to send the merged row object to the target database, so that the target database performs database playback according to the merged row object.
  • the parsing unit is specifically configured to:
  • each row object table assign a thread to the row object table and call the thread root
  • the statements corresponding to the row objects having the same primary key identifier are merged into one statement, and the merged row objects are generated according to the primary key identifier and the merged statement.
  • the predetermined rule comprises one or a combination of any one of the following:
  • the data will be deleted first, and then the two statements for data insertion are merged into one data insertion statement;
  • the data is deleted first, and then the two statements of data update are merged into one data update statement;
  • the data is updated first, and then the two statements for data deletion are combined into one data deletion statement.
  • the device further includes an encryption unit, specifically configured to:
  • the sending unit is further configured to:
  • a database replication apparatus based on log parsing includes a communication interface, a processor, a memory, and a bus system;
  • the processor is configured to solve the database log according to a database log in a source database Parsing into a row object; according to a preset rule, merging statements corresponding to row objects having the same primary key identifier into one statement, and generating merged row objects according to the primary key identifier and the merged statement;
  • the subsequent row object is sent to the target database to cause the target database to perform database playback according to the merged row object.
  • the processor is configured to: parse the database log in the data into at least one row object table according to a database log in the source database; wherein the row object table includes at least one row object; a database operation statement of a row object; for each row object table, assign a thread to the row object table, and call the thread to execute a statement corresponding to the row object with the same primary key identifier according to a preset rule Merges into a single statement and generates a merged row object based on the primary key identifier and the merged statement.
  • the predetermined rule comprises one or a combination of any one of the following:
  • the data will be deleted first, and then the two statements for data insertion are merged into one data insertion statement;
  • the data is deleted first, and then the two statements of data update are merged into one data update statement;
  • the data is updated first, and then the two statements for data deletion are combined into one data deletion statement.
  • the processor is further configured to: perform data encryption compression on the merged row object.
  • the processor is further configured to: according to a source structured query language in a MySQL database Binary Log Binary Log, parsing the Binary Log into a row object; merging row objects with the same primary key identifier according to a preset rule; sending the merged row object to the target MySQL database
  • the target MySQL database is caused to perform MySQL database playback according to the merged row object.
  • the database log is parsed into a row object, and the statements corresponding to the row object having the same primary key identifier are merged into one statement according to a preset rule, and the primary key is identified and merged according to the primary key.
  • the subsequent statement generates a merged row object, and sends the merged row object to the target database, so that the target database performs database playback according to the merged row object.
  • the embodiment firstly Row objects are merged, which greatly reduces the amount of data, which improves the speed of sending row objects from the source database to the target database and improves database replication efficiency.
  • FIG. 1 is a flowchart of a database replication method based on log parsing according to an embodiment of the present application
  • FIG. 2 is a detailed flowchart of a database replication method based on log parsing according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a database replication device based on log parsing according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of a database replication device based on log parsing according to an embodiment of the present application.
  • the database replication method based on log parsing includes:
  • Step 101 Parse the database log into a row object according to the database log in the source database.
  • Step 102 Combine, according to a preset rule, a statement corresponding to a row object having the same primary key identifier into a statement, and generate a merged row object according to the primary key identifier and the merged statement;
  • Step 103 Send the merged row object to the target database, so that the target database performs database playback according to the merged row object.
  • the source database may be an Oracle Oracle database, a SQL (Structured Query Language) database, a MySQL (My Structured Query Language) database, etc.
  • the target database is of the same type as the source database, and may be corresponding. Is an Oracle database, SQL database, MySQL database and so on.
  • Database statements recorded in the database log such as query data statements, delete data statements, and update data statements.
  • the database log is parsed into a row object, wherein one row object corresponds to one database statement, and multiple row objects corresponding to the same row have the same primary key identifier, for example, to the database table A
  • the first line is chronologically ordered. There are three database statements respectively, and the primary key corresponding to the first row is A1.
  • the first database statement is a statement that inserts data into the first row, so the first The row corresponds to a row object 1, the primary key corresponding to the row object 1 is identified as A1, the corresponding database statement is an operation of inserting data into the first row; the second database statement is a statement that has a data deletion on the first row, The first row corresponds to a row object 2, the primary key corresponding to row object 2 is identified as A1, the corresponding database statement is a delete operation for deleting the first row of data; the third database statement is a statement that has a data update for the first row. Therefore, the first row corresponds to a row object 3, and the row key corresponding to the row object 3 is identified as A1, and the corresponding database statement is to update the data of the first row. New operation.
  • the database log can be parsed into multiple row objects by reading the database log.
  • step 102 the statements corresponding to the row objects having the same primary key identifier are merged into one statement, and the merged row objects are generated according to the primary key identifier and the merged statement, for example, as described in the above example.
  • the first row of Table A corresponds to three database operations.
  • the statements corresponding to the row objects with the same primary key identifier can be merged, that is, multiple operation statements of the same row are merged.
  • one row corresponds to only one row. Object, and only one merged statement. This greatly reduces the total number of row objects.
  • Step 103 Send the merged row object to the target database, so that the target database performs database playback according to the merged row object. Since the row objects are merged in step 102, the number of row objects is greatly reduced, thereby increasing the speed of transmission when transmitting to the target database.
  • the method provided by the embodiment of the present application parses the database log into a row object according to the database log in the source database, and merges the statements corresponding to the row object having the same primary key identifier into one statement according to a preset rule, and according to the primary key
  • the identified and merged statements generate the merged row object, and the merged row object is sent to the target database, so that the target database performs database playback according to the merged row object.
  • the method Before sending the row object to the target database, the method first
  • the combination of row objects greatly reduces the amount of data, which improves the speed of sending row objects from the source database to the target database and improves database replication efficiency.
  • the database log when parsing the database log, the database log can be parsed into multiple tables, and then one row of one table is regarded as one object, one database corresponds to one database operation statement, or one object corresponds to one row.
  • the database operation statement; or a whole table as an object, the method for the analysis, the invention is not specifically limited.
  • the parsing the database log into a row object according to the database log in the source database including:
  • the above method parses the database log in the source database into at least one row object table;
  • a row object table contains at least one row object, a database operation statement of one row object, so a row object only corresponds to a database operation statement of one row operation, for example, for the second row of the table B, there are 2 in a time period.
  • the operations are delete statements and update statements, respectively, and the rows are parsed into two row objects, corresponding to the two database operation statements that are changed.
  • the method parses the database log into a database operation statement of one row as a minimum unit, and provides sufficient detailed information for subsequent database replication, thereby facilitating database replication and improving efficiency.
  • row objects having the same primary key identifier can be merged according to the following manner.
  • the statements corresponding to the row objects having the same primary key identifier are merged into one statement, and the merged row objects are generated according to the primary key identifier and the merged statement, including:
  • For each row object table assign a thread to the row object table, and call the thread to merge the statements corresponding to the row object with the same primary key identifier into one statement according to a preset rule, and according to the primary key identifier and The merged statement produces a merged row object.
  • a thread may be allocated to perform a merge operation on the row object corresponding to the row object, so that the row objects parsed according to the source database may be concurrently merged, thereby improving the efficiency of the row object merge. Improve system efficiency.
  • the predetermined rule comprises one or a combination of any one of the following:
  • the data will be deleted first, and then the two statements for data insertion are merged into one data insertion statement;
  • the data is deleted first, and then the two statements of data update are merged into one data update statement;
  • the data is updated first, and then the two statements for data deletion are combined into one data deletion statement.
  • any statement corresponding to the row object having the same primary key identifier may be merged, for example, the following operations are performed before and after a row: delete, insert, update, and then delete, insert, and update respectively
  • the statements are merged into one statement.
  • the delete statement and the insert statement are first merged into an insert statement, and then the insert statement and the update statement are merged into an insert statement, so only one insert statement is left in the merge, and the specific content of the merged insert statement is included.
  • the invention is not limited, depending on the actual application.
  • the statements corresponding to the row objects having the same primary key identifier can be merged, and then a new row object is generated according to the merged statement and the primary key identifier corresponding to the changed row, the new row object.
  • the above rules can be used to merge multiple row objects with the same primary key identifier into one row object, which greatly reduces the amount of data, and transfers the merged row object to the target. In the database, the transmission efficiency is improved due to the reduction in the amount of data.
  • some pre-processing may be performed on the merged row object, such as encryption, or no encryption, and the data may be compressed or not compressed, according to Actual needs depend on it.
  • the method provided by the embodiment of the present application before sending the merged row object to the target database, further includes:
  • encryption and compression operations are performed before the merged row object is sent to the target database, thereby improving the security of data transmission and further reducing the amount of data transmission.
  • the method further includes:
  • the statements corresponding to the row objects having the same primary key identifier are merged into one statement, and the merged row objects are generated according to the primary key identifier and the merged statement;
  • the above method is a database replication method in which the source database and the target database are both MySQL databases, and the database log is a Binary Log log.
  • the Binary Log log can be pulled to the local by simulating the MySQL dump protocol, and Parse it into a row object according to the Binary Log log format, and then merge the statements corresponding to the row object with the same primary key identifier into one statement according to a preset rule, and generate the merged row according to the primary key identifier and the merged statement.
  • the object sends the merged row object to the target MySQL database, so that the target MySQL database performs MySQL database playback based on the merged row object.
  • Step 201 The primary end parses the database log.
  • Step 202 The primary end performs a database row object merge operation.
  • Step 203 The primary end performs serialization of data transmission.
  • Step 204 The primary end performs database encryption and compression.
  • Step 205 The primary end performs data transmission of the primary end remote network.
  • Step 206 The standby end performs data reception of the off-site network in the standby end;
  • Step 207 The backup end performs data decompression and decryption.
  • Step 208 The backup end performs data deserialization.
  • Step 209 The standby end performs a data playback operation, and stores the played back data in the target database.
  • the method provided by the embodiment of the present application parses the database log into a row object according to the database log in the source database, and merges the statements corresponding to the row object having the same primary key identifier into one statement according to a preset rule, and according to the primary key
  • the identified and merged statements generate a merged row pair
  • the merged row object is sent to the target database, so that the target database performs database playback according to the merged row object.
  • the method first merges the row objects before sending the row object to the target database, thereby greatly reducing the data. Amount, which improves the speed of database replication by increasing the speed at which row objects are sent from the source database to the target database.
  • the embodiment of the present application further provides a database replication apparatus based on log parsing.
  • the database replication device based on log parsing provided by the embodiment of the present application is as shown in FIG. 3 .
  • the parsing unit 301 is configured to parse the database log into a row object according to a database log in the source database;
  • the merging unit 302 is configured to merge the statements corresponding to the row objects having the same primary key identifier into one statement according to a preset rule, and generate the merged row objects according to the primary key identifier and the merged statement;
  • the sending unit 304 is configured to send the merged row object to the target database, so that the target database performs database playback according to the merged row object.
  • the parsing unit 301 is configured to: parse the database log in the data into at least one row object table according to a database log in the source database; wherein the row object table includes at least one row object a row operation statement of a row object; for each row object table, assign a thread to the row object table, and call the thread to correspond to a row object having the same primary key identifier according to a preset rule The statements are merged into a single statement and the merged row object is generated based on the primary key identifier and the merged statement.
  • the predetermined rule comprises one or a combination of any one of the following:
  • the data will be deleted first, and then the two statements for data insertion are merged into one data insertion statement;
  • the data is deleted first, and then the two statements of data update are merged into one data update statement;
  • the data is updated first, and then the two statements for data deletion are combined into one data deletion statement.
  • the device further includes an encryption unit 303, specifically configured to: perform data encryption compression on the merged row object.
  • the sending unit 304 is further configured to: parse the Binary Log log into a row object according to a binary log Binary Log in a source structured query language MySQL database; and have the same according to a preset rule The row object identified by the primary key is merged; the merged row object is sent to the target MySQL database, so that the target MySQL database performs MySQL database playback according to the merged row object.
  • the method provided by the embodiment of the present application parses the database log into a row object according to the database log in the source database, and merges the statements corresponding to the row object having the same primary key identifier into one statement according to a preset rule, and according to the primary key
  • the identified and merged statements generate the merged row object, and the merged row object is sent to the target database, so that the target database performs database playback according to the merged row object.
  • the method Before sending the row object to the target database, the method first
  • the combination of row objects greatly reduces the amount of data, which improves the speed of sending row objects from the source database to the target database and improves database replication efficiency.
  • a database replication device based on log parsing provided by the embodiment of the present application, as shown in FIG. 4, includes a communication interface 401, a processor 402, a memory 403, and a bus system 404;
  • the memory 403 is used to store a program.
  • the program can include program code, the program code including computer operating instructions.
  • the memory 403 may be a random access memory (RAM) or a non-volatile memory, such as at least one disk storage. Only one memory is shown in the figure, of course, the memory can also Set to multiple as needed. Memory 403 can also be a memory in processor 402.
  • the memory 403 stores the following elements, executable modules or data structures, or a subset thereof, or an extended set thereof:
  • Operation instructions include various operation instructions for implementing various operations.
  • Operating system Includes a variety of system programs for implementing various basic services and handling hardware-based tasks.
  • the processor 402 controls the operation of the device that determines the motion information of the object to be tested, and the processor 402 may also be referred to as a CPU (Central Processing Unit).
  • the components of the device for determining the motion information of the object to be tested are coupled together by a bus system 404.
  • the bus system 404 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus.
  • various buses are labeled as bus system 404 in the figure. For ease of representation, only the schematic drawing is shown in FIG.
  • Processor 402 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the foregoing method may be completed by an integrated logic circuit of hardware in the processor 402 or an instruction in a form of software.
  • the processor 402 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or discrete hardware. Component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 403, and the processor 402 reads the information in the memory 403 and performs the following steps in conjunction with its hardware:
  • Parsing the database log into a row object according to a database log in the source database merging the statements corresponding to the row object having the same primary key identifier into one statement according to a preset rule, And generating a merged row object according to the primary key identifier and the merged statement; sending the merged row object to a target database, so that the target database performs database playback according to the merged row object.
  • the processor 402 is configured to: parse the database log in the data into at least one row object table according to a database log in the source database; wherein the row object table includes at least one row object a row operation statement of a row object; for each row object table, assign a thread to the row object table, and call the thread to correspond to a row object having the same primary key identifier according to a preset rule The statements are merged into a single statement and the merged row object is generated based on the primary key identifier and the merged statement.
  • the predetermined rule comprises one or a combination of any one of the following:
  • the data will be deleted first, and then the two statements for data insertion are merged into one data insertion statement;
  • the data is deleted first, and then the two statements of data update are merged into one data update statement;
  • the data is updated first, and then the two statements for data deletion are combined into one data deletion statement.
  • the processor 402 is further configured to: perform data encryption compression on the merged row object.
  • the processor 402 is further configured to: parse the Binary Log log into a row object according to a binary log Binary Log in a source structured query language MySQL database; Merging row objects having the same primary key identifier according to a preset rule; sending the merged row object to a target MySQL database, so that the target MySQL database performs a MySQL database according to the merged row object Playback.
  • embodiments of the present invention can be provided as a method, or a computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Computing Systems (AREA)

Abstract

一种基于日志解析的数据库复制方法及装置,涉及数据库复制技术领域,用以解决现有技术当主端解析得到的数据库回放语句数量非常大时,异地数据库复制效率低的技术问题,包括:根据源数据库中的数据库日志,将数据库日志解析为行对象(101),根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象(102),将合并后的行对象发送给目标数据库,以使目标数据库根据合并后的行对象进行数据库回放(103),该方法在发送行对象给目标数据库之前,首先对行对象进行合并,从而大大减少了数据量,从而加快了将行对象从源数据库发送到目标数据库的速度,提高了数据库复制效率。

Description

一种基于日志解析的数据库复制方法及装置
本申请要求在2015年11月12日提交中华人民共和国知识产权局、申请号为201510776844.X、发明名称为“一种基于日志解析的数据库复制方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据库复制技术领域,尤其涉及一种基于日志解析的数据库复制方法及装置。
背景技术
随着应用系统的不断发展,系统对于数据库的依赖与日俱增。数据库应用场景的丰富也导致数据库之间的复制需求与要求日趋旺盛,数据库复制是在数据库之间对数据和数据库对象进行复制和分发,并进行同步以确保其一致性的一组技术。
现有技术中,基于日志解析的数据库复制技术,通过主端将源数据库引擎产生的日志解析成数据库回放语句,并通过异地网络将解析后的大量数据回放语句发送至目标数据库,然后由备端根据接收到的数据回放语句进行数据库回放,并将回放得到的数据存储到目标数据库,从而实现将源数据库的数据复制到目标数据库。
上述方法存在的问题是:当主端解析得到的数据库回放语句数量非常大时,异地网络传输会比较慢,因而导致整个数据库异地复制的时间会非常长,降低了异地数据库复制效率。
发明内容
本申请提供一种基于日志解析的数据库复制方法,用以解决现有技术当主端解析得到的数据库回放语句数量非常大时,异地数据库复制效率低的技术问题。
第一方面,本申请实施例提供的一种基于日志解析的数据库复制方法,包括:
根据源数据库中的数据库日志,将所述数据库日志解析为行对象;
根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据所述主键标识和合并后的语句生成合并后的行对象;
将所述合并后的行对象发送给目标数据库,以使所述目标数据库根据所述合并后的行对象进行数据库回放。
可选地,所述根据源数据库中的数据库日志,将所述数据库日志解析为行对象,包括:
根据源数据库中的数据库日志,将所述数据中的数据库日志解析为至少一张行对象表;其中一张行对象表包含至少一个行对象;一个行对象一行的一条数据库操作语句;
根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象,包括:
针对每一张行对象表,为所述行对象表分配一个线程,并调用该线程根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象。
可选地,所述预先设定的规则包含下列一项或任几项的组合:
将多条进行数据插入的语句合并成一条数据插入语句;
将多条进行数据删除的语句合并成一条数据删除语句;
将多条进行数据更新的语句合并成一条数据更新语句;
将先进行数据插入,后进行数据删除的两条语句合并为一条数据删除语句;
将先进行数据插入,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据删除,后进行数据插入的两条语句合并为一条数据插入语句;
将先进行数据删除,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据更新,后进行数据删除的两条语句合并为一条数据删除语句。
可选地,将所述合并后的行对象发送给目标数据库之前,还包括:
对所述合并后的行对象进行数据加密压缩。
可选地,根据源数据中的数据库日志,将所述数据库日志解析为行对象之前,该方法还包括:
根据源结构化查询语言MySQL数据库中的二进制日志Binary Log日志,将所述Binary Log日志解析为行对象;
根据预先设定的规则,对具有相同主键标识的行对象进行合并;
将所述合并后的行对象发送给目标MySQL数据库,以使所述目标MySQL数据库根据所述合并后的行对象进行MySQL数据库回放。
第二方面,本申请实施例提供的一种基于日志解析的数据库复制装置,包括:
解析单元,用于根据源数据库中的数据库日志,将所述数据库日志解析为行对象;
合并单元,用于根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据所述主键标识和合并后的语句生成合并后的行对象;
发送单元,用于将所述合并后的行对象发送给目标数据库,以使所述目标数据库根据所述合并后的行对象进行数据库回放。
可选地,所述解析单元,具体用于:
根据源数据库中的数据库日志,将所述数据中的数据库日志解析为至少一张行对象表;其中一张行对象表包含至少一个行对象;一个行对象一行的一条数据库操作语句;
针对每一张行对象表,为所述行对象表分配一个线程,并调用该线程根 据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象。
可选地,所述预先设定的规则包含下列一项或任几项的组合:
将多条进行数据插入的语句合并成一条数据插入语句;
将多条进行数据删除的语句合并成一条数据删除语句;
将多条进行数据更新的语句合并成一条数据更新语句;
将先进行数据插入,后进行数据删除的两条语句合并为一条数据删除语句;
将先进行数据插入,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据删除,后进行数据插入的两条语句合并为一条数据插入语句;
将先进行数据删除,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据更新,后进行数据删除的两条语句合并为一条数据删除语句。
可选地,所述装置还包括加密单元,具体用于:
对所述合并后的行对象进行数据加密压缩。
可选地,所述发送单元还用于:
根据源结构化查询语言MySQL数据库中的二进制日志Binary Log日志,将所述Binary Log日志解析为行对象;
根据预先设定的规则,对具有相同主键标识的行对象进行合并;
将所述合并后的行对象发送给目标MySQL数据库,以使所述目标MySQL数据库根据所述合并后的行对象进行MySQL数据库回放。
第三方面,本申请实施例提供的一种基于日志解析的数据库复制装置,包括通信接口、处理器、存储器和总线系统;
所述处理器,用于根据源数据库中的数据库日志,将所述数据库日志解 析为行对象;根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据所述主键标识和合并后的语句生成合并后的行对象;将所述合并后的行对象发送给目标数据库,以使所述目标数据库根据所述合并后的行对象进行数据库回放。
可选地,所述处理器,具体用于:根据源数据库中的数据库日志,将所述数据中的数据库日志解析为至少一张行对象表;其中一张行对象表包含至少一个行对象;一个行对象一行的一条数据库操作语句;针对每一张行对象表,为所述行对象表分配一个线程,并调用该线程根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象。
可选地,所述预先设定的规则包含下列一项或任几项的组合:
将多条进行数据插入的语句合并成一条数据插入语句;
将多条进行数据删除的语句合并成一条数据删除语句;
将多条进行数据更新的语句合并成一条数据更新语句;
将先进行数据插入,后进行数据删除的两条语句合并为一条数据删除语句;
将先进行数据插入,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据删除,后进行数据插入的两条语句合并为一条数据插入语句;
将先进行数据删除,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据更新,后进行数据删除的两条语句合并为一条数据删除语句。
可选地,所述处理器,还用于:对所述合并后的行对象进行数据加密压缩。
可选地,所述处理器还用于:根据源结构化查询语言MySQL数据库中的 二进制日志Binary Log日志,将所述Binary Log日志解析为行对象;根据预先设定的规则,对具有相同主键标识的行对象进行合并;将所述合并后的行对象发送给目标MySQL数据库,以使所述目标MySQL数据库根据所述合并后的行对象进行MySQL数据库回放。
本申请实施例,根据源数据库中的数据库日志,将数据库日志解析为行对象,根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象,将合并后的行对象发送给目标数据库,以使目标数据库根据合并后的行对象进行数据库回放,本申请实施例在发送行对象给目标数据库之前,首先对行对象进行合并,从而大大减少了数据量,从而在提高了将行对象从源数据库发送到目标数据库的速度,提高了数据库复制效率。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的基于日志解析的数据库复制方法流程图;
图2为本申请实施例提供的基于日志解析的数据库复制方法详细流程图;
图3为本申请实施例提供的基于日志解析的数据库复制装置示意图;
图4为本申请实施例提供的基于日志解析的数据库复制装置示意图。
具体实施方式
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作进一步地详细描述,显然,所描述的实施例仅仅是本申请一部份实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的 范围。
下面结合说明书附图对本申请实施例作进一步详细描述。
如图1所示,本申请实施例提供的基于日志解析的数据库复制方法,包括:
步骤101、根据源数据库中的数据库日志,将数据库日志解析为行对象;
步骤102、根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象;
步骤103、将合并后的行对象发送给目标数据库,以使目标数据库根据所述合并后的行对象进行数据库回放。
源数据库可以是甲骨文Oracle数据库,SQL(Structured Query Language,结构化查询语言)数据库,MySQL(My Structured Query Language,关系型结构化查询语言)数据库等等,目标数据库与源数据库类型相同,可以相应的是Oracle数据库,SQL数据库,MySQL数据库等等。
数据库日志中记录的有对数据进行操作的数据库语句,例如查询数据语句,删除数据语句,以及更新数据语句等。
上述步骤101中,根据源数据库中的数据库日志,将数据库日志解析为行对象,其中一个行对象对应一个数据库语句,对应同一行的多个行对象具有相同的主键标识,例如,对数据库表A的第一行从时间先后顺序看,分别有3个数据库语句,且第一行对应的主键标识为A1,其中,第1个数据库语句是对第一行有一个数据插入的语句,因此第一行对应一个行对象1,行对象1对应的主键标识为A1,对应的数据库语句为对第一行进行数据插入的操作;第2个数据库语句是对第一行有一个数据删除的语句,因此第一行对应一个行对象2,行对象2对应的主键标识为A1,对应的数据库语句为删除第一行数据的删除操作;第3个数据库语句是对第一行有一个数据更新的语句,因此第一行对应一个行对象3,行对象3对应的主键标识为A1,对应的数据库语句为更新第一行的数据的更新操作。步骤101中,通过读取数据库日志,可以将数据库日志解析为多个行对象。
步骤102中,根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象,例如,如上例所述,表A的第一行对应有3个数据库操作,可以将具有相同主键标识的行对象对应的语句进行合并,即对同一个行的多个操作语句进行合并,合并之后,一个行只对应一个行对象,且只有一条合并后的语句。因而大大减少了总的行对象的数量。
步骤103、将合并后的行对象发送给目标数据库,以使目标数据库根据所述合并后的行对象进行数据库回放。由于步骤102对行对象进行了合并,因而大大减少了行对象的数量,从而在传输至目标数据库时,提高了传输的速度。
本申请实施例提供的方法,根据源数据库中的数据库日志,将数据库日志解析为行对象,根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象,将合并后的行对象发送给目标数据库,以使目标数据库根据合并后的行对象进行数据库回放,该方法在发送行对象给目标数据库之前,首先对行对象进行合并,从而大大减少了数据量,从而在提高了将行对象从源数据库发送到目标数据库的速度,提高了数据库复制效率。
具体地,在对数据库日志进行解析时,可以将数据库日志解析成多张表,然后将一张表的一行作为一个对象,一个对象对应一行的一个数据库操作语句,或者是一个对象对应一行的多个数据库操作语句;或者是将一张表整体作为一个对象,对于解析的方法,本发明不做具体限定。
可选地,所述根据源数据库中的数据库日志,将所述数据库日志解析为行对象,包括:
根据源数据库中的数据库日志,将所述数据中的数据库日志解析为至少一张行对象表;其中一张行对象表包含至少一个行对象;一个行对象一行的一条数据库操作语句;
上述方法,将源数据库中的数据库日志解析为至少一张行对象表;其中 一张行对象表包含至少一个行对象,一个行对象一行的一条数据库操作语句,因此一个行对象只对应一行一次操作的数据库操作语句,例如针对表B第2行,在一个时间段内有2个操作,分别是删除语句和更新语句,则改行解析为两个行对象,分别对应改行的两个数据库操作语句。该方法将数据库日志解析为以一行的一个数据库操作语句为最小单位,可为后续数据库的复制提供充分详细的信息,因而方便了数据库的复制,提高了效率。
基于上述数据库日志解析方式,可根据下列方式对具有相同主键标识的行对象进行合并。
可选地,根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象,包括:
针对每一张行对象表,为所述行对象表分配一个线程,并调用该线程根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象。
上述方法,针对每一张行对象表,可分配一个线程进行行对象对应的语句进行合并的操作,从而可以并发地将根据源数据库解析得到的行对象进行合并,从而提高行对象合并的效率,提高系统效率。
对于将具有相同主键标识的行对象对应的语句进行合并的规则有很多,本申请实施例提供的行对象对应的语句进行合并的规则如下:
可选地,所述预先设定的规则包含下列一项或任几项的组合:
将多条进行数据插入的语句合并成一条数据插入语句;
将多条进行数据删除的语句合并成一条数据删除语句;
将多条进行数据更新的语句合并成一条数据更新语句;
将先进行数据插入,后进行数据删除的两条语句合并为一条数据删除语句;
将先进行数据插入,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据删除,后进行数据插入的两条语句合并为一条数据插入语句;
将先进行数据删除,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据更新,后进行数据删除的两条语句合并为一条数据删除语句。
上述方法,根据上述规则的组合,可对任意具有相同主键标识的行对象对应的语句进行合并,例如一行前后有以下操作:删除,插入,更新,则可以将删除,插入,更新分别对应的3个语句合并为一条语句,首先将删除语句和插入语句合并为插入语句,然后将插入语句和更新语句合并为插入语句,因此合并只有只剩下一条插入语句,对于合并后的插入语句的具体内容根据实际应用而定,本发明不做限定。因此,通过上述预先设定的规则,可对具有相同主键标识的行对象对应的语句进行合并,然后根据合并后语句以及根据改行对应的主键标识,生产一个新的行对象,该新的行对象为该行合并后的行对象,由此可见,通过上述规则,可将具有相同主键标识的多个行对象合并为一个行对象,大大减少了数据量,在将合并后的行对象传输至目标数据库时,由于数据量的减少而提高了发送效率。
另外,在将合并后的行对象发送给目标数据库之前,可对合并后的行对象做一些预处理,例如加密,或者是不加密,以及还可以是对数据做压缩,或者不压缩,具体根据实际需要而定。
可选地,本申请实施例提供的方法,在将所述合并后的行对象发送给目标数据库之前,还包括:
对所述合并后的行对象进行数据加密压缩。
该方法中,在将合并后的行对象发送给目标数据库之前进行加密和压缩操作,从而提高数据发送的安全性,以及进一步降低了数据发送量。
可选地,根据源数据中的数据库日志,将所述数据库日志解析为行对象之前,该方法还包括:
根据源结构化查询语言MySQL数据库中的二进制日志Binary Log日志,将所述Binary Log日志解析为行对象;
根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象;
将所述合并后的行对象发送给目标MySQL数据库,以使所述目标MySQL数据库根据所述合并后的行对象进行MySQL数据库回放。
上述方法,是针对源数据库和目标数据库都是MySQL数据库,并且数据库日志是Binary Log日志的场景下的数据库复制方法,该方法中,可通过模拟MySQL dump协议将Binary Log日志拉取到本地,并根据Binary Log日志格式将其解析为行对象,然后根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象,将合并后的行对象发送给目标MySQL数据库,以使目标MySQL数据库根据合并后的行对象进行MySQL数据库回放。
下面对本申请实施例提供的基于日志解析的数据库复制方法做详细描述。
步骤201、主端对数据库日志进行解析;
步骤202、主端进行数据库行对象合并操作;
步骤203、主端进行数据传输序列化;
步骤204、主端进行数据库加密压缩;
步骤205、主端进行主端异地网络数据发送;
步骤206、备端进行备端异地网络数据接收;
步骤207、备端进行数据解压缩解密;
步骤208、备端进行数据反序列化;
步骤209、备端进行数据回放操作,将回放后的数据存储到目标数据库。
本申请实施例提供的方法,根据源数据库中的数据库日志,将数据库日志解析为行对象,根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对 象,将合并后的行对象发送给目标数据库,以使目标数据库根据合并后的行对象进行数据库回放,该方法在发送行对象给目标数据库之前,首先对行对象进行合并,从而大大减少了数据量,从而在提高了将行对象从源数据库发送到目标数据库的速度,提高了数据库复制效率。
基于相同的技术构思,本申请实施例还提供基于日志解析的数据库复制装置。本申请实施例提供的基于日志解析的数据库复制装置如图3所示。
解析单元301,用于根据源数据库中的数据库日志,将所述数据库日志解析为行对象;
合并单元302,用于根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据所述主键标识和合并后的语句生成合并后的行对象;
发送单元304,用于将所述合并后的行对象发送给目标数据库,以使所述目标数据库根据所述合并后的行对象进行数据库回放。
可选地,所述解析单元301,具体用于:根据源数据库中的数据库日志,将所述数据中的数据库日志解析为至少一张行对象表;其中一张行对象表包含至少一个行对象;一个行对象一行的一条数据库操作语句;针对每一张行对象表,为所述行对象表分配一个线程,并调用该线程根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象。
可选地,所述预先设定的规则包含下列一项或任几项的组合:
将多条进行数据插入的语句合并成一条数据插入语句;
将多条进行数据删除的语句合并成一条数据删除语句;
将多条进行数据更新的语句合并成一条数据更新语句;
将先进行数据插入,后进行数据删除的两条语句合并为一条数据删除语句;
将先进行数据插入,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据删除,后进行数据插入的两条语句合并为一条数据插入语句;
将先进行数据删除,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据更新,后进行数据删除的两条语句合并为一条数据删除语句。
可选地,所述装置还包括加密单元303,具体用于:对所述合并后的行对象进行数据加密压缩。
可选地,所述发送单元304还用于:根据源结构化查询语言MySQL数据库中的二进制日志Binary Log日志,将所述Binary Log日志解析为行对象;根据预先设定的规则,对具有相同主键标识的行对象进行合并;将所述合并后的行对象发送给目标MySQL数据库,以使所述目标MySQL数据库根据所述合并后的行对象进行MySQL数据库回放。
本申请实施例提供的方法,根据源数据库中的数据库日志,将数据库日志解析为行对象,根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象,将合并后的行对象发送给目标数据库,以使目标数据库根据合并后的行对象进行数据库回放,该方法在发送行对象给目标数据库之前,首先对行对象进行合并,从而大大减少了数据量,从而在提高了将行对象从源数据库发送到目标数据库的速度,提高了数据库复制效率。
基于相同的技术构思,本申请实施例提供的一种基于日志解析的数据库复制装置,如图4所示,包括通信接口401、处理器402、存储器403和总线系统404;
其中,存储器403,用于存放程序。具体地,程序可以包括程序代码,程序代码包括计算机操作指令。存储器403可能为随机存取存储器(random access memory,简称RAM),也可能为非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。图中仅示出了一个存储器,当然,存储器也可以 根据需要,设置为多个。存储器403也可以是处理器402中的存储器。
存储器403存储了如下的元素,可执行模块或者数据结构,或者它们的子集,或者它们的扩展集:
操作指令:包括各种操作指令,用于实现各种操作。
操作系统:包括各种系统程序,用于实现各种基础业务以及处理基于硬件的任务。
处理器402控制确定待测对象的运动信息的装置的操作,处理器402还可以称为CPU(Central Processing Unit,中央处理单元)。具体的应用中,确定待测对象的运动信息的装置的各个组件通过总线系统404耦合在一起,其中总线系统404除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统404。为便于表示,图9中仅是示意性画出。
上述本申请实施例揭示的方法可以应用于处理器402中,或者由处理器402实现。处理器402可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器402中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器402可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器403,处理器402读取存储器403中的信息,结合其硬件执行以下步骤:
根据源数据库中的数据库日志,将所述数据库日志解析为行对象;根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句, 并根据所述主键标识和合并后的语句生成合并后的行对象;将所述合并后的行对象发送给目标数据库,以使所述目标数据库根据所述合并后的行对象进行数据库回放。
可选地,所述处理器402,具体用于:根据源数据库中的数据库日志,将所述数据中的数据库日志解析为至少一张行对象表;其中一张行对象表包含至少一个行对象;一个行对象一行的一条数据库操作语句;针对每一张行对象表,为所述行对象表分配一个线程,并调用该线程根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象。
可选地,所述预先设定的规则包含下列一项或任几项的组合:
将多条进行数据插入的语句合并成一条数据插入语句;
将多条进行数据删除的语句合并成一条数据删除语句;
将多条进行数据更新的语句合并成一条数据更新语句;
将先进行数据插入,后进行数据删除的两条语句合并为一条数据删除语句;
将先进行数据插入,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据删除,后进行数据插入的两条语句合并为一条数据插入语句;
将先进行数据删除,后进行数据更新的两条语句合并为一条数据更新语句;
将先进行数据更新,后进行数据删除的两条语句合并为一条数据删除语句。
可选地,所述处理器402,还用于:对所述合并后的行对象进行数据加密压缩。
可选地,所述处理器402还用于:根据源结构化查询语言MySQL数据库中的二进制日志Binary Log日志,将所述Binary Log日志解析为行对象;根 据预先设定的规则,对具有相同主键标识的行对象进行合并;将所述合并后的行对象发送给目标MySQL数据库,以使所述目标MySQL数据库根据所述合并后的行对象进行MySQL数据库回放。
本领域内的技术人员应明白,本发明的实施例可提供为方法、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权 利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (10)

  1. 一种基于日志解析的数据库复制方法,其特征在于,包括:
    根据源数据库中的数据库日志,将所述数据库日志解析为行对象;
    根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据所述主键标识和合并后的语句生成合并后的行对象;
    将所述合并后的行对象发送给目标数据库,以使所述目标数据库根据所述合并后的行对象进行数据库回放。
  2. 如权利要求1所述的方法,其特征在于,所述根据源数据库中的数据库日志,将所述数据库日志解析为行对象,包括:
    根据源数据库中的数据库日志,将所述数据中的数据库日志解析为至少一张行对象表;其中一张行对象表包含至少一个行对象;一个行对象一行的一条数据库操作语句;
    根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象,包括:
    针对每一张行对象表,为所述行对象表分配一个线程,并调用该线程根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象。
  3. 如权利要求1或2所述的方法,其特征在于,所述预先设定的规则包含下列一项或任几项的组合:
    将多条进行数据插入的语句合并成一条数据插入语句;
    将多条进行数据删除的语句合并成一条数据删除语句;
    将多条进行数据更新的语句合并成一条数据更新语句;
    将先进行数据插入,后进行数据删除的两条语句合并为一条数据删除语句;
    将先进行数据插入,后进行数据更新的两条语句合并为一条数据更新语句;
    将先进行数据删除,后进行数据插入的两条语句合并为一条数据插入语句;
    将先进行数据删除,后进行数据更新的两条语句合并为一条数据更新语句;
    将先进行数据更新,后进行数据删除的两条语句合并为一条数据删除语句。
  4. 如权利要求1所述的方法,其特征在于,将所述合并后的行对象发送给目标数据库之前,还包括:
    对所述合并后的行对象进行数据加密压缩。
  5. 如权利要求1、2或4中任一项所述的方法,其特征在于,根据源数据中的数据库日志,将所述数据库日志解析为行对象之前,该方法还包括:
    根据源结构化查询语言MySQL数据库中的二进制日志Binary Log日志,将所述Binary Log日志解析为行对象;
    根据预先设定的规则,对具有相同主键标识的行对象进行合并;
    将所述合并后的行对象发送给目标MySQL数据库,以使所述目标MySQL数据库根据所述合并后的行对象进行MySQL数据库回放。
  6. 一种基于日志解析的数据库复制装置,其特征在于,包括:
    解析单元,用于根据源数据库中的数据库日志,将所述数据库日志解析为行对象;
    合并单元,用于根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据所述主键标识和合并后的语句生成合并后的行对象;
    发送单元,用于将所述合并后的行对象发送给目标数据库,以使所述目标数据库根据所述合并后的行对象进行数据库回放。
  7. 如权利要求6所述的装置,其特征在于,所述解析单元,具体用于:
    根据源数据库中的数据库日志,将所述数据中的数据库日志解析为至少一张行对象表;其中一张行对象表包含至少一个行对象;一个行对象一行的 一条数据库操作语句;
    针对每一张行对象表,为所述行对象表分配一个线程,并调用该线程根据预先设定的规则,将具有相同主键标识的行对象对应的语句合并成一个语句,并根据主键标识和合并后的语句生成合并后的行对象。
  8. 如权利要求6或7所述的装置,其特征在于,所述预先设定的规则包含下列一项或任几项的组合:
    将多条进行数据插入的语句合并成一条数据插入语句;
    将多条进行数据删除的语句合并成一条数据删除语句;
    将多条进行数据更新的语句合并成一条数据更新语句;
    将先进行数据插入,后进行数据删除的两条语句合并为一条数据删除语句;
    将先进行数据插入,后进行数据更新的两条语句合并为一条数据更新语句;
    将先进行数据删除,后进行数据插入的两条语句合并为一条数据插入语句;
    将先进行数据删除,后进行数据更新的两条语句合并为一条数据更新语句;
    将先进行数据更新,后进行数据删除的两条语句合并为一条数据删除语句。
  9. 如权利要求6所述的装置,其特征在于,所述装置还包括加密单元,具体用于:
    对所述合并后的行对象进行数据加密压缩。
  10. 如权利要求6、7或9中任一项所述的装置,其特征在于,所述发送单元还用于:
    根据源结构化查询语言MySQL数据库中的二进制日志Binary Log日志,将所述Binary Log日志解析为行对象;
    根据预先设定的规则,对具有相同主键标识的行对象进行合并;
    将所述合并后的行对象发送给目标MySQL数据库,以使所述目标MySQL数据库根据所述合并后的行对象进行MySQL数据库回放。
PCT/CN2016/105007 2015-11-12 2016-11-08 一种基于日志解析的数据库复制方法及装置 WO2017080431A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510776844.X 2015-11-12
CN201510776844.XA CN105955970A (zh) 2015-11-12 2015-11-12 一种基于日志解析的数据库复制方法及装置

Publications (1)

Publication Number Publication Date
WO2017080431A1 true WO2017080431A1 (zh) 2017-05-18

Family

ID=56917189

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/105007 WO2017080431A1 (zh) 2015-11-12 2016-11-08 一种基于日志解析的数据库复制方法及装置

Country Status (3)

Country Link
CN (1) CN105955970A (zh)
TW (1) TWI628551B (zh)
WO (1) WO2017080431A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110674147A (zh) * 2019-08-28 2020-01-10 视联动力信息技术股份有限公司 数据处理方法、装置及计算机可读存储介质
CN111444199A (zh) * 2019-01-17 2020-07-24 阿里巴巴集团控股有限公司 数据处理方法及装置、存储介质和处理器
CN113111050A (zh) * 2021-04-27 2021-07-13 山东福生佳信科技股份有限公司 数据库对比方法及装置

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105955970A (zh) * 2015-11-12 2016-09-21 中国银联股份有限公司 一种基于日志解析的数据库复制方法及装置
CN106844574A (zh) * 2017-01-05 2017-06-13 中国银联股份有限公司 一种远程数据同步的方法和装置
CN107122431A (zh) * 2017-04-14 2017-09-01 浙江数链科技有限公司 一种实时计算平台及基于实时计算平台的数据计算方法
CN107169094B (zh) * 2017-05-12 2020-10-13 北京小米移动软件有限公司 信息聚合方法及装置
CN109101627B (zh) * 2018-08-14 2022-03-22 交通银行股份有限公司 异构数据库同步方法及装置
CN109408589B (zh) * 2018-09-14 2020-08-14 新华三大数据技术有限公司 数据同步方法及装置
CN109656935B (zh) * 2018-11-23 2023-12-01 创新先进技术有限公司 用于数据库的数据回放的方法和系统
CN111382152B (zh) * 2018-12-27 2023-10-20 杭州海康威视数字技术股份有限公司 数据表处理方法、装置及存储介质
CN110134653B (zh) * 2019-05-17 2021-09-07 杭州安恒信息技术股份有限公司 一种利用日志辅助数据库审计方法及系统
CN110297866A (zh) * 2019-05-20 2019-10-01 平安普惠企业管理有限公司 基于日志分析的数据同步方法及数据同步装置
CN110569223A (zh) * 2019-09-16 2019-12-13 京东数字科技控股有限公司 数据库日志处理方法以及装置
CN114647659A (zh) * 2020-12-17 2022-06-21 金篆信科有限责任公司 数据处理方法、装置、电子设备、存储介质
CN113220646A (zh) * 2021-06-03 2021-08-06 北京锐安科技有限公司 一种数据解析方法、装置、计算机设备和存储介质
CN114969200B (zh) * 2022-04-18 2023-09-19 中移互联网有限公司 数据同步方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719149A (zh) * 2009-12-03 2010-06-02 联动优势科技有限公司 数据同步方法及装置
CN102129478A (zh) * 2011-04-26 2011-07-20 广州从兴电子开发有限公司 数据库同步方法及系统
CN103793514A (zh) * 2014-02-11 2014-05-14 华为技术有限公司 数据库同步方法及数据库
CN104933127A (zh) * 2015-06-12 2015-09-23 北京京东尚科信息技术有限公司 基于MariaDB的跨机房数据库同步设备及方法
CN105955970A (zh) * 2015-11-12 2016-09-21 中国银联股份有限公司 一种基于日志解析的数据库复制方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8245203B2 (en) * 2007-06-29 2012-08-14 Alcatel Lucent Logging system and method for computer software
CN102156720A (zh) * 2011-03-28 2011-08-17 中国人民解放军国防科学技术大学 一种数据恢复的方法、装置和系统
CN102841897B (zh) * 2011-06-23 2016-03-02 阿里巴巴集团控股有限公司 一种实现增量数据抽取的方法、装置及系统
CN102346775A (zh) * 2011-09-26 2012-02-08 苏州博远容天信息科技有限公司 一种基于日志的异构多源数据库同步方法
CN103246745B (zh) * 2013-05-22 2016-03-09 中国工商银行股份有限公司 一种基于数据仓库的数据处理装置及方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719149A (zh) * 2009-12-03 2010-06-02 联动优势科技有限公司 数据同步方法及装置
CN102129478A (zh) * 2011-04-26 2011-07-20 广州从兴电子开发有限公司 数据库同步方法及系统
CN103793514A (zh) * 2014-02-11 2014-05-14 华为技术有限公司 数据库同步方法及数据库
CN104933127A (zh) * 2015-06-12 2015-09-23 北京京东尚科信息技术有限公司 基于MariaDB的跨机房数据库同步设备及方法
CN105955970A (zh) * 2015-11-12 2016-09-21 中国银联股份有限公司 一种基于日志解析的数据库复制方法及装置

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444199A (zh) * 2019-01-17 2020-07-24 阿里巴巴集团控股有限公司 数据处理方法及装置、存储介质和处理器
CN111444199B (zh) * 2019-01-17 2023-11-14 阿里巴巴集团控股有限公司 数据处理方法及装置、存储介质和处理器
CN110674147A (zh) * 2019-08-28 2020-01-10 视联动力信息技术股份有限公司 数据处理方法、装置及计算机可读存储介质
CN113111050A (zh) * 2021-04-27 2021-07-13 山东福生佳信科技股份有限公司 数据库对比方法及装置
CN113111050B (zh) * 2021-04-27 2023-07-07 山东福生佳信科技股份有限公司 数据库对比方法及装置

Also Published As

Publication number Publication date
CN105955970A (zh) 2016-09-21
TWI628551B (zh) 2018-07-01
TW201717074A (zh) 2017-05-16

Similar Documents

Publication Publication Date Title
WO2017080431A1 (zh) 一种基于日志解析的数据库复制方法及装置
US20200320091A1 (en) Schemaless to relational representation conversion
US11216476B2 (en) Data processing method, apparatus, and device
CN107391628B (zh) 数据同步方法及装置
US10255108B2 (en) Parallel execution of blockchain transactions
WO2019178979A1 (zh) 报表数据查询方法、装置、存储介质和服务器
Macedo et al. Redis cookbook: Practical techniques for fast data manipulation
US10467192B2 (en) Method and apparatus for updating data table in keyvalue database
US8904225B2 (en) Stream data processing failure recovery method and device
US9703821B2 (en) Database auditing for bulk operations
CN112416654B (zh) 一种数据库日志重演方法、装置、设备及存储介质
CN114722119A (zh) 数据同步方法及系统
CN113656503A (zh) 数据同步方法、装置、系统及计算机可读存储介质
WO2023077971A1 (zh) 事务处理方法、装置、计算设备及存储介质
CN112434087B (zh) 一种跨系统数据比对方法、装置、电子设备及存储介质
CN112965945A (zh) 数据存储方法、装置、电子设备及计算机可读介质
CN111061798A (zh) 可配置化数据传输及监控方法、设备及介质
WO2016023372A1 (zh) 数据存储处理方法及装置
CN110417892B (zh) 基于报文解析的数据复制链路优化方法及装置
US20170180473A1 (en) High throughput, high reliability data processing system
US9280582B2 (en) Optimization of join queries for related data
CN110941658A (zh) 一种数据导出方法、装置、服务器及存储介质
US20220245097A1 (en) Hashing with differing hash size and compression size
CN111026764B (zh) 一种数据存储方法、装置、电子产品及存储介质
CN114490865A (zh) 数据库同步方法、装置、设备及计算机存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16863607

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16863607

Country of ref document: EP

Kind code of ref document: A1