CN116821237A - Database incremental data synchronization method, device, computer equipment and storage medium - Google Patents

Database incremental data synchronization method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN116821237A
CN116821237A CN202310723655.0A CN202310723655A CN116821237A CN 116821237 A CN116821237 A CN 116821237A CN 202310723655 A CN202310723655 A CN 202310723655A CN 116821237 A CN116821237 A CN 116821237A
Authority
CN
China
Prior art keywords
data
database
incremental data
incremental
synchronized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310723655.0A
Other languages
Chinese (zh)
Inventor
田钺
孔庆波
王益彰
孙收余
李文科
甘润东
缪新萍
董若烟
姚舜
朱昌会
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Power Grid Co Ltd
Original Assignee
Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Power Grid Co Ltd filed Critical Guizhou Power Grid Co Ltd
Priority to CN202310723655.0A priority Critical patent/CN116821237A/en
Publication of CN116821237A publication Critical patent/CN116821237A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a database incremental data synchronization method, a database incremental data synchronization device, computer equipment, a storage medium and a computer program product, and relates to the technical field of databases. The method comprises the following steps: inquiring a synchronous log between a source database and a target database, and determining a historical time stamp for last executing database data synchronization; inquiring an operation log of the source database according to the historical timestamp, and determining incremental data to be synchronized in the source database; loading incremental data to be synchronized into a temporary table preset in a target database; and performing de-duplication processing on the incremental data in the temporary table, and synchronizing the de-duplicated incremental data to a target database. The incremental data synchronization can be realized among heterogeneous databases by adopting the method.

Description

Database incremental data synchronization method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of database technologies, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for synchronizing incremental data of a database.
Background
Under the current big data age, the data volume is explosively increased, the variety of data types and data processing requirements is diversified, and various databases are promoted. Two common ways of data synchronization between databases are full data synchronization and incremental data synchronization, respectively.
Incremental data synchronization requires that the incremental data be screened out with a timestamp field in the database table to be applicable to database synchronization between heterogeneous databases. However, due to the irregular designer of heterogeneous data sources, the table structure design focuses on realizing service functions, and not each database table is designed with a timestamp field, and if incremental data synchronization is continuously wanted, the structure of the original database table needs to be changed, so that the implementation is difficult.
Disclosure of Invention
Based on this, it is necessary to provide a database incremental data synchronization method, apparatus, computer device, computer readable storage medium and computer program product for the technical problem that it is difficult to implement incremental data synchronization.
In a first aspect, the present application provides a method for synchronizing incremental data of a database. The method comprises the following steps:
inquiring a synchronous log between a source database and a target database, and determining a historical time stamp for last executing database data synchronization;
inquiring an operation log of the source database according to the historical timestamp, and determining incremental data to be synchronized in the source database;
loading the incremental data to be synchronized into a temporary table preset in the target database;
and performing de-duplication processing on the incremental data in the temporary table, and synchronizing the de-duplicated incremental data to the target database.
In one embodiment, the incremental data to be synchronized includes incremental data in at least one table of the source database; the loading the incremental data to be synchronized into a temporary table preset in the target database comprises the following steps:
acquiring an archive log corresponding to the incremental data to be synchronized;
converting incremental data of at least one table to be synchronized in the archive log into a structured file to obtain at least one structured file;
and loading the at least one structured file into the temporary table in sequence.
In one embodiment, the loading the at least one structured file into the temporary table sequentially includes:
recording the time of generating each structured file;
and sequentially loading each structured file into a preset temporary table according to the time sequence of generating each structured file.
In one embodiment, the performing deduplication processing on incremental data in the temporary table includes:
determining an identification field in the source database, wherein each piece of data can be uniquely identified;
grouping the incremental data in the temporary table based on the identification field to obtain grouping data;
and screening the latest data in each piece of grouping data, and determining the latest data as incremental data after de-duplication.
In one embodiment, the filtering the latest data in each packet data, determining the latest data as incremental data after de-duplication includes:
querying the operation log, and determining a time stamp corresponding to the packet data;
and screening out the latest data in each packet data according to the time stamp, and determining the latest data as incremental data after de-duplication.
In one embodiment, the synchronizing the deduplicated incremental data into the destination database includes:
acquiring the data change type of the increment data after the duplication removal;
synchronizing the increment data after the duplication elimination to a formal table of the target database according to the identification field and the data change type; the formal table is a synchronized data table consistent with the table structure in the source database.
In a second aspect, the application further provides a database incremental data synchronization device. The device comprises:
the time stamp query module is used for querying a synchronous log between the source database and the destination database and determining a historical time stamp for last executing database data synchronization;
the incremental data determining module is used for inquiring the operation log of the source database according to the historical timestamp and determining incremental data to be synchronized in the source database;
the temporary table loading module is used for loading the incremental data to be synchronized into a temporary table preset in the target database;
and the data synchronization module is used for carrying out de-duplication processing on the incremental data in the temporary table and synchronizing the de-duplicated incremental data into the target database.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:
inquiring a synchronous log between a source database and a target database, and determining a historical time stamp for last executing database data synchronization;
inquiring an operation log of the source database according to the historical timestamp, and determining incremental data to be synchronized in the source database;
loading the incremental data to be synchronized into a temporary table preset in the target database;
and performing de-duplication processing on the incremental data in the temporary table, and synchronizing the de-duplicated incremental data to the target database.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
inquiring a synchronous log between a source database and a target database, and determining a historical time stamp for last executing database data synchronization;
inquiring an operation log of the source database according to the historical timestamp, and determining incremental data to be synchronized in the source database;
loading the incremental data to be synchronized into a temporary table preset in the target database;
and performing de-duplication processing on the incremental data in the temporary table, and synchronizing the de-duplicated incremental data to the target database.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:
inquiring a synchronous log between a source database and a target database, and determining a historical time stamp for last executing database data synchronization;
inquiring an operation log of the source database according to the historical timestamp, and determining incremental data to be synchronized in the source database;
loading the incremental data to be synchronized into a temporary table preset in the target database;
and performing de-duplication processing on the incremental data in the temporary table, and synchronizing the de-duplicated incremental data to the target database.
The database incremental data synchronization method, the device, the computer equipment, the storage medium and the computer program product are used for inquiring the synchronization log between the source database and the target database, and determining the historical timestamp of the last execution of database synchronization so as to determine the range of the incremental data timestamp of the current database synchronization; then, according to the historical timestamp, inquiring an operation log of the source database, and determining incremental data to be synchronized in the source database, so that incremental data synchronization which is more time-saving than full data synchronization can be realized; then, the incremental data to be synchronized are loaded into a temporary table preset in a target database, and then the incremental data in the temporary table are subjected to de-duplication processing, so that redundant data are avoided; and finally synchronizing the increment data after the duplication removal to a target database, and realizing the data synchronization between the source database and the target database. By the method for synchronizing the incremental data of the database, not only can the adoption of a low-efficiency full-volume data synchronizing mode be avoided, but also the aim of synchronizing the incremental data among heterogeneous data can be achieved under the condition that the structure of a source database table without a timestamp field is not changed, and the universality of the incremental data synchronization of the heterogeneous database is improved.
Drawings
FIG. 1 is a diagram of an application environment for a database incremental data synchronization method in one embodiment;
FIG. 2 is a flow diagram of a method of database incremental data synchronization in one embodiment;
FIG. 3 is a flow chart of a method of database incremental data synchronization in another embodiment;
FIG. 4 is a flowchart of a specific example of a database incremental data synchronization method in one embodiment;
FIG. 5 is a block diagram of a database incremental data synchronization apparatus in one embodiment;
fig. 6 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The database incremental data synchronization method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the server 102 communicates with a source database 104, a destination database 106. The source database 104 and the destination database 106 may be integrated on the server 102 or may be located on a cloud or other network server. The source database 104 and the destination database 106 may be heterogeneous databases. The server 102 queries the synchronization log stored in the data storage system of the server, determines the historical time stamp of the last execution of database data synchronization of the source database 104 and the destination database 106, queries the operation log of the source database 104 based on the historical time stamp, determines the incremental data to be synchronized in the source database 104, loads the incremental data to be synchronized into the temporary table of the destination database 106, performs deduplication processing on the incremental data in the temporary table, and synchronizes the incremental data after deduplication into the destination database 106. The server 102 may be implemented as a stand-alone server or as a server cluster of multiple servers.
In one embodiment, as shown in fig. 2, a method for synchronizing incremental data of a database is provided, which is illustrated by using the method applied to the server 102 in fig. 1 as an example, and includes the following steps:
step S201, inquiring a synchronization log between a source database and a destination database, and determining a historical time stamp of last execution of database data synchronization.
The synchronous log is a log for recording the historical data synchronous information between the source database and the target database.
Illustratively, the server 102 begins performing the present database incremental data synchronization according to the user-set incremental data synchronization period. The server 102 first queries the synchronization log to determine the historical time stamp of the last time the source database and the destination database performed the data synchronization, i.e., to determine the time stamp of the source database at which the synchronization node of the current data of the destination database was. In addition, after the synchronization of the incremental data is completed, the server 102 needs to record the information of the synchronization of the incremental data in the synchronization log, which includes the timestamp of the synchronized data in the source database, so as to facilitate the execution of the next synchronization of the incremental data.
Step S202, inquiring an operation log of a source database according to the historical timestamp, and determining incremental data to be synchronized in the source database.
Illustratively, the operation log of the source database has recorded therein information for operating the database, including a time stamp that is gradually incremented as the operation occurs. The server 102 queries the operation log of the source database according to the historical timestamp, determines the operation information after the historical timestamp, and determines the incremental data to be synchronized of the source database in the incremental data synchronization based on the operation information. For example, the incremental data is the data of row a in table a in the source database.
Step S203, the incremental data to be synchronized is loaded into a temporary table preset in the destination database.
Illustratively, server 102 creates a temporary table in the destination database in advance, loads the synchronized delta data in the source database into the temporary table of the destination database, and prepares for subsequent delta data deduplication. The contents of the temporary table include the corresponding time stamp, the data change type, and the like in addition to the entire contents of the incremental data. In addition, the temporary table may be deleted directly after the temporary table is used or the contents thereof may be emptied.
And step S204, performing deduplication processing on the incremental data in the temporary table, and synchronizing the incremental data after deduplication to a target database.
In the primary incremental data synchronization, the incremental data obtained according to the operation log of the source database includes all database change data in the two incremental data synchronization intervals, wherein the change data may include a plurality of change data caused by a plurality of changes of the same line of data in the same table, the plurality of change data need to be subjected to de-duplication processing, the latest change data is screened out as the de-duplicated incremental data, and the de-duplicated incremental data is synchronized to a specific position of the destination database. For example, on the historical timestamp, there is a piece of data with a=10, in the incremental data synchronization, the piece of data is queried to undergo two changes, a=15 after the first change, a=8 after the second change, and after the piece of data is subjected to the deduplication processing, it is determined that a=8 is synchronized into the destination database.
In the database incremental data synchronization method, a synchronization log between a source database and a destination database is queried, and a historical time stamp of the last execution of database synchronization is determined so as to determine an incremental data time stamp range of the database synchronization; then, according to the historical timestamp, inquiring an operation log of the source database, and determining incremental data to be synchronized in the source database, so that incremental data synchronization which is more time-saving than full data synchronization can be realized; then, the incremental data to be synchronized are loaded into a temporary table preset in a target database, and then the incremental data in the temporary table are subjected to de-duplication processing, so that redundant data are avoided; and finally synchronizing the increment data after the duplication removal to a target database, and realizing the data synchronization between the source database and the target database. By the method for synchronizing the incremental data of the database, not only can the adoption of a low-efficiency full-volume data synchronizing mode be avoided, but also the aim of synchronizing the incremental data among heterogeneous data can be achieved under the condition that the structure of a source database table without a timestamp field is not changed, and the universality of the incremental data synchronization of the heterogeneous database is improved.
In one embodiment, the incremental data to be synchronized comprises incremental data in at least one table of the source database; the step S203 loads the incremental data to be synchronized into a temporary table preset in the destination database, which specifically includes: acquiring an archive log corresponding to incremental data to be synchronized; converting incremental data of at least one table to be synchronized in the archive log into a structured file to obtain at least one structured file; at least one structured file is loaded into the temporary table in sequence.
Wherein the archive log is an inactive redo log backup that can be used to fully restore the database.
Illustratively, an archive log corresponding to incremental data to be synchronized is obtained, wherein images before and after data change, data change types, transaction information causing data change, information of a table where the data are located, and the like are recorded. At least one table is stored in the source database, and the data change between each table is independent; according to the archive log, the incremental data of each table are respectively converted into corresponding structured files, and the structured files can be stored outside the source database and the destination database. And then sequentially loading the structured files corresponding to each table into the temporary table of the target database, namely, when the synchronous processing of the previous table is not completed, the next table is not loaded into the temporary table.
In this embodiment, according to the archive log corresponding to the incremental data to be synchronized of the source database, a structured file of the incremental data to be synchronized is constructed, and the structured file is loaded in a temporary table of the destination database, so that subsequent duplicate removal processing is facilitated. Meanwhile, each table in the target database is respectively constructed into a structured file, and each structured file is sequentially loaded in the temporary table, so that only data of one table are processed during the de-duplication processing, and the de-duplication efficiency can be effectively improved.
Based on the above embodiment, further, the loading the at least one structured file into the temporary table sequentially specifically includes: recording the time of generating each structured file; and sequentially loading each structured file into a preset temporary table according to the time sequence of generating each structured file.
In one embodiment, the step S204 performs the deduplication processing on the incremental data in the temporary table, and specifically includes: determining an identification field in a source database, wherein each piece of data can be uniquely identified; grouping the increment data in the temporary table based on the identification field to obtain grouping data; and screening the latest data in each piece of grouping data, and determining the latest data as incremental data after de-duplication.
For example, during a two incremental data synchronization time interval, one piece of data in the source database may be changed multiple times, and a record of the multiple changes of the next piece of data is recorded from the operation log, and the incremental data obtained based on the operation log may include the incremental data of the multiple changes of the one piece of data. Therefore, an identification field capable of uniquely identifying each piece of data in the source database is determined, and the identification field may not be a field set as a primary key in the source database; and carrying out grouping processing on the incremental data in the temporary table based on the identification field to obtain grouping data, namely, the data of each grouping are the data with the same identification field. And determining the latest data in each packet as incremental data after de-duplication. For example, in a data table for recording inventory of commodities, a commodity model is used as an identification field, wherein inventory information of the commodity a is changed twice, the first time is that the remaining number of the commodity a after leaving the warehouse is 10, corresponding to the first piece of incremental data of the commodity a, the second time is that the remaining number of the commodity a after entering the warehouse is 20, corresponding to the second piece of incremental data of the commodity a; after the commodity model is grouped, the two pieces of incremental data of the commodity A are in the same group, and the second piece of data is determined to be the latest incremental data and is used as the incremental data after the duplication removal.
In this embodiment, multiple incremental data generated by multiple changes of the same piece of data are determined based on the identification field, and the latest incremental data is screened out as the incremental data for finally performing data synchronization, so that the data synchronization efficiency is improved and the data redundancy is reduced.
Based on the above embodiment, further, the filtering the latest data in each packet data, determining the latest data as the incremental data after de-duplication specifically includes: inquiring an operation log, and determining a time stamp corresponding to the packet data; and screening the latest data in each piece of grouping data according to the time stamp, and determining the latest data as incremental data after de-duplication.
For example, for a source database that does not have a timestamp field built, matching the timestamp field to each incremental data based on the operation log of the source database may build the timestamp field in a temporary table, where the incremental data includes the corresponding timestamp. The latest data in each packet data is determined based on the time stamp.
In one embodiment, the synchronizing the de-duplicated incremental data to the destination database specifically includes: obtaining the data change type of the increment data after the duplication removal; synchronizing the increment data after the duplication removal to a formal table of a target database according to the identification field and the data change type; the formal table is a synchronized data table consistent with the table structure in the source database.
Illustratively, the data change type of the incremental data after the duplication removal is obtained from the operation log, including deletion, updating, insertion, and the like, and the incremental data after the duplication removal is synchronized into the formal table of the target database in different manners according to the data change type of the incremental data after the duplication removal. The formal table is a table consistent with the structure of the table storing the incremental data to be synchronized in the source database, and contains the data which are synchronized before. Further, for incremental data with the data change type deleted, deleting the data in the formal table of the target database based on the identification field; for incremental data with updated data change types, modifying data in a formal table of the target database based on the identification field; and adding the incremental data into the formal table of the target database based on the identification field for the incremental data with the data change type being the inserted one.
In this embodiment, incremental data is synchronized to the destination database in different manners through different data change types, so as to realize data synchronization between heterogeneous databases.
In another embodiment, as shown in fig. 3, there is provided a database incremental data synchronization method, including the steps of:
step S301, inquiring a synchronization log between a source database and a destination database, and determining a historical timestamp of the last execution of database data synchronization;
step S302, inquiring an operation log of a source database according to the historical timestamp, and determining incremental data to be synchronized in the source database;
step S303, obtaining an archive log corresponding to the incremental data to be synchronized, and converting the incremental data of at least one table to be synchronized in the archive log into a structured file to obtain at least one structured file;
step S304, recording the time of generating each structured file; sequentially loading each structured file into a temporary table preset in a target database according to the time sequence generated by each structured file;
step S305, determining an identification field in a source database, wherein each piece of data can be uniquely identified;
step S306, grouping the increment data in the temporary table based on the identification field to obtain grouping data;
step S307, inquiring the operation log, and determining a time stamp corresponding to the packet data; screening the latest data in each group of data according to the time stamp, and determining the latest data as incremental data after de-duplication;
step S308, obtaining the data change type of the incremental data after the duplication removal, and synchronizing the incremental data after the duplication removal to a formal table of a target database according to the identification field and the data change type.
In this embodiment, by the method for synchronizing incremental data of a database, not only a low-efficiency full-volume data synchronization mode can be avoided, but also the purpose of synchronizing incremental data among heterogeneous data can be achieved without changing the structure of a source database table without a timestamp field, and the universality of incremental data synchronization of the heterogeneous database is improved. In addition, the duplicate removal processing is carried out on the incremental data to be synchronized by taking one table as a unit, so that the data mixing can be avoided, and the duplicate removal efficiency is improved; based on different data change types, different modes are adopted to perform incremental data synchronization, redundant operations can be reduced, and the incremental data synchronization is efficiently realized.
In order to facilitate understanding of the embodiments of the present application by those skilled in the art, the present application will be described below with reference to the specific example of fig. 4; the source database in this example is an Oracle database, and the destination database is a greenplus database. The method comprises the following specific steps:
and S401, creating an empty formal table with the same table structure as the oracle database in the gremplum database.
Step S402, exporting the full-scale data of the oracle database into a full-scale external data file.
Step S403, loading the full external data file into the empty formal table created in step S401.
Step S404, determining an identification field capable of uniquely identifying each piece of data of the oracle database.
In step S405, the data in the oracle database is analyzed into an incremental external data file by taking a table as a unit and based on the archive log of the table.
In step S406, the incremental external data file is scanned, and the data in the incremental external data file is loaded into the temporary table of the gromplum database according to the generation time sequence of the incremental external data file.
And step S407, performing duplication elimination on the data of the temporary table according to the data change type, the identification field and the time stamp stored in the temporary table, and synchronizing the duplicated data into a formal table corresponding to the gremplum database.
In step S408, the increment external data file that has been completed in synchronization is cleared.
Step S401 to step S403 are initialization steps, namely, performing full data synchronization on the data of the two databases; steps S404 to S408 are steps of subsequent incremental data synchronization each time, and each incremental data synchronization may be a periodic process in which a time interval is set by a user, or a temporary process in response to a user' S demand.
Further, the number of empty formal tables created in step S401 needs to correspond to the data tables in the oracle database (source database).
In step S402, the full amount of data in the current oracle database may be exported as a structured data file (i.e., a full amount of external data files) using a tool such as sqluldr/dump.
In step S405, the delta change data of the oracle database may be extracted as a trail file (an intermediate file with a data pattern that can be quickly and accurately converted between heterogeneous databases) according to the SCN number (i.e., timestamp) in the operation log by using the goldengate tool of the oracle database. And configuring the replication process of the oracle database, and converting incremental data corresponding to a change event of a table to be synchronized in the trail file into a structured csv file (incremental external data csv file). The csv file contains the data content of the timestamp, the data change type (insert, update and delete) and the whole content of the changed data of each data change event of the data to be synchronized. Csv file naming scheme may generate time for pump_table schema name_table name_file.
In step S406, the contents of the temporary table in the gremplum database are increased by the time stamp and the data change type (insert, update, and delete) of each occurrence of the data change event of the data to be synchronized, as compared with the formal table of step S401. The temporary table loads only one increment external data file at a time according to the file generation time sequence of the increment external data files, and when one increment external data file is synchronous, the next increment external data file is loaded.
In step S407, the data in the temporary table is subjected to deduplication processing based on the identification field and the time stamp described in the temporary table; and synchronizing the data subjected to duplication removal in the temporary table into a formal table corresponding to the gremplum database according to the data change type recorded in the temporary table.
In step S408, after each incremental external data file is successfully synchronized, the file is immediately moved out of the task queue of the data file to be synchronized, and then loading of the next incremental external data file is performed; if the synchronization of the increment external data file encounters an abnormal error, the synchronization task of the increment external data file is terminated, and the file id of the synchronization abnormality is recorded for error correction processing of a standby user.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a database incremental data synchronization device for realizing the above related database incremental data synchronization method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation in the embodiments of the incremental database data synchronization device or devices provided below may be referred to the limitation of the incremental database data synchronization method hereinabove, and will not be repeated herein.
In one embodiment, as shown in fig. 5, there is provided a database incremental data synchronization apparatus, including: a timestamp query module 501, an incremental data determination module 502, a temporary table loading module 503, and a data synchronization module 504, wherein:
the timestamp query module 501 is configured to query a synchronization log between a source database and a destination database, and determine a historical timestamp of a last execution of database data synchronization;
the incremental data determining module 502 is configured to query an operation log of the source database according to the historical timestamp, and determine incremental data to be synchronized in the source database;
a temporary table loading module 503, configured to load incremental data to be synchronized into a temporary table preset in a destination database;
the data synchronization module 504 is configured to perform deduplication processing on the incremental data in the temporary table, and synchronize the deduplicated incremental data to the destination database.
In one embodiment, the incremental data to be synchronized includes incremental data in at least one table of the source database, and the temporary table loading module 503 is further configured to obtain an archive log corresponding to the incremental data to be synchronized; converting incremental data of at least one table to be synchronized in the archive log into a structured file to obtain at least one structured file; at least one structured file is loaded into the temporary table in sequence.
In one embodiment, the temporary table loading module 503 is further configured to record a time of generating each structured file; and sequentially loading each structured file into a preset temporary table according to the time sequence of generating each structured file.
In one embodiment, the data synchronization module 504 is further configured to determine an identification field in the source database that can uniquely identify each piece of data; grouping the increment data in the temporary table based on the identification field to obtain grouping data; and screening the latest data in each piece of grouping data, and determining the latest data as incremental data after de-duplication.
In one embodiment, the data synchronization module 504 is further configured to query an operation log to determine a timestamp corresponding to the packet data; and screening the latest data in each piece of grouping data according to the time stamp, and determining the latest data as incremental data after de-duplication.
In one embodiment, the data synchronization module 504 is further configured to obtain a data change type of the deduplicated incremental data; synchronizing the increment data after the duplication removal to a formal table of a target database according to the identification field and the data change type; the formal table is a synchronized data table consistent with the table structure in the source database.
The modules in the database incremental data synchronization apparatus may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used to store a synchronization log between the source database and the destination database. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a database incremental data synchronization method.
It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (10)

1. A method for synchronizing incremental data of a database, the method comprising:
inquiring a synchronous log between a source database and a target database, and determining a historical time stamp for last executing database data synchronization;
inquiring an operation log of the source database according to the historical timestamp, and determining incremental data to be synchronized in the source database;
loading the incremental data to be synchronized into a temporary table preset in the target database;
and performing de-duplication processing on the incremental data in the temporary table, and synchronizing the de-duplicated incremental data to the target database.
2. The method of claim 1, wherein the delta data to be synchronized comprises delta data in at least one table of the source database; the loading the incremental data to be synchronized into a temporary table preset in the target database comprises the following steps:
acquiring an archive log corresponding to the incremental data to be synchronized;
converting incremental data of at least one table to be synchronized in the archive log into a structured file to obtain at least one structured file;
and loading the at least one structured file into the temporary table in sequence.
3. The method of claim 2, wherein sequentially loading the at least one structured file into the temporary table comprises:
recording the time of generating each structured file;
and sequentially loading each structured file into a preset temporary table according to the time sequence of generating each structured file.
4. The method of claim 1, wherein the deduplicating incremental data in the temporary table comprises:
determining an identification field in the source database, wherein each piece of data can be uniquely identified;
grouping the incremental data in the temporary table based on the identification field to obtain grouping data;
and screening the latest data in each piece of grouping data, and determining the latest data as incremental data after de-duplication.
5. The method of claim 4, wherein said filtering out the latest data in each of said packet data, determining as de-duplicated delta data, comprises:
querying the operation log, and determining a time stamp corresponding to the packet data;
and screening out the latest data in each packet data according to the time stamp, and determining the latest data as incremental data after de-duplication.
6. The method of claim 4, wherein synchronizing the deduplicated incremental data into the destination database comprises:
acquiring the data change type of the increment data after the duplication removal;
synchronizing the increment data after the duplication elimination to a formal table of the target database according to the identification field and the data change type; the formal table is a synchronized data table consistent with the table structure in the source database.
7. A database incremental data synchronization apparatus, the apparatus comprising:
the time stamp query module is used for querying a synchronous log between the source database and the destination database and determining a historical time stamp for last executing database data synchronization;
the incremental data determining module is used for inquiring the operation log of the source database according to the historical timestamp and determining incremental data to be synchronized in the source database;
the temporary table loading module is used for loading the incremental data to be synchronized into a temporary table preset in the target database;
and the data synchronization module is used for carrying out de-duplication processing on the incremental data in the temporary table and synchronizing the de-duplicated incremental data into the target database.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
CN202310723655.0A 2023-06-16 2023-06-16 Database incremental data synchronization method, device, computer equipment and storage medium Pending CN116821237A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310723655.0A CN116821237A (en) 2023-06-16 2023-06-16 Database incremental data synchronization method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310723655.0A CN116821237A (en) 2023-06-16 2023-06-16 Database incremental data synchronization method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116821237A true CN116821237A (en) 2023-09-29

Family

ID=88142242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310723655.0A Pending CN116821237A (en) 2023-06-16 2023-06-16 Database incremental data synchronization method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116821237A (en)

Similar Documents

Publication Publication Date Title
CN107423426B (en) Data archiving method for block chain block data and electronic equipment
US8468291B2 (en) Asynchronous distributed object uploading for replicated content addressable storage clusters
WO2018178641A1 (en) Data replication system
CN111124474B (en) API version control method and device
CN111917834A (en) Data synchronization method and device, storage medium and computer equipment
US9852207B2 (en) Method for transporting relational data
CN110287251B (en) MongoDB-HBase distributed high fault-tolerant data real-time synchronization method
CN112631833A (en) Data archiving and querying method, system, storage medium and equipment
CN111858767A (en) Synchronous data processing method, device, equipment and storage medium
CN116821237A (en) Database incremental data synchronization method, device, computer equipment and storage medium
CN115525655A (en) Method and system for data query slicing
US20220245097A1 (en) Hashing with differing hash size and compression size
CN110990405B (en) Data loading method, device, server and storage medium
CN111966699A (en) Method and device for checking data, computer equipment and storage medium
US20200110822A1 (en) Composite metadata objects for database systems
US11995060B2 (en) Hashing a data set with multiple hash engines
CN112711627B (en) Data importing method, device and equipment of Greemplum database
US20220245104A1 (en) Hashing for deduplication through skipping selected data
CN117131023B (en) Data table processing method, device, computer equipment and readable storage medium
CN117539690B (en) Method, device, equipment, medium and product for merging and recovering multi-disk data
US20220245112A1 (en) Hashing a data set with multiple hash engines
CN115756960B (en) Misoperation data recovery method and device, computer equipment and storage medium
US20230376461A1 (en) Supporting multiple fingerprint formats for data file segment
CN117149313A (en) Program execution plan synchronization method, program execution plan synchronization device, computer device, and storage medium
CN117312445A (en) Data synchronization method, apparatus, computer device, storage medium, and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination