CN113761052A - Database synchronization method and device - Google Patents

Database synchronization method and device Download PDF

Info

Publication number
CN113761052A
CN113761052A CN202011363440.5A CN202011363440A CN113761052A CN 113761052 A CN113761052 A CN 113761052A CN 202011363440 A CN202011363440 A CN 202011363440A CN 113761052 A CN113761052 A CN 113761052A
Authority
CN
China
Prior art keywords
data record
data
version number
merged
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011363440.5A
Other languages
Chinese (zh)
Inventor
杨雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202011363440.5A priority Critical patent/CN113761052A/en
Publication of CN113761052A publication Critical patent/CN113761052A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/219Managing data history or versioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity

Abstract

The invention discloses a database synchronization method and device, and relates to the technical field of computers. One embodiment of the method comprises: acquiring an application operation log of an application, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number; acquiring a database log of a source database of the application, wherein the database log comprises one or more second data records, and each second data record comprises a data index and a second version number; and carrying out synchronous operation on a target database according to the first data record and the second data record, wherein the first data record and the second data record have the same data index. The implementation method can realize more accurate synchronization between the target database and the source database.

Description

Database synchronization method and device
Technical Field
The invention relates to the technical field of computers, in particular to a database synchronization method and a database synchronization device.
Background
Different databases have different functional characteristics, and different aspects of management can be performed on data by using different databases. Due to the requirements of various applications, data synchronization between different databases is often required. However, due to the different database types or the larger difference between the service systems of the source data end and the target data end, the two data ends cannot be synchronized accurately.
Disclosure of Invention
In view of this, embodiments of the present invention provide a database synchronization method and apparatus, which can implement accurate synchronization between a target database and a source database.
In a first aspect, an embodiment of the present invention provides a database synchronization method, including:
acquiring an application operation log of an application, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number;
acquiring a database log of a source database of the application, wherein the database log comprises one or more second data records, and each second data record comprises a data index and a second version number;
and carrying out synchronous operation on a target database according to the first data record and the second data record, wherein the first data record and the second data record have the same data index.
Alternatively,
and the synchronous operation of the target database according to the first data record and the second data record comprises the following steps:
merging the first data record and the second data record with the same data index to obtain a merged data record, wherein the data index of the merged data record is the same as the data index of the first data record or the second data record, and the reference version number of the merged data record is determined according to the first version number of the first data record and the second version number of the second data record;
and carrying out synchronous operation on the target database according to the merged data record.
Alternatively,
the merging the first data record and the second data record with the same data index to obtain a merged data record includes:
for each second data record to be merged, determining a target second data record of which the second version number is greater than or equal to the reference version number of the merged data record in the second data records;
and updating the merged data record according to the target second data record, and updating the reference version number of the data merging queue to be the second version number of the target second data record.
Alternatively,
the merging the first data record and the second data record with the same data index to obtain a merged data record includes:
determining the maximum second version number of each second data record to be merged;
and under the condition that the maximum second version number is greater than or equal to the reference version number of the merged data record, updating the merged data record according to the second data record with the maximum second version number, and updating the reference version number of the data merging queue to the maximum second version number.
Alternatively,
after updating the reference version number of the data merge queue according to the second version number, the method further includes:
for each first data record to be merged, determining a target first data record of which the first version number is greater than the reference version number of the merged data record in the first data record;
and updating the merged data record with the target first data record according to the sequence of the first version number of the target first data record, and updating the reference version number of the data merging queue to the first version number of the target first data record.
Alternatively,
after updating the reference version number of the data merge queue according to the second version number, the method further includes:
determining the maximum first version number of each first data record to be merged;
and under the condition that the maximum first version number is larger than the reference version number of the merged data record, updating the merged data record according to the first data record with the maximum first version number, and updating the reference version number of the data merging queue to be the maximum first version number.
Alternatively,
receiving a query request for the merged data record;
determining whether a first data record to be merged exists, wherein a first version number of the first data record to be merged is greater than a reference version number of the merged data record;
and if so, updating the merged data record according to the first data record, and updating the reference version number of the data merging queue to be the maximum first version number of the first data record to be merged.
Alternatively,
the merging the first data record and the second data record with the same data index to obtain a merged data record includes:
obtaining a task of merged data records using one or more queue records, wherein the task is represented in the queue with the data index;
and determining a task to be executed from the queue so as to merge the first data record and the second data record with the same data index to obtain a merged data record.
In a second aspect, an embodiment of the present invention provides another database synchronization apparatus, including:
after the operation on the source database is successful, an application operation log is generated, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number.
In a third aspect, an embodiment of the present invention provides a database synchronization apparatus, including:
the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring an application operation log of an application, the application operation log comprises one or more first data records, and the first data records comprise data indexes and first version numbers;
a second obtaining unit, configured to obtain a database log of a source database of the application, where the database log includes one or more second data records, and the second data records have a data index and a second version number;
and the synchronization unit is used for performing synchronization operation on a target database according to the first data record and the second data record, wherein the first data record and the second data record have the same data index.
In a fourth aspect, an embodiment of the present invention provides another database synchronization apparatus, including:
the log generation unit is used for generating an application operation log after the operation on the source database is successful, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number.
In a fifth aspect, an embodiment of the present invention provides an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any of the embodiments described above.
In a sixth aspect, the present invention provides a computer-readable medium, on which a computer program is stored, where the computer program is executed by a processor to implement the method of any one of the above embodiments.
One embodiment of the above invention has the following advantages or benefits: and realizing the data synchronization of the source database and the target database by applying the operation log and the database log of the source database. The application oplog and the database log have the same data index that is used to identify both the data records in the source database log and the data records in the application oplog. The database log records the data change information of the source database, and the accuracy is good. Compared with a database log, the application operation log saves the operation records of the application for the source database, and the time efficiency is higher. The method of the embodiment of the invention utilizes two logs, and can quickly and accurately record the data change condition in the source database. Therefore, accurate synchronization between the source database and the target database can be realized.
In addition, a synchronization center can be arranged between the source database end and the target database end. The synchronous center is used for obtaining the application operation log and the database log of the source database, and carrying out synchronous operation on the target database according to the application operation log and the database log. Compared with a synchronization mode that the source database end directly sends the data change message to the target database end, the service systems on the source database end and the target database end only need to pay attention to own service logic and do not need to pay attention to the synchronization problem of the data, the system coupling degree between the source database end and the target database end is reduced, and the expansion and maintenance of the whole synchronization system are facilitated.
In addition, each data record in the application operation log and the database log is provided with a version number, and the data records in the logs can be operated according to the version numbers, so that the problem of data inconsistency caused by concurrence problems is prevented, and the synchronization accuracy is further guaranteed.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
FIG. 2 is a schematic diagram illustrating a flow of a database synchronization method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of another database synchronization scenario provided by an embodiment of the invention;
FIG. 4 is a schematic diagram of a flow of another database synchronization method provided by an embodiment of the invention;
FIG. 5 is a schematic diagram illustrating a flow of a data query method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram illustrating a flow of another database synchronization method according to an embodiment of the present invention;
FIG. 7 is a schematic diagram illustrating a flow of another database synchronization method according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of a database synchronization apparatus according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of another database synchronization apparatus according to an embodiment of the present invention;
fig. 10 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
FIG. 1 is an exemplary system architecture diagram in which embodiments of the present invention may be employed. As shown in fig. 1, the method of the embodiment of the present invention mainly involves three terminals: a source database terminal, a synchronization center and a target database terminal. The source database side is deployed with a server (referred to as a source database server in this disclosure) for installing the source database and an application server using the source database. The source database server is deployed with at least one source database, and the application server is deployed with at least one application which needs to operate the source database. The same application may invoke at least one source database, which may also be invoked by multiple different applications. The source database server and the application server may be deployed on multiple servers or clusters respectively, or may be deployed on the same server or cluster.
The synchronous center is used for acquiring the application operation log of the application and the database log of the source database, and carrying out synchronous operation on the target database according to the application operation log and the database log. And the data records in the application operation log and the database log have the same index record.
The target database end deploys at least one database that needs to be synchronized with the source database. At least one target database at the target database end can be deployed on a plurality of servers or clusters, and can also be deployed on the same server or cluster. The method of the embodiment of the invention does not limit the specific deployment forms of the source database end and the target database end.
In one embodiment, in an initial phase, the target database side may send a subscription request to the synchronization center. As shown in table 1, the subscription request information includes a source database identifier and a target database identifier. The subscription request information in table 1 is used to inform the synchronization center that the database corresponding to the target database identifier needs to perform synchronization operation with the database corresponding to the source database identifier.
Table 1 subscription request information
Source database/source application identification Target database/target application identification
Y0001 M0001
Y0002 M0002
Table 2 provides another format for subscription request information. As shown in table 2, the subscription message may include source database identifier, source table identifier, and other information. The table corresponding to the source table identifier is a data table in the source database, and the table corresponding to the target identifier is a data table in the target database. The subscription request information in table 2 is used to inform the synchronization center that the target table corresponding to the target table identifier needs to perform synchronization operation with the table corresponding to the source table identifier.
Table 2 subscription request information
Figure BDA0002804689340000071
As shown in fig. 1, an application program at a source database end completes service logic processing, persists (for example, newly adds, modifies, and deletes) data in a database, generates an application operation log after the database operation is completed, submits the application operation log to a synchronization center, and completes application operation log processing by the synchronization center, which ensures real-time synchronization of data.
And the database logs of the source database are also sent to the synchronization center, and the synchronization center performs change processing, so that the data can be ensured to be finally consistent. When the real-time synchronization request is abnormal, the final data consistency can be ensured through the mode, so that the degradation processing of the real-time synchronization is supported.
And the application operation log and the database log are both taken as modification logs to be recorded, and are periodically merged.
It should be understood that the number of databases and applications on the source database side and the target database side in fig. 1 is merely illustrative. Different numbers of databases and applications may be deployed, as desired for implementation.
According to the method provided by the embodiment of the invention, the synchronization centers are configured at the source database end and the target database end, so that the coupling degree of each service system in the source database end and the target database end can be reduced, each service system only needs to pay attention to the service logic of the service system, and the problem of data synchronization is not needed. The embodiment of the invention designs a set of general technical scheme which can be applied to various data synchronization scenes and is convenient to popularize and use. Data consistency can also be ensured in a distributed environment with multiple data sources and in the face of a large number of concurrent requests.
Fig. 2 is a flowchart of a database synchronization method according to an embodiment of the present invention. As shown in fig. 2, the method includes:
step 201: the method comprises the steps of obtaining an application operation log of an application, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number.
Step 202: and acquiring a database log of a source database of the application, wherein the database log comprises one or more second data records, and the second data records comprise data indexes and second version numbers.
The method of the embodiment of the invention can be applied to the synchronous center. The data record of the synchronization center may comprise two parts: application operation logs and database logs. The application operation log is generated by the application using the source database, records the data change of the application, and can include information such as data addition and deletion. The application server can push the application operation log to the synchronous center periodically, and the synchronous center can also actively pull the application operation log from the application server.
The database log is automatically generated by the source database. The format and generation mode of the database log are determined by the type of the source database. The types of source databases may include: redis, Mysql, db2, and the like. The source database can push the database logs to the synchronization center periodically, and the synchronization center can also actively pull the database logs from the source database.
It should be noted that each record in the application operation log and the database log must include a data index and a version number. The data index is used to uniquely identify a data record in a source or target database. The version number is used for recording version information of the data record, the version number is increased when the data is changed every time, and the data record corresponding to the maximum version number is the latest content of the current data.
As shown in table 3, the application operation log may include all fields in one data record, or may only include the information of the changed field, for example, the latter record in table 3 represents that the field 3 of the data record with data index 65535 is modified from 30 to 32.
Table 3 application oplog format
Figure BDA0002804689340000091
Table 3 performs two operations for the application identified as 005 on the data record with data index 65535, with a current version number of 13. Table 4 shows the database log contents corresponding to table 3. As shown in Table 4, the database log contains all the fields of a piece of data, and each field has a field value under the corresponding version. From table 4, the contents of all fields in the data record with the current latest version number of 13 and the index identifier of 65536 corresponding to the version number can be determined.
Table 4 format of database log
Figure BDA0002804689340000092
The application operation log and the database log may be stored in a database or file deployed by the synchronization center. And for each data index, respectively storing a corresponding application operation log and a corresponding database log in synchronization.
The ordered set may automatically order the plurality of records in the set according to the ordering field, and may include: zset, TreeMap, etc. The database of the synchronization center may store application operation logs and database logs in an ordered collection. When the same data has a plurality of modification records, a plurality of modification logs are recorded, and the version number in the logs is used as a sequencing field. The ordered set can be used for automatically ordering the stored application operation logs and the database logs according to the version numbers, so that the time required by the subsequent log merging process is reduced.
Optionally, the application operation log and the database log are stored separately, and each uses a different ordered set to perform the sorting process. The purpose of storing the application operation log and the database log separately is to reduce the merging operation in the query phase.
Step 203: and carrying out synchronous operation on the target database according to the first data record and the second data record, wherein the first data record and the second data record have the same data index.
The application oplog maintains data changes for the application. The same data record of the application operation log and the database log has the same data index. The data change condition is completely and accurately recorded in the application operation log and the database log. The target database can be accurately synchronized with the source database according to the merging condition of the application operation log and the database log, so that the problem that the target database cannot be accurately synchronized with the source database in the prior art is solved.
In addition, the method of the embodiment of the invention ensures that the write operation is not covered in an incremental log-based manner. And the data is ensured to be written according to the version based on the data version, so that the problem of data inconsistency caused by concurrency problems is prevented.
In one embodiment of the present invention, step 203 may comprise:
merging the first data record and the second data record with the same data index to obtain a merged data record, wherein the data index of the merged data record is the same as that of the first data record or the second data record, and the reference version number of the merged data record is determined according to the first version number of the first data record and the second version number of the second data record;
and carrying out synchronous operation on the target database according to the merged data record.
The real-time operation log and the database change log are more and more, so that the merging processing is required. There are various ways to merge the logs.
In the first mode, first, the first data records in the application operation log are merged according to the sequence of the version numbers to generate a first merged data record, and the maximum version number in the first data record is used as the version number of the first merged data record. And finding out a second data record with the version number larger than that of the first merging log from the second data record, merging the first merging log again to generate a merged data record, and taking the maximum version number in the second data record as the reference version number of the merged data record.
In the second mode, the second data record in the database log can be merged first to generate a second merged data record, and the maximum version number in the second data record is used as the version number of the second merged data record. And finding out the first data record with the version number larger than that of the second merging log from the first data record, merging the second merging log again to generate a merged data record, and taking the maximum version number in the first data record as the reference version number of the merged data record.
And in the third mode, randomly selecting records to be merged from the first data record and the second data record with the same version number, merging the records to be merged corresponding to all the version numbers to generate merged data records, and taking the maximum version number in the records to be merged as the reference version number of the merged data records.
The reference version number records the data after the previous merging, and the subsequent modification operation occurs after the version. And the data is ensured to be written according to the version based on the data version, so that the data inconsistency caused by the concurrency problem is prevented.
The synchronization center needs to record the reference version number of each data record in the source database and the synchronization information of each target database. Tables 5 and 6 show the recording method of the reference version number of each data record and the synchronization information of each target database in the data center.
As shown in table 5, the reference version number of the data record with the index mark 65535 is 13, and the merged data record of the version is not yet synchronized by the target database 1 and the target database 4, so that the target database 1 and the target database 4 need to be synchronized according to the merged data record with the reference version number of 13. Once the target databases 1 and 4 have completed the synchronization operation, the states of the target databases 1 and 4 are changed to synchronized.
TABLE 5 data index version record
Figure BDA0002804689340000111
If the sync center performs the merging process again on the basis of the index identifier 65535 and the reference version number 13, and the reference version number of the generated merged data record becomes 30, the reference version number of the data record with the index identifier 65535 is changed to 30, and the status information of all the target databases becomes unsynchronized.
As shown in table 6, the synchronization center records not only the reference index number of the merged data record, but also the current version number of each target database. If the reference version number of the data record with the index mark 65536 is 25 and the current version numbers of the target database 1 and the target database 2 are less than 25, the target database 1 and the target database 2 need to be synchronized. And once the target databases 1 and 2 complete the synchronization operation, the version numbers of the target databases 1 and 4 are changed to 25.
TABLE 6 data index version record
Figure BDA0002804689340000121
If the sync center performs the merging process again on the basis of the index identifier 65535 and the reference version number 13, and the reference version number of the generated merged data record becomes 30, the reference version number of the data record with the index identifier 65535 is changed to 30, and the version information of all the target databases does not need to be changed.
Compared with the mode shown in table 5, the version of each target database recorded in table 6 can help the system and the user to better control the data version, and better solve the problem of dirty data caused by multi-thread concurrence. For example, in the query stage, log records which are not merged can be queried in the synchronization center according to the version of the target database, and the query result is returned after merging so as to avoid directly returning out-of-date data.
It should be noted that, in the initial state or the default state, the reference version number set by the synchronization center should be small enough or negative enough so that the reference version number is smaller than the version numbers of all records in the first data record and the second data record. Because the data records with the publication number larger than the reference version number need to be selected from the application operation log and the database log to complete the merging processing, the reference version number needs to be set small enough to avoid discarding the data records needing to be merged, thereby ensuring that the merging processing can be completed smoothly.
In one embodiment of the present invention, step 203 may comprise:
merging the first data record and the second data record with the same data index to obtain a merged data record, comprising:
obtaining a task of merging data records using one or more queue records, wherein the task is represented in the queue by a data index;
and determining a task to be executed from the queue so as to carry out merging processing on the first data record and the second data record with the same data index and obtain a merged data record.
And distributing the data indexes corresponding to the tasks for merging the data records into several different queues. The target queue may be selected according to a predetermined rule, and the task of merging the data records corresponding to the data indexes in the target queue may be executed. And a corresponding thread can be set for each queue, and the tasks of merging the data records corresponding to the data indexes in each queue are executed in parallel through a plurality of threads, so that the merging speed is improved.
Fig. 3 is a schematic diagram of another database synchronization scenario provided by an embodiment of the present invention. The data area records the data after the previous merging, and the subsequent modification operation occurs after the version. The log area stores an application operation log (oplog) and a database log (binlog). The merging task queue of the merging task area records a data index (bizId) to be currently merged, and the task can be split into a plurality of lists (merge _ task _1 and merge _ task _2) according to a certain rule (such as modulo), and the lists are processed by a plurality of merging threads in parallel. Wherein arrow 1 represents merging the application operation log into the data area, and arrow 2 represents merging the database log into the data area.
Because the writing of the single service data reference version is completed by the merging thread, the concurrency problem does not exist, locking is not needed, and the real-time performance and the efficiency of data synchronization are guaranteed.
It should be noted that the same data index can only appear in one task queue, and only needs to appear once, and the deduplication processing of the ordered set can be used. The purpose of log merging is to merge the application operation log and the database log into a reference version, and the merged application operation log and the database log can be cleared after merging is completed.
Fig. 4 is a flowchart of a database synchronization method according to another embodiment of the present invention. As shown in fig. 4, the method includes:
step 401: the method comprises the steps of obtaining an application operation log of an application, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number.
Step 402: and acquiring a database log of a source database of the application, wherein the database log comprises one or more second data records, and the second data records comprise data indexes and second version numbers.
Step 403: and merging the second data records with the same data index to obtain merged data records, and determining the reference version number of the merged data records according to the second version number of the second data records.
First, the second data record in the database log is merged. The version number of the second data record needs to be greater than or equal to the reference version number. And determining the maximum version number corresponding to the merged second data record, and determining the maximum version number as the reference version number of the merged record.
After the second data records are merged, the database log including the second data records that have been merged is deleted. On one hand, the storage resources of the system can be saved, and on the other hand, the merged second data record is deleted, so that the influence on subsequent merging or query processing can be reduced.
Step 404: and merging the first data record with the same data index and the merged data record to update the merged data record, and determining the reference version number of the merged data record according to the first version number of the first data record.
On the basis of the merged data record generated in step 403, further merging is performed by applying the first data record in the operation log. The version number of the first data record needs to be larger than the reference version number. And determining the maximum version number corresponding to the merged first data record, and determining the maximum version number as the reference version number of the merged record.
After the merging of the first data records is completed, the database log including the merged first data record is deleted. On one hand, the storage resources of the system can be saved, and on the other hand, the merged first data record is deleted, so that the influence on subsequent merging or query processing can be reduced.
Step 405: and carrying out synchronous operation on the target database according to the merged data record.
The merging operation of the logs can be executed periodically by the scheduling thread, and the scheduling period can be configured according to needs, such as 1 second, 2 seconds, 5 seconds and the like.
The scheduling thread firstly obtains a data index to be merged from the task queue, and inquires a reference version number, an application operation log and a database log according to the data index. Generally, the messages of the database logs of the same version arrive later than the application operation log, and the messages of the database logs carry complete and accurate data, so that the messages of the database logs are merged to the reference version when merging is considered, and the messages of the database logs are merged when the version of the messages of the database logs is greater than or equal to the version of the reference version. If the version of the database log is smaller than the reference version, the database log is expired and needs to be discarded.
And after the information processing of the database logs is finished, merging the application operation logs, and merging when the version number of the application logs is larger than the reference version. If the version number of the application operation log is equal to the reference version number, the database log is already merged, the latest data is already in the reference version, and the application operation log does not need to be merged.
Compared with an application operation log, the data records of the database log have more accurate data information. In the embodiment of the invention, the database logs of the same version are merged prior to the application operation log in the log merging process, so that the synchronization accuracy of the data in the target database and the data in the source database can be ensured to be higher.
Step 403 may be implemented in two ways:
in a first mode, aiming at each second data record to be merged, determining a target second data record of which the second version number is greater than or equal to the reference version number of the merged data record in the second data records;
and updating the merged data record according to the target second data record, and updating the reference version number of the data merging queue to be the second version number of the target second data record.
And if the second version number of the second data record is greater than or equal to the reference version number, merging the second data record into the merged data record, and determining the second version number as the reference version number of the merged data record. If the second version number of the second data record is less than the reference version number, the second data record is expired and should be discarded.
In the first mode, the second data records are merged one by one according to the version number to generate merged data records. The first mode is more suitable for a scene with higher timeliness, and once the data in the source database changes, the synchronous center can acquire the application operation log and the database log more quickly. The synchronous center can also merge the acquired logs into the merged data records at a higher speed, and carry out synchronous operation on the target database according to the merged data records.
Determining the maximum second version number of each second data record to be merged;
and updating the merged data record according to the second data record with the maximum second version number and updating the reference version number of the data merging queue to the maximum second version number under the condition that the maximum second version number is greater than or equal to the reference version number of the merged data record.
Since the second data record in the database log stores the complete data information, i.e. the data information of each field. Therefore, the second data record with the maximum version number may be selected directly from the second data records with version numbers greater than or equal to the reference version number, and the merged data record may be updated by the second data record, and the maximum version number may be determined as the reference version number.
In an embodiment of the invention, the second data record with the largest version among the plurality of second data records is selected for updating the merged data record. Compared with the first mode and the second mode, the method does not need to merge one by one, and can improve merging efficiency and reduce system resource consumption.
Step 404 may be implemented in two ways:
in a first mode, aiming at each first data record to be merged, determining a target first data record of which the first version number is greater than the reference version number of the merged data record in the first data record;
and updating the merged data record with the target first data record according to the sequence of the first version number of the target first data record, and updating the reference version number of the data merging queue to the first version number of the target first data record.
If the first version number of the first data record is greater than the reference version number, the first data record is merged into the merged data record and the first version number is determined to be the reference version number of the merged data record. If the first version number of the first data record is less than or equal to the reference version number, the merged data record is already merged by a second data record of the same version and should be discarded.
In the first mode, the first data records are merged one by one according to the version number to generate merged data records. The first mode is more suitable for a scene with higher timeliness, and once the data in the source database changes, the synchronous center can acquire the application operation log and the database log more quickly. The synchronous center can also merge the acquired logs into the merged data records at a higher speed, and carry out synchronous operation on the target database according to the merged data records.
Determining the maximum first version number of each first data record to be merged;
in the case where the maximum first version number is greater than the reference version number of the merged data record, the merged data record is updated according to the first data record having the maximum first version number, and the reference version number of the data merge queue is updated to the maximum first version number.
Only the change field information may be stored in the first data record in the application operation log, or the complete data information may be stored. When the first data has complete data information, the first data record with the maximum version number can be selected directly from the first data records with the version numbers larger than the reference version number, the merged data record is updated through the first data record, and the maximum version number is determined as the reference version number.
In an embodiment of the invention, the first data record with the largest version among the plurality of first data records is selected for updating the merged data record. Compared with the first mode and the second mode, the method does not need to merge one by one, and can improve merging efficiency and reduce system resource consumption.
Fig. 5 is a schematic diagram illustrating a flow of a data query method according to an embodiment of the present invention. As shown in fig. 5, the method includes:
step 501: a query request for a consolidated data record is received.
Step 502: and determining whether a first data record to be merged exists, wherein the first version number of the first data record to be merged is greater than the reference version number of the merged data record.
Step 503: and if so, updating the merged data record according to the first data record, and updating the reference version number of the data merging queue to the maximum first version number of the first data record to be merged.
Due to the adoption of the regular merging logs, in the data query, in view of the fact that the timeliness of the application operation logs is better and the application operation logs may not be merged into the merged data records, the merged data records and the application operation logs need to be queried simultaneously in the query, and the application operation logs need to be merged into the merged data records and then returned.
Because only the application operation logs need to be merged, the application operation logs and the database logs are stored separately when the logs are stored, and the method is beneficial to quick merging and response in the inquiry stage.
In order to ensure the accuracy of the query, a merging process may be required during the query. Whether the merging process affects the response efficiency of the system can be analyzed from the following two aspects:
the probability of merging: the calculation can be carried out according to the merging period, if the merging period is 1s, the query is carried out only within 1s after the data are changed, and the merging is needed; considering that most service data has a long life cycle and the number of changes made in the life cycle is limited, the probability of merging required during query is very low. Such as 100 changes per day, and considering the query probability mean distribution:
Figure BDA0002804689340000181
then the probability that the merging needs to be performed is: 0.1157 percent.
Calculating according to the read-write ratio: if the system read-write ratio is 500:1, the query probability average distribution is also considered:
Figure BDA0002804689340000182
then the probability that the merging needs to be performed is: 0.2 percent.
Effects of the merge operation: the merged service is very simple to operate, i.e. re-assigning values in sequence and overwriting them with new values. Coverage may be based on the fields of the query when merging. The merging process of the query phase is different from the merging process of the synchronization phase described above, which requires processing of each modified field. Therefore, the merging process of the query phase can greatly reduce merging operations compared to the merging process of the synchronization phase.
From the above analysis results, the probability of merging required in the query stage is very low, about 0.1% to 0.2%, and the merging processing process is very simple, has small influence, and can be suitable for application in many different service scenarios.
Fig. 6 is a schematic diagram illustrating a flow of another database synchronization method according to an embodiment of the present invention. As shown in fig. 6, the method includes:
step 601: after the operation on the source database is successful, an application operation log is generated, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number.
The method of the embodiment of the invention can be applied to a source database terminal. And after the application successfully operates the source database, generating an application operation log. The application operation log may reflect content information of changed fields of the data records in the source database, and may also reflect content information of all fields of the data records in the source database. A first data record in the application oplog has a data index and a first version number. The data index may be used to identify data records in the source data. The first version number is used to record an altered version of the data record, and the first version number is progressively increased as the operation on the data record is applied.
The source database end can push the application operation log to the synchronization center periodically, and the synchronization center can also actively pull the application operation log from the source database end.
Fig. 7 is a schematic diagram illustrating a flow of another database synchronization method according to an embodiment of the present invention. As shown in fig. 7, the method includes:
step 701: and carrying out synchronous operation on the target database according to the merged data record, wherein the merged data record is generated by merging an application operation log of the application and a database log of the source database, and the data records in the application operation log and the database log have the same data index.
The method of the embodiment of the invention can be applied to a target database terminal. The merged data record is generated by merging the application operation log of the application and the database log of the source database. The application operation log stores operation records of the application for the source database. The application operation log and the database log have the same data index, and the data index is used for identifying data records in the source database. By applying the operation log and the database log, the data change condition in the source database can be completely and accurately recorded, so that the source database and the target database are accurately synchronized.
The synchronous center can push the application merged data record to the target database end periodically, and the target database end can also pull the merged data record from the synchronous center periodically and actively.
As shown in fig. 8, an embodiment of the present invention provides a database synchronization apparatus, including:
a first obtaining unit 801, configured to obtain an application operation log of an application, where the application operation log includes one or more first data records, and the first data records have data indexes and first version numbers;
a second obtaining unit 802, configured to obtain a database log of a source database of the application, where the database log includes one or more second data records, and the second data records have data indexes and second version numbers;
a synchronization unit 803, configured to perform a synchronization operation on a target database according to the first data record and the second data record, where the first data record and the second data record have the same data index.
In an embodiment of the present invention, the synchronization unit 803 is specifically configured to: merging the first data record and the second data record with the same data index to obtain a merged data record, wherein the data index of the merged data record is the same as the data index of the first data record or the second data record, and the reference version number of the merged data record is determined according to the first version number of the first data record and the second version number of the second data record;
and carrying out synchronous operation on the target database according to the merged data record.
In an embodiment of the present invention, the synchronization unit 803 is specifically configured to: for each second data record to be merged, determining a target second data record of which the second version number is greater than or equal to the reference version number of the merged data record in the second data records;
and updating the merged data record according to the target second data record, and updating the reference version number of the data merging queue to be the second version number of the target second data record.
In an embodiment of the present invention, the synchronization unit 803 is specifically configured to: determining the maximum second version number of each second data record to be merged;
and under the condition that the maximum second version number is greater than or equal to the reference version number of the merged data record, updating the merged data record according to the second data record with the maximum second version number, and updating the reference version number of the data merging queue to the maximum second version number.
In an embodiment of the present invention, the synchronization unit 803 is specifically configured to: for each first data record to be merged, determining a target first data record of which the first version number is greater than the reference version number of the merged data record in the first data record;
and updating the merged data record with the target first data record according to the sequence of the first version number of the target first data record, and updating the reference version number of the data merging queue to the first version number of the target first data record.
In an embodiment of the present invention, the synchronization unit 803 is specifically configured to: determining the maximum first version number of each first data record to be merged;
and under the condition that the maximum first version number is larger than the reference version number of the merged data record, updating the merged data record according to the first data record with the maximum first version number, and updating the reference version number of the data merging queue to be the maximum first version number.
In one embodiment of the invention, the apparatus further comprises:
the querying unit 804 is configured to: receiving a query request for the merged data record;
determining whether a first data record to be merged exists, wherein a first version number of the first data record to be merged is greater than a reference version number of the merged data record;
and if so, updating the merged data record according to the first data record, and updating the reference version number of the data merging queue to be the maximum first version number of the first data record to be merged.
The synchronization unit 803 is specifically configured to: obtaining a task of merged data records using one or more queue records, wherein the task is represented in the queue with the data index;
and determining a task to be executed from the queue so as to merge the first data record and the second data record with the same data index to obtain a merged data record.
As shown in fig. 9, an embodiment of the present invention provides a database synchronization apparatus, including:
the log generating unit 901 is configured to generate an application operation log after the operation on the source database is successful, where the application operation log includes one or more first data records, and the first data records have a data index and a first version number.
An embodiment of the present invention provides an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method of any of the embodiments described above.
Referring now to FIG. 10, a block diagram of a computer system 100 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 10, the computer system 100 includes a Central Processing Unit (CPU)101 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)102 or a program loaded from a storage section 108 into a Random Access Memory (RAM) 103. In the RAM 103, various programs and data necessary for the operation of the system 100 are also stored. The CPU 101, ROM 102, and RAM 103 are connected to each other via a bus 104. An input/output (I/O) interface 105 is also connected to bus 104.
The following components are connected to the I/O interface 105: an input portion 106 including a keyboard, a mouse, and the like; an output section 107 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 108 including a hard disk and the like; and a communication section 109 including a network interface card such as a LAN card, a modem, or the like. The communication section 109 performs communication processing via a network such as the internet. A drive 110 is also connected to the I/O interface 105 as needed. A removable medium 111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 110 as necessary, so that a computer program read out therefrom is mounted into the storage section 108 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 109, and/or installed from the removable medium 111. The above-described functions defined in the system of the present invention are executed when the computer program is executed by the Central Processing Unit (CPU) 101.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a first acquisition unit, a second acquisition unit, and a synchronization unit. Where the names of these modules do not in some cases constitute a limitation on the module itself, for example, the first obtaining unit may also be described as "obtaining an application operation log of an application, the application operation log including one or more first data records having a data index and a module of a first version number".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:
acquiring an application operation log of an application, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number;
acquiring a database log of a source database of the application, wherein the database log comprises one or more second data records, and each second data record comprises a data index and a second version number;
and carrying out synchronous operation on a target database according to the first data record and the second data record, wherein the first data record and the second data record have the same data index.
According to the technical scheme of the embodiment of the invention, the data synchronization of the source database and the target database is realized by applying the operation log and the database log of the source database. The application operation log stores operation records of the application for the source database. The application operation log and the database log have the same data index, and the data index is used for identifying data records in the source database. By applying the operation log and the database log, the data change condition in the source database can be completely and accurately recorded, so that the source database and the target database are accurately synchronized.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (13)

1. A database synchronization method, comprising:
acquiring an application operation log of an application, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number;
acquiring a database log of a source database of the application, wherein the database log comprises one or more second data records, and each second data record comprises a data index and a second version number;
and carrying out synchronous operation on a target database according to the first data record and the second data record, wherein the first data record and the second data record have the same data index.
2. The method of claim 1,
and the synchronous operation of the target database according to the first data record and the second data record comprises the following steps:
merging the first data record and the second data record with the same data index to obtain a merged data record, wherein the data index of the merged data record is the same as the data index of the first data record or the second data record, and the reference version number of the merged data record is determined according to the first version number of the first data record and the second version number of the second data record;
and carrying out synchronous operation on the target database according to the merged data record.
3. The method of claim 2,
the merging the first data record and the second data record with the same data index to obtain a merged data record includes:
for each second data record to be merged, determining a target second data record of which the second version number is greater than or equal to the reference version number of the merged data record in the second data records;
and updating the merged data record according to the target second data record, and updating the reference version number of the data merging queue to be the second version number of the target second data record.
4. The method of claim 2,
the merging the first data record and the second data record with the same data index to obtain a merged data record includes:
determining the maximum second version number of each second data record to be merged;
and under the condition that the maximum second version number is greater than or equal to the reference version number of the merged data record, updating the merged data record according to the second data record with the maximum second version number, and updating the reference version number of the data merging queue to the maximum second version number.
5. The method according to claim 3 or 4,
after updating the reference version number of the data merge queue according to the second version number, the method further includes:
for each first data record to be merged, determining a target first data record of which the first version number is greater than the reference version number of the merged data record in the first data record;
and updating the merged data record with the target first data record according to the sequence of the first version number of the target first data record, and updating the reference version number of the data merging queue to the first version number of the target first data record.
6. The method according to claim 3 or 4,
after updating the reference version number of the data merge queue according to the second version number, the method further includes:
determining the maximum first version number of each first data record to be merged;
and under the condition that the maximum first version number is larger than the reference version number of the merged data record, updating the merged data record according to the first data record with the maximum first version number, and updating the reference version number of the data merging queue to be the maximum first version number.
7. The method of claim 1, comprising:
receiving a query request for the merged data record;
determining whether a first data record to be merged exists, wherein a first version number of the first data record to be merged is greater than a reference version number of the merged data record;
and if so, updating the merged data record according to the first data record, and updating the reference version number of the data merging queue to be the maximum first version number of the first data record to be merged.
8. The method of claim 2,
the merging the first data record and the second data record with the same data index to obtain a merged data record includes:
obtaining a task of merged data records using one or more queue records, wherein the task is represented in the queue with the data index;
and determining a task to be executed from the queue so as to merge the first data record and the second data record with the same data index to obtain a merged data record.
9. A database synchronization method, comprising:
after the operation on the source database is successful, an application operation log is generated, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number.
10. A database synchronization apparatus, comprising:
the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring an application operation log of an application, the application operation log comprises one or more first data records, and the first data records comprise data indexes and first version numbers;
a second obtaining unit, configured to obtain a database log of a source database of the application, where the database log includes one or more second data records, and the second data records have a data index and a second version number;
and the synchronization unit is used for performing synchronization operation on a target database according to the first data record and the second data record, wherein the first data record and the second data record have the same data index.
11. A database synchronization apparatus, comprising:
the log generation unit is used for generating an application operation log after the operation on the source database is successful, wherein the application operation log comprises one or more first data records, and each first data record comprises a data index and a first version number.
12. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9.
13. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-9.
CN202011363440.5A 2020-11-27 2020-11-27 Database synchronization method and device Pending CN113761052A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011363440.5A CN113761052A (en) 2020-11-27 2020-11-27 Database synchronization method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011363440.5A CN113761052A (en) 2020-11-27 2020-11-27 Database synchronization method and device

Publications (1)

Publication Number Publication Date
CN113761052A true CN113761052A (en) 2021-12-07

Family

ID=78786163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011363440.5A Pending CN113761052A (en) 2020-11-27 2020-11-27 Database synchronization method and device

Country Status (1)

Country Link
CN (1) CN113761052A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114253944A (en) * 2021-12-16 2022-03-29 深圳壹账通科技服务有限公司 Database bidirectional synchronization method and device and electronic equipment
CN115033585A (en) * 2022-08-09 2022-09-09 北京奥星贝斯科技有限公司 Data merging processing method and device for target database

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114253944A (en) * 2021-12-16 2022-03-29 深圳壹账通科技服务有限公司 Database bidirectional synchronization method and device and electronic equipment
CN115033585A (en) * 2022-08-09 2022-09-09 北京奥星贝斯科技有限公司 Data merging processing method and device for target database

Similar Documents

Publication Publication Date Title
CN108965355B (en) Method, apparatus and computer readable storage medium for data transmission
US9934263B1 (en) Big-fast data connector between in-memory database system and data warehouse system
CN108959292B (en) Data uploading method, system and computer readable storage medium
CN112307037B (en) Data synchronization method and device
CN106874281B (en) Method and device for realizing database read-write separation
CN109144785B (en) Method and apparatus for backing up data
US20180157710A1 (en) Query and change propagation scheduling for heteogeneous database systems
CN112445626B (en) Data processing method and device based on message middleware
CN111399764B (en) Data storage method, data reading device, data storage equipment and data storage medium
CN113076304A (en) Distributed version management method, device and system
CN113761052A (en) Database synchronization method and device
CN113364877B (en) Data processing method, device, electronic equipment and medium
CN110673959A (en) System, method and apparatus for processing tasks
CN112579695A (en) Data synchronization method and device
CN113886485A (en) Data processing method, device, electronic equipment, system and storage medium
CN113468196B (en) Method, apparatus, system, server and medium for processing data
CN116244383A (en) BOM synchronous processing method, equipment and medium based on BOM middle station
CN113760950B (en) Index data query method, device, electronic equipment and storage medium
CN111405015B (en) Data processing method, device, equipment and storage medium
CN113783916B (en) Information synchronization method and device
CN114969165A (en) Data query request processing method, device, equipment and storage medium
CN115189931A (en) Distributed key management method, device, equipment and storage medium
CN112948494A (en) Data synchronization method and device, electronic equipment and computer readable medium
CN113742376A (en) Data synchronization method, first server and data synchronization system
CN112527900A (en) Method, device, equipment and medium for database multi-copy reading consistency

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination