CN114297214A - Data synchronization method and device, computer storage medium and electronic equipment - Google Patents

Data synchronization method and device, computer storage medium and electronic equipment Download PDF

Info

Publication number
CN114297214A
CN114297214A CN202111658188.5A CN202111658188A CN114297214A CN 114297214 A CN114297214 A CN 114297214A CN 202111658188 A CN202111658188 A CN 202111658188A CN 114297214 A CN114297214 A CN 114297214A
Authority
CN
China
Prior art keywords
instruction
data
synchronization
data table
synchronized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111658188.5A
Other languages
Chinese (zh)
Other versions
CN114297214B (en
Inventor
程正武
韩方方
贺凌峰
王涛
王世伟
鲁良
李光伟
贾帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jindi Technology Co Ltd
Original Assignee
Beijing Jindi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jindi Technology Co Ltd filed Critical Beijing Jindi Technology Co Ltd
Priority to CN202111658188.5A priority Critical patent/CN114297214B/en
Publication of CN114297214A publication Critical patent/CN114297214A/en
Application granted granted Critical
Publication of CN114297214B publication Critical patent/CN114297214B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a data synchronization method and a device thereof, a computer storage medium and electronic equipment, wherein the data synchronization method comprises the following steps: performing conversion processing on an actual updating instruction included by the source synchronous instruction sequence to generate an equivalent insertion instruction and an equivalent deletion instruction, wherein the execution timestamp of the equivalent deletion instruction is earlier than that of the equivalent insertion instruction; generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence; determining an execution time stamp of each data synchronization instruction in the effective synchronization instruction sequence in the source data table; judging whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table or not according to the execution timestamp; and if so, operating a data synchronization instruction to synchronize the data to be synchronized between the source data table and the target data table.

Description

Data synchronization method and device, computer storage medium and electronic equipment
Technical Field
The present application relates to the field of database technologies, and in particular, to a data synchronization method and apparatus, a computer storage medium, and an electronic device.
Background
End-to-end data synchronization can be applied to many scenarios, such as synchronizing data between clients, synchronizing data between departments, and the like. In order to ensure real-time and high efficiency of data synchronization and avoid backlog, data synchronization is generally performed in parallel, but the parallel problem necessarily brings uncertainty of execution sequence of each synchronization instruction, when a plurality of synchronization instructions are performed in parallel, the synchronization instructions pointing to the same main key are allocated to the same task to be executed, and the synchronization instructions pointing to different main keys are allocated to different tasks to be executed.
Disclosure of Invention
Embodiments of the present application provide a data synchronization method and apparatus, a computer storage medium, and an electronic device, so as to overcome or alleviate the above technical problems in the prior art.
The technical scheme adopted by the application is as follows:
a method of data synchronization, comprising:
performing conversion processing on an actual updating instruction included in a source synchronization instruction sequence to generate an equivalent insertion instruction and an equivalent deletion instruction, wherein an execution timestamp of the equivalent deletion instruction is earlier than an execution timestamp of the equivalent insertion instruction;
generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence, wherein the other actual synchronous instructions include at least one of an actual deletion instruction and an actual insertion instruction;
determining an execution time stamp of each data synchronization instruction in the effective synchronization instruction sequence in the source data table;
judging whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table or not according to the execution timestamp;
and if so, operating the data synchronization instruction to synchronize the data to be synchronized between the source data table and the target data table.
And generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence.
Optionally, if a plurality of data to be synchronized in a target data table are synchronized, the determining, for each data synchronization instruction in the valid synchronization instruction sequence, an execution timestamp of the data synchronization instruction in the source data table includes: and by running a plurality of tasks in parallel, determining the execution time stamp of a plurality of data synchronization instructions in the effective synchronization instruction sequence in the source data table in parallel.
Optionally, if a plurality of data to be synchronized in a plurality of target data tables are synchronized, the determining, for each data synchronization instruction in the valid synchronization instruction sequence, an execution timestamp of the data synchronization instruction in a source data table includes: by running a plurality of tasks in parallel, the execution time stamps of a plurality of data synchronization instructions of the plurality of target data tables in the source data table are determined in parallel.
Optionally, if the data synchronization instruction is an equivalent deletion instruction or an actual deletion instruction, the execution timestamp is a data deletion timestamp;
the method further comprises the following steps: determining the latest synchronization timestamp of the data to be synchronized in the target data table by the data synchronization instruction and a data synchronization instruction corresponding to the latest synchronization timestamp;
wherein, the determining whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table according to the execution timestamp includes: if the data deletion timestamp is not earlier than the latest synchronized timestamp, determining that the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table;
the executing the data synchronization instruction to synchronize the data to be synchronized between the source data table and the target data table includes:
and operating the equivalent deleting instruction or the actual deleting instruction to delete the data to be synchronized corresponding to the main key of the data synchronizing instruction in the target data table.
Optionally, if the data synchronization instruction is an equivalent insertion instruction or an actual insertion instruction, the execution timestamp is a data insertion timestamp;
the method further comprises the following steps: determining the latest synchronization timestamp of the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction in the target data table and the data synchronization instruction corresponding to the latest synchronization timestamp,
the determining whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table according to the execution timestamp includes: if the data insertion timestamp is not earlier than the latest synchronized timestamp and synchronization failure occurs when the equivalent insertion instruction is predicted to be executed or the actual insertion instruction is predicted, determining that synchronization needs to be performed on the data to be synchronized pointed by the main key corresponding to the data synchronization instruction between the source data table and the target data table;
the executing the data synchronization instruction to perform data synchronization between the source data table and the target data table includes: and responding to the fact that the data synchronization instruction corresponding to the latest synchronization timestamp is another equivalent insertion instruction or an actual insertion instruction, converting the equivalent insertion instruction or the actual insertion instruction into an equivalent updating instruction, and operating the equivalent updating instruction to update the data to be synchronized, corresponding to the main key corresponding to the data synchronization instruction, in the target data table.
Optionally, the determining whether to synchronize data to be synchronized according to the execution timestamp includes: and judging whether the data to be synchronized needs to be synchronized or not according to the execution timestamp based on the set timestamp constraint condition.
Optionally, the method further comprises: and if the data to be synchronized does not need to be synchronized, discarding the corresponding data synchronization instruction.
A data synchronization device, comprising:
the first processing unit is used for performing conversion processing on an actual updating instruction included by a source synchronization instruction sequence to generate an equivalent insertion instruction and an equivalent deletion instruction, wherein an execution timestamp of the equivalent deletion instruction is earlier than that of the equivalent insertion instruction;
the second processing unit is used for generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence, wherein the other actual synchronous instructions include at least one of an actual deletion instruction and an actual insertion instruction;
a third processing unit, configured to determine, for each data synchronization instruction in the valid synchronization instruction sequence, an execution timestamp of the data synchronization instruction in a source data table;
the fourth processing unit is used for judging whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table or not according to the execution timestamp;
and the fifth processing unit is used for operating the data synchronization instruction when the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table, so as to synchronize the data to be synchronized between the source data table and the target data table.
A computer storage medium having stored thereon a computer executable program, the computer executable program being operative to perform a method as in any one of the embodiments of the present application.
An electronic device comprising a memory for storing thereon a computer-executable program and a processor for executing the computer-executable program to implement the method of any of the embodiments of the present application.
According to the embodiment of the application, an equivalent insertion instruction and an equivalent deletion instruction are generated by converting an actual updating instruction included in a source synchronous instruction sequence, wherein an execution time stamp of the equivalent deletion instruction is earlier than that of the equivalent insertion instruction; generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence, wherein the other actual synchronous instructions include at least one of an actual deletion instruction and an actual insertion instruction; determining an execution time stamp of each data synchronization instruction in the effective synchronization instruction sequence in the source data table; judging whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table or not according to the execution timestamp; if so, the data synchronization instruction is operated to synchronize the data to be synchronized between the source data table and the target data table, so that data synchronization between the source data table and the target data table is effectively realized, and the data consistency between the source data table and the target data table is ensured.
Drawings
FIG. 1A is a schematic diagram of an application scenario in accordance with an embodiment of the present application;
FIG. 1B is a schematic diagram of another application scenario in accordance with an embodiment of the present application;
FIG. 2 is a schematic diagram of another application scenario according to an embodiment of the present application;
fig. 3A is a schematic flowchart of a data synchronization method according to an embodiment of the present application;
FIG. 3B is a flowchart illustrating the generation of a valid synchronization command sequence according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a data synchronization apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
To make the technical problems, technical solutions and advantages to be solved by the present application clearer, the following detailed description is made with reference to the accompanying drawings and specific embodiments.
According to the embodiment of the application, an equivalent insertion instruction and an equivalent deletion instruction are generated by converting an actual updating instruction included in a source synchronous instruction sequence, wherein an execution time stamp of the equivalent deletion instruction is earlier than that of the equivalent insertion instruction; generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence, wherein the other actual synchronous instructions include at least one of an actual deletion instruction and an actual insertion instruction; determining an execution time stamp of each data synchronization instruction in the effective synchronization instruction sequence in the source data table; judging whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table or not according to the execution timestamp; if so, the data synchronization instruction is operated to synchronize the data to be synchronized between the source data table and the target data table, so that data synchronization between the source data table and the target data table is effectively realized, and the data consistency between the source data table and the target data table is ensured.
Specifically, the technical solutions of the embodiments of the present application are exemplified by taking an application in a specific application scenario as an example, but the specific instructions and values herein are merely, and are not the only limitations on the technical solutions of the present application.
In a scenario where a plurality of synchronization instructions with synchronization instructions pointing to different primary keys are executed in parallel but the data with unique indexes set is essentially synchronized, different tasks are assigned to be executed in parallel, such as two actual insertion instructions: insert (2, 1, 3), insert (1, 1, 1), and an actual delete instruction: in parenthesis of insert (2, 1, 3), insert (1, 1, 1), delete (1, 1, 2), the numerical value represents the primary key, the data corresponding to the unique index, and the stamp between executions in the source data table, respectively. The execution sequence of the three synchronous instructions in the source data table is as follows: insert (1, 1, 1), delete (1, 1, 2), insert (2, 1, 3). When tasks are allocated based on Hash partitions in synchronization, for example, because the primary keys are different, insert (2, 1, 3) is allocated to one task to be executed, insert (1, 1, 1) and delete (1, 1, 2) are allocated to another task to be executed, and there are the following out-of-order situations: insert (2, 1, 3) executes delete (1, 1, 2) before insert (1, 1, 1, 1), because delete (1, 1, 2) is executed last, insert (2, 1, 3) successfully executes will insert a piece of data corresponding to primary key 2 and unique index corresponding data 1 into the target data table, and when insert (1, 1, 1) executes, because the unique index corresponding data is 1 in the synchronization instruction, synchronization conflict caused by the unique index will occur, resulting in the failure of executing insert1(1, 1, 1), and then when delete (1, 1, 2) executes, the data inserted into the target data table when insert (2, 1, 3) is deleted is removed. Therefore, when data are synchronized, the data corresponding to insert (1, 1, 1) and insert (2, 1, 3) are all lost.
For the above out-of-order situation, according to the scheme of the embodiment of the present application, when data synchronization is performed, because insert (2, 1, 3) is currently performed, a latest synchronization timestamp in a target data table is 3, and then delete (1, 1, 2) is performed, an execution timestamp of which is 2 and is 3 less than the latest synchronization timestamp, and the delete (1, 1, 2) is directly discarded, so that data when insert (2, 1, 3) is operated on a source data table is retained in the target data table, thereby implementing data synchronization between the source data table and the target data table, and ensuring consistency of the source data table and the target data table.
Similarly, in another case, the three synchronization instructions are generated to execute in the source data table in the following order: insert (1, 1, 1), update (1, 2,2), insert (2, 1, 3). insert (1, 1, 1), update (1, 2,2) are assigned to the same task and insert (2, 1, 3) is assigned to another task, in an out-of-order situation, insert (1, 1, 1) is executed before insert (2, 1, 3), insert (2, 1, 3) is executed before update (1, 2,2), and insert (2, 1, 3) is executed before update (1, 2,2), similar to the above situation, there will be a situation where unique index conflicts result in out-of-synchronization. For this reason, according to the present invention, update (1, 2,2) is equivalent to delete (1, 1, 2) and insert (1, 2, 3), when data is synchronized, the timestamp in delete (1, 1, 2) is 2, and insert (1, 1, 1) is successfully executed, so that the latest synchronization timestamp in the target data table is 1 and is earlier than the timestamp is 2, so that delete (1, 1, 2) deletes the data corresponding to the primary key 1 when insert (1, 1, 1) in the target data table when execute insert (1, 1, 1) in the target data table, and execute insert (1, 2, 3), and finally execute synchronization instruction insert (2, 1, 3), because the primary key is 2 and the data corresponding to the unique index is 1, and the primary key of insert (1, 2, 3) is 1, and the data corresponding to the unique index is 2, which is equivalent to insert two pieces of data, and thus no collision occurs in any primary key The problem of unique index conflicts ensures that the target data table has data stored therein that implements insert (1, 2, 3) and insert (2, 1, 3).
FIG. 1A is a schematic diagram of an application scenario in accordance with an embodiment of the present application; as shown in fig. 1A, the application scenario includes a source data table and a target data table, where the source data table and the target data table are set based on Mysql data, for example, when data in the source data table is modified and changed (including but not limited to adding a completely new data record, deleting an existing data record, or updating local data in a certain data record, etc.), the target data table is synchronized (for example, adding a completely new data record in the target data table, deleting an existing data record, or updating a certain data record to implement updating of local data, etc.), so that data in the target data table and data in the source data table are kept consistent.
Further, when the data in the source data table is modified, a data operation record is generated for each modification of each data record in the source data table and is stored in a binlog Log (Binary Log), if the modified data needs to be synchronized into the target data table, each corresponding data operation record is converted into a data synchronization instruction, for example, an insertion instruction of the data is used to increase a brand new data record in the target data table, a deletion instruction of the data is used to delete an existing data record in the target data table, and an update instruction of the data is used to update a certain data record in the target data table, so that data consistency between the source data table and the target data table is ensured.
FIG. 1B is a schematic diagram of another application scenario in accordance with an embodiment of the present application; as shown in fig. 1B, data synchronization is required between two servers, one is a source server (for example, may be referred to as a source database instance), and the other is a target server (for example, may be referred to as a target database instance), the source server is provided with a plurality of source data tables, the target server is provided with a plurality of target data tables, and data synchronization is performed between the plurality of source data tables and the plurality of target data tables, so as to implement data synchronization between the source server and the target server (or, referred to as data synchronization between the source database instance and the target database instance).
FIG. 2 is a schematic diagram of another application scenario according to an embodiment of the present application; as shown in fig. 2, in this embodiment, in order to implement efficient synchronization between the source data table and the target data table, some middleware for instruction processing is added, including but not limited to middleware for real-time data synchronization (referred to as first instruction processing middleware in this embodiment), and middleware for cache synchronization instruction (referred to as second instruction processing middleware in this embodiment).
In this embodiment, the first instruction processing middleware is, for example, a flink middleware, the second instruction processing middleware is, for example, a kafka middleware, the kafka middleware is configured to cache a source synchronization instruction sequence, the flink middleware is configured to acquire the source synchronization instruction sequence from the kafka middleware, and perform processing on the source synchronization instruction sequence to generate an effective synchronization instruction sequence, and implement the data synchronization method based on the effective synchronization instruction sequence according to the following steps:
for each data synchronization instruction in the valid sequence of synchronization instructions, determining an execution timestamp thereof in a source data table; judging whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table or not according to the execution timestamp; and if so, operating the data synchronization instruction to synchronize the data to be synchronized between the source data table and the target data table.
It should be noted that, in the embodiment of fig. 2, the flink middleware and the kafka middleware are only exemplary and not limiting. For those skilled in the art, other middleware can be selected according to the requirements of the application scenario without departing from the embodiments of the present application.
The data synchronization can be specifically applied to data synchronization between clients, data synchronization between departments and the like.
However, in the application scenario of data synchronization, for example, data synchronization failure often occurs for data with unique indexes set in the source data table and the target data table, so that data inconsistency between the source data table and the target data table is caused. For example, in a specific application scenario, in a process of performing account registration by a user at a front end, a corresponding data record is inserted into a source data table at the front end, the data record records an identity card number of the user, since the identity card numbers of different users are inevitably different, a unique index is configured for the identity card number, the identity card number is synchronized into a target data table at a rear end (corresponding to an insertion instruction of execution data), but in a process of performing account registration again after the user, a login name is modified, but the filled identity card number is completely the same as the identity card number filled in previous registration, and therefore, when performing data synchronization at a subsequent time (corresponding to an update instruction of the execution data, or also referred to as update), since the same identity card number already exists in the target data table, the data synchronization at the subsequent time fails, the consistency of the source data table and the target data table cannot be guaranteed, that is, in the data synchronization process, synchronization conflict occurs, and data synchronization cannot be realized, so that the consistency of data cannot be realized between the source data table and the target data table.
Of course, the above scenario based on user registration and use of identification numbers is merely for illustration of the possibility of causing data inconsistency and is not intended to be limiting.
Fig. 3A is a schematic flowchart of a data synchronization method according to an embodiment of the present application; in this embodiment, the main execution body of the method may be the flink middleware. Specifically, as shown in fig. 3A, it includes:
s301, determining an execution time stamp of each data synchronization instruction in the effective synchronization instruction sequence in a source data table;
in this embodiment, the execution time stamp of each data synchronization instruction sequence includes, for example, a time stamp for performing an operation on data in the source data table. Further, in order to ensure that the execution time stamp can be determined by executing step S301, when the data of the source data table is operated, the time stamp of the operation is automatically recorded, and at the same time, the time stamp field is configured in the data synchronization command, and the recorded time stamp is filled in the time stamp field. In addition, the data synchronization command includes a data ID (e.g., a primary key) of the data to be operated on in the source data table and a corresponding index, including a unique index.
Optionally, in this embodiment, each data synchronization instruction is specifically analyzed by the flink middleware, and a corresponding execution timestamp is determined from the data synchronization instruction. For example, if the data synchronization instruction is an insert instruction, the execution timestamp is a timestamp of inserting data into the source data table, if the data synchronization instruction is a delete instruction, the execution timestamp is a timestamp of deleting data from the source data table, and if the data synchronization instruction updates the instruction, the execution timestamp is a timestamp of deleting data from the source data table.
Optionally, if a plurality of data to be synchronized in a target data table are synchronized, the determining, for each data synchronization instruction in the valid synchronization instruction sequence, an execution timestamp of the data synchronization instruction in the source data table includes: and by running a plurality of tasks in parallel, determining the execution time stamp of a plurality of data synchronization instructions in the effective synchronization instruction sequence in the source data table in parallel.
Optionally, if a plurality of data to be synchronized in a plurality of target data tables are synchronized, the determining, for each data synchronization instruction in the valid synchronization instruction sequence, an execution timestamp of the data synchronization instruction in a source data table includes: by running a plurality of tasks in parallel, the execution time stamps of a plurality of data synchronization instructions of the plurality of target data tables in the source data table are determined in parallel.
In this embodiment, by distributing the plurality of data synchronization instructions to the plurality of tasks running in parallel, the efficiency of determining the execution timestamp is improved, and the timeliness of data synchronization is further improved.
Specifically, the data IDs in the multiple data synchronization instructions may be used to determine the corresponding relationship between the data synchronization instructions and the tasks.
Of course, the way of complementation here is merely an example and is not only limited.
Specifically, when the execution timestamps of the plurality of data synchronization instructions are determined by the parallel task, all the data synchronization instructions corresponding to the same data ID are allocated to the same task, or in other words, the data synchronization instructions corresponding to the plurality of data IDs may be executed on the same task, and the data synchronization instructions corresponding to the same data ID are executed on the same task to determine the execution timestamps of the plurality of data synchronization instructions for the same data ID.
Further, as previously described, the valid synchronization sequence may be derived based on the source synchronization instruction sequence. FIG. 3B is a flowchart illustrating the generation of a valid synchronization command sequence according to an embodiment of the present application; in this embodiment, before step S101, the method may further include:
S300A, converting an actual updating instruction included in the source synchronous instruction sequence to generate an equivalent inserting instruction and an equivalent deleting instruction, wherein the execution time stamp of the equivalent deleting instruction is earlier than that of the equivalent inserting instruction;
since the execution timestamp of the equivalent delete instruction is earlier than the execution timestamp of the equivalent insert instruction, the equivalent delete instruction will be executed before the equivalent insert instruction when executed.
S300B, generating an effective synchronization instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronization instructions included in the source synchronization instruction sequence, wherein the other actual synchronization instructions include at least one of an actual deletion instruction and an actual insertion instruction.
Specifically, in this embodiment, the source synchronization instruction sequence includes at least one of an actual insertion instruction, an actual deletion instruction, and an actual update instruction. The actual insertion instruction is, for example, due to a new data record being inserted into the source data table, and the data record needs to be synchronized to the target data table to also insert a new data record into the target data table; the actual deleting instruction is that an existing data record is deleted in the source data table, and the same data record needs to be deleted in the target data table; the actual update instruction is, for example, to update a certain data record in the source data table to implement the update of the local data, and to update the same data record in the target data table, etc.
As described above, for the actual update instruction, if a unique index is set in one of the data records to be updated in the target data table, there is a problem of synchronization conflict, and for this reason, in this embodiment, in order to avoid a possible conflict problem, the actual update instruction is converted to generate an equivalent insertion instruction and an equivalent deletion instruction, and the equivalent deletion instruction is specified to be executed before the equivalent insertion instruction, so when the following step S302 is sequentially executed, it is ensured that the data record pointed by the primary key in the target data table in the equivalent insertion instruction is empty before the equivalent insertion instruction is executed, it is ensured that the field in the data record where the unique index is set does not have the same data, it is further ensured that the equivalent insertion instruction can be executed in a manner equivalent to inserting a new data record into the target data table, and the updated data is inserted into the data record pointed by the primary key in the target data table, and the data synchronization conflict caused by the unique index conflict can not occur, so that the consistency of data synchronization is ensured.
For example, corresponding to the case of performing account registration by the user, when performing the next data synchronization (corresponding to the actual update instruction of the execution data, or also referred to as update), the actual update instruction is converted, the equivalent insertion instruction and the equivalent deletion instruction executed prior to the equivalent insertion instruction are executed, so as to delete the data record synchronized to the target data table during the previous registration, and then execute the equivalent insertion instruction, so as to synchronize the data record generated in the source data table during the next registration to the target data table, thereby avoiding the conflict of unique index and ensuring that the data of the source data table and the target data table are completely consistent.
The execution subject of the steps S300A and S300B may be the flink middleware. Specifically, the actual update instruction may be determined by identifying a field in the source synchronization instruction sequence that reflects the synchronization type.
In summary, the actual updating instruction is converted into the equivalent inserting instruction and the equivalent deleting instruction, and then the equivalent inserting instruction and the actual deleting instruction form the effective synchronous instruction sequence together, so that the type of the synchronous instruction in the effective synchronous instruction sequence is only inserted and deleted, the effective synchronous execution sequence is simplified, and the data synchronization efficiency is improved.
S302, according to the execution timestamp, whether the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table is judged.
As mentioned above, the update instruction in the source synchronization instruction sequence is converted into the equivalent insert instruction and the equivalent instruction, so that the data synchronization instructions in the valid synchronization instruction sequence have only two types: an insert instruction (equivalent insert instruction or actual insert instruction) or a delete instruction (equivalent delete instruction or actual delete instruction). For this reason, when step S302 is executed, it may be determined whether the above-described step S302 is executed for each data synchronization instruction.
Specifically, the operation type field in the data synchronization instruction may be parsed to determine whether to insert or delete the instruction, for example, if the operation type field is insert, it indicates that the instruction is an insert instruction, and if the operation type field is delete, it indicates that the instruction is a delete instruction.
Specifically, if the data synchronization instruction is an equivalent delete instruction or an actual delete instruction, the execution timestamp is a data delete timestamp.
Further, the method further comprises: and determining the latest synchronization timestamp of the data to be synchronized in the target data table by the data synchronization instruction and the data synchronization instruction corresponding to the latest synchronization timestamp. This step may be included in step S302 or may be performed before step S302 as long as the execution of S302 can be ensured.
Specifically, in this embodiment, as described above, since the data synchronization instruction includes the operation type and the execution timestamp, the flink middleware may analyze the data synchronization instruction that is successfully executed with respect to the target data table to determine the latest synchronization timestamp of the data to be synchronized in the target data table for the data synchronization instruction that is to be currently executed, and the data synchronization instruction corresponding to the latest synchronization timestamp.
Further, in this embodiment, the determining, according to the execution timestamp, whether to synchronize to-be-synchronized data pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table includes: and if the data deletion timestamp is not earlier than the latest synchronized timestamp, judging that the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table. If the data deletion timestamp is not earlier than the latest synchronized timestamp, it indicates that the data to be deleted in the target data table still exists, and for this purpose, a deletion instruction corresponding to the data deletion timestamp needs to be executed to delete the data to be deleted in the target data table. If the data deletion timestamp is later than the latest synchronized timestamp, it indicates that subsequent operations (such as insertion) are performed on the same data in the source data table after the deletion instruction corresponding to the data deletion timestamp, and for this reason, the deletion instruction corresponding to the data deletion timestamp is an invalid or old instruction and does not need to be executed, and the deletion instruction corresponding to the data deletion timestamp is directly discarded.
Optionally, in this embodiment, if the data synchronization instruction is an equivalent insertion instruction or an actual insertion instruction, the execution timestamp is a data insertion timestamp.
The method further comprises the following steps: and determining the latest synchronization timestamp of the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction in the target data table and the data synchronization instruction corresponding to the latest synchronization timestamp. This step may be included in step S302 or may be performed before step S302 as long as the execution of S302 can be ensured.
Similar to the case that the data synchronization instruction is an equivalent deletion instruction or an actual deletion instruction, the determining, according to the execution timestamp, whether to synchronize the to-be-synchronized data pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table includes: and if the data insertion timestamp is not earlier than the latest synchronized timestamp and synchronization failure occurs when the equivalent insertion instruction is predicted to be executed or the actual insertion instruction is predicted, determining that the data to be synchronized pointed by the main key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table. If a synchronization failure occurs in the equivalent or actual insertion instruction, for example, if a unique index conflict occurs, then a synchronization failure occurs in the equivalent or actual insertion instruction.
If the data insertion timestamp is not earlier than the most recently synchronized timestamp, indicating that data should be inserted into the target data table. If the data insertion timestamp is later than the latest synchronized timestamp, it indicates that a subsequent operation (such as insertion) is performed on the same data in the source data table after the insertion instruction corresponding to the data insertion timestamp, and for this reason, the insertion instruction corresponding to the data insertion timestamp is an invalid or old instruction and does not need to be executed.
Optionally, in step S302, determining whether to synchronize data to be synchronized according to the execution timestamp, including: and judging whether the data to be synchronized needs to be synchronized or not according to the execution timestamp based on the set timestamp constraint condition.
The timestamp constraint condition is, for example, a where _ time of the timestamp is less than or equal to row.update, where _ time of the timestamp represents the latest synchronized timestamp, and row.update represents the execution timestamp in step S302, and the timestamp constraint condition can quickly compare timestamps, so as to improve the execution speed of data synchronization.
If necessary, go to step S303A: and operating the data synchronization instruction to synchronize the data to be synchronized between the source data table and the target data table.
Aiming at the fact that the data synchronization instruction is an equivalent deletion instruction or an actual deletion instruction, and when synchronization is needed, the data synchronization instruction is operated, so that synchronization of the data to be synchronized between the source data table and the target data table comprises the following steps: and operating the equivalent deleting instruction or the actual deleting instruction to delete the data to be synchronized corresponding to the main key of the data synchronizing instruction in the target data table.
If the data deleting time stamp is not earlier than the latest synchronized time stamp, it indicates that the data that should be deleted in the target data table still exists, and for this reason, in operation S303A, the equivalent deleting instruction or the actual deleting instruction is directly executed to delete the data that is inserted into the target data table during data synchronization in the case that the primary key corresponding to the data synchronizing instruction corresponds to the data to be synchronized in the target data table, that is, in the case that an inserting instruction is executed before.
Optionally, when it is determined that data to be synchronized pointed by a primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table, the executing the data synchronization instruction to perform data synchronization between the source data table and the target data table includes: and responding to the fact that the data synchronization instruction corresponding to the latest synchronization timestamp is another equivalent insertion instruction or an actual insertion instruction, converting the equivalent insertion instruction or the actual insertion instruction into an equivalent updating instruction, and operating the equivalent updating instruction to update the data to be synchronized, corresponding to the main key corresponding to the data synchronization instruction, in the target data table.
Specifically, for the case that the data synchronization fails due to the unique index conflict, the synchronization can be achieved only by local updating, and for this purpose, the equivalent insertion instruction or the actual insertion instruction is converted into an equivalent update instruction (also called update), so that the data synchronization between the source data table and the target data table is achieved by local data updating.
If it is determined in the above step S302 that data synchronization between the source data table and the destination data table is not required, step S303B is executed: data synchronization instructions that do not require data synchronization are discarded.
Further, in an embodiment, the data synchronization instructions in the valid synchronization instruction sequence may be processed in batches, each batch may run the method described in fig. 3A for a plurality of data synchronization instructions, and after completion, the method described in fig. 3A may be run for the next plurality of data synchronization instructions, and so on.
Fig. 4 is a schematic structural diagram of a data synchronization apparatus according to an embodiment of the present application; as shown in fig. 4, it includes:
a first processing unit 401, configured to perform conversion processing on an actual update instruction included in a source synchronization instruction sequence to generate an equivalent insertion instruction and an equivalent deletion instruction, where an execution timestamp of the equivalent deletion instruction is earlier than an execution timestamp of the equivalent insertion instruction;
a second processing unit 402, configured to generate an effective synchronization instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction, and other actual synchronization instructions included in the source synchronization instruction sequence, where the other actual synchronization instructions include at least one of an actual deletion instruction and an actual insertion instruction;
a third processing unit 403, configured to determine, for each data synchronization instruction in the valid synchronization instruction sequence, an execution timestamp of the data synchronization instruction in a source data table;
a fourth processing unit 404, configured to determine whether to synchronize, between the source data table and the target data table, to-be-synchronized data pointed by a primary key corresponding to the data synchronization instruction according to the execution timestamp;
a fifth processing unit 405, configured to run the data synchronization instruction when data to be synchronized pointed by the primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table, so as to synchronize the data to be synchronized between the source data table and the target data table.
Optionally, if a plurality of data to be synchronized in a target data table are synchronized, the third processing unit 403 is specifically configured to run a plurality of tasks in parallel to determine, in parallel, an execution timestamp of a plurality of data synchronization instructions in the valid synchronization instruction sequence in the source data table.
Optionally, if multiple pieces of data to be synchronized in multiple target data tables are synchronized, the third processing unit 403 is specifically configured to run multiple tasks in parallel to determine, in parallel, execution timestamps of multiple data synchronization instructions of the multiple target data tables in the source data table.
Optionally, if the data synchronization instruction is an equivalent deletion instruction or an actual deletion instruction, the execution timestamp is a data deletion timestamp; the third processing unit 403 is further configured to determine a latest synchronization timestamp of the data to be synchronized in the target data table by the data synchronization instruction and a data synchronization instruction corresponding to the latest synchronization timestamp.
The fourth processing unit 404 is specifically configured to determine that data to be synchronized pointed by the primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table when the data deletion timestamp is not earlier than the latest synchronized timestamp.
The fifth processing unit 405 is specifically configured to execute the equivalent deleting instruction or the actual deleting instruction to delete the to-be-synchronized data corresponding to the primary key corresponding to the data synchronization instruction in the target data table.
Optionally, if the data synchronization instruction is an equivalent insertion instruction or an actual insertion instruction, the execution timestamp is a data insertion timestamp; the third processing unit 403 is further configured to determine a latest synchronization timestamp of the to-be-synchronized data pointed by the primary key corresponding to the data synchronization instruction in the target data table, and a data synchronization instruction corresponding to the latest synchronization timestamp.
The fourth processing unit 404 is specifically configured to determine that data to be synchronized pointed by a primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table when the data insertion timestamp is not earlier than the latest synchronized timestamp and it is predicted that synchronization failure may occur when the equivalent insertion instruction is executed or the actual insertion instruction is executed.
The fifth processing unit 405 is specifically configured to, in response to that the data synchronization instruction corresponding to the latest synchronization timestamp is another equivalent insertion instruction or an actual insertion instruction, convert the equivalent insertion instruction or the actual insertion instruction into an equivalent update instruction, and run the equivalent update instruction to update the to-be-synchronized data corresponding to the primary key corresponding to the data synchronization instruction in the target data table.
The fourth processing unit 404 is specifically configured to determine whether to synchronize data to be synchronized according to the execution timestamp based on a set timestamp constraint condition.
Optionally, the fifth processing unit 405 is further configured to discard the corresponding data synchronization instruction when the data to be synchronized does not need to be synchronized.
Embodiments of the present application further provide a computer storage medium, where a computer executable program is stored on the computer storage medium, and the computer executable program is executed to implement any of the methods described in the embodiments of the present application.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application; as shown in fig. 5, the electronic device includes a memory 501 for storing a computer-executable program and a processor 502 for executing the computer-executable program to implement the method according to any of the embodiments of the present application.
The above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method of data synchronization, comprising:
performing conversion processing on an actual updating instruction included in a source synchronization instruction sequence to generate an equivalent insertion instruction and an equivalent deletion instruction, wherein an execution timestamp of the equivalent deletion instruction is earlier than an execution timestamp of the equivalent insertion instruction;
generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence, wherein the other actual synchronous instructions include at least one of an actual deletion instruction and an actual insertion instruction;
determining an execution time stamp of each data synchronization instruction in the effective synchronization instruction sequence in the source data table;
judging whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table or not according to the execution timestamp;
and if so, operating the data synchronization instruction to synchronize the data to be synchronized between the source data table and the target data table.
And generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence.
2. The method of claim 1, wherein if synchronization is performed for a plurality of data to be synchronized in a target data table, the determining an execution timestamp in a source data table for each data synchronization instruction in the valid sequence of synchronization instructions comprises: and by running a plurality of tasks in parallel, determining the execution time stamp of a plurality of data synchronization instructions in the effective synchronization instruction sequence in the source data table in parallel.
3. The method of claim 1, wherein if the synchronization is performed for a plurality of data to be synchronized in a plurality of target data tables, the determining an execution timestamp of each data synchronization instruction in the valid sequence of synchronization instructions in a source data table comprises: by running a plurality of tasks in parallel, the execution time stamps of a plurality of data synchronization instructions of the plurality of target data tables in the source data table are determined in parallel.
4. The method according to claim 1, wherein if the data synchronization command is an equivalent delete command or an actual delete command, the execution timestamp is a data delete timestamp;
the method further comprises the following steps: determining the latest synchronization timestamp of the data to be synchronized in the target data table by the data synchronization instruction and a data synchronization instruction corresponding to the latest synchronization timestamp;
wherein, the determining whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table according to the execution timestamp includes: if the data deletion timestamp is not earlier than the latest synchronized timestamp, determining that the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table;
the executing the data synchronization instruction to synchronize the data to be synchronized between the source data table and the target data table includes:
and operating the equivalent deleting instruction or the actual deleting instruction to delete the data to be synchronized corresponding to the main key of the data synchronizing instruction in the target data table.
5. The method of claim 1, wherein the execution timestamp is a data insertion timestamp if the data synchronization instruction is an equivalent insertion instruction or an actual insertion instruction;
the method further comprises the following steps: determining the latest synchronization timestamp of the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction in the target data table and the data synchronization instruction corresponding to the latest synchronization timestamp,
the determining whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table according to the execution timestamp includes: if the data insertion timestamp is not earlier than the latest synchronized timestamp and synchronization failure occurs when the equivalent insertion instruction is predicted to be executed or the actual insertion instruction is predicted, determining that synchronization needs to be performed on the data to be synchronized pointed by the main key corresponding to the data synchronization instruction between the source data table and the target data table;
the executing the data synchronization instruction to perform data synchronization between the source data table and the target data table includes: and responding to the fact that the data synchronization instruction corresponding to the latest synchronization timestamp is another equivalent insertion instruction or an actual insertion instruction, converting the equivalent insertion instruction or the actual insertion instruction into an equivalent updating instruction, and operating the equivalent updating instruction to update the data to be synchronized, corresponding to the main key corresponding to the data synchronization instruction, in the target data table.
6. The method according to any one of claims 1 to 5, wherein the determining whether the data to be synchronized needs to be synchronized according to the execution timestamp comprises: and judging whether the data to be synchronized needs to be synchronized or not according to the execution timestamp based on the set timestamp constraint condition.
7. The method of claim 6, further comprising: and if the data to be synchronized does not need to be synchronized, discarding the corresponding data synchronization instruction.
8. A data synchronization apparatus, comprising:
the first processing unit is used for performing conversion processing on an actual updating instruction included by a source synchronization instruction sequence to generate an equivalent insertion instruction and an equivalent deletion instruction, wherein an execution timestamp of the equivalent deletion instruction is earlier than that of the equivalent insertion instruction;
the second processing unit is used for generating an effective synchronous instruction sequence according to the equivalent insertion instruction, the equivalent deletion instruction and other actual synchronous instructions included by the source synchronous instruction sequence, wherein the other actual synchronous instructions include at least one of an actual deletion instruction and an actual insertion instruction;
a third processing unit, configured to determine, for each data synchronization instruction in the valid synchronization instruction sequence, an execution timestamp of the data synchronization instruction in a source data table;
the fourth processing unit is used for judging whether to synchronize the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction between the source data table and the target data table or not according to the execution timestamp;
and the fifth processing unit is used for operating the data synchronization instruction when the data to be synchronized pointed by the primary key corresponding to the data synchronization instruction needs to be synchronized between the source data table and the target data table, so as to synchronize the data to be synchronized between the source data table and the target data table.
9. A computer storage medium having stored thereon a computer-executable program that is executed to implement the method of any one of claims 1-7.
10. An electronic device comprising a memory for storing thereon a computer-executable program and a processor for executing the computer-executable program to implement the method of any of claims 1-7.
CN202111658188.5A 2021-12-30 2021-12-30 Data synchronization method and device, computer storage medium and electronic equipment Active CN114297214B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111658188.5A CN114297214B (en) 2021-12-30 2021-12-30 Data synchronization method and device, computer storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111658188.5A CN114297214B (en) 2021-12-30 2021-12-30 Data synchronization method and device, computer storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN114297214A true CN114297214A (en) 2022-04-08
CN114297214B CN114297214B (en) 2022-09-20

Family

ID=80973247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111658188.5A Active CN114297214B (en) 2021-12-30 2021-12-30 Data synchronization method and device, computer storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114297214B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090119346A1 (en) * 2007-11-06 2009-05-07 Edwina Lu Automatic error correction for replication and instantaneous instantiation
CN108536752A (en) * 2018-03-13 2018-09-14 北京信安世纪科技有限公司 A kind of method of data synchronization, device and equipment
CN110795499A (en) * 2019-09-17 2020-02-14 中国平安人寿保险股份有限公司 Cluster data synchronization method, device and equipment based on big data and storage medium
CN113010608A (en) * 2021-04-07 2021-06-22 亿企赢网络科技有限公司 Data real-time synchronization method and device and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090119346A1 (en) * 2007-11-06 2009-05-07 Edwina Lu Automatic error correction for replication and instantaneous instantiation
CN108536752A (en) * 2018-03-13 2018-09-14 北京信安世纪科技有限公司 A kind of method of data synchronization, device and equipment
CN110795499A (en) * 2019-09-17 2020-02-14 中国平安人寿保险股份有限公司 Cluster data synchronization method, device and equipment based on big data and storage medium
CN113010608A (en) * 2021-04-07 2021-06-22 亿企赢网络科技有限公司 Data real-time synchronization method and device and computer readable storage medium

Also Published As

Publication number Publication date
CN114297214B (en) 2022-09-20

Similar Documents

Publication Publication Date Title
CN108121782B (en) Distribution method of query request, database middleware system and electronic equipment
CN110442579B (en) State tree data storage method, synchronization method and equipment and storage medium
CN110262929B (en) Method for ensuring consistency of copying affairs and corresponding copying device
JP3779263B2 (en) Conflict resolution for collaborative work systems
US9836361B2 (en) Data replicating system, data replicating method, node device, management device and computer readable medium
CN109379432A (en) Data processing method, device, server and computer readable storage medium
CN107391634B (en) Data migration method and device
CN106874281B (en) Method and device for realizing database read-write separation
CN112182104A (en) Data synchronization method, device, equipment and storage medium
CN109086382B (en) Data synchronization method, device, equipment and storage medium
CN112307119A (en) Data synchronization method, device, equipment and storage medium
CN115185787B (en) Method and device for processing transaction log
CN112612850A (en) Data synchronization method and device
CN104917813A (en) Resource request method and device
CN113407639B (en) Data processing method, device, system and storage medium
CN114297214B (en) Data synchronization method and device, computer storage medium and electronic equipment
CN114297216B (en) Data synchronization method and device, computer storage medium and electronic equipment
CN111078418B (en) Operation synchronization method, device, electronic equipment and computer readable storage medium
US20160019121A1 (en) Data transfers between cluster instances with delayed log file flush
CN110502584B (en) Data synchronization method and device
CN112035418A (en) Multi-computer room synchronization method, computing device and computer storage medium
EP4394619A1 (en) Data processing method and apparatus based on blockchain, and device and readable storage medium
CN114265901A (en) Data synchronization method and device, computer storage medium and electronic equipment
CN111078669B (en) Processing method, device and equipment based on name resolution tree and storage medium
CN114238507A (en) Data synchronization method and device based on multiple databases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant