CN111177162A - Data synchronization method and device - Google Patents

Data synchronization method and device Download PDF

Info

Publication number
CN111177162A
CN111177162A CN201911244776.7A CN201911244776A CN111177162A CN 111177162 A CN111177162 A CN 111177162A CN 201911244776 A CN201911244776 A CN 201911244776A CN 111177162 A CN111177162 A CN 111177162A
Authority
CN
China
Prior art keywords
data
data source
change
source
changed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911244776.7A
Other languages
Chinese (zh)
Inventor
曹宗南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201911244776.7A priority Critical patent/CN111177162A/en
Publication of CN111177162A publication Critical patent/CN111177162A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

In the application, a synchronization device can access the update information of a first data source first, and the update information records data changes in the first data source in a plurality of different time periods; the synchronization means determines the data in the first data source where the data change has occurred, that is, the changed data, by referring to the update information. The synchronization device then synchronizes the changed data to the second data source. The data change generated in the first data source is integrated into the update information, the first data source and the second data source can be synchronized only by accessing the update information by the synchronization device, and the data synchronization mode does not limit the information included in the two data sources needing to be synchronized, so that the data synchronization method is more flexible and efficient.

Description

Data synchronization method and device
Technical Field
The present application relates to the field of communications technologies, and in particular, to a data synchronization method and apparatus.
Background
With the continuous deepening of informatization, the requirements of information sharing and information interaction are increasingly strong. In such a need, it is often necessary to synchronize data in one data source to other data sources in a timed or real-time manner.
The timing data synchronization method needs to perform data synchronization according to a fixed period, that is, the synchronization device needs to query the changed data in the source data source at fixed time intervals, and then synchronize the changed data to the target data source.
In this way, it is mechanical, and for a source data source that does not change data for a long time, the source data source still needs to be synchronized at regular intervals, which results in resource waste and poor efficiency.
The real-time data synchronization method needs to determine the changed data according to the change time (or change identifier) of the data in the source data source by means of the change time (or change identifier) of the data recorded in the source data source, and then completes the data synchronization from the source data source to the target data source.
However, this method is suitable for a source data source in which a change time (or change flag) of data is recorded, and has a limited range of application.
In summary, the current data synchronization method is limited, and data synchronization cannot be performed flexibly and effectively.
Disclosure of Invention
The application provides a data synchronization method and device, which are used for improving the efficiency and flexibility of data synchronization.
In a first aspect, an embodiment of the present application provides a data synchronization method, which may be performed by a synchronization apparatus, in the method, the synchronization apparatus may first access update information of a first data source, where the update information records data changes occurring in the first data source in multiple different time periods; the synchronization means determines the data in the first data source where the data change has occurred, that is, the changed data, by referring to the update information. The synchronization device then synchronizes the changed data to the second data source.
By the method, the data change in the first data source is integrated into the update information, the synchronization device can synchronize the first data source and the second data source only by accessing the update information, and the data synchronization mode does not limit the information in the two data sources needing to be synchronized, so that the method is more flexible and efficient.
In one possible design, the first data source may include one or more data tables, and each data table may store a portion of the data in the first data source.
By the method, the synchronization between the first data source and the second data source can be the synchronization between the data table and the data table.
In one possible design, when the first data source includes multiple data tables, if there is an association relationship between the multiple data tables, for example, there is a data reference relationship. When the synchronizer synchronizes the changed data to the second data source, the synchronizer can inquire the incidence relation among the multiple data tables and determine the synchronization sequence of the multiple data tables in the second data source according to the incidence relation among the multiple data tables; and then, synchronizing the changed data to the plurality of data tables of the second data source according to the synchronization sequence.
By the method, the synchronization device determines the synchronization sequence of the data tables according to the incidence relation between the data tables, so that the data synchronization of the data tables with the incidence relation in the second data source can be realized, and the accuracy of the data synchronization between the data sources is ensured.
In one possible design, the synchronization means may divide the same type of data changes occurring in successive periods of time in the update information into a group when determining changed data in the first data source from the update information; and if the data is divided into a plurality of groups, determining changed data corresponding to the data change of each group according to the data change included in each group. If the data is divided into a group, the changed data is determined according to the data change included in the group.
By the method, the synchronizer can acquire the changed data more conveniently and quickly in a grouping mode, and further the data synchronization efficiency can be improved.
In a possible design, if the first data source includes data change time, and the synchronization device determines changed data in the first data source according to the data change included in the group, the synchronization device may query data in a time period to which the data change in the group belongs at the data change time in the first data source, and use the queried data as changed data.
By the method, the synchronization device can conveniently determine the changed data according to the time period recorded in the updating information and the data change time in the first data source.
In one possible design, the update information may record, in addition to the period to which the data change belongs, an index of data in the first data source in which the data change has occurred, and the synchronization means may determine the changed data in the first data source based on the index of the data in the first data source in the update information in which the data change of the group has occurred, when determining the changed data in the first data source based on the data change included in the group.
By the method, the synchronization device can determine the changed data more directly and conveniently by sending the index of the data changed in the data source.
In a possible design, if the data of the first data source is changed to the data of the deletion type, the update information may record an index of the data deleted when the data of the first data source is changed to the data of the deletion type in the first data source.
By the method, the updating information records the index of the deleted data in the first data source, so that the synchronization device can inquire the deleted data in the first data source, and the subsequent data synchronization is facilitated.
In a possible design, if the data change occurred in the first data source is a data change of a deletion type, the synchronization device may delete the data indicated by the index in the second data source according to the index recorded in the update information when synchronizing the changed data to the second data source.
By the method, the synchronization device can inquire the index of the deleted data through the updating information, and then can accurately delete the data indicated by the index from the second data source, so that data synchronization can be accurately performed.
In a possible design, the type of the data change occurring in the first data source is a table structure change, and if the table structure change is a change of a data column in a data table in the first data source, field information of the changed data column may be recorded in the update information, and optionally, a table name of the data table may be recorded. If the table structure is changed to the data table in the first data source (e.g., change the table name of the data table, delete the data table, add the data table, etc.), the table name of the data table changed in the first data source may be recorded in the update information.
By the method, the data column and the related information of the data table when the data of the table structure change type is changed are recorded in the update information, so that the synchronization device can conveniently determine the changed data subsequently.
In one possible design, when the type of data change occurring at the first data source is a table structure change, the synchronization means is synchronizing the changed data to the second data source.
If the table structure change is that the data column in the data table of the first data source is changed, the synchronization device may perform the table structure change on the data column in the second data source according to the field information of the changed data column in the data table of the first data source.
If the table structure is changed to the data table in the first data source, the synchronization device may change the table structure of the data table in the second data source according to the table name of the data table changed in the first data source.
By the method, the synchronization device can conveniently execute the same table structure change on the second data source according to the information recorded in the updating information, and the efficiency of data synchronization can be improved.
In one possible design, when the synchronization device determines to change the data according to the update information, the synchronization device may determine to change the data according to all data changes recorded in the update information if the first data source and the second data source have not previously performed data synchronization. If the first data source and the second data source have been synchronized before, the synchronization device may determine, according to the update information, a data change that has occurred from the last synchronization time of the second data source to the current time of the first data source; thereafter, changed data is determined from the first data source based on the determined manner of data change.
By the method, the synchronization device can determine the changed data which needs to be synchronized at the current time, and the accuracy of the whole data synchronization process is further ensured.
In a second aspect, an embodiment of the present application further provides a synchronization apparatus, and for beneficial effects, reference may be made to the description of the first aspect, which is not described herein again. The apparatus has the functionality to implement the actions in the method instance of the first aspect described above. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions. In a possible design, the structure of the apparatus includes an access unit, a determination unit, and a synchronization unit, and these units may perform corresponding functions in the method example of the first aspect, for which specific reference is made to the detailed description in the method example, and details are not repeated here.
In a third aspect, an embodiment of the present application further provides a computing device, where the computing device includes a processor and a memory, and may further include a communication interface, and the processor executes program instructions in the memory to perform the method provided in the first aspect or any possible implementation manner of the first aspect. The memory is coupled to the processor and holds the program instructions and data necessary to perform data synchronization. The communication interface is used for communicating with other equipment, such as obtaining update information from other equipment.
In a fourth aspect, the present application provides a computing device system comprising at least one computing device. Each computing device includes a memory and a processor. A processor of at least one computing device is configured to access code in the memory to perform the method provided by the first aspect or any one of its possible implementations.
In a fifth aspect, the present application provides a non-transitory readable storage medium which, when executed by a computing device, performs the method provided in the foregoing first aspect or any possible implementation manner of the first aspect. The storage medium stores a program therein. The storage medium includes, but is not limited to, volatile memory such as random access memory, and non-volatile memory such as flash memory, Hard Disk Drive (HDD), and Solid State Drive (SSD).
In a sixth aspect, the present application provides a computing device program product comprising computer instructions that, when executed by a computing device, perform the method provided in the first aspect or any possible implementation manner of the first aspect. The computer program product may be a software installation package, which may be downloaded and executed on a computing device in case it is desired to use the method as provided in the first aspect or any possible implementation manner of the first aspect.
Drawings
FIG. 1 is a block diagram of a system according to the present application;
FIG. 2 is a schematic diagram of another system configuration provided herein;
FIG. 3 is a schematic diagram of another system configuration provided herein;
FIG. 4 is a schematic diagram of a data synchronization method provided in the present application;
FIGS. 5A-5B are schematic diagrams of update information provided herein;
FIG. 6 is a schematic structural diagram of a synchronization apparatus provided in the present application;
FIG. 7 is a schematic diagram of a computing device provided by an embodiment of the present application;
fig. 8 is a schematic diagram of a computing device in a computing device system according to an embodiment of the present application.
Detailed Description
Fig. 1 shows a system architecture suitable for the embodiment of the present application, which includes a synchronization apparatus 100 and at least two data sources.
Data is stored in the data source, the embodiment of the present application does not limit the type of the data source and the type of the data stored in the data source, for example, the data source may be an Application Programming Interface (API), and the API includes but is not limited to: RESTful (representational state transfer), Webservice, or other custom API form; the data source may also be a Message Queue (MQ), including Kafka, RabbitMQ, Java Message Service (JMS), etc.; database (DB), including but not limited to oracle database, SQLServer (structured query language server), MySQL, DB2, PostgreSQL, etc.; the data source may also be a big data platform, the big data platform includes Hadoop distributed file system, Hive, Flink, Hbase, Presto, Spark, and the like, and the data source may also be a file store, the file store includes File Transfer Protocol (FTP), HDFS (hadoopd distributed file system), object store, and the like.
As shown in fig. 1, the system includes two data sources, i.e., a first data source 110 and a second data source 120, for example, wherein the first data source 110 and the second data source 120 are of the same type or different types.
In the embodiment of the present application, the synchronization apparatus 100 may synchronize the data of the first data source 110 with the second data source 120, and execute the data synchronization method provided in the embodiment of the present application, and the synchronization apparatus 100 may access update information in which data changes occurring in different time periods of the first data source 110 are recorded, determine data (simply referred to as changed data) that has changed in the first data source 110 according to the update information, and then synchronize the determined changed data with the second data source 120. In this data synchronization method, the information required to be included in the first data source 110 is not limited, and the synchronization apparatus 100 can synchronize the first data source 110 and the second data source 120 only by accessing the updated information, which is more flexible and efficient.
The synchronization apparatus 100 may be a hardware apparatus, such as: a server, a terminal computing device, etc., or a software device, specifically a set of software systems running on a hardware computing device. The deployed position of the synchronization apparatus 100 is not limited in the embodiment of the present application. For example, as shown in fig. 2, the synchronization apparatus 100 may operate on a cloud computing device system (including at least one cloud computing device, such as a server, etc.), may also operate on an edge computing device system (including at least one edge computing device, such as a server, a desktop, etc.), and may also operate on various terminal computing devices, such as: notebook computers, personal desktop computers, and the like.
The synchronization apparatus 100 may also be logically configured by a plurality of parts, for example, the synchronization apparatus 100 may include an access unit, a determination unit, and a synchronization unit, and each component in the synchronization apparatus 100 may be respectively deployed in different systems or servers. For example, as shown in fig. 3, each part of the apparatus may operate in three environments, namely, a cloud computing device system, an edge computing device system, or a terminal computing device, respectively, or may operate in any two of the three environments. The cloud computing device system, the edge computing device system and the terminal computing device are connected through communication paths, and can communicate with each other and transmit data. The data synchronization method provided by the embodiment of the application is cooperatively executed by the combined parts of the synchronization device 100 operating in three environments (or any two of the three environments).
A data synchronization method provided in the embodiment of the present application is described below with reference to fig. 4. As shown in fig. 4, the method includes:
step 401: the synchronization apparatus 100 generates update information of the first data source 110, wherein the update information records data changes occurring in the first data source 110 over a plurality of different time periods.
The types of data changes involved in the embodiments of the present application may be divided into four types, namely, insertion (insert), update (update), deletion (delete), and structure change.
1. And (4) inserting.
Insertion refers to inserting new data into the data of the first data source 110.
2. And (6) updating.
Updating refers to changing one or more data in the first data source 110 to other data.
3. And (5) deleting.
Delete refers to removing one or more data in the first data source 110.
4. The table structure is changed.
When the first data source 110 includes a data table, the data table includes a plurality of data columns, and each data column is provided with a field that identifies an attribute of the data column, such as a data type of the data column. Changes to the structure of the data table in the first data source 110 include, but are not limited to: changing fields of one or more data columns in a data table, deleting one or more data columns in the data table (including deleting fields of the one or more data columns and data stored in the one or more data columns), adding one or more fields in the data table, adding a new data table, deleting a data table, changing a table name of a data table.
In the embodiment of the present application, the number of the data tables in the first data source 110 is not limited, and may be one or more; and there is allowed an association relationship between the data tables, that is, data in one table (in this association, the table is referred to as a child table) refers to data in another table (in this association, the table is referred to as a parent table).
The synchronization apparatus 100 may record the data change that has occurred in the first data source 110 when the data change occurs in the first data source 110 or after the data change occurs, so as to arrange the updated information that becomes the first data source 110. For example, the synchronization apparatus 100 may record the time of the data change occurring in the first data source 110, the type of the data change occurring in the first data source 110, or the index of the data change occurring in the first data source 110 when the data change occurs in the first data source 110, and then integrate the data change occurring in the first data source 110 in a plurality of different time periods according to the time of the data change occurring in the first data source 110.
The synchronization apparatus 100 may also create a trigger, which may monitor the first data source 110 in real time, and when a data change occurs in the first data source 110, the trigger determines a time period to which the time of the data change occurs, records the data change and the time period to which the data change belongs, and generates the update information.
In the embodiment of the present application, the specific length of the time period is not limited, and the lengths of the time periods may be the same or different. When the update information is generated, if the insertion and the updated data change occur within the same time period, the insertion and the update may be recorded in a combined manner. When the inserted (or updated) and deleted data changes are transmitted within the same time period, the time period needs to be divided into a time period in which the inserted (or updated) data changes occur and a time period in which the deleted data changes occur. When data changes of insertion (or update) and structure change occur in the same time slot, the time slot needs to be divided into a time slot in which the data change of insertion (or update) occurs and a time slot in which the data change of structure change occurs. If the data change of deletion and structure change occurs in the same time slot, the time slot needs to be divided into the time slot in which the data change of deletion occurs and the data change in which structure change occurs. That is, data changes that are deleted, changed in structure, need to be recorded separately, and data changes that are inserted and updated can be merged.
It should be noted that in some data change scenarios, the statements used for inserting and updating the type of data change are the same, and for example, merge statements or replace statements may be used. When performing an insert and update type of data change, if data is queried, the update type of data change is performed, otherwise the insert type of data change is performed, whereas the insert and update type of data change may be merged. However, for scenarios where the statements of the insertion and update type data changes are different, the insertion and update type data changes may not be merged.
For different ways of recording data in the first data source 110, the ways of recording data changes of the first data source 110 in a plurality of different time periods by the update information are also different, which are respectively described as follows:
in the first mode, the first data source 110 does not record the update time of the data, that is, the data in the first data source 110 does not record the time of the data update or insertion. The update information may record data changes occurring in the first data source 110 during a plurality of different time periods and an index of the data where the data changes occur in the first data source 110. The index may indicate data in the first data source 110 for which data changes occurred. The embodiment of the present application does not limit the specific representation form of the index, and may be, for example, a primary key in a data table, a row number in the data table, a unique key, a data positioning identifier, and the like.
While for a deletion type of data change, data in the first data source 110 has been deleted, for subsequent convenience in determining the deleted data in the first data source 110, the index of the deleted data in the first data source 110 may be recorded when the deletion type of data change occurs in the update information.
For a data change of the table structure change type, field information of one or more data columns changed in the first data source 110 or a table name of a data table in which the data change occurs needs to be recorded in update information, and the field information of the update information record includes but is not limited to: field name, field type, field length.
If the data of the table structure change type occurring in the first data source 110 is changed into the field of one or more data columns in the changed data table, the field name, the field length, etc. of the one or more data columns before and after the change may be recorded in the update information. Optionally, the table name of the data table in which the one or more data columns are located may also be recorded.
If the data of the table structure change type generated in the first data source 110 changes to delete one or more columns in the data table, the field names of the deleted one or more data columns are recorded in the update information. Optionally, the table name of the data table in which the one or more data columns are located may also be recorded.
If the data of the table structure change type occurring in the first data source 110 is changed to add one or more fields in the data table, the field names of the one or more added data columns are recorded in the update information. Optionally, the table name of the data table in which the one or more fields are located may also be recorded.
If the data of the table structure change type occurring in the first data source 110 is changed to add a data table in the first data source 110, and the added table name of the data table is recorded in the update information, optionally, the table structure of the data table may also be recorded, such as the number of data columns included in the data table, the field name of each column of data, and the like.
If the data of the table structure change type generated in the first data source 110 is changed to delete the data table in the first data source 110, the table name of the deleted data table is recorded in the update information.
If the data of the table structure change type generated in the first data source 110 is changed to change the table name in the data table in the first data source 110, the table name of the data table before and after the change may be recorded in the update information.
Fig. 5A is a schematic diagram of update information according to an embodiment of the present application, where each row records data changes occurring in a time period, and each row records a table name of a data table in which the data changes occur, a start time of the time period, an end time of the time period, a type of the data changes occurring, and an index of changed data.
For example, in the same time period, data changes of the update type occur in all data of rows 1 to 10 in the data table, the index of the data table may be identified by a row number, the row numbers of rows 1 to 10 in the data table may be recorded in the update information, or the row numbers of the data of rows 1 to 10 in the data table may be merged and recorded as rows 1 to 10.
In the second mode, the first data source 110 records the update time of the data, and the update information may record data changes occurring in the first data source 110 in a plurality of different time periods, and optionally, may record an index of the changed data.
Since the first data source 110 records the update time of the data, the update information only needs to record the time period to which the data change occurs, so as to determine which changed data of the data change occurs in the time period.
For data change of deletion type and data change of table structure change type, the information recorded in the update information is the same as the information recorded in the first mode, which can be referred to the foregoing specifically, and is not described herein again.
Fig. 5B is a schematic diagram of update information according to an embodiment of the present application, where each row records data changes occurring in a time period, and each row records a table name of a data table in which the data changes occur, a start time of the time period, an end time of the time period, and a type of the data changes occurring. Optionally, an index of the changed data may also be recorded.
Step 402: the synchronization apparatus 100 accesses the update information of the first data source 110.
The embodiment of the present application does not limit the manner of executing step 402, and the synchronization apparatus 100 may periodically access the update information of the first data source 110, may access the update information of the first data source 110 in real time, or may access the update information of the first data source 110 after receiving an instruction for instructing to perform data synchronization triggered by a user.
Step 403: the synchronization apparatus 100 determines the changed data in the first data source 110 based on the update information.
When the synchronization apparatus 100 executes step 403, if data synchronization is not performed between the first data source 110 and the second data source 120, the synchronization apparatus 100 may determine all data changes occurring in the first data source 110 before the current time according to the update information. If the first data source 110 and the second data source 120 have performed data synchronization, the synchronization apparatus 100 may determine, according to the update information, all data changes that have occurred in the first data source 110 after the last data synchronization of the first data source 110 and the second data source 120 but before the current time.
As a possible implementation manner, the synchronization apparatus 100 may delete the record of the synchronized change data in the update information after synchronizing the data of the first data source 110 to the second data source 120 each time. In this way, when the synchronization apparatus 100 determines changed data in the first data source 110 based on the update information, the changed data can be determined based on all information currently held by the update information.
When determining the changed data in the first data source 110, the synchronization apparatus 100 may determine the changed data corresponding to the data change occurring in each time period; or combining a plurality of continuous time periods, and determining changed data corresponding to the data change occurring in the combined time period. When merging a plurality of consecutive time slots, if the types of data changes occurring in the consecutive time slots are the same, the synchronization apparatus 100 may merge the consecutive time slots into one group, and determine the changed data in the first data source 110 according to the data changes recorded in the group. If the types of data changes occurring in two adjacent time periods are different, the two time periods may not be merged, and the synchronization device 100 may determine the changed data corresponding to the data changes occurring in the two time periods.
With the updated information as shown in fig. 5A, wherein the types of data changes occurring in the first time period (9: 30-9: 31) and the second time period (9: 35-9: 36) are both inserted (or updated), the first time period and the second time period can be merged. The synchronization device 100 can determine the changed data corresponding to the data change occurring within the time period (9:30 to 9: 36). The data change type generated in the third time period (9: 38-9: 39) is deletion, different from the data change type transmitted in the second time period, the second time period and the third time period cannot be merged, and the synchronization device 100 can respectively determine the time period in which the first time period (9: 30-9: 31) and the second time period (9: 35-9: 36) are merged and the change data corresponding to the data change generated in the third time period (9: 38-9: 39).
The following describes the modified data corresponding to the four different data modification types:
1. insert and update the changed data corresponding to the type of data change.
The changed data corresponding to the inserted and updated type of data change is the newly inserted or updated data in the first data source 110.
If the first data source 110 includes the data change time, that is, the time for inserting or updating the data is recorded in the first data source 110.
The synchronization apparatus 100 may query, from the first data source 110, data whose change time belongs to the time slot according to the insertion recorded in the update information and the time slot to which the update-type data change belongs, and use the data whose change time belongs to the time slot as the change data.
If the first data source 110 does not include the data change time, the synchronization apparatus 100 may query the index of the first data source 110 for the data indicated by the index according to the data that is recorded in the update information and has been subjected to the data change of the update type, and use the data indicated by the index as the changed data.
2. And deleting the changed data corresponding to the data change of the type.
The changed data corresponding to the deletion type data change is the deleted data in the first data source 110.
The synchronization apparatus 100 may be configured to obtain, from the index of the data, recorded in the update information, in the first data source 110, where the data changed by the deletion-type data is indicated in the first data source 110, that is, the changed data corresponding to the data change by the deletion-type data.
It should be noted that, since the first data source 110 deletes the data indicated by the index after sending the deletion-type data change, if the first data source 110 and the second data source 120 synchronize the data indicated by the index before, the second data source 120 may query the data indicated by the index. If the first data source 110 and the second data source 120 have not previously synchronized the data indicated by the index; for example, the first data source 110 and the second data source 120 have not been previously synchronized. As another example, the data indicated by the index is inserted after the last data synchronization between the first data source 110 and the second data source 120, and a delete-type data change is subsequently performed. In this case, the data indicated by the index may not be used as the changed data.
3. The table structure changes the changed data corresponding to the type of data change.
The changed data corresponding to the data change of the table structure change type is fields of one or more data columns changed in the first data source 110 or a data table changed in the first data source 110.
The synchronization apparatus 100 may determine the changed data according to field information of one or more data columns changed in the first data source 110 recorded in the update information or a table name of a data table changed in the first data source 110.
After the changed data is determined, step 404 may be performed.
Step 404: the synchronization apparatus 100 synchronizes the change data to the second data source 120.
The following describes the synchronization method of the changed data corresponding to the four different data change types:
1. insert and update the changed data corresponding to the type of data change.
If the first data source 110 includes the data change time, the second data source 120 also includes the data change time accordingly.
The synchronization device 100 may search for data in the time slot to which the change time belongs from the second data source 120 based on the insertion recorded in the update information and the time slot to which the update type data change belongs, and may synchronize the data corresponding to the change time in the second data source 120 with the changed data if the data in the time slot to which the change time belongs can be searched from the second data source 120, indicating that the type of the data change is update. If the data with the changed time belonging to the time period cannot be inquired from the second data source 120 and the type of the data change is the insertion, the changed data is inserted into the second data source 120, and the changed time of the changed data in the second data source 120 is set as the data change time corresponding to the changed data in the first data source 110.
If the first data source 110 does not include the data change time, the synchronization apparatus 100 may query the index of the first data source 110 for the data indicated by the index from the second data source 120 according to the insertion recorded in the update information and the data changed in the update type, and if the index is capable of being queried from the second data source 120, it indicates that the data change type is an update, and synchronizes the data indicated by the index in the second data source 120 to the changed data. If the data indicated by the index cannot be searched from the second data source 120 and the type of the data change is insertion, the changed data is inserted into the second data source 120 and the index of the changed data inserted into the second data source 120 is set as the index recorded in the update information.
It should be noted that, since synchronization is required between the first data source 110 and the second data source 120, that is, the index of the same data in the first data source 110 is the same as the index in the second data source 120, the synchronization apparatus 100 may query the second data source 120 for the data indicated by the index according to the index of the data in the first data source 110.
2. And deleting the changed data corresponding to the data change of the type.
The synchronization apparatus 100 may query the second data source 120 for the data indicated by the index according to the index of the data, which is recorded in the update information and has undergone the deletion-type data change, in the first data source 110, and delete the data indicated by the index in the second data source 120.
3. The table structure changes the changed data corresponding to the type of data change.
The synchronization apparatus 100 may perform the same table structure change on the data table in the second data source 120 according to the field information of the one or more data columns changed in the first data source 110 recorded in the update information or the table name of the data table added in the first data source 110.
If a data change of the type of the table structure change occurring in the first data source 110 changes fields of one or more data columns in the changed data table, the synchronization apparatus 100 may change fields of the same one or more data columns in the data table in the second data source 120 according to field information of the one or more data columns before and after the change recorded in the update information.
If the data change of the table structure change type occurred in the first data source 110 is to delete one or more data columns in the data table, the synchronization apparatus 100 may delete one or more data columns of the same field in the data table in the second data source 120 according to the field information of the deleted one or more data columns recorded in the update information.
If the data change of the table structure change type occurred in the first data source 110 is to add one or more fields in the data table, the synchronization apparatus 100 adds one or more data columns of the same fields in the data table in the second data source 120 according to the added one or more fields recorded in the update information.
If the data change of the table structure change type occurring in the first data source 110 is to add a data table in the first data source 110, the sync device 100 adds a data table of the same table name in the data table in the second data source 120 according to the table name of the added data table recorded in the update information. The synchronization apparatus 100 may also set the same table structure for the added data table in the second data source 120 according to the table structure (e.g. the number of data columns included in the data table, the field name of each column of data, etc.) in which the data table is recorded in the update information.
If the data change of the table structure change type occurred in the first data source 110 is to delete a data table in the first data source 110, the sync device 100 deletes a data table of the same table name in a data table that can be in the second data source 120.
If a data change of the table structure change type occurred in the first data source 110 changes the table name of a data table in the changed data table, the synchronization apparatus 100 may change the table name of the same data table in the second data source 120 according to the table name of the data table before and after the change recorded in the update information.
As a possible implementation manner, when the first data source 110 includes multiple data tables and there is an association relationship between the multiple data tables, the synchronization apparatus 100 may also determine a synchronization order of the same multiple data tables in the second data source 120 when synchronizing the changed data to the second data source 120.
For the data change of the insertion or update type, if the changed data corresponding to the data change of the insertion or update type exists in the child table and the parent table at the same time, the synchronization apparatus 100 may synchronize the data in the parent table in the second data source 120 first, and then synchronize the data in the child table in the second data source 120.
For a deletion type data change, if the changed data corresponding to the deletion type data change exists in the child table and the parent table at the same time, the synchronization apparatus 100 may synchronize the data in the child table in the second data source 120 first, and then synchronize the data in the parent table in the second data source 120.
For a data change of the table structure change type, the synchronization apparatus 100 may not set the synchronization order of the same plurality of data tables in the second data source 120.
Based on the same inventive concept as the method embodiment, the embodiment of the present application further provides a synchronization apparatus, where the synchronization apparatus is configured to execute the method executed by the synchronization apparatus 100 in the method embodiment. As shown in fig. 6, the synchronization apparatus 600 includes an accessing unit 601, a determining unit 602, and a synchronizing unit 603, which may be software modules. Specifically, in the synchronization apparatus 600, the modules are connected to each other through a communication path.
The accessing unit 601 is configured to access update information of the first data source, where the update information is used to record data changes occurring in the first data source in multiple different time periods. Such as performing step 402 in the embodiment of the method shown in fig. 4.
A determining unit 602, configured to determine, according to the update information, changed data in the first data source; such as performing step 403 in the method embodiment shown in fig. 4.
A synchronizing unit 603 configured to synchronize the change data to the second data source. Such as performing step 404 in the embodiment of the method shown in fig. 4.
As a possible implementation, the first data source includes at least one data table, and the data table is used for storing data.
As a possible embodiment, in the case that the first data source includes a plurality of data tables, there may be an association relationship among the plurality of data tables, and when synchronizing the changed data to the second data source, the synchronizing unit 603 may determine the synchronization order of the plurality of data tables in the second data source according to the association relationship among the plurality of data tables; and then, synchronizing the changed data to a plurality of data tables of the second data source according to the synchronization sequence.
As a possible embodiment, the determining unit 602, when determining changed data in the first data source according to the update information, may divide data changes of the same type occurring in consecutive periods in the update information into a group; if the data is divided into a plurality of groups, determining unit 602 may determine changed data corresponding to the data change of each group according to the data change included in each group. If the group is divided, the determination unit 602 may determine the changed data according to the data change included in the group.
As a possible embodiment, when the data change time is included in the first data source, and the determination unit 602 determines changed data in the first data source according to the data change included in the group, data in the first data source whose data change time is in a time period to which the data change included in the group belongs may be regarded as changed data.
As a possible embodiment, the update information may further include, in addition to the time period to which the data change belongs, an index of the data in which the data change has occurred in the first data source, and the determination unit 602 may, when determining the changed data in the first data source from the data change included in the group, take, as the changed data, data indicated by the index of the data in which the data change included in the group has occurred in the first data source.
As a possible implementation manner, when the type of the data change of the first data source is deletion, the update information may record an index of the deleted data in the first data source when the data change of the deletion type of the first data source occurs.
As a possible implementation manner, when the type of the data change occurring in the first data source is deletion, the synchronization unit 603 may delete the data indicated by the index in the second data source when synchronizing the changed data to the second data source.
As a possible implementation manner, when the type of the data change occurring in the first data source is a table structure change, the update information includes field information of a data column in the data table of the first data source that has been changed and/or a table name of the data table in which the first data source has been changed.
As a possible implementation manner, when the type of the data change occurring in the first data source is a table structure change, the synchronizing unit 603 may perform the table structure change on the data column in the second data source according to the field information of the data column in the data table of the first data source, when synchronizing the changed data to the second data source; the table structure of the data table in the second data source can be changed according to the table name of the data table changed by the first data source.
As a possible embodiment, when determining to change data according to the update information, the determining unit 602 may determine, according to the update information, that data change has occurred from the last synchronization time of the second data source to the current time of the first data source; thereafter, changed data is determined from the first data source based on the determined manner of data change.
The division of the modules in the embodiments of the present application is schematic, and only one logic function division is provided, and in actual implementation, there may be another division manner, and in addition, each functional module in each embodiment of the present application may be integrated in one processor, may also exist alone physically, or may also be integrated in one module by two or more modules. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a terminal device (which may be a personal computer, a mobile phone, or a network device) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Such as computing device 700 shown in fig. 7. The computing device 700 includes a bus 701, a processor 702, a communication interface 703, and a memory 704. The processor 702, memory 704, and communication interface 703 communicate over a bus 701.
The processor 702 may be a Central Processing Unit (CPU). The memory 704 may include volatile memory (volatile memory), such as Random Access Memory (RAM). The memory 704 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory, an HDD, or an SSD. The memory has stored therein executable code that the processor 702 executes to perform the aforementioned method of classification model training. The memory 704 may also include other software modules required to run a process, such as an operating system. The operating system may be LINUXTM,UNIXTM,WINDOWSTMAnd the like.
Specifically, the memory 704 stores the modules of the apparatus 900. The memory 704 may include, in addition to the aforementioned modules, other software modules required for running a process, such as an operating system. The operating system may be LINUXTM,UNIXTM,WINDOWSTMAnd the like.
The present application also provides a computing device system that includes at least one computing device 800 as shown in fig. 8. The computing device 800 includes a bus 801, a processor 802, a communication interface 803, and a memory 804. The processor 802, memory 804, and communication interface 803 communicate over a bus 801. At least one computing device 800 in the system of computing devices communicates with each other via a communication path.
The processor 802 may be a CPU, among others. The memory 804 may include volatile memory, such as random access memory. The memory 804 may also include a non-volatile memory, such as a read-only memory, a flash memory, an HDD, or an SSD. The memory 804 has stored therein executable code that the processor 802 executes to perform any or all of the aforementioned methods of data synchronization. In the memoryBut may also include other software modules required to run a process, such as an operating system. The operating system may be LINUXTM,UNIXTM,WINDOWSTMAnd the like.
Specifically, the memory 804 stores any one or more modules of the apparatus 700. The memory 804 may include, in addition to any one or more of the modules described above, other software modules required to run a process, such as an operating system. The operating system may be LINUXTM,UNIXTM,WINDOWSTMAnd the like.
At least one computing device 800 in the computing device system, on each of which any one or any plurality of modules in the apparatus 700 are running, establishes communication with each other over a communication network. The at least one computing device 800 collectively performs the aforementioned data synchronization operations.
The descriptions of the flows corresponding to the above-mentioned figures have respective emphasis, and for parts not described in detail in a certain flow, reference may be made to the related descriptions of other flows.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product for data synchronization comprises one or more computer program instructions for data synchronization which, when loaded and executed on a computer, cause, in whole or in part, the flow or function of data synchronization according to embodiments of the invention.
The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another, for example, from one website, computer, server, or data center over a wired (e.g., coaxial, fiber, digital subscriber line, or wireless (e.g., infrared, wireless, microwave, etc.) link to another website, computer, server, or data center. (e.g., floppy disk, hard disk, magnetic tape), optical media (e.g., DVD), or semiconductor media (e.g., SSD).

Claims (24)

1. A method for synchronizing data, the method comprising:
accessing update information of a first data source, the update information being used to record data changes occurring in the first data source over a plurality of different time periods;
determining changed data in the first data source according to the updating information;
synchronizing the change data to the second data source.
2. The method of claim 1, wherein the first data source includes at least one data table for storing data.
3. The method of claim 2, wherein the first data source includes a plurality of data tables, and wherein synchronizing the change data to the second data source includes:
determining the synchronization sequence of the plurality of data tables in the second data source according to the association relationship among the plurality of data tables;
and synchronizing the changed data to a plurality of data tables of the second data source according to the synchronization sequence.
4. A method according to any one of claims 1 to 3, wherein said determining changed data in said first data source based on said update information comprises:
dividing the same type of data changes occurring in successive time periods in the update information into a group;
determining the changed data according to the data changes included in the group.
5. The method of claim 4, wherein the first data source includes a data change time therein, and wherein determining changed data in the first data source based on the data changes included in the group comprises:
and taking the data of the time period which belongs to the data change and is included in the group at the data change time in the first data source as the change data.
6. The method of claim 4, wherein the update information includes an index in the first data source of data for which the data change occurred, and wherein determining changed data in the first data source from the data changes included in the group comprises:
and taking data indicated by the index of the data changed by the data included in the group in the first data source as the changed data.
7. The method according to any one of claims 1 to 3, wherein the type of the data change of the first data source is deletion, and the update information includes an index of the deleted data in the first data source when the data change of the deletion type of the first data source occurs.
8. The method of claim 7, wherein the synchronizing the change data to the second data source comprises:
deleting the data in the second data source that indexes the indication.
9. The method according to any one of claims 1 to 3, wherein the type of the data change of the first data source is a table structure change, and the update information includes field information of a data column of the data table of the first data source which is changed and/or a table name of the data table of the first data source which is changed.
10. The method of claim 9, wherein the synchronizing the change data to the second data source comprises:
performing table structure change on the data column in the second data source according to the field information of the changed data column in the data table of the first data source; or
And carrying out table structure change on the data table in the second data source according to the table name of the changed data table in the first data source.
11. The method of claim 1, wherein said determining change data based on said update information comprises:
determining data change of the first data source from the last synchronization time of the second data source to the current time according to the updating information;
and determining the changed data from the first data source according to the determined data change mode.
12. A synchronization apparatus, characterized in that the apparatus comprises an access unit, a determination unit, and a synchronization unit:
the access unit is used for accessing the updating information of a first data source, and the updating information is used for recording data changes occurring in the first data source in a plurality of different time periods;
the determining unit is used for determining changed data in the first data source according to the updating information;
the synchronization unit is configured to synchronize the change data to the second data source.
13. The apparatus of claim 12, wherein the first data source comprises at least one data table for storing data.
14. The apparatus of claim 13, wherein the first data source comprises a plurality of data tables, and the synchronization unit synchronizes the change data to the second data source, and is specifically configured to:
determining the synchronization sequence of the plurality of data tables in the second data source according to the association relationship among the plurality of data tables;
and synchronizing the changed data to a plurality of data tables of the second data source according to the synchronization sequence.
15. The apparatus according to any one of claims 12 to 14, wherein the determining unit determines the changed data in the first data source according to the update information, and is specifically configured to:
dividing the same type of data changes occurring in successive time periods in the update information into a group;
determining the changed data according to the data changes included in the group.
16. The apparatus according to claim 15, wherein the first data source comprises a data change time, and the determining unit determines the changed data in the first data source according to the data changes comprised in the group, and is specifically configured to:
and taking the data of the time period which belongs to the data change and is included in the group at the data change time in the first data source as the change data.
17. The apparatus according to claim 15, wherein the update information includes an index of data in the first data source where the data change occurred, and the determining unit determines changed data in the first data source according to the data change included in the group, and is specifically configured to:
and taking data indicated by the index of the data changed by the data included in the group in the first data source as the changed data.
18. The apparatus according to any one of claims 12 to 14, wherein the type of the data change of the first data source is deletion, and the update information includes an index of the data deleted when the data change of the deletion type of the first data source occurs in the first data source.
19. The apparatus of claim 18, wherein the synchronization unit to synchronize the change data to the second data source comprises:
deleting the data in the second data source that indexes the indication.
20. The apparatus according to any one of claims 12 to 14, wherein the type of the data change of the first data source is a table structure change, and the update information includes field information of a data column of the data table of the first data source that has been changed and/or a table name of the data table of the first data source that has been changed.
21. The apparatus of claim 20, wherein the synchronization unit synchronizes the change data to the second data source, and is specifically configured to:
performing table structure change on the data column in the second data source according to the field information of the changed data column in the data table of the first data source; or
And carrying out table structure change on the data table in the second data source according to the table name of the changed data table in the first data source.
22. The apparatus according to claim 12, wherein the determining unit is configured to determine the change data based on the update information, and is configured to:
determining data change of the first data source from the last synchronization time of the second data source to the current time according to the updating information;
and determining the changed data from the first data source according to the determined data change mode.
23. A computing device system comprising at least one computing device, each computing device comprising a memory and a processor, the memory of the at least one computing device for storing computer instructions;
the processor of the at least one computing device executes the computer instructions stored by the memory to perform the method of any of the above claims 1-11.
24. A non-transitory readable storage medium, wherein the non-transitory readable storage medium, when executed by a computing device, performs the method of any of claims 1-11.
CN201911244776.7A 2019-12-06 2019-12-06 Data synchronization method and device Pending CN111177162A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911244776.7A CN111177162A (en) 2019-12-06 2019-12-06 Data synchronization method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911244776.7A CN111177162A (en) 2019-12-06 2019-12-06 Data synchronization method and device

Publications (1)

Publication Number Publication Date
CN111177162A true CN111177162A (en) 2020-05-19

Family

ID=70650172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911244776.7A Pending CN111177162A (en) 2019-12-06 2019-12-06 Data synchronization method and device

Country Status (1)

Country Link
CN (1) CN111177162A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656221A (en) * 2021-08-18 2021-11-16 中国邮政储蓄银行股份有限公司 Data processing method and device, computer readable storage medium and processor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038195A (en) * 2015-12-17 2017-08-11 阿里巴巴集团控股有限公司 Method of data synchronization and device
CN108563658A (en) * 2017-12-29 2018-09-21 邵阳学院 A kind of method and apparatus of multi-platform data synchronization updating
CN109885581A (en) * 2019-03-14 2019-06-14 苏州达家迎信息技术有限公司 Synchronous method, device, equipment and the storage medium of database
US10353907B1 (en) * 2016-03-30 2019-07-16 Microsoft Technology Licensing, Llc Efficient indexing of feed updates for content feeds

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038195A (en) * 2015-12-17 2017-08-11 阿里巴巴集团控股有限公司 Method of data synchronization and device
US10353907B1 (en) * 2016-03-30 2019-07-16 Microsoft Technology Licensing, Llc Efficient indexing of feed updates for content feeds
CN108563658A (en) * 2017-12-29 2018-09-21 邵阳学院 A kind of method and apparatus of multi-platform data synchronization updating
CN109885581A (en) * 2019-03-14 2019-06-14 苏州达家迎信息技术有限公司 Synchronous method, device, equipment and the storage medium of database

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656221A (en) * 2021-08-18 2021-11-16 中国邮政储蓄银行股份有限公司 Data processing method and device, computer readable storage medium and processor

Similar Documents

Publication Publication Date Title
US11226948B2 (en) Index maintenance based on a comparison of rebuild vs. update
US9336227B2 (en) Selective synchronization in a hierarchical folder structure
CN112307037B (en) Data synchronization method and device
JP6521402B2 (en) Method for updating data table of KeyValue database and apparatus for updating table data
EP3125501B1 (en) File synchronization method, server, and terminal
CN108121782B (en) Distribution method of query request, database middleware system and electronic equipment
US10803079B2 (en) Timing-based system-period temporal table in a database system
US11188423B2 (en) Data processing apparatus and method
CN106874281B (en) Method and device for realizing database read-write separation
CN109145060B (en) Data processing method and device
US11599425B2 (en) Method, electronic device and computer program product for storage management
CN111177162A (en) Data synchronization method and device
CN113761052A (en) Database synchronization method and device
CN109165259B (en) Index table updating method based on network attached storage, processor and storage device
CN116775712A (en) Method, device, electronic equipment, distributed system and storage medium for inquiring linked list
CN114490865A (en) Database synchronization method, device, equipment and computer storage medium
CN112948494A (en) Data synchronization method and device, electronic equipment and computer readable medium
CN108573042B (en) Report synchronization method, electronic equipment and computer readable storage medium
CN113742376A (en) Data synchronization method, first server and data synchronization system
EP3214549A1 (en) Information processing device, method, and program
US20190121894A1 (en) Parallel map and reduce on hash chains
CN110362706B (en) Data searching method and device, storage medium and electronic device
CN113950145B (en) Data processing method and device
CN113127164B (en) Method, apparatus and computer program product for managing tasks in application nodes
CN117573679A (en) Data synchronization method, device, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220214

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Applicant after: Huawei Cloud Computing Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Applicant before: HUAWEI TECHNOLOGIES Co.,Ltd.

TA01 Transfer of patent application right