CN112783980B

CN112783980B - Data synchronous processing method, device, electronic equipment and computer readable medium

Info

Publication number: CN112783980B
Application number: CN202110137823.9A
Authority: CN
Inventors: 王浩; 王道龙
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-02-01
Filing date: 2021-02-01
Publication date: 2024-05-10
Anticipated expiration: 2041-02-01
Also published as: CN112783980A

Abstract

The application discloses a data synchronization method and device, and relates to the technical field of databases. One embodiment of the method includes determining data storage structures of at least one source database, generating a data structure table corresponding to each storage structure; for each source database, responsive to receiving a database log of the source database, copying data of the database log into a data structure table corresponding to the source database; based on the change of the data in the data structure table corresponding to each source database, the data change flow of each data structure table is recorded in a preset data change table.

Description

Data synchronous processing method, device, electronic equipment and computer readable medium

Technical Field

The present application relates to the field of computer technology, and in particular, to the field of database technology, and in particular, to a data synchronization processing method, apparatus, electronic device, computer readable medium, and computer program product.

Background

The data of different services of the internet are rapidly growing at present, the service data are required to be stored, the data sources can be data of a plurality of different channels when the service data are stored, for example, the service data are sourced from a downstream system or different types of databases, and when the data storage amount is large, the service data are synchronously stored in an incremental mode.

There are various ways of incremental synchronous storage, for example, the incremental synchronous storage is in butt joint with a downstream system, so that the downstream system can carry out real-time notification when new or changed data is available, and when the data is frequently changed, the synchronous storage can lead to the fact that a receiver is tired of synchronous operation, so that not only is certain consumption caused to network resources, but also strict requirements are made on the performance of a receiver, and in extreme cases, the receiver can be towed down. For another example, by timing tasks, the full data or incremental data is scanned at regular intervals and the differences are compared, and the incremental synchronous storage mode cannot be guaranteed by aging firstly, and then a large amount of system resources can be occupied when the data differences are compared improperly.

Disclosure of Invention

A data synchronization processing method, apparatus, electronic device, computer readable medium, and computer program product are provided.

According to a first aspect, there is provided a data synchronization processing method, the method comprising: determining data storage structures of at least one source database, and generating a data structure table corresponding to each storage structure; for each source database, responsive to receiving a database log of the source database, copying data of the database log into a data structure table corresponding to the source database; and recording the data change flow of each data structure table in a preset data change table based on the change of the data in the data structure table corresponding to each source database.

According to a second aspect, there is provided a data synchronization processing apparatus comprising: a generation unit configured to determine data storage structures of at least one source database, generating a data structure table corresponding to each storage structure; a copying unit configured to copy, for each source database, data of a database log into a data structure table corresponding to the source database in response to receiving the database log of the source database; and the recording unit is configured to record the data change flow of each data structure table in a preset data change table based on the change of the data in the data structure table corresponding to each source database.

According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in any one of the implementations of the first aspect.

According to a fourth aspect, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a method as described in any implementation of the first aspect.

According to a fifth aspect, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.

The embodiment of the application provides a data synchronization processing method and device, firstly, determining a data storage structure of at least one source database, and generating a data structure table corresponding to each storage structure; secondly, for each source database, responding to the received database log of the source database, and copying the data of the database log into a data structure table corresponding to the source database; and finally, based on the change of the data in the data structure table corresponding to each source database, recording the data change flow of each data structure table in a preset database change table. Therefore, the incremental synchronous storage can be carried out on the data from a plurality of different databases, the data change flow of the changed data in the incremental synchronous process is automatically recorded, the flexibility of the data change record is improved, the data storage space is saved, and the data storage efficiency is improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are included to provide a better understanding of the present application and are not to be construed as limiting the application. Wherein:

FIG. 1 is a flow chart of one embodiment of a data synchronization processing method in accordance with the present application;

FIG. 2 is a schematic diagram of a structure for data synchronization processing in accordance with the present application;

FIG. 3 is a flow chart of another embodiment of a data synchronization processing method according to the present application;

FIG. 4 is a flow chart of a method of querying a data change pipeline of data corresponding to all subscription fields in a data change table according to the present application;

FIG. 5 is a schematic diagram of an embodiment of a data synchronization processing apparatus according to the present application;

Fig. 6 is a block diagram of an electronic device for implementing a data synchronization processing method according to an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present application are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 shows a flow 100 of one embodiment of a data synchronization processing method according to the present application. The data synchronous processing method comprises the following steps:

Step 101, determining data storage structures of at least one source database, and generating a data structure table corresponding to each storage structure.

In this embodiment, the source database is a data source side database, since the storage structures of the storage data of different kinds of source databases are different, the data storage structure of the source database can be first determined manually or by other means (such as communication with a server of the source database that stores the data storage structure in advance), the source database of the data is stored in a database table, the table structure of the table in the source database can be determined manually or by other means, the table structure includes at least one table field, each table field is used for recording different data, the table of the data structure is a blank table with the same table field as the storage structure, and the table of the data structure is used for synchronizing with the changed data of the source database.

In this embodiment, the execution subject on which the data synchronization processing method operates may be a database, and the data synchronization and data replication process between the source database and the execution subject may refer to a conventional data synchronization and replication process between two databases.

Step 102, for each source database, in response to receiving the database log of the source database, copying the data of the database log into the data structure table corresponding to the source database.

In this embodiment, the execution body on which the data synchronization processing method operates receives the data of the database log generated in the native synchronization manner of the source database, copies the data of the database log into the data structure table corresponding to the source database, and may record all the data in the database log of the source database in real time in the data structure table.

The manner in which database logs are received is different for different types of source databases, as shown in fig. 2, for Mysql type source databases, mysql binlog (Mysql binlog is a binary log file used to record data updates or potential updates of Mysql) of the source database may be received by looking at the log file command. Specifically, the size of each database log file in the instance, and the usage, may be obtained by a dbcc sqlperf (logspace) command. Further, the dynamic execute dbcc sqlperf (logspace) command may insert the database log data into the data structure table for the purpose of copying the database log data. For an oracle type source database (oracle in fig. 2), the database log of the source database may be received by a database synchronization tool (e.g., shareplex tool, which may read all the change data of the database from the database log of the oracle source database). For other types of databases (other databases in fig. 2), a synchronization manner adapted to the other types of databases may be adopted to obtain database logs of the other types of databases.

Step 103, based on the change of the data in the data structure table corresponding to each source database, recording the data change flow of each data structure table in a preset data change table.

In this embodiment, the data structure table is only used for recording the final change values of the data corresponding to the different fields, and the data change table is used for recording all the change values of the data corresponding to the different fields to form a data change pipeline, for example, an integer data is changed from 1 to 2,2 to 3,3 to 5, the data structure table is only recorded with the integer data being 5, and the data change table is recorded with the data change pipeline formed by the data of 1, 2,3 and 5.

In this embodiment, a data change table may be set in advance according to the data structure of each data structure table, and the data change table may be one table or a plurality of tables, and when the data change table is a plurality of tables, each data change table may correspond to one data structure table, and the data change pipeline of each data structure table is recorded. When the data change table is a table, the data change table may be used to record a data change stream of all data structure tables.

As shown in table 1, the data change table includes table creation contents (e.g., id, table_name, create_time) and data change contents (change_content, type) of table records, and in the data change table, the data change contents change_content are stored in at least one table field in a multi-column manner, and the types of data corresponding to the respective table fields may be the same or different. Since the auto-increment key id in the change record table can be automatically increased and cannot be repeated, whether the data change table has new change content modification_content or not can be determined by judging whether the last auto-increment key id is the same as the auto-increment key id in the previous table, and when the data change table is different, the data change table is indicated to have new record of the data change flow.

TABLE 1

Name of the name	Type(s)	Length of	Content
				id	int	20	Main key
table_name	varchar	50	Table name of change log table
				type	int	2	Change type: 1, newly adding; 2, modifying; 3, deleting;
modify_content	varchar	255	Altering content, multi-field storage
				create_time	datetime	0	Table creation time

In this embodiment, a trigger may be set for data in the data structure table corresponding to each source database, so that when the data in each data structure table changes, the changed data is recorded in a preset data change table. A Trigger (Trigger) is an event that triggers an operation. These events include INSERT (INSERT), add (UPDATE), and DELETE (DELETE) statements in the respective data structure tables. When the database system executes these events, the trigger is activated to perform the operation of inserting the changed data into the data change table.

In this embodiment, the execution body of the data synchronization processing method may add a trigger to each data structure table, so as to automatically trigger each trigger when a change operation occurs in each data structure table, and store the data change pipeline in which the data change record is represented in the change record table.

In some optional implementations of this embodiment, the recording the data change flow of each data structure table in the preset data change table includes: and recording the data change flow of the data corresponding to at least one table field in the data structure table corresponding to each source database in the data change table in real time.

In this optional implementation manner, the trigger may record the change condition of the data of any one or more table fields in the data structure table, so as to obtain a data change pipeline of the data corresponding to at least one table field in the data structure table.

The method for recording the data change stream of the data structure table provided by the alternative implementation manner can record the data change stream of the whole data structure table or the table field, and the dimension is the lowest to the column level of the data structure table, so that the flexibility of changing the data record is improved.

The data synchronous processing method provided by the embodiment of the application comprises the steps of firstly, determining a data storage structure of at least one source database, and generating a data structure table corresponding to each storage structure; secondly, for each source database, responding to the received database log of the source database, and copying the data of the database log into a data structure table corresponding to the source database; and finally, based on the change of the data in the data structure table corresponding to each source database, recording the data change flow of each data structure table in a preset database change table. Therefore, the incremental synchronous storage can be carried out on the data from a plurality of different databases, the data change flow of the changed data in the incremental synchronous process is automatically recorded, the flexibility of the data change record is improved, the data storage space is saved, and the data storage efficiency is improved.

Fig. 3 shows a flow 300 of another embodiment of a data synchronization processing method according to the application. The data synchronous processing method comprises the following steps:

step 301, determining data storage structures of at least one source database, and generating a data structure table corresponding to each storage structure.

Step 302, for each source database, in response to receiving the database log of the source database, copying the data of the database log into the data structure table corresponding to the source database.

Step 303, based on the change of the data in the data structure table corresponding to each source database, recording the data change stream of each data structure table in the preset data change table.

It should be understood that the operations and features in steps 301-303 described above correspond to those in steps 101-103, respectively, and thus the descriptions of the operations and features in steps 101-103 described above are equally applicable to steps 301-303, and are not repeated herein.

Step 304, a subscription table is obtained that includes at least one subscription field of the subscriber.

In this embodiment, the subscriber is a service party with a requirement for changing data subscription, and the service party records subscription content (for example, subscription field, index identifier) in the subscription table in an application system supporting the subscription table. After obtaining the subscription table, the execution main body on which the data synchronization processing method of the embodiment operates can extract the subscription content, and extract the data change stream of the data corresponding to the subscription content from the change data table based on the subscription content.

As shown in table 2, a subscription table has table creation content (id, table_name) and subscription content (table_column, index_id), in which the self-increasing primary key id is also self-increasing and not repeated, and the index identifier is a self-increasing primary key synchronized with the change data table and is the last self-increasing primary key synchronized with the change data table. In the subscription table, the subscription column name table_column, also called a subscription field, is the same as at least one table field in the change content in the data change table, the subscription field supports multi-column storage, the data change flow of the data of the field needing to be subscribed can be determined through the subscription field,

TABLE 2

Name of the name	Type(s)	Length of	Content
				id	int	20	Self-increasing primary key
table_name	varchar	50	Table name of subscription table
				table_column	varchar	255	Subscribing column names, supporting multi-column storage
index_id	int	20	Index identifier, id last synchronized with data change table

It should be noted that, table 2 is only one example table format of the subscription table, and in other embodiments, the subscription content of the subscription table may include only the index identifier, the subscription content may include only the subscription field, or the subscription content may not include the index identifier and the subscription field based on the requirements of the subscriber.

Step 305, based on the subscription table, query the data change stream of the data corresponding to all the subscription fields in the data change table.

In this embodiment, the subscription field in the subscription table is the same as at least one table field in the data change table, and the data change stream of the data in the data change table can be quickly determined through the subscription field.

In order to obtain more effective data change flow, all the data change flow of the whole subscription field can be obtained, part of the data change flow in the subscription field can also be obtained, and part of the data change flow can be obtained through the self-increasing main key in the data change table.

In some optional implementations of the present embodiment, the subscription table includes an index identifier, and the data change table includes an auto-increment primary key; the above-mentioned querying, based on the subscription table, the data change flow of the data corresponding to all subscription fields in the data change table includes:

Inquiring the data change flow of data corresponding to all subscription fields between the last self-increasing main key in the data change table and the index mark recorded in the subscription table in response to determining that the last self-increasing main key in the data change table is different from the index mark recorded in the subscription table; and replacing the index mark recorded in the subscription table by adopting the last self-increasing main key.

In this alternative implementation, the self-increasing primary key in the data change table may be self-increasing (for example, adding one increase) according to the data change, where each self-increasing primary key is completely different, and the last self-increasing primary key in the data change table is also the last self-increasing primary key in the data change table, and generally, the value of the last self-increasing primary key is greater than the values of other self-increasing primary keys.

In this optional implementation manner, the index identifier in the subscription table is an auto-increment primary key in the data change table, and the index identifier is an auto-increment primary key which is synchronized with the data change table for the last time, and after the subscription table queries the data change table each time, the last auto-increment primary key of the data change table queried at this time replaces the index identifier and records the latest index identifier in the subscription table, so that the next time the data change flow of data corresponding to all subscription fields from the index identifier to the last auto-increment primary key in the data change table is continuously obtained.

In the optional implementation manner, when the last self-increasing main key in the data change table is different from the index identifier recorded in the subscription table, determining that the change table has data change flow except for the index identifier recorded in the subscription table, wherein the data change flow is the data change flow required by the subscriber; further, the last self-increasing main key is adopted to replace the index mark recorded in the subscription table, so that the real-time subscription effect of the subscription table is improved.

Step 306, pushing the queried data change stream of the data corresponding to all the subscription fields to the subscriber.

In this embodiment, based on different communication modes between the execution main body and the subscriber, the queried data change stream may be pushed to the subscriber in different modes, for example, information is pushed to the subscriber through a WEB page, or data change stream is pushed to the subscriber through short messages.

The data synchronization processing method of the embodiment can be realized through an independent system or logic, and the subscriber can directly register and use in the system or the system where the independent logic is located, so that the docking is simple and convenient, and the complexity of system business can be effectively reduced.

According to the data synchronization processing method provided by the embodiment, after the data change flow of each data structure table is recorded in the preset data change table, the subscription table configured by the subscriber for the data change table is obtained, the data change flow of the data corresponding to all subscription fields in the data change table is queried based on the subscription table, and the queried data change flow of the data corresponding to all subscription fields is pushed to the subscriber, so that the data change flow can be quickly pushed to the subscriber through the subscription table configured by the subscriber, and the subscriber can determine the data change condition in real time.

In some optional implementations of the present embodiment, when the subscription table includes an index identifier and the data change table includes an auto-increment key, the present embodiment further provides a method for querying a data change stream of data corresponding to all subscription fields in the data change table, and fig. 4 shows a flow 400 of one embodiment of the method for querying a data change stream of data corresponding to all subscription fields in the data change table according to the present application. The method for querying the data change stream of the data corresponding to all the subscription fields in the data change table comprises the following steps:

Step 401, in response to determining that the last self-increasing main key in the data change table is different from the index identifier recorded in the subscription table, locating invalid data in the data change pipeline of data corresponding to all subscription fields between the last self-increasing main key and the index identifier recorded in the subscription table.

In this alternative implementation, the invalid data may be determined according to the data synchronization requirement or the subscription requirement of the subscriber, for example, the data between the last auto-increase key in the data change table and the index identifier recorded in the subscription table changes from one value back to the value, and all the changed values between the values are invalid data regardless of any change in the middle. Alternatively, a value between two data each having the same value in the data change pipeline may be used as valid data, and data other than valid data may be invalid data.

Step 402, invalid data is removed.

In this optional implementation manner, removing the invalid data refers to deleting the invalid data in the located data change stream, so that the located data change stream does not include any invalid data. In this embodiment, the invalid data may be at least one segment of the data change pipeline, or may be a data value.

Step 403, querying the data change stream of the data corresponding to all subscription fields between the last auto-increment key in the data change table and the index identifier recorded in the subscription table.

In this optional implementation manner, the data change flow of the data corresponding to all the subscription fields between the last auto-increment key in the data change table and the index identifier recorded in the subscription table is the data change flow of the data corresponding to the subscription fields after the invalid data is removed.

In this optional implementation manner, after the data change stream of the data corresponding to all the subscription fields between the last auto-increase main key in the data change table and the index identifier recorded in the subscription table is queried, the data change stream of the data corresponding to all the subscription fields between the last auto-increase main key in the queried data change table and the index identifier recorded in the subscription table is pushed to the subscriber.

In this embodiment, the invalid data may be determined based on the requirement of the subscriber, for example, all data in which the value change value of the adjacent data in the data change pipeline is set to be a set value (for example, 5) are invalid data, and for the data that is stored in an incremental manner in synchronization, all intermediate data in which the change trend is changed to the original value may be used as the invalid data, for example, the data change pipeline of a certain integer data is changed from 1 to 2, from 2 to 4, from 4 to 7, and from 7 to 2, "from 2 to 4, from 4 to 7, and from 7 to 2" are all invalid data.

In some optional implementations of this embodiment, locating invalid data in the data change pipeline of data corresponding to all subscription fields between the last auto-increase primary key and index identifiers recorded in the subscription table includes: arranging all the self-increasing main keys between the last self-increasing main key and index marks recorded in the subscription table in descending order to obtain data change pipeline segments of data corresponding to all subscription fields; detecting whether the same value exists in the data change pipeline segment; in response to determining that there are identical values in the data change pipeline segment, all data between the two identical values is determined to be invalid data.

In this optional implementation manner, the data value of the data corresponding to all subscription fields between the last self-increasing primary key of the positioning and the index identifier recorded in the subscription table is initially 1, changed to 2 after the change, and changed to 1 after the change, at this time, the data can be considered as not changed, and then the data can be considered as invalid data, and can be not pushed to the subscriber any more. The data change record list is screened, the self-increasing main keys of the data change record list are arranged in a descending order, the uppermost information is the last self-increasing main key, and the uppermost information is the last modified self-increasing main key, for example, after the data value of the data corresponding to the last self-increasing main key of a certain subscription field is searched, whether other modified records exist in the data of the subscription field is searched in the data change list, if yes, the two data values are compared, whether the two data values are identical is detected, if the two data values are identical, a data change pipeline segment between the two identical data of the subscription field can be removed, and if no identical data value exists, the data change pipeline segment is pushed to a subscriber.

In the alternative implementation mode, firstly, data change pipeline segments of data corresponding to all subscription fields are determined, whether the same value exists in the data change pipeline segments or not is detected, when the same value exists in the change pipeline segments, all data between the two same values are determined to be invalid data, the alternative implementation mode provides a reliable means for removing the invalid data by synchronizing the data in an incremental mode, and the effectiveness of the data change pipeline pushed to the subscriber is ensured.

Step 404, replacing the index identifier recorded in the subscription table with the last auto-increase primary key.

In the alternative implementation mode, the last self-increasing main key is adopted to replace the index mark recorded in the subscription table and is used for updating the index mark in the subscription table, so that the index mark in the subscription table is synchronous with the last self-increasing main key of the data change table, and the reliability of the data change running water in the data change table queried next time is ensured.

According to the method for determining the data change flow of the data corresponding to all the subscription fields, when the last self-increasing main key in the data change table is different from the index mark recorded in the subscription table, invalid data in the data change flow of the data corresponding to all the subscription fields between the last self-increasing main key and the index mark recorded in the subscription table is positioned, the invalid data is removed, the data quality of the obtained subscription data of the subscription table is improved, and the validity of the data obtained by the subscriber is ensured.

With further reference to fig. 5, as an implementation of the method shown in the foregoing figures, the present application provides an embodiment of a data synchronization processing apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 1, and the apparatus is particularly applicable to various electronic devices.

As shown in fig. 5, the data synchronization processing apparatus 500 provided in this embodiment includes: a generating unit 501, a copying unit 502, and a recording unit 503. The generating unit 501 may be configured to determine a data storage structure of at least one source database, and generate a data structure table corresponding to each storage structure. The replication unit 502 may be configured to, for each source database, replicate, in response to receiving a database log of the source database, data of the database log into a data structure table corresponding to the source database. The recording unit 503 may be configured to record the data change stream of each data structure table in a preset data change table based on the change of the data in the data structure table corresponding to each source database.

In the present embodiment, in the data synchronization processing apparatus 500: the specific processing of the generating unit 501, the copying unit 502, and the recording unit 503 and the technical effects thereof may refer to the descriptions related to the steps 101, 102, and 103 in the corresponding embodiment of fig. 1, and are not repeated herein.

In some optional implementations of this embodiment, the recording unit 503 is further configured to record, in the data change table, in real time, a data change stream of data corresponding to at least one table field in the data structure table corresponding to each source database.

In some optional implementations of this embodiment, the apparatus 500 further includes: the device comprises an acquisition unit (not shown in the figure), a query unit (not shown in the figure) and a pushing unit (not shown in the figure). Wherein the acquiring unit may be configured to acquire a subscription table including at least one subscription field of the subscriber. The querying unit may be configured to query a data change pipeline of data corresponding to all subscription fields in the data change table based on the subscription table. The pushing unit may be configured to push the data change stream of the data corresponding to all the queried subscription fields to the subscriber.

In some optional implementations of this embodiment, the subscription table includes an index identifier, and the data change table includes an auto-increment primary key; the inquiry unit includes: a query module (not shown) and a replacement module (not shown). The query module may be configured to query a data change flow of data corresponding to all subscription fields between the last auto-increase primary key in the data change table and the index identifier recorded in the subscription table in response to determining that the last auto-increase primary key in the data change table is different from the index identifier recorded in the subscription table. The replacing module may be configured to replace the index identifier recorded in the subscription table with the last auto-added primary key.

In some optional implementations of this embodiment, the apparatus 500 further includes: a positioning unit (not shown in the figure), and a removing unit (not shown in the figure). The positioning unit may be configured to position invalid data in the data change pipeline of data corresponding to all subscription fields between the last auto-increase primary key and index identifiers recorded in the subscription table. The removing unit may be configured to remove invalid data.

In some optional implementations of the present embodiment, the positioning unit includes: a sorting module (not shown), a detecting module (not shown), and a determining module (not shown). The sorting module may be configured to sort all the auto-increment primary keys from the last auto-increment primary key to index identifiers recorded in the subscription table in a descending order, so as to obtain data change pipeline segments of data corresponding to all subscription fields. The detection module may be configured to detect whether there are identical values in the data change pipeline segment. The determination module may be configured to determine that all data between two identical values is invalid data in response to determining that there are identical values in the data change pipeline segment.

The data synchronization processing method provided by the embodiment of the present application includes that first, a generating unit 501 determines data storage structures of at least one source database, and generates a data structure table corresponding to each storage structure; next, for each source database, the replication unit 502 replicates, in response to receiving a database log of the source database, data of the database log into a data structure table corresponding to the source database; finally, the recording unit 503 records the data change stream of each data structure table in the preset database change table based on the change of the data in the data structure table corresponding to each source database. Therefore, the incremental synchronous storage can be carried out on the data from a plurality of different databases, the data change flow of the changed data in the incremental synchronous process is automatically recorded, the flexibility of the data change record is improved, the data storage space is saved, and the data storage efficiency is improved.

According to embodiments of the present application, the present application also provides an electronic device, a readable storage medium and a computer program product.

Fig. 6 shows a schematic block diagram of an example electronic device 600 that may be used to implement an embodiment of the application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.

As shown in fig. 6, the apparatus 600 includes a computing unit 601 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 may also be stored. The computing unit 601, ROM 602, and RAM 603 are connected to each other by a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Various components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, mouse, etc.; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 601 performs the respective methods and processes described above, such as a data synchronization processing method. For example, in some embodiments, the data synchronization processing method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. When a computer program is loaded into the RAM 603 and executed by the computing unit 601, one or more steps of the data synchronization processing method described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the data synchronization processing method in any other suitable way (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present disclosure may be performed in parallel, sequentially, or in a different order, so long as the desired result of the technical solution of the present disclosure is achieved, and the present disclosure is not limited herein.

The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims

1. A method of data synchronization processing, the method comprising:

Determining data storage structures of at least one source database, and generating a data structure table corresponding to each storage structure, wherein the data structure table is a blank table with table fields identical to the storage structures;

For each source database, responding to receiving the database log of the source database, and copying the data of the database log into a data structure table corresponding to the source database;

Based on the change of the data in the data structure table corresponding to each source database, recording the data change flow of each data structure table in the preset data change table comprises the following steps: and recording the data change flow of the data corresponding to at least one table field in the data structure table corresponding to each source database in the data change table in real time.

2. The method of claim 1, the method further comprising:

Acquiring a subscription table comprising at least one subscription field of a subscriber;

Inquiring data change flow of data corresponding to all subscription fields in the data change table based on the subscription table;

pushing the data change stream of the data corresponding to all the queried subscription fields to the subscriber.

3. The method of claim 2, wherein the subscription table includes an index identification, and the data change table includes an auto-increment master key;

The querying, based on the subscription table, a data change flow of data corresponding to all subscription fields in the data change table includes:

In response to determining that the last self-increasing main key in the data change table is different from the index identifier recorded in the subscription table, querying the data change flow of data corresponding to all subscription fields between the last self-increasing main key in the data change table and the index identifier recorded in the subscription table;

And replacing the index mark recorded in the subscription table by adopting the last self-increasing main key.

4. A method according to claim 3, the method further comprising:

And positioning invalid data in the data change flow of data corresponding to all subscription fields between the last self-increasing main key and index identifiers recorded in the subscription table, and removing the invalid data.

5. The method of claim 4, wherein locating invalid data in the data change pipeline for all subscription field correspondence data between the last auto-increment key and index identifications recorded in the subscription table comprises:

Arranging all the self-increasing main keys between the last self-increasing main key and index marks recorded in the subscription table in a descending order to obtain data change pipeline segments of data corresponding to all subscription fields;

detecting whether the same value exists in the data change pipeline section;

In response to determining that there are identical values in the data change pipeline segment, all data between the two identical values is determined to be invalid data.

6. A data synchronization processing apparatus, the apparatus comprising:

A generating unit configured to determine data storage structures of at least one source database, generate data structure tables corresponding to the respective storage structures, the data structure tables being blank tables having the same table fields as the storage structures;

A copying unit configured to copy, for each source database, data of a database log of the source database into a data structure table corresponding to the source database in response to receiving the database log;

A recording unit configured to record a data change stream of each data structure table in a preset data change table based on a change of data in the data structure table corresponding to each source database; the recording unit is further configured to: and recording the data change flow of the data corresponding to at least one table field in the data structure table corresponding to each source database in the data change table in real time.

7. The apparatus of claim 6, the apparatus further comprising:

An acquisition unit configured to acquire a subscription table including at least one subscription field of a subscriber;

The inquiring unit is configured to inquire the data change flow of the data corresponding to all the subscription fields in the data change table based on the subscription table;

and the pushing unit is configured to push the data change stream of the data corresponding to all the queried subscription fields to the subscriber.

8. The apparatus of claim 7, wherein the subscription table comprises an index identification, and the data change table comprises an auto-increment master key; the query unit includes:

The inquiring module is configured to inquire the data change flow of data corresponding to all subscription fields between the last self-increasing main key in the data change table and the index identifier recorded in the subscription table in response to determining that the last self-increasing main key in the data change table is different from the index identifier recorded in the subscription table;

and the replacing module is configured to replace the index identifier recorded in the subscription table by the last self-increasing main key.

9. The apparatus of claim 8, the apparatus further comprising:

The positioning unit is configured to position invalid data in the data change flow of data corresponding to all subscription fields between the last self-increasing main key and the index mark recorded in the subscription table;

And a removing unit configured to remove the invalid data.

10. The apparatus of claim 9, wherein the positioning unit comprises:

The ordering module is configured to arrange all the self-increasing main keys from the last self-increasing main key to index marks recorded in the subscription table in a descending order to obtain data change pipeline segments of data corresponding to all subscription fields;

A detection module configured to detect whether the data change pipeline segments have the same value;

And a determining module configured to determine that all data between two identical values is invalid data in response to determining that the identical values are in the data change pipeline segment.

11. An electronic device, comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-5.

13. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-5.