CN109739929B - Data synchronization method, device and system - Google Patents

Data synchronization method, device and system Download PDF

Info

Publication number
CN109739929B
CN109739929B CN201811552328.9A CN201811552328A CN109739929B CN 109739929 B CN109739929 B CN 109739929B CN 201811552328 A CN201811552328 A CN 201811552328A CN 109739929 B CN109739929 B CN 109739929B
Authority
CN
China
Prior art keywords
data
message
partition
synchronized
data change
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811552328.9A
Other languages
Chinese (zh)
Other versions
CN109739929A (en
Inventor
张娜
鹿慧
何栋
黎锦康
尚玲瑞
李子旺
于灏
欧创新
刘震
杨猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peoples Insurance Company of China
Original Assignee
Peoples Insurance Company of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peoples Insurance Company of China filed Critical Peoples Insurance Company of China
Priority to CN201811552328.9A priority Critical patent/CN109739929B/en
Publication of CN109739929A publication Critical patent/CN109739929A/en
Application granted granted Critical
Publication of CN109739929B publication Critical patent/CN109739929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The application discloses a data synchronization method, which comprises the following steps: acquiring data to be synchronized in a source database, and generating a first data change message aiming at the data to be synchronized; determining a message subject of the first data change message according to a database table to which the data to be synchronized belongs in the source database; pushing the first data change message into a first message queue which corresponds to the message theme and only comprises one partition according to the message theme; when the triggering condition is met, distributing the first data change message in the first message queue to a first partition of a second message queue according to a preset partition strategy; and the second message queue is used for the data synchronization unit to acquire the data change message and synchronize the data to be synchronized to the target database according to the data change message. The problem of how to synchronize the changed data of the source database to the target database more accurately and quickly is solved.

Description

Data synchronization method, device and system
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data synchronization method, an apparatus, and a data synchronization system.
Background
For the industries with huge data scale and diversified business requirements, the data processing system constructed by adopting the micro-service system can adapt to the rapidly increasing business requirements and user quantity. The existing data processing system constructed by adopting a micro-service system comprises a main system and a plurality of subsystems, wherein the main system processes basic data based on a main system database for storing the basic data and takes the main system database as a source database; each subsystem provides service for the outside based on respective independent database, and the subsystem database is used as a target database; after the basic data of the source database is changed, the basic data needs to be synchronized to each target database in time so as to maintain data consistency. How to more quickly and accurately classify the changed data of the source database into the target database and ensure that the data of each subsystem can reach consistency in time is a problem to be solved.
Disclosure of Invention
The application provides a data synchronization method, which aims to solve the problem of how to quickly and accurately distinguish changed data of a source database to a target database.
The application provides a data synchronization method, which comprises the following steps:
acquiring data to be synchronized in a source database, and generating a first data change message aiming at the data to be synchronized;
determining a message subject of the first data change message according to a database table to which the data to be synchronized belongs in the source database;
pushing the first data change message into a first message queue corresponding to the message theme according to the message theme; wherein the first message queue is a single partition message queue comprising only one partition;
when the triggering condition is met, distributing the first data change message in the first message queue to a first partition of a second message queue according to a preset partition strategy; the second message queue is a multi-partition message queue comprising a plurality of partitions, each of the plurality of partitions for storing data change messages; the first partition is a partition in which a first data change message is stored in the plurality of partitions; and the second message queue is used for the data synchronization unit to acquire the data change message and synchronize the data to be synchronized to the target database according to the data change message.
Optionally, the obtaining data to be synchronized in the source database, and generating a first data change message for the data to be synchronized includes:
acquiring a log of a source database;
acquiring record change information according to the log;
acquiring the data to be synchronized according to the record change information;
and generating a first data change message by using the data to be synchronized.
Optionally, the generating a first data change message by using the data to be synchronized includes:
and formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data.
Optionally, the obtaining data to be synchronized in the source database, and generating a first data change message for the data to be synchronized includes:
determining a message primary key for generating the first data change message;
and generating the first data change message according to the message primary key and the data to be synchronized.
Optionally, when the trigger condition is met, distributing the first data change message in the first message queue to the first partition of the second message queue according to a preset partition policy, where the method includes:
when a trigger condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition.
Optionally, the determining, according to the message primary key and the preset partition policy, the partition identifier corresponding to the first data change message includes:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
and obtaining the partition identification according to the preset partition strategy.
Optionally, when the trigger condition is met, distributing the first data change message in the first message queue to the first partition of the second message queue according to a preset partition policy, where the method includes:
when a trigger condition is met, acquiring the service type information or the service area information of the data to be synchronized corresponding to the first data change message;
taking the corresponding relation between the service type information or the service area information and at least one partition of a second message queue as the preset partition strategy;
assigning a partition identifier corresponding to the first data change message according to the preset partition strategy;
and sending the first data change message to a partition indicated by the partition identifier in a second message queue according to the partition identifier.
Optionally, the method further includes:
starting a server for providing data change messages aiming at the second message queue; the server is used for providing the data change message to the data synchronization unit.
The application also provides a data synchronization method, which comprises the following steps:
the data synchronization unit acquires indication information for indicating that a first data change message is acquired in a first partition of a second message queue; the data synchronization unit is configured to synchronize data to be synchronized in a source database to a target database, the second message queue includes a plurality of partitions, each of the plurality of partitions is configured to store a data change message, the first partition is a partition that stores the first data change message and corresponds to the data synchronization unit, and the first data change message is a data change message for the data to be synchronized;
the data synchronization unit acquires the first data change message from the first partition;
the data synchronization unit acquires the data to be synchronized according to the first data change message;
and the data synchronization unit synchronizes the data to be synchronized to the target database.
Optionally, the method further includes:
the data synchronization unit acquires the data change message from a second message queue by using at least one thread;
the data synchronization unit acquires the first data change message from the first partition, and includes:
the data synchronization unit obtains the first data change message from the first partition using a first thread of the at least one thread.
Optionally, the obtaining, by the data synchronization unit, the data change message from the second message queue by using at least one thread includes:
the data synchronization unit starts a corresponding thread for each partition in the plurality of partitions of the second message queue;
and acquiring the data change message from the corresponding partition by using the started thread.
Optionally, the method further includes:
taking the thread as a client;
configuring the client parameters and server information corresponding to a second message queue;
and acquiring data change information in the partition corresponding to the thread from the server by using the client parameter and the server information.
Optionally, the obtaining, by the data synchronization unit, the data change message from the second message queue by using at least one thread includes:
determining a message topic processed by the at least one thread;
monitoring at least one partition of a plurality of partitions of the second message queue according to the message topic;
obtaining the monitored data change message using the at least one thread.
Optionally, the synchronizing the data to be synchronized to the target database by the data synchronizing unit includes:
the data synchronization unit generates a database synchronization instruction for synchronizing data to a first database table in a target database according to the first data change message;
synchronizing the data to be synchronized to the first database table using the database synchronization instruction; and the structure of the first database table is different from the structure of the database table to which the synchronous data belongs in the source database.
The present application further provides a database synchronization system, comprising: the system comprises a source database, a message queue flow module, a data synchronization unit and a target database;
the source database is used for providing data to be synchronized which needs to be synchronized to the target database, and the data to be synchronized comprises basic data used for insurance business;
the message queue flow module is used for acquiring data to be synchronized in a source database and generating a first data change message aiming at the data to be synchronized; determining a message subject of the first data change message according to a database table to which the data to be synchronized belongs in the source database; pushing the first data change message into a first message queue corresponding to the message topic according to the message topic, wherein the first message queue is a single-partition message queue comprising only one partition; when a trigger condition is met, according to a preset partition strategy, distributing a first data change message in the first message queue to a first partition of a second message queue, wherein the second message queue is a multi-partition message queue comprising a plurality of partitions, each partition in the plurality of partitions is used for storing the data change message, the first partition is a partition which stores the first data change message and corresponds to the data synchronization unit, the second message queue is used for a data synchronization unit to acquire the data change message, and data to be synchronized is synchronized to the target database according to the data change message;
the data synchronization unit is configured to start, for each partition of a plurality of partitions of a second message queue, a thread corresponding to each partition, acquire the data change message from the corresponding partition using the started thread, and synchronize data to be synchronized in a source database to a target database, where the second message queue includes the plurality of partitions, and each partition of the plurality of partitions is used to store the data change message; acquiring indication information for indicating that a first data change message is acquired in a first partition of a second message queue, and acquiring the first data change message from the first partition by using a first thread in started threads, wherein the first data change message is a data change message for data to be synchronized; acquiring the data to be synchronized according to the first data change message, and synchronizing the data to be synchronized to the target database;
and the target database is used for receiving the data to be synchronized by the data synchronization unit.
Optionally, the message queue flow module includes a change data capture sub-module, where the change data capture sub-module is configured to:
acquiring a log of a source database;
acquiring record change information according to the log;
acquiring the data to be synchronized according to the record change information;
and generating a first data change message by using the data to be synchronized.
Optionally, the change data capturing sub-module is further configured to:
acquiring the data to be synchronized according to a first database table in the source database, and generating a first data change message by using the data to be synchronized;
the data synchronization unit is further configured to:
generating a database synchronization instruction aiming at a second database table in the target database according to the first data change message;
synchronizing the data to be synchronized to the second database table according to the database synchronization instruction;
wherein the second database table structure is different from the first database table.
Optionally, the message queue flow module is specifically configured to:
determining a message primary key for generating the first data change message;
and generating the first data change message according to the message primary key and the data to be synchronized.
Optionally, the message queue flow module is specifically configured to:
when a trigger condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition.
Optionally, the message queue flow module is specifically configured to:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
and obtaining the partition identification according to the preset partition strategy.
Optionally, the message queue flow module includes a preset partition policy determining unit, where the preset partition policy determining unit is configured to: when a trigger condition is met, acquiring the service type information or the service area information of the data to be synchronized corresponding to the first data change message;
taking the corresponding relation between the service type information or the service area information and at least one partition of a second message queue as the preset partition strategy;
the message queue flow module is specifically configured to: assigning a partition identifier corresponding to the first data change message according to the preset partition strategy;
and sending the first data change message to a partition indicated by the partition identifier in a second message queue according to the partition identifier.
Optionally, the message queue flow module further includes a service terminal module, where the service terminal module is configured to: starting a server for providing data change messages aiming at the second message queue; the server is used for providing the data change message to a data synchronization unit;
the data synchronization unit is further configured to:
taking the thread as a client;
configuring the client parameters and server information corresponding to a second message queue;
and acquiring data change information in the partition corresponding to the thread from the server by using the client parameter and the server information.
Optionally, the data synchronization unit is further configured to:
determining a message topic processed by the started thread;
monitoring at least one partition of a plurality of partitions of the second message queue according to the message topic;
and acquiring the monitored data change message by using the started thread.
Optionally, the system further includes: a marketing platform and a telephone traffic service subsystem;
the marketing platform is used for processing the basic data of the source database and changing the basic data; calling an application programming interface of the telephone traffic service subsystem to process the telephone traffic data of the target database;
the telephone traffic service subsystem is used for providing an application programming interface and responding to the calling of the marketing platform aiming at the application programming interface to process the telephone traffic data of the target database;
the source database also stores basic data containing the data to be synchronized, receives a request of the marketing platform for changing the basic data, and provides the data to be synchronized aiming at the request of changing the basic data;
and the target database also stores the traffic data, and stores the received data to be synchronized as incremental traffic data.
Optionally, the source database includes: a primary database and a backup database;
the main database is used for receiving a request of the marketing platform for changing the basic data and synchronizing the changed data generated by the request for changing the basic data to the backup database;
the backup database is used for receiving the changed data synchronized by the main database and providing the data to be synchronized for the changed data to the message queue flow module.
The present application further provides a data synchronization apparatus, including:
the data change message generating unit is used for acquiring data to be synchronized in a source database and generating a first data change message aiming at the data to be synchronized;
a message theme determining unit, configured to determine a message theme of the first data change message according to a database table to which the data to be synchronized belongs in the source database;
a first message queue unit, configured to push the first data change message into a first message queue corresponding to the message topic according to the message topic; wherein the first message queue is a single partition message queue comprising only one partition;
the partition unit is used for distributing the first data change message in the first message queue to a first partition of a second message queue according to a preset partition strategy when the trigger condition is met; the second message queue is a multi-partition message queue comprising a plurality of partitions, each of the plurality of partitions for storing data change messages; the first partition is a partition in which a first data change message is stored in the plurality of partitions; and the second message queue is used for the data synchronization unit to acquire the data change message and synchronize the data to be synchronized to the target database according to the data change message.
The present application further provides a data synchronization apparatus applied to a data synchronization unit, the data synchronization unit includes:
the indication information acquisition subunit is used for acquiring indication information used for indicating to acquire the first data change message in the first partition of the second message queue; the data synchronization unit is configured to synchronize data to be synchronized in a source database to a target database, the second message queue includes a plurality of partitions, each of the plurality of partitions is configured to store a data change message, the first partition is a partition that stores the first data change message and corresponds to the data synchronization unit, and the first data change message is a data change message for the data to be synchronized;
a first data change message acquiring subunit, configured to acquire the first data change message from the first partition;
a data to be synchronized acquiring subunit, configured to acquire the data to be synchronized according to the first data change message;
and the target database synchronization subunit is used for synchronizing the data to be synchronized to the target database.
Compared with the prior art, the method has the following advantages:
the method includes the steps that a first data change message is generated aiming at data to be synchronized of a source database, and a message theme of the first data change message is determined according to a database table to which the data to be synchronized belongs in the source database; distributing a first data change message to a first partition of a second message queue comprising a plurality of partitions using a first message queue comprising only one partition according to message subject; the second message queue is used for the data synchronization unit to acquire data change messages, and to synchronize the data to be synchronized to a target database according to the data change messages, the data synchronization can be performed in real time through a message queue mechanism, and the message theme is determined according to the database tables, so that the data change messages corresponding to the data to be synchronized of each database table can be divided into the same message theme, and the orderliness of the data to be synchronized is ensured; through the plurality of partitions of the second message queue, the data change messages carrying the data to be synchronized can be processed in parallel, the processing efficiency is improved, the data can be synchronized and updated in time, and therefore the problem that the changed data of the source database are quickly and accurately sorted to the target database is solved.
Drawings
Fig. 1 is a process flow diagram of a data synchronization method according to a first embodiment of the present application;
FIG. 2 is a block diagram of multiple publications of data change line messages from a single block message queue to a multi-block message queue as included in the first embodiment of the present application;
FIG. 3 is a process flow diagram of a data synchronization method according to a second embodiment of the present application;
FIG. 4 is a diagram of a data synchronization system according to a third embodiment of the present application;
fig. 5 is a schematic diagram of a system for synchronizing data of a source database to a target database of a branch office according to a third embodiment of the present application;
FIG. 6 is a diagram of a data synchronization apparatus according to a fourth embodiment of the present application;
fig. 7 is a schematic diagram of a data synchronization apparatus according to a fifth embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
The application provides a data synchronization method, a data synchronization device and a data synchronization system. Details are described in the following examples one by one.
A first embodiment of the present application provides a data synchronization method.
For ease of understanding, a practical application scenario of the data synchronization method will be briefly described first. In practical applications, a message queue stream module and a data synchronization unit, which is also called a score clearing module, may be deployed on a computing device between a source database and a target database. The computing device may be a server node of a cluster. The message queue flow module is used for pushing data to be synchronized into a single-partition message queue with only one partition in a data change message mode, then the data change messages in the single-partition message queue are placed into a plurality of partitions according to a preset partition strategy, and each partition in the plurality of partitions stores the data change messages. The message queue flow module may specifically be implemented based on a message queue mechanism. In the embodiment of the present application, a Message Queue (Message Queue) mechanism is a distributed Message driven system that transmits data in units of messages between computing devices, for example, a Kafka Message Queue. Then, the data synchronization unit acquires the data change information from a plurality of partitions of the information queue, further acquires the data to be synchronized, and synchronizes the data to the target database. The data synchronization method provided by the first embodiment may be deployed in a message queue stream module, and is used to quickly and accurately classify changed data of a source database into a target database.
A data synchronization method provided in a first embodiment of the present application is described below with reference to fig. 1 and fig. 2.
The data synchronization method shown in fig. 1 includes: step S101 to step S104.
Step S101, acquiring data to be synchronized in a source database, and generating a first data change message aiming at the data to be synchronized.
The data to be synchronized is changed data content of the target database after the data of the source database is changed. For example, the source database is an Oracle database a storing basic data for insurance business, and a user can operate the basic data through a marketing platform, so that the basic data changes, and the changed data content is the data to be synchronized.
The data change message refers to a message carrying data content that can be processed based on a message queue mechanism, and may be a byte array that can be used for storing any object. For example, the content of the data to be synchronized and the type of change corresponding to the generation of the synchronized data are stored. The first data change message is a data change message containing the data to be synchronized.
In the embodiment of the application, the data change message is issued based on a message queue mechanism, and the data synchronization unit acquires the data change message from a plurality of partitions of the message queue for processing and synchronizes the data to be synchronized to the target database. Specifically, the message queue mechanism employs a Kafka message queue. The Kafka message queue can be used as a message buffer pool of the data change message, and when the difference between the message processing efficiency of the source database side and the message processing efficiency of the target database side is large, the buffer function of the data change message is realized. In addition, the partition characteristic of the kafka message queue can be utilized to issue the data change message to different partitions, so that the data synchronization unit can realize parallel processing of messages by a plurality of consumers, and the data processing efficiency is improved.
In the embodiment of the present application, a first data change message for the data to be synchronized is generated according to the data to be synchronized by the following processing:
acquiring a log of a source database;
acquiring record change information according to the log;
acquiring the data to be synchronized according to the record change information;
and generating a first data change message by using the data to be synchronized.
Follow the example of Oracle database a. The Oracle database a may be concatenated using cdc (changed Data capture) to obtain the Data to be synchronized. The so-called CDC, which is an incremental log analysis tool that can be connected to an Oracle database, can be used to capture change data in real time.
Preferably, when the first data change message is generated by using the data to be synchronized, the method further includes: and formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data. The preset data format may be String, JSON, or the like. For example, a data change message is generated from the data to be synchronized in JSON format, and the generated data change message includes the content and change type of the data to be synchronized.
Preferably, the acquiring data to be synchronized in the source database, and generating a first data change message for the data to be synchronized includes: determining a message primary key for generating the first data change message; and generating the first data change message according to the message primary key and the data to be synchronized. And binding a message main key for each data change message, and sending the data change message to a specified position in the subsequent processing step so as to facilitate partition processing.
Step S102, determining a message subject of the first data change message according to a database table to which the data to be synchronized belongs in the source database.
The message Topic (Topic) is the category to which each message posted to the message queue belongs. Through the message theme, the message to be consumed can be conveniently determined.
Taking a data change message based on a Kafka message queue as an example, the Kafka message queue includes a message producer and a message consumer, the message producer is a party that issues the data change message to the Kafka message queue cluster, for example, a module that generates and pushes a first data change message into a first message queue; the message consumer is a party, such as a terminal or a service deploying the data synchronization unit, that obtains the data change message from the Kafka message queue cluster for processing. Data change messages belonging to the same message topic may actually be stored on one or more cluster servers, and when a message consumer subscribes to a message topic, all data change messages related to the message topic can be received across nodes by only specifying the message topic of the data change message without requiring the actual holding location of the data.
In the embodiment of the present application, preferably, one database table corresponds to one message topic. Correspondingly, if the message of one message theme needs to be processed by a plurality of consumption groups, the message of the message theme is copied into a plurality of copies according to the number of the consumption groups and provided for each consumption group, so that each consumption group can acquire all and the same data to realize the function of message broadcasting. In addition, a plurality of database tables can be adopted to correspond to one message subject, and the ordering of the messages in the partitions can be maintained.
Step S103, pushing the first data change message into a first message queue corresponding to the message theme according to the message theme; wherein the first message queue is a single partition message queue comprising only one partition.
The partition refers to a minimum unit for maintaining message data in a message queue mechanism, and data change messages in the same partition have orderliness. For example, with data change messages in a partition, the ordering of the messages is maintained by setting a unique offset. The different partitions may be physically different cluster servers. Each message topic may contain one or more partitions. The single-partition message queue is a message queue only comprising one partition, and the message queue comprises a container for storing messages.
The step is that the data change information is pushed into the corresponding single-partition information queue according to the information theme of the data change information. In practical application, due to multiple operations of the service system of the source database on the source database, the same record data in the source database is sequentially changed for multiple times, and each change generates a data change message containing the change type and the relevant information of the change content waiting synchronization data. Because the operation of the source database has the precedence dependency, a thread produces the data change message of the same message theme and pushes the message belonging to the same message theme into the single-partition message queue, and the sequence is strict and the same as the operation. When the data synchronization unit processes the data change messages according to the order of the generated data change messages, the data to be synchronized can be correctly synchronized to the target database.
Step S104, when the triggering condition is met, distributing the first data change message in the first message queue to a first partition of a second message queue according to a preset partition strategy; the second message queue is a multi-partition message queue comprising a plurality of partitions, each of the plurality of partitions for storing data change messages; the first partition is a partition in which a first data change message is stored in the plurality of partitions; and the second message queue is used for the data synchronization unit to acquire the data change message and synchronize the data to be synchronized to the target database according to the data change message.
The multi-partition message queue is a message queue comprising a plurality of partitions.
In this step, the data change message in the single partition of the first message queue is posted to the plurality of partitions of the second message queue. And in the process of issuing the message from the single partition to the multiple partitions, appointing each data change message to be issued to one partition in the multiple partitions of the second message queue according to a preset partition strategy. Specifically, the embodiment of the present application includes the following processes: firstly pushing data to be synchronized of a source database end to a first message queue with only one partition; and continuously monitoring the first message queue, and pulling the data change message and pushing the pulled data change message to a corresponding partition of the second message queue according to a preset partition strategy as long as the data change message is monitored to exist in the first message queue.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating a message publishing from a single partition of a first message queue to a multi-partition of a second message queue, where 201 is the single partition of the first message queue, and the first message queue maintains an order of generating data change messages depending on an operation sequence; 202, issuing data change information of a single partition to a multi-partition; 203 is a multi-partition of the second message queue, the number of partitions being n. Each partition in the second message queue maintains the ordering of the data change messages in the first message queue.
In an implementation manner of the embodiment of the present application, the preset partition policy is determined according to the message primary key. Specifically, when the trigger condition is met, distributing the first data change message in the first message queue to the first partition of the second message queue according to a preset partition policy includes:
when a trigger condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition.
The first partition stores the first data change message and is used for the data synchronization unit to acquire the first data change message and synchronize corresponding data to be synchronized to a target database according to the first data change message.
Preferably, the partition identifier corresponding to the first data change message is determined by the following processing:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
and obtaining the partition identification according to the preset partition strategy.
For example, the preset partition policy is determined according to the following formula:
partition value ═ hash (key)% partitionNum;
wherein, the key is a main key of the message;
partitionNum is the number of partitions of the multi-partition of the second message queue.
The partition value may serve as a partition identification for the specified partition. By adopting the preset partition strategy, the data change messages with the same message main key can be pushed to the same partition in the plurality of partitions of the second message queue, and the ordering of the data change messages in the single partition can be kept when the data change messages are issued to the multiple partitions from the single partition.
In another implementation manner of the embodiment of the present application, a pre-partition policy is specified for each data change message. Preferably, one of the partitions of the second message queue to which the data change message is issued is designated according to the service type information or the service area information to which the data change message belongs. Specifically, when the trigger condition is met, distributing the first data change message in the first message queue to the first partition of the second message queue according to a preset partition policy includes:
when a trigger condition is met, acquiring the service type information or the service area information of the data to be synchronized corresponding to the first data change message;
taking the corresponding relation between the service type information or the service area information and at least one partition of a second message queue as the preset partition strategy;
assigning a partition identifier corresponding to the first data change message according to the preset partition strategy;
and sending the first data change message to a partition indicated by the partition identifier in a second message queue according to the partition identifier.
When the data change message is issued to a plurality of partitions of the second message queue from a single partition of the first message queue according to a preset partition strategy, the data change message in each message topic is reasonably issued to the plurality of partitions so as to ensure that all the data change messages related to one record enter the same partition of the second message queue. And the multi-partition can be processed in parallel, a thread is adopted to acquire and process the data change information corresponding to each partition, and the corresponding data to be synchronized is synchronized to the target database according to the data change information, so that the data synchronization efficiency is improved.
In the embodiment of the present application, in practical implementation, the method further includes:
starting a server for providing data change messages aiming at the second message queue; the server is used for providing the data change message to the data synchronization unit.
Based on the foregoing embodiments, a second embodiment of the present application provides a data synchronization method. The data synchronization method provided by the second embodiment may be deployed in a data synchronization unit. And the data synchronization unit is used as a processor of the data change message, acquires the data change message from a plurality of partitions of the message queue, further acquires the data to be synchronized, and synchronizes the data to the target database.
A data synchronization method provided in a second embodiment of the present application is described below with reference to fig. 3.
The data synchronization method shown in fig. 3 includes: step S301 to step S304.
Step S301, the data synchronization unit acquires indication information for indicating that a first data change message is acquired in a first partition of a second message queue; the data synchronization unit is configured to synchronize data to be synchronized in a source database to a target database, the second message queue includes a plurality of partitions, each of the plurality of partitions is configured to store a data change message, the first partition is a partition that stores the first data change message and corresponds to the data synchronization unit, and the first data change message is a data change message for the data to be synchronized.
In this step, instruction information for instructing acquisition of the first data change message in the first partition of the second message queue is acquired. For example, the indication information is acquired according to a service processing trigger.
In this embodiment of the present application, the data synchronization unit uses at least one thread to obtain the data change message from the second message queue. After the indication information is acquired, acquiring the first data change message from the first partition by the following processing: the data synchronization unit obtains the first data change message from the first partition using a first thread of the at least one thread. In practical implementation, the thread number of the data synchronization unit is not less than the partition number of the second message queue, so that multithreading parallel processing of data is realized, and efficiency is improved.
Preferably, the data synchronization unit acquires the data change message from the second message queue using at least one thread, and includes: the data synchronization unit starts a corresponding thread for each partition in the plurality of partitions of the second message queue; and acquiring the data change message from the corresponding partition by using the started thread.
For example, if the message queue flow module providing the second message queue is a computing module implemented based on the Kafka message queue mechanism, each thread initiated by the data synchronization unit may be a consumer thread of a consumer module integrating the Kafka message queue, consuming data change messages of each partition of the multiple partitions of the second message queue. The consumer module is a client that reads a data change message to a broker (broker) of the Kafka cluster. In practical implementations, the data synchronization unit may include a plurality of consumption groups, and each consumption group may include a plurality of consumer threads. And each consumer thread in the same consumption group correspondingly processes the data change message of one partition, and each data change message can be processed by only one consumer thread in the same consumption group. Due to the orderliness of the data change messages in each partition of the second message queue, the operation sequence of the data to be synchronized generated by the source database end aiming at the same record is kept, and the data change messages related to the same record are issued to the same partition of the second message queue, so that a consumer thread of the data synchronization unit processes the data change messages in one partition corresponding to the consumer thread one by one, the original sequence of the data change messages is not disturbed, the data to be synchronized can be correctly synchronized to the target database, and the data consistency is kept. The multiple consumer threads respectively and correspondingly process the data change messages of the multiple partitions, so that the parallel processing of the data change messages is realized, and the data synchronization efficiency is improved. It should be noted that, in the message queue flow module providing the second message queue, multiple copies of the data change message in the same message topic are copied according to the number of consumption groups of the data synchronization unit, and each consumption group acquires all and the same data change messages, so that the function of broadcasting the data change messages is realized, and horizontal expansion is facilitated.
Preferably, the number of consumers is greater than the number of partitions of the second message queue, ensuring that data change messages for each partition will be processed, providing high availability.
Step S302, the data synchronization unit acquires the first data change message from the first partition.
The data synchronization unit obtains and processes a first data change message from a first partition of a plurality of partitions of a second message queue of the message queue flow module.
In an implementation manner of the embodiment of the present application, the data synchronization unit, as a client, pulls (pull mode) a data change message from the second message queue, includes the following processing:
taking the thread as a client;
configuring the client parameters and server information corresponding to a second message queue;
and acquiring data change information in the partition corresponding to the thread from the server by using the client parameter and the server information.
For example, the data synchronization unit, corresponding to a first thread processing a first partition as a client, pulls and processes data change messages from the first partition one by one.
In another implementation manner of the embodiment of the present application, the monitoring, by the data synchronization unit, the second message queue in real time to obtain the data change message includes the following processing:
determining a message topic processed by the at least one thread;
monitoring at least one partition of a plurality of partitions of the second message queue according to the message topic;
obtaining the monitored data change message using the at least one thread.
Step S303, the data synchronization unit obtains the data to be synchronized according to the first data change message.
In this step, the data to be synchronized included in the first data change message is obtained.
In the embodiment of the present application, specifically, the following processing is included: and acquiring the data to be synchronized from the first data change message according to a preset data format. For example, if the format adopted when the first data change message is generated is the JSON format, the content and the change type of the data to be synchronized are analyzed according to the JSON format.
Step S304, the data synchronization unit synchronizes the data to be synchronized to the target database.
In this embodiment, the source database and the target database may be heterogeneous databases having different database management systems. Specifically, the source database and the target database are heterogeneous in any one of the following forms: run on different computer architectures, have different underlying operating systems, have different dbms (database Management system). The data synchronization method is also applicable to the following two scenarios: the data to be synchronized is synchronized to a target database table belonging to a target database, and the table structure of the data to be synchronized can be different from the table structure of a source database table belonging to the source database; alternatively, the target database table and the source database table may have the same table structure.
The data synchronization unit synchronizes the data to be synchronized to the target database, including:
the data synchronization unit generates a database synchronization instruction for synchronizing data to a first database table in a target database according to the first data change message;
synchronizing the data to be synchronized to the first database table using the database synchronization instruction; and the structure of the first database table is different from the structure of the database table to which the synchronous data belongs in the source database. In this way, synchronization between heterogeneous databases is achieved.
Based on the foregoing embodiments, a third embodiment of the present application provides a data synchronization system.
Referring to FIG. 4, a schematic diagram of the data synchronization system is shown. Since the third embodiment is based on the above embodiments, the description is simple, and the relevant portions refer to the corresponding descriptions of the above embodiments.
The data synchronization system shown in fig. 4 includes: the system comprises a source database, a message queue flow module, a data synchronization unit and a target database;
the source database 401 is configured to provide data to be synchronized, which needs to be synchronized to the target database, where the data to be synchronized includes basic data for insurance services.
The source database may be a database, such as an Oracle database, that stores the underlying data.
In an embodiment mode of the embodiment of the present application, the data synchronization system further includes: a marketing platform and a telephone traffic service subsystem; wherein the content of the first and second substances,
the marketing platform is used for processing the basic data of the source database and changing the basic data; calling an application programming interface of the telephone traffic service subsystem to process the telephone traffic data of the target database;
the telephone traffic service subsystem is used for providing an application programming interface and responding to the calling of the marketing platform aiming at the application programming interface to process the telephone traffic data of the target database;
correspondingly, the source database also stores basic data containing the data to be synchronized, receives a request of the marketing platform for changing the basic data, and provides the data to be synchronized aiming at the request of changing the basic data; and the target database also stores the traffic data, and stores the received data to be synchronized as incremental traffic data.
For example, when the marketing platform operates on the source database, the basic data in the source database is changed, and the changed data content and the changed type can be used as the data to be synchronized.
Preferably, the source database includes: a primary database and a backup database; the main database is used for processing the basic data in cooperation with the marketing platform, receiving the change of the basic data by the marketing platform, and synchronizing the changed data to the backup database; the backup database is used for receiving the changed data synchronized by the main database and providing the data to be synchronized for the changed data of the basic data to the message queue flow module. The backup library comprises a backup database. The backup database is adopted to provide the data to be synchronized for the message queue flow module, so that the performance influence of data synchronization on the main database can be reduced.
The target database 402 is configured to receive the data to be synchronized by the data synchronization unit.
The target database may be a database that is heterogeneous to the source database. For example, the source database is an Oracle database, and the target database may be a Gauss database (gaussian database). The data synchronization system can realize data synchronization between two heterogeneous databases.
The message queue flow module 403 is configured to obtain data to be synchronized in a source database, and generate a first data change message for the data to be synchronized; determining a message subject of the first data change message according to a database table to which the data to be synchronized belongs in the source database; pushing the first data change message into a first message queue corresponding to the message topic according to the message topic, wherein the first message queue is a single-partition message queue comprising only one partition; when a trigger condition is met, according to a preset partition strategy, a first data change message in the first message queue is distributed to a first partition of a second message queue, the second message queue is a multi-partition message queue comprising a plurality of partitions, each partition in the plurality of partitions is used for storing the data change message, the first partition is a partition which stores the first data change message and corresponds to the data synchronization unit, the second message queue is used for a data synchronization unit to acquire the data change message, and data to be synchronized is synchronized to the target database according to the data change message.
In an implementation manner of the embodiment of the present application, the message queue flow module includes a change data capture submodule, where the change data capture submodule is configured to generate a first data change message for data to be synchronized at a source database, and specifically includes the following processing:
acquiring a log of a source database;
acquiring record change information according to the log;
acquiring the data to be synchronized according to the record change information;
and generating a first data change message by using the data to be synchronized.
In practical applications, the source database and the target database may be heterogeneous databases having different database management systems. Specifically, the source database and the target database are heterogeneous in any one of the following forms: run on different computer architectures, have different underlying operating systems, have different dbms (database Management system). For example, the source database is an Oracle database and the destination database is a Gaussian database (Gauss DB). Preferably, the change data capture submodule is further configured to: acquiring the data to be synchronized according to a first database table in the source database, and generating a first data change message by using the data to be synchronized; correspondingly, the data synchronization unit is further configured to:
generating a database synchronization instruction aiming at a second database table in the target database according to the first data change message;
synchronizing the data to be synchronized to the second database table according to the database synchronization instruction;
wherein the second database table structure is different from the first database table.
In an implementation manner provided in the embodiment of the present application, the message queue flow module determines a preset partition policy according to a message primary key of the data change message, and selects a specific partition of a message queue to which the data change message is to be issued according to the preset partition policy. Therefore, the message queue flow module is further specifically configured to: determining a message primary key for generating the first data change message; and generating the first data change message according to the message primary key and the data to be synchronized. Further, the message queue flow module acquires a message primary key of the first data change message when meeting a trigger condition; determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy; sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition.
Wherein the partition identification is determined specifically by the following processing:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
and obtaining the partition identification according to the preset partition strategy.
In another specific implementation manner of the embodiment of the present application, the specific partition issued to the second message queue is determined according to the service type or the service area to which the data change message belongs. Specifically, the message queue flow module includes a preset partition policy determining unit, where the preset partition policy determining unit is configured to: when a trigger condition is met, acquiring the service type information or the service area information of the data to be synchronized corresponding to the first data change message;
taking the corresponding relation between the service type information or the service area information and at least one partition of a second message queue as the preset partition strategy;
the message queue flow module is specifically configured to: assigning a partition identifier corresponding to the first data change message according to the preset partition strategy;
and sending the first data change message to a partition indicated by the partition identifier in a second message queue according to the partition identifier.
In another implementation manner of the embodiment of the present application, the message queue flow module further includes a service terminal module, where the service terminal module is configured to: starting a server for providing data change messages aiming at the second message queue; the server is used for providing the data change message to the data synchronization unit.
The data synchronization unit 404 is configured to, for each partition in a plurality of partitions of a second message queue, start a thread corresponding to each partition, acquire the data change message from the corresponding partition using the started thread, and synchronize data to be synchronized in a source database to a target database, where the second message queue includes the plurality of partitions, and each partition in the plurality of partitions is used to store the data change message; acquiring indication information for indicating that a first data change message is acquired in a first partition of a second message queue, and acquiring the first data change message from the first partition by using a first thread in started threads, wherein the first data change message is a data change message for data to be synchronized; and acquiring the data to be synchronized according to the first data change message, and synchronizing the data to be synchronized to the target database.
In this embodiment, the data synchronization unit is also referred to as a score clearing module, and processes data change messages in multiple partitions of the second message queue of the message queue module through multiple threads. In one embodiment, the data synchronization unit includes a multi-thread consumption submodule, and the multi-thread consumption submodule is configured to:
starting a corresponding thread for each of a plurality of partitions of a second message queue;
acquiring the data change message from the corresponding partition by using the started thread; and acquiring the first data change message from the first partition by using a first thread in the started threads. In practical implementation, the thread number of the data synchronization unit is not less than the partition number of the second message queue, so that multithreading parallel processing of data is realized, and efficiency is improved.
In an implementation manner of the embodiment of the present application, the message queue flow module, as described above, further includes a service terminal module, and correspondingly, the data synchronization unit is further configured to:
taking the thread as a client;
configuring the client parameters and server information corresponding to a second message queue;
and acquiring data change information in the partition corresponding to the thread from the server by using the client parameter and the server information.
In another implementation manner of the embodiment of the present application, the monitoring, by the data synchronization unit, the second message queue, and acquiring and processing the monitored data change message specifically include:
determining a message topic processed by the started thread;
monitoring at least one partition of a plurality of partitions of the second message queue according to the message topic;
and acquiring the monitored data change message by using the started thread.
Preferably, the data synchronization unit acquires the content and the change type of the data to be synchronized according to the data change message, generates an SQL statement, connects to the target database through JDBC, executes the SQL statement, and synchronizes the data to be synchronized to the target database.
Examples are as follows. Fig. 5 shows a system for synchronizing data from a source database of a certain group company to a target database of a branch company. In the figure, a marketing platform 501 is connected with a source database 502 in which basic data are stored, and processes the basic data; the source database 502 is an Oracle database, which includes a main database and a backup database;
the telephone traffic service subsystem 503 is connected with a target database 504 in which telephone traffic data is stored, and the telephone traffic service subsystem 503 provides telephone traffic service which needs to depend on basic data, so that the basic data is changed after the marketing platform 501 operates a source database, and the change of the basic data needs to be synchronized to the target database 503 to be stored as incremental telephone traffic data; the target database 504 is a Gauss database;
a backup library of a source database 502 is connected to a message queue flow module 505, the message queue flow module 505 acquires data to be synchronized of the backup library, generates a data change message, first pushes the data change message into a first message queue 505-1 including only one partition, and issues the data change message in a single partition to a second message queue 505-2 including a plurality of partitions when a trigger condition is met;
the score clearing module 506 comprises a plurality of consumer threads, each consumer thread processing data change messages of a partition of the second message queue 505-2 and synchronizing corresponding data to be synchronized into the target database 504.
Corresponding to the data synchronization method provided in the first embodiment of the present application, a fourth embodiment of the present application also provides a data synchronization apparatus.
Referring to fig. 6, a schematic diagram of the data synchronization apparatus is shown. Since the apparatus embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, reference is made to the corresponding description of the method embodiments for relevant parts.
A fourth embodiment provides a data synchronization apparatus, including:
a data change message generating unit 601, configured to acquire data to be synchronized in a source database, and generate a first data change message for the data to be synchronized;
a message topic determining unit 602, configured to determine a message topic of the first data change message according to a database table to which the data to be synchronized belongs in the source database;
a first message queue unit 603, configured to push the first data change message into a first message queue corresponding to the message topic according to the message topic; wherein the first message queue is a single partition message queue comprising only one partition;
a partitioning unit 604, configured to, when a trigger condition is met, distribute, according to a preset partitioning policy, a first data change message in the first message queue to a first partition of a second message queue; the second message queue is a multi-partition message queue comprising a plurality of partitions, each of the plurality of partitions for storing data change messages; the first partition is a partition in which a first data change message is stored in the plurality of partitions; and the second message queue is used for the data synchronization unit to acquire the data change message and synchronize the data to be synchronized to the target database according to the data change message.
The data change message generating unit 601 is specifically configured to:
acquiring a log of a source database;
acquiring record change information according to the log;
acquiring the data to be synchronized according to the record change information;
and generating a first data change message by using the data to be synchronized.
The data change message generating unit 601 is specifically configured to: and formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data.
The data change message generating unit 601 is specifically configured to: determining a message primary key for generating the first data change message; and generating the first data change message according to the message primary key and the data to be synchronized.
The partition unit 604 is specifically configured to:
when a trigger condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition.
Wherein the partition unit 604 comprises a partition identification determination subunit configured to:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
and obtaining the partition identification according to the preset partition strategy.
The partition unit 604 is specifically configured to:
when a trigger condition is met, acquiring the service type information or the service area information of the data to be synchronized corresponding to the first data change message;
taking the corresponding relation between the service type information or the service area information and at least one partition of a second message queue as the preset partition strategy;
assigning a partition identifier corresponding to the first data change message according to the preset partition strategy;
and sending the first data change message to a partition indicated by the partition identifier in a second message queue according to the partition identifier.
Wherein the apparatus further comprises a service terminal unit for: starting a server for providing data change messages aiming at the second message queue; the server is used for providing the data change message to the data synchronization unit.
Corresponding to the data synchronization method provided in the second embodiment of the present application, a fifth embodiment of the present application further provides a data synchronization apparatus.
Referring to fig. 7, a schematic diagram of the data synchronization apparatus is shown. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and the relevant portions only need to refer to the corresponding description of the method embodiment.
A fifth embodiment provides a data synchronization apparatus applied to a data synchronization unit, including:
an indication information obtaining subunit 701, configured to obtain indication information used for indicating that a first data change message is obtained in a first partition of a second message queue; the data synchronization unit is configured to synchronize data to be synchronized in a source database to a target database, the second message queue includes a plurality of partitions, each of the plurality of partitions is configured to store a data change message, the first partition is a partition that stores the first data change message and corresponds to the data synchronization unit, and the first data change message is a data change message for the data to be synchronized;
a first data change message acquiring subunit 702, configured to acquire the first data change message from the first partition;
a to-be-synchronized data obtaining subunit 703, configured to obtain the to-be-synchronized data according to the first data change message;
a synchronize sub-unit 704 for synchronizing the data to be synchronized to the target database.
Wherein the apparatus further comprises a multithreading subunit to: obtaining the data change message from a second message queue using at least one thread;
correspondingly, the first data change message acquiring subunit 702 is further configured to: obtaining, using a first thread of the at least one thread, the first data change message from the first partition.
Wherein the multithreading subunit is specifically configured to: starting a corresponding thread for each of a plurality of partitions of a second message queue; and acquiring the data change message from the corresponding partition by using the started thread.
Wherein the apparatus further comprises a customer terminal unit for:
taking the thread as a client;
configuring the client parameters and server information corresponding to a second message queue;
and acquiring data change information in the partition corresponding to the thread from the server by using the client parameter and the server information.
Wherein the multithreading subunit is specifically configured to:
determining a message topic processed by the at least one thread;
monitoring at least one partition of a plurality of partitions of the second message queue according to the message topic;
obtaining the monitored data change message using the at least one thread.
The target database synchronization subunit 704 is specifically configured to:
generating a database synchronization instruction for synchronizing data to a first database table of a target database according to the first data change message;
synchronizing the data to be synchronized to the first database table using the database synchronization instruction; and the structure of the first database table is different from the structure of the database table to which the synchronous data belongs in the source database.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

Claims (21)

1. A method of data synchronization, comprising:
acquiring data to be synchronized in a source database, and generating a first data change message aiming at the data to be synchronized;
determining a message subject of the first data change message according to a database table to which the data to be synchronized belongs in the source database;
pushing the first data change message into a first message queue corresponding to the message theme according to the message theme; wherein the first message queue is a single partition message queue comprising only one partition;
when the triggering condition is met, distributing the first data change message in the first message queue to a first partition of a second message queue according to a preset partition strategy; the second message queue is a multi-partition message queue comprising a plurality of partitions, each of the plurality of partitions for storing data change messages; the first partition is a partition in which a first data change message is stored in the plurality of partitions; the second message queue is used for the data synchronization unit to acquire the data change message and synchronize the data to be synchronized to the target database according to the data change message;
the acquiring data to be synchronized in a source database, and generating a first data change message for the data to be synchronized, includes:
determining a message primary key for generating the first data change message;
generating the first data change message according to the message primary key and the data to be synchronized;
when the triggering condition is met, distributing the first data change message in the first message queue to the first partition of the second message queue according to a preset partition strategy, including:
when a trigger condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition;
the determining, according to the message primary key and the preset partition policy, a partition identifier corresponding to the first data change message includes:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
obtaining the partition identification according to the preset partition strategy;
the data change message refers to a message carrying data content and capable of being processed based on a message queue mechanism;
the generating a first data change message for the data to be synchronized further comprises: formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data;
the preset data format comprises: and generating a data change message by the data to be synchronized in a JSON format, wherein the generated data change message comprises the content and the change type of the data to be synchronized.
2. The method according to claim 1, wherein the obtaining data to be synchronized in a source database, and generating a first data change message for the data to be synchronized comprises:
acquiring a log of a source database;
acquiring record change information according to the log;
acquiring the data to be synchronized according to the record change information;
and generating a first data change message by using the data to be synchronized.
3. The method of claim 2, wherein generating a first data change message using the data to be synchronized comprises:
and formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data.
4. The method of claim 1, further comprising:
starting a server for providing data change messages aiming at the second message queue; the server is used for providing the data change message to the data synchronization unit.
5. A method of data synchronization, comprising:
the data synchronization unit acquires indication information for indicating that a first data change message is acquired in a first partition of a second message queue; the data synchronization unit is configured to synchronize data to be synchronized in a source database to a target database, the second message queue includes a plurality of partitions, each of the plurality of partitions is configured to store a data change message, the first partition is a partition that stores the first data change message and corresponds to the data synchronization unit, and the first data change message is a data change message for the data to be synchronized;
the data synchronization unit acquires the first data change message from the first partition;
the data synchronization unit acquires the data to be synchronized according to the first data change message;
the data synchronization unit synchronizes the data to be synchronized to the target database;
the first data change message in the first partition is generated according to a message main key and the data to be synchronized and is distributed from a first message queue to a first partition of a second message queue according to a preset partition strategy;
the distributing from the first message queue to the first partition of the second message queue according to the preset partition policy comprises: when a trigger condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition;
the determining, according to the message primary key and the preset partition policy, a partition identifier corresponding to the first data change message includes:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
obtaining the partition identification according to the preset partition strategy;
the data change message refers to a message carrying data content and capable of being processed based on a message queue mechanism;
the method for generating the first data change message further includes: formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data;
the preset data format comprises: and generating a data change message by the data to be synchronized in a JSON format, wherein the generated data change message comprises the content and the change type of the data to be synchronized.
6. The method of claim 5, further comprising:
the data synchronization unit acquires the data change message from a second message queue by using at least one thread;
the data synchronization unit acquires the first data change message from the first partition, and includes:
the data synchronization unit obtains the first data change message from the first partition using a first thread of the at least one thread.
7. The method of claim 6, wherein the data synchronization unit obtains the data change message from a second message queue using at least one thread, comprising:
the data synchronization unit starts a corresponding thread for each partition in the plurality of partitions of the second message queue;
and acquiring the data change message from the corresponding partition by using the started thread.
8. The method of claim 7, further comprising:
taking the thread as a client;
configuring the client parameters and server information corresponding to a second message queue;
and acquiring data change information in the partition corresponding to the thread from the server by using the client parameter and the server information.
9. The method of claim 6, wherein the data synchronization unit obtains the data change message from a second message queue using at least one thread, comprising:
determining a message topic processed by the at least one thread;
monitoring at least one partition of a plurality of partitions of the second message queue according to the message topic;
obtaining the monitored data change message using the at least one thread.
10. The method of claim 5, wherein the data synchronization unit synchronizes the data to be synchronized to the target database, comprising:
the data synchronization unit generates a database synchronization instruction for synchronizing data to a first database table in a target database according to the first data change message;
synchronizing the data to be synchronized to the first database table using the database synchronization instruction; and the structure of the first database table is different from the structure of the database table to which the synchronous data belongs in the source database.
11. A database synchronization system, comprising: the system comprises a source database, a message queue flow module, a data synchronization unit and a target database;
the source database is used for providing data to be synchronized which needs to be synchronized to the target database, and the data to be synchronized comprises basic data used for insurance business;
the message queue flow module is used for acquiring data to be synchronized in a source database and generating a first data change message aiming at the data to be synchronized; determining a message subject of the first data change message according to a database table to which the data to be synchronized belongs in the source database; pushing the first data change message into a first message queue corresponding to the message topic according to the message topic, wherein the first message queue is a single-partition message queue comprising only one partition; when a trigger condition is met, according to a preset partition strategy, distributing a first data change message in the first message queue to a first partition of a second message queue, wherein the second message queue is a multi-partition message queue comprising a plurality of partitions, each partition in the plurality of partitions is used for storing the data change message, the first partition is a partition which stores the first data change message and corresponds to the data synchronization unit, the second message queue is used for a data synchronization unit to acquire the data change message, and data to be synchronized is synchronized to the target database according to the data change message;
the data synchronization unit is configured to start, for each partition of a plurality of partitions of a second message queue, a thread corresponding to each partition, acquire the data change message from the corresponding partition using the started thread, and synchronize data to be synchronized in a source database to a target database, where the second message queue includes the plurality of partitions, and each partition of the plurality of partitions is used to store the data change message; acquiring indication information for indicating that a first data change message is acquired in a first partition of a second message queue, and acquiring the first data change message from the first partition by using a first thread in started threads, wherein the first data change message is a data change message for data to be synchronized; acquiring the data to be synchronized according to the first data change message, and synchronizing the data to be synchronized to the target database;
the message queue flow module is used for determining a message primary key for generating the first data change message;
generating the first data change message according to the message primary key and the data to be synchronized;
when the triggering condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition;
the message queue flow module is further configured to determine, according to the message primary key and the preset partition policy, a partition identifier corresponding to the first data change message, including:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
obtaining the partition identification according to the preset partition strategy;
the data change message refers to a message carrying data content and capable of being processed based on a message queue mechanism;
the generating a first data change message for the data to be synchronized further comprises: formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data;
the preset data format comprises: generating a data change message by the data to be synchronized in a JSON format, wherein the generated data change message comprises the content and the change type of the data to be synchronized;
and the target database is used for receiving the data to be synchronized by the data synchronization unit.
12. The system of claim 11, wherein the message queue flow module comprises a change data capture submodule configured to:
acquiring a log of a source database;
acquiring record change information according to the log;
acquiring the data to be synchronized according to the record change information;
and generating a first data change message by using the data to be synchronized.
13. The system of claim 12, wherein the change data capture sub-module is further configured to:
acquiring the data to be synchronized according to a first database table in the source database, and generating a first data change message by using the data to be synchronized;
the data synchronization unit is further configured to:
generating a database synchronization instruction aiming at a second database table in the target database according to the first data change message;
synchronizing the data to be synchronized to the second database table according to the database synchronization instruction;
wherein the second database table structure is different from the first database table.
14. The system according to claim 11, wherein the message queue flow module comprises a pre-partition policy determination unit, and the pre-partition policy determination unit is configured to: when a trigger condition is met, acquiring the service type information or the service area information of the data to be synchronized corresponding to the first data change message;
taking the corresponding relation between the service type information or the service area information and at least one partition of a second message queue as the preset partition strategy;
the message queue flow module is specifically configured to: assigning a partition identifier corresponding to the first data change message according to the preset partition strategy;
according to the partition identification, the first data change message is sent to the partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition.
15. The system of claim 11, wherein the message queue flow module further comprises a service side module configured to: starting a server for providing data change messages aiming at the second message queue; the server is used for providing the data change message to a data synchronization unit;
the data synchronization unit is further configured to:
taking the thread as a client;
configuring the client parameters and server information corresponding to a second message queue;
and acquiring data change information in the partition corresponding to the thread from the server by using the client parameter and the server information.
16. The system of claim 11, wherein the data synchronization unit is further configured to:
determining a message topic processed by the started thread;
monitoring at least one partition of a plurality of partitions of the second message queue according to the message topic;
and acquiring the monitored data change message by using the started thread.
17. The system of claim 11, further comprising: a marketing platform and a telephone traffic service subsystem;
the marketing platform is used for processing the basic data of the source database and changing the basic data; calling an application programming interface of the telephone traffic service subsystem to process the telephone traffic data of the target database;
the telephone traffic service subsystem is used for providing an application programming interface and responding to the calling of the marketing platform aiming at the application programming interface to process the telephone traffic data of the target database;
the source database also stores basic data containing the data to be synchronized, receives a request of the marketing platform for changing the basic data, and provides the data to be synchronized aiming at the request of changing the basic data;
and the target database also stores the traffic data, and stores the received data to be synchronized as incremental traffic data.
18. The system of claim 17, wherein the source database comprises: a primary database and a backup database;
the main database is used for receiving a request of the marketing platform for changing the basic data and synchronizing the changed data generated by the request for changing the basic data to the backup database;
the backup database is used for receiving the changed data synchronized by the main database and providing the data to be synchronized for the changed data to the message queue flow module.
19. The system of claim 11, wherein the source database and the target database are heterogeneous databases having different database management systems.
20. A data synchronization apparatus, comprising:
the data change message generating unit is used for acquiring data to be synchronized in a source database and generating a first data change message aiming at the data to be synchronized;
a message theme determining unit, configured to determine a message theme of the first data change message according to a database table to which the data to be synchronized belongs in the source database;
a first message queue unit, configured to push the first data change message into a first message queue corresponding to the message topic according to the message topic; wherein the first message queue is a single partition message queue comprising only one partition;
the partition unit is used for distributing the first data change message in the first message queue to a first partition of a second message queue according to a preset partition strategy when the trigger condition is met; the second message queue is a multi-partition message queue comprising a plurality of partitions, each of the plurality of partitions for storing data change messages; the first partition is a partition in which a first data change message is stored in the plurality of partitions; the second message queue is used for the data synchronization unit to acquire the data change message and synchronize the data to be synchronized to the target database according to the data change message;
a second message queue unit which determines a message primary key for generating the first data change message;
generating the first data change message according to the message primary key and the data to be synchronized;
when the triggering condition is met, distributing the first data change message in the first message queue to the first partition of the second message queue according to a preset partition strategy, including:
when a trigger condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition;
the second message queue unit is further configured to determine, according to the message primary key and the preset partition policy, a partition identifier corresponding to the first data change message, and includes:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
obtaining the partition identification according to the preset partition strategy;
the data change message refers to a message carrying data content and capable of being processed based on a message queue mechanism;
the generating a first data change message for the data to be synchronized further comprises: formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data;
the preset data format comprises: generating a data change message by the data to be synchronized in a JSON format, wherein the generated data change message comprises the content and the change type of the data to be synchronized;
the data change message refers to a message carrying data content, which can be processed based on a message queue mechanism.
21. A data synchronization apparatus, applied to a data synchronization unit, the data synchronization unit comprising:
the indication information acquisition subunit is used for acquiring indication information used for indicating to acquire the first data change message in the first partition of the second message queue; the data synchronization unit is configured to synchronize data to be synchronized in a source database to a target database, the second message queue includes a plurality of partitions, each of the plurality of partitions is configured to store a data change message, the first partition is a partition that stores the first data change message and corresponds to the data synchronization unit, and the first data change message is a data change message for the data to be synchronized;
a first data change message acquiring subunit, configured to acquire the first data change message from the first partition;
a data to be synchronized acquiring subunit, configured to acquire the data to be synchronized according to the first data change message;
a target database synchronization subunit, configured to synchronize the data to be synchronized to the target database;
the first data change message in the first partition is generated according to a message main key and the data to be synchronized and is distributed from a first message queue to a first partition of a second message queue according to a preset partition strategy;
the distributing from the first message queue to the first partition of the second message queue according to the preset partition policy comprises: when a trigger condition is met, acquiring a message primary key of the first data change message;
determining a partition identifier corresponding to the first data change message according to the message main key and the preset partition strategy;
sending the first data change message to a partition indicated by the partition identification in a second message queue; the partition indicated by the partition identification is the first partition;
the determining, according to the message primary key and the preset partition policy, a partition identifier corresponding to the first data change message includes:
obtaining a hash value of the message primary key;
acquiring the partition number of the second message queue;
utilizing the hash value of the message main key to perform modulo on the partition number to obtain a modulo result, and taking the modulo result as the preset partition strategy;
obtaining the partition identification according to the preset partition strategy;
the data change message refers to a message carrying data content and capable of being processed based on a message queue mechanism;
the method for generating the first data change message further includes: formatting the data to be synchronized according to a preset data format, and generating the first data change message by using the formatted data;
the preset data format comprises: and generating a data change message by the data to be synchronized in a JSON format, wherein the generated data change message comprises the content and the change type of the data to be synchronized.
CN201811552328.9A 2018-12-18 2018-12-18 Data synchronization method, device and system Active CN109739929B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811552328.9A CN109739929B (en) 2018-12-18 2018-12-18 Data synchronization method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811552328.9A CN109739929B (en) 2018-12-18 2018-12-18 Data synchronization method, device and system

Publications (2)

Publication Number Publication Date
CN109739929A CN109739929A (en) 2019-05-10
CN109739929B true CN109739929B (en) 2021-03-16

Family

ID=66360484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811552328.9A Active CN109739929B (en) 2018-12-18 2018-12-18 Data synchronization method, device and system

Country Status (1)

Country Link
CN (1) CN109739929B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110471896B (en) * 2019-06-21 2023-11-28 陕西融华电子科技有限公司 Data processing method, system and server
CN110321387B (en) * 2019-07-10 2022-02-01 中国联合网络通信集团有限公司 Data synchronization method, equipment and terminal equipment
CN110516124B (en) * 2019-08-09 2022-04-22 济南浪潮数据技术有限公司 File analysis method and device and computer readable storage medium
CN110807067B (en) * 2019-09-29 2023-12-22 北京淇瑀信息科技有限公司 Data synchronization method, device and equipment for relational database and data warehouse
CN110852778B (en) * 2019-09-30 2021-03-26 口口相传(北京)网络技术有限公司 Data processing method and device for business object
CN110941623A (en) * 2019-11-12 2020-03-31 北京达佳互联信息技术有限公司 Data synchronization method and device
CN111013133A (en) * 2019-11-29 2020-04-17 北京奇艺世纪科技有限公司 Data processing method and device
CN111026774A (en) * 2019-12-03 2020-04-17 深圳前海环融联易信息科技服务有限公司 Data sequence synchronization method and device, computer equipment and storage medium
CN111209126A (en) * 2020-01-03 2020-05-29 北京明略软件系统有限公司 Data transmission method and device between microservices and electronic equipment
CN111400407B (en) * 2020-04-10 2023-09-26 浙江大华技术股份有限公司 Data synchronization method and device, storage medium and electronic device
CN111752910A (en) * 2020-06-24 2020-10-09 上海微盟企业发展有限公司 Data synchronization method, system and related device for heterogeneous platform
CN111831748A (en) * 2020-06-30 2020-10-27 北京小米松果电子有限公司 Data synchronization method, device and storage medium
CN111813868B (en) * 2020-08-13 2023-11-10 中国工商银行股份有限公司 Data synchronization method and device
CN112333083B (en) * 2020-10-30 2023-04-28 平安付科技服务有限公司 Transaction information processing method, device, computer equipment and computer readable medium
CN112463886A (en) * 2020-11-30 2021-03-09 浙江大华技术股份有限公司 Data processing method and device, electronic equipment and storage medium
CN112612799B (en) * 2020-12-08 2022-10-18 福建天泉教育科技有限公司 Data synchronization method and terminal
CN112612583A (en) * 2020-12-16 2021-04-06 平安消费金融有限公司 Data synchronization method and device, computer equipment and readable storage medium
CN112597247B (en) * 2020-12-25 2022-05-31 杭州数梦工场科技有限公司 Data synchronization method and device
CN112685426A (en) * 2021-01-21 2021-04-20 浪潮云信息技术股份公司 NiFi-based Kafka consumption NewSQL CDC stream data conversion method
CN112925743A (en) * 2021-02-07 2021-06-08 中国工商银行股份有限公司 File generation method and device and storage medium
CN113064950B (en) * 2021-03-18 2024-04-16 北京沃东天骏信息技术有限公司 Data synchronization method, device, equipment and storage medium
CN112905706A (en) * 2021-03-19 2021-06-04 平安消费金融有限公司 Database synchronization method and device, storage medium and computer equipment
CN113220791B (en) * 2021-06-03 2023-07-28 西安热工研究院有限公司 Data cascading synchronization system and method
CN113312192A (en) * 2021-06-07 2021-08-27 平安证券股份有限公司 Data synchronization method and device based on window, electronic equipment and storage medium
CN113407637A (en) * 2021-07-13 2021-09-17 上海浦东发展银行股份有限公司 Data synchronization method and device, electronic equipment and storage medium
CN113609199B (en) * 2021-07-27 2023-09-12 远景智能国际私人投资有限公司 Database system, server, and storage medium
CN113722390A (en) * 2021-09-03 2021-11-30 小马国炬(玉溪)科技有限公司 Data storage method and system
CN115391361A (en) * 2022-08-24 2022-11-25 国任财产保险股份有限公司 Real-time data processing method and device based on distributed database
CN117349384B (en) * 2023-12-04 2024-03-15 四川才子软件信息网络有限公司 Database synchronization method, system and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005284781A (en) * 2004-03-30 2005-10-13 Nomura Research Institute Ltd Mq data synchronizing system and mq data synchronizing program
CN103905503A (en) * 2012-12-27 2014-07-02 中国移动通信集团公司 Data storage method, data scheduling method, device and system
CN104572689A (en) * 2013-10-17 2015-04-29 腾讯科技(深圳)有限公司 Data synchronizing method, device and system
WO2017141229A1 (en) * 2016-02-21 2017-08-24 Geir Christian Karlsen System and method for securely exchanging data between devices
CN107783975A (en) * 2016-08-24 2018-03-09 北京京东尚科信息技术有限公司 The method and apparatus of distributed data base synchronization process
CN107844524A (en) * 2017-10-12 2018-03-27 金蝶软件(中国)有限公司 Data processing method, data processing equipment, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005284781A (en) * 2004-03-30 2005-10-13 Nomura Research Institute Ltd Mq data synchronizing system and mq data synchronizing program
CN103905503A (en) * 2012-12-27 2014-07-02 中国移动通信集团公司 Data storage method, data scheduling method, device and system
CN104572689A (en) * 2013-10-17 2015-04-29 腾讯科技(深圳)有限公司 Data synchronizing method, device and system
WO2017141229A1 (en) * 2016-02-21 2017-08-24 Geir Christian Karlsen System and method for securely exchanging data between devices
CN107783975A (en) * 2016-08-24 2018-03-09 北京京东尚科信息技术有限公司 The method and apparatus of distributed data base synchronization process
CN107844524A (en) * 2017-10-12 2018-03-27 金蝶软件(中国)有限公司 Data processing method, data processing equipment, computer equipment and storage medium

Also Published As

Publication number Publication date
CN109739929A (en) 2019-05-10

Similar Documents

Publication Publication Date Title
CN109739929B (en) Data synchronization method, device and system
CN110019240B (en) Service data interaction method, device and system
CN108280080B (en) Data synchronization method and device and electronic equipment
EP3258396A1 (en) Data synchronization method, device and system
CN112307037B (en) Data synchronization method and device
CN110334070A (en) Data processing method, system, equipment and storage medium
CN106899654B (en) Sequence value generation method, device and system
CN112559475B (en) Data real-time capturing and transmitting method and system
CN110245134B (en) Increment synchronization method applied to search service
CN111399764B (en) Data storage method, data reading device, data storage equipment and data storage medium
CN110928851B (en) Method, device and equipment for processing log information and storage medium
WO2019057193A1 (en) Data deletion method and distributed storage system
CN106375360B (en) Graph data updating method, device and system
CN112579692B (en) Data synchronization method, device, system, equipment and storage medium
CN111680017A (en) Data synchronization method and device
CN107040576A (en) Information-pushing method and device, communication system
CN114416868B (en) Data synchronization method, device, equipment and storage medium
US11243777B2 (en) Process stream replication for content management system synchronization
CN113761052A (en) Database synchronization method and device
CN109189864B (en) Method, device and equipment for determining data synchronization delay
CN111552701A (en) Method for determining data consistency in distributed cluster and distributed data system
CN116304390A (en) Time sequence data processing method and device, storage medium and electronic equipment
CN111147226A (en) Data storage method, device and storage medium
CN111522688B (en) Data backup method and device for distributed system
CN113297327A (en) System and method for generating distributed ID

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant