CN117171132A - Data synchronization method, device and medium - Google Patents

Data synchronization method, device and medium Download PDF

Info

Publication number
CN117171132A
CN117171132A CN202311125525.3A CN202311125525A CN117171132A CN 117171132 A CN117171132 A CN 117171132A CN 202311125525 A CN202311125525 A CN 202311125525A CN 117171132 A CN117171132 A CN 117171132A
Authority
CN
China
Prior art keywords
data
source
information
synchronization
binlog
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311125525.3A
Other languages
Chinese (zh)
Inventor
张德亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202311125525.3A priority Critical patent/CN117171132A/en
Publication of CN117171132A publication Critical patent/CN117171132A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a data synchronization method, a device and a medium, which relate to the technical field of computers and are used for solving the problem of data migration without shutdown, wherein the method comprises the following steps: source terminal information and target terminal information to be subjected to data synchronization are configured in an associated mode through a synchronization information table; configuring a binlog analysis component according to source information to be subjected to data synchronization in a synchronization information table, analyzing binlog of a source based on the binlog analysis component to obtain data change of the source, and storing the data change in an intermediate device; and starting a consumption program to synchronize the data change stored in the intermediate equipment to the target end according to the target end information to be subjected to data synchronization, which is associated with the source end information to be subjected to data synchronization, in the synchronization information table. The invention can formulate a high-efficiency data synchronization strategy from the source end to the target end, acquire, store and synchronize data change according to the data synchronization strategy through the intermediate equipment, and realize flexible and controllable data migration from the source end to the target end without shutdown.

Description

Data synchronization method, device and medium
Technical Field
The present invention relates to at least the field of computer technology, and in particular, to a data synchronization method, an intermediate device for data synchronization, and a computer readable storage medium.
Background
In the conventional data migration process, writing of data to a source end needs to be stopped when data is migrated, so that consistency of the data is ensured, and service suspension during data migration can be caused, so that service performance and user experience of the service are influenced.
Therefore, how to realize migration of data without stopping the machine and ensure service performance and user experience of the business is a technical problem to be solved in the field.
Disclosure of Invention
The invention aims to solve the technical problem of providing a data synchronization method, intermediate equipment for data synchronization and a computer readable storage medium, so as to solve the problem of how to realize data migration without shutdown and ensure the service performance and user experience of a service.
In a first aspect, the present invention provides a data synchronization method, applied to an intermediate device connecting a source end and a target end, including:
source terminal information and target terminal information to be subjected to data synchronization are configured in an associated mode through a synchronization information table;
configuring a binlog analysis component according to source information to be subjected to data synchronization in a synchronization information table, analyzing binlog of a source based on the binlog analysis component to obtain data change of the source, and storing the data change in an intermediate device;
according to target end information to be subjected to data synchronization, which is associated with source end information to be subjected to data synchronization, in the synchronization information table, starting a consumption program to synchronize data changes stored in the intermediate equipment to the target end;
where binlog is a binary log file.
Further, the source end information and the target end information to be subjected to data synchronization are configured in a correlated manner through the synchronization information table, and the method specifically comprises the following steps:
setting a source end information table, a target end information table and a synchronous information table;
writing a plurality of source database identifications in a source information table, and setting a first storage position for storing BinlogTopic, a name for storing the source data table and a second storage position for storing a main key in association with each source database identification;
writing a plurality of target end addresses into a target end information table, and setting a third storage position for storing target end partition information in association with each target end address;
writing a source end database identifier to be subjected to data synchronization and a target end address to which source end data are to be synchronized in a related manner in a synchronization information table;
wherein BinlogTopic is the unique identification of the binlog parsing component.
Further, configuring a binlog analysis component according to source information to be subjected to data synchronization in the synchronization information table, analyzing binlog of a source based on the binlog analysis component to obtain data change of the source, and storing the data change in an intermediate device, wherein the method specifically comprises the following steps:
configuring a binlog analysis component according to a source end database identifier to be subjected to data synchronization, generating a binlog topic of the binlog analysis component, writing the binlog topic into a first storage position, and setting a fourth storage position for storing data change of a source end based on the binlog topic in an intermediate device;
and analyzing the binlog of the source terminal based on the binlog analysis component to acquire the data change of the source terminal, acquiring the name and the primary key of the source terminal data table according to the data change, writing the name and the primary key of the source terminal data table into the second storage position, and writing the data change into the fourth storage position.
Further, according to target end information to be subjected to data synchronization associated with source end information to be subjected to data synchronization in the synchronization information table, starting a consumption program to synchronize data changes stored in the intermediate device to the target end, specifically including:
acquiring the names and the primary keys of the BinlogTopic and the source data table according to the source database identification to be subjected to data synchronization, setting a partition at a target end according to the names and the primary keys of the source data table and the target end address to be subjected to the source data synchronization, and writing target end partition information into a third storage position;
and starting a consumption program to acquire data changes stored in the intermediate device according to the BinlogTopic, acquiring target end partition information according to a target end address to which source end data are to be synchronized, which is associated with the source end information to be subjected to data synchronization, and synchronizing the acquired data changes to the target end according to the target end address and the target end partition information.
Further, the source end is specifically a MYSQL database, and the target end is specifically at least one of a MYSQL database, a KAFKA cluster, an MQ cluster and an ES cluster;
wherein MYSQL is a relational database management system, KAFKA is an open source stream processing platform, MQ is a message queue, and ES is a search server.
Further, the consumption program is in the form of one of a virtual machine, a physical machine and a container.
Further, the method further comprises:
in the synchronization information table, the source information configuration associated with the data to be synchronized stores the data change in a storage period of the intermediate device, and the target information configuration associated with the data to be synchronized consumes a start time of the program, wherein the start time is within the storage period, and/or the target information configuration associated with the data to be synchronized targets filters.
Further, the method further comprises:
and sending an alarm in response to monitoring that the performance of the intermediate device is insufficient, the binlog analysis component cannot complete analysis operation, or the consumption program cannot complete synchronous operation.
In a second aspect, the present invention provides an intermediate device for data synchronization, where the intermediate device connects a source end and a target end, and includes:
the configuration module is used for associating and configuring source terminal information and target terminal information to be subjected to data synchronization through the synchronization information table;
the analysis module is connected with the configuration module and is used for configuring a binlog analysis component according to source end information to be subjected to data synchronization in the synchronization information table, analyzing binlog of a source end based on the binlog analysis component to acquire data change of the source end, and storing the data change in the intermediate equipment;
the synchronous module is connected with the analysis module and is used for starting a consumption program to synchronize the data change stored in the intermediate equipment to the target end according to the target end information to be subjected to data synchronization, which is associated with the source end information to be subjected to data synchronization, in the synchronous information table;
where binlog is a binary log file.
In a third aspect, the present invention provides a computer readable storage medium having a computer program stored therein, which when executed by a processor, implements a data synchronization method as described above.
The invention provides a data synchronization method, an intermediate device for data synchronization and a computer readable storage medium, which enable the intermediate device to timely acquire source end information and target end information to be subjected to data synchronization through the associated configuration of a synchronization information table, then acquire data change of a source end in real time by adopting a binlog analysis component, store the data change in the intermediate device, synchronize the data change stored in the intermediate device to the target end through a consumption program, and realize flexible associated configuration of the source end and the target end through the synchronization information table, thereby making a high-efficiency data synchronization strategy from the source end to the target end, acquiring, storing and synchronizing the data change through the intermediate device according to the data synchronization strategy, and realizing non-stop and flexible controllable migration data from the source end to the target end.
Drawings
FIG. 1 is a schematic diagram of a data synchronization system according to an embodiment of the present invention;
FIG. 2 is a flow chart of a data synchronization method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an intermediate device for data synchronization according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of another data synchronization system according to an embodiment of the present invention.
Detailed Description
In order to make the technical scheme of the present invention better understood by those skilled in the art, the following detailed description of the embodiments of the present invention will be given with reference to the accompanying drawings.
It is to be understood that the specific embodiments and figures described herein are merely illustrative of the invention, and are not limiting of the invention.
It is to be understood that the various embodiments of the invention and the features of the embodiments may be combined with each other without conflict.
It is to be understood that only the portions relevant to the present invention are shown in the drawings for convenience of description, and the portions irrelevant to the present invention are not shown in the drawings.
It should be understood that each unit and module in the embodiments of the present invention may correspond to only one physical structure, may be formed by a plurality of physical structures, or may be integrated into one physical structure.
It will be appreciated that the functions and steps noted in the flowcharts and block diagrams of the subject invention can occur out of the order noted in the figures without conflict.
It is to be understood that the flowcharts and block diagrams of the present invention illustrate the architecture, functionality, and operation of possible implementations of systems, apparatuses, devices, methods according to various embodiments of the present invention. Where each block in the flowchart or block diagrams may represent a unit, module, segment, code, or the like, which comprises executable instructions for implementing the specified functions. Moreover, each block or combination of blocks in the block diagrams and flowchart illustrations can be implemented by hardware-based systems that perform the specified functions, or by combinations of hardware and computer instructions.
It should be understood that the units and modules related in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, for example, the units and modules may be located in a processor.
Example 1:
the invention provides a data synchronization method, which is applied to an intermediate device 2 for connecting a source terminal 1 and a target terminal 3 as shown in fig. 1, and comprises the steps as shown in fig. 2:
s21, configuring source terminal information and target terminal information to be subjected to data synchronization in an associated mode through a synchronization information table;
s22, configuring a binlog analysis component according to source end information to be subjected to data synchronization in a synchronization information table, analyzing binlog of a source end based on the binlog analysis component to acquire data change of the source end, and storing the data change in an intermediate device;
s23, according to target end information to be subjected to data synchronization, which is associated with source end information to be subjected to data synchronization, in a synchronization information table, starting a consumption program to synchronize data changes stored in the intermediate equipment to the target end;
where binlog is a binary log file.
Specifically, in this embodiment, a data synchronization system as shown in fig. 1 is first constructed, an intermediate device is added between a source end and a target end, the composition structure of the intermediate device is shown in fig. 3, and the data synchronization system includes a configuration module 21 for implementing step S21, an analysis module 22 for implementing step S22, and a synchronization module 23 for implementing step S23, where the configuration module 21 is used to perform association configuration of a synchronization information table, so that the intermediate device timely obtains source end information and target end information to be subjected to data synchronization, then the analysis module 22 uses a binlog analysis component to obtain data changes of the source end in real time, and stores the data changes in the intermediate device, and then uses a consumption program to synchronize the data changes stored in the intermediate device to the target end, so that flexible association configuration of the source end and the target end can be implemented through the synchronization information table, and thus a high-efficiency data synchronization policy from the source end to the target end can be formulated, and data can be obtained, stored and synchronized through the intermediate device according to the data synchronization policy, and flexible and controllable data can be implemented. In addition, the intermediate device stores the data change of the source end, does not store the full data, has small data size and high real-time synchronous data speed, and the host computer of the intermediate device is adopted to store the data change, has large capacity and high data storage reliability, and can simultaneously set a plurality of target ends for synchronization through the configuration of the table association information.
Further, the source end information and the target end information to be subjected to data synchronization are configured in a correlated manner through the synchronization information table, and the method specifically comprises the following steps:
setting a source end information table, a target end information table and a synchronous information table;
writing a plurality of source database identifications in a source information table, and setting a first storage position for storing BinlogTopic, a name for storing the source data table and a second storage position for storing a main key in association with each source database identification;
writing a plurality of target end addresses into a target end information table, and setting a third storage position for storing target end partition information in association with each target end address;
writing a source end database identifier to be subjected to data synchronization and a target end address to which source end data are to be synchronized in a related manner in a synchronization information table;
wherein BinlogTopic is the unique identification of the binlog parsing component.
Specifically, in this embodiment, the data synchronization system may be specifically represented by a composition structure shown in fig. 4, the source end may be specifically a source end database, the intermediate device may be specifically represented by a target end cluster including a binlog analysis component, a local storage module and a consumption program cluster, the binlog of the source end is monitored by the binlog analysis component of the intermediate device, the data change of the source end is obtained in real time according to the binlog, the data change is stored in the local storage module, then the data change of the source end is migrated to the target end from the intermediate device through the consumption program cluster, a table association policy for controlling the data migration process is also stored in the local storage module, the table association policy includes three tables of source end information table, target end information table and synchronization information table, specifically, the association configuration of source end information and target end information to be subjected to data synchronization can be realized through the synchronization information table, the source end and target end required to be subjected to data synchronization is obtained through the synchronization information table, the source end and target end required to be subjected to data synchronization are obtained from the source end information table and target end information table respectively, the source end information and target end information required to be subjected to data synchronization are recorded in a plurality of sets of tasks, and the task is recorded in relation to be recorded in the system, and the target end information is required to be subjected to the synchronization to the multiple task.
More specifically, for example, in a scenario where a service is deployed in a single area, if data outage migration occurs, according to the data volume and the network delay condition, the overall migration process may take several days, and a outage period of several days may have a great influence on the service, with rapid development of the service and increase of the number of users, as the service is rapidly developed and the number of users increases, the service is relatively delayed and cannot be accessed in real time due to the fact that the user cross-domain access service is relatively high, so that the expansion capability of the service is affected, and further, data cannot be normally acquired due to the fact that nonresistance factors such as power failure, network interruption and the like occur, so that a data migration method with excellent performance is very necessary for this situation. Incremental data synchronization is a solution that can be referred to, for example, a massive platform is used as a source database, a Hadoop (a distributed system infrastructure developed by the Apache foundation) large data platform is used as a target database, mapReduce (a programming model used for parallel operation of a large-scale data set (greater than 1 TB)) is used as a calculation engine of large data, an HDFS (distributed file system) distributed file system is used for storing unstructured and semi-structured data, an Hbase (distributed, open-source database) distributed database is used for storing structured data, and an automatic incremental data import method is developed in combination with Java (an object-oriented programming language) to realize incremental import from the source database to the target database, while incremental synchronization for data is realized, long-time service shutdown caused by one-time overall migration of data is avoided, real-time performance is poor, and real-time requirements of service cannot be satisfied. In another data synchronization method based on the canaal (database synchronization tool), MYSQL (a relational database management system), oracle (a relational database management system) and the like are used as source end databases, KAFKA (an open source stream processing platform), MYSQL and the like are used as target ends, a dump (backup file system) protocol of MYSQL slave and MYSQL master is simulated, a binlog (log file in binary format) is analyzed to realize the basic requirement of data synchronization, but each target end needs to be deployed with a set of programs for synchronization, 31 sets of program clusters need to be deployed for data synchronization if the service is expanded to 31 provinces, more clusters need to be deployed if the service is synchronized to different target ends, under the conditions, the waste of cluster resources is caused, the database load is higher due to the need to continuously inquiring the source end databases, and normal use of the service is affected when serious.
In view of the above-mentioned problems of the scenario and possible options, the present embodiment provides a real-time data synchronization system and synchronization method with high synchronization efficiency without shutdown migration, where the method is implemented based on a database binlog, and the analysis of the binlog is completed in an intermediate device, and the intermediate device obtains incremental data of a source end by analyzing the binlog, and then synchronizes the incremental data from the intermediate device to a target end, and the intermediate device only needs to save the incremental data, thereby saving storage space and improving synchronization efficiency. Specifically, the data synchronization system includes: source RDS (relational database service ) database (such as MYSQL), database binlog analysis component, target cluster and consumer cluster for synchronizing transaction data in source database to target in real time; the consumer program cluster acquires a data synchronization control strategy through a data synchronization table association strategy, controls the data synchronization time and a synchronous target end through the data synchronization control strategy, wherein the table association strategy is obtained by associating a newly-built or modified source end information table, a target end information table and a required synchronization information table, id (Identity) fields are arranged in the source end information table, the target end information table and the synchronization information table, a link is uniquely identified, three tables are associated by id, and the source end information table stores source end database information, binlogTopic, database names, table names, main keys and the like; the target end information table stores different information aiming at different target ends, such as kafka stores the cluster information, topic, partition and the like of kafka, and the database stores the data table information and the like corresponding to the source end database; the synchronization information table stores a source end and a corresponding target end which need to be synchronized, consumption time points for starting data synchronization, and the like.
Further, configuring a binlog analysis component according to source information to be subjected to data synchronization in the synchronization information table, analyzing binlog of a source based on the binlog analysis component to obtain data change of the source, and storing the data change in an intermediate device, wherein the method specifically comprises the following steps:
configuring a binlog analysis component according to a source end database identifier to be subjected to data synchronization, generating a binlog topic of the binlog analysis component, writing the binlog topic into a first storage position, and setting a fourth storage position for storing data change of a source end based on the binlog topic in an intermediate device;
and analyzing the binlog of the source terminal based on the binlog analysis component to acquire the data change of the source terminal, acquiring the name and the primary key of the source terminal data table according to the data change, writing the name and the primary key of the source terminal data table into the second storage position, and writing the data change into the fourth storage position.
Specifically, in the embodiment, in the database binlog-based data synchronization system, the source database is specifically a MYSQL database; the database binlog analysis component is used for analyzing the database binlog information, analyzing, filtering and formatting the source end data in a standard manner, and storing the source end data locally; the consumption program cluster is used for carrying out real-time synchronous consumption on locally stored data, namely, source end data information generated by the database binlog analysis component is processed and then is synchronized to the target end cluster in real time, and the consumption program cluster depends on a table association strategy for data synchronization, and the table association strategy is used for associating newly-built or modified source end information, target end information and needed synchronous data table information; the target end cluster comprises a MYSQL database, a KAFKA cluster, an MQ cluster and the like.
Further, according to target end information to be subjected to data synchronization associated with source end information to be subjected to data synchronization in the synchronization information table, starting a consumption program to synchronize data changes stored in the intermediate device to the target end, specifically including:
acquiring the names and the primary keys of the BinlogTopic and the source data table according to the source database identification to be subjected to data synchronization, setting a partition at a target end according to the names and the primary keys of the source data table and the target end address to be subjected to the source data synchronization, and writing target end partition information into a third storage position;
and starting a consumption program to acquire data changes stored in the intermediate device according to the BinlogTopic, acquiring target end partition information according to a target end address to which source end data are to be synchronized, which is associated with the source end information to be subjected to data synchronization, and synchronizing the acquired data changes to the target end according to the target end address and the target end partition information.
Specifically, in this embodiment, the whole system operation flow is that the intermediate device obtains a source MYSQL database identifier of a required data synchronization provided by a service side, configures a database binlog analysis component according to the MYSQL database identifier, generates a unique database binlog analysis ID (hereinafter referred to as binlog topic) based on the database, and each operation of MYSQL records binlog, the analysis component monitors real-time changes of the source MYSQL database, analyzes the binlog corresponding to the database in real time, formats real-time transaction data (operations of a user on the source database, including insert, update, delete, etc.) and stores the formatted real-time transaction data in a local (cluster host corresponding to the analysis component); configuring table association information of data synchronization, firstly, inputting source database information, target synchronization information and required data synchronization table information into a storage module of a consumption program cluster by a user, determining a MYSQL database required by a service side and a data table name required to be synchronized according to the table association information, wherein the synchronized data table is required to have a main key, the main key can uniquely identify one piece of data, and determining a target end required to be synchronized, if the target end is a database, the target end database is required to be ensured to build a data table consistent with a source end table structure; if the target end is KAFKA and MQ cluster, the corresponding topic and partition are ensured to be established and communicated with the consumer program cluster network; starting a consumption program, acquiring source end database information in table association information, acquiring transaction information stored locally by BinlogTopic corresponding to a source end MYSQL database according to BinlogTopic information in the source end database information, acquiring data synchronization table information in the table association information, screening and processing the transaction information according to table information in the data synchronization table, acquiring target end synchronization information in the table association information, and synchronizing screened and processed data to a target end in real time according to a configured target end address. In addition, the transmitted data can specifically adopt a key-value (key value pair) mode, wherein the key is field information of a table and a value field value. Through the above flow, the embodiment solves the problem of service suspension caused by large data volume and delay in the user migration process, can synchronize the change data of a plurality of tables concurrently, integrally update the synchronous link simply and efficiently, has higher synchronous efficiency, can normally operate the service during the synchronization, can configure one source end database to synchronize to a plurality of target ends concurrently, solves the problem of resource waste of deployment clusters and reduces performance influence on the source end database, for example, synchronously synchronizes the MYSQL database and the KAFKA cluster, can realize cross-domain synchronization and multi-activity in different places, and really realizes one-saving data multi-saving synchronous backup.
Further, the source end is specifically a MYSQL database, and the target end is specifically at least one of a MYSQL database, a KAFKA cluster, an MQ cluster and an ES cluster;
wherein MYSQL is a relational database management system, KAFKA is an open source stream processing platform, MQ is a message queue, and ES is a search server.
Specifically, in this embodiment, the source database may include a MYSQL database, and the target includes a MYSQL database, a KAFKA cluster, a MQ (Message Queue) cluster, an ES (search server) cluster, and the like, and source data may be synchronized to multiple targets at the same time, e.g., to the RDS database and the KAFKA cluster.
Further, the consumption program is in the form of one of a virtual machine, a physical machine and a container.
Specifically, in the present embodiment, the consumer program cluster may be composed of virtual machines, physical machines, k8s (Kubernetes, a completely new distributed architecture solution based on container technology) containers, and the like.
Further, the method further comprises:
in the synchronization information table, the source information configuration associated with the data to be synchronized stores the data change in a storage period of the intermediate device, and the target information configuration associated with the data to be synchronized consumes a start time of the program, wherein the start time is within the storage period, and/or the target information configuration associated with the data to be synchronized targets filters.
Specifically, in this embodiment, the table association policy of data synchronization may be set as a source information table, to store source RDS database information, program deployment cluster information, and the like; a target end information table storing a target end address, a user name password, attribute information, and the like; data synchronization table information, storage source database information, synchronization table information, synchronization patterns (such as consumption time points, api (application program interface, application Programming Interface) addresses, etc.), and the like. When transmitting data to a target end, field conditions can be set for data filtering, the data storage period can be set to 7 days, consumption time point configuration can be dynamically carried out, data synchronization can be carried out at a certain time, and data table configuration needing synchronization can be dynamically carried out, so that breakpoint continuous transmission can be realized, namely, data storage of a source end database synchronization table is carried out for 7 days, the consumption time point in 7 days can be dynamically set for data synchronization only by adjusting the consumption time point field in an associated table data synchronization information table, if a downstream system temporarily stops a data synchronization task due to service, the breakpoint continuous transmission can be realized within 7 days, the data synchronization is carried out continuously by the last time of suspending consumption time, the problem of data loss is prevented, and the dynamically set field condition filtering of the synchronous target end can realize flexible data transfer of each province.
Further, the method further comprises:
and sending an alarm in response to monitoring that the performance of the intermediate device is insufficient, the binlog analysis component cannot complete analysis operation, or the consumption program cannot complete synchronous operation.
Specifically, in this embodiment, an auxiliary step may be further provided for monitoring and alarming the database binlog analysis component service, for example, whether the BinlogTopic is normal or not, and whether delay occurs, where delay means that the source may have a large amount of data change, and the corresponding BinlogTopic may not be able to analyze all operations at a time; monitoring and alarming of the generated data synchronization task, namely alarming of synchronizing data to a target end task, if yes, the task fails, whether delay occurs, wherein the delay refers to operation of a source end with large data volume, and the source end cannot be synchronized to the target end in time possibly; monitoring and alarming of a host where the binlog analysis component service is located and a host where the consumer program clusters are located, such as whether a host disk where the binlog analysis component is located is full, whether a memory is insufficient, and the like; in addition, the service side can also check whether the MYSQL database and the target end cluster have inconsistent conditions or not at regular time, and if the inconsistent query reasons, data restoration is carried out.
Example 2:
as shown in fig. 1 and 3, the present invention provides an intermediate device 2 for data synchronization, where the intermediate device 2 connects a source terminal 1 and a destination terminal 3, and includes:
a configuration module 21, configured to associate and configure source side information and target side information to be subjected to data synchronization through a synchronization information table;
the analysis module 22 is connected with the configuration module 21 and is used for configuring a binlog analysis component according to the source end information to be subjected to data synchronization in the synchronization information table, analyzing the binlog of the source end based on the binlog analysis component to acquire the data change of the source end, and storing the data change in the intermediate equipment;
the synchronization module 23 is connected with the analysis module 22 and is used for starting a consumption program to synchronize the data change stored in the intermediate equipment to the target end according to the target end information to be subjected to data synchronization, which is associated with the source end information to be subjected to data synchronization, in the synchronization information table;
where binlog is a binary log file.
Further, the configuration module 21 specifically includes:
the table setting unit is used for setting a source end information table, a target end information table and a synchronous information table;
the source information table unit is used for writing a plurality of source database identifications in the source information table, and setting a first storage position for storing the BinlogTopic, a name for storing the source data table and a second storage position for storing the primary key in association with each source database identification;
the target end information table unit is used for writing a plurality of target end addresses into the target end information table and setting a third storage position for storing target end partition information in association with each target end address;
the synchronous information table unit is used for writing a source end database identifier to be subjected to data synchronization and a target end address to which the source end data are to be synchronized in a correlated manner in the synchronous information table;
wherein BinlogTopic is the unique identification of the binlog parsing component.
Further, the parsing module 22 specifically includes:
the BinlogTopic unit is used for configuring a binlog analysis component according to a source end database identifier to be subjected to data synchronization, generating BinlogTopic of the binlog analysis component, writing the BinlogTopic into a first storage position, and setting a fourth storage position for storing data change of a source end based on the BinlogTopic in the intermediate equipment;
the data change information unit is used for analyzing the binlog of the source terminal based on the binlog analysis component to acquire the data change of the source terminal, acquiring the name and the primary key of the source terminal data table according to the data change, writing the name and the primary key of the source terminal data table into the second storage position, and writing the data change into the fourth storage position.
Further, the synchronization module 23 specifically includes:
the target end partition information unit is used for acquiring the names and the primary keys of the BinlogTopic and the source end data table according to the source end database identification to be subjected to data synchronization, setting a partition at the target end according to the names and the primary keys of the source end data table and the target end address to which the source end data is to be synchronized, and writing target end partition information into a third storage position;
the target end partition synchronization unit is used for starting a consumption program to acquire data changes stored in the intermediate device according to the BinlogTopic, acquiring target end partition information according to a target end address to which source end data are to be synchronized, which is associated with source end information to be subjected to data synchronization, and synchronizing the acquired data changes to a target end according to the target end address and the target end partition information.
Further, the source end is specifically a MYSQL database, and the target end is specifically at least one of a MYSQL database, a KAFKA cluster, an MQ cluster and an ES cluster;
wherein MYSQL is a relational database management system, KAFKA is an open source stream processing platform, MQ is a message queue, and ES is a search server.
Further, the consumption program is in the form of one of a virtual machine, a physical machine and a container.
Further, the synchronization information table unit is further configured to:
in the synchronization information table, the source information configuration associated with the data to be synchronized stores the data change in a storage period of the intermediate device, and the target information configuration associated with the data to be synchronized consumes a start time of the program, wherein the start time is within the storage period, and/or the target information configuration associated with the data to be synchronized targets filters.
Further, the intermediate device 2 further includes:
and the monitoring and alarming unit is used for sending out an alarm in response to the condition that the performance of the intermediate equipment is insufficient, the binlog analysis component can not complete the analysis operation or the consumption program can not complete the synchronous operation.
Example 3:
embodiment 3 of the present invention provides a computer-readable storage medium having a computer program stored therein, which when executed by a processor, implements the data synchronization method described in embodiment 1.
Computer-readable storage media includes volatile or nonvolatile, removable or non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, computer program modules or other data. Computer-readable storage media includes, but is not limited to, RAM (Random Access Memory ), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read Only Memory, charged erasable programmable Read-Only Memory), flash Memory or other Memory technology, CD-ROM (Compact Disc Read-Only Memory), digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer.
In addition, the present invention may also provide a computer apparatus including a memory and a processor, the memory storing a computer program, the processor executing the data synchronization method as described in embodiment 1 when the processor runs the computer program stored in the memory.
The memory is connected with the processor, the memory can be flash memory or read-only memory or other memories, and the processor can be a central processing unit or a singlechip.
Furthermore, the present invention may also provide a data synchronization system as shown in fig. 1 or fig. 4, in which the intermediate device performs the data synchronization method as described in embodiment 1, and the remaining components and functions thereof are as described in embodiment 1.
Embodiments 1 to 3 of the present invention provide a data synchronization method, an intermediate device for data synchronization, and a computer readable storage medium, which enable the intermediate device to timely acquire source information and destination information to be subjected to data synchronization through association configuration of a synchronization information table, then acquire data changes of a source in real time by adopting a binlog analysis component, store the data changes in the intermediate device, and synchronize the data changes stored in the intermediate device to a destination through a consumption program, and through the synchronization information table, flexible association configuration of the source and the destination can be achieved, so that a data synchronization policy from the source to the destination can be formulated, and migration data from the source to the destination can be achieved without shutdown and flexible and controllable through the intermediate device according to the data synchronization policy.
It is to be understood that the above embodiments are merely illustrative of the application of the principles of the present invention, but not in limitation thereof. Various modifications and improvements may be made by those skilled in the art without departing from the spirit and substance of the invention, and are also considered to be within the scope of the invention.

Claims (10)

1. The data synchronization method is characterized by being applied to an intermediate device connecting a source end and a target end, and comprising the following steps:
source terminal information and target terminal information to be subjected to data synchronization are configured in an associated mode through a synchronization information table;
configuring a binlog analysis component according to source information to be subjected to data synchronization in a synchronization information table, analyzing binlog of a source based on the binlog analysis component to obtain data change of the source, and storing the data change in an intermediate device;
according to target end information to be subjected to data synchronization, which is associated with source end information to be subjected to data synchronization, in the synchronization information table, starting a consumption program to synchronize data changes stored in the intermediate equipment to the target end;
where binlog is a binary log file.
2. The method according to claim 1, wherein the source side information and the target side information to be subjected to data synchronization are configured in a correlated manner through a synchronization information table, specifically comprising:
setting a source end information table, a target end information table and a synchronous information table;
writing a plurality of source database identifications in a source information table, and setting a first storage position for storing BinlogTopic, a name for storing the source data table and a second storage position for storing a main key in association with each source database identification;
writing a plurality of target end addresses into a target end information table, and setting a third storage position for storing target end partition information in association with each target end address;
writing a source end database identifier to be subjected to data synchronization and a target end address to which source end data are to be synchronized in a related manner in a synchronization information table;
wherein BinlogTopic is the unique identification of the binlog parsing component.
3. The method according to claim 2, wherein the binlog parsing component is configured according to source information to be data synchronized in the synchronization information table, and the binlog of the source is parsed by the binlog parsing component to obtain data changes of the source, and the data changes are stored in the intermediate device, specifically including:
configuring a binlog analysis component according to a source end database identifier to be subjected to data synchronization, generating a binlog topic of the binlog analysis component, writing the binlog topic into a first storage position, and setting a fourth storage position for storing data change of a source end based on the binlog topic in an intermediate device;
and analyzing the binlog of the source terminal based on the binlog analysis component to acquire the data change of the source terminal, acquiring the name and the primary key of the source terminal data table according to the data change, writing the name and the primary key of the source terminal data table into the second storage position, and writing the data change into the fourth storage position.
4. A method according to claim 3, wherein starting the consumption program to synchronize the data change stored in the intermediate device to the target according to the target information to be synchronized associated with the source information to be synchronized in the synchronization information table, specifically comprises:
acquiring the names and the primary keys of the BinlogTopic and the source data table according to the source database identification to be subjected to data synchronization, setting a partition at a target end according to the names and the primary keys of the source data table and the target end address to be subjected to the source data synchronization, and writing target end partition information into a third storage position;
and starting a consumption program to acquire data changes stored in the intermediate device according to the BinlogTopic, acquiring target end partition information according to a target end address to which source end data are to be synchronized, which is associated with the source end information to be subjected to data synchronization, and synchronizing the acquired data changes to the target end according to the target end address and the target end partition information.
5. The method according to any one of claims 1-4, wherein the source terminal is in the form of a MYSQL database, and the target terminal is in the form of at least one of a MYSQL database, a KAFKA cluster, a MQ cluster, and an ES cluster;
wherein MYSQL is a relational database management system, KAFKA is an open source stream processing platform, MQ is a message queue, and ES is a search server.
6. The method according to any of claims 1-4, wherein the consumption program is in the form of one of a virtual machine, a physical machine, a container.
7. The method according to any one of claims 1-4, further comprising:
in the synchronization information table, the source information configuration associated with the data to be synchronized stores the data change in a storage period of the intermediate device, and the target information configuration associated with the data to be synchronized consumes a start time of the program, wherein the start time is within the storage period, and/or the target information configuration associated with the data to be synchronized targets filters.
8. The method according to any one of claims 1-4, further comprising:
and sending an alarm in response to monitoring that the performance of the intermediate device is insufficient, the binlog analysis component cannot complete analysis operation, or the consumption program cannot complete synchronous operation.
9. An intermediate device for data synchronization, wherein the intermediate device connects a source end and a target end, and comprises:
the configuration module is used for associating and configuring source terminal information and target terminal information to be subjected to data synchronization through the synchronization information table;
the analysis module is connected with the configuration module and is used for configuring a binlog analysis component according to source end information to be subjected to data synchronization in the synchronization information table, analyzing binlog of a source end based on the binlog analysis component to acquire data change of the source end, and storing the data change in the intermediate equipment;
the synchronous module is connected with the analysis module and is used for starting a consumption program to synchronize the data change stored in the intermediate equipment to the target end according to the target end information to be subjected to data synchronization, which is associated with the source end information to be subjected to data synchronization, in the synchronous information table;
where binlog is a binary log file.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when being executed by a processor, implements the data synchronization method according to any of claims 1-8.
CN202311125525.3A 2023-09-01 2023-09-01 Data synchronization method, device and medium Pending CN117171132A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311125525.3A CN117171132A (en) 2023-09-01 2023-09-01 Data synchronization method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311125525.3A CN117171132A (en) 2023-09-01 2023-09-01 Data synchronization method, device and medium

Publications (1)

Publication Number Publication Date
CN117171132A true CN117171132A (en) 2023-12-05

Family

ID=88938932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311125525.3A Pending CN117171132A (en) 2023-09-01 2023-09-01 Data synchronization method, device and medium

Country Status (1)

Country Link
CN (1) CN117171132A (en)

Similar Documents

Publication Publication Date Title
US11010358B2 (en) Data migration method and system
CN110209726B (en) Distributed database cluster system, data synchronization method and storage medium
CN109960710B (en) Data synchronization method and system between databases
CN108170768B (en) Database synchronization method, device and readable medium
CN109918349B (en) Log processing method, log processing device, storage medium and electronic device
CN111723160A (en) Multi-source heterogeneous incremental data synchronization method and system
US20150263909A1 (en) System and method for monitoring a large number of information processing devices in a communication network
CN111143382B (en) Data processing method, system and computer readable storage medium
CN103345502B (en) Transaction processing method and system of distributed type database
CN103440244A (en) Large-data storage and optimization method
CN104468274A (en) Cluster monitor and management method and system
CN112328702A (en) Data synchronization method and system
CN112612850A (en) Data synchronization method and device
CN113157701A (en) Dual-activity mechanism deployment method and device of ORACLE database
CN107566341B (en) Data persistence storage method and system based on federal distributed file storage system
CN111552701B (en) Method for determining data consistency in distributed cluster and distributed data system
CN111459913B (en) Capacity expansion method and device of distributed database and electronic equipment
CN110298031B (en) Dictionary service system and model version consistency distribution method
CN113032356A (en) Cabin distributed file storage system and implementation method
CN110555064A (en) data service system and method for insurance business
CN114925075B (en) Real-time dynamic fusion method for multi-source time-space monitoring information
CN114500289B (en) Control plane recovery method, device, control node and storage medium
CN117171132A (en) Data synchronization method, device and medium
CN113111074B (en) Interactive data monitoring method and device based on block chain
CN112685486B (en) Data management method and device for database cluster, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination