CN112100275A

CN112100275A - Data synchronization method, system and electronic equipment

Info

Publication number: CN112100275A
Application number: CN202010907131.3A
Authority: CN
Inventors: 林大; 王星宇; 师文庆
Original assignee: Shanghai Weiyi Intelligent Manufacturing Technology Co ltd; Changzhou Weiyizhi Technology Co Ltd
Current assignee: Shanghai Weiyi Intelligent Manufacturing Technology Co ltd; Changzhou Weiyizhi Technology Co Ltd
Priority date: 2020-09-02
Filing date: 2020-09-02
Publication date: 2020-12-18

Abstract

The invention discloses a data synchronization method, a data synchronization system and electronic equipment. The data synchronization method comprises the following steps: monitoring data change in the Mysql database through a Canal server, judging whether the data change, if so, storing the changed complete data into a Canal connector, configuring a mapping relation between the Mysql database and an ElasticSearch in a configuration file, performing incremental synchronization operation on the changed complete data, and performing full synchronization operation on Mysql full data. The invention can give consideration to both incremental synchronization and full-scale synchronization of the service data, and realizes the incremental synchronization of the service data under the condition of no service perception, and is completely decoupled from the service logic.

Description

Data synchronization method, system and electronic equipment

Technical Field

The present invention relates to the field of data synchronization technologies, and in particular, to a data synchronization method, system and electronic device.

Background

The full-text retrieval function of the distributed search middleware ElasticSearch provides great convenience for large-data-volume search, and conveniently meets specific search requirements. The service data is synchronized from Mysql and other databases into the ElasticSearch by using the ElasticSearch, and an inverted index is established, so that the full-text retrieval function can be provided.

In the synchronization method in the prior art, an elastic search is integrated into a service logic, or data is completely synchronized into the elastic search in a time period with small service pressure such as early morning, the former can ensure the timeliness of data synchronization but has high coupling degree, the latter can ensure the consistency of the data, but newly added data before the full synchronization cannot be retrieved. Incremental synchronization and full synchronization of service data cannot be considered in the prior art, so that improvement is urgently needed.

Disclosure of Invention

In view of the above drawbacks of the prior art, an object of the present invention is to provide a data synchronization method, system and electronic device, which are used to solve the problem in the prior art that both incremental synchronization and full synchronization of service data cannot be considered.

To achieve the above and other related objects, the present invention provides a data synchronization method, including:

monitoring data change in the Mysql database through a Canal server, judging whether the data change, and if so, storing the changed complete data into a Canal connector;

configuring a mapping relation between a Mysql database and an ElasticSearch in a configuration file;

performing incremental synchronization operation on the changed complete data;

and carrying out full-scale synchronization operation on Mysql full-scale data.

In an embodiment of the present invention, the data synchronization method further includes:

and deploying a Canal server to configure the information of the Mysql database to be monitored in the configuration file.

In an embodiment of the present invention, the monitoring, by the Canal server, a data change in the Mysql database, and determining whether the data has changed, and if so, storing the changed complete data in the Canal connector includes:

the Mysql database sets the mode of the log file with the name of the log file as binlog as a row mode;

the log file records the changed complete data in the Mysql database and sends the changed complete data to a Canal server;

and receiving the log file through the Canal server, and storing the changed complete data into a Canal connector.

In an embodiment of the present invention, the step of performing an incremental synchronization operation on the changed complete data includes:

integrating a Canal client in a data synchronization system;

the cancer client acquires the changed complete data from the cancer connector at regular time;

analyzing and processing the changed complete data to obtain analyzed and processed data;

acquiring a database name and a table name of the Mysql database from the analyzed and processed data;

and synchronizing the analyzed and processed data into an ElasticSearch according to the mapping relation between the Mysql database and the ElasticSearch so as to realize increment synchronous operation.

In an embodiment of the present invention, the step of performing full-scale synchronization operation on Mysql full-scale data includes:

establishing a full synchronization task of the ElasticSearch;

and adding the full synchronization task into a distributed task scheduling service system to realize full synchronization operation.

In an embodiment of the present invention, the mapping relationship includes a mapping relationship between a database name and a table name of the Mysql database and an index and a type of the ElasticSearch, and a mapping relationship between a primary key column of a data row record of the Mysql database and a unique code of the ElasticSearch.

In an embodiment of the present invention, the step of the cancer client periodically acquiring the changed complete data from the cancer connector includes:

starting a timing task;

and regularly checking whether data exist in the cancer connector, if so, regularly acquiring the changed complete data from the cancer connector by the cancer client.

In an embodiment of the present invention, a release event and an event monitoring mechanism are adopted to analyze and process the changed complete data.

In an embodiment of the present invention, the step of synchronizing the parsed and processed data into an ElasticSearch according to the mapping relationship between the Mysql database and the ElasticSearch to implement an incremental synchronization operation includes:

searching the index and the type of the ElasticSearch in the mapping relation between the Mysql database and the ElasticSearch;

converting the row record data of the Mysql database into a storage structure form to construct an ElasticSearch with a transmission structure form for storage function;

and storing the parsed and processed data into an ElasticSearch in a transmission structural form with a storage function so as to realize the increment synchronous operation from the data in the Mysql database to the ElasticSearch.

In an embodiment of the present invention, the step of establishing a full-scale synchronization task of the ElasticSearch includes:

reading a database name and a table name of the Mysql database from the mapping relation between the Mysql database and the ElasticSearch;

defining new index names for data in each table of the Mysql database in sequence, setting word segmenters, attributes and fragments in mapping and setting in indexes, and completing creation of the index names in an elastic search;

synchronizing the data of the table to be synchronized to the index name of the ElasticSearch in a batch mode;

and the index name of the ElasticSearch completing data synchronization is associated and bound with the alias for retrieval, and the alias is unbound from the index name except the index name of the ElasticSearch, so that the establishment of the full-scale synchronization task of the data is completed.

The present invention also provides a data synchronization system, comprising:

the local server module is used for monitoring data change in the Mysql database, judging whether the data change, and if so, storing the changed complete data into the local connector;

the configuration module is used for configuring the mapping relation between the Mysql database and the ElasticSearch in the configuration file;

the increment synchronous operation module is used for carrying out increment synchronous operation on the changed complete data;

and the full synchronous operation module is used for performing full synchronous operation on Mysql full data.

The invention also provides an electronic device, which comprises a processor and a memory, wherein the memory stores program instructions, and the processor runs the program instructions to realize the data synchronization method.

As described above, the data synchronization method, system and electronic device of the present invention have the following advantages:

the data synchronization method can give consideration to both incremental synchronization and full synchronization of the service data by integrating the Canal component, realizes the incremental synchronization of the service data under the condition that the service is not sensed, is completely decoupled from service logic, has schedulability, can adjust the full synchronization strategy at any time, can change according to the magnitude of real service pressure, and is flexible in configuration.

According to the data synchronization method, the local server can timely sense the data change of the target Mysql database by integrating the local components, the local client can quickly respond, the time interval from Mysql data synchronization to ElasticSearch is effectively shortened, and the timeliness of data synchronization is improved.

The data synchronization method adopts the mechanism of issuing events and monitoring events to analyze and process changed data, and distributes different listeners to different types of changed data for processing, thereby realizing the decoupling of the processing method, realizing the parallel processing and further improving the data synchronization efficiency.

The data synchronization method of the invention realizes the non-perception switching of the retrieval index through the alias mechanism when the data full synchronization is carried out, and does not influence the normal retrieval service.

The data synchronization method of the invention carries out the full-scale synchronization of the data by adopting a mode of establishing tasks and adds the data into the distributed task scheduling service, thereby realizing the flexible configuration and the adjustment of the full-scale synchronization tasks.

Drawings

Fig. 1 is a flowchart illustrating a data synchronization method according to an embodiment of the present application.

Fig. 2 is a flowchart illustrating a data synchronization method according to another embodiment of the present application.

Fig. 3 is a flowchart of an operation of step S1 of the data synchronization method in fig. 1 according to an embodiment of the present application.

Fig. 4 is a flowchart of an operation of step S3 of the data synchronization method in fig. 1 according to an embodiment of the present application.

Fig. 5 is a flowchart of an operation of step S4 of the data synchronization method in fig. 1 according to an embodiment of the present application.

Fig. 6 is a flowchart of an operation of step S32 of the data synchronization method in fig. 4 according to an embodiment of the present application.

Fig. 7 is a flowchart of an operation of step S35 of the data synchronization method in fig. 4 according to an embodiment of the present application.

Fig. 8 is a flowchart of an operation of step S41 of the data synchronization method in fig. 5 according to an embodiment of the present application.

Fig. 9 is a schematic block diagram of a structure of a data synchronization system according to an embodiment of the present application.

Fig. 10 is a schematic block diagram of a structure of an electronic device according to an embodiment of the present application.

Fig. 11 is a schematic block diagram of a structure of a computer-readable storage medium according to an embodiment of the present application.

Description of the element reference numerals

10 cancer server module

20 configuration module

30 increment synchronous operation module

40 full-quantity synchronous operation module

50 processor

60 memory

70 computer instructions

701 computer readable storage medium

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the drawings only show the components related to the present invention rather than the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.

Referring to fig. 1 and fig. 2, fig. 1 is a flowchart illustrating a data synchronization method according to an embodiment of the present disclosure. Fig. 2 is a flowchart illustrating a data synchronization method according to another embodiment of the present application. The invention provides a data synchronization method, which can dispatch data in a synchronous Mysql database into an ElasticSearch, and comprises the following steps: and step S0, deploying a Canal Server (Canal Server) to configure the information of the Mysql database to be monitored in the configuration file. Step S1, monitoring data change in the Mysql database through the Canal server, and determining whether the data has changed, if the data has changed, storing the changed complete data in a Canal Connector (Canal Connector), and if the data has not changed, not processing the data. And step S2, configuring the mapping relation between the Mysql database and the ElasticSearch in the configuration file. And step S3, performing increment synchronization operation on the changed complete data. And step S4, carrying out full-scale synchronization operation on Mysql full-scale data.

Referring to fig. 3, fig. 3 is a flowchart illustrating a step S1 of the data synchronization method in fig. 1 according to an embodiment of the present disclosure. In step S1, the step of monitoring the data change in the Mysql database through the Canal server, and determining whether the data change, if so, storing the changed complete data in the Canal connector includes: step S11, the Mysql database sets the mode of the log file whose log file name is binlog to row (row) mode. And step S12, the log file records the changed complete data in the Mysql database and sends the changed complete data to a Canal server. And step S13, receiving the log file through the Canal server, and storing the changed complete data into a Canal connector.

Referring to fig. 4, fig. 4 is a flowchart illustrating a step S3 of the data synchronization method in fig. 1 according to an embodiment of the present disclosure. The step of performing an incremental synchronization operation on the changed complete data in step S3 includes: step S31, integrate a cancer Client (cancer Client) in the data synchronization system. And step S32, the cancer client acquires the changed complete data from the cancer connector at regular time. And step S33, analyzing and processing the changed complete data to obtain analyzed and processed data. Step S34, obtaining the database name (database) and table name (table) of the Mysql database from the analyzed and processed data. And step S35, synchronizing the analyzed and processed data into an ElasticSearch according to the mapping relation between the Mysql database and the ElasticSearch to realize increment synchronization operation. Specifically, in step S33, a release event and event monitoring mechanism is used to perform parsing and processing operations on the changed complete data, that is, different events are respectively released for the operations of adding, deleting and modifying data, and the parsing and processing of data are completed by different listeners respectively, so that decoupling of processing methods for different events is realized, and the operations can be performed in parallel, thereby improving data synchronization efficiency.

Referring to fig. 5, fig. 5 is a flowchart illustrating a step S4 of the data synchronization method in fig. 1 according to an embodiment of the present disclosure. The step of performing a full-scale synchronization operation on Mysql full-scale data in step S4 includes: and step S41, establishing a full synchronization task of the ElasticSearch. And step S42, adding the full synchronization task into the distributed task scheduling service system to realize full synchronization operation.

Referring to fig. 6, fig. 6 is a flowchart illustrating a step S32 of the data synchronization method in fig. 4 according to an embodiment of the present disclosure. The step of acquiring, by the cancer client, changed complete data from the cancer connector at regular time in step S6 includes: step S321, start timing task. Step S322, periodically checking whether data exists in the cancer connector, if so, periodically acquiring, by the cancer client, the changed complete data from the cancer connector, and if not, not performing processing.

Referring to fig. 7, fig. 7 is a flowchart illustrating a step S35 of the data synchronization method in fig. 4 according to an embodiment of the present disclosure. The step S35 of synchronizing the parsed and processed data to an ElasticSearch according to the mapping relationship between the Mysql database and the ElasticSearch to implement incremental synchronization operation includes: step S351, finding the index (index) and type (type) of the ElasticSearch in the mapping relationship between the Mysql database and the ElasticSearch. Step S352, converting the row record data (rowData) of the Mysql database into a storage structure form so as to construct an ElasticSearch with a transmission structure form for storage function. Specifically, in step S352, the line record data is converted into Map < key, value > form. And S353, storing the analyzed and processed data into an ElasticSearch with a transmission structure form for a storage function so as to realize increment synchronous operation from the data in the Mysql database to the ElasticSearch. Specifically, in step S353, the transmission structure for the storage function may be in a JSON form, the data is stored in the ElasticSearch, and the incremental synchronization from the data in the Mysql database to the ElasticSearch is completed under the condition that the service layer is unaware.

Referring to fig. 8, fig. 8 is a flowchart illustrating a step S41 of the data synchronization method in fig. 5 according to an embodiment of the present disclosure. The step of establishing the full synchronization task of the ElasticSearch in step S41 includes: step S411, reading the database name and the table name of the Mysql database from the mapping relation between the Mysql database and the ElasticSearch. Step S412, defining a new index name (indexinme) for the data in each table of the Mysql database in turn, setting a word splitter (Analyzer), attributes (properties) and shards (guard) in mapping and setting in the index, and completing the creation of the index name in the elastic search. Step S413, synchronizing the data of the table to be synchronized to the index name of the ElasticSearch in a batch manner. And step S414, the index name of the ElasticSearch which completes the data synchronization is associated and bound with the alias (alias) for retrieval, and the alias is unbound with the index name except the index name of the ElasticSearch, so that the establishment of the full-scale synchronization task of the data is completed. The method can realize the non-perception switching of the retrieval index, does not influence the normal retrieval service, adds the data full-scale synchronization task into the distributed task scheduling service, can adjust the full-scale synchronization strategy at any time, and can be flexibly configured according to the change of the real service pressure.

As shown in fig. 1 to 8, a data synchronization method may be applied to a schedulable system for synchronizing Mysql data to ElasticSearch, and specifically includes: step one, deploying a Canal server, wherein the Canal server is responsible for monitoring data changes of the Mysql database, and once data of the database are changed, the changed complete data are put into a Canal connector for being read by a Canal client. And step two, configuring the mapping relation from the Mysql database to the ElastiSearch in the configuration file in the data synchronization system, integrating a Canal client in the system, starting a timing task, acquiring data stored by the Canal server from a Canal connector, and analyzing and processing the data. And step three, acquiring the database name and the table name of the Mysql database from the analyzed and processed data, and synchronizing the analyzed and processed data in the step two into the ElasticSearch according to the mapping relation between the Mysql database and the ElasticSearch, so as to realize the sensorless incremental synchronization of the service layer. And fourthly, establishing an elastic search full-scale synchronization task based on an alias mechanism, adding the full-scale synchronization task into a distributed task scheduling service system for management, and realizing full-scale synchronization of data.

As shown in fig. 1 to 8, the specific method of the first step is: firstly, the Canal server needs to configure the information of the Mysql database to be monitored in a configuration file, including the basic connection information such as the IP, the port number, the user name, and the password of the database host. Secondly, the database operation and maintenance personnel set the log mode of the binlog as a row mode, record the complete data after the database data is changed, and send the complete data to the Canal server. And finally, after receiving the binlog log, the cancer server reads the changed data and puts the changed data into the cancer connector.

As shown in fig. 1 to 8, the specific method of step two is: firstly, in the data synchronization system, a mapping relation from the Mysql database to the ElasticSearch is configured in a configuration file, and the mapping relation comprises a mapping relation between a database name (database) and a table name (table) of the database and an index (index) and a type (type) of the ElasticSearch, and a mapping relation between a main key column (column) of a data line record and a unique code (id) of the ElasticSearch. Secondly, integrating a Canal client in the data synchronization system, starting a timing task, checking whether data exist in a Canal connector or not, and reading the data from the Canal connector if the data exist. And finally, analyzing and processing the read data, wherein the process adopts a publishing event (publish) and an event monitoring (eventlistener) mechanism, different events are respectively published for increasing and deleting data, and the analysis and processing of the data are respectively completed by different listeners, so that the decoupling of processing methods of different events is realized, the data can be processed in parallel, and the data synchronization efficiency is improved.

As shown in fig. 1 to 8, the specific method of step three is: firstly, acquiring corresponding database names, table names and row record data from processed data, and finding the index and type of the corresponding ElasticSearch from the mapping relation from the Mysql database to the ElasticSearch. Secondly, converting the line record data into a Map < key, value > form, constructing a JSON form which can be stored by the elastic search, storing the data into the elastic search, and finishing the increment synchronization from Mysql data to the elastic search under the condition that a service layer does not sense.

As shown in fig. 1 to 8, the specific method of step four is: firstly, reading a database name and a table name of data to be synchronized from a mapping relation from a Mysql database to an ElasticSearch, defining a new index name for the data in each table in sequence, completing the setting of a word splitter, attributes and fragments in mapping and setting, and completing the creation of an index in the ElasticSearch. Secondly, synchronizing the data of the table to be synchronized from the Mysql database to the index of the ElasticSearch in a batch mode. And moreover, the indexes which are synchronized are associated and bound with the aliases for retrieval, the aliases and other indexes are unbound, the establishment of a data full-scale synchronization task is completed, the non-perception switching of the retrieval indexes can be realized, and the normal retrieval service is not influenced. And finally, adding the data full-scale synchronization task into the distributed task scheduling service.

Referring to fig. 9, fig. 9 is a schematic structural block diagram of a data synchronization system according to an embodiment of the present application. The invention also provides a data synchronization system, which applies the scheduled data in the Mysql database to the ElasticSearch, and includes but is not limited to a cancer server module 10, a configuration module 20, an incremental synchronization operation module 30 and a full synchronization operation module 40. The Canal server module 10 is configured to monitor data changes in the Mysql database, and determine whether the data changes, and if so, store the changed complete data in the Canal connector. The configuration module 20 is configured to configure a mapping relationship between the Mysql database and the ElasticSearch in a configuration file. The incremental synchronization operation module 30 is configured to perform an incremental synchronization operation on the changed complete data. The full-scale synchronization operation module 40 is configured to perform full-scale synchronization operation on Mysql full-scale data. Specifically, the cancer server module 10 includes a cancer server, the cancer server is connected to the cancer connector, and the cancer connector is connected to the cancer client.

Referring to fig. 10, fig. 10 is a schematic structural block diagram of an electronic device according to an embodiment of the present disclosure. The invention further provides an electronic device, which includes a processor 50 and a memory 60, where the memory 60 stores program instructions, and the processor 50 executes the program instructions to implement the data synchronization method. The Processor 50 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component; the Memory 60 may include a Random Access Memory (RAM), and may also include a Non-Volatile Memory (Non-Volatile Memory), such as at least one disk Memory. The Memory 60 may also be an internal Memory of Random Access Memory (RAM) type, and the processor 50 and the Memory 60 may be integrated into one or more independent circuits or hardware, such as: application Specific Integrated Circuit (ASIC). It should be noted that the computer program in the memory 60 can be implemented in the form of software functional units and stored in a computer readable storage medium when the computer program is sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to perform all or part of the steps of the method according to the embodiments of the present invention.

Referring to fig. 10, fig. 10 is a schematic block diagram illustrating a structure of a computer-readable storage medium according to an embodiment of the present disclosure. The present invention further provides a computer-readable storage medium 701, where the computer-readable storage medium 701 stores computer instructions 70, and the computer instructions 70 are configured to cause the computer to execute the data synchronization method described above. The computer-readable storage medium 701 may be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system or a propagation medium. The computer-readable storage medium 701 may also include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a Random Access Memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Optical disks may include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-RW), and DVD.

In summary, the data synchronization method of the present invention integrates the Canal component, can give consideration to both incremental synchronization and full synchronization of the service data, and realizes incremental synchronization of the service data under the condition of no service perception, and is completely decoupled from the service logic, and meanwhile, the data synchronization method has schedulability, can adjust the full synchronization policy at any time, can change according to the magnitude of the real service pressure, and is flexible in configuration.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. A method of data synchronization, comprising:

performing incremental synchronization operation on the changed complete data;

and carrying out full-scale synchronization operation on Mysql full-scale data.

2. A data synchronization method according to claim 1, wherein the data synchronization method further comprises:

3. The data synchronization method according to claim 1, wherein the step of monitoring data change in the Mysql database through the Canal server, and determining whether the data change, if yes, storing the changed complete data in the Canal connector comprises:

4. A method for data synchronization as claimed in claim 1, wherein the step of performing incremental synchronization operation on the changed complete data comprises:

integrating a Canal client in a data synchronization system;

5. The data synchronization method of claim 1, wherein the step of performing a full-scale synchronization operation on Mysql full-scale data comprises:

establishing a full synchronization task of the ElasticSearch;

6. A method of data synchronization according to claim 1, characterized by: the mapping relation comprises a mapping relation between the database name and the table name of the Mysql database and the index and the type of the ElasticSearch, and a mapping relation between the primary key column of the data row record of the Mysql database and the unique code of the ElasticSearch.

7. The data synchronization method according to claim 4, wherein the step of the cancer client periodically acquiring the changed complete data from the cancer connector comprises:

starting a timing task;

8. A method of data synchronization according to claim 4, wherein: and analyzing and processing the changed complete data by adopting a release event and event monitoring mechanism.

9. A method of data synchronization according to claim 4, wherein: the step of synchronizing the analyzed and processed data into an elastic search according to the mapping relationship between the Mysql database and the elastic search to realize the incremental synchronization operation comprises the following steps:

10. The data synchronization method according to claim 5, wherein the step of establishing the full synchronization task of the ElasticSearch comprises:

11. A data synchronization system, comprising:

12. An electronic device comprising a processor and a memory, the memory storing program instructions, characterized in that: the processor executes program instructions to implement the data synchronization method of any one of claims 1 to 10.