CN116662448A - Automatic data synchronization method and device, electronic equipment and storage medium - Google Patents

Automatic data synchronization method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116662448A
CN116662448A CN202310695544.3A CN202310695544A CN116662448A CN 116662448 A CN116662448 A CN 116662448A CN 202310695544 A CN202310695544 A CN 202310695544A CN 116662448 A CN116662448 A CN 116662448A
Authority
CN
China
Prior art keywords
message
data
format data
field
synchronization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310695544.3A
Other languages
Chinese (zh)
Inventor
杜均
蔡满天
凌海挺
张茜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202310695544.3A priority Critical patent/CN116662448A/en
Publication of CN116662448A publication Critical patent/CN116662448A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of financial science and technology and the field of data processing, and discloses an automatic data synchronization method, which comprises the following steps: connecting a synchronization tool with a source database, and monitoring a preset message group in real time by using the synchronization tool; if the message group at the current t moment changes, converting the table structure modification message into first format data and converting the field modification message into second format data; when the first format data is judged to be the message type, updating the corresponding table structure of the target database by using the first format data; and when the first field number is larger or smaller than the second field number, writing the second format data at the time t into the target database. The invention is applied to the field of financial science and technology, automatically recognizes the change of the table field of the source database in the field of financial science and technology and automatically synchronizes the table field to the target database, thereby improving the real-time performance and consistency of data synchronization.

Description

Automatic data synchronization method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of financial science and technology, and in particular, to a method and apparatus for automatically synchronizing data, an electronic device, and a storage medium.
Background
In the field of financial science and technology, changing data of table field messages (Schema) of a source database of a financial institution is automatically synchronized to a target database through synchronization tools, however, existing synchronization tools, such as Sqoop, dataX, kettle, are implemented based on a query mode, and a request is processed through offline scheduling of query jobs. This means that these synchronization tools cannot automatically recognize and synchronize the changes of the table field messages of the source database to the target database during the data synchronization process, and if the table field messages of the source database change, the data synchronization task must be stopped and the changes of the table fields must be manually performed in the target database, and then the synchronization task restarted.
For example, in the field of financial science and technology, a financial institution modifies, deletes, and adds fields to a financial balance sheet, a financial profit sheet, a financial cash flow sheet, and the like of a source database.
The existing synchronization tool cannot automatically identify the change of the table field message of the source database and automatically synchronize the table field message to the target database, so that the source database and the target database are not synchronized, and the information of each party participant in the financial market is asymmetric, so that the transaction is unbalanced or irrational. Because the operation mode is required to depend on the operation mode of manual operation and maintenance, the real-time performance and consistency of data synchronization are easy to be poor.
Disclosure of Invention
In view of the foregoing, it is necessary to provide an automatic data synchronization method, which aims to solve the technical problem of poor real-time performance and consistency of data synchronization in the prior art.
The automatic data synchronization method provided by the invention comprises the following steps:
connecting a preset synchronization tool with a preset source database, and monitoring whether a preset message group of the source database changes or not by utilizing the synchronization tool in real time;
if the message group changes at the current time t, converting the table structure modification message in the message group into first format data and converting the field modification message in the message group into second format data;
judging the message type of the first format data, and when judging that the first format data is the message type of the table structure modification message, updating metadata of a corresponding table structure of a target database by using the first format data;
and acquiring the first field number of the second format data at the time t, acquiring the second field number of the second format data at the time t-1, calculating, and writing the second format data at the time t into the target database when the first field number is larger or smaller than the second field number.
Optionally, the connecting the preset synchronization tool with a preset source database includes:
and establishing connection between the synchronization tool and a data stream interface of the source database through a connector.
Optionally, the monitoring, by using the synchronization tool, whether the preset message group of the source database changes in real time includes:
taking a table field type message of the source database as the message group;
automatically mapping a message group of the source database into a circular log file of the synchronization tool using the connector;
and monitoring whether the table field type information of the information group changes in real time.
Optionally, after monitoring whether the preset message group of the source database changes in real time by using the synchronization tool, the method includes:
writing the changed message group into a circulation log file of the synchronous tool, wherein the circulation log file comprises a table structure modification message and a field modification message.
Optionally, the converting the table structure modification message in the message group into the first format data includes:
and acquiring a database name, a table name, a field name and an SQL statement operated on the table structure modification message, which are contained in the table structure modification message, and converting the database name, the table name and the field name into the first format data.
Optionally, the converting the field modification message in the message group into the second format data includes:
analyzing the field modification message by using the message of the cyclic log file;
and converting the data result obtained after the analysis into the second format data.
Optionally, the method further comprises:
and if the number of the first fields is equal to the number of the second fields, skipping the operation of the second format data at the time t.
In order to solve the above problems, the present invention also provides an automatic data synchronization device, the device comprising:
the monitoring module is used for responding to a data processing request of the first client, and constructing a safety house in a preset trusted environment, wherein the data processing request comprises a data set to be processed;
the conversion module is used for dividing the data set into a public sample set and a non-public sample set, transmitting the public sample set to the front end of the safety house, and transmitting the non-public sample set to the rear end of the safety house;
the judging module is used for debugging the public sample set by utilizing the first code input by the front end receiving the second client, and packaging the second code obtained after debugging and the training environment into a virtual mirror image file;
And the calculation module is used for mounting the virtual image file and the non-public sample set by utilizing the rear end, generating training parameters of a preset initial model, enabling the second client to execute the second code to debug the initial model based on the training parameters to obtain a target model, and returning the target model to the first client.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a data auto-synchronization program executable by the at least one processor, the data auto-synchronization program being executable by the at least one processor to enable the at least one processor to perform the data auto-synchronization method described above.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having stored thereon a data automatic synchronization program executable by one or more processors to implement the above-mentioned data automatic synchronization method.
Compared with the prior art, the method monitors the change of the table field of the source database in real time through the synchronization tool, converts the change information of the table structure of the source database and the change information of the table field into the first format data and the second format data which are adaptive to the target database, writes the change information into the target database after judging or comparing the first format data and the second format data, ensures the correctness and consistency of the table structure synchronization of the source database and the target database of the financial institution, improves the safety of data transaction in the financial field, and avoids the occurrence of information asymmetry problem in the financial market.
Drawings
FIG. 1 is a flowchart of an automatic data synchronization method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a data auto-synchronization device according to an embodiment of the invention;
fig. 3 is a schematic structural diagram of an electronic device for implementing an automatic data synchronization method according to an embodiment of the present invention;
the achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
Along with the rapid development of the financial science and technology field, the invention provides a data automatic synchronization method which can be applied to the financial science and technology field, monitors the change of the table field of the source database in real time through a synchronization tool, converts the change information of the table structure of the source database and the change information of the table field into the first format data and the second format data which are matched with the target database, writes the change information into the target database after judging or comparing the first format data and the second format data, ensures the correctness and consistency of the table structure synchronization of the source database and the target database of a financial institution, improves the safety of data transaction in the financial field, and avoids the occurrence of information asymmetry problem in the financial market.
The method can solve the technical problem that the change of the table field message of the source database cannot be automatically identified and automatically synchronized to the target database in the synchronous process of the financial data.
Referring to fig. 1, a flow chart of a method for automatically synchronizing data according to an embodiment of the invention is shown. The method is performed by an electronic device.
In this embodiment, the data automatic synchronization method includes:
s1, connecting a preset synchronization tool with a preset source database, and monitoring whether a preset message group of the source database changes or not in real time by using the synchronization tool.
In this embodiment, the synchronization tool is a tool that has a technology of capturing data changes in the source database and can realize integrated real-time synchronization of full volume and increment. The synchronization tool may be a Flink CDC synchronization tool, or may be a synchronization tool having the above functions, which is not limited herein.
The Flink CDC synchronization tool has the function of integrating full and incremental real-time synchronization, is different from Sqoop, dataX, kettle equivalent tools, is realized based on a query mode, and is used for processing requests by offline scheduling of query jobs.
The source database refers to a self-contained database that directly provides raw materials or specific data, and users do not need to refer to other information sources. The source database may include a numeric database, a text-numeric database, a full text database, a term database, an image database, an audio-visual database, etc., and the type of source database may include mysql, oracle, postgresql, etc. For example, in the field of financial technology, the source database may be a medical institution's primary database, a financial institution's primary database, and an insurance institution's primary database.
The preset message group of the source database refers to the message combination modified by the user on the table structure and the fields of the table of the source database. The user's manipulation of the source database or the modified message set is stored in a binary log (binlog) of the source database.
For example, after the user a of the financial institution modifies the table structure and fields of the financial asset liability table of the source database of the financial type with a message combination, then the user a may operate on the source database or the modified set of messages may be stored in a binary log of the source database, and the flank CDC synchronization tool may obtain the modified message combination of the table structure and fields of the financial asset liability table from the binary log.
In this embodiment, the message group includes a table structure modification message (DDL message) and a field modification message (DML message).
The form structure modification message refers to a change message of a user operating on a form structure of a source database, such as an increase field for a financial balance sheet and a financial profit sheet, and a financial cash flow sheet in the field of financial science and technology.
The field modification message refers to a change message in which the user operates on data such as fields of a table of the source database, for example, adds, deletes, and modifies contents/data of the above-described financial balance sheet and financial profit sheet, and financial cash flow sheet, etc.
According to the type of the source database and the database expression, the synchronization tool provides a corresponding connector to establish connection with the source database, reads the data of the source database through the connector, and monitors a predefined message group in the read data in real time by utilizing the synchronization tool.
For example, the table structure modification message and the field modification message in the predefined source database are used as the preset message group.
And monitoring the message group in real time by utilizing a synchronization tool to judge whether the table structure modification message and the field modification message of the source database are operated by a user or not.
Because whether there is a change in the table field message (Schema) of the source database, closely related to the table structure modification message and the field modification message in the source database, monitoring the table field message is one of the main technical means for achieving the correctness and consistency of data synchronization.
In one embodiment, the connecting the preset synchronization tool with the preset source database includes:
and establishing connection between the synchronization tool and a data stream interface of the source database through a connector.
The connector of the synchronization tool refers to an operator (Source) that reads data from the Source database, and is also the input to read and process the data. The connector may be a Flink Source connector.
For example, after the connector establishes a connection with the source database, the table snapshot block of the source database is read first, then the binary log of the source database is read, and the connector can accurately read the data of the source database during processing no matter in the table snapshot block stage or the binary log stage, and even if a task fails, the data of the source database can still be accurately read.
The connector acquires the table field type information of the source database from the table snapshot block and the binary log as an information group, and synchronizes the information group in real time in the circulating log file of the synchronous tool.
In one embodiment, the monitoring, by the synchronization tool, whether the preset message group of the source database changes in real time includes:
taking a table field type message of the source database as the message group;
automatically mapping a message group of the source database into a circular log file of the synchronization tool using the connector;
and monitoring whether the table field type information of the information group changes in real time.
In one embodiment, after monitoring whether a preset message group of the source database changes in real time by using the synchronization tool, the method includes:
Writing the changed message group into a circulation log file of the synchronous tool, wherein the circulation log file comprises a table structure modification message and a field modification message.
And modifying the table structure and the table field of the source database by a user to form a table field type message as a message group, loading a basic package (Flink connector base) of the connector into a server local file of the source database, and connecting the basic package with a cdc component of the source database after loading.
And establishing an automatic mapping relation between the type of the table field message of the source database and the circulation log file of the synchronous tool, so that the synchronous tool monitors whether the type of the table field message of the message group changes in real time.
In other embodiments, if the message group corresponding to the mysql, oracle, etc. source database changes, the flank CDC synchronization tool may capture the table structure modification message and directly parse the table structure modification message.
However, if the preset message group of the PostgreSQL source database changes, it is necessary to manually set the Flink CDC synchronization tool to monitor the table structure modification message changes.
In step S1, the field modification message and the table structure modification message of the source database are closely related to the table field message change of the target database, and by monitoring the table structure modification message and the field modification message, it represents that not only the table structure of the source database is changed, but also the table data of the source database is changed.
The full and incremental integrated real-time synchronization between the source database and the target database can be realized through the Flink CDC synchronization tool, and the technical problem that the synchronization tool in the prior art needs to process requests in an offline operation mode is solved.
S2, if the message group changes at the current time t, converting the table structure modification message in the message group into first format data, and converting the field modification message in the message group into second format data.
In this embodiment, the time t refers to the current time, and is a time when the message state is monitored in real time. The synchronization tool monitors in real time that the message group has changed, that is, the user has updated the table structure modification message and/or the field modification message of the source database.
The table structure modification message is converted to first format data and the field modification message is converted to second format data based on the change in the message group.
By converting the table field message of the source database into the first format data and the second format data, the synchronization and the fast writing of the data to the target database are facilitated.
For example, when the user a adds a field to the financial liability table K (10 fields in the original) of the source database, and the synchronization tool listens to the financial liability table K from 10 fields to 11 fields in real time, the table structure modification message of the financial liability table K is converted into the first format data, and the field modification message of the financial liability table K is converted into the second format data, thereby realizing the conversion of the financial data table of the source database into the standard format of the link CDC synchronization tool.
In one embodiment, the converting the table structure modification message in the message group into the first format data includes:
and acquiring a database name, a table name, a field name and an SQL statement operated on the table structure modification message, which are contained in the table structure modification message, and converting the database name, the table name and the field name into the first format data.
The first format Data (dl Row Data) is a kind of structure Data inside the table. The first format data can be directly written into the target database through the basic interface of the target database through the Flink SQL statement.
Analyzing the table structure modification message to obtain the database name, table name, field name, SQL statement of operation and other information contained in the table structure modification message, wherein the information represents each line of data in the table structure and also represents the change description type of the circulating log file, and the information is converted into the first format data through a preset character string conversion format.
In other embodiments, the operation ddl type (e.g., ddl type includes CREATE, ALTER, DROP, TRUNCATE type, etc.) included in the table structure modification message may also be obtained and converted into the first format data.
Because the line data (Row data) does not directly exist in the source database, the invention acquires the information such as database name, table name, field name, SQL statement of operation and the like contained in the table structure modification information after analysis through the change description type of the circular log file. These messages are converted into first format data, which is automatically synchronized for updating, because the first format data can be automatically adapted to the target database correspondence table structure.
In one embodiment, the converting the field modification message in the message group into the second format data includes:
analyzing the field modification message by using the message of the cyclic log file;
and converting the data result obtained after the analysis into the second format data.
The message of the synchronization tool refers to a message in which the flank CDC bottom layer encapsulates debezum, and the message includes a full-size phase (all records in the current table are queried) and an incremental phase (data is changed from binlog consumption).
The field modification message is parsed by the message of the circular log file, a data result about the total and increment of the table of the source database is obtained, and the data result is converted into the second format data, because the second format data can be automatically adapted to the field of the corresponding table of the target database, and the field modification message is automatically updated synchronously.
In step S2, the present invention discovers that the table field of the source database is typically caused by a change in the table structure modification message and the field modification message upon a change. If the table structure modification message and the field modification message of the source database can be captured, the table structure modification message and the field modification message are converted into data which is automatically adapted to be written into the target database.
The method and the device can solve the technical problems that in the prior art, data after the table field of the source database is changed cannot be automatically adapted to the corresponding table structure of the target database, and manual operation and maintenance of a user are required, and realize better transition of synchronous data types between the source database and the target database.
S3, judging the message type of the first format data, and when judging that the first format data is the message type of the table structure modification message, updating metadata of a corresponding table structure of a target database by using the first format data.
In this embodiment, the judgment of the message type of the first format data is to solve two problems:
1. confirming whether the table structure of the source database is modified by a user, and if the table structure of the source database is judged to be the message type of the table structure modification message of the source database, proving that the table structure of the source database is modified by the user, and needing to write the first format data into the target database for synchronization in the next step.
2. It is confirmed whether the message type of the first format data can be automatically adapted to be written into the corresponding table structure of the target database.
After confirmation through the two steps, the first format data can be written into the target database, and the table structure at the same position in the target database as the table structure which has been modified by the source database is synchronously updated.
The metadata of the corresponding table structure of the target database is updated by using the first format data, for example, the first format data comprises database names, table names, field names and SQL sentences operated by the table structure modification message of the financial asset liability table K, and the metadata of the corresponding table structure of the target database can be updated by the database names of XX database, the table names of financial asset liability table K, the field names of asset, liability, stakeholder equity, interest payable, equity payable equity and the like, so that the first format data can be synchronized into the metadata of the corresponding table structure of the target database, and the safety of data transaction in the financial field is improved.
In one embodiment, the determining the message type of the first format data includes:
analyzing the first format data by using a preset operator to obtain an analysis value;
and determining the message type of the first format data according to the mapping relation between the preset message type table and the analysis value.
The preset operator is a self-defined abstract operator, and the preset operator can be Ddl Process Function and Ddl Process Function and is used for executing DDL statements on the first format data in the target database after analyzing the first format data, so that the consistency and the accuracy of the table structure between the target database and the source database can be ensured.
The preset operator has different judging results on different message types, for example, the operator does not process the second format data, the second format data is directly released to enter a subsequent processing stage, and the operator is utilized to analyze the table structure data (the enabled stream element of the table structure) of the first format data to obtain an analysis value of the first format data.
And determining the message type of the first format data according to the mapping relation between the preset message type table and the analysis value.
The message type table is a table predefined according to attributes of different data of the source database, for example, the structured data and the unstructured data of the source database are classified according to the two attributes, and the structured data can be predefined according to the type (for example, WORD, XML, exlel and the like) of the structured data.
In one embodiment, when the first format data is determined to be the message type of the table structure modification message, updating metadata of a table structure corresponding to the target database with the first format data includes:
according to the mapping relation, obtaining the message type of the table structure corresponding to the target database and the first format data;
And when the message type of the first format data is calculated to be the same as the analysis value of the message type of the table structure, updating the metadata of the table structure by using the first format data.
And according to the information such as database name, table name and the like contained in the first format data, acquiring the information type (such as database name and table name) of the table structure corresponding to the first format data from the target database, and reading the information type of the first format data and the analysis value of the information type of the table structure.
And judging whether the analysis values of the two are the same by using a preset code if statement, if so, updating metadata of the table structure in the target database by using the database name, the table name, the field name, the SQL statement of the operation and other information contained in the first format data.
When judging whether the analysis values of the two are the same or not by using a preset code if statement, if the analysis values are different, the user is informed that the table structure modification message (DDL message) of the source database is not updated, and metadata of the table structure in the target database is not required to be updated.
In step S3, messages that have been modified by the user with respect to the table structure of the source database may be captured in real time, and automatically synchronized to the metadata of the corresponding table structure in the target database.
The method and the device can solve the technical problems that in the prior art, the table field of the source database changes, the data synchronization task is stopped, the table field is manually changed in the target database, then the synchronization task is restarted, and the like, improve the data synchronization efficiency, and ensure the accuracy and consistency of the table structure synchronization of the source database and the target database.
S4, acquiring the first field number of the second format data at the t moment, and acquiring the second field number of the second format data at the t-1 moment to calculate, and writing the second format data at the t moment into the target database when the first field number is larger or smaller than the second field number.
In this embodiment, after the user modifies a field of any table in the Source database, the link Source end of the Source database generates a piece of second format data.
In most cases, multiple writing tasks may be performed simultaneously in the target database, in the prior art, when a new writing message is performed in the target database, only one task can be notified, and other tasks cannot be notified, which easily causes that each task of the link Sink end of the target database cannot notify the writing message of the second format data.
The technical problem that in the prior art, only one subtask is informed of updating metadata and other subtasks are not updated to cause incomplete data can be solved by detecting the last two second format data sent by the Flink Source end and comparing the field quantity.
For example, at time t-1 (last time), the number of fields of the second format data (original 10 field numbers) of the financial asset liability table K becomes 11 by time t (current time), and the comparison of the field numbers indicates that the fields of the table of the source database change at time t, and the second format data of the financial asset liability table K at time t is written into the target database.
The method can ensure that the second format data can be synchronized into the target database, improves the safety of data transaction in the financial field, and avoids the occurrence of information asymmetry problem in the financial market.
When the number of the first fields is larger than the number of the second fields, the user is informed of adding fields to the table of the source database.
When the first field number is less than the second field number, the user is said to delete the field from the table of the source database.
In one embodiment, the method further comprises:
and if the number of the first fields is equal to the number of the second fields, skipping the operation of the second format data at the time t.
When the number of the first fields is equal to the number of the second fields, it is indicated that the number of the fields of the last two times of second format data (the second format data of the two times of the time t and the time t-1) sent by the link Source end is consistent, that is, it is indicated that the fields of the table of the Source database are not modified by the user, the operation of skipping the second format data of the time t is not needed, and the second format data of the time t is not needed to be written into the target database.
In step S4, messages that have been modified by the user for field messages of the tables of the source database may be captured in real time, and automatically synchronized to metadata of the corresponding table structure in the target database.
The method and the device can solve the technical problems that in the prior art, the table field of the source database changes, the data synchronization task is stopped, the table field is manually changed in the target database, then the synchronization task is restarted, and the like, improve the data synchronization efficiency, and ensure the accuracy and consistency of the table structure synchronization of the source database and the target database.
In the steps S1-S4, the method monitors the source database in real time through the synchronization tool, converts the acquired change information of the table structure of the source database and the change information of the fields of the table into the first format data and the second format data which are adaptive to the target database, writes the change information into the target database after judging or comparing the first format data and the second format data, and ensures the correctness and consistency of the table structure synchronization of the source database and the target database.
Compared with the prior art, the invention also realizes the following technical effects:
1. automatic handling table field (Schema) changes: the method can automatically identify the change of the table field of the source database and automatically update the table field of the target database, thereby improving the accuracy and consistency of data synchronization.
2. Improving compatibility problems: the synchronization tool (Flink CDC) is used as a stream processing engine, has better compatibility and performance, and can better process data synchronization tasks.
3. The invention has rich application scenes in the field of financial science and technology:
1) In a banking scenario, the method of the present invention may be used to build a real-time data warehouse to support business analysis and risk control.
Banks typically process large amounts of transaction data and customer information that need to be processed and analyzed in real-time in order to quickly make decisions and provide better service.
By using the method of the invention, the bank can automatically track the structural change of the source database and automatically reflect the structural change into the target database, thereby reducing errors and delays caused by inconsistent data structures of the source database and the target database and ensuring the accuracy and instantaneity of data synchronization.
In addition, the bank can process and analyze large-scale data in real time by utilizing the strong real-time computing capability of the Flink CDC, so that the bank can quickly know the service condition and control the risk.
2) In the insurance finance industry, the method of the invention can be used for constructing a real-time risk management system to support insurance product design and claim settlement.
Insurance companies often need to track and analyze and process large amounts of policy and claim data in real time. By using the method of the invention, the insurance company can automatically track the structural change of the source database and automatically reflect the structural change into the target database, thereby ensuring data consistency, reliability and real-time performance. Meanwhile, the method can improve the efficiency and reliability of data synchronization, reduce the requirement of manual intervention and reduce the operation risk and cost.
In addition, the insurance company can process and analyze large-scale data in real time by utilizing the strong real-time computing capability of the Flink CDC, so that the insurance company can quickly know the service condition and perform risk management and claim settlement.
In summary, with the method of the present invention, when the structure and format of the source database change, the changes can be automatically synchronized to the target database without manually stopping the synchronization task, manually updating the data structure and format of the target database, and then restarting the task.
The efficiency and the accuracy of data synchronization and migration can be greatly improved, and the workload and the error rate of operators can be reduced.
Fig. 2 is a schematic block diagram of an automatic data synchronization device according to an embodiment of the invention.
The data automatic synchronization device 100 of the present invention may be installed in an electronic apparatus. Depending on the functions implemented, the data auto-synchronization device 100 may include a listening module 110, a conversion module 120, a determination module 130, and a calculation module 140. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
a monitoring module 110, configured to connect a preset synchronization tool with a preset source database, and monitor, in real time, whether a preset message group of the source database changes by using the synchronization tool;
the conversion module 120 is configured to convert the table structure modification message in the message group into the first format data and convert the field modification message in the message group into the second format data if the message group changes at the current time t;
a judging module 130, configured to judge a message type of the first format data, and when judging that the first format data is a message type of the table structure modification message, update metadata of a table structure corresponding to a target database by using the first format data;
the calculation module 140 is configured to obtain the first field number of the second format data at the time t, and obtain the second field number of the second format data at the time t-1, and write the second format data at the time t into the target database when the first field number is greater than or less than the second field number.
In one embodiment, the connecting the preset synchronization tool with the preset source database includes:
And establishing connection between the synchronization tool and a data stream interface of the source database through a connector.
In one embodiment, the monitoring, by the synchronization tool, whether the preset message group of the source database changes in real time includes:
taking a table field type message of the source database as the message group;
automatically mapping a message group of the source database into a circular log file of the synchronization tool using the connector;
and monitoring whether the table field type information of the information group changes in real time.
In one embodiment, after monitoring whether a preset message group of the source database changes in real time by using the synchronization tool, the method includes:
writing the changed message group into a circulation log file of the synchronous tool, wherein the circulation log file comprises a table structure modification message and a field modification message.
In one embodiment, the converting the table structure modification message in the message group into the first format data includes:
and acquiring a database name, a table name, a field name and an SQL statement operated on the table structure modification message, which are contained in the table structure modification message, and converting the database name, the table name and the field name into the first format data.
In one embodiment, the converting the field modification message in the message group into the second format data includes:
analyzing the field modification message by using the message of the cyclic log file;
and converting the data result obtained after the analysis into the second format data.
In one embodiment, the method further comprises:
and if the number of the first fields is equal to the number of the second fields, skipping the operation of the second format data at the time t.
Fig. 3 is a schematic structural diagram of an electronic device for implementing an automatic data synchronization method according to an embodiment of the present invention.
In the present embodiment, the electronic device 1 includes, but is not limited to, a memory 11, a processor 12, and a network interface 13, which are communicably connected to each other via a system bus, and the memory 11 stores therein a data automatic synchronization program 10, the data automatic synchronization program 10 being executable by the processor 12. Fig. 3 shows only the electronic device 1 with the components 11-13 and the data auto-sync routine 10, it will be appreciated by those skilled in the art that the structure shown in fig. 3 is not limiting of the electronic device 1 and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.
Wherein the storage 11 comprises a memory and at least one type of readable storage medium. The memory provides a buffer for the operation of the electronic device 1; the readable storage medium may be a non-volatile storage medium such as flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1; in other embodiments, the nonvolatile storage medium may also be an external storage device of the electronic device 1, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device 1. In this embodiment, the readable storage medium of the memory 11 is generally used to store an operating system and various application software installed in the electronic device 1, for example, code for storing the data automatic synchronization program 10 in one embodiment of the present invention. Further, the memory 11 may be used to temporarily store various types of data that have been output or are to be output.
Processor 12 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the electronic device 1, such as performing control and processing related to data interaction or communication with other devices, etc. In this embodiment, the processor 12 is configured to execute the program code stored in the memory 11 or process data, such as running the data auto-synchronization program 10.
The network interface 13 may comprise a wireless network interface or a wired network interface, the network interface 13 being used for establishing a communication connection between the electronic device 1 and a terminal (not shown).
Optionally, the electronic device 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 1 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The data auto-synchronization program 10 stored in the memory 11 of the electronic device 1 is a combination of instructions that, when executed in the processor 12, may implement:
connecting a preset synchronization tool with a preset source database, and monitoring whether a preset message group of the source database changes or not by utilizing the synchronization tool in real time;
if the message group changes at the current time t, converting the table structure modification message in the message group into first format data and converting the field modification message in the message group into second format data;
judging the message type of the first format data, and when judging that the first format data is the message type of the table structure modification message, updating metadata of a corresponding table structure of a target database by using the first format data;
and acquiring the first field number of the second format data at the time t, acquiring the second field number of the second format data at the time t-1, calculating, and writing the second format data at the time t into the target database when the first field number is larger or smaller than the second field number.
Specifically, the specific implementation method of the above-mentioned data auto-synchronization program 10 by the processor 12 may refer to the description of the related steps in the corresponding embodiment of fig. 1, which is not repeated herein.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable medium may be nonvolatile or nonvolatile. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
The computer readable storage medium stores the data automatic synchronization program 10, where the data automatic synchronization program 10 may be executed by one or more processors, and the specific implementation of the computer readable storage medium is substantially the same as the above embodiments of the data automatic synchronization method, and will not be described herein.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. A method for automatically synchronizing data, the method comprising:
connecting a preset synchronization tool with a preset source database, and monitoring whether a preset message group of the source database changes or not by utilizing the synchronization tool in real time;
if the message group changes at the current time t, converting the table structure modification message in the message group into first format data and converting the field modification message in the message group into second format data;
Judging the message type of the first format data, and when judging that the first format data is the message type of the table structure modification message, updating metadata of a corresponding table structure of a target database by using the first format data;
and acquiring the first field number of the second format data at the time t, acquiring the second field number of the second format data at the time t-1, calculating, and writing the second format data at the time t into the target database when the first field number is larger or smaller than the second field number.
2. The method for automatically synchronizing data according to claim 1, wherein said connecting a predetermined synchronization tool to a predetermined source database comprises:
and establishing connection between the synchronization tool and a data stream interface of the source database through a connector.
3. The method for automatically synchronizing data according to claim 1 or 2, wherein monitoring whether a preset message group of the source database is changed in real time by using the synchronization tool comprises:
taking a table field type message of the source database as the message group;
automatically mapping a message group of the source database into a circular log file of the synchronization tool using the connector;
And monitoring whether the table field type information of the information group changes in real time.
4. The method for automatically synchronizing data according to claim 3, wherein monitoring in real time whether a preset message group of the source database has changed by using the synchronization tool comprises:
writing the changed message group into a circulation log file of the synchronous tool, wherein the circulation log file comprises a table structure modification message and a field modification message.
5. The method for automatically synchronizing data according to claim 1, wherein said converting the table structure modification message in the message group into the first format data comprises:
and acquiring a database name, a table name, a field name and an SQL statement operated on the table structure modification message, which are contained in the table structure modification message, and converting the database name, the table name and the field name into the first format data.
6. The method for automatically synchronizing data according to claim 1, wherein said converting the field modification message in the message group into the second format data comprises:
analyzing the field modification message by using the message of the cyclic log file;
and converting the data result obtained after the analysis into the second format data.
7. The method for automatically synchronizing data according to claim 1, further comprising:
and if the number of the first fields is equal to the number of the second fields, skipping the operation of the second format data at the time t.
8. An automatic data synchronization device, the device comprising:
the monitoring module is used for connecting a preset synchronizing tool with a preset source database and monitoring whether a preset message group of the source database changes or not in real time by using the synchronizing tool;
the conversion module is used for converting the table structure modification message in the message group into first format data and converting the field modification message in the message group into second format data if the message group changes at the current time t;
the judging module is used for judging the message type of the first format data, and when judging that the first format data is the message type of the table structure modification message, updating the metadata of the corresponding table structure of the target database by using the first format data;
the calculation module is used for obtaining the first field number of the second format data at the time t and obtaining the second field number of the second format data at the time t-1 to calculate, and when the first field number is larger or smaller than the second field number, the second format data at the time t is written into the target database.
9. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a data auto-synchronization program executable by the at least one processor to enable the at least one processor to perform the data auto-synchronization method of any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon a data auto-synchronization program executable by one or more processors to implement the data auto-synchronization method of any of claims 1 to 7.
CN202310695544.3A 2023-06-12 2023-06-12 Automatic data synchronization method and device, electronic equipment and storage medium Pending CN116662448A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310695544.3A CN116662448A (en) 2023-06-12 2023-06-12 Automatic data synchronization method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310695544.3A CN116662448A (en) 2023-06-12 2023-06-12 Automatic data synchronization method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116662448A true CN116662448A (en) 2023-08-29

Family

ID=87718823

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310695544.3A Pending CN116662448A (en) 2023-06-12 2023-06-12 Automatic data synchronization method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116662448A (en)

Similar Documents

Publication Publication Date Title
CN110309071B (en) Test code generation method and module, and test method and system
US20190317944A1 (en) Methods and apparatus for integrated management of structured data from various sources and having various formats
US10013439B2 (en) Automatic generation of instantiation rules to determine quality of data migration
US8712965B2 (en) Dynamic report mapping apparatus to physical data source when creating report definitions for information technology service management reporting for peruse of report definition transparency and reuse
US9037555B2 (en) Asynchronous collection and correlation of trace and communications event data
CN111078140B (en) Nuclear power station file uploading management method and device, terminal equipment and medium
US9251222B2 (en) Abstracted dynamic report definition generation for use within information technology infrastructure
CN109299074B (en) Data verification method and system based on templated database view
CN114880240B (en) Automatic test system and method for equipment of Internet of things, storage medium and equipment
CN111813804A (en) Data query method and device, electronic equipment and storage medium
CN113326247A (en) Cloud data migration method and device and electronic equipment
CN113760922A (en) Service data processing system, method, server and storage medium
CN114880405A (en) Data lake-based data processing method and system
CN115757626A (en) Data quality detection method and device, electronic equipment and storage medium
CN114385722A (en) Interface attribute consistency checking method and device, electronic equipment and storage medium
US20240037084A1 (en) Method and apparatus for storing data
CN113010208B (en) Version information generation method, device, equipment and storage medium
US20200326932A1 (en) System and method for creating and validating software development life cycle (sdlc) digital artifacts
CN110716804A (en) Method and device for automatically deleting useless resources, storage medium and electronic equipment
US20120221967A1 (en) Dashboard object validation
US12039416B2 (en) Facilitating machine learning using remote data
CN116662448A (en) Automatic data synchronization method and device, electronic equipment and storage medium
CN114968725A (en) Task dependency relationship correction method and device, computer equipment and storage medium
US9330115B2 (en) Automatically reviewing information mappings across different information models
US12008017B2 (en) Replicating data across databases by utilizing validation functions for data completeness and sequencing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination