CN108052681B - Method and system for synchronizing structured data between relational databases - Google Patents

Method and system for synchronizing structured data between relational databases Download PDF

Info

Publication number
CN108052681B
CN108052681B CN201810030156.2A CN201810030156A CN108052681B CN 108052681 B CN108052681 B CN 108052681B CN 201810030156 A CN201810030156 A CN 201810030156A CN 108052681 B CN108052681 B CN 108052681B
Authority
CN
China
Prior art keywords
data
database
format
export
json
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810030156.2A
Other languages
Chinese (zh)
Other versions
CN108052681A (en
Inventor
毛彬
罗威
谭玉珊
罗准辰
牛海波
张吉才
武帅
叶宇铭
田昌海
尹忠博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MILITARY SCIENCE INFORMATION RESEARCH CENTER OF MILITARY ACADEMY OF THE CHINESE PLA
Original Assignee
Military Science Information Research Center Of Military Academy Of Chinese Pla
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Military Science Information Research Center Of Military Academy Of Chinese Pla filed Critical Military Science Information Research Center Of Military Academy Of Chinese Pla
Priority to CN201810030156.2A priority Critical patent/CN108052681B/en
Publication of CN108052681A publication Critical patent/CN108052681A/en
Application granted granted Critical
Publication of CN108052681B publication Critical patent/CN108052681B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for synchronizing structured data between relational databases, which is used for realizing data synchronization between a source database and a target database; the method comprises the following steps: step 1) extracting key value pair information from structured original data needing to be exported in log information of a source database and a source database according to actual service application requirements, rewriting the key value pair information into intermediate data meeting format requirements by combining a data operation type and a data entry mark format, and storing the intermediate data in a json format as intermediate export data; step 2) mapping json intermediate export data from basic data into target data according to a data import strategy of a target database by combining data cleaning operation and database format conversion operation, converting the json intermediate export data into data to be imported according with a corresponding database import format, and then importing the data into the target database; and 3) performing reverse analysis operation on the data to be imported generated in the step 2) in combination with the type of the target database to generate backup recovery data for rolling back the data version.

Description

Method and system for synchronizing structured data between relational databases
Technical Field
The invention relates to the field of data synchronization processing, in particular to a method and a system for synchronizing structured data among relational databases.
Background
In the field of data processing, it is generally necessary to perform processes such as data extraction, mapping conversion, and synchronization. The raw data is normally stored in conventional databases such as MySQL, MariaDB, SQLServer and the like to support large-scale retrieval requirements, and further data processing stores the data in relational databases such as elastic search, solr and the like which are professionally served to a data retrieval engine, so that the data needs to be mapped and synchronized among various relational databases. In addition, for enterprises, companies, data analysis organizations and the like providing large-scale data applications, since the authority control of data resources is usually equipped with security isolation settings of internal and external networks, requirements for offline unidirectional transmission are provided for data synchronization.
The traditional data synchronization scheme can well solve the data synchronization of the databases of the same type, but avoids the data mapping synchronization requirement among different databases; meanwhile, the synchronous consistency among data can be well maintained by carrying out bidirectional communication under the same network, but the problem of data backup and recovery caused by data version conflict under the limitation of unidirectional communication is difficult to timely and effectively solve.
Disclosure of Invention
The invention aims to provide a method and a system for synchronizing structured data among relational databases aiming at the synchronization of the structured data among different relational databases and the data processing requirements of the synchronization of offline structured data such as unidirectional communication and the like, which are suitable for solving the problem of data synchronization among the databases and can complete the generalized synchronization of the data among the databases, namely the mapping synchronization including the complete consistency of the data and the incomplete consistency after the data cleaning operation is added; the method is particularly suitable for completing data synchronization under the offline condition, and data reverse analysis operation is added for data version rollback; and the reliability and stability of data are ensured.
In order to achieve the above object, the present invention provides a method for synchronizing structured data between relational databases, which is used for implementing data synchronization between a source database and a target database; including but not limited to full consistent synchronization of data between databases of the same type and databases of different types and incomplete consistent synchronization of data in conjunction with data cleansing operations; the method comprises the following steps:
step 1) extracting key value pair information from structured original data needing to be exported in log information of a source database and a source database according to actual service application requirements, rewriting the key value pair information into intermediate data meeting format requirements by combining a data operation type and a data entry mark format, and storing the intermediate data in a json format as intermediate export data;
step 2) mapping json intermediate export data from basic data into target data according to a data import strategy of a target database by combining data cleaning operation and database format conversion operation, converting the json intermediate export data into data to be imported according with a corresponding database import format, and then importing the data into the target database;
and 3) performing reverse analysis operation on the data to be imported generated in the step 2) in combination with the type of the target database to generate backup recovery data for rolling back the data version.
As an improvement of the above method, the step 1) specifically includes:
step 1-1) dividing structured original data needing to be exported in a source database into: data table information, full field data and partial data;
step 1-2) exporting data table information: exporting structural information of a data table needing to be exported in a source database, extracting key value pair information, converting the key value pair information into a json format and storing the json format; the structure information of the data table includes: database name, table name, code, and field name, type, length of all fields;
step 1-3) exporting the full field data: intercepting selected items in a source database according to query statements generated in a set interval, analyzing items needing data deletion in a source database log information in a targeted mode according to deletion operation, converting structured original data of the selected items into a key value pair mode, generating a unique identifier according to a specified data item mark format to serve as a data identification code of each data item, marking data operation types, storing the data identification codes into a json format, and generating json intermediate export data; if the data entry mark format is not specified, setting the data entry mark format to be null; defaulting the data operation type of the data entry selected from the source database to be 'increased', deleting the data operation type of the data entry selected from the log information of the source database, and setting a corresponding data operation code according to the corresponding database type;
step 1-4) exporting partial data: generating query statements according to a field list appointed by a user and a set query interval, matching data in a source database to obtain a selected entry, analyzing entries needing data deletion in the log information of the source database in a targeted manner aiming at 'delete' operation, converting structured original data of the selected entry into a key value pair form, generating a unique identifier according to an appointed data entry mark format to serve as a data identification code of each data entry, marking a data operation type, storing the unique identifier into a json format, and generating json intermediate export data; if the data entry mark format is not specified, setting the data entry mark format to be null; and default data operation types of the data items selected from the source database to be modified, default data operation types of the data items selected from the log information of the source database to be deleted, and different data operation codes are set according to the corresponding database types.
As an improvement of the above method, the step 2) specifically includes:
step 2-1), dividing json intermediate derived data into the following data types: creating a data table and importing data;
step 2-2) data table creation: reducing the structure information of the data table in the json intermediate export data into key value pairs, filling a format required by creating a new table for the target database according to the type of the target database, and creating the new table for the target database;
step 2-3) data import: generating a data processing strategy by combining the type of the target database and a data format adjustment and data cleaning strategy specified by a user;
and 2-4) performing data format adjustment and data cleaning according to the data processing strategy generated in the step 2-3) by combining the data operation code and the allocated data identification code, generating a final data import statement, and importing the final data import statement into a target database.
As an improvement of the above method, the step 3) specifically includes:
step 3-1) dividing the data to be imported generated in the step 2) into reverse analysis of data table information and reverse analysis of content data according to data types;
step 3-2) reverse analysis of data table information: reading the data to be imported for creating a new table of the target database generated in the step 2-2), decoding the data into corresponding table deletion statements, and generating backup recovery data for rolling back of a subsequent data version;
step 3-3) reverse analysis of content data: and reading the data to be imported generated in the step 2-4) and used for importing the target database, and performing corresponding reverse analysis operation by combining the type of the target database to generate backup recovery data used for rolling back the subsequent data version.
As an improvement of the above method, the reverse parsing operation includes: reading a data entry to be imported, and reversely mapping the data operation code: and replacing the adding operation with a deleting operation, assigning the modifying operation with a modifying operation corresponding to the original data, replacing the deleting operation with an adding operation corresponding to the data, and generating backup recovery data for rolling back the subsequent data version by combining the item content.
The invention also provides a system for synchronizing the structured data among the relational databases, which comprises: the data synchronization engine 10, the data processing module 20, the message scheduling module 30, the data backup repository 40 and the log management repository 50;
the data synchronization engine 10 is used for taking charge of interaction between a user and a system, including task customization, authority management of the user, uploading and downloading of data and expansion and connection of other external interfaces;
the data processing module 20 is configured to receive a data processing task command sent by the message scheduling module 30, read data required by a task from the log management database 50 according to the task command, export the data from the source database, and import the data into the target database; performing reverse analysis operation on the data for rolling back the data version;
the message scheduling module 30 is configured to obtain a task request configured by a user and a configured normalized data synchronization task request from the data synchronization engine 10, and transmit a data processing task command to the data processing module 20;
the data backup storage 40 is used for storing data after format adjustment and data cleaning of data uploaded by a user in the data synchronization engine 10, storing export data packets, upload import data packets and generated reverse recovery data packets generated by the data processing module 20, storing log files generated in the operation process of the system, and storing log information files of all deletion operations obtained by periodically extracting and analyzing the operation log files in each database;
the log management library 50 is configured to manage all information interaction logs generated during the system operation process, including task customization information, user usage records, uploaded data and code storage path registration, information entry of an external interface in the data synchronization engine 10, data import and export condition records in the data processing module 20, association information records between import and export data packets, reverse recovery data packets and a task database, and task execution information records of the message scheduling module 30.
As an improvement of the above system, the data synchronization engine 10 includes:
the task customizing unit is used for configuring the normalized timing task, configuring the one-time temporary task, and configuring the tasks including the merging of a plurality of subtasks, the data operation code assignment in the exporting process, the data identification code format configuration, the attribute configuration of source data, the data format adjustment in the importing process, the uploading of the data cleaning processing unit, the attribute configuration of target data and the like, and the calling configuration of an external interface;
the authority management unit of the user is used for realizing the highest authority setting of data synchronization of an administrator, the execution authority of maintenance personnel and the temporary configuration and calling authority of data application personnel;
the interactive interface is used for uploading and downloading data; modifying and uploading the data cleaning processing unit; interaction such as task customization and attribute configuration;
and the external interface is used for expanding the application range of the data synchronization system and increasing the expandable interface management of system intelligent services, such as the increase and the change of a data import unit and a data export unit, the butt joint of an optical disk ferrying system controlled by data flow direction and the like.
As an improvement of the above system, the data processing module 20 includes: the device comprises a data export operation unit A, a data import operation unit B and a reverse analysis recovery unit C;
the data export operation unit a is configured to export original data from a source database into json intermediate export data, read source database log information about "delete" operation in the data backup repository 40, analyze export data of entry generation flag bit deletion operation that needs data deletion, and merge the export data into the json intermediate export data; dividing original data into derivation of a data table, derivation of full-field data and derivation of partial data according to data types; performing json datamation of table creation information on original data to be exported according to an export strategy of a data export task, and combining the original data of full-field data and partial data with data operation types and data standardization of a data entry mark format to generate json intermediate export data;
the data import operation unit B is used for importing json intermediate export data into a target database; dividing json intermediate import data into creation of a data table and import of data to be imported according to data types; carrying out data formatting on the data table information to be imported according to the requirement of new table creation of a target database, and finishing data mapping on the content data to be imported according to data cleaning and the import requirement of the target database so as to generate a final import statement and import the final import statement into the target database;
and the reverse analysis recovery unit C is used for performing reverse analysis operation on the data to be imported to generate backup recovery data for rolling back the data version.
The invention has the advantages that:
the method and the system of the invention enable the implementation of data synchronization business of users between different relational databases to be more convenient, solve the data recovery problem of coping with data version conflicts under the limitation of one-way communication by using the data recovery strategy, effectively cope with various different data synchronization scenes, and lay the foundation for a more intelligent data synchronization system through interface expansion.
Drawings
FIG. 1 is a schematic diagram of the process of the present invention;
FIG. 2 is a schematic diagram of the system of the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments.
As shown in fig. 1, data synchronization is mainly divided into two steps, one is data export, which exports data from a source database: according to the actual service application requirements, the information of the structured data extraction key value pairs needing to be exported in the source database is rewritten into intermediate data meeting format requirements and stored as json texts for subsequent use or backup; secondly, data import, importing the data into a target database, and generating an execution script for data recovery: and importing the data of the json intermediate data text into the database according to the communication condition of the target database and the data import strategy of the target database, and simultaneously performing reverse analysis on the text data to generate backup recovery data which can be used for rolling back the data version.
(1) Data export
The data synchronization is to migrate all data in the source database to the target database in a consistent manner, so that the consistency of basic data of the two databases is maintained to the maximum extent. The exported data is mainly divided into three types of data table structure export, full-field data export and partial data export, wherein the data export generally involves three operations of 'adding', 'deleting' and 'modifying', and operation information of three different operations needs to be saved in the data export process.
The table structure information export part of the data exports the creation code information of the data table needing to be exported in the source database, extracts the key value pair information and converts the key value pair information into json format for storage. The structure information of the data table mainly comprises: database name, table name, code, and field name, type, length, etc. of all fields. Such as creation information of table in sql, mapping information of type in elastic search, and schema of core in solr.
Since the data deleting operation on the database does not pass through the data synchronization system, but the deleting operation on the database data is easy to distinguish and extract in the database log due to obvious characteristics, the data deleting log can be collected according to the directional analysis on the database log, so that an execution script about the data deleting operation information of a specific table in a source database is generated, and meanwhile, different data operation codes for marking the deleting operation are set according to the type of the corresponding database. The information of the part can be newly established with a special deleting operation log detection program on the database server for a specific export process to identify and synthesize into the generated export text.
The method comprises the steps of leading out parts of full-field data, intercepting parts needing to be led out in a source database according to query statements generated in a set interval, converting structured data of selected items into key value pairs, storing the key value pairs into json format for leading out, meanwhile, allocating a data identification code for each data item according to an appointed data item mark format, and if the key value pairs are not appointed, setting the data identification codes to be null; the default operation information is "add", corresponding data operation codes are set according to corresponding database types, such as insert in sql, index in elastic search, add in solr, and the like for distinguishing, if the data operation codes are specially needed, the data updating data table can be set as "modify", and the data operation codes are set as follows.
The export part of partial data generates a corresponding query statement according to a field list needing to be exported and a set interval to intercept the part needing to be exported in a source database, converts the structured data of the selected entry into a key value pair format to be saved into a json format for export, and meanwhile, allocates a data identification code for each data entry according to the designated data entry mark format, and if the data entry mark format is not designated, the data entry mark format is set to be null; the default operation information is 'modification', different data operation codes are set according to the type of the corresponding database, such as update _ set of sql, update of elastic search, add _ field of solr and the like for distinguishing, if a data table is newly built according to the export data, the data operation codes can be set to 'add', and the data operation codes are set as above.
(2) Data import
The data import mainly comprises analyzing the data obtained in the data export link and inputting the data into a target database. And according to the type of the data to be imported, establishing a data table of the target database based on the table structure data and synchronizing the data of the target database based on the table data.
And the import part for creating a new table based on the table structure data analyzes the structure information in the json import data into key value pairs, fills a format required by creating the new table for the target database according to the type of the target database, calls a corresponding table creation module to create the new table, and simultaneously generates backup recovery data for deleting the data table and used for rolling back a subsequent data version.
Synchronizing an importing part of a target database based on table data, namely, enabling imported json data to 1) call a corresponding data importing unit according to the database type of the target database, 2) process data items to be imported according to data processing units such as specified data format adjustment and data cleaning, and 3) perform data mapping synchronization according to data operation codes and equipped data identification codes; and meanwhile, generating corresponding data recovery data according to the data operation codes and the data identification codes through reverse analysis, for example, implementing deletion operation on the addition operation, implementing modification operation based on corresponding original data on the modification operation, and implementing addition operation on the corresponding data on the deletion operation, thereby generating an execution script for data recovery of data recovery.
As shown in fig. 2, the data synchronization system mainly includes a data synchronization engine 10, a data processing module 20, a message scheduling module 30, a data backup repository 40, and a log management repository 50.
(1) Data synchronization engine 10
The data synchronization engine 10 is mainly responsible for all interaction behaviors between users and data synchronization, including task customization, user authority management, uploading and downloading of data, and other external interface expansion links such as: optical disc ferry applications involved in data synchronization of unidirectional communication, and the like.
Task customization is divided into two major categories: the method comprises the steps of configuring a normalized timing task, configuring a one-time temporary task, wherein the configuration comprises the combination of a plurality of subtasks, data operation code assignment in the export process, data identification code format configuration, attribute configuration of source data, data format adjustment in the import process, uploading of a data cleaning processing unit, attribute configuration of target data and the like, and calling configuration with an external interface;
the authority management of the user relates to the highest authority of data synchronization of an administrator, the execution authority of maintenance personnel, the temporary configuration calling authority of data application personnel and the like;
the uploading and downloading of data are the most basic system interaction interfaces; modifying and uploading the data cleaning processing unit; interaction such as task customization and attribute configuration;
the connection of other external interfaces is expandable interface management for expanding the application range of the data synchronization system and increasing the intelligent service of the system, such as the increase and the modification of a data import unit and a data export unit, the butt joint of an optical disk ferrying system for controlling the data flow direction and the like.
(2) Data processing module 20
The data processing module 20 includes: the device comprises a data export operation unit A, a data import operation unit B and a reverse analysis recovery unit C;
the data export operation unit a is configured to export original data from a source database into json intermediate export data, read periodically extracted source database log information about a deletion operation in the data backup repository 40, analyze export data of an entry requiring data deletion to generate a flag bit deletion operation, and merge the export data into the json intermediate export data; the method comprises the steps of dividing original data into a data table information derivation, a full field data derivation and a partial data derivation according to data types. Performing json datamation of table creation information on original data to be exported according to an export strategy of a data export task, and combining the original data of full-field data and partial data with data operation types and data standardization of a data entry mark format to generate json intermediate export data;
the data import operation unit B is used for importing json intermediate export data into a target database; dividing json intermediate import data into creation of a data table and import of data to be imported according to data types. Dividing json intermediate export data into: standard table information to be imported is formatted according to data required by new table creation of a target database, standard content data to be imported is subjected to data cleaning and data mapping required by import of the target database, and therefore a final import statement is generated and imported into the target database;
and the reverse analysis recovery unit C is used for performing reverse analysis on the data to be imported to generate backup recovery data which is subsequently used for rolling back the data version. The method comprises the steps of conducting corresponding reverse operation type mapping on data to be imported in combination with different database types and generating single communication network by backup recovery item data of corresponding data items, and conducting data export and import in an off-line synchronization mode in a bidirectional communication network.
(3) Message scheduling Module 30
The message scheduling module 30 is responsible for collecting the task requests of the new configuration of the user and the configured normalized data synchronization task requests acquired in the data synchronization engine 10, and transmitting data processing task commands to the data processing module 20. The message scheduling module can well avoid message congestion, thereby relieving system pressure and reducing errors or omissions in data processing task execution.
(4) Data backup repository 40
The data backup repository 40 mainly stores a data format adjustment and data cleaning processing unit uploaded by a user in the data synchronization engine 10, an export data packet generated in a data processing task, an import data packet uploaded in the data processing task, a reverse recovery data packet generated in the data processing task, a log file generated in the system operation process, and log information files of all deletion operations obtained by periodically extracting and analyzing the operation log file in each database.
(5) Log management library 50
The log management library 50 is responsible for recording all information interaction logs generated in the operation process of the management system, including task customization information, user usage records, uploaded data and code storage path registration, information entry of external interfaces in the data synchronization engine 10, data import and export condition records in the data processing module 20, import and export data packets, association information records between reverse recovery data packets and a task database, and the like, and task execution information records of the message scheduling module 30.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (6)

1. A method for synchronizing structured data between relational databases is used for realizing data synchronization between a source database and a target database; including but not limited to full consistent synchronization of data between databases of the same type and databases of different types and incomplete consistent synchronization of data in conjunction with data cleansing operations; the method comprises the following steps:
step 1) extracting key value pair information from structured original data needing to be exported in log information of a source database and a source database according to actual service application requirements, rewriting the key value pair information into intermediate data meeting format requirements by combining a data operation type and a data entry mark format, and storing the intermediate data in a json format as intermediate export data;
step 2) mapping json intermediate export data from basic data into target data according to a data import strategy of a target database by combining data cleaning operation and database format conversion operation, converting the json intermediate export data into data to be imported according with a corresponding database import format, and then importing the data into the target database;
step 3) performing reverse analysis operation on the data to be imported generated in the step 2) in combination with the type of the target database to generate backup recovery data for rolling back the data version;
the step 1) specifically comprises the following steps:
step 1-1) dividing structured original data needing to be exported in a source database into: data table information, full field data and partial data;
step 1-2) exporting data table information: exporting structural information of a data table needing to be exported in a source database, extracting key value pair information, converting the key value pair information into a json format and storing the json format; the structure information of the data table includes: database name, table name, code, and field name, type, length of all fields;
step 1-3) exporting the full field data: intercepting selected items in a source database according to query statements generated in a set interval, analyzing items needing data deletion in a source database log information in a targeted mode according to deletion operation, converting structured original data of the selected items into a key value pair mode, generating a unique identifier according to a specified data item mark format to serve as a data identification code of each data item, marking data operation types, storing the data identification codes into a json format, and generating json intermediate export data; if the data entry mark format is not specified, setting the data entry mark format to be null; defaulting the data operation type of the data entry selected from the source database to be 'increased', deleting the data operation type of the data entry selected from the log information of the source database, and setting a corresponding data operation code according to the corresponding database type;
step 1-4) exporting partial data: generating query statements according to a field list appointed by a user and a set query interval, matching data in a source database to obtain a selected entry, analyzing entries needing data deletion in the log information of the source database in a targeted manner aiming at 'delete' operation, converting structured original data of the selected entry into a key value pair form, generating a unique identifier according to an appointed data entry mark format to serve as a data identification code of each data entry, marking a data operation type, storing the unique identifier into a json format, and generating json intermediate export data; if the data entry mark format is not specified, setting the data entry mark format to be null; and default data operation types of the data items selected from the source database to be modified, default data operation types of the data items selected from the log information of the source database to be deleted, and different data operation codes are set according to the corresponding database types.
2. The method for synchronizing structured data between relational databases according to claim 1, wherein the step 2) specifically comprises:
step 2-1), dividing json intermediate derived data into the following data types: creating a data table and importing data;
step 2-2) data table creation: reducing the structure information of the data table in the json intermediate export data into key value pairs, filling a format required by creating a new table for the target database according to the type of the target database, and creating the new table for the target database;
step 2-3) data import: generating a data processing strategy by combining the type of the target database and a data format adjustment and data cleaning strategy specified by a user;
and 2-4) performing data format adjustment and data cleaning according to the data processing strategy generated in the step 2-3) by combining the data operation code and the allocated data identification code, generating a final data import statement, and importing the final data import statement into a target database.
3. The method for synchronizing structured data between relational databases according to claim 2, wherein the step 3) specifically comprises:
step 3-1) dividing the data to be imported generated in the step 2) into reverse analysis of data table information and reverse analysis of content data according to data types;
step 3-2) reverse analysis of data table information: reading the data to be imported for creating a new table of the target database generated in the step 2-2), decoding the data into corresponding table deletion statements, and generating backup recovery data for rolling back of a subsequent data version;
step 3-3) reverse analysis of content data: and reading the data to be imported generated in the step 2-4) and used for importing the target database, and performing corresponding reverse analysis operation by combining the type of the target database to generate backup recovery data used for rolling back the subsequent data version.
4. The method for synchronizing structured data between relational databases according to claim 3, wherein the reverse parsing operation comprises: reading a data entry to be imported, and reversely mapping the data operation code: and replacing the adding operation with a deleting operation, assigning the modifying operation with a modifying operation corresponding to the original data, replacing the deleting operation with an adding operation corresponding to the data, and generating backup recovery data for rolling back the subsequent data version by combining the item content.
5. A system for synchronizing structured data between relational databases, the system comprising: the system comprises a data synchronization engine (10), a data processing module (20), a message scheduling module (30), a data backup warehouse (40) and a log management library (50);
the data synchronization engine (10) is used for taking charge of interactive behaviors between a user and a system, and comprises task customization, authority management of the user, uploading and downloading of data and expansion and connection of other external interfaces;
the data processing module (20) is used for receiving a data processing task command sent by the message scheduling module (30), reading data required by a task from the log management library (50) according to the task command, exporting the data from a source database, and importing the data into a target database; performing reverse analysis operation on the data for rolling back the data version;
the message scheduling module (30) is used for acquiring a task request configured by a user and a configured normalized data synchronization task request from the data synchronization engine (10), and transmitting a data processing task command to the data processing module (20);
the data backup storage (40) is used for storing data which are uploaded by a user in the data synchronization engine (10) and are subjected to data format adjustment and data cleaning, storing export data packets, upload import data packets and generated reverse recovery data packets which are generated by the data processing module (20), storing log files generated in the operation process of the system and log information files of all deleting operations which are obtained by periodically extracting and analyzing the operation log files in each database;
the log management library (50) is used for managing all information interaction logs generated in the running process of the system, and comprises task customization information, user usage records, uploaded data and code storage path registration, information entry of an external interface in a data synchronization engine (10), data import and export condition records in a data processing module (20), import and export data packets, association information records between reverse recovery data packets and a task database, and task execution information records of a message scheduling module (30);
the data processing module (20) comprises: the device comprises a data export operation unit A, a data import operation unit B and a reverse analysis recovery unit C;
the data export operation unit A is used for exporting original data from a source database into json intermediate export data, reading source database log information about 'delete' operation in a data backup warehouse (40), analyzing export data of item generation zone bit delete operation needing data delete, and merging the export data into the json intermediate export data; dividing original data into derivation of a data table, derivation of full-field data and derivation of partial data according to data types; performing json datamation of table creation information on original data to be exported according to an export strategy of a data export task, and combining the original data of full-field data and partial data with data operation types and data standardization of a data entry mark format to generate json intermediate export data;
the data import operation unit B is used for importing json intermediate export data into a target database; dividing json intermediate import data into creation of a data table and import of data to be imported according to data types; carrying out data formatting on the data table information to be imported according to the requirement of new table creation of a target database, and finishing data mapping on the content data to be imported according to data cleaning and the import requirement of the target database so as to generate a final import statement and import the final import statement into the target database;
and the reverse analysis recovery unit C is used for performing reverse analysis operation on the data to be imported to generate backup recovery data for rolling back the data version.
6. The system for synchronization of structured data between relational databases according to claim 5, wherein the data synchronization engine (10) comprises:
the task customizing unit is used for configuring the normalized timing task, configuring the one-time temporary task, and configuring the tasks including the combination of a plurality of subtasks, the data operation code assignment in the exporting process, the data identification code format configuration, the attribute configuration of source data, the data format adjustment in the importing process, the uploading of the data cleaning processing unit, the attribute configuration of target data and the calling configuration of an external interface;
the authority management unit of the user is used for realizing the highest authority setting of data synchronization of an administrator, the execution authority of maintenance personnel and the temporary configuration and calling authority of data application personnel;
the interactive interface is used for uploading and downloading data; modifying and uploading the data cleaning processing unit; task customization and attribute configuration;
and the external interface is used for expanding the application range of the data synchronization system and increasing the expandable interface management of the system intelligent service, and comprises the increase and the modification of a data import unit and a data export unit and the butt joint of an optical disk ferrying system controlled by the data flow direction.
CN201810030156.2A 2018-01-12 2018-01-12 Method and system for synchronizing structured data between relational databases Active CN108052681B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810030156.2A CN108052681B (en) 2018-01-12 2018-01-12 Method and system for synchronizing structured data between relational databases

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810030156.2A CN108052681B (en) 2018-01-12 2018-01-12 Method and system for synchronizing structured data between relational databases

Publications (2)

Publication Number Publication Date
CN108052681A CN108052681A (en) 2018-05-18
CN108052681B true CN108052681B (en) 2020-05-26

Family

ID=62127506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810030156.2A Active CN108052681B (en) 2018-01-12 2018-01-12 Method and system for synchronizing structured data between relational databases

Country Status (1)

Country Link
CN (1) CN108052681B (en)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241183B (en) * 2018-08-16 2020-12-29 武汉元鼎创天信息科技有限公司 Data synchronization method and system based on socket communication
CN109359103A (en) * 2018-09-04 2019-02-19 河南智云数据信息技术股份有限公司 A kind of data aggregate cleaning method and system
CN109446262B (en) * 2018-10-31 2021-10-08 成都四方伟业软件股份有限公司 Data aggregation method and device
CN109885532A (en) * 2019-02-11 2019-06-14 中国银行股份有限公司 A kind of transaction data standardized method and device
CN110297869B (en) * 2019-05-30 2022-11-25 北京百度网讯科技有限公司 AI data warehouse platform and operation method
CN110197051A (en) * 2019-06-13 2019-09-03 浪潮软件股份有限公司 A kind of method, terminal and the computer readable storage medium of permission control
CN110413672B (en) * 2019-07-03 2023-09-19 平安科技(深圳)有限公司 Automatic data importing method and device and computer readable storage medium
CN111158642A (en) * 2019-11-25 2020-05-15 深圳壹账通智能科技有限公司 Data construction method and device, computer equipment and storage medium
CN110971685B (en) * 2019-11-29 2021-01-01 腾讯科技(深圳)有限公司 Content processing method, content processing device, computer equipment and storage medium
CN111061739B (en) * 2019-12-17 2023-07-04 医渡云(北京)技术有限公司 Method and device for warehousing massive medical data, electronic equipment and storage medium
CN111125065B (en) * 2019-12-24 2023-09-12 阳光人寿保险股份有限公司 Visual data synchronization method, system, terminal and computer readable storage medium
CN111143329B (en) * 2019-12-27 2024-02-13 中国银联股份有限公司 Data processing method and device
CN111159160B (en) * 2019-12-31 2023-06-20 卓米私人有限公司 Version rollback method and device, electronic equipment and storage medium
US20210200751A1 (en) * 2019-12-31 2021-07-01 Capital One Services, Llc Monitoring and data validation of process log information imported from multiple diverse data sources
CN111414260A (en) * 2020-03-03 2020-07-14 中国平安人寿保险股份有限公司 Software system data processing method, device and computer readable storage medium
CN111475531A (en) * 2020-04-12 2020-07-31 魏秋云 Information analysis system based on student employment data
CN111694840B (en) * 2020-04-29 2023-05-30 平安科技(深圳)有限公司 Data synchronization method, device, server and storage medium
CN111694812A (en) * 2020-05-06 2020-09-22 五八有限公司 Data migration method and data migration device
CN111881209A (en) * 2020-06-29 2020-11-03 平安国际智慧城市科技股份有限公司 Data synchronization method and device for heterogeneous database, electronic equipment and medium
CN111858632B (en) * 2020-07-22 2024-02-20 浪潮云信息技术股份公司 NiFi-based relational database incremental data warehousing method
CN111897877B (en) * 2020-08-12 2024-03-26 浪潮软件股份有限公司 High-performance high-reliability data sharing system and method based on distributed ideas
CN112364101A (en) * 2020-11-11 2021-02-12 深圳前海微众银行股份有限公司 Data synchronization method and device, terminal equipment and medium
CN112416907A (en) * 2020-12-03 2021-02-26 厦门市美亚柏科信息股份有限公司 Database table data importing and exporting method, terminal equipment and storage medium
CN112632176B (en) * 2020-12-31 2024-09-17 中国农业银行股份有限公司 Interaction method and device for supervising report database
WO2022193894A1 (en) * 2021-03-19 2022-09-22 International Business Machines Corporation Asynchronous persistency of replicated data changes in database accelerator
US11797570B2 (en) 2021-03-19 2023-10-24 International Business Machines Corporation Asynchronous persistency of replicated data changes in a database accelerator
US11500733B2 (en) 2021-03-19 2022-11-15 International Business Machines Corporation Volatile database caching in a database accelerator
CN113392081B (en) * 2021-06-10 2024-07-09 北京猿力未来科技有限公司 Data processing system and method
CN113535857A (en) * 2021-08-04 2021-10-22 阿波罗智联(北京)科技有限公司 Data synchronization method and device
CN114595291B (en) * 2022-05-10 2022-08-02 城云科技(中国)有限公司 Collection task adjusting method and device based on database annotation
CN115062033A (en) * 2022-06-21 2022-09-16 长春一汽富晟集团有限公司 Automatic importing and exporting system and method for SQL Server database data table
CN116107816B (en) * 2023-04-13 2023-08-01 山东捷瑞数字科技股份有限公司 MYSQL database back-file cloud platform

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129478B (en) * 2011-04-26 2012-10-03 广州从兴电子开发有限公司 Database synchronization method and system thereof
CN102542007B (en) * 2011-12-13 2014-06-25 中国电子科技集团公司第十五研究所 Method and system for synchronization of relational databases
CN103067483B (en) * 2012-12-25 2017-04-05 广东邮电职业技术学院 Teledata increment synchronization method based on packet
CN103678532B (en) * 2013-12-02 2017-05-10 中国移动(深圳)有限公司 Alternation statement reverse analysis method, database alternating and backspacing method and database alternating and backspacing system
CN104516989B (en) * 2015-01-26 2018-07-03 北京京东尚科信息技术有限公司 Incremental data supplying system and method
CN106951536A (en) * 2017-03-22 2017-07-14 努比亚技术有限公司 Data method for transformation and system

Also Published As

Publication number Publication date
CN108052681A (en) 2018-05-18

Similar Documents

Publication Publication Date Title
CN108052681B (en) Method and system for synchronizing structured data between relational databases
CN103617176B (en) One kind realizes the autosynchronous method of multi-source heterogeneous data resource
CN110175213A (en) A kind of oracle database synchronization system and method based on SCN mode
CN110377666A (en) Based on the synchronous method of data between CMSP message-oriented middleware progress different source data library
CN105005618A (en) Data synchronization method and system among heterogeneous databases
CN106599104A (en) Mass data association method based on redis cluster
CN109213820B (en) Method for realizing fusion use of multiple types of databases
CN112685433B (en) Metadata updating method and device, electronic equipment and computer-readable storage medium
WO2018036324A1 (en) Smart city information sharing method and device
CN104573100A (en) Step-by-step database synchronization method with autoincrement identifications
CN104598610A (en) Step-by-step database data distribution uploading and synchronizing method
CN104899274B (en) A kind of memory database Efficient Remote access method
CN105791401B (en) Client and server-side data interactive method, system under net and off-network state
CN105608126A (en) Method and apparatus for establishing secondary indexes for massive databases
CN108984725A (en) Cross-gatekeeper data synchronization method
CN107688611A (en) A kind of Redis key assignments management system and method based on saltstack
CN114218218A (en) Data processing method, device and equipment based on data warehouse and storage medium
CN111913933B (en) Power grid historical data management method and system based on unified support platform
KR101357135B1 (en) Apparatus for Collecting Log Information
CN110705724A (en) Reusable automatic operation and maintenance management system
CN111090803A (en) Data processing method and device, electronic equipment and storage medium
CN114374701B (en) Transparent sharing device for sample model of multistage linkage artificial intelligent platform
CN101645073A (en) Method for guiding prior database file into embedded type database
CN109857808B (en) Vertical data synchronization system and method based on neutral data structure
CN116414801A (en) Data migration method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210406

Address after: No.26 Fucheng Road, Haidian District, Beijing 100142

Patentee after: MILITARY SCIENCE INFORMATION RESEARCH CENTER OF MILITARY ACADEMY OF THE CHINESE PLA

Address before: 100142 courtyard 26, Fucheng Road, Haidian District, Beijing

Patentee before: Mao Bin

Patentee before: MILITARY SCIENCE INFORMATION RESEARCH CENTER OF MILITARY ACADEMY OF THE CHINESE PLA