CN117743466A

CN117743466A - Cross-platform database synchronization method

Info

Publication number: CN117743466A
Application number: CN202311803752.7A
Authority: CN
Inventors: 钟继宗; 王海潮
Original assignee: Ningbo Zhenhai District Audit Bureau
Current assignee: Ningbo Zhenhai District Audit Bureau
Priority date: 2023-12-26
Filing date: 2023-12-26
Publication date: 2024-03-22

Abstract

The invention relates to the technical field of databases, in particular to a cross-platform database synchronization method, which comprises the following steps: defining a unified data model in a source database and a target database; the data conversion is processed through the middleware, the middleware is responsible for converting the data format in the source database into a format acceptable by the target database, and the middleware is also responsible for processing abnormal conditions in the data synchronization process; implementing a data transmission mechanism to ensure the safe transmission of data between a source database and a target database; target database data verification, wherein data verification is implemented in the target database to ensure the integrity and accuracy of the data; introducing a dynamic synchronization adjustment mechanism based on real-time performance feedback, wherein the dynamic synchronization adjustment mechanism adjusts a strategy of data synchronization in real time according to the current network condition, database load and data synchronization efficiency; the invention effectively improves the data synchronization efficiency and reliability in unstable network environment or in the face of different database loads.

Description

Cross-platform database synchronization method

Technical Field

The invention relates to the technical field of databases, in particular to a cross-platform database synchronization method.

Background

In the field of modern information technology, data synchronization is becoming increasingly important between different database platforms. With the rapid development of cloud computing and big data technology, enterprises and organizations increasingly rely on multiple database systems to store and process data, however, existing database synchronization techniques face a variety of challenges. These challenges include mainly the problem of data compatibility between different database platforms, security risks in the data synchronization process, and the difficulty of maintaining synchronization efficiency and stability under varying network and system load conditions.

Different database systems typically have different data storage formats and data types, and without an effective conversion and mapping mechanism, these differences result in data being prone to format errors or loss during synchronization, thereby affecting the integrity and accuracy of the data.

With the increase of network attacks and the frequent occurrence of data leakage events, security in the data synchronization process becomes a significant concern, and in addition, the data synchronization process needs to be kept efficient and stable under various network environments and different system load conditions, which is often difficult to achieve in the prior art.

Existing data synchronization methods often lack sufficient flexibility and intelligence to automatically adjust synchronization policies based on real-time network conditions or database loads, which limits their effectiveness in dynamically changing environments, especially in cross-platform application scenarios.

Disclosure of Invention

Based on the above purpose, the invention provides a cross-platform database synchronization method.

A method for synchronizing a cross-platform database, comprising the steps of:

s1: defining unified data models in a source database and a target database, and ensuring the consistency of the data structures among different platforms;

s2: the data conversion is processed through the middleware, the middleware is responsible for converting the data format in the source database into a format acceptable by the target database, and the middleware is also responsible for processing abnormal conditions in the data synchronization process, wherein the abnormal conditions comprise data format mismatch and field deletion;

s3: implementing a data transmission mechanism to ensure the safe transmission of data between a source database and a target database;

s4: data verification of a target database is carried out, and the data verification is carried out in the target database so as to ensure the integrity and the accuracy of the data, wherein the data consistency check and the integrity check are included;

s5: and introducing a dynamic synchronization adjustment mechanism based on real-time performance feedback, wherein the dynamic synchronization adjustment mechanism adjusts a strategy of data synchronization in real time according to the current network condition, database load and data synchronization efficiency, and comprises the steps of adjusting synchronization frequency and optimizing the size of a data packet.

Further, the data model defined uniformly in the step S1 comprises a field mapping mechanism, a data type conversion rule and a data relation mapping;

the field mapping mechanism: analyzing the data structures in the source database and the target database, automatically identifying the same or similar data fields, and for inconsistent fields, providing a field mapping mechanism that allows a user or an automatic algorithm to map the fields in the source database to the corresponding fields in the target database;

the data type conversion rule: establishing a set of data type conversion rules for converting data types between different database platforms, including basic data types and complex data types, wherein the rules are automatically applied to ensure correct conversion and adaptation of the data types in the data synchronization process;

data relationship mapping: the mapping of data relations is processed, relational data structures in a source database and a target database are identified, the relational data structures comprise incidence relations (one-to-one, one-to-many and many-to-many) among tables, correct processing and mapping of the incidence relations are ensured during data synchronization through a relation identification algorithm, consistency of the data relations is ensured, the relation identification algorithm uses a graph theory algorithm to identify entity relation diagrams in the database, depth-first search or breadth-first search to explore and identify the incidence relations among the tables, constraint conversion strategies are adopted in the mapping process, relation mapping is realized through an intermediate table for complex many-to-many relations, and cascade updating and deleting rules are applied to keep consistency and integrity of the data relations.

Further, the middleware in S2 includes a data conversion engine, where the data conversion engine converts a data format in the source database into a format acceptable to the target database, including conversion of a basic data type (such as an integer, a character string), a time format, and a special data type (such as a BLOB, a CLOB), and the conversion rule is defined by a data mapping table, where the data mapping table is configured by a user or generated by automatic detection during initialization, and is used to describe a data type correspondence between the source database and the target database;

the middleware further comprises an exception handling mechanism, wherein in the data synchronization process, consistency and integrity of data are monitored in real time, when the condition of mismatching of data formats or field missing is met, an exception handling program is triggered, and for the condition of mismatching of data formats, the exception handling mechanism tries to apply a standby conversion rule or a preset algorithm to adjust the data so as to meet the requirements of a target database; when a field is missing, the exception handling mechanism will fill the process with default values according to ignoring the missing field;

after the data is converted, the middleware performs a data verification step to ensure that the converted data meets the format and integrity requirements of the target database, if the data does not pass the verification, the middleware records detailed error information, and performs retry, skip or stop synchronization operations according to the configuration.

Further, the preset algorithm comprises data type forced conversion or default value filling, and when the data types of the source database and the target database are inconsistent, the data type forced conversion algorithm is applied; when there is a field miss in the source database, a default value filling algorithm will be applied.

Further, the encryption transmission mechanism and the data integrity verification mechanism in S3 specifically include:

encryption transmission mechanism: an SSL/TLS protocol encryption technology is adopted in the transmission process to ensure the data security, in the encryption process, the data is encrypted into an unreadable format, the target database end is decrypted, the encryption algorithm adopts a standard algorithm comprising AES or RSA, and the proper key length and encryption mode can be selected according to the security requirement;

data integrity verification mechanism: in the data transmission process, an integrity check code is added for each data packet, after the target database receives data, the check code of the data is recalculated and compared with the original check code to verify whether the data is tampered or damaged in the transmission process, and in order to prevent the check code from being tampered, the check code is also encrypted and transmitted, and at the target database, the check code is decrypted first, and then the integrity verification is executed.

Further, the dynamic synchronization adjustment mechanism in S5 specifically includes:

the real-time monitoring component is used for continuously collecting and analyzing current network conditions, database loads and data synchronization efficiency indexes, wherein the network conditions are evaluated through bandwidth utilization rate, delay and packet loss rate parameters; the database load is monitored by querying response time, transaction waiting number and resource utilization index;

policy adjustment algorithm: according to the collected real-time data, a strategy adjustment algorithm is applied to determine adjustment of a synchronization strategy, wherein the strategy comprises the steps of changing the frequency of data synchronization, adjusting the size of a data packet and selecting different data transmission paths;

and adopting a self-adaptive optimization strategy, and continuously optimizing and adjusting parameters according to the historical synchronous performance data by using a statistical analysis method.

Further, the policy adjustment algorithm is specifically as follows:

the following parameters are defined: n is a network condition index, which is obtained by comprehensively evaluating parameters such as bandwidth utilization rate, delay, packet loss rate and the like, D is a database load index, which is obtained by comprehensively evaluating query response time, transaction waiting number and resource utilization rate index, and E is a data synchronization efficiency index, which is obtained by calculating according to historical synchronization data;

defining a synchronization strategy adjustment function f (N, D, E), and outputting adjustment suggestions of the synchronization strategy according to the input network condition index, the database load index and the data synchronization efficiency index;

the algorithm formula: the synchronization strategy is set to include a synchronization frequency F and a data packet size P, and definition is given:

synchronization frequency f=k ₁ /N+k ₂ /D；

Packet size p=k ₃ ×E×N；

Wherein k is ₁ ，k ₂ And k ₃ Is a preset coefficient for adjusting the influence of network conditions and database loads on the synchronization frequency and the size of the data packet;

decision logic:

when the network conditions are good, i.e. the N value is high, and the database load is low, i.e. the D value is low: increasing the synchronous frequency F and increasing the data packet size P;

when the network conditions are poor, i.e. the N value is low, or the database load is high, i.e. the D value is high: reducing the synchronization frequency F and reducing the size P of the data packet;

and (3) adjusting according to the data synchronization efficiency index E: if the history synchronization efficiency is high, P is increased.

Further, the statistical analysis method specifically includes:

improved time series analysis:

wherein Y is _t,p Is the performance index of the p-th platform at time t, phi _i,p And theta _j,p Is a model parameter for platform p, ε _t,p Is an error term;

improved multiple regression analysis, including cross-platform specific variables:

the multiple regression model was improved as follows:

wherein Z is _p Is a variable representing the characteristics of different platforms, gamma _p Is corresponding toRegression coefficients, taking into account the influence of platform differences on synchronization performance;

improved parameter optimization algorithm:

the optimization objective function is improved as follows:wherein Y is _t,p Is the actual synchronization performance index of the p-th platform at time t,>the performance index of model prediction;

adaptive adjustment strategy: based on the analysis and optimization results, the synchronization strategy is automatically adjusted for each platform, including adjusting the synchronization frequency or the data packet size, and continuously monitoring the effect of the new strategy, and the strategy of each platform is individually adjusted by considering different performance characteristics and requirements of different platforms.

Further, the system also comprises a user interface design, wherein a user interface is provided for a user to monitor and manage the data synchronization process, and the user monitors the data synchronization state, views the log and manages the synchronization task through the interface.

The invention has the beneficial effects that:

the cross-platform database synchronization method remarkably improves the efficiency and flexibility of data synchronization, and can ensure the efficient processing and synchronization of data among different database systems by realizing advanced data type conversion rules and field mapping mechanisms, encryption transmission and data integrity verification, thereby not only accelerating the speed of data processing, but also reducing the synchronization failure caused by data conversion errors or incompatibilities.

The invention provides an extra security layer for the data synchronization process by the encryption transmission and the data integrity verification mechanism, and effectively prevents the data from being leaked or tampered in the transmission process by applying the strong encryption protocol in the data transmission and carrying out strict integrity check on the target end.

The dynamic synchronization adjustment mechanism can automatically adjust the synchronization strategy according to the real-time network condition, the database load and the data synchronization efficiency, and the synchronization frequency and the data packet size are synchronized, and by means of an advanced statistical analysis method and an adaptive optimization strategy, the mechanism continuously learns and adapts to the performance characteristics and the synchronization requirements of different platforms, and can keep high-efficiency and stable data synchronization when facing network fluctuation or the performance difference between different database platforms, thereby remarkably reducing synchronization interruption or performance degradation caused by environmental change and greatly improving the intelligence and robustness of cross-platform data synchronization.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only of the invention and that other drawings can be obtained from them without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a synchronization method according to an embodiment of the invention.

Detailed Description

The present invention will be further described in detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent.

It is to be noted that unless otherwise defined, technical or scientific terms used herein should be taken in a general sense as understood by one of ordinary skill in the art to which the present invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.

As shown in fig. 1, a cross-platform database synchronization method includes the following steps:

s5: introducing a dynamic synchronization adjustment mechanism based on real-time performance feedback, wherein the dynamic synchronization adjustment mechanism adjusts a strategy of data synchronization in real time according to the current network condition, database load and data synchronization efficiency, and comprises the steps of adjusting synchronization frequency and optimizing the size of a data packet;

through the steps, the method provided by the invention not only processes the basic requirement of cross-platform data synchronization, but also effectively improves the data synchronization efficiency and reliability in an unstable network environment or in the face of different database loads through a dynamic synchronization adjustment mechanism. The steps are closely related and mutually supported, and a comprehensive and efficient cross-platform database synchronization solution is formed together.

The data model with unified definition in S1 comprises a field mapping mechanism, a data type conversion rule and a data relation mapping;

field mapping mechanism: analyzing the data structures in the source and target databases, automatically identifying the same or similar data fields, and for inconsistent fields, providing a field mapping mechanism that allows a user or an automated algorithm to map the fields in the source database to corresponding fields in the target database, e.g., a "date" field in the source database may be mapped to a "timestamp" field in the target database;

data type conversion rules: establishing a set of data type conversion rules for converting data types among different database platforms, wherein the data types comprise basic data types (such as integers, character strings and Boolean values) and complex data types (such as date and time and binary data), and the rules are automatically applied to ensure the correct conversion and adaptation of the data types in the data synchronization process;

the data type conversion rule refers to the mutual conversion of different data types in a source database and a target database when data are synchronized between different database platforms, and the following are some specific rule examples:

basic data type conversion:

integer type: for example, INTEGREEN in the source database is converted to BIGINT (for larger capacity database systems) in the target database.

Floating point number type: such as converting flow to DOUBLE to accommodate more precision demanding scenarios.

Character string type: the VARCHAR or TEXT types are uniformly converted, taking into account differences in character set and length constraints.

Date and time type conversion: converting DATE, TIME, DATETIME and the like according to the format of the target database may involve time zone conversion and format adjustment.

Special data type conversion: for BLOBs (binary large objects) and clibs (character large objects), necessary encoding or decoding processes are performed to ensure the integrity of the data during conversion.

Boolean value conversion: converting boolean values (TRUE/FALSE) into corresponding numerical representations (e.g., 0 and 1) or string representations (e.g., 'TRUE'/'FALSE') to accommodate the processing of different databases;

data relationship mapping: processing mapping of data relations, identifying a relational data structure in a source database and a target database, wherein the relational data structure comprises incidence relations (one-to-one, one-to-many and many-to-many) among tables, ensuring that the incidence relations are correctly processed and mapped when data are synchronized through a relation identification algorithm, ensuring consistency of the data relations, identifying entity relation diagrams in the database through the relation identification algorithm by using a graph theory algorithm, exploring and identifying the incidence relations among the tables by using depth-first search or breadth-first search, adopting a constraint conversion strategy in the mapping process, such as constraining external keys in the target database to reconstruct in a proper form, realizing relation mapping through an intermediate table for complex many-to-many relations, and applying cascade updating and deleting rules to keep consistency and integrity of the data relations;

in addition, the system supports user-defined mapping and conversion rules. The user can add or modify mapping and conversion rules according to specific requirements, so that the data model definition is more flexible and adaptive.

By means of the detailed implementation modes, the method ensures that a unified data model is effectively defined between a source database and a target database, and solves key problems in cross-platform data synchronization, including accuracy of field mapping, adaptability of data type conversion and consistency of data relations, so that a solid foundation is provided for efficient and accurate data synchronization.

The middleware in S2 comprises a data conversion engine, wherein the data conversion engine converts a data format in a source database into a format acceptable by a target database, and comprises conversion of basic data types (such as integers and character strings), time formats and special data types (such as BLOB and CLOB), the conversion rule is defined by a data mapping table, and the data mapping table is configured by a user or generated through automatic detection during initialization and is used for describing the data type corresponding relation between the source database and the target database;

the middleware also comprises an exception handling mechanism, wherein in the data synchronization process, the consistency and the integrity of data are monitored in real time, when the condition of mismatching of data formats or field missing is met, an exception handling program is triggered, and for the condition of mismatching of the data formats, the exception handling mechanism tries to apply a standby conversion rule or a preset algorithm to adjust the data so as to meet the requirements of a target database; when a field is missing, the exception handling mechanism will fill the process with default values according to ignoring the missing field;

The preset algorithm comprises data type forced conversion or default value filling, and when the data types of the source database and the target database are inconsistent, the data type forced conversion algorithm is applied;

integer and floating point conversion:

if the source data is an integer (int), but the target database requires a floating point number (float), the conversion formula is: float_value=float (int_value).

String and value type conversion:

when it is desired to convert a string to a numeric type, such as converting a number represented by the string to an integer or floating point number.

Date-time format conversion:

for the date and time type, the conversion formula converts the YYYY-MM-DD format to MM/DD/yyyyy depending on the date and time format of the source and target databases.

When there is a field miss in the source database, a default value filling algorithm will be applied.

Value type field:

for fields of numeric type, such as integer or floating point numbers, if the field is missing, it may be filled with 0 or other preset default value.

String type field:

for a string type field, if the field is missing, it may be filled in as an empty string or other default text that is preset.

Boolean type field:

for a boolean type field, if the field is missing, it is typically filled with False unless there is a specific traffic requirement: boost_field=false

Date time type field:

for a date and time type field, if the field is missing, it is filled with a specific default date, such as the current date or a fixed date.

The encryption transmission mechanism and the data integrity verification mechanism in the S3 specifically comprise:

data integrity verification mechanism: in the data transmission process, an integrity check code is added for each data packet, after the target database receives data, the check code of the data is recalculated and compared with the original check code to verify whether the data is tampered or damaged in the transmission process, and in order to prevent the check code from being tampered, the check code is also encrypted and transmitted, at the target database, the check code is decrypted firstly, and then the integrity verification is executed;

if the verification of the data integrity fails (i.e. the verification codes are not matched), the system records the transmission abnormality, processes the transmission abnormality according to a preset strategy, such as retransmitting the data, recording a log or notifying an administrator, provides a data retransmission mechanism, and can automatically or manually trigger retransmission when the verification fails to ensure the correct transmission of the data.

The above encryption and integrity verification process is transparent to the end user, i.e., the user does not need to participate in or know the specific details of encryption and verification, the system automatically handles these security measures, and the encryption and integrity verification mechanism is compatible with various databases and network environments without affecting the normal operation of existing databases or applications.

The dynamic synchronization adjustment mechanism in S5 specifically includes:

policy adjustment algorithm: based on the collected real-time data, a policy adjustment algorithm is applied to determine adjustment of the synchronization policy, wherein the policy includes changing the frequency of data synchronization, adjusting the size of the data packet, selecting different data transmission paths, for example, when the network condition is good and the database load is low, the system can increase the synchronization frequency or increase the size of the data packet to improve the synchronization efficiency. Conversely, when network conditions are poor or database loading is high, the system may decrease the synchronization frequency or decrease the packet size to relieve the pressure on the network and database;

adopting a self-adaptive optimization strategy, and continuously optimizing adjustment parameters according to historical synchronous performance data by using a statistical analysis method;

exception handling and recovery: when policy adjustment is implemented, possible abnormal conditions such as data synchronization interruption or performance degradation are monitored, and once an abnormality is detected, the system can automatically fall back to the last stable synchronous configuration and notify an administrator.

The policy adjustment algorithm is specifically as follows:

synchronization frequency f=k ₁ /N+k ₂ /D；

Packet size p=k ₃ ×E×N；

decision logic:

The implementation steps are as follows:

1. n, D, and E are calculated periodically or in real time.

2. F and P are calculated by applying a synchronization policy adjustment function F (N, D, E).

3. And adjusting the frequency and the packet size of the data synchronization according to the calculation result.

The statistical analysis method specifically comprises the following steps:

improved time series analysis (accounting for platform differences):

the multiple regression model was improved as follows:

wherein Z is _p Is a variable representing the characteristics of different platforms, gamma _p Is the corresponding regression coefficient, take account ofThe influence of platform difference on the synchronization performance is considered;

improved parameter optimization algorithm (taking into account multi-platform factors):

The system also comprises a user interface design, wherein a user interface is provided for a user to monitor and manage the data synchronization process, and the user monitors the data synchronization state, views the log and manages the synchronization task through the interface.

Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the invention is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the invention, the steps may be implemented in any order and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.

The present invention is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the present invention should be included in the scope of the present invention.

Claims

1. A method for synchronizing a cross-platform database, comprising the steps of:

2. The method for synchronizing databases across platforms according to claim 1, wherein the defined unified data model in S1 includes a field mapping mechanism, a data type conversion rule, and a data relationship mapping;

data relationship mapping: the mapping of data relations is processed, relational data structures in a source database and a target database are identified, the relation data structures comprise relations among tables, correct processing and mapping of the relation are guaranteed when the data are synchronized through a relation identification algorithm, consistency of the data relations is guaranteed, the relation identification algorithm uses a graph theory algorithm to identify entity relation diagrams in the database, depth-first search or breadth-first search to explore and identify the relation among the tables, constraint conversion strategies are adopted in the mapping process, relation mapping is achieved through an intermediate table for complex many-to-many relations, and cascade updating and deleting rules are applied to keep consistency and integrity of the data relations.

3. The method for synchronizing databases across platforms according to claim 2, wherein the middleware in S2 includes a data conversion engine, the data conversion engine converting the data format in the source database into a format acceptable to the target database, including conversion of a basic data type, a time format, and a special data type, the conversion rule being defined by a data mapping table, the data mapping table being configured by a user at the time of initialization or being generated by automatic detection, for describing a data type correspondence between the source database and the target database;

4. A method of synchronizing databases across platforms as claimed in claim 3, wherein the preset algorithm includes a data type forced conversion or default filling, the data type forced conversion algorithm being applied when the data types of the source database and the target database are not identical, the default filling algorithm being applied when there is a field missing in the source database.

5. The method for synchronizing databases across platforms according to claim 4, wherein the step S3 comprises an encrypted transmission mechanism and a data integrity verification mechanism, which specifically comprises:

6. The method for synchronizing a cross-platform database according to claim 5, wherein the dynamic synchronization adjustment mechanism in S5 specifically includes:

7. The method for synchronizing a cross-platform database according to claim 6, wherein the policy adjustment algorithm is specifically as follows:

synchronization frequency f=k ₁ /N+k ₂ /D；

Packet size p=k ₃ ×E×N；

decision logic:

8. The method for synchronizing databases across platforms according to claim 7, wherein the statistical analysis method specifically comprises:

improved time series analysis:

the multiple regression model was improved as follows:

wherein Z is _p Is a variable representing the characteristics of different platforms, gamma _p Is a corresponding regression coefficient, and considers the influence of platform difference on the synchronization performance;

improved parameter optimization algorithm:

9. The method of claim 8, further comprising user interface design providing a user interface for a user to monitor and manage the data synchronization process, the user monitoring data synchronization status, viewing logs, and managing synchronization tasks via the interface.