CN118170746A - Data migration method, device, electronic equipment and storage medium - Google Patents

Data migration method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN118170746A
CN118170746A CN202410280736.2A CN202410280736A CN118170746A CN 118170746 A CN118170746 A CN 118170746A CN 202410280736 A CN202410280736 A CN 202410280736A CN 118170746 A CN118170746 A CN 118170746A
Authority
CN
China
Prior art keywords
data
migration
ddl
database
source database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410280736.2A
Other languages
Chinese (zh)
Inventor
陈思樑
林志云
林陈学
张磊
郑立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Fujian Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Fujian Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Fujian Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202410280736.2A priority Critical patent/CN118170746A/en
Publication of CN118170746A publication Critical patent/CN118170746A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a data migration method, a device, electronic equipment and a storage medium, and relates to the technical field of data processing, wherein the method comprises the following steps: in response to monitoring a DDL change event in a source database, acquiring first data associated with the DDL change event; dynamically analyzing a data dictionary corresponding to the source database to obtain a data structure view; judging whether preset conditions for migration and copy of the first data are met or not based on the data structure view and the first data; and if the preset condition is met, performing DDL copying on the first data so as to migrate the first data to a target database. Thus, highly accurate, reliable and complex DDL logic replication can be ensured to be achieved in situations involving frequent DDL changes during data migration.

Description

Data migration method, device, electronic equipment and storage medium
Technical Field
The disclosure relates to the technical field of data processing, and in particular relates to a data migration method, a data migration device, electronic equipment and a storage medium.
Background
With the development of the age, data has become one of the core assets of enterprises, and they play an important role in various aspects of enterprise decision-making, operation management, customer relationship and the like. In order to ensure the security, integrity and timeliness of data, database logical transaction-level replication techniques are receiving increasingly widespread attention. The method generates a transaction file by analyzing the transaction in the database log, transmits the transaction file to the target end, and finally ensures the consistency of databases at both ends through various conversions. Among these, logical replication of DDL (Database Definition Language ) is particularly critical, which directly relates to the stability and ease of use of the overall data migration process.
In the current digital age, with the continuous increase of data volume and the increasing complexity of enterprise application scenarios, frequent changes of DDL become a normative state. For example, an enterprise may need to add new data fields, delete fields that are no longer used, or modify existing data structures in successive iterations to accommodate new business needs. In this context, a stable, efficient and easy-to-use DDL logic replication technique is particularly important. The method can ensure smooth data migration and effectively reduce various potential risks caused by structure change.
In the related art, the modification of the object structure is mainly captured by setting a trigger (Triger) in the source database. When the structure of the object is changed, the trigger automatically activates and records the DDL records, and applies the records to the target database through the MAP function, so as to achieve the effect of DDL replication synchronization. However, trigger-based replication techniques have a significant impact on database performance because each change in data structure triggers an additional database operation. In addition, the trigger is automatically activated when the object structure is changed every time, so that additional database operation is added, and meanwhile, additional logs are required to be opened, so that consumption in terms of resources is increased, and performance bottlenecks are easily caused.
Disclosure of Invention
The present disclosure aims to solve, at least to some extent, one of the technical problems in the related art.
An embodiment of a first aspect of the present disclosure provides a data migration method, including:
In response to monitoring a DDL change event in a source database, acquiring first data associated with the DDL change event;
Dynamically analyzing a data dictionary corresponding to the source database to obtain a data structure view;
Judging whether preset conditions for migration and copy of the first data are met or not based on the data structure view and the first data;
and if the preset condition is met, performing DDL copying on the first data so as to migrate the first data to a target database.
Embodiments of a second aspect of the present disclosure provide a data migration apparatus, including:
The first acquisition module is used for responding to the detection of a DDL change event in the source database and acquiring first data associated with the DDL change event;
The second acquisition module is used for dynamically analyzing the data dictionary corresponding to the source database so as to obtain a data structure view;
the judging module is used for judging whether preset conditions for migration and copy of the first data are met or not based on the data structure view and the first data;
And the copy migration module is used for performing DDL copying on the first data to migrate to a target database if the preset condition is met.
An embodiment of a third aspect of the present disclosure proposes an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements a data migration method according to an embodiment of the first aspect of the present disclosure when the program is executed by the processor.
An embodiment of a fourth aspect of the present disclosure proposes a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements a data migration method according to an embodiment of the first aspect of the present disclosure.
Embodiments of a fifth aspect of the present disclosure propose a computer program product, which when executed by an instruction processor in the computer program product, implements a data migration method as described in embodiments of the first aspect of the present disclosure.
In the embodiment of the disclosure, first data associated with a DDL change event is acquired in response to monitoring the DDL change event in a source database, then a data dictionary corresponding to the source database is dynamically parsed to obtain a data structure view, then whether a preset condition for migration and copy of the first data is met is judged based on the data structure view and the first data, and if the preset condition is met, DDL copy is performed on the first data to migrate to a target database. Thus, highly accurate, reliable and complex DDL logic replication can be ensured to be achieved in situations involving frequent DDL changes during data migration.
Additional aspects and advantages of the disclosure will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the disclosure.
Drawings
The foregoing and/or additional aspects and advantages of the present disclosure will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
fig. 1 is a flow chart of a data migration method according to an embodiment of the disclosure;
FIG. 2 is a flow chart of another method for migrating data according to an embodiment of the present disclosure;
Fig. 3 is a schematic structural diagram of a data migration apparatus according to an embodiment of the disclosure;
Fig. 4 is a block diagram of an electronic device for implementing a data migration method proposed by an embodiment of the present disclosure.
Detailed Description
Embodiments of the present disclosure are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary and intended for the purpose of explaining the present disclosure and are not to be construed as limiting the present disclosure.
The embodiment of the disclosure provides a data migration method, a data migration device, data migration equipment and a storage medium, which are used for at least solving one of the problems.
In the database field, logical incremental replication has been a widely practiced technique whose core concept is to capture altered data from a source database and apply the alterations to a target database, rather than just replicating the entire database file. Logical incremental replication is accomplished by analyzing the operation logs of the database, which log all changes to the database. Common database operations include DML (data manipulation language) and DDL (data definition language). DML mainly involves insertion, deletion and updating of data records, while DDL involves changes to database structures, such as creating, modifying and deleting tables or columns, and this proposal is mainly directed to the replication technique of DDL.
In the related art, a trigger is usually set in a source database to capture the change of the object structure, and when the structure of the object is changed, the trigger will automatically activate and record the DDL records, and apply the records to a target database through the MAP function, so as to achieve the effect of DDL replication synchronization. Trigger-based replication techniques have a significant impact on database performance because each change in data structure triggers an additional database operation. The trigger can be automatically activated when the object structure is changed every time, so that additional database operation is added, and meanwhile, additional logs are required to be opened, so that consumption in terms of resources is increased, and performance bottlenecks are easily caused. And the trigger is utilized to capture DDL, which can cause memory overflow and process abnormality under the condition that the DDL of the source database terminal is frequent. Memory overflow and process anomalies may result in frequent DDL conditions. In addition, some databases may limit or prohibit the use of triggers in a traffic usage scenario, resulting in a trigger-based replication scheme that is not universally applicable.
In the embodiment of the present disclosure, the data migration method may be executed by the data migration apparatus, or may be a server, which is not limited herein.
In view of the above problems, the present disclosure proposes a data migration method and apparatus.
Data migration methods and apparatuses according to embodiments of the present disclosure are described below with reference to the accompanying drawings.
Fig. 1 is a flow chart of a data migration method according to an embodiment of the disclosure.
As shown in fig. 1, the data migration method may include the steps of:
Step 101, in response to monitoring a DDL change event in a source database, acquiring first data associated with the DDL change event.
The source database may be a currently monitored database.
Operations DDL (Data Definition Language), among other things, are operations for defining database structures and schemas, and may include operations to create, modify, and delete database objects (e.g., tables, indexes, views, stored procedures, etc.). DDL operations are primarily used to manage the structure of a database, e.g., creating tables, defining fields, setting constraints, creating indexes, etc., and may directly affect the structure of a database.
Where DDL change events refer to change operations performed on database structures, including, but not limited to, operations to create, modify, and delete database objects. When these operations are performed, the structure of the database may change, potentially affecting existing data stores and queries. Therefore, monitoring and logging DDL change events is important to ensure database integrity and stability.
It will be appreciated that DDL operations directly affect the structure and data of the database, and that finer database operation details, such as change types, time stamps, associated users, etc., can be captured by monitoring DDL change events in the source database, i.e., all through in-depth log analysis.
Alternatively, upon detection of a DDL change event, this operation may be captured by a Wal-logging (Write-Ahead Logging) technique, ensuring that all transactions are recorded before the actual change occurs. This not only ensures the integrity and atomicity of transactions, but also optimizes system performance and speed of data recovery through complex buffering techniques and log flush strategies. To ensure efficiency, the system needs to identify and capture new changes, also referred to as delta data, that occur compared to the last inspection. This avoids the repeated processing of already processed data. The captured DDL log is not just an original change, but can also enrich metadata information such as associated applications, context of trigger events, etc., providing a deeper perspective for subsequent analysis.
Wherein the first data may be metadata in a source database associated with the DDL change event. Optionally, metadata monitoring may be performed on a system catalog (System Catalog Tables) in the source database. The system directory table is used as a centralized storage point of metadata, is very critical in a relational database management system (RDBMS), and stores logical structure information of a database. The status of the entire data ecosystem can be aided by continuous monitoring of the system catalog.
The first data may include a Base table (Base Tables), an index (Indexes), and a view (Views) associated with the DDL change event, which are not limited herein. Alternatively, statistical analysis may be performed on the frequency, source, and time stamp of different DDL operations in addition to monitoring DDL change activity. This helps identify potential patterns or abnormal behavior and thus take appropriate action in time to ensure the stability and reliability of the database.
Optionally, whether the first data is processed data may be determined according to the historical DDL transaction data, and if the first data is processed data, migration copy is stopped for the first data.
The historical DDL transaction data may be data generated during historical DDL change events, such as changes after a previous check. It will be appreciated that if the first data is processed data, it is indicated that the first data is not new altered data, the same or similar event has been generated before, and the system can avoid the repeated processing of the processed data.
It should be noted that, identifying and capturing incremental data changes is critical to ensuring system efficiency, and by comparing changes after the last inspection, the system can accurately capture new changes, avoiding repeated operations on processed data, and improving processing efficiency and performance. By using the incremental data processing mode, the system can more intelligently manage the data change, reduce the resource consumption and improve the efficiency and accuracy of data processing. The method has important significance for scenes such as large-scale data processing, ETL (Transform) tasks, data synchronization and the like, and is beneficial to optimizing the system performance and improving the overall data processing efficiency.
And 102, dynamically analyzing a data dictionary corresponding to the source database to obtain a data structure view.
And dynamically analyzing the data structure view obtained from the data dictionary to obtain a data logic structure chart Schema corresponding to the source database.
In the field of databases, schema (or referred to as data Schema) refers to a logical view of structured data in a database, which defines metadata information such as data tables, columns, indexes, constraints, and the like, and describes the organization structure and relationship of the data in the database.
Specifically, by analyzing the data dictionary, the structure information of objects such as table, column, index and the like in the source database can be obtained, so that the logic structure and metadata of the database can be known.
Alternatively, during dynamic parsing, a system catalog table (System Catalog Tables) or metadata table may be queried, which contains definitions and attribute information for database objects. By executing queries against these system tables, detailed information about each object in the database, such as table name, column name, data type, constraints, etc., can be obtained.
Specifically, the real-time acquisition and updating of the database schema can be realized by dynamically analyzing the data dictionary without hard coding schema information, the scene of the change of the database structure can be dynamically adapted, and meanwhile, the management and adjustment of the schema information in the database management and data processing processes are also facilitated.
The system table of the RDBMS stores therein rich metadata information covering all database objects from table, index to view. By deeply parsing these tables, a view of the data structure can be obtained for the entire database architecture. As database schemas may undergo many changes over time. Dynamic parsing ensures that these changes can be tracked in real time, whether they are small or significant. In addition to a single database object, relationships are also extremely critical. Dynamic parsing may reveal dependencies, constraints, and other connections between objects, providing a multi-dimensional database view.
Step 103, based on the data structure view and the first data, judging whether preset conditions for migration and copy of the first data are met.
It should be noted that, by monitoring the DDL change event in the source database and comparing the first data with the data structure view, the change of the data structure can be found in time and the real-time performance and accuracy of the data dictionary can be ensured. If the data structure view and the first data are found to be out of synchronization, timely updating of the data dictionary may help to maintain consistency of the data dictionary with the actual state of the database.
Specifically, the data structure view and the first data may be compared, if the data structure view and the first data are synchronous, it is indicated that the preset condition for migration and copy of the first data is satisfied, and if the data structure view and the first data are not synchronous, the preset condition for migration and copy of the first data is not satisfied.
Specifically, the data dictionary obtained by dynamic analysis, that is, the obtained data structure view, can be compared with the obtained first data, and the difference between the two data is compared, including a new table, a new field, a change of field attribute, and the like. This step may be accomplished automatically using scripts or programs. And then, judging whether the data dictionary and the DDL change transaction data are synchronous or not according to the comparison result. If all changes in the DDL change transaction data are contained in the data dictionary, then the two may be considered synchronized. If there is a discrepancy, an update of the data dictionary is required.
Alternatively, if the data dictionary is found to be unsynchronized with DDL change transaction data, the data dictionary needs to be updated in time to ensure that it remains consistent with the database structure.
In actual operation, scripts or development programs can be written to automatically execute the steps, so that automatic comparison and synchronization of the data dictionary and the database structure are realized, and the efficiency and accuracy of data management are improved.
104, If the preset condition is met, performing DDL copying on the first data to migrate to the target database.
Alternatively, the timestamp and the spatial mark of the log entry associated with the DDL change event may be obtained first, then the context information associated with the log entry in the source database may be determined according to the timestamp and the spatial mark of the log entry, and finally the migration copy may be performed on the first data based on the context information.
In addition, context-aware copying is required when dealing with DDL changes. Context-aware replication refers to capturing and replicating DDL changes while ensuring that the environment and related information in which the changes occur are taken into account and that the DDL changes are continuous in the timeline without omission or conflicts. Specifically, by assigning a timestamp to each DDL operation, it can be ensured that they are applied to the target database in the correct order, thereby maintaining data consistency. In addition, considering that different instances or users in the same database may make different changes to the same data structure, it is necessary to ensure that the replication policy can adapt to different spatial contexts, so that differences between different instances or users can be identified and processed in the replication process to ensure accuracy of data synchronization.
In particular, each DDL operation may be assigned a time stamp and spatial signature to ensure that the DDL operation is unique throughout the system, thereby properly replicating and applying the changes. The time stamp and the space mark also provide important reference information for subsequent analysis, debugging and optimization.
It will be appreciated that ensuring data integrity, consistency, and contextual relevance is a very important consideration in performing an actual DDL logical copy. The replication process needs to ensure that the data structures and content are accurately replicated to maintain consistency of the target system with the source system. This includes replication of elements of tables, columns, indexes, constraints, etc. When DDL changes are applied in the target database, the applications must be in the correct order and context. Network delay and bandwidth limitations need to be taken into account in the replication process. This may have an impact on the replication speed and performance, requiring corresponding optimization strategies such as increasing bandwidth, adjusting replication frequency, or using incremental replication, etc.
Among other things, to ensure the smoothness and efficiency of the migration process, data conversion, caching, and optimization techniques may be used to help handle data format conversion, performance optimization, and data migration to improve the efficiency and accuracy of the replication process.
In the embodiment of the disclosure, first data associated with a DDL change event is acquired in response to monitoring the DDL change event in a source database, then a data dictionary corresponding to the source database is dynamically parsed to obtain a data structure view, then whether a preset condition for migration and copy of the first data is met is judged based on the data structure view and the first data, and if the preset condition is met, DDL copy is performed on the first data to migrate to a target database. Thus, highly accurate, reliable and complex DDL logic replication can be ensured to be achieved in situations involving frequent DDL changes during data migration.
Fig. 2 is a flow chart of another data migration method according to an embodiment of the disclosure.
As shown in fig. 2, the data migration method may include the steps of:
Step 201, based on a change data capture technique and a direct base table scan, obtaining first data associated with a DDL change event from a system target table of a source database.
It is important to integrate real-time data and ensure real-time data. In embodiments of the present disclosure, a change data Capture technique (CHANGE DATA Capture, CDC) may be employed to help Capture all data changes in the source database in real-time.
To prevent missing changes or delays under high load, in embodiments of the present disclosure, direct base table scanning (Direct Base Table Scanning, DBTS) is incorporated to supplement the deficiency of CDC. Therefore, through combining CDC and direct basic table scanning, full-view monitoring of the source database can be realized, and continuity, consistency and accuracy of data are ensured.
The CDC can capture most data changes and provide real-time new, modification and deletion operations, and the direct base table scanning can scan the base table to capture the possible missing changes of the CDC, so that more comprehensive data monitoring can be provided, and any data changes can be timely captured and synchronized.
Meanwhile, in order to ensure the real-time performance of the data, technologies such as compression, segmented transmission and the like can be adopted to optimize the data transmission efficiency in the data integration process, and parallel processing is used to accelerate the data processing speed. Optionally, a real-time monitoring system can be established to timely detect and alarm data synchronous delay or abnormal conditions so as to timely take measures for repairing.
Further, according to the service requirement and the increase of the data volume, the use of the level expansion and load balancing technology can be considered to improve the throughput and the concurrent processing capacity of the system.
It should be noted that, where the Base Table is used as the main information stored in the database, its deep analysis not only provides the surface information of the data, but also reveals the complex structure, relationship and pattern behind the data. By performing a deep analysis of the underlying table, a better understanding of the nature, source, and flow direction of the data may be aided. In addition, deep base table analysis can also help predict future data changes, providing powerful support for data management and optimization.
Step 202, dynamically analyzing a data dictionary corresponding to the source database to obtain a data structure view.
It should be noted that, the specific implementation of step 202 may refer to the above embodiments, and will not be described herein.
And 203, comparing the data structure view with the first data to judge whether the data dictionary is matched with the actual state of the source database.
Specifically, the first data and the data structure view may be compared to check whether there is a difference or a discrepancy, such as comparing table structures, column definitions, index information, etc. If no difference or inconsistency is found, whether the data dictionary is matched with the actual state of the source database is described, otherwise, the data dictionary is not matched.
And step 204, if the data dictionary is not matched with the actual state of the source database, updating the data dictionary.
Specifically, if the data dictionary does not match the actual state of the source database, corresponding measures may be taken to synchronize the data dictionary with the database state, such as updating metadata, recreating a table structure, adjusting an index, and the like.
Step 205, if the actual states of the data dictionary and the source database are matched, the preset condition for performing migration copy on the first data is satisfied.
Step 206, comparing the source database and the target database to determine whether there is a difference between the logical mode and the physical mode.
It is understood that in some DBMS, there may be Logical-PHYSICAL SCHEMA DISCREPANCIES (Logical-physical schema differences). By base table analysis, logical-PHYSICAL DATA Synchronization (Logical-physical data Synchronization) can be realized, ensuring that the whole replication process is non-differential in Logical and physical levels.
Alternatively, metadata information for the source database and the target database, such as table structures, indexes, constraints, triggers, etc., may be collected first. Then, the metadata information of the two databases can be compared to find out the difference between the logical mode and the physical mode, for example, the table structure, the index, the constraint and the like can be compared to determine whether the difference exists.
It should be noted that in the database field, differences in logical and physical data representations may lead to inconsistencies and confusion in the data. Wherein the logical schema describes how the data is understood and defines the structure, relationships, and constraints of the data. While the physical schema describes how data is organized on the storage device, including index, partition, storage format, and the like. Differences between logical and physical patterns may lead to inconsistencies in the data at different levels.
Optionally, consistency and effectiveness of logic and physical data can be ensured through a logic-physical data synchronization technology, so that data comparison, difference analysis and automatic repair are realized. It will be appreciated that by data alignment, differences between logical and physical data can be detected and the portions that need to be synchronized determined. Where the discrepancy analysis can help to understand the discrepancy between logical and physical patterns and find potential problems and inconsistencies. Automatic repair techniques can automatically synchronize logical and physical data to ensure that they are completely consistent. Therefore, the consistency and rationality problems in the large-scale and distributed database environments can be solved by conforming the logic and the physics, and the reliability of the data are improved.
Step 207, if there is a discrepancy, executing the synchronization operation of the logical view and the physical view according to the discrepancy.
Alternatively, a synchronous operation plan of the logical view and the physical view can be formulated according to the difference condition. For example, for differences in logical patterns, table structure adjustments, adding constraints, or modifying triggers may be required; for differences in physical patterns, it may be desirable to reorganize the storage structure, adjust the index, or optimize query performance.
Further, the synchronization of the logical view and the physical view may be performed according to a synchronization plan. The database may be backed up before the synchronization operation is performed in case of an operation failure leading to a loss or inconsistency of data. After the synchronization operation is performed, verification may be performed to ensure consistency of the logical and physical patterns. After the synchronization operation is completed, a monitoring mechanism can be established to periodically check the consistency of the logical and physical modes and to timely handle potential differences.
Step 208, if the preset condition is met, performing DDL copying on the first data to migrate to the target database.
It should be noted that, the specific implementation of step 208 may refer to the above embodiment, and will not be described herein.
Optionally, error detection and correction can be performed on the migration copy process of the first data.
It should be noted that in a large database system, a small error may cause a large loss in the transmission and mapping process of data. Therefore, in the data mapping process, the base table data can be examined and analyzed in real time, and the integrity and the correctness of the data are ensured. This process may detect surface errors of the data, as well as go deep into the structure and logic of the data, to determine deep errors. For example, accurate data mapping (Precision DATA MAPPING) can be used to ensure that each step in the data transfer process is accurate and that each data point exactly matches its source data. If any errors or mismatches are found in this process, the system will immediately alert and automatically initiate the correction procedure. Thus, the efficiency of data processing is greatly improved, and the quality and reliability of data are ensured.
Further, in the case that it is determined that the first data has been migrated to the target database, the server may verify the target database and return a verification result.
After the replication and migration of the data, it is important to ensure that the new database state is consistent with the expected state, so that the correctness of the data replication can be verified. For example, an integrity check may be performed on the database, including in-depth checking of each portion of the database (table, index, trigger, etc.) to confirm data structure, data accuracy, and integrity. Alternatively, data integrity verification techniques such as hash comparisons, checkpoint comparisons, etc. may be used to ensure that each portion of data is completely matched.
Optionally, to ensure that all external dependencies are satisfied, including external systems and API endpoints, etc., it is also necessary to ensure that no external dependencies are missed during data replication and migration, so as not to cause problems.
Alternatively, the verification results may be comprehensive detailed reports, such as may include all replicated DDL operations, verification results per step, problems found, and solutions. This may help teams understand and solve the problem, and also ensure transparency and traceability of the entire migration process.
In the embodiment of the disclosure, first data associated with a DDL change event is obtained from a system target table of a source database based on a change data capturing technology and a direct basic table scan, then a data dictionary corresponding to the source database is dynamically analyzed to obtain a data structure view, then the data structure view is compared with the first data to determine whether the data dictionary is matched with an actual state of the source database, if the data dictionary is not matched with the actual state of the source database, the data dictionary is updated, if the data dictionary is not matched with the actual state of the source database, a preset condition for migration copy of the first data is satisfied, then the source database and the target database can be compared to determine whether a difference between a logic mode and a physical mode exists, if the difference exists, a synchronous operation of the logic view and the physical view is executed, and finally if the preset condition is satisfied, the first data is subjected to DDL copy to migrate into the target database. Therefore, the data dictionary can be ensured to be matched with the actual state of the source database by dynamically analyzing the data dictionary, so that the change of the data structure is effectively managed and the accuracy of the data dictionary is maintained. By comparing the source database and the target database, differences between the logical mode and the physical mode are identified, and a synchronization operation of the logical view and the physical view is performed, thereby automatically maintaining consistency between the two databases. According to preset conditions, DDL copying is carried out on the first data so as to migrate to the target database, migration and copying of the data are achieved, data consistency and integrity between the source database and the target database can be enhanced, and possibility of data inconsistency and errors is reduced. Highly accurate, reliable and complex DDL logic replication can be ensured in situations involving frequent DDL changes during data migration.
In summary, the present application proposes the following beneficial effects:
1. Higher real-time performance: replication is achieved by directly analyzing the database log or base table, rather than waiting for trigger triggers. The log analysis or base table analysis can immediately capture any changes in the database without waiting for a data change event to trigger a specific trigger, thereby improving real-time.
2. Better performance and reduced overhead: the trigger need not be set and activated for each table or data change. Triggers may cause performance bottlenecks in high concurrency environments, while directly analyzing database logs or base tables may reduce additional computational and storage overhead.
3. Improved data consistency: data changes are captured directly from the root source (e.g., base table or log). Trigger-based methods may not ensure consistency of data in a high concurrency environment, while capturing changes from the root may ensure integrity of the data.
4. Simplified system complexity and maintenance: the database activity is centrally analyzed without the need to create and maintain triggers for each table and each type of data modification. The number and complexity of the triggers is reduced, thereby reducing the risk of errors and maintenance costs.
5. Enhanced extensibility: because of the direct analysis of the database activity, the proposed method is more scalable when the database is changed. When the database table or structure is changed, the trigger does not need to be adjusted or re-created, so that the system is more flexible and has strong adaptability.
Fig. 3 is a schematic structural diagram of a data migration apparatus according to an embodiment of the disclosure.
As shown in fig. 3, the data migration apparatus 300 includes:
A first obtaining module 310, configured to obtain, in response to monitoring a DDL change event in a source database, first data associated with the DDL change event;
a second obtaining module 320, configured to dynamically parse the data dictionary corresponding to the source database to obtain a data structure view;
A judging module 330, configured to judge, based on the data structure view and the first data, whether a preset condition for performing migration copy on the first data is satisfied;
And the copy migration module 340 is configured to perform DDL copy on the first data to migrate to the target database if the preset condition is satisfied.
Optionally, the first obtaining module is further configured to:
Judging whether the first data is processed data or not according to historical DDL transaction data;
and stopping migration and copying of the first data if the first data is processed data.
Optionally, the judging module is specifically configured to:
Comparing the data structure view with the first data to judge whether the data dictionary is matched with the actual state of the source database;
if the data dictionary is not matched with the actual state of the source database, updating the data dictionary;
and if the actual states of the data dictionary and the source database are matched, meeting the preset condition of migration and copy of the first data.
Optionally, the judging module is further configured to:
Acquiring a timestamp and a spatial mark of a log entry associated with the DDL change event;
Determining context information associated with the log entry in the source database according to the timestamp and the spatial mark of the log entry;
and performing migration copy on the first data based on the context information.
Optionally, the first obtaining module is specifically configured to:
Based on a change data capture technique and a direct base table scan, first data associated with the DDL change event is obtained from a system target table of the source database.
Optionally, the apparatus further comprises;
the correction module is used for carrying out error detection and correction on the migration and copy process of the first data;
and the verification module is used for verifying the target database and returning a verification result in response to determining that the first data has been migrated to the target database.
Optionally, the copy migration module 340 is further configured to:
comparing the source database and the target database to determine whether there is a difference between the logical mode and the physical mode;
if the difference exists, the synchronization operation of the logical view and the physical view is executed according to the difference.
In the embodiment of the disclosure, first data associated with a DDL change event is acquired in response to monitoring the DDL change event in a source database, then a data dictionary corresponding to the source database is dynamically parsed to obtain a data structure view, then whether a preset condition for migration and copy of the first data is met is judged based on the data structure view and the first data, and if the preset condition is met, DDL copy is performed on the first data to migrate to a target database. Thus, highly accurate, reliable and complex DDL logic replication can be ensured to be achieved in situations involving frequent DDL changes during data migration.
In order to implement the above embodiment, the present application further proposes an electronic device, as shown in fig. 4, and fig. 4 is a block diagram of an electronic device for data migration according to an exemplary embodiment.
As shown in fig. 4, the electronic device 800 includes:
Memory 810 and processor 820, bus 830 connecting the different components (including memory 810 and processor 820), memory 810 storing a computer program that when executed by processor 820 implements the data migration methods described in embodiments of the present disclosure.
Bus 830 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 800 typically includes a variety of electronic device readable media. Such media can be any available media that is accessible by electronic device 800 and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 810 may also include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 840 and/or cache memory 850. Electronic device 800 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 860 may be used to read from and write to non-removable, non-volatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard disk drive"). Although not shown in fig. 4, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 830 through one or more data medium interfaces. Memory 810 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the various embodiments of the disclosure.
A program/utility 880 having a set (at least one) of program modules 870 may be stored, for example, in memory 810, such program modules 870 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 870 generally perform the functions and/or methods in the embodiments described in this disclosure.
The electronic device 800 may also communicate with one or more external devices 890 (e.g., keyboard, pointing device, display, etc.), one or more devices that enable a user to interact with the electronic device 800, and/or any device (e.g., network card, modem, etc.) that enables the electronic device 800 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 892. Also, electronic device 800 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 893. As shown in fig. 4, network adapter 893 communicates with other modules of electronic device 800 over bus 830. It should be appreciated that although not shown in fig. 4, other hardware and/or software modules may be used in connection with electronic device 800, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
Processor 820 executes various functional applications and data processing by executing programs stored in memory 810.
It should be noted that, the implementation process and the technical principle of the electronic device in this embodiment refer to the foregoing explanation of the data migration method in the embodiment of the disclosure, and are not repeated herein.
In order to achieve the above embodiments, the present application further proposes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the data migration method described in the above embodiments.
To achieve the above embodiments, the present disclosure further provides a computer program product, which when executed by an instruction processor in the computer program product, performs the data migration method described in the above embodiments.
In the description of this specification, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (10)

1. A method of data migration, comprising:
In response to monitoring a DDL change event in a source database, acquiring first data associated with the DDL change event;
Dynamically analyzing a data dictionary corresponding to the source database to obtain a data structure view;
Judging whether preset conditions for migration and copy of the first data are met or not based on the data structure view and the first data;
and if the preset condition is met, performing DDL copying on the first data so as to migrate the first data to a target database.
2. The method of claim 1, further comprising, after the acquiring the first data associated with the DDL change event:
Judging whether the first data is processed data or not according to historical DDL transaction data;
and stopping migration and copying of the first data if the first data is processed data.
3. The method of claim 1, wherein the determining whether a preset condition for migration replication of the first data is satisfied based on the data structure view and the first data comprises:
Comparing the data structure view with the first data to judge whether the data dictionary is matched with the actual state of the source database;
if the data dictionary is not matched with the actual state of the source database, updating the data dictionary;
and if the actual states of the data dictionary and the source database are matched, meeting the preset condition of migration and copy of the first data.
4. The method of claim 1, further comprising, after the predetermined condition for migration replication of the first data is satisfied if the actual states of the data dictionary and the source database match:
Acquiring a timestamp and a spatial mark of a log entry associated with the DDL change event;
Determining context information associated with the log entry in the source database according to the timestamp and the spatial mark of the log entry;
and performing migration copy on the first data based on the context information.
5. The method of claim 1, wherein the obtaining the first data associated with the DDL change event comprises:
Based on a change data capture technique and a direct base table scan, first data associated with the DDL change event is obtained from a system target table of the source database.
6. The method as recited in claim 1, further comprising:
error detection and correction of the migration copy process of the first data, and
And in response to determining that the first data has been migrated into the target database, verifying the target database, and returning a verification result.
7. The method of claim 1, further comprising, prior to said migration replicating said first data:
comparing the source database and the target database to determine whether there is a difference between the logical mode and the physical mode;
if the difference exists, the synchronization operation of the logical view and the physical view is executed according to the difference.
8. A data migration apparatus, comprising:
The first acquisition module is used for responding to the detection of a DDL change event in the source database and acquiring first data associated with the DDL change event;
The second acquisition module is used for dynamically analyzing the data dictionary corresponding to the source database so as to obtain a data structure view;
the judging module is used for judging whether preset conditions for migration and copy of the first data are met or not based on the data structure view and the first data;
And the copy migration module is used for performing DDL copying on the first data to migrate to a target database if the preset condition is met.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the data migration method of any one of claims 1-7 when the program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the data migration method according to any one of claims 1-7.
CN202410280736.2A 2024-03-12 2024-03-12 Data migration method, device, electronic equipment and storage medium Pending CN118170746A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410280736.2A CN118170746A (en) 2024-03-12 2024-03-12 Data migration method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410280736.2A CN118170746A (en) 2024-03-12 2024-03-12 Data migration method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN118170746A true CN118170746A (en) 2024-06-11

Family

ID=91348076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410280736.2A Pending CN118170746A (en) 2024-03-12 2024-03-12 Data migration method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN118170746A (en)

Similar Documents

Publication Publication Date Title
US11829360B2 (en) Database workload capture and replay
US11468062B2 (en) Order-independent multi-record hash generation and data filtering
US11176140B2 (en) Updating a table using incremental and batch updates
US10554771B2 (en) Parallelized replay of captured database workload
US8078582B2 (en) Data change ordering in multi-log based replication
CN109906448B (en) Method, apparatus, and medium for facilitating operations on pluggable databases
US8589346B2 (en) Techniques for combining statement level, procedural, and row level replication
USRE47106E1 (en) High-performance log-based processing
US8108343B2 (en) De-duplication and completeness in multi-log based replication
EP2746971A2 (en) Replication mechanisms for database environments
US20080162590A1 (en) Method and apparatus for data rollback
US20070185912A1 (en) Off-loading I/O and computationally intensive operations to secondary systems
US20220083529A1 (en) Tracking database partition change log dependencies
CN110727548A (en) Continuous data protection method and device based on database DML synchronization
US11704335B2 (en) Data synchronization in a data analysis system
CN111930850A (en) Data verification method and device, computer equipment and storage medium
CN110737710A (en) Distributed data automatic structured warehousing method and system
WO2023033720A2 (en) Data consistency mechanism for hybrid data processing
CN114741453A (en) Method, system and computer readable storage medium for data synchronization
CN115373889A (en) Method and device for data comparison verification and data repair in data synchronization
US11188228B1 (en) Graphing transaction operations for transaction compliance analysis
CN118170746A (en) Data migration method, device, electronic equipment and storage medium
CN115658391A (en) Backup recovery method of WAL mechanism based on QianBase MPP database
AT&T \376\377\000t\000i\000d\000a\000l\000r\000a\000c\000e\000_\000i\000n\000d\000u\000s\000t\000r\000i\000a\000l\000_\0002\0000\0001\0004
Zhou et al. FoundationDB: A Distributed Key Value Store

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination