CN107122361A

CN107122361A - Data mover system and method

Info

Publication number: CN107122361A
Application number: CN201610102852.0A
Authority: CN
Inventors: 付大超
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2016-02-24
Filing date: 2016-02-24
Publication date: 2017-09-01
Anticipated expiration: 2036-02-24
Also published as: CN107122361B

Abstract

Data mover system general between isomorphism and heterogeneous database is supported there is provided a kind of this application discloses a kind of data mover system and method, can be in the complete data mover system shut down with full dose migration is carried out under non-stopped status；There is provided the increment shift function under non-stopped status, newly-increased data are made to obtain timely synchronization.

Description

Data migration system and method

Technical Field

The present application belongs to the field of internet technologies, and in particular, relates to a data migration system and method.

Background

The typical data migration system at present accomplishes data migration through full-volume migration or incremental migration. For example, golden gate by Oracle, which is a log-based structured data replication software, obtains incremental changes of data by parsing an online log or an archive log of a source database, and applies the changes to a target database, thereby synchronizing the source database and the target database. The golden gate can realize the sub-second-level real-time copying of a large amount of data among heterogeneous IT infrastructures (including almost all common operating system platforms and database platforms), so that the golden gate can be applied to a plurality of scenes such as emergency systems, online reports, real-time data warehouse supply, transaction tracking, data synchronization, centralization/distribution, disaster tolerance and the like. The implementation of the data integration technique of golden gate mainly comprises three processes (a data extraction process, a transmission process and an application process).

And reading the online log or the filing log in the source database by using a data extraction process, analyzing, extracting the change of data such as addition, deletion and modification, converting the related information into a self-defined intermediate format, storing the intermediate format in a queue file, and transmitting the queue file to a target database by using a transmission process through TCP/IP. After the data extraction process finishes reading the data change in the log each time and transmits the data to the target system, a check point is written, the position of the log which is currently extracted is recorded, and the existence of the check point can enable the data extraction process to continuously copy the log from the position of the check point after being suspended and recovered. And the target database receives the data change and buffers the data change into a queue, wherein the queue is a series of files for temporarily storing the data change and waits for the application process to read the data. And the application process reads the data change from the queue and creates a corresponding SQL statement, executes the SQL statement through a local interface of the database, updates the check point of the application process after the SQL statement is submitted to the database successfully, and records the position of the application process where the replication is completed.

There are also some similar databases, such as Microsoft's SQL Server Migration Assistant (SSMA), which is a tool set published by Microsoft to help clients to migrate more easily from Oracle/Sybase/MySQL/Access to SQL Server and SQL Azure.

Microsoft's SQL Server Migration Assistant is not a general data Migration system, and Oracle's golden Gate has only incremental Migration, not full Migration.

Disclosure of Invention

In view of the above, the present application provides a data migration system and method to solve the technical problem that a general data migration system supporting full migration between various homogeneous and heterogeneous databases is absent in the prior art.

In order to solve the above technical problem, the present application discloses a data migration system, including: an entry (Portal) module, an Application Programming Interface (API) module, a distributed scheduling module, and a full migration module; the entrance module is used for providing an entrance for managing migration tasks, receiving the migration tasks created by a client and the designated first database, second database and data to be migrated of the first database, wherein the migration tasks comprise full migration;

the application programming interface module is used for providing a service interface for the client so as to call a corresponding full migration module according to the migration type in the migration task;

the distributed scheduling module is used for monitoring the execution state of each migration task and scheduling the migration tasks according to the current load of each device so as to realize load balance of each device;

the full migration module is configured to migrate the data to be migrated from the first database to the second database, convert the data type of the data to be migrated into the data type of the second database through a preset intermediate format when the first database and the second database are heterogeneous, and write the data to be migrated into the second database according to the converted data type.

Wherein the system further comprises a first migration evaluation unit,

when the migration task comprises migration evaluation, the first migration evaluation unit receives the call of the application programming interface module, and is used for outputting an evaluation report according to the first database and the second database before the full-scale migration module migrates the data to be migrated, wherein the evaluation report comprises space occupation information, list columns and object information of the first database, executed SQL (structured query language) and processing performance information, and the evaluation report further comprises incompatible characteristic information and modification cost information of the second database.

Wherein the system further comprises a structure migration module,

when the migration task comprises structure migration, the structure migration module receives the call of the application programming interface module, and is used for migrating the library table list and the objects used by the data to be migrated to the second database before the full-scale migration module performs data migration, when the first database and the second database are heterogeneous, determining the types of the library table list and the objects in the second database through a preset type mapping relation, and creating the library table list and the objects corresponding to the types in the second database.

Wherein the system further comprises a second migration evaluation unit,

when the migration task includes migration evaluation, the second migration evaluation unit receives a call of the application programming interface module, and is configured to output an evaluation report according to the first database and the second database before the structure migration module migrates the library table columns and the objects used by the data to be migrated, where the evaluation report includes space occupation information, library table columns and object information, executed SQL and processing performance information of the first database, and the evaluation report further includes incompatible characteristic information and modification cost information of the second database.

Wherein,

the distributed scheduling module schedules the migration tasks by counting the number of unitization executed on each device in the data migration system and the memory occupancy rate so as to realize load balance among the devices.

Wherein,

and when the full migration module writes data into the second database, recording the site information of the written data for breakpoint continuous transmission.

Wherein the full-scale migration module is further to,

and when the first database and the second database in the full migration task are isomorphic, migrating the data to be migrated according to the current structure of the database list.

Wherein the full-scale migration module is further to,

and when the first database and the second database in the full migration task are heterogeneous, for data migration between heterogeneous databases, completing conversion of data types between the heterogeneous databases according to a preset intermediate format.

Wherein the full-scale migration module is further to,

in the process of full-scale migration, if the library name, the table name and the column name of the first database are inconsistent with those of the second database, mapping is carried out according to a preset mapping rule of the library table column, and the data of the library table column is written into a library table column corresponding to the second database.

The application discloses a data migration system, includes: an entrance (Portal) module, an Application Programming Interface (API) module, a distributed scheduling module, a full migration module and an incremental migration module;

the entry module is used for providing an entry for managing migration tasks, receiving the migration tasks created by a client and the designated first database, second database and data to be migrated of the first database, wherein the migration tasks comprise full migration and incremental migration;

the application programming interface module is used for providing a service interface for the client so as to call a corresponding full-volume migration module and an incremental migration module according to the migration type in the migration task;

the full migration module is configured to migrate the data to be migrated from the first database to the second database, convert the data type of the data to be migrated into the data type of the second database through a preset intermediate format when the first database and the second database are heterogeneous, and write the data to be migrated into the second database according to the converted data type;

the incremental migration module is configured to migrate, with a first time when the full migration starts as a reference, data to be migrated that changes after the first time to the second database, convert, when the first database is heterogeneous to the second database, a data type of the data to be migrated into a data type of the second database through a preset intermediate format, and write the data to be migrated into the second database according to the converted data type.

Wherein the system further comprises a first migration evaluation unit,

Wherein the system further comprises a structure migration module,

Wherein the system further comprises a second migration evaluation unit,

Wherein,

and when the full-amount migration module or the incremental migration module writes data into the second database, recording the site information of the written data for breakpoint continuous transmission.

Wherein the full-scale migration module is further to,

The incremental migration module is further configured to obtain data that changes after the first time by using a log of the first database or using an operation record saved by a trigger of the first database.

In order to solve the above technical problem, the present application further discloses a data migration method, including:

receiving a migration task created by a client and a designated first database, a designated second database and data to be migrated of the first database, wherein the migration task comprises full migration, and the first database is in a shutdown state;

and calling a full migration module to migrate the data to be migrated from the first database to the second database, converting the data type of the data to be migrated into the data type of the second database through a preset intermediate format when the first database and the second database are heterogeneous, and writing the data to be migrated into the second database according to the converted data type.

Wherein, when the migration task further includes a migration evaluation, before invoking a full-scale migration module, the method further comprises: and calling a first migration evaluation unit, and outputting an evaluation report according to the first database and the second database, wherein the evaluation report comprises space occupation information, list columns and object information of the first database, executed SQL (structured query language) and processing performance information, and the evaluation report further comprises incompatible characteristic information and transformation cost information of the second database.

When the migration task further comprises structure migration, before calling a full-scale migration module, the method further comprises:

and calling a structure migration module to migrate the library table list and the objects used by the data to be migrated to the second database, when the first database is heterogeneous to the second database, determining the types of the library table list and the objects in the second database according to a preset type mapping relation, and creating the library table list and the objects corresponding to the types in the second database.

Wherein, when the migration task further includes a migration evaluation, before invoking the structure migration module, the method further comprises:

and calling a second migration evaluation unit, and outputting an evaluation report according to the first database and the second database, wherein the evaluation report comprises space occupation information, list columns and object information of the first database, executed SQL (structured query language) and processing performance information, and the evaluation report further comprises incompatible characteristic information and transformation cost information of the second database.

When the full migration module is called to migrate the data to be migrated from the first database to the second database, the method further includes: and recording the position information of the written data for breakpoint continuous transmission.

Wherein invoking a full migration module to migrate the data to be migrated from the first database to the second database further comprises,

The application discloses a data migration method, which is characterized by comprising the following steps:

receiving a migration task created by a client, a designated first database, a designated second database and data to be migrated of the first database, wherein the migration task comprises full migration and incremental migration, and the first database is in a starting state;

calling a full migration module to migrate the data to be migrated from the first database to the second database, converting the data type of the data to be migrated into the data type of the second database through a preset intermediate format when the first database and the second database are heterogeneous, and writing the data to be migrated into the second database according to the converted data type;

and calling an incremental migration module, migrating the data to be migrated, which changes after the first moment, to the second database by taking the first moment when the full migration starts as a reference, converting the data type of the data to be migrated into the data type of the second database through a preset intermediate format when the first database and the second database are heterogeneous, and writing the data to be migrated into the second database according to the converted data type.

When the full-volume migration module or the incremental migration module is called to migrate the data to be migrated from the first database to the second database, the method further includes: and recording the position information of the written data for breakpoint continuous transmission.

Wherein invoking the full-scale migration module to write the data to be migrated into the second database further comprises,

The invoking of the incremental migration module specifically includes acquiring data that changes after the first time by using a log of the first database or using an operation record saved by a trigger of the first database.

Wherein the method further comprises: when the write operation of the client to the first database is switched to the second database, taking a second time when the client writes data to the second database for the first time as a reference, migrating the data of the second database, which changes after the second time, to the first database;

when the migrated data changed after the second moment catches up with the current data change of the second database, judging whether the data of the first database and the data of the second database are consistent or not aiming at the data changed after the second moment in real time, and recording the inconsistent data in a preset list.

Wherein the method further comprises: and when the access of the client to the second database is abnormal, switching the write operation of the client to the second database back to the first database.

Compared with the prior art, the application can obtain the following technical effects:

1) the data migration system supporting the universality between isomorphic and heterogeneous databases is provided, and the complete data migration system can perform full migration in a shutdown state and a non-shutdown state;

2) the structure migration function before the full-scale migration is provided, and when the data volume needing to be migrated is large, a large amount of time and labor can be saved.

3) An increment migration function in a non-stop state is provided, and newly added data can be synchronized in time.

4) The data migration system also has a distributed scheduling function, so that the processing efficiency of data migration with large data volume can be improved;

5) when data migration is carried out in a non-shutdown state, data synchronization among databases is realized through data backflow, so that running services can be switched at any time.

Of course, it is not necessary for any one product to achieve all of the above-described technical effects simultaneously.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a schematic architecture diagram of a data migration system according to an embodiment of the present application;

FIG. 1a is an architecture diagram of the full-scale migration module 131 in its own right;

FIG. 1b is an architectural diagram of a combination of a full-scale migration module 131 and a fabric migration module 135;

FIG. 1c is a schematic diagram of the architecture of the full-scale migration module 131 and the migration evaluation module 136 when they exist in combination;

FIG. 1d is an architectural diagram of the combined presence of the full-scale migration module 131 and the structure migration module 135 and the migration evaluation module 136;

FIG. 1e is an architectural diagram of the combined presence of full-scale migration module 131, structure migration module 135, and full-scale check module 133;

FIG. 1f is a schematic diagram of the architecture of the full-scale migration module 131, the migration evaluation module 136, and the full-scale check module 133 when present in combination;

FIG. 1g is a schematic diagram of the architecture of the full migration module 131 and the full verification module 133 when they exist in combination;

FIG. 1h is an architectural diagram of the full-scale migration module 131 and the combined existence of the structure migration module and migration evaluation module 136 and the full-scale check module 133;

FIG. 1i is an architectural diagram of the full-scale migration module 131 and the incremental-scale migration module 132 in combination;

FIG. 1j is an architectural diagram of the combined presence of the full-scale migration module 131, the structure migration module 135, and the incremental migration module 132;

FIG. 1k is an architectural diagram of the full-scale migration module 131, the migration evaluation module 136, and the incremental migration module 132 when present in combination;

FIG. 1l is an architectural diagram of the combined presence of the full-scale migration module 131 and the structure migration module 135 and the incremental migration module 132 and the migration evaluation module 136;

FIG. 1m is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 in combination with the incremental migration module 132 and the full-scale check module 133;

FIG. 1n is a schematic diagram of the architecture of the full migration module 131 and the migration evaluation module 136 in combination with the incremental migration module 132 and the full check module 133;

FIG. 1o is a schematic diagram of the architecture of the full migration module 131 and the full check module 133 and the incremental migration module 132 when present in combination;

FIG. 1p is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 and the migration evaluation module 136 and the full-scale check module 133 and the incremental migration module 132 when present in combination;

FIG. 1q is a schematic diagram of an architecture when a full-scale migration module 131 and an incremental migration module 132 exist in combination and an architecture when an incremental check module 134 exists in combination;

FIG. 1r is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 in combination with the incremental migration module 132 and the incremental verification module 134;

FIG. 1s is a schematic diagram of the architecture of the full-scale migration module 131 and the migration evaluation module 136 in combination with the incremental migration module 132 and the incremental verification module 134;

FIG. 1t is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 in combination with the incremental migration module 132 and the migration evaluation module 136 and the incremental verification module 134;

FIG. 1u is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 and the incremental migration module 132 and the full-scale check module 133 and the incremental check module 134 when they exist in combination;

FIG. 1v is a schematic diagram of the architecture of the full-scale migration module 131 and the migration evaluation module 136 in combination with the incremental migration module 132 and the full-scale verification module 133 and the incremental verification module 134;

FIG. 1w is a schematic diagram of the architecture of the full-volume migration module 131 and the full-volume check module 133 when present in combination with the incremental migration module 132 and the incremental check module 134;

FIG. 1x is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 and the migration evaluation module 136 in combination with the full-scale check module 133 and the incremental migration module 132 and the incremental check module 134;

FIG. 2 is a schematic diagram of a state change of a state machine provided by a distributed scheduling module for a migration task according to an embodiment of the present application;

FIG. 3 is a diagram illustrating an embodiment of the present application for performing type conversion between heterogeneous databases using an intermediate format;

fig. 4 is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 5 is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 6 is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 7a is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 7b is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 8a is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 8b is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 9 is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 10 is a schematic flowchart of a data migration method according to an embodiment of the present application;

fig. 11 is a flowchart illustrating a data migration method according to an embodiment of the present application.

Detailed Description

Embodiments of the present application will be described in detail with reference to the drawings and examples, so that how to implement technical means to solve technical problems and achieve technical effects of the present application can be fully understood and implemented.

The embodiment of the application provides a data migration system, which provides migration services of types such as full migration, incremental migration, full verification, incremental verification, migration evaluation, structure migration and the like. By using the data migration system, data migration can be completed between isomorphic or heterogeneous databases, full verification is carried out on the data subjected to full migration, and incremental verification is carried out on the data subjected to incremental migration. Migration evaluation and structure migration can also be performed before data migration begins, so that the feasibility of migration is analyzed and the efficiency of data migration is improved. Under the condition that the operation of the source database is allowed to be stopped or not, the data migration system can be used for migrating the service data from the source database to the target database, and under the condition that the source database is not allowed to be stopped, the data synchronization between the source database and the target database can be realized, so that the access to the service data can be switched at any time.

Fig. 1 is a schematic structural diagram of a data migration system according to an embodiment of the present Application, including an entry (Portal) module 10, an Application Programming Interface (API) module 11, a distributed scheduling module 12, a full migration module 131, an incremental migration module 132, a full verification module 133, an incremental verification module 134, a structure migration module 135, and a migration evaluation module 136.

The portal module 10 provides a portal for managing the migration task, and the client can complete the configuration of the migration task such as creation, starting, management, progress inquiry and the like through the portal module 10.

The Portal module 10 may be implemented by open source Portal (Portal) software, such as, for example, the library Portal, supporting Asynchronous JavaScript And XML (Asynchronous JavaScript And XML, AJAX) And Java Specification Request (JSR) 286 standards, creating a management console application of the data migration system, And exposing an interactive interface of the management console application at a client. The user can create a migration task in the interactive interface and specify the source database, the target database and the data to be migrated in the source database.

Before entering a management console of the data migration system, a second database in the target server needs to be created in the cloud service console, the data migration system can automatically create the second database in the target server, the name of the automatically created second database is the same as the name of the first database of the source server, and if a user wants the name of the second database to be different from the name of the first database, the name of the second database can be customized. Then, the account numbers used by the source server and the target server are created to ensure that the migration task has the reading authority of the source server and the writing authority of the target server. The source server and the target server may also be instances running on two servers, respectively.

Entering a management console of the data migration system, starting to create a migration task, and firstly setting a task name for the created migration task, for example, the task name is "abc _ beijing to hang", "bcd _ local to Qingdao", and the like.

After the task name setting is completed, the connection information of the source server and the target server is filled, including the instance type of the source server, for example, "self-built database with public network IP"; a first database type of origin server, e.g., Oracle, Mysql, etc.; the host name or IP address of the origin server and port number, e.g., 201.165.1.112: 3308 (b); an account number and a password of the source server, wherein the account number needs to have data reading permission aiming at the source server; an instance type of the target server, e.g., a Relational Database Service (RDS) instance; an RDS instance identification of the target server, such as "RDS 2 affbt"; an account number and a password of the target server, wherein the account number needs to have data writing authority aiming at the target server.

And after the connection information of the source server and the target server is filled, selecting a migration type and data to be migrated. In the interactive interface, options of migration types such as full migration, incremental migration, full verification, incremental verification, structure migration, migration evaluation and the like are provided for a user to select, the migration types can default to full migration, and other migration types are added according to the needs of the user. The data to be migrated may be selected from one or more database sets (Schema), or may be one or more tables within a certain set, for example, the data to be migrated may be set as a database set "amptest 1", or may be a table "a 00_ full _ type _ table _ m1b 1" in a database set "amptest 1". If the user wants the name of the second database created in the target server not to be consistent with the name of the selected data to be migrated, the database name mapping relationship needs to be configured at this time, for example, the name mapping of the selected data to be migrated "amptest 1" is configured as "liuy _ amptest 1".

After the configuration is completed, the migration task is started in the management console of the data migration system, and the currently executed migration type and progress of the migration task can be seen in the interface, for example, the structure migration is completed by 80%, and the full-scale migration is completed by 100%.

The client initiates a HyperText Transfer Protocol (Http) request to the entry module 10, and the entry module 10 receives and parses a migration task created by the client and calls an open API of the data migration system according to the migration task.

The application programming interface module 11 is configured to provide an interface of a callable migration type to the client, so as to call one or more of the corresponding full migration module 131, the incremental migration module 132, the full verification module 133, the incremental verification module 134, the structure migration module 135, and the migration evaluation module 136 according to the migration type in the migration task.

The call request may be sent to the API module 11 through Http or hypertext transfer protocol over Secure Socket Layer (Http) based Http protocol. The Http Get method is supported to send requests, each of which requires formulation of an Action parameter, i.e., an Action parameter, to be performed, e.g., creation of a database, full migration, incremental migration, etc.

The distributed scheduling module 12 is configured to monitor an execution state of each migration task, and schedule the migration tasks according to the current load of each device, so as to implement load balancing of each device.

The distributed scheduling module 12 provides a state machine to monitor the operation state of each migration task, an operation diagram of the state machine is shown in fig. 2, and the state machine includes seven states, which are: initialization (init), running (running), caught (catch), paused (used), stopped (finished), failed (failed), and successful (success). The above seven states are converted as shown in fig. 2, wherein the time (time) represents the duration of the incremental migration process (e.g. 15 minutes), and when the duration of the incremental migration is less than the preset time, it means that the current data change has been caught up, and a caught (catch) state is reached; pause (pause) selects a pause task on behalf of the client; kill (kill) represents the task being killed; an exception (unusal) represents a capture of an exception condition, e.g., the target server instance is locked, resulting in the data write permissions of the entered account being reclaimed. If the configured migration task comprises full migration and does not comprise incremental migration, changing the running state into a successful state when the full migration task is finished; if the configured migration tasks include full migration and incremental migration, changing from a running state to a successful state at the end of the incremental migration tasks.

The distributed scheduling module 12 may perform scheduling of the migration task by counting the number of unitization executed on each device in the data migration system and the memory occupancy rate, so as to achieve load balancing among the devices.

The full migration module 131 is used for migrating data to be migrated from the first database to the second database. If the first database is in a halt state, carrying out full migration, and migrating the data to be migrated, which is currently stored in the first database, to a second database; and if the first database is not stopped, taking the first time when the full migration task starts to execute as a reference, and migrating the data to be migrated, which is stored at the first time, to the second database.

The isomorphism of the first database and the second database means that the tables of the two databases have the same structure, and the isomorphism of the first database and the second database means that the tables of the two databases have different structures.

And when the first database and the second database in the full migration task are isomorphic, migrating the data to be migrated according to the current structure of the database list. And when the first database and the second database in the full migration task are heterogeneous, for data migration between heterogeneous databases, completing conversion of data types between the heterogeneous databases according to a preset intermediate format. As shown in FIG. 3, the data type of the first database is converted to an intermediate format, and then from the intermediate format to the data type of the second database. The advantage of this is that there is no need to develop a conversion method for each type of database separately for the other types of databases. For example, if the seven databases shown in fig. 3 are respectively developed with conversion methods for other types of databases, a total of forty-two conversion methods are required; by adopting the mode of converting through the intermediate format, a reading method is respectively developed for seven databases, the data of the seven databases are read into the intermediate format, then a writing method is respectively developed for the seven databases, and the data read into the intermediate format are written into different databases. Therefore, fourteen conversion methods need to be developed to realize type conversion among seven databases, so that the development efficiency is improved, and the data type conversion among different types of databases is easier to realize.

The preset intermediate format includes a String array and a Byte array of the Java language. When data is read from the first database, the binary data type of the first database is converted into a byte array form for storage, and other data types are converted into a character string array form for storage. When data is written into the second database, the data stored in the character string array and the byte array are converted into the type corresponding to the written column.

For example, taking data migration from Oracle to Mysql as an example, when reading Oracle data, data of Binary floating point types such as Binary Large Object (blob) type, Binary _ float, and Binary _ double of Oracle is converted into byte (byte) array of Java for storage, and data of other types such as varchar2, nvarchar2 of Oracle is converted into string (string) array of Java for storage, so that data to be migrated in the Oracle database is stored in an intermediate format. When the data to be migrated in the intermediate format is written into the Mysql, a byte array corresponding to the blob type is converted into a longblob type of the Mysql, a byte array corresponding to the binary _ float type is converted into a decimalal type of the Mysql, a data type of the binary _ double is converted into a double type of the Mysql, a character string array corresponding to the varchar2 type is converted into a varchar type of the Mysql, a character string array corresponding to the nvarchar2 type is converted into a native varchar type of the Mysql, and character string arrays corresponding to other types are correspondingly converted into types corresponding to the Mysql database and written into the Mysql database.

As shown in fig. 3, the intermediate format can realize type conversion of seven databases, such as Mysql, Oracle, SQLServer, high-performance distributed database system OceanBase, object-relational database management system PostgreSQL, open structured data service OTS, distributed nematic relational database management system Hbase SQLServer, and the like, thereby improving development efficiency.

When the client selects full migration as a migration type, if a table to be migrated in the first database is specified, the name of the specified table is added to the white list; if the first database needs to be migrated as a whole, the white list is empty. And when the full migration is started, determining a table to be migrated or migrating the whole first database according to the white list.

In one embodiment, the table to be migrated may be segmented before migration, where the segmentation may be implemented by using an existing table segmentation algorithm, and the table to be migrated may be segmented into a plurality of subtasks for respective asynchronous processing, so as to improve the efficiency of full-scale migration. For example, the table may be sliced using a horizontal, vertical, or hash-slicing algorithm; the horizontal segmentation is to put the data into two or more independent tables according to certain conditions, namely, segmentation is carried out according to records, different records can be stored separately, the number of columns of each sub-table is the same, the number of pages of the data and the index which need to be read during query can be reduced after segmentation, the number of layers of the index is also reduced, and the query speed is accelerated; the main key and some columns are placed in one table by vertical segmentation, then the main key and other columns are placed in another table, and the original table is divided into a plurality of tables only containing few columns, so that row data is reduced, one data Block (Block) can store more data, and the I/O times can be reduced during query; hash splitting employs a Hash (Hash) algorithm to distribute data into the various sub-tables, so that I/O is more balanced.

In the process of full-scale migration, if the library name, the table name and the column name of the first database are inconsistent with those of the second database, mapping is carried out according to a preset mapping rule of the library table column, and the data of the library table column is written into a library table column corresponding to the second database. Before the migration task is started, when data to be migrated is selected, the situation of inconsistent library names can be set, and the description is not repeated here. For the case that the table name and the column name are inconsistent, the mapping rule of the table name and the column name also needs to be preset. For example:

in one embodiment, after data is written into the second database each time, the storage location information of the written data to be migrated may be recorded for breakpoint resuming.

"mapperList ═ db _ 1; table _ 1; db _ 1; table _ 2; | db _ 1; table _ 2; db _ 1; table _ 2; uname ". Where "|" is used to separate two mapping rules, db _ 1; table _ 1; db _ 1; table _ 2; table _1, representing database db _1, maps to table _2, i.e. there is a change in the name of the table. db _ 1; table _ 2; db _ 1; table _ 2; the name indicates that the field username of table _2 of database db _1 is mapped to the column of table _2 as a name, i.e., the column name is changed accordingly.

In one embodiment, after data is written into the second database each time, the written location information of the data to be migrated may be recorded for breakpoint resuming. For example, a table increment _ trx is created in the target server, and the table increment _ trx is a site table created in the target server and is mainly used for recording sites of full migration and solving the problem of breakpoint continuous transmission after abnormal restart of a task.

The structure migration module 135 is configured to migrate, before the data migration performed by the full volume migration module 131, the library table list and the object used by the data to be migrated to the second database, determine, through a preset type mapping relationship, the type of the library table list and the type of the object in the second database when the first database is heterogeneous to the second database, and create, in the second database, the library table list and the object corresponding to the type.

When the data to be migrated is very large, the process of migrating is slow, and consumes much time and labor, for example, when data of thousands of tables of hundreds of databases are to be migrated, the above problem becomes more obvious. Therefore, when creating a migration task with a large data volume, a user may select structure migration in the migration types, or the data migration system may automatically perform structure migration before full migration when determining that the data volume of the data to be migrated configured by the migration task exceeds a preset threshold.

For example, the first database in the source server is an Oracle database, the second database in the target server is a Mysql database, and a table with a table name of "dbtest". multidetest "in the Oracle database is specified as data to be migrated, and the structure of the table includes the following columns:

("C_ID"NUMBER(20,0),

"C_NUM_1"NUMBER,

"C_NUM_2"NUMBER(38,0),

"C_NUM_3"NUMBER(20,3),

"C_FLOAT_1"BINARY_FLOAT,

"C_FLOAT_2"BINARY_DOUBLE,

"C_DATE_1"DATE,

"C_DATE_2"TIMESTAMP(6),

"C_DATE_3"TIMESTAMP(6),

"C_DATE_4"TIMESTAMP(6)WITH TIME ZONE,

"C_DATE_5"TIMESTAMP(6)WITH LOCAL TIME ZONE,

"C_STRING_1"CHAR(40CHAR),

"C_STRING_2"CHAR(40BYTE),

"C_STRING_3"NCHAR(40),

"C_STRING_4"VARCHAR2(40BYTE),

"C_STRING_5"VARCHAR2(40CHAR),

"C_STRING_6"NVARCHAR2(40),

"C_STRING_7"CLOB,

"C_STRING_8"NCLOB,

"C_STRING_9"LONG,

"C_BLOB_1"BLOB,

"C_BLOB_2"RAW(200)

)。

if the user selects the migration type and selects the structure migration, the structure migration module 135 first creates a corresponding table in the Mysql database, and if the user does not have a mapping of a preset table name, creates a table' dbtest. The table 'dbtest. "tabletest' is also created with the corresponding columns, and the code to create the table is as follows:

CREATE TABLE`dbtest.`tabletest`(

`C_ID`DECIMAL(20,0)NOT NULL,

`C_NUM_1`BIGINT NOT NULL,

`C_NUM_2`DECIMAL(38,0)NOT NULL,

`C_NUM_3`DECIMAL(20,3)NOT NULL,

`C_FLOAT_1`DECIMAL(65,8),

`C_FLOAT_2`DOUBLE,

`C_DATE_1`DATETIME,

`C_DATE_2`DATETIME,

`C_DATE_3`DATETIME,

`C_DATE_4`DATETIME,

`C_DATE_5`DATETIME,

`C_STRING_1`CHAR(40),

`C_STRING_2`CHAR(40),

`C_STRING_3`VARCHAR(40),

`C_STRING_4`VARCHAR(40),

`C_STRING_5`VARCHAR(40),

`C_STRING_6`VARCHAR(40),

`C_STRING_7`LONGTEXT,

`C_STRING_8`LONGTEXT,

`C_STRING_9`LONG,

`C_BLOB_1`LONGBLOB,

`C_BLOB_2`VARBINARY(200)

)engine＝INNODB charset＝UTF8COLLATE UTF8_bin；

therefore, a corresponding table ' dbtest. ' tabletest ' is created in the Mysql database, the table and the table ' dbtest. ' tabletest ' of the Oracle database have corresponding columns, and therefore the structure of the table ' dbtest. ' tabletest ' to be migrated is migrated to the Mysql database first before actual data are migrated. In the migration process, type conversion is performed on each column according to a type mapping relationship from Oracle to Mysql, for example, in the above example, the column "C _ ID" type of Oracle is Number, in a table created by the Mysql database, the corresponding column "C _ ID" type is Decimal, and corresponding conversion is performed on other columns, which is not listed herein.

The user creates a migration task, and can set a mapping relation between a table name and a column name in addition to a mapping relation between a library name when selecting data to be migrated. When a table in the first database is selected as data to be migrated, a table name of the table in the second database may be preset. In the interface for setting the table name of the table in the second database, all columns in the table are shown, and the user can set the column name of each column in the table of the second database.

For example, the user selects the TABLE child _ KEY _ TABLE in the first database as the data to be migrated, an edit page may be opened after selecting the TABLE, and the name child _ KEY _ TABLE _ TEST of the TABLE corresponding to the TABLE in the second database is set; in this interface, the columns in the TABLE child _ KEY _ TABLE to be migrated are shown simultaneously: C. ID, K, PAD, etc., and the user may re-edit the column names of the respective columns, e.g., modify the column names of the respective columns to C _ TEST, ID _ TEST, K _ TEST, PAD _ TEST.

The migration evaluation module 136 is configured to output an evaluation report according to the first database and the second database before the full migration module 131 migrates the data to be migrated or before the structure migration module 135 migrates the library table columns and the objects used by the data to be migrated, where the evaluation report includes space occupation information, library table columns and object information of the first database, executed SQL and processing performance information, and the evaluation report further includes incompatible characteristic information and modification cost information of the second database.

For example, the database sets amptest1 and amptest2 are selected from an Oracle database as data to be migrated to the database sets liuy _ amptest1 and liuy _ amptest2 corresponding to the Mysql database. Before starting the full migration or the structure migration, an evaluation report is output, wherein the evaluation report comprises the following contents: the Oracle database occupies 500GB in current space, 1.5TB in available space, comprises 200 databases, 5000 tables and 35 average columns of each table, supports the where condition of SQL standard, and occupies 25% of current memory; the method includes the steps that two libraries, namely amptest1 and amptest2, are migrated at this time, 55 tables are shared, the average column number of each table is 30, the target Mysql database, namely, the library _ amptest1 and the library _ amptest2 do not support several data types, namely, the bfile, the interval year to month and the interval day to second, in Oracle, when migration is performed, if the tables of amptest1 and amptest2 contain the three types, structure migration failure can be caused, please write data corresponding to the three types into the library _ amptest1 and the library _ amptest2 in advance according to business requirements, and the reconstruction time needs about 24 hours.

The full amount verification module 133 is configured to, when the full amount migration module 131 completes data migration, determine whether data of the first database and data of the second database are consistent with each other with respect to the data to be migrated, which is migrated by the full amount migration module 131, and store inconsistent data in a preset list; and when inconsistent data exist in the preset list, repeatedly checking the inconsistent data every preset time, and recording the inconsistent data in the migration evaluation report when the inconsistent data still exist after repeated checking for a preset number of times.

The full amount checking module 133 also determines a table to be subjected to full amount checking according to the white list of the full amount migration module 131, and when the white list is empty when the whole database set is subjected to full amount migration, the data in the whole database set is checked. When the first database is heterogeneous to the second database, the data to be migrated needs to be read first and converted into a preset intermediate format, and the data is converted into the type of the second database through the intermediate format and then compared with the corresponding data in the second database.

And when the first database is in a halt state and a non-halt state, the full verification can be continuously carried out after the full migration is finished. And in the non-shutdown state, data migration is carried out by taking the first moment when the full migration starts as a reference, and in the full migration process, new write operation may modify the data to be migrated, so that the data written into the second database is inconsistent with the modified data to be migrated.

For example, after the data in the table "dbtest" in the Oracle database "tabletest" is migrated to the table "dbtest" tabletest "in Mysql, the DATE value of the corresponding column" C _ DATE _1 "is not consistent with the DATE value of" C _ DATE _1 ', and possibly the time is changed during the migration process, the corresponding column "C _ DATE _1" and "C _ DATE _ 1' are recorded in the preset list, and the repeat check is performed every preset time duration (e.g. 3 seconds), and if the data inconsistent with the repeat check for a preset number of times (e.g. 20 times) still exists, the data are recorded in the migration report. The migration report inner table records that the full migration from table "dbtest". tabletest "to table 'dbtest. tabletest' has been completed, but there is inconsistent data: the corresponding columns "C _ DATE _1" and "C _ DATE _ 1".

The incremental migration module 132 is configured to migrate data to be migrated to the second database after the first time by using the first time when the full migration starts as a reference, convert the data type of the data to be migrated into the data type of the second database through a preset intermediate format when the first database and the second database are heterogeneous, and write the data to be migrated into the second database according to the converted data type.

The incremental migration is started when the full-volume migration is finished, and may be implemented by analyzing log data or analyzing an operation record of a trigger, which will be described below.

When implemented by parsing log data, the incremental migration module 132 reads the log of the first database, for example, reads a binary log (binlog) when the first database is Mysql, and reads a redolog (redo log) when the first database is Oracle.

And analyzing the read log content, converting the read log content into a preset intermediate format (a byte array and a character string array of Java language), writing the intermediate format into a Queue (Queue), and reading data in the intermediate format in the Queue. And determining data needing to be read through a white list, wherein the white list is created during full migration. For example, if the white list includes a table "dbtest". table, "data for this table is filtered from the redo log of Oracle, and the table" dbtest "of Mysql is written after completing operations such as type conversion. In the process of reading data from the queue and writing the data into the second database, the data can be read and written through concurrent operation of multiple threads, so that the incremental migration efficiency is improved. Incremental migration, which is implemented by parsing log data, supports transactional writes, by which is meant a series of operations performed as a single logical unit of work, either performed entirely or not performed at all. Transaction processing can ensure that data-oriented resources are not permanently updated unless all operations within a transactional unit are successfully completed. To become a logical unit of work a transaction, the so-called ACID (atomicity, consistency, isolation, and durability) properties must be satisfied. A transaction is a logical work unit in Database operation, and a transaction Management subsystem in a Database Management System (DBMS) is responsible for processing the transaction. That is, the client commits the data written to the first database with the next transaction at the first database, and writes to the second database with one transaction at the time of incremental migration. The method is important for the application which cannot accept the intermediate state of data, such as money transfer, inventory information of goods during online shopping and the like, when the user does not complete the transfer operation of the client or the shopping flow is not finished, the incremental migration is not carried out although some data in the table are changed until all the changed data are subjected to the incremental migration after the transfer operation or the shopping operation is finished.

When the incremental migration module 132 is implemented by a trigger, recording the operation of a Data Management Language (DML) after a first time through a trigger function of the first database, and writing the operation into an incremental table; reading information in the increment table, reversely checking corresponding data in the first database, converting the inquired data into a preset intermediate format (a byte array and a character string array of Java language) and writing the data into a queue; and reading from the queue, writing into a second database after type conversion processing. The increment table stores the primary key information, but not the original data, so that the original data needs to be obtained by back-checking according to the primary key information. For example, after the full migration starts, the client writes a table "dbtest". multidbletest "of the Oracle database, the trigger records the primary key information id of 1 in the increment table, and after the increment migration starts, the client reads the primary key information id of the increment table of 1 and then performs reverse check on the corresponding written data

"C_STRING_1"＝fh2aje。

An increment checking module 134, configured to, when the data that is migrated by the increment migration module 132 and changes after the first time catches up with the current data change of the data to be migrated, determine, in real time, for the data that changes after the first time, whether the data of the first database and the data of the second database are consistent, and record the inconsistent data in a preset list; and when inconsistent data exist in the preset list, repeatedly checking the inconsistent data every preset time, and recording the inconsistent data in the migration report when the inconsistent data still exist after repeated checking for a preset number of times.

For example, data migration is performed on a table "dbtest". "tabletest" of an Oracle database, after a first time when full migration starts, four write operations currently occur on the table "dbtest". "tabletest", and then incremental migration shifts write data of the four write operations to the table "dbtest" "-tabletest" of Mysql, so that the "Catch-up" (latch) state is reached, which may also be referred to as a "tie-up" state. At this time, incremental verification of the incrementally migrated data is started.

The incremental check can be correspondingly realized by a log data analysis mode or a trigger mode.

When the log data are analyzed, the log data of the first database are read and analyzed, and data corresponding to the library or the table related to the incremental migration are filtered out through the white list. Querying a corresponding library or table to generate a column of data changes after a first time; inquiring a base table column corresponding to the second database according to the mapping relation of the base table column; and comparing the data which are changed after the first moment with the data corresponding to the second database through type conversion, and writing the data into a preset list if the data are inconsistent.

For example, the white list includes a table "dbtest". times "tabletest" of the Oracle database, reads data for the table "dbtest". times "tabletest" from the Oracle database redo log, and queries for a total of four write operations after a first time, including twice for the column "C _ strong _1", once for the column "C _ ID", and once for the column "C _ DATE _ 1". The current data of columns "C _ STRING _1", "C _ ID", and "C _ DATE _1" in the table "dbtest" are read. According to the mapping relation of the library table column, the current data of columns 'C _ STRING _ 1', 'C _ ID' and 'C _ DATE _ 1' in the table 'dbtest', 'tabletest' of the Mysql database is queried. After the current data of the columns "C _ STRING _1", "C _ ID", and "C _ DATE _1" are subjected to type conversion, they are compared with the current data of the columns "C _ STRING _1", "C _ ID", and "C _ DATE _1", and the inconsistency is recorded in the preset list.

When the method is realized in a trigger mode, firstly, information in an increment table of a first database is read, corresponding actual data in the first database is checked back according to primary key information recorded by the increment table, then, the actual data belonging to the table subjected to increment migration is filtered according to a white list, the actual data in the first database is compared with corresponding data in a second database through table column mapping and type conversion, and if the actual data are inconsistent, a preset data list is written.

For example, the increment table includes primary keys ID 1 to 20, and four of the back-checked data are related to the table "dbtest". multidetest "of the Oracle database in the white list, including two for the column" C _ strong _1", one for the column" C _ ID ", and one for the column" C _ DATE _1 ". According to the mapping relation of the library table columns, reading the current data of columns 'C _ STRING _ 1', 'C _ ID' and 'C _ DATE _ 1' in the table 'dbtest', 'tabletest' of the Mysql database. The data of columns "C _ STRING _1", "C _ ID" and "C _ DATE _1" of the table "dbtest". minor "table database which is back-checked is subjected to type conversion, and compared with the current data of columns 'C _ STRING _ 1'," C _ ID 'and' C _ DATE _1 'in the table' dbtest. "table", inconsistent is written into a preset list.

Incremental checks perform multiple rechecks for inconsistent data. Since there may be a delay in writing data to the second database, the incremental check module 134 performs a repeated check on inconsistent data recorded in the preset list every preset time (e.g., 3 seconds), and records the inconsistent data in the migration report if the inconsistent data still exists after the preset number of repeated checks (e.g., 20 times). The migration report inner table records that full and incremental migration from table "dbtest". multidigitest "to table 'dbtest". multidigitest "has been completed, but there is inconsistent data, e.g., corresponding columns" C _ DATE _1 "and" C _ DATE _ 1'.

It should be noted that, in the embodiment of the present application, some of the full migration module 131, the incremental migration module 132, the full verification module 133, the incremental verification module 134, the structure migration module 135, and the migration evaluation module 136 work independently to realize a specific function, and at the same time, a plurality of modules may be combined according to a selection of a user, so as to form a system with different functions. For example, when the full-scale migration module 131 is an independent module, multi-threaded distributed data migration of multiple external surfaces of multiple libraries can be realized; when the full migration module 131 is combined with the migration evaluation module 136, the first database and the second database may be further evaluated before the full migration is performed, so as to obtain the basic conditions of the source database and the destination database, and implement a pre-estimation on the full migration; the full migration module 131 may further be combined with the structure migration module 135, and when the first database is heterogeneous to the second database and the data migration amount is large, performing structure migration in advance may improve the efficiency of data full migration and save time.

The full amount check module 133 exists in attachment to the full amount migration module 131, and can check the data consistency of the source library and the destination library after the data is migrated in full amount. In the above cases, the total verification module 133 is respectively explained with the combined functions of the structure migration module 135, the migration evaluation module 136, and the total verification module 133 may also be combined with any two or all three of the migration evaluation module 136, the total verification module 133, and the structure migration module 135 at the same time to obtain eight optional combination modes. The eight combinations are shown in fig. 1a to 1h, respectively.

Wherein, 1a is an architecture diagram of the full migration module 131 when it exists independently; FIG. 1b is an architectural diagram of a combination of a full-scale migration module 131 and a fabric migration module 135; FIG. 1c is a schematic diagram of the architecture of the full-scale migration module 131 and the migration evaluation module 136 when they exist in combination; FIG. 1d is an architectural diagram of the combined presence of the full-scale migration module 131 and the structure migration module 135 and the migration evaluation module 136; FIG. 1e is an architectural diagram of the combined presence of full-scale migration module 131, structure migration module 135, and full-scale check module 133; FIG. 1f is a schematic diagram of the architecture of the full-scale migration module 131, the migration evaluation module 136, and the full-scale check module 133 when present in combination; FIG. 1g is a schematic diagram of the architecture of the full migration module 131 and the full verification module 133 when they exist in combination; FIG. 1h is an architectural diagram of the full-scale migration module 131 and the combined existence of the structure migration module and migration evaluation module 136 and the full-scale check module 133;

when the modules are used in combination, the functions of the modules are as described in the embodiment corresponding to fig. 1, and are not described herein again.

In addition, the incremental migration module 132 of the embodiment of the present application also exists in attachment to the full migration module 131, and the incremental migration is usually operated after the full migration is completed. In connection with the above-mentioned implementation of the full-volume migration module 131, it should be understood that the incremental migration module 132 has the following implementation manners when implementing its function in the data migration system: the incremental migration module 132 and the full migration module 131 are combined, the incremental migration module 132 and the full migration module 131 and the full verification module 133 are combined, the incremental migration module 132 and the full migration module 131 and the incremental verification module 134 are combined, and the incremental migration module 132 and the full migration module 131 and the incremental verification module 134 and the full verification module 133 are combined, wherein in the above four combination manners, each combination manner may be combined with one of the structure migration module 135 or the migration evaluation module 136 or all two modules, so as to obtain sixteen optional combination manners. As shown in FIGS. 1i to 1 x:

FIG. 1i is an architectural diagram of the full-scale migration module 131 and the incremental-scale migration module 132 in combination; FIG. 1j is an architectural diagram of the combined presence of the full-scale migration module 131, the structure migration module 135, and the incremental migration module 132; FIG. 1k is an architectural diagram of the full-scale migration module 131, the migration evaluation module 136, and the incremental migration module 132 when present in combination; FIG. 1l is an architectural diagram of the combined presence of the full-scale migration module 131 and the structure migration module 135 and the incremental migration module 132 and the migration evaluation module 136; FIG. 1m is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 in combination with the incremental migration module 132 and the full-scale check module 133; FIG. 1n is a schematic diagram of the architecture of the full migration module 131 and the migration evaluation module 136 in combination with the incremental migration module 132 and the full check module 133; FIG. 1o is a schematic diagram of the architecture of the full migration module 131 and the full check module 133 and the incremental migration module 132 when present in combination; FIG. 1p is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 and the migration evaluation module 136 and the full-scale check module 133 and the incremental migration module 132 when present in combination; FIG. 1q is a schematic diagram of an architecture when a full-scale migration module 131 and an incremental migration module 132 exist in combination and an architecture when an incremental check module 134 exists in combination; FIG. 1r is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 in combination with the incremental migration module 132 and the incremental verification module 134; FIG. 1s is a schematic diagram of the architecture of the full-scale migration module 131 and the migration evaluation module 136 in combination with the incremental migration module 132 and the incremental verification module 134; FIG. 1t is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 in combination with the incremental migration module 132 and the migration evaluation module 136 and the incremental verification module 134; FIG. 1u is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 and the incremental migration module 132 and the full-scale check module 133 and the incremental check module 134 when they exist in combination; FIG. 1v is a schematic diagram of the architecture of the full-scale migration module 131 and the migration evaluation module 136 in combination with the incremental migration module 132 and the full-scale verification module 133 and the incremental verification module 134; FIG. 1w is a schematic diagram of the architecture of the full-volume migration module 131 and the full-volume check module 133 when present in combination with the incremental migration module 132 and the incremental check module 134; FIG. 1x is a schematic diagram of the architecture of the full-scale migration module 131 and the structure migration module 135 and the migration evaluation module 136 in combination with the full-scale check module 133 and the incremental migration module 132 and the incremental check module 134; when the modules are used in combination, the functions of the modules are as described in the embodiment corresponding to fig. 1, and are not described herein again.

The data migration system provided by the embodiment of the application can provide migration services of types such as full migration, incremental migration, full verification, incremental verification, migration evaluation, structure migration and the like, and the migration types are selected by a user according to requirements. The data migration system supporting the universality between isomorphic and heterogeneous databases is provided, and the complete data migration system can perform full migration in a shutdown state and a non-shutdown state; the structure migration function before the full-scale migration is provided, and when the data volume needing to be migrated is large, a large amount of time and labor can be saved. The full-amount verification function after full-amount migration is provided, so that a user can find out the inconsistency problem in the migration process in time. The incremental migration and incremental verification functions in the non-shutdown state are provided, so that newly added data are synchronized in time, the increment is verified in real time, and the problem of inconsistency of the incremental data is found in time. The data migration system also has a distributed scheduling function, and the processing efficiency of data migration with large data volume can be improved.

By adopting the data migration system provided by the application, full-scale migration can be performed under the condition that the operation of the source database is allowed to be stopped; further, performing full-scale verification after full-scale migration; further, migration evaluation and structure migration are carried out before full-scale migration according to actual conditions. Taking the example of migrating the database collection amptest1 in the Oracle database to the liuy _ amptest1 of the Mysql database, the process is shown in fig. 4 and includes the following steps.

In step S210, the created migration task is received, and the migration task is migrated from amptest1 to liuy _ amptest1, wherein the migration types comprise migration evaluation, structure migration, full-volume migration and full-volume check, and the library amptest1 of Oracle is in a shutdown state.

In step S220, the migration task is evaluated, and an evaluation report is output.

The evaluation report comprises space occupation information of amptest1, base table column and object information, executed SQL and processing performance information, and the evaluation report further comprises incompatible characteristic information of liuy _ amptest1 and transformation cost information.

In step S230, when an instruction to continue executing the migration task is received, a table in amptest1 is created to the liuy _ amptest1, and a corresponding column is created for each table through the type mapping relationship.

Optionally, step S230 further includes S231, in step S231, determining a table to be migrated in amptest1 according to a preconfigured white list; the white list is a list of tables which need to be migrated and are specified by a user, and if the tables which need to be migrated in the Oracle database are specified, the names of the specified tables are added to the white list; if the first database needs to be migrated as a whole, the white list is empty.

Optionally, step S230 further includes step S232, if the table needing to be migrated is determined according to the white list, in step S232, the table needing to be migrated is segmented.

The step is not limited to a specific segmentation algorithm, and may be any one of algorithms such as a horizontal segmentation algorithm, a vertical segmentation algorithm, and a hash segmentation algorithm. And by a segmentation algorithm, the table to be migrated is segmented into a plurality of subtasks to be processed asynchronously respectively, so that the efficiency of full migration is improved.

In step S240, all data in amptest1 is migrated to the corresponding table and column in liuy _ amptest 1. Optionally, the data migration process of this embodiment further includes step S241, in step S241, after writing data into the liuy _ amptest1 each time, recording storage location information of the data to be migrated, so as to resume the transfer at the breakpoint.

Optionally, the data migration process of this embodiment further includes step S242: after data is written into the liuy _ amptest1 each time, the site information of the data to be migrated is recorded for breakpoint transmission.

Specifically, a site table increment _ trx may be created in the target server Mysql, and is used to record the sites of the full-scale migration. If the book server is abnormally restarted, the site table can effectively solve the problem of breakpoint continuous transmission after the task is abnormally restarted.

In step S250, amptest1 is compared with the corresponding table and column data in liuy _ amptest 1.

The comparison process in this step is used to determine whether the data migrated from the source database to the destination database are consistent, and if inconsistent data exists, the inconsistent data is stored in a preset list. When inconsistent data exist in the preset list, the inconsistent data are repeatedly checked every other preset time, and when the inconsistent data still exist after repeated checking for a preset number of times, the inconsistent data are recorded in the migration evaluation report, so that the problem of data inconsistency in full-scale migration can be timely found, real-time checking of data migration is realized, and the accuracy of data migration is further guaranteed.

Specifically, a table which needs to be subjected to full-scale verification is determined according to the preconfigured white list, and when the whole database set is subjected to full-scale migration, and the white list is empty, the data in the whole database set is verified. Particularly, when the first database is heterogeneous to the second database, the data to be migrated needs to be read first and converted into a preset intermediate format, and the intermediate format is converted into the type of the second database, and then compared with the corresponding data in the second database. For example, if the Oracle database and the Mysql database in the embodiment are heterogeneous databases, the data to be migrated in the Oracle database needs to be read first, converted into a preset intermediate format, then converted into a corresponding data type of the Mysql database, and then compared with corresponding data in the Mysql database, so as to implement real-time verification of full-scale data migration.

In an alternative implementation manner, with reference to fig. 5, the full migration process of the embodiment of the present application may be further implemented by the following steps:

in step S501, a table to be migrated is determined according to a preconfigured white list or whether to migrate the entire first database is determined.

In step S502, the table to be migrated is divided into a plurality of subtasks.

In the step, a large quantity table is divided into a plurality of subtasks for processing, and for each subtask, the distributed scheduling module performs corresponding scheduling so as to realize load balancing and improve the efficiency of data migration.

In step S503, each of the subtasks reads the first library data allocated by the scheduling system for processing.

In step S504, when the first database is heterogeneous to the second database, the read data of the first database is converted into a preset intermediate format.

In step S505, the library table column of the first database and the type of the object to be migrated in the second database are determined according to the preset type mapping relationship.

In step S506, a library table column and object corresponding to the library table column type are created in the second database.

In step S507, the data to be migrated is migrated from the first database to the second database.

In step S508, after data is written into the second database each time, the location information of the data to be migrated is recorded for breakpoint transmission. By adopting the data migration system provided by the application, under the condition that the operation of the source database is not allowed to be stopped, full-scale migration and incremental migration can be performed; further, performing full amount check after full amount migration, and performing incremental check after incremental migration; further, migration evaluation and structure migration are carried out before full-scale migration according to actual conditions. Still taking the example of migrating the database collection amptest1 in the Oracle database to the liuy _ amptest1 of the Mysql database, the process is shown in FIG. 6 and includes the following steps.

In step S601, the created migration task is received, and migrated from amptest1 to liuy _ amptest1, where the migration types include migration evaluation, structure migration, full volume check, incremental migration, and incremental check, where Oracle' S library amptest1 is in a non-shutdown state.

In step S602, the current migration task is evaluated, and an evaluation report is output.

In step S603, when an instruction to continue executing the migration task is received, create an amptest1 table to a liuy _ amptest1, and create a corresponding column for each table through the type mapping relationship.

In step S604, all data in amptest1 is migrated to the corresponding table and column in liuy _ amptest 1.

In step S605, amptest1 is compared with the corresponding table and column data in the liuy _ amptest 1.

In step S606, data that has changed in amptest1 after the start of the full migration is migrated to a table and a column corresponding to the liuy _ amptest 1.

In step S607, when the incremental migration reaches the "caught up" state, the amptest1 is compared with the data of the liuy _ amptest1 that has been incrementally migrated in real time.

In step S608, access to amptest1 is switched to liuy _ amptest 1.

In step S609, from the second time when data is written into the liuy _ amptest1 for the first time after switching, reverse incremental migration is started, and the changed data in the liuy _ amptest1 is migrated to the table and the column corresponding to amptest 1.

In step S610, when the reverse incremental migration from the second time point reaches the "caught up" state, the amptest1 is compared with the data of the liuy _ amptest1, which has been subjected to the reverse incremental migration, in real time.

In step S611, when the occurrence of the abnormality of the liuy _ amptest1, the access to the liuy _ amptest1 is switched back to amptest 1.

In this embodiment, the full-volume migration and the incremental migration are completed in a state where the database is not shut down, and the full-volume verification and the incremental verification are performed on the migrated data in real time, so as to verify the consistency of the data after the migration in real time. The reasons for data migration usually include that the storage capacity of the first database is too large, and part of data needs to be migrated to the second database, or that the access amount of part of data in the first database is too large, which affects normal access of other data, and the data with excessive access amount needs to be migrated to the second database. Therefore, when the migrated data are completely consistent, the access of the client to the data to be migrated can be switched from the first database to the second database.

After the switch, there may be a situation that the access needs to be switched back to the first database due to a BUG (BUG) occurring in the application program during operation or a problem with the processing capability of the second database. In order to keep the data to be migrated consistent in the two databases in real time. Therefore, the change of the data to be migrated in the second database is also reflowed to the first database in a reverse incremental migration manner in real time, and the reflowed data is verified in real time in an incremental verification manner.

For example, after the write operation is switched to the liuy _ amptest1, the first write operation modifies the data in the column "id _ type" in the table "a 00_ full _ type _ table _ m1b 1", and the time of the write operation is the second time. At this point, a reverse delta migration from liuy _ amptest1 to amptest1 begins.

25 seconds after the second time, the data in table "a 00_ full _ type _ table _ m1b 1" is modified four times, and when all the four modified data are migrated to amptest1, the delta check for the above reverse delta migration is started to verify whether the data of current amptest1 and liuy _ amptest1 are consistent.

When the abnormality occurs in the liuy _ amptest1, since the reverse incremental migration has been performed, the access of the write operation and the like can be switched back to the amptest1 in real time.

In the non-stop data migration method provided by this embodiment, real-time incremental migration and incremental verification are performed between the two databases, so that the two databases are kept consistent, and access to corresponding data can be switched between the two databases in real time.

Specifically, with reference to fig. 7a and 7b, in an optional implementation manner, in this embodiment of the present application, the log-based incremental migration process may further be implemented by the following steps:

in step S701: and reading and analyzing the log of the first database to obtain incremental data.

Generally, the incremental migration creates a parsing path of the incremental task in advance, for example, the embodiment of the present invention may use a topoic store to identify one data parsing channel.

The log records of the first database, namely the source database, are incremental data, and the data added to the source database can be obtained by analyzing the log of the first database.

In step S702: and determining tables needing migration in the incremental data through a white list.

And the incremental forwarding is carried out after the full-scale migration is finished, and after the incremental data is obtained from the topoic store, the incremental data is filtered through a white list, so that a table to be migrated is obtained. The white list is the same as a white list which is pre-configured in the full-scale migration process and is specified by the client.

In step S703: and converting the data to be migrated into a preset intermediate format and writing the data into a Queue.

In step S704, the data read data in the intermediate format is multithreadingly read from the Queue and mapping of the library list is performed.

In step S705, the data to be migrated is written into the second database, and the location information of the data to be migrated is recorded for breakpoint transmission.

Further, when the primary key of the data table has an update operation (update) or a delete operation (delete), in step S706, the data writing process is further split into writing and deleting according to the update result of the first database.

For example, when the primary key value of a certain data table in the first database changes from 1 to 2, correspondingly, a deletion operation (delete) is performed in the second database corresponding to the table having a mapping relationship with the data table having the primary key value of 1 in the first database, and at the same time, a data write operation is performed in the second database corresponding to the table having a mapping relationship with the data table having the primary key value of 2 in the first database. Of course, the above description is only for example and not intended to limit the embodiments of the present application.

In an optional implementation manner, with reference to fig. 8, in this embodiment of the present application, the incremental migration process based on the trigger manner may further be implemented by the following steps:

in step S801, an increment table and a trigger (trigger) are first created.

In step S802, the information in the increment table of the first database is read, and the corresponding original data in the first database is back-checked according to the primary key information recorded in the increment table.

In step S803, the actual data of the incrementally migrated table is filtered out according to the white list.

In step S804, the data to be migrated is converted into a preset intermediate format and mapping of the library list is performed.

In step S805, the data to be migrated is written into the second database, and the location information of the processed increment is recorded for breakpoint resume.

It should be noted that, only the primary key information, not the original data, is stored in the increment table, so that before writing the data to be migrated into the second database, the source database, that is, the first database, needs to be back-checked according to the obtained information in the increment table, so as to obtain the original data to be migrated, and write the original data into the second database.

Further, if the primary key of the data table in the first database has an update operation (update) or a delete operation (delete), in step S806, the data writing process is further divided into writing and deleting according to the update result of the first database.

By using the data migration system provided by the present application, the full amount check is performed after the incremental migration, specifically, the implementation process of the full amount check is as shown in fig. 9, and includes the following steps:

in step S901, reading the table information of the first database, and obtaining the table information to be verified according to a preconfigured white list;

in step S902, reading data in the first database, converting the read data into a preset intermediate format, and performing table mapping;

in step S903, querying the second database for the data migrated from the first database according to the library table mapping;

it should be noted that step S902 and step S903 are parallel steps, and the order thereof is not limited in the embodiment of the present application.

In step S904, the data obtained in step S902 and step S903 are compared according to the mapping relationship of the library table, and if the results are consistent, step S905 is executed, otherwise step S906 is executed.

In step S905, the verification process ends.

In step S906, the inconsistent data is saved in a preset list.

And when inconsistent data exist in the preset list, repeatedly checking the inconsistent data every preset time, and recording the inconsistent data in the migration evaluation report when the inconsistent data still exist after repeated checking for a preset number of times.

By adopting the data migration system provided by the application, increment verification is carried out after increment migration. The incremental verification based on the log is implemented as shown in fig. 10, and includes the following steps:

in step S1001, reading a log of a first database and analyzing the content of the log;

in step S1002, the first database is queried according to the analysis result, so as to obtain incremental data to be migrated in the first data.

In step S1003, performing white list filtering on the incremental data queried in the first database to obtain the base tables and data involved in incremental migration, and converting the base tables and data into a preset intermediate format.

In step S1004, a library table mapping is performed, and the library table and the data involved in the incremental migration in the second database are queried according to the result of the library table mapping.

In step S1005, the table or data related to the incremental advance acquired in step S1003 and step S1004 is compared with each other based on the table mapping relationship, and if they match, step S1006 is executed, and if not, step S1007 is executed.

In step S1006, the incrementally verified site information is saved.

In step S1007, inconsistent data is saved in a preset list, and the process jumps to step S1006.

Specifically, in the embodiment of the present application, an incremental verification implementation process based on a trigger is shown in fig. 11, and includes the following steps:

in step S1101, reading an increment table of the first database;

in step S1102, the first database is queried according to the increment table, so as to obtain a base table and data involved in incremental migration in the first data.

In step S1103, the incremental data queried in the first database is filtered by a white list and converted into a preset intermediate format.

In step S1104, a library table mapping is performed, and the library tables and data involved in the incremental migration in the second database are queried according to the result of the library table mapping.

In step S1105, the data acquired in step S1103 and step S1104 are compared with each other based on the table mapping relationship, and if they match, step S1106 is executed, and if they do not match, step S1107 is executed.

In step S1106, the incrementally verified site information is saved.

In step S1107, inconsistent data is saved in a preset list, and the process jumps to step S1106.

In the embodiment, the problem of inconsistent data in the data migration process can be timely found through the full amount check and the incremental amount check, and particularly for the data of the core application, the integrity and the safety of the migrated data are ensured.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

As used in the specification and in the claims, certain terms are used to refer to particular components. As one skilled in the art will appreciate, manufacturers may refer to a component by different names. This specification and claims do not intend to distinguish between components that differ in name but not function. In the following description and in the claims, the terms "include" and "comprise" are used in an open-ended fashion, and thus should be interpreted to mean "include, but not limited to. "substantially" means within an acceptable error range, and a person skilled in the art can solve the technical problem within a certain error range to substantially achieve the technical effect. Furthermore, the term "coupled" is intended to encompass any direct or indirect electrical coupling. Thus, if a first device couples to a second device, that connection may be through a direct electrical coupling or through an indirect electrical coupling via other devices and couplings. The description which follows is a preferred embodiment of the present application, but is made for the purpose of illustrating the general principles of the application and not for the purpose of limiting the scope of the application. The protection scope of the present application shall be subject to the definitions of the appended claims.

It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.

The foregoing description shows and describes several preferred embodiments of the present application, but as aforementioned, it is to be understood that the application is not limited to the forms disclosed herein, but is not to be construed as excluding other embodiments and is capable of use in various other combinations, modifications, and environments and is capable of changes within the scope of the inventive concept as expressed herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the application, which is to be protected by the claims appended hereto.

Claims

1. A data migration system, comprising: an entry (Portal) module, an Application Programming Interface (API) module, a distributed scheduling module, and a full migration module;

the entrance module is used for providing an entrance for managing migration tasks, receiving the migration tasks created by a client and the designated first database, second database and data to be migrated of the first database, wherein the migration tasks comprise full migration;

2. The data migration system of claim 1, wherein said system further comprises a first migration evaluation unit,

3. The data migration system of claim 1, wherein said system further comprises a structure migration module,

4. The data migration system of claim 3, wherein said system further comprises a second migration evaluation unit,

5. The data migration system of claim 1,

6. The data migration system of claim 1,

7. The data migration system of claim 1, wherein the full-scale migration module is further to,

8. The data migration system of claim 1, wherein the full-scale migration module is further to,

9. The data migration system of claim 1, wherein the full-scale migration module is further to,

10. A data migration system, comprising: an entrance (Portal) module, an Application Programming Interface (API) module, a distributed scheduling module, a full migration module and an incremental migration module;

11. The data migration system of claim 10, wherein said system further comprises a first migration evaluation unit,

12. The data migration system of claim 10, wherein the system further comprises a structure migration module,

13. The data migration system of claim 12, wherein said system further comprises a second migration evaluation unit,

14. The data migration system of claim 10,

15. The data migration system of claim 10,

16. The data migration system of claim 10, wherein the full-scale migration module is further to,

17. The data migration system of claim 10, wherein the full-scale migration module is further to,

18. The data migration system of claim 10, wherein the full-scale migration module is further to,

19. The data migration system of claim 10, wherein the incremental migration module is further to,

and acquiring the data which changes after the first moment by using the log of the first database or the operation record saved by the trigger of the first database.

20. A method of data migration, comprising:

21. The data migration method of claim 20, wherein when the migration task further includes a migration evaluation, prior to invoking the full-scale migration module, the method further comprises:

and calling a first migration evaluation unit, and outputting an evaluation report according to the first database and the second database, wherein the evaluation report comprises space occupation information, list columns and object information of the first database, executed SQL (structured query language) and processing performance information, and the evaluation report further comprises incompatible characteristic information and transformation cost information of the second database.

22. The data migration method of claim 20, wherein when the migration task further includes structure migration, prior to invoking the full-scale migration module, the method further comprises:

23. The data migration method of claim 22, wherein when the migration task further includes a migration evaluation, prior to invoking the structure migration module, the method further comprises:

24. The data migration method according to claim 20, wherein invoking the full-scale migration module, when migrating the data to be migrated from the first database to the second database, further comprises:

and recording the position information of the written data for breakpoint continuous transmission.

25. The data migration method of claim 20, wherein invoking said full-scale migration module writes said data to be migrated to said second database, further comprising,

26. The data migration method of claim 20, wherein invoking a full-scale migration module to migrate the data to be migrated from the first database to the second database, further comprises,

27. The data migration method of claim 20, wherein invoking a full-scale migration module to migrate the data to be migrated from the first database to the second database, further comprises,

28. A method of data migration, comprising:

29. The data migration method of claim 28, wherein when the migration task further includes a migration evaluation, prior to invoking the full-scale migration module, the method further comprises:

30. The data migration method of claim 28, wherein when the migration task further includes structure migration, prior to invoking the full-scale migration module, the method further comprises:

31. The data migration method of claim 30, wherein when the migration task further includes a migration evaluation, prior to invoking the structure migration module, the method further comprises:

32. The data migration method according to claim 28, wherein invoking the full-volume migration module or the incremental migration module, when migrating the data to be migrated from the first database to the second database, further comprises:

33. The data migration method of claim 28, wherein invoking said full-scale migration module writes said data to be migrated to said second database, further comprising,

34. The data migration method of claim 28, wherein invoking a full-scale migration module migrates said data to be migrated from said first database to said second database, further comprising,

35. The data migration method of claim 28, wherein invoking a full-scale migration module migrates said data to be migrated from said first database to said second database, further comprising,

36. The data migration method of claim 28, wherein invoking the incremental migration module specifically comprises,

37. The data migration method of claim 28, wherein the method further comprises:

when the write operation of the client to the first database is switched to the second database, taking a second time when the client writes data to the second database for the first time as a reference, migrating the data of the second database, which changes after the second time, to the first database;

38. The data migration method of claim 37, wherein the method further comprises:

and when the access of the client to the second database is abnormal, switching the write operation of the client to the second database back to the first database.