CN112181951B

CN112181951B - Heterogeneous database data migration method, device and equipment

Info

Publication number: CN112181951B
Application number: CN202011128698.7A
Authority: CN
Inventors: 董晨辉
Original assignee: New H3C Big Data Technologies Co Ltd
Current assignee: New H3C Big Data Technologies Co Ltd
Priority date: 2020-10-20
Filing date: 2020-10-20
Publication date: 2022-03-25
Anticipated expiration: 2040-10-20
Also published as: CN112181951A

Abstract

The application provides a data migration method, a device and equipment for a heterogeneous database, wherein under the condition that data migration in the database fails, a characteristic value determined according to log information for describing errors generated when data in a first database is migrated to a second database is input into a decision tree model to obtain a target solution with the highest priority for solving the errors, and then a target syntax relation mapping table is generated when the data in the first database is migrated to the second database according to the target solution of the errors, so that the data in the first database is migrated to the second database according to the target syntax relation mapping table. Therefore, by applying the technical scheme provided by the embodiment of the application, the data migration efficiency of the database can be improved on the basis of saving human resources.

Description

Heterogeneous database data migration method, device and equipment

Technical Field

The present application relates to the field of database technologies, and in particular, to a method, an apparatus, and a device for data migration of a heterogeneous database.

Background

The database is used as a storage cabinet for storing data, and a corresponding database management system stores the data into the database. However, in practical applications, for third-party applications that apply databases, because a new generation of database management system has a stronger advantage than its own database, in order to improve the performance of the third-party applications, it is often necessary to migrate data in an original database to a database that is different from and has a stronger performance than the database management system to which the original database belongs, and during the data migration, because the original database and the database to be migrated have different database management systems. Thus, the human resource investment is relatively high, but the database processing service is not strongly related to the service, and the skills of mastering different database management systems are also required, which is a great challenge for developers.

Based on this, for the processing between the databases corresponding to different database management systems, automatic migration is generally performed through a third-party application program, but in the migration process, migration often cannot be successful at one time, under the condition of data migration failure, the common method is to manually analyze log information for describing errors generated due to data migration failure, revise each error recorded in the log information in the original syntax relation mapping table one by one to form a new syntax relation mapping table, further use the syntax relation mapping table to automatically migrate the data in the original database again through the third-party application, and if migration fails again, manually revise the syntax relation mapping table again until data migration succeeds. Therefore, when a data migration failure occurs, a large amount of human resources are often input, a large amount of time is spent for analyzing the reason of the data migration failure, and then each error is modified one by one, however, in the solving process, the same error caused by manual negligence can also occur, and then the phenomenon of multiple data migration failures is caused, so that the data migration efficiency in the database is low.

Disclosure of Invention

In view of this, the present application provides a data migration method, apparatus and device for a heterogeneous database, so as to improve data migration efficiency on the basis of saving human resources.

Specifically, the method is realized through the following technical scheme:

in one aspect, an embodiment of the present application provides a data migration method for a heterogeneous database, where the method includes:

obtaining a characteristic value to be input into a trained decision tree model, wherein the characteristic value is determined according to log information which is generated when data in a first database is migrated to a second database and is used for describing errors;

inputting the obtained characteristic values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution;

and generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table.

On the other hand, based on the same concept, an embodiment of the present application further provides a heterogeneous database data migration apparatus, where the apparatus includes:

the system comprises a characteristic value obtaining unit, a decision tree model training unit and a decision tree model training unit, wherein the characteristic value obtaining unit is used for obtaining a characteristic value to be input into a trained decision tree model, and the characteristic value is determined according to log information which is generated when data in a first database is migrated to a second database and is used for describing errors;

a solution output unit, configured to input the obtained feature values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution;

and the data migration unit is used for generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table.

In yet another aspect, an embodiment of the present application provides an electronic device, including a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to implement the method steps for data migration of heterogeneous databases according to the above embodiments.

In yet another aspect, embodiments of the present application further provide a machine-readable storage medium storing machine-executable instructions, which when invoked and executed by a processor, cause the processor to implement the method steps of data migration of a heterogeneous database described in the foregoing embodiments.

As can be seen from the foregoing technical solutions, in the embodiment of the present application, in the case of a data migration failure, a solution (target solution) with the highest priority for solving each error may be obtained by inputting a feature value determined according to log information for describing the error, which is generated when data in a first database is migrated to a second database, into a decision tree model, and then a target syntax relation mapping table for migrating the data in the first database to the second database is generated according to the target solution for each error, so as to migrate the data in the first database to the second database according to the target syntax relation mapping table. Therefore, by applying the technical scheme provided by the embodiment of the application, the success rate of data migration can be greatly improved by adopting the target solution output by the decision tree model aiming at the condition of data migration failure, meanwhile, the data migration is automatically completed as far as possible without manual intervention for processing as far as possible until the data migration is successful, and therefore, the data migration efficiency of the database can be improved on the basis of saving human resources.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic flowchart of a data migration method for a heterogeneous database according to an embodiment of the present application;

fig. 2 is a schematic flow chart of another heterogeneous database data migration method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a data migration apparatus for heterogeneous databases according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in detail below with reference to the accompanying drawings and specific embodiments.

In a database, databases having the same database management system are considered to be the same database, and databases having different database management systems are considered to be different databases. In this application, database data migration refers to migrating data in a database to a database that is different from the database.

In practical application, the database can be automatically migrated through a third-party application program, and data migration can also be performed by inserting a script for automatically migrating the database data into the database to be migrated. However, whether data migration is performed by using a third-party application or the script, there may be a case where data migration cannot be successful at one time. For clarity of description, an example is illustrated, specifically:

in practical application, MySQL is a relational DBMS (Database Management System), and because of its small size, fast speed, low total cost of ownership, and having an open-source DBMS, it makes many websites select MySQL as a Database Management System of the Database during development;

the postgreSQL and the MySQL are the same as the open-source DBMS, but the MySQL database as a built-in database of the product must comply with the highest level open-source protocol, and under the open-source protocol, the product also needs to be open-source, which obviously is not beneficial to the protection of the core product. Based on this, since the PostgreSQL database does not need a product open source, and greenplus of an MPP (analytic Massively Parallel Processing) architecture established by the PostgreSQL is also in a mainstream status in the market, and users are used to increase year by year, the PostgreSQL database is a best product built-in database for replacing the MySQL database, and it is seen that migrating data stored in the MySQL database to the PostgreSQL database is urgent.

In such an application scenario, a third-party application program and a script for migrating data in the PostgreSQL database to the PostgreSQL database appear, and then one-key intelligent data migration from the MySQL database to the PostgreSQL database is realized, so that data migration efficiency of the developers and the implementers in the background art can be improved, but in the data migration process, data migration cannot be successful once.

At present, under the condition of data migration failure, the reason of data migration error is often analyzed through manual intervention as described in the background technology, and the list structure definition statement with the error is modified line by line, so that the investment of human resources is large, and the time spent is correspondingly large. And mapping relations between different statement of table structure definition are more, so that omission problem is easy to occur, a professional who is familiar with the MySQL database and the PostgreSQL database can only correct the mapping relations, so that a grammar relation mapping table for migrating data in the MySQL database to the PostgreSQL database can be created again, the created grammar relation mapping table is further arranged in a third-party application program or script, and one-key intelligent data migration is executed again until the migration is successful. Therefore, when a data migration failure occurs, a large amount of human resources are often input, a large amount of time is spent for analyzing the reason of the data migration failure, and then the problem that the data migration failure occurs errors one by one is solved.

In order to solve the problems in the prior art, the method and the device can obtain the characteristic value to be input into the trained decision tree model, wherein the characteristic value is determined according to log information for describing errors generated when data in a first database is migrated to a second database; inputting the obtained characteristic values into a decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as a target solution and outputting the solution; and generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table. As the decision tree model can take the solution with the highest priority for solving each error as the target solution and output the solution, the target solution is the optimal solution, so that the success rate of automatically migrating the data in the first database to the second database according to the grammar relation mapping table generated by the optimal solution is greatly improved. In addition, aiming at the condition of data migration failure, manual intervention is not needed to perform data migration as much as possible, but the data migration is automatically completed as much as possible until the data migration is successful, so that the data migration efficiency can be improved on the basis of saving human resources.

The data migration method for the heterogeneous database provided by the embodiment of the application can be applied to a third-party application program, and the third-party application program can be DBConvert for MySQL & PostgreSQL (a data conversion tool for migrating data in MySQL to PostgreSQL) or Navicat. The method and the device can also be applied to scripts inserted in a database management system, and the method and the device are not particularly limited in the application.

However, when the third-party application program is used for data migration, the details and the method for controlling data migration may not be flexibly realized because a developer cannot know the data migration principle of the bottom layer of the third-party application program in detail, and thus the developer cannot easily handle the situation of data migration failure when the data migration failure occurs. In addition, this also results in a complete reliance on third party applications, and if a problem occurs with a third party application, the data migration cannot be completed.

Compared with the data migration of a third-party application program, the data migration of the application script does not need to concern the mapping problem of the field grammar in the table structure definition statement, and the field grammars can be matched by a built-in algorithm in the database to be migrated and the database into which the data is migrated. Meanwhile, the developer can also process the commands in the script under the condition that the data migration fails. In addition, since the script realizes data migration, the script can be run as long as the running environment of the script is owned (all the current operating systems are internally provided with the script running environment), and the method does not depend on a third-party application program. Based on the method, whether the method is applied to the third-party application or the inserting script can be flexibly selected according to specific practical situations.

For convenience of description, the following embodiments are all applied to a database management system that inserts a script.

Referring to fig. 1, fig. 1 is a schematic flowchart of a data migration method for a heterogeneous database according to an embodiment of the present application, where the method is applied to a database management system, and the method may include the following steps:

step 101, obtaining a characteristic value to be input into a trained decision tree model, wherein the characteristic value is determined according to log information for describing errors generated when data in a first database is migrated to a second database.

The decision tree model is trained based on the configured sample error type and the priority of the sample solution corresponding to the sample error type, and the decision tree model will be described in detail later, which is not described herein again.

In this embodiment of the present application, the first database and the second database do not refer to a fixed database, but may refer to any two databases belonging to different database management systems, and the following description of this embodiment of the present application is not repeated.

The first database is the database to be migrated, such as the MySQL database referred to above, and the second database is the database into which the data is migrated, such as the PostgreSQL database referred to above.

A database application is formed of a plurality of table structure defining statements. In this way, if the data in the first database is to be migrated to the second database, it means that the table structure definition statements of the first database are all required to be modified into the table structure definition statements of the second numerical control library, and on this premise, if the table structure definition statements of the first database are all modified into the table structure definition statements of the second numerical control library, it means that the first database is already managed by the database management system owned by the second numerical control library, that is, the data in the first database is successfully migrated. Similarly, if a part of the table structure definition statements in the table structure definition statements of the first database is not modified into the table structure definition statements of the second numerical control database, it means that the data in the first database is not successfully migrated to the second numerical control database, which means that the data in the first database is failed to be migrated. And the table structure definition statement of the first database is a table structure definition statement migrated to the second numerical control library through the syntax relation mapping table, and after the data migration fails, log information for describing errors is generated, the log information records errors occurring in the syntax relation mapping table, each error has a respective corresponding error type, keywords for representing the error type are extracted from each error type, and the keywords are characteristic values in the step.

For example, in practical applications, when data in the MySQL database is migrated to the PostgreSQL database, the error types with the highest occurrence frequency are: ERROR, relation "example. person" do not exist exists to extract the characteristic value from the ERROR type, and the relation not exist exists can be used as the characteristic value corresponding to the ERROR type.

Further, each table structure definition statement in the first database follows the field syntax specified in the first database, and based on this, if one table structure definition statement does not perform data migration according to the field syntax of the second nc library in the data migration process, data migration in the table structure statement may fail. In the process of migrating the data in the first database to the second database, if the data migration fails, the generated log information includes information indicating that the syntax of the field is wrong, such as: # ERROR Log-ERROR, psql: example _ data. sql:33 ERROR: current transaction is abuted, command aligned transaction end of transaction block.

As an example, a specific implementation manner of implementing step 101 may be: analyzing corresponding error field grammar from the log information, and determining the error type corresponding to the error field grammar; and determining the corresponding characteristic value according to the error type.

In this embodiment, the table structure definition statement that has an error is determined from the log information, the corresponding error field syntax is parsed from the table structure definition statement that has an error, and the error type corresponding to the error field syntax is determined for each error field syntax.

Illustratively, the first sentence where the error is located is retrieved from the log information, the sentence is segmented, and the error types corresponding to the field grammars are identified according to the field grammars to which the segmented words belong.

In addition, after the error type corresponding to the field grammar of each error is determined, for each error type, key fields capable of representing the representative features of the error type are extracted from the error type, and the key fields are the feature values of the step.

Step 102, inputting the obtained characteristic values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution.

In the application, each error corresponds to a respective error type, the error has at least one solution for each error, and according to actual experience, the priority can be divided according to the success rate of each solution for solving the error type, the higher the success rate of the solution for solving the error type is, the higher the corresponding priority is, otherwise, the lower the success rate of the solution for solving the error type is, the lower the corresponding priority is.

Based on the above description of the solution, the target solution is the solution with the highest priority for solving each error, that is, the solution with the highest success rate for solving each error.

As an embodiment, the decision tree model may be trained in the following manner, specifically including the following steps a and B:

step A, obtaining sample error types, and extracting sample characteristic values corresponding to respective error types from the determined sample error types; the sample error type is obtained from a sample error log generated when data in the first database is migrated to the second database.

The sample error type may be determined from log information generated during a history migration process of data in the first database to the second database, and as an embodiment, a corresponding sample error field syntax may be parsed from the log information to determine a sample error type corresponding to the sample error field syntax; and determining a corresponding sample characteristic value according to the sample error type.

It should be noted that, in the using process, if an error type that does not exist in the sample error types is found, the error type may be used as a new sample error type, and a sample solution for solving the error type is determined, so that the decision tree model is further updated according to the training manners of step a and step B, so that the application range of the decision tree model is wider, and the success rate of predicting the solution corresponding to each error is higher.

And B, training the decision tree model by using a supervised learning decision tree algorithm according to the configured sample error type and the priority of the sample solution corresponding to the sample error type.

Correspondingly, the sample solutions corresponding to the sample error types respectively are used for solving the error field grammar corresponding to each sample error type so as to obtain the correct solution corresponding to the field grammar of the second database.

Based on a supervised learning decision tree algorithm, constructing an initial decision tree model according to the priorities of solutions respectively corresponding to the sample error types; and training a decision tree model according to the configured sample error type and the priority of the sample solution corresponding to the sample error type.

Therefore, in the technical scheme provided by the embodiment of the application, the decision tree model is trained by using the supervised learning decision tree algorithm according to the configured sample error type and the priority of the sample solution corresponding to the sample error type, so that the learning capability is strong, and the method has the characteristics of simple training and easy expansion.

For a sample error type, the sample error type has at least one sample solution, each sample solution is provided with a priority, and the sample solution with the highest priority has the highest success rate for solving the sample error type.

And 103, generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table.

The target syntax relationship mapping table characterizes a mapping relationship between the syntax of a field in the table structure definition statement of the first database and the syntax of a field in the table structure definition statement of the second database. That is, the table structure definition statement in the first database is modified according to the target syntax mapping table, and can become the table structure definition statement in the second database, so that the data in the first database is successfully migrated to the second database.

As an embodiment, a specific implementation manner of "generating a target syntax relation mapping table when migrating data in the first database to the second database according to a target solution of each error" in the step 103 may include steps C and D:

and step C, modifying the syntax relation mapping table when the data in the first database fails to be migrated to the second database according to the target solution of each error.

The syntax relation mapping table used when the data in the first database fails to be migrated to the second database may be a syntax relation mapping table created in advance and used for errors when the data in the first database is migrated to the second database, or may be a target syntax relation mapping table used when the data in the first database fails to be migrated to the second database.

In this step, the syntax relation mapping table that is failed to modify the data in the first database to the second database may be a corresponding error syntax field in the table structure definition statement that is modified incorrectly, so as to modify the field syntax that conforms to the second control library, thereby obtaining the table structure definition statement that conforms to the second database field syntax.

And D, determining the modified grammar relation mapping table as a target grammar relation mapping table.

In this step, the target syntax mapping table is the syntax mapping table obtained by modifying the wrong table structure definition statement in the wrong syntax mapping table. This indicates that, according to the target syntax relationship mapping table, the probability of successful migration of the data in the first database to the second database is higher.

For clarity of description, an example is given, specifically:

for example, the original syntax relationship mapping table includes: statement A1 maps statement B1, statement A2 maps statement B2, and statement A3 maps statement B4. In the data migration process, it can be known through log information analysis that the statement A3 maps the statement B4 with an error, and based on the above steps 201 to 206, the mapping error of the statement A3 is solved, and the original syntax relation mapping table is corrected to obtain the target syntax relation mapping table, which specifically includes: statement A1 maps statement B1, statement A2 maps statement B2, and statement A3 maps statement B3.

As an embodiment, a specific implementation manner of implementing "migrating data in the first database to the second database according to the target syntax relationship mapping table" in step 103 may include step E and step F:

and step E, reading the table structure definition sentences in the first database from the specified text, and modifying the read table structure definition sentences according to the target syntax relation mapping table to obtain modified table structure definition sentences.

As an embodiment, the table structure defining statement in the first database may be imported into the specified text by a statement export instruction for exporting the table structure defining statement.

The export instruction may be a mysqldump instruction.

It should be noted that, the only thing that this step derives from the first database is the table structure definition statement, and does not include the inserted data statement.

As an embodiment, one table structure definition statement may be taken as a unit, the table structure definition statement in the first database is read line by line from the specified text, and the table structure definition statement in the specified text is modified into the table structure definition statement in the second database according to the target syntax relation mapping table.

And F, importing the modified table structure definition statement into a second database so as to successfully migrate the data in the first database to the second database.

In this step, the modified table structure definition statement is imported into the second database, and if the import is successful, the table structure definition statement represents a database in which the data in the first database is managed by the database management system of the second database. And after the import is successful, the stored data in the first database can be directly inserted into the second database. And updating the wrong grammar relation mapping table by using the target grammar relation mapping table, carrying out data migration on the first database again, and executing the step 101 to the step 103 until the complete data migration is successful if the data migration fails.

As another embodiment, if the number of data migrations in the first database reaches a preset number, and there is still no data migration success, a manual intervention may be performed, the syntax of the error field in the log information is analyzed, for each syntax of the error field, if it is determined that the error type corresponding to the syntax of the error field does not belong to the sample error type, after a solution corresponding to the syntax of the error field is obtained, a new syntax relation mapping table is generated, if the data in the first database is successfully migrated to the second database by using the generated syntax relation mapping table, the solutions corresponding to the error type and the sample error type in the neural network model are updated by using the error type and the corresponding solution to which the syntax of the error field belongs, and a specific priority is given to the sample solution corresponding to the error type to which the syntax of the error field belongs, and then the purpose of updating the neural network model is achieved, so that the updated neural network model can more accurately output a solution, the number of manual intervention times is reduced, and the method is as close as possible to the target of completely automatic data migration.

Therefore, in the technical solution of the embodiment of the present application, compared with the prior art, in the case of a data migration failure, a solution (target solution) with the highest priority for solving each error may be obtained by inputting a feature value determined according to log information for describing the error, which is generated when data in a first database is migrated to a second database, into a decision tree model, and then a target syntax relation mapping table for migrating the data in the first database to the second database is generated according to the target solution for each error, so as to migrate the data in the first database to the second database according to the target syntax relation mapping table. It can be seen that, by applying the technical scheme provided by the embodiment of the application, for the case of data migration failure, the success rate of data migration in the database can be greatly improved by adopting the target solution output by the decision tree model, and meanwhile, the data migration is automatically completed as far as possible without manual intervention as possible until the data migration is successful, so that the data migration efficiency can be improved on the basis of saving human resources.

Referring to fig. 2, fig. 2 is a schematic flowchart of another heterogeneous database data migration method provided in an embodiment of the present application, where the method may include the following steps:

in step 201, log information for describing errors generated when data in a first database is migrated to a second database is obtained.

Step 202, parsing out corresponding error field syntax from the log information, and determining an error type corresponding to the error field syntax.

Step 203, determining a corresponding characteristic value according to the error type.

And step 204, inputting the obtained characteristic values into a decision tree model to obtain a target solution for solving each error.

Step 205, according to the target solution of each error, modifying the syntax relation mapping table when the migration of the data in the first database to the second database fails.

Step 206, determining the modified grammar relation mapping table as a target grammar relation mapping table.

Step 207, reading the table structure definition sentence in the first database from the specified text, and modifying the read table structure definition sentence according to the target syntax relation mapping table to obtain a modified table structure definition sentence.

And step 208, importing the modified table structure definition statement into the second database so as to successfully migrate the data in the first database into the second database.

Therefore, in the technical solution provided in the embodiment of the present application, for the case of data migration failure in the database, the eigenvalue determined according to the error type to which the error field syntax belongs can comprehensively and systematically determine the eigenvalue corresponding to each error, so that the target solution output by the decision tree model can further solve the success rate of each error in the log information, and further the table structure definition statement modified according to the target relation mapping table determined by the target solution can be further successfully imported. Meanwhile, the embodiment does not need manual intervention to process as much as possible, but automatically completes data migration as much as possible until the data migration is successful, so that the data migration efficiency of the database can be improved on the basis of saving human resources.

Based on the description of the data migration method of the heterogeneous database, the following describes in detail the data migration method of the heterogeneous database according to the embodiment of the present invention with a specific example, in this example, taking MySQL as an example, and PostgreSQL as an example, migrating data in MySQL to PostgreSQL by taking a second numerical control library as an example, where the specific method is as follows:

the first step is that a statement export instruction MySQL jump instruction based on the export table structure definition statement imports the table structure definition statement in MySQL into a specified text. This step derives from the first database only the table structure definition statements and does not include the inserted data statements.

And secondly, judging whether a grammar relation mapping table for migrating the data in the MySQL to the PostgreSQL exists in the specified text, if so, executing the third step, and if not, executing the fourth step.

In the step, whether the grammar relation mapping table exists in the appointed text or not is judged, if yes, the third step is directly executed without creating a second grammar relation mapping table, and if not, the grammar relation mapping table needs to be created, and then the fourth step is executed.

And thirdly, modifying the table structure definition statement in the specified text into a table structure definition statement of PostgreSQL according to the grammar relation mapping table, importing the modified table structure definition statement into the PostgreSQL, and executing the fourth step to the seventh step if the importing is failed.

As an embodiment, the first text may be read line by line with one table structure definition sentence as a unit, and the table structure definition sentence in the specified text may be modified into the table structure definition sentence of PostgreSQL according to the syntax relation mapping table.

In this step, the modified table structure definition statement is imported into PostgreSQL, and if the import is successful, MySQL becomes a database managed by the database management system of PostgreSQL, that is, data in MySQL is migrated to PostgreSQL. And after the import is successful, the data in MySQL can be directly inserted into PostgreSQL. And if the import fails, executing the fifth step to the seventh step.

And fourthly, creating a grammar relation mapping table, modifying the table structure definition sentence in the specified text into a table structure definition sentence of PostgreSQL according to the grammar relation mapping table, importing the modified table structure definition sentence into the PostgreSQL, and if the importing fails, executing the fifth step to the seventh step.

The syntax relationship mapping table can be obtained by defining a sentence as a unit according to a table structure according to a specified text And directly matching the sentence with a CART (Classification And Regression Trees) algorithm in MySQL And PostgreSQL to obtain the syntax relationship mapping table.

And fifthly, acquiring log information which is generated when the data in the MySQL is migrated to PostgreSQL and is used for describing errors, analyzing error field grammar corresponding to a table structure definition statement generating errors from the log information, determining an error type corresponding to the error field grammar, and determining a corresponding characteristic value according to the error type.

And sixthly, inputting the obtained characteristic values into a decision tree model, obtaining at least one solution corresponding to each error type and the priority of each solution by the decision tree model according to the input characteristic values, and selecting the solution with the highest priority from the at least one solution corresponding to each error type and the priority of each solution as a target solution and outputting the solution according to each error type. The success rate of each target solution to resolve errors corresponding to the respective target solution is higher.

And seventhly, modifying the syntax relation mapping table when the data in the MySQL is migrated to the PostgreSQL in the second step and fails according to the target solution of each error, determining the modified syntax relation mapping table as a target syntax relation mapping table, and modifying the syntax relation mapping table to ensure that the success rate of migrating the data in the MySQL to the PostgreSQL is higher according to the target syntax relation mapping table.

Eighthly, when the table structure definition statement of MySQL is successfully modified into the table structure definition statement of MySQL, the data in MySQL is successfully migrated into PostgreSQL, based on the fact that the data in MySQL are successfully migrated into PostgreSQL, the table structure definition statement in MySQL is read from the specified text in the step, and the read table structure definition statement is modified according to the target syntax relation mapping table to obtain the modified table structure definition statement; and importing the modified table structure definition statement into PostgreSQL so that the data in MySQL is successfully migrated into PostgreSQL. Therefore, in the process of transferring the data in the MySQL to the PostgreSQL, manual intervention is not needed to be carried out as much as possible, and the data transfer is automatically completed as much as possible until the data transfer is successful, so that the data transfer efficiency can be improved on the basis of saving human resources. If the processing method provided by the embodiment of the present application is used, after the number of data migration times in MySQL reaches the preset number of times, the data cannot be successfully migrated to PostgreSQL, then the situation that the error type to which the error repeatedly occurs may not be the sample error type exists in the log information is represented, and based on this situation, manual intervention is required to analyze and modify the error to obtain the syntax relation mapping table until the syntax relation mapping table is imported again and migration is successful, however, in any case, the heterogeneous database data migration method provided by the embodiment of the present application does not need manual intervention to process the data migration failure frequently, but only needs manual intervention to process the data migration method provided by the embodiment of the present application after the number of data migration times reaches the preset number of times, that is, the application of the embodiment of the present application can further reduce the burden for human engineering, the data migration efficiency is improved.

Based on the same application concept as the method, an embodiment of the present application further provides a heterogeneous database data migration apparatus 300, which is shown in fig. 3 and is a structural diagram of the apparatus, and the apparatus may include:

a feature value obtaining unit 301, configured to obtain a feature value to be input to a trained decision tree model, where the feature value is determined according to log information for describing an error, which is generated when data in a first database is migrated to a second database;

a solution output unit 302, configured to input the obtained feature values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution;

the data migration unit 303 is configured to generate a target syntax relation mapping table when migrating data in the first database to the second database according to each wrong target solution, and migrate data in the first database to the second database according to the target syntax relation mapping table.

In an embodiment of the present application, the characteristic value obtaining unit 301 may include:

a log information obtaining subunit, configured to obtain log information, which is generated when data in the first database is migrated to the second database, and is used for describing an error;

the error type determining subunit is used for analyzing corresponding error field grammar from the log information and determining the error type corresponding to the error field grammar;

and the characteristic value determining subunit is used for determining the corresponding characteristic value according to the error type.

In an embodiment of the application, the data migration unit 303 includes:

a syntax relation mapping table modifying subunit, configured to modify the syntax relation mapping table when the migration of the data in the first database to the second database fails, according to the target solution of each error;

and the target syntax relation mapping table determining subunit is used for determining the modified syntax relation mapping table as the target syntax relation mapping table.

In an embodiment of the application, the data migration unit 303 may further include:

a table structure definition sentence modification subunit, configured to read a table structure definition sentence in the first database from the specified text, and modify the read table structure definition sentence according to the target syntax relation mapping table to obtain a modified table structure definition sentence;

and the data migration subunit is used for importing the modified table structure definition statement into the second database so as to successfully migrate the data in the first database into the second database.

In an embodiment of the present application, the apparatus may further include: the model prediction module is used for training a decision tree model;

wherein, the model prediction module is specifically configured to:

obtaining sample error types, and extracting sample characteristic values corresponding to respective error types from the determined sample error types; the sample error type is obtained from a sample error log generated when the data in the first database is migrated to the second database;

and training the decision tree model by using a supervised learning decision tree algorithm according to the configured sample error type and the priority of the sample solution corresponding to the sample error type.

In summary, in the technical solution provided in the embodiment of the present application, when data migration fails, a solution (target solution) with the highest priority for solving each error may be obtained by inputting a feature value determined according to log information for describing an error, which is generated when data in a first database is migrated to a second database, into a decision tree model, and then a target syntax relation mapping table for migrating data in the first database to the second database may be generated according to the target solution for each error, so as to migrate the data in the first database to the second database according to the target syntax relation mapping table. It can be seen that, by applying the technical scheme provided by the embodiment of the application, for the case of data migration failure, the success rate of data migration in the database can be greatly improved by adopting the target solution output by the decision tree model, and meanwhile, the data migration is automatically completed as far as possible without manual intervention as possible until the data migration is successful, so that the data migration efficiency in the database can be improved on the basis of saving human resources.

The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.

In the electronic device provided in the embodiment of the present application, from a hardware level, a schematic diagram of a hardware architecture can be seen as shown in fig. 4. The method comprises the following steps: a machine-readable storage medium and a processor, wherein: the machine-readable storage medium having stored thereon machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to perform the heterogeneous database data migration operations disclosed in the above examples.

Machine-readable storage media are provided by embodiments of the present application that store machine-executable instructions that, when invoked and executed by a processor, cause the processor to perform the example disclosed heterogeneous database data migration operations described above.

Here, a machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and so forth. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims

1. A heterogeneous database data migration method, the method comprising:

2. The method of claim 1, wherein obtaining feature values to be input to the trained decision tree model comprises:

obtaining log information which is generated when data in a first database is migrated to a second database and is used for describing errors;

analyzing corresponding error field grammar from the log information, and determining an error type corresponding to the error field grammar;

and determining the corresponding characteristic value according to the error type.

3. The method of claim 1, wherein generating a target syntax mapping table for migrating data in a first database to a second database according to a target solution for each error comprises:

modifying a syntax relation mapping table when the data in the first database fails to be migrated to the second database according to the target solution of each error;

and determining the modified grammar relation mapping table as a target grammar relation mapping table.

4. The method of claim 1, wherein migrating data in a first database to a second database according to the target syntax relationship mapping table comprises:

reading a table structure definition statement in a first database from a specified text, and modifying the read table structure definition statement according to the target syntax relation mapping table to obtain a modified table structure definition statement;

and importing the modified table structure definition statement into the second database so as to successfully migrate the data in the first database to the second database.

5. The method of claim 1, wherein the decision tree model is trained by:

obtaining sample error types, and extracting sample characteristic values corresponding to respective error types from the determined sample error types; the sample error type is obtained from a sample error log generated when data in a first database is migrated to a second database;

6. An apparatus for data migration of a heterogeneous database, the apparatus comprising:

7. The apparatus of claim 6, wherein the data migration unit comprises:

8. The apparatus of claim 6, wherein the data migration unit further comprises:

a table structure definition statement modification subunit, configured to read a table structure definition statement in the first database from the specified text, and modify the read table structure definition statement according to the target syntax relationship mapping table to obtain a modified table structure definition statement;

9. The apparatus of claim 6, further comprising: the model prediction module is used for training a decision tree model;

wherein the model prediction module is specifically configured to:

10. An electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to perform the method steps of any of claims 1-5.