CN112181951B - Heterogeneous database data migration method, device and equipment - Google Patents

Heterogeneous database data migration method, device and equipment Download PDF

Info

Publication number
CN112181951B
CN112181951B CN202011128698.7A CN202011128698A CN112181951B CN 112181951 B CN112181951 B CN 112181951B CN 202011128698 A CN202011128698 A CN 202011128698A CN 112181951 B CN112181951 B CN 112181951B
Authority
CN
China
Prior art keywords
database
error
data
solution
mapping table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011128698.7A
Other languages
Chinese (zh)
Other versions
CN112181951A (en
Inventor
董晨辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN202011128698.7A priority Critical patent/CN112181951B/en
Publication of CN112181951A publication Critical patent/CN112181951A/en
Application granted granted Critical
Publication of CN112181951B publication Critical patent/CN112181951B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data migration method, a device and equipment for a heterogeneous database, wherein under the condition that data migration in the database fails, a characteristic value determined according to log information for describing errors generated when data in a first database is migrated to a second database is input into a decision tree model to obtain a target solution with the highest priority for solving the errors, and then a target syntax relation mapping table is generated when the data in the first database is migrated to the second database according to the target solution of the errors, so that the data in the first database is migrated to the second database according to the target syntax relation mapping table. Therefore, by applying the technical scheme provided by the embodiment of the application, the data migration efficiency of the database can be improved on the basis of saving human resources.

Description

Heterogeneous database data migration method, device and equipment
Technical Field
The present application relates to the field of database technologies, and in particular, to a method, an apparatus, and a device for data migration of a heterogeneous database.
Background
The database is used as a storage cabinet for storing data, and a corresponding database management system stores the data into the database. However, in practical applications, for third-party applications that apply databases, because a new generation of database management system has a stronger advantage than its own database, in order to improve the performance of the third-party applications, it is often necessary to migrate data in an original database to a database that is different from and has a stronger performance than the database management system to which the original database belongs, and during the data migration, because the original database and the database to be migrated have different database management systems. Thus, the human resource investment is relatively high, but the database processing service is not strongly related to the service, and the skills of mastering different database management systems are also required, which is a great challenge for developers.
Based on this, for the processing between the databases corresponding to different database management systems, automatic migration is generally performed through a third-party application program, but in the migration process, migration often cannot be successful at one time, under the condition of data migration failure, the common method is to manually analyze log information for describing errors generated due to data migration failure, revise each error recorded in the log information in the original syntax relation mapping table one by one to form a new syntax relation mapping table, further use the syntax relation mapping table to automatically migrate the data in the original database again through the third-party application, and if migration fails again, manually revise the syntax relation mapping table again until data migration succeeds. Therefore, when a data migration failure occurs, a large amount of human resources are often input, a large amount of time is spent for analyzing the reason of the data migration failure, and then each error is modified one by one, however, in the solving process, the same error caused by manual negligence can also occur, and then the phenomenon of multiple data migration failures is caused, so that the data migration efficiency in the database is low.
Disclosure of Invention
In view of this, the present application provides a data migration method, apparatus and device for a heterogeneous database, so as to improve data migration efficiency on the basis of saving human resources.
Specifically, the method is realized through the following technical scheme:
in one aspect, an embodiment of the present application provides a data migration method for a heterogeneous database, where the method includes:
obtaining a characteristic value to be input into a trained decision tree model, wherein the characteristic value is determined according to log information which is generated when data in a first database is migrated to a second database and is used for describing errors;
inputting the obtained characteristic values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution;
and generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table.
On the other hand, based on the same concept, an embodiment of the present application further provides a heterogeneous database data migration apparatus, where the apparatus includes:
the system comprises a characteristic value obtaining unit, a decision tree model training unit and a decision tree model training unit, wherein the characteristic value obtaining unit is used for obtaining a characteristic value to be input into a trained decision tree model, and the characteristic value is determined according to log information which is generated when data in a first database is migrated to a second database and is used for describing errors;
a solution output unit, configured to input the obtained feature values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution;
and the data migration unit is used for generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table.
In yet another aspect, an embodiment of the present application provides an electronic device, including a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to implement the method steps for data migration of heterogeneous databases according to the above embodiments.
In yet another aspect, embodiments of the present application further provide a machine-readable storage medium storing machine-executable instructions, which when invoked and executed by a processor, cause the processor to implement the method steps of data migration of a heterogeneous database described in the foregoing embodiments.
As can be seen from the foregoing technical solutions, in the embodiment of the present application, in the case of a data migration failure, a solution (target solution) with the highest priority for solving each error may be obtained by inputting a feature value determined according to log information for describing the error, which is generated when data in a first database is migrated to a second database, into a decision tree model, and then a target syntax relation mapping table for migrating the data in the first database to the second database is generated according to the target solution for each error, so as to migrate the data in the first database to the second database according to the target syntax relation mapping table. Therefore, by applying the technical scheme provided by the embodiment of the application, the success rate of data migration can be greatly improved by adopting the target solution output by the decision tree model aiming at the condition of data migration failure, meanwhile, the data migration is automatically completed as far as possible without manual intervention for processing as far as possible until the data migration is successful, and therefore, the data migration efficiency of the database can be improved on the basis of saving human resources.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a schematic flowchart of a data migration method for a heterogeneous database according to an embodiment of the present application;
fig. 2 is a schematic flow chart of another heterogeneous database data migration method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a data migration apparatus for heterogeneous databases according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in detail below with reference to the accompanying drawings and specific embodiments.
In a database, databases having the same database management system are considered to be the same database, and databases having different database management systems are considered to be different databases. In this application, database data migration refers to migrating data in a database to a database that is different from the database.
In practical application, the database can be automatically migrated through a third-party application program, and data migration can also be performed by inserting a script for automatically migrating the database data into the database to be migrated. However, whether data migration is performed by using a third-party application or the script, there may be a case where data migration cannot be successful at one time. For clarity of description, an example is illustrated, specifically:
in practical application, MySQL is a relational DBMS (Database Management System), and because of its small size, fast speed, low total cost of ownership, and having an open-source DBMS, it makes many websites select MySQL as a Database Management System of the Database during development;
the postgreSQL and the MySQL are the same as the open-source DBMS, but the MySQL database as a built-in database of the product must comply with the highest level open-source protocol, and under the open-source protocol, the product also needs to be open-source, which obviously is not beneficial to the protection of the core product. Based on this, since the PostgreSQL database does not need a product open source, and greenplus of an MPP (analytic Massively Parallel Processing) architecture established by the PostgreSQL is also in a mainstream status in the market, and users are used to increase year by year, the PostgreSQL database is a best product built-in database for replacing the MySQL database, and it is seen that migrating data stored in the MySQL database to the PostgreSQL database is urgent.
In such an application scenario, a third-party application program and a script for migrating data in the PostgreSQL database to the PostgreSQL database appear, and then one-key intelligent data migration from the MySQL database to the PostgreSQL database is realized, so that data migration efficiency of the developers and the implementers in the background art can be improved, but in the data migration process, data migration cannot be successful once.
At present, under the condition of data migration failure, the reason of data migration error is often analyzed through manual intervention as described in the background technology, and the list structure definition statement with the error is modified line by line, so that the investment of human resources is large, and the time spent is correspondingly large. And mapping relations between different statement of table structure definition are more, so that omission problem is easy to occur, a professional who is familiar with the MySQL database and the PostgreSQL database can only correct the mapping relations, so that a grammar relation mapping table for migrating data in the MySQL database to the PostgreSQL database can be created again, the created grammar relation mapping table is further arranged in a third-party application program or script, and one-key intelligent data migration is executed again until the migration is successful. Therefore, when a data migration failure occurs, a large amount of human resources are often input, a large amount of time is spent for analyzing the reason of the data migration failure, and then the problem that the data migration failure occurs errors one by one is solved.
In order to solve the problems in the prior art, the method and the device can obtain the characteristic value to be input into the trained decision tree model, wherein the characteristic value is determined according to log information for describing errors generated when data in a first database is migrated to a second database; inputting the obtained characteristic values into a decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as a target solution and outputting the solution; and generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table. As the decision tree model can take the solution with the highest priority for solving each error as the target solution and output the solution, the target solution is the optimal solution, so that the success rate of automatically migrating the data in the first database to the second database according to the grammar relation mapping table generated by the optimal solution is greatly improved. In addition, aiming at the condition of data migration failure, manual intervention is not needed to perform data migration as much as possible, but the data migration is automatically completed as much as possible until the data migration is successful, so that the data migration efficiency can be improved on the basis of saving human resources.
The data migration method for the heterogeneous database provided by the embodiment of the application can be applied to a third-party application program, and the third-party application program can be DBConvert for MySQL & PostgreSQL (a data conversion tool for migrating data in MySQL to PostgreSQL) or Navicat. The method and the device can also be applied to scripts inserted in a database management system, and the method and the device are not particularly limited in the application.
However, when the third-party application program is used for data migration, the details and the method for controlling data migration may not be flexibly realized because a developer cannot know the data migration principle of the bottom layer of the third-party application program in detail, and thus the developer cannot easily handle the situation of data migration failure when the data migration failure occurs. In addition, this also results in a complete reliance on third party applications, and if a problem occurs with a third party application, the data migration cannot be completed.
Compared with the data migration of a third-party application program, the data migration of the application script does not need to concern the mapping problem of the field grammar in the table structure definition statement, and the field grammars can be matched by a built-in algorithm in the database to be migrated and the database into which the data is migrated. Meanwhile, the developer can also process the commands in the script under the condition that the data migration fails. In addition, since the script realizes data migration, the script can be run as long as the running environment of the script is owned (all the current operating systems are internally provided with the script running environment), and the method does not depend on a third-party application program. Based on the method, whether the method is applied to the third-party application or the inserting script can be flexibly selected according to specific practical situations.
For convenience of description, the following embodiments are all applied to a database management system that inserts a script.
Referring to fig. 1, fig. 1 is a schematic flowchart of a data migration method for a heterogeneous database according to an embodiment of the present application, where the method is applied to a database management system, and the method may include the following steps:
step 101, obtaining a characteristic value to be input into a trained decision tree model, wherein the characteristic value is determined according to log information for describing errors generated when data in a first database is migrated to a second database.
The decision tree model is trained based on the configured sample error type and the priority of the sample solution corresponding to the sample error type, and the decision tree model will be described in detail later, which is not described herein again.
In this embodiment of the present application, the first database and the second database do not refer to a fixed database, but may refer to any two databases belonging to different database management systems, and the following description of this embodiment of the present application is not repeated.
The first database is the database to be migrated, such as the MySQL database referred to above, and the second database is the database into which the data is migrated, such as the PostgreSQL database referred to above.
A database application is formed of a plurality of table structure defining statements. In this way, if the data in the first database is to be migrated to the second database, it means that the table structure definition statements of the first database are all required to be modified into the table structure definition statements of the second numerical control library, and on this premise, if the table structure definition statements of the first database are all modified into the table structure definition statements of the second numerical control library, it means that the first database is already managed by the database management system owned by the second numerical control library, that is, the data in the first database is successfully migrated. Similarly, if a part of the table structure definition statements in the table structure definition statements of the first database is not modified into the table structure definition statements of the second numerical control database, it means that the data in the first database is not successfully migrated to the second numerical control database, which means that the data in the first database is failed to be migrated. And the table structure definition statement of the first database is a table structure definition statement migrated to the second numerical control library through the syntax relation mapping table, and after the data migration fails, log information for describing errors is generated, the log information records errors occurring in the syntax relation mapping table, each error has a respective corresponding error type, keywords for representing the error type are extracted from each error type, and the keywords are characteristic values in the step.
For example, in practical applications, when data in the MySQL database is migrated to the PostgreSQL database, the error types with the highest occurrence frequency are: ERROR, relation "example. person" do not exist exists to extract the characteristic value from the ERROR type, and the relation not exist exists can be used as the characteristic value corresponding to the ERROR type.
Further, each table structure definition statement in the first database follows the field syntax specified in the first database, and based on this, if one table structure definition statement does not perform data migration according to the field syntax of the second nc library in the data migration process, data migration in the table structure statement may fail. In the process of migrating the data in the first database to the second database, if the data migration fails, the generated log information includes information indicating that the syntax of the field is wrong, such as: # ERROR Log-ERROR, psql: example _ data. sql:33 ERROR: current transaction is abuted, command aligned transaction end of transaction block.
As an example, a specific implementation manner of implementing step 101 may be: analyzing corresponding error field grammar from the log information, and determining the error type corresponding to the error field grammar; and determining the corresponding characteristic value according to the error type.
In this embodiment, the table structure definition statement that has an error is determined from the log information, the corresponding error field syntax is parsed from the table structure definition statement that has an error, and the error type corresponding to the error field syntax is determined for each error field syntax.
Illustratively, the first sentence where the error is located is retrieved from the log information, the sentence is segmented, and the error types corresponding to the field grammars are identified according to the field grammars to which the segmented words belong.
In addition, after the error type corresponding to the field grammar of each error is determined, for each error type, key fields capable of representing the representative features of the error type are extracted from the error type, and the key fields are the feature values of the step.
Step 102, inputting the obtained characteristic values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution.
In the application, each error corresponds to a respective error type, the error has at least one solution for each error, and according to actual experience, the priority can be divided according to the success rate of each solution for solving the error type, the higher the success rate of the solution for solving the error type is, the higher the corresponding priority is, otherwise, the lower the success rate of the solution for solving the error type is, the lower the corresponding priority is.
Based on the above description of the solution, the target solution is the solution with the highest priority for solving each error, that is, the solution with the highest success rate for solving each error.
As an embodiment, the decision tree model may be trained in the following manner, specifically including the following steps a and B:
step A, obtaining sample error types, and extracting sample characteristic values corresponding to respective error types from the determined sample error types; the sample error type is obtained from a sample error log generated when data in the first database is migrated to the second database.
The sample error type may be determined from log information generated during a history migration process of data in the first database to the second database, and as an embodiment, a corresponding sample error field syntax may be parsed from the log information to determine a sample error type corresponding to the sample error field syntax; and determining a corresponding sample characteristic value according to the sample error type.
It should be noted that, in the using process, if an error type that does not exist in the sample error types is found, the error type may be used as a new sample error type, and a sample solution for solving the error type is determined, so that the decision tree model is further updated according to the training manners of step a and step B, so that the application range of the decision tree model is wider, and the success rate of predicting the solution corresponding to each error is higher.
And B, training the decision tree model by using a supervised learning decision tree algorithm according to the configured sample error type and the priority of the sample solution corresponding to the sample error type.
Correspondingly, the sample solutions corresponding to the sample error types respectively are used for solving the error field grammar corresponding to each sample error type so as to obtain the correct solution corresponding to the field grammar of the second database.
Based on a supervised learning decision tree algorithm, constructing an initial decision tree model according to the priorities of solutions respectively corresponding to the sample error types; and training a decision tree model according to the configured sample error type and the priority of the sample solution corresponding to the sample error type.
Therefore, in the technical scheme provided by the embodiment of the application, the decision tree model is trained by using the supervised learning decision tree algorithm according to the configured sample error type and the priority of the sample solution corresponding to the sample error type, so that the learning capability is strong, and the method has the characteristics of simple training and easy expansion.
For a sample error type, the sample error type has at least one sample solution, each sample solution is provided with a priority, and the sample solution with the highest priority has the highest success rate for solving the sample error type.
And 103, generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table.
The target syntax relationship mapping table characterizes a mapping relationship between the syntax of a field in the table structure definition statement of the first database and the syntax of a field in the table structure definition statement of the second database. That is, the table structure definition statement in the first database is modified according to the target syntax mapping table, and can become the table structure definition statement in the second database, so that the data in the first database is successfully migrated to the second database.
As an embodiment, a specific implementation manner of "generating a target syntax relation mapping table when migrating data in the first database to the second database according to a target solution of each error" in the step 103 may include steps C and D:
and step C, modifying the syntax relation mapping table when the data in the first database fails to be migrated to the second database according to the target solution of each error.
The syntax relation mapping table used when the data in the first database fails to be migrated to the second database may be a syntax relation mapping table created in advance and used for errors when the data in the first database is migrated to the second database, or may be a target syntax relation mapping table used when the data in the first database fails to be migrated to the second database.
In this step, the syntax relation mapping table that is failed to modify the data in the first database to the second database may be a corresponding error syntax field in the table structure definition statement that is modified incorrectly, so as to modify the field syntax that conforms to the second control library, thereby obtaining the table structure definition statement that conforms to the second database field syntax.
And D, determining the modified grammar relation mapping table as a target grammar relation mapping table.
In this step, the target syntax mapping table is the syntax mapping table obtained by modifying the wrong table structure definition statement in the wrong syntax mapping table. This indicates that, according to the target syntax relationship mapping table, the probability of successful migration of the data in the first database to the second database is higher.
For clarity of description, an example is given, specifically:
for example, the original syntax relationship mapping table includes: statement A1 maps statement B1, statement A2 maps statement B2, and statement A3 maps statement B4. In the data migration process, it can be known through log information analysis that the statement A3 maps the statement B4 with an error, and based on the above steps 201 to 206, the mapping error of the statement A3 is solved, and the original syntax relation mapping table is corrected to obtain the target syntax relation mapping table, which specifically includes: statement A1 maps statement B1, statement A2 maps statement B2, and statement A3 maps statement B3.
As an embodiment, a specific implementation manner of implementing "migrating data in the first database to the second database according to the target syntax relationship mapping table" in step 103 may include step E and step F:
and step E, reading the table structure definition sentences in the first database from the specified text, and modifying the read table structure definition sentences according to the target syntax relation mapping table to obtain modified table structure definition sentences.
As an embodiment, the table structure defining statement in the first database may be imported into the specified text by a statement export instruction for exporting the table structure defining statement.
The export instruction may be a mysqldump instruction.
It should be noted that, the only thing that this step derives from the first database is the table structure definition statement, and does not include the inserted data statement.
As an embodiment, one table structure definition statement may be taken as a unit, the table structure definition statement in the first database is read line by line from the specified text, and the table structure definition statement in the specified text is modified into the table structure definition statement in the second database according to the target syntax relation mapping table.
And F, importing the modified table structure definition statement into a second database so as to successfully migrate the data in the first database to the second database.
In this step, the modified table structure definition statement is imported into the second database, and if the import is successful, the table structure definition statement represents a database in which the data in the first database is managed by the database management system of the second database. And after the import is successful, the stored data in the first database can be directly inserted into the second database. And updating the wrong grammar relation mapping table by using the target grammar relation mapping table, carrying out data migration on the first database again, and executing the step 101 to the step 103 until the complete data migration is successful if the data migration fails.
As another embodiment, if the number of data migrations in the first database reaches a preset number, and there is still no data migration success, a manual intervention may be performed, the syntax of the error field in the log information is analyzed, for each syntax of the error field, if it is determined that the error type corresponding to the syntax of the error field does not belong to the sample error type, after a solution corresponding to the syntax of the error field is obtained, a new syntax relation mapping table is generated, if the data in the first database is successfully migrated to the second database by using the generated syntax relation mapping table, the solutions corresponding to the error type and the sample error type in the neural network model are updated by using the error type and the corresponding solution to which the syntax of the error field belongs, and a specific priority is given to the sample solution corresponding to the error type to which the syntax of the error field belongs, and then the purpose of updating the neural network model is achieved, so that the updated neural network model can more accurately output a solution, the number of manual intervention times is reduced, and the method is as close as possible to the target of completely automatic data migration.
Therefore, in the technical solution of the embodiment of the present application, compared with the prior art, in the case of a data migration failure, a solution (target solution) with the highest priority for solving each error may be obtained by inputting a feature value determined according to log information for describing the error, which is generated when data in a first database is migrated to a second database, into a decision tree model, and then a target syntax relation mapping table for migrating the data in the first database to the second database is generated according to the target solution for each error, so as to migrate the data in the first database to the second database according to the target syntax relation mapping table. It can be seen that, by applying the technical scheme provided by the embodiment of the application, for the case of data migration failure, the success rate of data migration in the database can be greatly improved by adopting the target solution output by the decision tree model, and meanwhile, the data migration is automatically completed as far as possible without manual intervention as possible until the data migration is successful, so that the data migration efficiency can be improved on the basis of saving human resources.
Referring to fig. 2, fig. 2 is a schematic flowchart of another heterogeneous database data migration method provided in an embodiment of the present application, where the method may include the following steps:
in step 201, log information for describing errors generated when data in a first database is migrated to a second database is obtained.
Step 202, parsing out corresponding error field syntax from the log information, and determining an error type corresponding to the error field syntax.
Step 203, determining a corresponding characteristic value according to the error type.
And step 204, inputting the obtained characteristic values into a decision tree model to obtain a target solution for solving each error.
Step 205, according to the target solution of each error, modifying the syntax relation mapping table when the migration of the data in the first database to the second database fails.
Step 206, determining the modified grammar relation mapping table as a target grammar relation mapping table.
Step 207, reading the table structure definition sentence in the first database from the specified text, and modifying the read table structure definition sentence according to the target syntax relation mapping table to obtain a modified table structure definition sentence.
And step 208, importing the modified table structure definition statement into the second database so as to successfully migrate the data in the first database into the second database.
Therefore, in the technical solution provided in the embodiment of the present application, for the case of data migration failure in the database, the eigenvalue determined according to the error type to which the error field syntax belongs can comprehensively and systematically determine the eigenvalue corresponding to each error, so that the target solution output by the decision tree model can further solve the success rate of each error in the log information, and further the table structure definition statement modified according to the target relation mapping table determined by the target solution can be further successfully imported. Meanwhile, the embodiment does not need manual intervention to process as much as possible, but automatically completes data migration as much as possible until the data migration is successful, so that the data migration efficiency of the database can be improved on the basis of saving human resources.
Based on the description of the data migration method of the heterogeneous database, the following describes in detail the data migration method of the heterogeneous database according to the embodiment of the present invention with a specific example, in this example, taking MySQL as an example, and PostgreSQL as an example, migrating data in MySQL to PostgreSQL by taking a second numerical control library as an example, where the specific method is as follows:
the first step is that a statement export instruction MySQL jump instruction based on the export table structure definition statement imports the table structure definition statement in MySQL into a specified text. This step derives from the first database only the table structure definition statements and does not include the inserted data statements.
And secondly, judging whether a grammar relation mapping table for migrating the data in the MySQL to the PostgreSQL exists in the specified text, if so, executing the third step, and if not, executing the fourth step.
In the step, whether the grammar relation mapping table exists in the appointed text or not is judged, if yes, the third step is directly executed without creating a second grammar relation mapping table, and if not, the grammar relation mapping table needs to be created, and then the fourth step is executed.
And thirdly, modifying the table structure definition statement in the specified text into a table structure definition statement of PostgreSQL according to the grammar relation mapping table, importing the modified table structure definition statement into the PostgreSQL, and executing the fourth step to the seventh step if the importing is failed.
As an embodiment, the first text may be read line by line with one table structure definition sentence as a unit, and the table structure definition sentence in the specified text may be modified into the table structure definition sentence of PostgreSQL according to the syntax relation mapping table.
In this step, the modified table structure definition statement is imported into PostgreSQL, and if the import is successful, MySQL becomes a database managed by the database management system of PostgreSQL, that is, data in MySQL is migrated to PostgreSQL. And after the import is successful, the data in MySQL can be directly inserted into PostgreSQL. And if the import fails, executing the fifth step to the seventh step.
And fourthly, creating a grammar relation mapping table, modifying the table structure definition sentence in the specified text into a table structure definition sentence of PostgreSQL according to the grammar relation mapping table, importing the modified table structure definition sentence into the PostgreSQL, and if the importing fails, executing the fifth step to the seventh step.
The syntax relationship mapping table can be obtained by defining a sentence as a unit according to a table structure according to a specified text And directly matching the sentence with a CART (Classification And Regression Trees) algorithm in MySQL And PostgreSQL to obtain the syntax relationship mapping table.
And fifthly, acquiring log information which is generated when the data in the MySQL is migrated to PostgreSQL and is used for describing errors, analyzing error field grammar corresponding to a table structure definition statement generating errors from the log information, determining an error type corresponding to the error field grammar, and determining a corresponding characteristic value according to the error type.
And sixthly, inputting the obtained characteristic values into a decision tree model, obtaining at least one solution corresponding to each error type and the priority of each solution by the decision tree model according to the input characteristic values, and selecting the solution with the highest priority from the at least one solution corresponding to each error type and the priority of each solution as a target solution and outputting the solution according to each error type. The success rate of each target solution to resolve errors corresponding to the respective target solution is higher.
And seventhly, modifying the syntax relation mapping table when the data in the MySQL is migrated to the PostgreSQL in the second step and fails according to the target solution of each error, determining the modified syntax relation mapping table as a target syntax relation mapping table, and modifying the syntax relation mapping table to ensure that the success rate of migrating the data in the MySQL to the PostgreSQL is higher according to the target syntax relation mapping table.
Eighthly, when the table structure definition statement of MySQL is successfully modified into the table structure definition statement of MySQL, the data in MySQL is successfully migrated into PostgreSQL, based on the fact that the data in MySQL are successfully migrated into PostgreSQL, the table structure definition statement in MySQL is read from the specified text in the step, and the read table structure definition statement is modified according to the target syntax relation mapping table to obtain the modified table structure definition statement; and importing the modified table structure definition statement into PostgreSQL so that the data in MySQL is successfully migrated into PostgreSQL. Therefore, in the process of transferring the data in the MySQL to the PostgreSQL, manual intervention is not needed to be carried out as much as possible, and the data transfer is automatically completed as much as possible until the data transfer is successful, so that the data transfer efficiency can be improved on the basis of saving human resources. If the processing method provided by the embodiment of the present application is used, after the number of data migration times in MySQL reaches the preset number of times, the data cannot be successfully migrated to PostgreSQL, then the situation that the error type to which the error repeatedly occurs may not be the sample error type exists in the log information is represented, and based on this situation, manual intervention is required to analyze and modify the error to obtain the syntax relation mapping table until the syntax relation mapping table is imported again and migration is successful, however, in any case, the heterogeneous database data migration method provided by the embodiment of the present application does not need manual intervention to process the data migration failure frequently, but only needs manual intervention to process the data migration method provided by the embodiment of the present application after the number of data migration times reaches the preset number of times, that is, the application of the embodiment of the present application can further reduce the burden for human engineering, the data migration efficiency is improved.
Based on the same application concept as the method, an embodiment of the present application further provides a heterogeneous database data migration apparatus 300, which is shown in fig. 3 and is a structural diagram of the apparatus, and the apparatus may include:
a feature value obtaining unit 301, configured to obtain a feature value to be input to a trained decision tree model, where the feature value is determined according to log information for describing an error, which is generated when data in a first database is migrated to a second database;
a solution output unit 302, configured to input the obtained feature values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution;
the data migration unit 303 is configured to generate a target syntax relation mapping table when migrating data in the first database to the second database according to each wrong target solution, and migrate data in the first database to the second database according to the target syntax relation mapping table.
In an embodiment of the present application, the characteristic value obtaining unit 301 may include:
a log information obtaining subunit, configured to obtain log information, which is generated when data in the first database is migrated to the second database, and is used for describing an error;
the error type determining subunit is used for analyzing corresponding error field grammar from the log information and determining the error type corresponding to the error field grammar;
and the characteristic value determining subunit is used for determining the corresponding characteristic value according to the error type.
In an embodiment of the application, the data migration unit 303 includes:
a syntax relation mapping table modifying subunit, configured to modify the syntax relation mapping table when the migration of the data in the first database to the second database fails, according to the target solution of each error;
and the target syntax relation mapping table determining subunit is used for determining the modified syntax relation mapping table as the target syntax relation mapping table.
In an embodiment of the application, the data migration unit 303 may further include:
a table structure definition sentence modification subunit, configured to read a table structure definition sentence in the first database from the specified text, and modify the read table structure definition sentence according to the target syntax relation mapping table to obtain a modified table structure definition sentence;
and the data migration subunit is used for importing the modified table structure definition statement into the second database so as to successfully migrate the data in the first database into the second database.
In an embodiment of the present application, the apparatus may further include: the model prediction module is used for training a decision tree model;
wherein, the model prediction module is specifically configured to:
obtaining sample error types, and extracting sample characteristic values corresponding to respective error types from the determined sample error types; the sample error type is obtained from a sample error log generated when the data in the first database is migrated to the second database;
and training the decision tree model by using a supervised learning decision tree algorithm according to the configured sample error type and the priority of the sample solution corresponding to the sample error type.
In summary, in the technical solution provided in the embodiment of the present application, when data migration fails, a solution (target solution) with the highest priority for solving each error may be obtained by inputting a feature value determined according to log information for describing an error, which is generated when data in a first database is migrated to a second database, into a decision tree model, and then a target syntax relation mapping table for migrating data in the first database to the second database may be generated according to the target solution for each error, so as to migrate the data in the first database to the second database according to the target syntax relation mapping table. It can be seen that, by applying the technical scheme provided by the embodiment of the application, for the case of data migration failure, the success rate of data migration in the database can be greatly improved by adopting the target solution output by the decision tree model, and meanwhile, the data migration is automatically completed as far as possible without manual intervention as possible until the data migration is successful, so that the data migration efficiency in the database can be improved on the basis of saving human resources.
The implementation process of the functions and actions of each unit in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
In the electronic device provided in the embodiment of the present application, from a hardware level, a schematic diagram of a hardware architecture can be seen as shown in fig. 4. The method comprises the following steps: a machine-readable storage medium and a processor, wherein: the machine-readable storage medium having stored thereon machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to perform the heterogeneous database data migration operations disclosed in the above examples.
Machine-readable storage media are provided by embodiments of the present application that store machine-executable instructions that, when invoked and executed by a processor, cause the processor to perform the example disclosed heterogeneous database data migration operations described above.
Here, a machine-readable storage medium may be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and so forth. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the application. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (10)

1. A heterogeneous database data migration method, the method comprising:
obtaining a characteristic value to be input into a trained decision tree model, wherein the characteristic value is determined according to log information which is generated when data in a first database is migrated to a second database and is used for describing errors;
inputting the obtained characteristic values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution;
and generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table.
2. The method of claim 1, wherein obtaining feature values to be input to the trained decision tree model comprises:
obtaining log information which is generated when data in a first database is migrated to a second database and is used for describing errors;
analyzing corresponding error field grammar from the log information, and determining an error type corresponding to the error field grammar;
and determining the corresponding characteristic value according to the error type.
3. The method of claim 1, wherein generating a target syntax mapping table for migrating data in a first database to a second database according to a target solution for each error comprises:
modifying a syntax relation mapping table when the data in the first database fails to be migrated to the second database according to the target solution of each error;
and determining the modified grammar relation mapping table as a target grammar relation mapping table.
4. The method of claim 1, wherein migrating data in a first database to a second database according to the target syntax relationship mapping table comprises:
reading a table structure definition statement in a first database from a specified text, and modifying the read table structure definition statement according to the target syntax relation mapping table to obtain a modified table structure definition statement;
and importing the modified table structure definition statement into the second database so as to successfully migrate the data in the first database to the second database.
5. The method of claim 1, wherein the decision tree model is trained by:
obtaining sample error types, and extracting sample characteristic values corresponding to respective error types from the determined sample error types; the sample error type is obtained from a sample error log generated when data in a first database is migrated to a second database;
and training the decision tree model by using a supervised learning decision tree algorithm according to the configured sample error type and the priority of the sample solution corresponding to the sample error type.
6. An apparatus for data migration of a heterogeneous database, the apparatus comprising:
the system comprises a characteristic value obtaining unit, a decision tree model training unit and a decision tree model training unit, wherein the characteristic value obtaining unit is used for obtaining a characteristic value to be input into a trained decision tree model, and the characteristic value is determined according to log information which is generated when data in a first database is migrated to a second database and is used for describing errors;
a solution output unit, configured to input the obtained feature values into the decision tree model to obtain a target solution for solving each error; the decision tree model is used for obtaining at least one solution corresponding to each error and the priority of each solution according to the input characteristic value, and for each error, selecting the solution with the highest priority from the at least one solution corresponding to the error and the priority of each solution as the target solution and outputting the solution;
and the data migration unit is used for generating a target syntax relation mapping table when the data in the first database is migrated to the second database according to the target solution of each error, and migrating the data in the first database to the second database according to the target syntax relation mapping table.
7. The apparatus of claim 6, wherein the data migration unit comprises:
a syntax relation mapping table modifying subunit, configured to modify the syntax relation mapping table when the migration of the data in the first database to the second database fails, according to the target solution of each error;
and the target syntax relation mapping table determining subunit is used for determining the modified syntax relation mapping table as the target syntax relation mapping table.
8. The apparatus of claim 6, wherein the data migration unit further comprises:
a table structure definition statement modification subunit, configured to read a table structure definition statement in the first database from the specified text, and modify the read table structure definition statement according to the target syntax relationship mapping table to obtain a modified table structure definition statement;
and the data migration subunit is used for importing the modified table structure definition statement into the second database so as to successfully migrate the data in the first database into the second database.
9. The apparatus of claim 6, further comprising: the model prediction module is used for training a decision tree model;
wherein the model prediction module is specifically configured to:
obtaining sample error types, and extracting sample characteristic values corresponding to respective error types from the determined sample error types; the sample error type is obtained from a sample error log generated when data in a first database is migrated to a second database;
and training the decision tree model by using a supervised learning decision tree algorithm according to the configured sample error type and the priority of the sample solution corresponding to the sample error type.
10. An electronic device comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to perform the method steps of any of claims 1-5.
CN202011128698.7A 2020-10-20 2020-10-20 Heterogeneous database data migration method, device and equipment Active CN112181951B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011128698.7A CN112181951B (en) 2020-10-20 2020-10-20 Heterogeneous database data migration method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011128698.7A CN112181951B (en) 2020-10-20 2020-10-20 Heterogeneous database data migration method, device and equipment

Publications (2)

Publication Number Publication Date
CN112181951A CN112181951A (en) 2021-01-05
CN112181951B true CN112181951B (en) 2022-03-25

Family

ID=73923044

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011128698.7A Active CN112181951B (en) 2020-10-20 2020-10-20 Heterogeneous database data migration method, device and equipment

Country Status (1)

Country Link
CN (1) CN112181951B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114328472B (en) * 2022-03-15 2022-05-27 北京数腾软件科技有限公司 AI-based data migration method and system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077401A (en) * 2014-07-04 2014-10-01 用友软件股份有限公司 Database data migration device and method
CN104598531A (en) * 2014-12-25 2015-05-06 广东电子工业研究院有限公司 Incremental data migration method among heterogeneous relational databases based on trigger
CN107958057A (en) * 2017-11-29 2018-04-24 苏宁云商集团股份有限公司 A kind of code generating method and device for being used for Data Migration in heterogeneous database
CN108491891A (en) * 2018-04-04 2018-09-04 桂林电子科技大学 A kind of online transfer learning method of multi-source based on decision tree local similarity
EP3495951A1 (en) * 2017-12-11 2019-06-12 Accenture Global Solutions Limited Hybrid cloud migration delay risk prediction engine
CN110705591A (en) * 2019-03-09 2020-01-17 华南理工大学 Heterogeneous transfer learning method based on optimal subspace learning
CN110704398A (en) * 2019-09-30 2020-01-17 深圳前海环融联易信息科技服务有限公司 Database migration method and device from MySQL to Oracle and computer equipment
CN111095233A (en) * 2017-09-28 2020-05-01 深圳清华大学研究院 Hybrid file system architecture, file storage, dynamic migration and applications thereof
CN111241056A (en) * 2019-12-31 2020-06-05 国网浙江省电力有限公司电力科学研究院 Power energy consumption data storage optimization method based on decision tree model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8280915B2 (en) * 2006-02-01 2012-10-02 Oracle International Corporation Binning predictors using per-predictor trees and MDL pruning

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077401A (en) * 2014-07-04 2014-10-01 用友软件股份有限公司 Database data migration device and method
CN104598531A (en) * 2014-12-25 2015-05-06 广东电子工业研究院有限公司 Incremental data migration method among heterogeneous relational databases based on trigger
CN111095233A (en) * 2017-09-28 2020-05-01 深圳清华大学研究院 Hybrid file system architecture, file storage, dynamic migration and applications thereof
CN107958057A (en) * 2017-11-29 2018-04-24 苏宁云商集团股份有限公司 A kind of code generating method and device for being used for Data Migration in heterogeneous database
EP3495951A1 (en) * 2017-12-11 2019-06-12 Accenture Global Solutions Limited Hybrid cloud migration delay risk prediction engine
CN108491891A (en) * 2018-04-04 2018-09-04 桂林电子科技大学 A kind of online transfer learning method of multi-source based on decision tree local similarity
CN110705591A (en) * 2019-03-09 2020-01-17 华南理工大学 Heterogeneous transfer learning method based on optimal subspace learning
CN110704398A (en) * 2019-09-30 2020-01-17 深圳前海环融联易信息科技服务有限公司 Database migration method and device from MySQL to Oracle and computer equipment
CN111241056A (en) * 2019-12-31 2020-06-05 国网浙江省电力有限公司电力科学研究院 Power energy consumption data storage optimization method based on decision tree model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Big Data Analytics Using Hadoop Map Reduce Framework and Data MigrationProcess;Payal M. Bante 等;《2017 International Conference on Computing, Communication, Control and Automation (ICCUBEA)》;20171231;第1-5页 *
异构数据库间的动态数据迁移;姜倩 等;《应用科技》;20050930;第32卷(第9期);第43-45、49页 *

Also Published As

Publication number Publication date
CN112181951A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
US11599356B1 (en) Systems and methods for legacy source code optimization and modernization
US10515002B2 (en) Utilizing artificial intelligence to test cloud applications
JP6280551B2 (en) Software build error prediction
US8898635B2 (en) System and method for automatic impact variable analysis and field expansion in mainframe systems
EP3748507B1 (en) Automated software testing
US20210241273A1 (en) Smart contract platform
CN111198868A (en) Intelligent sub-database real-time data migration method and device
US20180300147A1 (en) Database Operating Method and Apparatus
CN109657803B (en) Construction of machine learning models
CN108427709B (en) Multi-source mass data processing system and method
US11308069B2 (en) Data configuration, management, and testing
CN112181951B (en) Heterogeneous database data migration method, device and equipment
CN112948473A (en) Data processing method, device and system of data warehouse and storage medium
CN110928941B (en) Data fragment extraction method and device
CN116974554A (en) Code data processing method, apparatus, computer device and storage medium
CN107092671B (en) Method and equipment for managing meta information
US11861332B2 (en) String localization for universal use
CN115907400A (en) Work order processing method and device
CN112051987B (en) Service data processing method, device and equipment, program generating method and device
CN110222105A (en) Data summarization processing method and processing device
US20230385056A1 (en) Removing inactive code to facilitate code generation
CN113590213B (en) Component maintenance method, electronic device and storage medium
US20230118407A1 (en) Systems and methods for autonomous testing of computer applications
CN115757360A (en) Database data processing method and device and computer equipment
CN111859928A (en) Feature processing method, device, medium and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant