CN116521652A

CN116521652A - Method, system and medium for realizing migration of distributed heterogeneous database based on DataX

Info

Publication number: CN116521652A
Application number: CN202310801510.8A
Authority: CN
Inventors: 辛华; 詹卫许; 李成鸿; 张琦; 吴恺翔; 毛利朋; 汤磊
Original assignee: Southern Power Grid Digital Grid Research Institute Co Ltd
Current assignee: Southern Power Grid Digital Grid Research Institute Co Ltd
Priority date: 2023-07-03
Filing date: 2023-07-03
Publication date: 2023-08-01
Anticipated expiration: 2043-07-03
Also published as: CN116521652B

Abstract

The invention discloses a method, a system and a medium for realizing migration of a distributed heterogeneous database based on DataX, wherein the method comprises the following steps: acquiring information of a migrated data source library; obtaining migration table information according to the migrated data source library information; judging whether a corresponding data table of the migration table exists in the preset target database, if not, obtaining the data table non-existence information of the preset target database, and generating the corresponding data table according to the preset requirement; if yes, obtaining the existing information of the data table of the preset target database; obtaining a json file of the migration table according to the migration table information; and migrating the data in the migrated data source database to a preset target database according to the json file of the migration table and storing the data. According to the method, heterogeneous database migration is performed through the configuration tool, so that data migration is free from the dilemma of fragmentation and excessive specialization, and the heterogeneous database migration is easy to be performed by a person skilled in the art.

Description

Method, system and medium for realizing migration of distributed heterogeneous database based on DataX

Technical Field

The invention relates to the field of data processing and heterogeneous data transmission, in particular to a method, a system and a medium for realizing distributed heterogeneous database migration based on DataX.

Background

The large-scale digital application system of group scene is various, the database is scattered and complicated; different services have the requirements of heterogeneous database interaction, split-on-demand service data and the like. On the basis, the method meets the group service control requirement and the local individuation requirement, and the migration among heterogeneous databases is designed according to different projects; the migration scheme for each project output is too complicated, cannot be utilized, cannot form a unified migration standard, and has the problems of difficult operation and maintenance management, poor flexibility and the like, such as: the data types of the project source database and the target database are different; the conversion rules of the data types of all the projects are inconsistent, and the data structure needs to be manually adjusted; the data mapping rules are inconsistent, and the mapping rules need to be manually maintained.

Accordingly, there is a need for improvement in the art.

Disclosure of Invention

In view of the above problems, the present invention aims to provide a method, a system and a medium for implementing distributed heterogeneous database migration by using DataX, which can more conveniently migrate heterogeneous databases.

The first aspect of the present invention provides a method for implementing distributed heterogeneous database migration by using DataX, including:

Acquiring information of a migrated data source library;

obtaining migration table information according to the migrated data source library information;

judging whether a corresponding data table of the migration table exists in the preset target database, if not, obtaining the data table non-existence information of the preset target database, and generating the corresponding data table according to the preset requirement; if yes, obtaining the existing information of the data table of the preset target database;

obtaining a json file of the migration table according to the migration table information;

and migrating the data in the migrated data source database to a preset target database according to the json file of the migration table and storing the data.

In this solution, after migrating the data in the migrated data source database to the preset target database and storing the data, the method includes:

acquiring data name information of migration completion;

judging whether the data name of the migration completion is consistent with the data name in the data table, if so, marking the data name in the data table; if not, revising the data table according to the migrated data name, and triggering prompt information;

and sending the prompt information to a user side for prompting.

In this solution, after the data in the migrated data source library completes migration, the method further includes:

Judging whether an unlabeled data name exists in the data table, if so, sending the unlabeled data name to a user side for prompting; if not, comparing and analyzing the data table and the migration table;

judging whether the data names in the data table are consistent with those in the migration table, if so, obtaining data migration completion information; if not, inconsistent data name information is obtained;

and sending the data migration completion information or inconsistent data name information to a user side for display.

In this scheme, after obtaining inconsistent data name information, still include:

obtaining migration log information;

according to the migration log information, obtaining the reason information of inconsistent data names in the migration table and the data table;

matching the reason information of inconsistent data names in the migration table and the data table with a preset adjustment scheme to obtain a matching value;

judging whether the matching value is larger than a preset matching threshold value, if so, extracting a preset adjustment scheme corresponding to the matching value;

and sending the preset adjustment scheme to a user side for display.

In this solution, the step of obtaining the json file of the migration table according to the migration table information specifically includes:

Mapping an id field of a migration table into a gid field based on the requirement of a preset user terminal;

obtaining a field type according to the requirement of a preset user terminal or gid field;

configuring the transmission concurrency of the migration table to obtain the concurrency control of the migration table during data transmission;

and obtaining the json file of the migration table according to the field type, concurrency control and the migration table.

In this scheme, still include:

classifying the migration table information according to different data source libraries to obtain migration table information of the same data source library;

classifying migration tables of the same data source base according to different field types to obtain migration table information of different types of the same data source;

and sending the migration table information to a preset task management end according to classification, and displaying migration information of each migration table at the preset task management end.

The second aspect of the present invention provides a system for implementing a distributed heterogeneous database migration based on DataX, including a memory and a processor, where the memory stores a program for implementing a distributed heterogeneous database migration based on DataX, and the implementation of the program for implementing the distributed heterogeneous database migration based on DataX by the processor performs the following steps:

Acquiring information of a migrated data source library;

acquiring data name information of migration completion;

and sending the prompt information to a user side for prompting.

obtaining migration log information;

and sending the preset adjustment scheme to a user side for display.

In this scheme, still include:

A third aspect of the present invention provides a computer medium, in which a program for implementing a distributed heterogeneous database migration method based on DataX is stored, where the program for implementing the distributed heterogeneous database migration method based on DataX implements a step of implementing the distributed heterogeneous database migration method based on DataX as described in any one of the above when executed by a processor.

According to the distributed heterogeneous database migration method, system and medium based on the DataX, heterogeneous database migration is carried out through a configuration tool, and the purposes of data migration transparency, configurability, individuation and standardization are achieved; the data migration is separated from the dilemma of fragmentation and over-specialization, and the heterogeneous database migration is easy to be performed by a person skilled in the art.

Drawings

FIG. 1 is a flow chart of a method for implementing distributed heterogeneous database migration based on DataX in accordance with the present invention;

FIG. 2 illustrates a block diagram of a distributed heterogeneous database migration system implemented based on DataX in accordance with the present invention.

Detailed Description

In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments may be combined with each other.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.

FIG. 1 shows a flow chart of a method for implementing distributed heterogeneous database migration based on DataX according to the present invention.

As shown in fig. 1, the invention discloses a method for realizing distributed heterogeneous database migration based on DataX, which comprises the following steps:

s102, acquiring migrated database information;

s104, obtaining migration table information according to the migrated data source library information;

s106, judging whether a corresponding data table of the migration table exists in the preset target database, if not, obtaining the data table non-existence information of the preset target database, and generating the corresponding data table according to the preset requirement; if yes, obtaining the existing information of the data table of the preset target database;

s108, obtaining json files of the migration table according to the migration table information;

s110, migrating the data in the migrated data source database to a preset target database according to the json file of the migration table and storing the data.

It should be noted that, the data source address is determined according to the information of the data source base, and meanwhile, the data source address can be added and modified in the data source management configuration, and the corresponding account number and the cipher text are stored to ensure the data security. And obtaining data to be migrated according to the migrated data source database, wherein the service node for synchronizing the data to be migrated is a multi-node, a distributed deployment mode is supported, and the management node automatically distributes synchronization tasks to the data synchronization nodes, wherein the distributed deployment mode comprises a management node and a plurality of data synchronization nodes, each data synchronization node corresponds to only one database, and each database supports concurrent migration of a plurality of data synchronization nodes. The migration table can be a plurality of numbers, each migration table is provided with various data names, the data names needing to be migrated are recorded, the migration table or the data names in the migration table can be obtained through single modes or multiple modes such as table name blurring, full matching search and the like, after the migration table is selected, whether a preset target database is provided with a corresponding data table or not is automatically inquired, if the preset target database is not provided with the corresponding data table, no information exists in the corresponding data table is popped up, and the pre-generation of DDL sentences is automatically carried out according to the type of the preset target database, and a user carries out manual revision according to the actual requirements of the project or directly executes the data table in the target database to generate the corresponding data table. The preset requirement is the actual requirement of the user according to the project; the DDL statement is a database schema definition language. The preset data migration mode comprises one of full transmission and incremental transmission, and is specifically selected by a user according to actual requirements.

According to an embodiment of the present invention, after migrating and storing data in a migrated data source database to a preset target database, the method includes:

acquiring data name information of migration completion;

and sending the prompt information to a user side for prompting.

It should be noted that, the data names in the migration table and the data table should be consistent, where if the data name of the migration completed does not exist in the data table, it is indicated that the data corresponding to the migration completed is in error or the data name in the data table is missing, so that the prompt message is triggered. If the data name of which the migration is completed exists in the data table, marking the data name in the data table, such as hooking, and the like, and displaying that the migration of the corresponding data name in the corresponding data table is completed.

According to the embodiment of the invention, after the data in the migrated data source library is migrated, the method further comprises the following steps:

After the data in the migrated data source library is migrated, detecting whether all the data names in the data table are marked, if so, checking the data table and the migration table if not, wherein the data names in the migration table and the data table are the same, if so, the data is not migrated, or the data is not recorded in the data table or the migration table if not.

According to the embodiment of the invention, after inconsistent data name information is obtained, the method further comprises the following steps:

obtaining migration log information;

and sending the preset adjustment scheme to a user side for display.

When abnormal conditions such as migration errors and interruption occur in the migration table, the abnormal conditions are displayed through a migration log; the preset adjustment scheme library stores various handling methods of migration errors, wherein when a plurality of adjustment methods are obtained by matching the migration table, the adjustment methods of the migration errors are ranked according to the preset operation simplicity degree and are sent to a user side for display, and the adjustment methods with simpler operation are ranked more forward.

According to the embodiment of the invention, the step of obtaining the json file of the migration table according to the migration table information specifically includes:

It should be noted that, the requirement of the preset user side is a user personalized requirement, and the configuration of field mapping is performed according to the user personalized requirement, if the id field of the database needs to be mapped into gid field, the id field can be selected in the field mapping and manually maintained into gid; the gid is denoted as group and is used to represent a unique identifier of a field. The migration table is controlled by two layers of concurrency, and consists of data synchronization node concurrency and source and target database concurrency, wherein the data synchronization node concurrency is that a plurality of data synchronization nodes can be appointed to initiate migration tasks, so that insufficient resources of the data synchronization nodes are avoided; the concurrency of the source and target databases is that multiple concurrency read and write data are designated according to the performance and resource conditions of the source and target databases.

According to an embodiment of the present invention, further comprising:

It should be noted that, the migration table information is classified according to different data sources, such as domestic databases of Oracle, mysql, megashirt, dream, etc., to obtain the migration table of the same database. And the migration information of the migration table is sent to a list in task management for display, and the task management end can perform operations such as starting, stopping, editing, deleting, executing, checking and the like on the migration information of the migration table.

According to an embodiment of the present invention, before the data is migrated according to json files in a migration table, the method includes:

acquiring an available memory value of a target database and a memory value of a migration table;

judging whether the memory value of the migration table is smaller than or equal to the available memory value of the target database, if so, indicating that the target database can accommodate the memory value of the migration table; otherwise, triggering the memory shortage information.

Before the migration table is migrated, calculating the data memory size of the migration table, and determining the memory value of the migration table; and calculating the available memory of the target database, and determining the available memory value of the target database. When the memory value of the migration table is smaller than or equal to the available memory value of the target database, the corresponding migration table is indicated to be capable of performing migration tasks; otherwise, the available memory of the target database is insufficient, and the data migration operation is invalid.

According to an embodiment of the present invention, before the data is migrated according to the json file of the migration table, the method further includes:

obtaining migration plan information of a migration table;

extracting time or sequencing information set in a migration plan of a migration table;

and carrying out data migration on the migration table according to the set time or the sequence.

It should be noted that, before the migration table performs migration, a plan for starting migration needs to be formulated, for example, determining a time for starting migration, that is, immediately starting migration, starting migration at a specific time, periodically scheduling migration, and the like, for example, determining an order for starting migration: according to the sequence of the selected time and the sequence of the memory size, etc.

According to an embodiment of the present invention, after the data is migrated according to the json file of the migration table, the method further includes:

acquiring table quantity information of a data source database and a target database;

judging whether the migration table numbers of the data source database and the target database are consistent, if so, indicating that the data migration is successful; if not, triggering a data migration failure prompt and displaying the data which is not migrated successfully and the data source library where the corresponding data is located.

It should be noted that, after the migration task is completed, the table numbers of the data source database and the target database are obtained, where when the migration table numbers of the data source database and the target database are consistent, the migration is completed, otherwise, the migration is interrupted or an error is indicated.

According to an embodiment of the present invention, further comprising:

obtaining migration task information set by a user side;

and sending the migration task set by the user side to a preset system management side for display.

The user terminal is a common user, and the common user can only view, add and modify common data source address information, synchronous tasks and other functions of the user; the preset system management end is a system manager, and the system manager has all rights and can check, add and modify global parameters; the accuracy of the migration task is further improved through different account rights.

According to an embodiment of the present invention, further comprising: and sending the json file of the migration table to a preset configuration information table for storage.

When the migration address of the migration table is wrong or the original address needs to be migrated back again, the reverse json file of the corresponding migration table is produced according to the json file of the corresponding migration table, and the corresponding migration table is returned to the original database through the reverse json file of the corresponding migration table.

According to an embodiment of the present invention, further comprising:

acquiring a first quantity value of a migration table based on a preset time period;

acquiring a second quantity value of a migration table of which the one-time migration is successful;

Dividing the second quantity value by the first quantity value to obtain the one-time migration success rate of the migration table;

judging whether the one-time migration success rate of the migration table is smaller than a preset success rate threshold value, if so, extracting migration log information;

and adjusting the data migration step according to the migration log information.

It should be noted that, the one-time migration success rate of the migration table is a value obtained by dividing the second number value of the user terminal by the first number value in a preset time period, the migration log information includes information such as a location where an error occurs in data migration, a non-matching format, and the like, and the preset success rate threshold is set by a person skilled in the art.

As shown in fig. 2, a second aspect of the present invention provides a data x-based distributed heterogeneous database migration system 2, which includes a memory 21 and a processor 22, where the memory stores a data x-based distributed heterogeneous database migration method program, and the data x-based distributed heterogeneous database migration method program implements the following steps when executed by the processor:

Acquiring information of a migrated data source library;

acquiring data name information of migration completion;

and sending the prompt information to a user side for prompting.

obtaining migration log information;

and sending the preset adjustment scheme to a user side for display.

According to an embodiment of the present invention, further comprising:

obtaining migration plan information of a migration table;

According to an embodiment of the present invention, further comprising:

obtaining migration task information set by a user side;

According to an embodiment of the present invention, further comprising:

The invention discloses a method, a system and a medium for realizing migration of a distributed heterogeneous database based on DataX, wherein the method comprises the following steps: acquiring information of a migrated data source library; obtaining migration table information according to the migrated data source library information; judging whether a corresponding data table of the migration table exists in the preset target database, if not, triggering the absence of a prompt, and generating the corresponding data table according to the preset requirement; if yes, displaying that the corresponding data table exists; obtaining a json file of the migration table according to the migration table information; based on a preset data migration mode, migrating the data according to json files of a migration table. According to the method, heterogeneous database migration is performed through the configuration tool, so that the purposes of data migration transparency, configurability, individuation and standardization are achieved; the data migration is separated from the dilemma of fragmentation and over-specialization, and the heterogeneous database migration is easy to be performed by a person skilled in the art.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, or the like, which can store program codes.

Alternatively, the above-described integrated units of the present invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.

Claims

1. The method for realizing the migration of the distributed heterogeneous database based on the DataX is characterized by comprising the following steps of:

acquiring information of a migrated data source library;

2. The method for implementing distributed heterogeneous database migration based on DataX according to claim 1, wherein after migrating and storing the data in the migrated data source database to a preset target database, the method comprises:

acquiring data name information of migration completion;

And sending the prompt information to a user side for prompting.

3. The method for implementing distributed heterogeneous database migration based on DataX according to claim 1, further comprising, after the migration of the data in the migrated data source library is completed:

4. The method for implementing distributed heterogeneous database migration based on DataX according to claim 3, further comprising, after obtaining the inconsistent data name information:

obtaining migration log information;

and sending the preset adjustment scheme to a user side for display.

5. The method for implementing distributed heterogeneous database migration based on DataX according to claim 1, wherein the step of obtaining the json file of the migration table according to the migration table information specifically includes:

6. The method for implementing distributed heterogeneous database migration based on DataX of claim 1, further comprising:

7. The distributed heterogeneous database migration system based on the DataX is characterized by comprising a memory and a processor, wherein the memory stores a distributed heterogeneous database migration method program based on the DataX, and the distributed heterogeneous database migration method program based on the DataX realizes the following steps when being executed by the processor:

acquiring information of a migrated data source library;

8. The system for implementing distributed heterogeneous database migration based on DataX according to claim 7, wherein after migrating and storing the data in the migrated data source database to the preset target database, the system comprises:

Acquiring data name information of migration completion;

and sending the prompt information to a user side for prompting.

9. The data-based distributed heterogeneous database migration system of claim 7, further comprising, after the data in the migrated data source library has completed migration:

10. A computer medium, wherein a data x-based distributed heterogeneous database migration method program is stored in the computer medium, and the data x-based distributed heterogeneous database migration method program is executed by a processor to implement the steps of the data x-based distributed heterogeneous database migration method according to any one of claims 1 to 6.