Detailed Description
The present disclosure now will be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the disclosure are shown. The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the scope of protection of the present disclosure. The technical solution of the present disclosure is described in various aspects below with reference to various figures and embodiments.
The terms "first" and "second" are used only for descriptive distinction and have no other special meaning.
Currently, data migration is usually performed by relying on a data migration tool carried by a database manufacturer. For example, when the source and target libraries are both mysql and the database table structures are completely consistent, data synchronization may be performed using mysql binlog, which is used to record all or potentially updated data; when the source library is the mysql library but the target library is other databases, a plug-in can be used for data synchronization, and the plug-in can be a canal plug-in ali; when the source library and the target library are both oracle, dblink can be used for data synchronization; when the source library and the target library are both sql server databases, data synchronization can be performed by using a database publishing and subscribing mode.
The existing data migration technical scheme depends on a data migration tool provided by a database manufacturer, a source library and a target library are necessarily the same database, the structures of the source library and the target library are necessarily identical, and the field type is also the same type. Most of the existing data migration technical schemes perform data migration under the condition that an application program stops service, and are not suitable for a core service system with uninterruptible service; most of existing data migration technical schemes adopt synchronous data verification, namely, one piece of synchronous data is verified, efficiency is low, and when migration failure occurs in a migration process, error exit is usually directly reported, and great repetitive work can be brought to data migration with large data volume.
Fig. 1 is a schematic flow chart diagram of an embodiment of a database data migration method according to the present disclosure, as shown in fig. 1:
step 101, configuring data mapping information between a source database table and a target database table.
In one embodiment, the source database is a data synchronization source database, the source database may be a variety of databases conforming to standard SQL, and the source database table is a data table in the source database. The target database is a target database for data synchronization, the target database can be various databases conforming to standard SQL, and the target database table is a data table in the target database.
The data mapping information includes information such as the correspondence between data tables and the mapping between data table fields. The database types of the source database and the target database may be different, for example, the source database is oracle and the target database is mysql. The source database table and the target database table may differ by one or more fields.
And 102, synchronizing the data in the source database table in the target database table based on the data mapping information under the condition that the source database and the target database provide services.
In one embodiment, the data synchronization between the source database and the target database is realized in a state that the source database and the target database still provide services such as query, data insertion and the like.
And 103, after data synchronization operation is performed between the source database table and the target database table, checking data in the source database table and the target database table, and judging whether the data synchronization operation is correct or not and performing corresponding processing based on a checking result.
And step 104, after the data synchronization operation is judged to be correct, synchronizing the newly added data in the target database table in the source database table based on the data mapping information.
In one embodiment, data link configuration information of the source database and the target database is configured, and the data link configuration information includes a database type, a link address, a connection string, a database username and password, and the like. And respectively configuring a source database link pool and a target database link pool based on the data link configuration information, and respectively storing the source database link and the target database link.
Fig. 2 is a schematic flowchart of configuring a database connection pool in an embodiment of a database data migration method according to the present disclosure, as shown in fig. 2:
step 201, a source database connection pool and a target database connection pool are created, and the maximum database connection number is determined.
And step 202, respectively generating a source database link and a target database link based on the data link information and the maximum database connection number, and respectively placing the source database link and the target database link into a source database connection pool and a target database connection pool.
And acquiring data mapping information, wherein the data mapping information comprises the corresponding relation between data tables and the mapping relation between data table fields. And synchronizing the data in the source database table in the target database table based on the data mapping information by using the source database link in the source database connection pool and the target database link in the target database link pool.
In one embodiment, a plurality of database data source connection modes can be configured through a management terminal program, and data migration is carried out seamlessly. And embedding the common database connection jar package into the management terminal program, and configuring a database connection mode corresponding to the jar package in the configuration file. For example, the source database is oracle, and the target database is mysql; the source database table is a t _ user table in the oracle database, and the structure of the source database table is shown in table 1 below:
name of field
|
Field attributes
|
id
|
Number(12)
|
user_name
|
Varchar2(64)
|
age
|
Number(2)
|
create_date
|
Date |
Table 1 t _ user structure table the target database table is a t _ person table in mysql library, and its structure is shown in table 2 below:
TABLE 2 structural Table of t _ person
The data link configuration information may be stored in a configuration file (profiles). The management terminal can provide a web terminal management interface for configuring information such as database link addresses, connection character strings, database user names and passwords of the source database and the target database. The source database link is an oracle database link, and the target database link is a mysql database link.
The types and versions of the source database and the target database are obtained, and the field type conversion information between the source database and the target database is configured. For example, the target database is mysql and the date field needs to convert the character type to a date type using the str _ to _ date function. The field type conversion information may be stored in a config.
And configuring data mapping information, wherein the data mapping information comprises mapping between data tables of a source database and a target database and mapping between data table fields. For example, the t _ user table in the source database corresponds to the t _ person table in the target database, and the user _ name field in the t _ user table corresponds to the person _ name field in the t _ person table. The data mapping information may be stored in a configuration file (mapping.
The thread configuration number is corresponding to the source database table, and can be flexibly configured according to the data volume of the source database table, so that all table data can be migrated as simultaneously as possible. The number of threads is stored in a config.
The frequency of timing the data to be captured (data synchronization frequency) is configured to preferably correspond to the execution time of a single thread, for example, the execution time of a single thread is about 2 seconds, and the data synchronization frequency is set once for 2.5 seconds. Data synchronization frequency is stored in config. The number of pieces of data of the source database table captured each time can be configured for paging of the source database query, and the configuration item is stored in the configg.
Fig. 3 is a schematic flowchart of data synchronization in an embodiment of a database data migration method according to the present disclosure, as shown in fig. 3:
step 301, configuring data synchronization frequency information corresponding to a source database table.
Step 302, generating a first task corresponding to the source database table, and setting the execution frequency of the first task based on the data synchronization frequency information; wherein the first task is executed using a thread corresponding to the source data table.
In one embodiment, a task refers to an activity performed by software, the task being performed by a process or thread. The thread corresponding to the source data table may be acquired or started from the thread pool, the execution frequency of the first task is set based on the data synchronization frequency information, and the first task is executed using the thread corresponding to the source data table (timed execution).
And 303, acquiring a target database link from the target database link pool based on the first task, and acquiring data information needing to be synchronized by using the acquired target database link.
In one embodiment, the retrieved target database link is used to query the maximum ID value corresponding to the data record in the target database table.
Step 304, obtaining the source database link from the source database connection pool based on the first task, and obtaining the data to be synchronized from the source database table based on the obtained source database link and the data information to be synchronized.
In one embodiment, the obtained source database link is used to obtain data with a data record ID value greater than the maximum ID value in the target database table from the source database table as the data to be synchronized.
And 305, inserting the data needing synchronization into a target database table by using the acquired target database link and based on the data mapping information.
And acquiring field type conversion information between the source database and the target database, and performing type conversion processing on the data to be synchronized based on the field type conversion information when the data to be synchronized is inserted into the target database table.
In one embodiment, database link pools of the source database and the target database are respectively created according to the data link configuration information. Quartz is an open source project in the Job scheduling field for OpenSymphony open source organization, can be combined with J2EE and J2SE application programs or used independently, and can be used for creating Jobs. And dynamically creating job (tasks) by using Quartz according to the number of database tables in the configuration file, wherein each job _ name corresponds to one database table, and setting the execution frequency of each job equal to the preset data synchronization frequency information.
And dynamically creating a first task corresponding to the source database table, wherein the name of the first task is jobuser, and the first task is a timing task. The first task needs to do the following work:
1. and acquiring a target database link from the target database link pool. For example, a database link is obtained from the oracle library link pool.
2. And querying the maximum value of the id in the target database table (the data before the maximum value is necessarily stored in the target database table, and only the data after the maximum value needs to be synchronized), and storing the maximum value of the id in the variable maxId. For example, the largest ID of the t _ person table in the mysql library needs to be queried, which is the ID of the first task starting next time, and the query statement is:
select max(id)maxId from t_person。
3. a source database link is obtained from a source database link pool. For example, a database link is obtained from the mysql library link pool.
4. For querying data needing synchronization from a source database table, for example, querying in an oracle database, the sql statement is as follows:
select a1.*(select t_user.*,rownum rn from t_user where id>maxId)t1 where rn>=limitNum;
the query result set is stored in a List object.
5. And acquiring field type conversion information between the source database and the target database, and checking whether a field needing translation exists. For example, the target database is the mysql database and the Create _ date field is the date type, the sql statement needs to be concatenated to str _ to _ date (Create _ date, 'yyyyy-MM-dd').
6. And (4) inserting data in batches by using a batch insertion method in the target database table. For example, the target database is a mysql library, and the batch-inserted sql statements are as follows:
insert _ int _ person (id, person _ name, age, create _ date) VALUES (1,' Zhang three)
',20, str _ to _ date ('2017-09-09',' yyyyy-MM-dd ')), (2,' Liquan
',21, str _ to _ date ('2017-09-09', ' yyyyy-MM-dd ')), (3, ' WangpuWo '), (see FIGS.)
',22,str_to_date('2017-09-09','yyyy-MM-dd’))。
And repeating the steps 1-6 until the data synchronization is finished. After the data synchronization is finished, before the data source is not switched to the target library, if the source database has newly added data, the data source and the data source are synchronized to the target library in real time.
Fig. 4 is a schematic flowchart of data updating in an embodiment of a database data migration method according to the present disclosure, as shown in fig. 4:
step 401, after inserting the data to be synchronized into the target database table, generating a second task corresponding to the source database table; wherein the second task is executed using a thread corresponding to the source data table.
Step 402, obtaining a source database link from the source database connection pool based on the second task, and obtaining the modified data of the source database table during the data synchronization operation by using the obtained source database link.
And step 403, acquiring a target database link from the target database link pool based on the second task, and updating data corresponding to the modified data in the target database table by using the acquired target database link.
In one embodiment, since the data source is not switched during the synchronization process, and the production data or the production data is written into the source database, it is highly likely that the data synchronized to the target database has been modified during the synchronization process, and therefore, an incremental data modification needs to be performed.
A source database link is obtained from a source database link pool, for example, a database link is obtained from a mysql database link pool. Querying a record of modified data in a source database table from a data synchronization time point, wherein the modified data after the time point is synchronized can be presented through modified, and as long as the modified time is described as modified after the current time point, corresponding sql statements are as follows:
select from t _ user modified > is the data synchronization time point.
The query result set is stored in a List object. And updating the corresponding record in the target database table, wherein the corresponding sql statement is as follows:
update t_person set person_name=#t_user.user_name#where id=#t_user.id#。
this second task runs until the data source is not switched to the target repository, ensuring that the data between the two databases remains consistent at all times.
Fig. 5 is a schematic flowchart of a verification process in an embodiment of a database data migration method according to the present disclosure, as shown in fig. 5:
step 501, generating a third task corresponding to a source database table; wherein the third task is executed using a thread corresponding to the source data table.
Step 502, obtaining a source database link from the source database connection pool based on the third task, querying data corresponding to the data synchronization operation and stored in the source database table by using the obtained source database link, converting all field values contained in the data into character types and performing splicing processing, and performing encryption processing on the spliced character string to obtain a first encrypted value.
Step 503, acquiring a target database link from the target database connection pool based on the third task, querying data corresponding to the data synchronization operation and stored in the target database table by using the acquired target database link, converting all field values contained in the data into character types, splicing, and encrypting the spliced character string to obtain a second encrypted value.
If the first cryptographic value is determined to be the same as the corresponding second cryptographic value, then the data synchronization is determined to be correct, step 504.
Step 504, if the first cryptographic value is determined to be different from the corresponding second cryptographic value, determining that the data synchronization is abnormal, and performing corresponding abnormal processing. The exception handling may be a variety of processes, such as notifying a user of an exception, resuming data synchronization, and the like.
Fig. 6 is a schematic flowchart of a verification process in another embodiment of the database data migration method according to the present disclosure, as shown in fig. 6:
step 601, obtaining a source database and a target database link.
Step 602, reading mapping.
Step 603, start multithreading.
In one embodiment, according to the number of source database tables, Quartz is used to dynamically create verification jobs, i.e., a third task, where each jobname corresponds to one source database table and one jobb is responsible for verifying one source database table and the corresponding target database table. And respectively acquiring database links from the source database link pool and the target database link pool. Reading the configuration information in mapping and properties files, and acquiring all field information of all data tables in the database.
Step 604-. Reading one record in the list each time, converting all fields of each record into character types to perform character splicing operation, and then performing md5 operation on the spliced character strings to obtain the md5 value. And comparing whether the two corresponding md5 values are consistent, if so, indicating that the synchronization of the data is correct, and if not, recording the inconsistent id.
Step 609, prompt the user for the data synchronization exception.
The user may resynchronize the table, step 610.
Step 611, prompt the user that the data migration is successful.
Fig. 7 is a schematic flowchart of performing data inverse synchronization in an embodiment of a database data migration method according to the present disclosure, as shown in fig. 7:
step 701, generating a fourth task corresponding to the target database table; wherein the fourth task is executed using the thread corresponding to the target data table.
Step 702, acquiring a target database link from the target database connection pool based on the fourth task, and querying new data of the target data table after the data synchronization operation by using the acquired target database link.
In one embodiment, after the data synchronization operation, the target database connection pool may be switched to the master database, the production data is written into the target database, and the acquired target database is used to link and query the new data of the target database table after the data synchronization operation.
Step 703, obtaining the source database link from the source database connection pool based on the fourth task, and synchronizing the newly added data in the source data table by using the obtained source database link and based on the data mapping information.
In one embodiment, the reverse data synchronization process needs to be started after the data has been synchronized and verified. The reverse data synchronization process is to add data reversely on the basis of forward data migration, namely, synchronize data from a target database to a source database, so that whether data is written to the source database or the target database, the data of the two databases can be guaranteed to be consistent all the time, a data source can be migrated smoothly, and the purpose of migrating the database of an online system without stopping the online system is achieved.
And a set of reverse jobs is newly added, namely a fourth task, the data synchronization process multiplexes the forward data synchronization processing process, and the data source of the reverse jobs program is opposite to the forward jobs data source, namely the reverse jobs program takes the target database as a master database and the source database as a slave database. And performing inverse synchronous operation from the target database to the source database, and after the data of the source database is completely consistent with that of the target database, transferring the online instance database to the target database in batches.
In one embodiment, as shown in fig. 8, the present disclosure provides a database data migration apparatus 80, comprising: a mapping configuration module 81, a data synchronization module 82, a data verification module 83, and a reverse synchronization module 84. The mapping configuration module 81 configures data mapping information between a source database table and a target database table; wherein the data mapping information includes: the correspondence between data tables and the mapping between data table fields. The data synchronization module 82 synchronizes the data in the source database table in the target database table based on the data mapping information in a state that the source database and the target database provide services.
The data checking module 83 performs data synchronization operation between the source database table and the target database table, checks data in the source database table and the target database table, and determines whether the data synchronization operation is correct and performs corresponding processing based on a check result. After the reverse synchronization module 84 determines that the data synchronization operation is correct, the newly added data in the target database table is synchronized in the source database table based on the data mapping information.
In one embodiment, as shown in FIG. 9, the database data migration apparatus 80 further includes a link configuration module 85 and a link generation module 86. Link configuration module 85 configures data link configuration information for the source database and the target database. The link generation module 86 configures the source database link pool and the target database link pool based on the data link configuration information, and stores the source database link and the target database link respectively.
The data link configuration information comprises information such as database type, link address, connection character string and database user name and password. The link generation module 86 creates a source database link pool and a target database link pool and determines the maximum database connection number; the link generation module 86 generates a source database link and a target database link based on the data link information and the maximum database connection number, and puts them into the source database connection pool and the target database connection pool, respectively.
In one embodiment, the data synchronization module 82 synchronizes data in the source database table in the target database table based on the data mapping information using the source database link in the source database connection pool and the target database link in the target database link pool.
As shown in fig. 10, the data synchronization module 82 includes: a synchronization task starting unit 821, an information obtaining unit 822, a data querying unit 823, and a data inserting unit 824. The sync task starting unit 821 configures data sync frequency information corresponding to the source database table; the sync task starting unit 821 generates a first task corresponding to the source database table, and sets an execution frequency of the first task based on the data sync frequency information; wherein the first task is executed using a thread corresponding to the source data table. The information obtaining unit 822 obtains a target database link from the target database link pool based on the first task, and obtains data information that needs to be synchronized using the obtained target database link.
The data querying unit 823 obtains the source database link from the source database connection pool based on the first task, and obtains the data to be synchronized from the source database table using the obtained source database link and based on the data information to be synchronized. The data insertion unit 824 inserts data to be synchronized into the target database table using the acquired target database link and based on the data mapping information.
In one embodiment, the information acquisition unit 822 queries the maximum ID value corresponding to the data record in the target database table using the acquired target database link. The data querying unit 823 acquires data having a data record ID value larger than the maximum ID value from the source database table using the acquired source database link, as data that needs to be synchronized.
The data inserting unit 824 acquires field type conversion information between the source database and the target database, and performs type conversion processing on data to be synchronized based on the field type conversion information when the data to be synchronized is inserted into the target database table.
In one embodiment, the data updating module 87 generates a second task corresponding to the source database table after inserting the data to be synchronized into the target database table; wherein the second task is executed using a thread corresponding to the source data table.
The data update module 87 obtains the source database link from the source database connection pool based on the second task, and obtains the data of the source database table modified during the data synchronization operation using the obtained source database link. The data update module 87 acquires the target database link from the target database link pool based on the second task, and updates the data corresponding to the modified data in the target database table using the acquired target database link.
As shown in fig. 11, the data verification module 83 includes: a verification task initiating unit 831, a data encryption processing unit 832 and an encryption result verifying unit 833. The verification task starting unit 831 generates a third task corresponding to the source database table; wherein the third task is executed using a thread corresponding to the source data table.
The data encryption processing unit 832 obtains the source database link from the source database connection pool based on the third task, queries data corresponding to the data synchronization operation stored in the source database table using the obtained source database link, converts all field values included in the data into character types and performs splicing processing, and encrypts the spliced character string to obtain the first encrypted value.
The data encryption processing unit 832 acquires a target database link from the target database connection pool based on the third task, queries data corresponding to data synchronization operation stored in the target database table using the acquired target database link, converts all field values included in the data into character types and performs splicing processing, and encrypts the spliced character string to obtain a second encrypted value.
The encryption result verifying unit 833 determines that the data synchronization is correct if it is determined that the first encrypted value is the same as the corresponding second encrypted value; and if the encryption result verification unit 833 determines that the first encryption value is different from the corresponding second encryption value, the data synchronization is determined to be abnormal, and corresponding abnormal processing is performed.
As shown in fig. 12, the reverse synchronization module 84 includes: a reverse task start unit 841 and a reverse sync data unit 842. The reverse task starting unit 841 generates a fourth task corresponding to the target database table; wherein the fourth task is executed using the thread corresponding to the target data table.
The reverse synchronization data unit 842 obtains the target database link from the target database connection pool based on the fourth task, and queries the new data of the target data table after the data synchronization operation by using the obtained target database link. The reverse synchronization data unit 842 obtains the source database link from the source database connection pool based on the fourth task, and synchronizes the new data in the source data table based on the obtained source database link and the data mapping information.
In one embodiment, FIG. 13 is a block diagram illustration of yet another embodiment of a database data migration apparatus according to the present disclosure. As shown in fig. 13, the apparatus may include a memory 1301, a processor 1302, a communication interface 1303, and a bus 1304. The memory 1301 is used for storing instructions, the processor 1302 is coupled to the memory 1301, and the processor 1302 is configured to execute the database data migration method based on the instructions stored in the memory 1301.
The memory 1301 may be a high-speed RAM memory, a non-volatile memory (non-volatile memory), or the like, and the memory 1301 may be a memory array. The memory 1301 may also be partitioned, and the blocks may be combined into virtual volumes according to certain rules. Processor 1302 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement the database data migration methods of the present disclosure.
In one embodiment, the present disclosure provides a computer-readable storage medium storing computer instructions that, when executed by a processor, implement a database data migration method as in any one of the above embodiments.
In the database data migration method, apparatus, and storage medium provided in the foregoing embodiments, in a state where a source database and a target database provide services, data in the source database table is synchronized in the target database table based on data mapping information; checking the data in the source database table and the target database table, and judging whether the data synchronization operation is correct and carrying out corresponding processing based on the checking result; synchronizing the newly added data in the target database table in the source database table based on the data mapping information; the data migration can be realized without stopping the machine, and the problem of the online system data migration stopping is solved; the bidirectional data appending strategy is adopted, so that data can be synchronously written into any database in real time, and the method is suitable for data migration of on-line uninterruptible service; by configuring the relationship between mapping table fields, the difference between the attributes of the database and the data table fields can be shielded, the data migration between the heterogeneous database and the heterogeneous table structure is realized, the method can be suitable for different version databases of various manufacturers, and the use flexibility is higher; data integrity check, data synchronization and data check can be asynchronously performed and independently operated, synchronous operation efficiency is improved, and user experience can be improved.
The method and system of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
The description of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.