CN114860690A - Data migration method, device, equipment and storage medium - Google Patents
Data migration method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN114860690A CN114860690A CN202210512455.6A CN202210512455A CN114860690A CN 114860690 A CN114860690 A CN 114860690A CN 202210512455 A CN202210512455 A CN 202210512455A CN 114860690 A CN114860690 A CN 114860690A
- Authority
- CN
- China
- Prior art keywords
- data
- migrated
- data set
- migration
- data migration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000005012 migration Effects 0.000 title claims abstract description 341
- 238000013508 migration Methods 0.000 title claims abstract description 338
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000003860 storage Methods 0.000 title claims description 16
- 230000011218 segmentation Effects 0.000 claims abstract description 34
- 238000012163 sequencing technique Methods 0.000 claims abstract description 10
- 230000002159 abnormal effect Effects 0.000 claims description 43
- 238000004590 computer program Methods 0.000 claims description 16
- 238000000638 solvent extraction Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 abstract description 5
- 238000005516 engineering process Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 description 22
- 230000006870 function Effects 0.000 description 12
- 238000004891 communication Methods 0.000 description 8
- 238000007726 management method Methods 0.000 description 5
- 230000005856 abnormality Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000004140 cleaning Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/214—Database migration support
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5021—Priority
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of data processing, and discloses a data migration method, which comprises the following steps: acquiring a data set to be migrated and a data migration target table in a data source table; performing data arrangement on the data set to be migrated to obtain an arranged data set to be migrated; performing segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated; and sequencing the plurality of data migration tasks according to a preset priority, and migrating the segmented data set to be migrated to a data migration target table according to the sequenced data migration tasks. The invention also relates to a block chain technology, and the data set to be migrated can be stored in the block chain node. The invention also provides a data migration device, equipment and a medium. The invention can improve the data migration efficiency.
Description
Technical Field
The present invention relates to the field of data processing, and in particular, to a data migration method, apparatus, device, and storage medium.
Background
Data migration refers to the process of migrating data from a source database to a target database. In the conventional data migration process, data migration is usually realized through a data migration tool, because the data migration tool mainly performs simple data synchronization through an operation log, the data integrity in the data migration process cannot be guaranteed, and when the data migration tool is abnormal, the data migration process is interrupted and needs to be restarted, so that the data migration efficiency is low.
Disclosure of Invention
The invention provides a data migration method, a data migration device, data migration equipment and a storage medium, and mainly aims to improve data migration efficiency.
In order to achieve the above object, the present invention provides a data migration method, including:
acquiring a data set to be migrated and a data migration target table in a data source table;
performing data arrangement on the data set to be migrated to obtain an arranged data set to be migrated;
performing segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated;
creating a plurality of data migration tasks of the data set to be segmented and migrated, sequencing the data migration tasks according to a preset priority to obtain a sequenced data migration task, and migrating the data set to be segmented and migrated to the data migration target table according to the sequenced data migration task.
Optionally, the data arrangement of the data set to be migrated to obtain an arranged data set to be migrated includes:
acquiring the data type of the data set to be migrated;
identifying a data structure of the data set to be migrated according to the data type;
and arranging the data structure according to a preset arrangement rule to obtain the data set to be migrated.
Optionally, the segmenting the arranged data set to be migrated to obtain a segmented data set to be migrated includes:
performing database partitioning operation on the data source database to obtain a plurality of data source sub-databases;
performing table division operation in the data source sub-databases to obtain a plurality of data source sub-tables;
and segmenting the data set to be arranged and migrated according to the data source sub-tables to obtain the segmented data set to be migrated.
Optionally, the creating of the multiple data migration tasks for segmenting the data set to be migrated includes:
acquiring a segmentation number and a segmentation address of the segmented data set to be migrated;
and respectively associating the data set to be migrated with the corresponding segmentation serial number and the segmentation address to obtain a plurality of data migration tasks for segmenting the data set to be migrated.
Optionally, migrating the segmented to-be-migrated data set to the data migration target table according to the sorted data migration task, including:
executing the sorted data migration task, and reading an execution parameter of the sorted data migration task by using a preset plug-in;
comparing the execution parameters with a preset load coefficient;
when the execution parameter does not exceed the load coefficient threshold, loading the segmented data set to be migrated into the data migration target table through a preset migration thread by using a preset data migration tool;
and when the execution parameter exceeds the load coefficient threshold, newly adding a plurality of migration threads, and sequentially loading the segmented data set to be migrated into the data migration target table through the plurality of migration threads by using the data migration tool.
Optionally, after the migrating the segmented to-be-migrated data set to the data migration target table according to the sorted data migration task, the method further includes:
checking the consistency of the data set to be migrated and the full migration data set in the data migration target table according to a preset full rule, recording a migration abnormal point of the data set to be migrated and executing data migration of the data set to be migrated again according to the migration abnormal point when the data set to be migrated and the full migration data set are inconsistent;
and checking the consistency of the data set to be migrated and the incremental migration data set in the data migration target table according to a preset incremental rule, recording a migration abnormal point of the data set to be migrated when the data set to be migrated and the incremental migration data set are inconsistent, and re-executing data migration of the data set to be migrated according to the migration abnormal point.
Optionally, after the acquiring the data set to be migrated and the data migration target table in the data source table, the method further includes:
carrying out duplication removal operation on the data set to be migrated, and detecting whether a data missing value exists in the data set to be migrated after duplication removal;
if the missing data value does not exist, the data set to be migrated after the duplication removal is obtained;
and if the data missing value exists, filling the data missing value to obtain a filled data set to be migrated.
In order to solve the above problem, the present invention also provides a data migration apparatus, including:
the data acquisition module is used for acquiring a data set to be migrated and a data migration target table in the data source table;
the data arrangement module is used for carrying out data arrangement on the data set to be migrated to obtain an arranged data set to be migrated;
the data segmentation module is used for carrying out segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated;
and the data migration module is used for creating a plurality of data migration tasks for segmenting the data set to be migrated, sequencing the data migration tasks according to a preset priority to obtain a sequenced data migration task, and migrating the segmented data set to be migrated to the data migration target table according to the sequenced data migration task.
In order to solve the above problem, the present invention also provides an electronic device, including:
a memory storing at least one computer program; and
and the processor executes the computer program stored in the memory to realize the data migration method.
In order to solve the above problem, the present invention also provides a computer-readable storage medium, in which at least one computer program is stored, and the at least one computer program is executed by a processor in an electronic device to implement the data migration method described above.
In the embodiment of the invention, firstly, the data arrangement is carried out on the data set to be migrated to obtain the data set to be migrated, so that the abnormality caused by the inconsistency between the data structure in the data source table and the data structure in the data migration target table when the subsequent data set to be migrated is migrated can be avoided; secondly, the data set to be migrated is arranged and segmented to obtain a segmented data set to be migrated, so that the consequence of failure of overall data migration caused by abnormal scenes in the one-time migration process of a large amount of data can be avoided, and when the data migration process is abnormal, the data migration restoration of the abnormal data set to be migrated is carried out again only by identifying the abnormal data set to be migrated in the segmented data set to be migrated, so that the data integrity in the data migration process is ensured; and finally, migrating the segmented data set to be migrated to the data migration target table according to the sorted data migration task, so that the intelligent scheduling of the data migration task can be realized, and the data migration efficiency can be improved. Therefore, the data migration method, the data migration device, the data migration equipment and the storage medium provided by the embodiment of the invention can improve the data migration efficiency.
Drawings
Fig. 1 is a schematic flow chart of a data migration method according to an embodiment of the present invention;
FIG. 2 is a detailed flowchart illustrating a step of a data migration method according to an embodiment of the present invention;
fig. 3 is a detailed flowchart illustrating a step in a data migration method according to an embodiment of the present invention;
FIG. 4 is a block diagram of a data migration apparatus according to an embodiment of the present invention;
fig. 5 is a schematic internal structural diagram of an electronic device implementing a data migration method according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the invention provides a data migration method. The execution subject of the data migration method includes, but is not limited to, at least one of electronic devices such as a server and a terminal that can be configured to execute the method provided by the embodiments of the present application. In other words, the data migration method may be performed by software or hardware installed in the terminal device or the server device, and the software may be a block chain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Referring to fig. 1, which is a schematic flow chart of a data migration method according to an embodiment of the present invention, in an embodiment of the present invention, the data migration method includes the following steps S1-S4:
and S1, acquiring the data set to be migrated and the data migration target table in the data source table.
In the embodiment of the invention, the data set to be migrated refers to data which needs to be migrated from a data source table to a data migration target table, for example, data in an Oracle database is migrated to a Tidb database; the data source table is a data table for storing a data set to be migrated; the data migration target table is used for storing the data set of the data set to be migrated after the data set is migrated.
In an optional embodiment of the present invention, because there are many useless data in the data set to be migrated, the data to be migrated is subjected to data cleaning to obtain a standard data set, so that the useless data in the data set to be migrated can be screened out, and the data migration efficiency is improved.
In detail, after the acquiring the data set to be migrated and the data migration target table in the data source table, the method further includes:
carrying out duplication removal operation on the data set to be migrated, and detecting whether a data missing value exists in the data set to be migrated after duplication removal; if the missing data value does not exist, the data set to be migrated after the duplication removal is obtained; and if the data missing value exists, filling the data missing value to obtain a filled data set to be migrated.
In an embodiment of the present invention, the data set to be migrated is subjected to deduplication operations by using the following method:
wherein d represents the distance value of any two data in the data set to be migrated, w 1j And w 2j And representing any two data in the data set to be migrated, deleting any one of the data when the distance value is smaller than a preset distance value, and simultaneously keeping the two data if the distance value is not smaller than the preset distance value. Preferably, the preset distance value may be 0.1.
And S2, performing data arrangement on the data set to be migrated to obtain the data set to be migrated.
In the embodiment of the present invention, the main function of the data arrangement is to adapt to the data arrangement performed by the data structure of the data migration target table because the data structure in the data source table is inconsistent with the data structure of the data migration target table.
According to the data migration method and device, the data arrangement is carried out on the data set to be migrated to obtain the data set to be migrated, so that the problem that due to the fact that the data structure in the data source table is inconsistent with the data structure in the data migration target table, the data set to be migrated subsequently is abnormal when being migrated can be avoided, and the subsequent data migration efficiency is improved.
As an embodiment of the present invention, referring to fig. 2, the data arrangement of the data set to be migrated to obtain an arranged data set to be migrated includes the following steps S21-S23:
s21, acquiring the data type of the data set to be migrated;
s22, identifying the data structure of the data set to be migrated according to the data type;
and S23, arranging the data structure according to a preset arrangement rule to obtain the data set to be migrated.
The data type is mainly used for representing data of the data set information to be migrated; the data structure refers to a collection of data elements with one or more relationships between data sets to be migrated, such as the insurance relationship between a client and different insurance classes in the client insurance application data.
In an embodiment of the present invention, the arrangement rule may be customized according to an actual scene, for example, in the insurance field, the data set to be migrated in the data source table represents the insurance application conditions of all clients, the arrangement rule may be arranged according to client types, when the client type is identified as an individual client, the arrangement rule is arranged to an individual table, and when the client type is identified as a group client, the arrangement rule is arranged to a group table.
And S3, performing segmentation operation on the arranged data set to be migrated to obtain a segmented data set to be migrated.
And performing segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated.
In the embodiment of the present invention, the segmenting the to-be-migrated data set refers to data obtained by performing database partitioning and table partitioning on the arranged to-be-migrated data set.
The data set to be migrated is obtained by segmenting the data set to be arranged, so that the result of failure of overall data migration caused by an abnormal scene occurring in the one-time migration process of a large amount of data can be avoided, the data migration restoration of the abnormal data set to be migrated is only needed to be carried out again by identifying the abnormal data set to be migrated in the segmented data set to be migrated when the data migration process is abnormal, and the data integrity in the data migration process is ensured.
As an embodiment of the present invention, referring to fig. 3, the segmenting the arranged to-be-migrated data set to obtain a segmented to-be-migrated data set includes the following steps S31 to S33:
s31, performing database partitioning operation on the data source database to obtain a plurality of data source databases;
s32, performing table division operation in the data source sub-databases to obtain a plurality of data source sub-tables;
and S33, segmenting the data set to be arranged and migrated according to the data source sub-tables to obtain the segmented data set to be migrated.
The performance of the data source library is limited, so that the capacity of the data source library cannot be too large, the data source library needs to be subjected to library splitting operation, and a plurality of data source libraries are obtained, for example, if the data source library is a, the plurality of data source libraries can be a1, a2, A3, a4 and the like; the data source sub-databases are subjected to table partitioning operation to prevent more recorded data stored in the data tables, avoid abnormal interruption of data migration, and also when the table data is abnormal, migration of other table data cannot be affected, for example, the number of the data source sub-databases is six, ten tables are stored in each sub-database, the total number of the data source sub-tables is sixty, the dataset to be migrated is partitioned according to the data source sub-tables, and the dataset to be migrated can be partitioned by sixty.
S4, creating a plurality of data migration tasks of the data set to be segmented and migrated, sequencing the data migration tasks according to a preset priority to obtain a sequencing data migration task, and migrating the data set to be segmented and migrated to the data migration target table according to the sequencing data migration task.
In the embodiment of the invention, the data migration task refers to a task of segmenting the data set to be migrated and migrating the data set to the data migration target table. For example, the data set to be migrated is divided into all the customer insurance information of the insurance company, and the data migration tasks are data migration tasks such as common customer insurance application information, common customer heavy insurance application information, VIP customer heavy insurance application information, and VIP customer insurance application information.
In the embodiment of the present invention, the preset priority refers to a user's custom priority ranking for performing multiple data migration tasks according to the requirement, for example, the first level may be VIP customer insurance application information, the second level may be VIP customer heavy insurance application information, the third level may be common customer heavy insurance application information, and the fourth level may be common customer insurance application information.
According to the embodiment of the invention, a plurality of data migration tasks for segmenting the data set to be migrated are created, the data migration tasks are sequenced according to the preset priority to obtain the sequenced data migration tasks, and the segmented data set to be migrated is migrated to the data migration target table according to the sequenced data migration tasks, so that the intelligent scheduling of the data migration tasks can be realized, and the data migration efficiency is improved.
In detail, the creating of the plurality of data migration tasks for segmenting the data set to be migrated includes:
acquiring a segmentation serial number and a segmentation address of the segmented data set to be migrated; and respectively associating the data set to be migrated with the corresponding segmentation serial number and the segmentation address to obtain a plurality of data migration tasks for segmenting the data set to be migrated.
Further, the migrating the segmented to-be-migrated data set to the data migration target table according to the sorted data migration task includes:
executing the sorted data migration task, and reading an execution parameter of the sorted data migration task by using a preset plug-in; comparing the execution parameters with a preset load coefficient; when the execution parameter does not exceed the load coefficient threshold, loading the segmented data set to be migrated into the data migration target table through a preset migration thread by using a preset data migration tool; and when the execution parameter exceeds the load coefficient threshold, newly adding a plurality of migration threads, and sequentially loading the segmented data set to be migrated into the data migration target table through the plurality of migration threads by using the data migration tool.
The execution parameters refer to resource consumption parameters of a target database where a data migration target table is located in the execution process of the sequenced data migration task, for example, the running load coefficients of databases such as database connection number, tps, qps, CPU, memory and the like; the preset load factor threshold is 80%.
In an embodiment of the present invention, the ordered data migration task is executed by a migration thread, and the tasks can be executed by different threads through resource consumption of the target database, so as to implement intelligent scheduling of task execution and improve migration efficiency of data migration.
Preferably, the data migration tool may be a key tool.
In another embodiment of the present invention, after the migrating the segmented to-be-migrated data set to the data migration target table according to the sorted data migration task, the method further includes:
checking the consistency of the data set to be migrated and the full migration data set in the data migration target table according to a preset full rule, recording a migration abnormal point of the data set to be migrated and executing data migration of the data set to be migrated again according to the migration abnormal point when the data set to be migrated and the full migration data set are inconsistent; and checking the consistency of the data set to be migrated and the incremental migration data set in the data migration target table according to a preset incremental rule, recording a migration abnormal point of the data set to be migrated when the data set to be migrated and the incremental migration data set are inconsistent, and re-executing data migration of the data set to be migrated according to the migration abnormal point.
The full-scale rule is mainly used for identifying whether the full-scale migration data sets, stored in the source data table, of the segmented data sets to be migrated, in the data target table are synchronous or not, changes of the data source table can be reflected to the data migration target table in real time, and finally the segmented migration data sets of the data source table are consistent with the full-scale migration data sets; the delta rule primarily functions to synchronize data migration without production shutdown.
For example, data migration of the data source table 2 months and 2 days before 2022 is regarded as full data migration, and data migration of the data source table 2 months and 2 days after 2022 is regarded as incremental data migration.
In the embodiment of the invention, firstly, the data arrangement is carried out on the data set to be migrated to obtain the data set to be migrated, so that the abnormality caused by the inconsistency between the data structure in the data source table and the data structure in the data migration target table when the subsequent data set to be migrated is migrated can be avoided; secondly, the data set to be migrated is arranged and segmented to obtain a segmented data set to be migrated, so that the consequence of failure of overall data migration caused by abnormal scenes in the one-time migration process of a large amount of data can be avoided, and when the data migration process is abnormal, the data migration restoration of the abnormal data set to be migrated is carried out again only by identifying the abnormal data set to be migrated in the segmented data set to be migrated, so that the data integrity in the data migration process is ensured; and finally, migrating the segmented data set to be migrated to the data migration target table according to the sorted data migration task, so that the intelligent scheduling of the data migration task can be realized, and the data migration efficiency can be improved. Therefore, the data migration method provided by the embodiment of the invention can improve the data migration efficiency.
The data migration apparatus 100 of the present invention may be installed in an electronic device. According to the implemented functions, the data migration apparatus may include a data acquisition module 101, a data arrangement module 102, a data segmentation module 103, and a data migration module 104, where the modules may also be referred to as units, and refer to a series of computer program segments capable of being executed by a processor of an electronic device and performing fixed functions, and the computer program segments are stored in a memory of the electronic device.
In the present embodiment, the functions of the respective modules/units are as follows:
the data obtaining module 101 is configured to obtain a data set to be migrated and a data migration target table in a data source table.
In the embodiment of the invention, the data set to be migrated refers to data which needs to be migrated from a data source table to a data migration target table, for example, data in an Oracle database is migrated to a Tidb database; the data source table is a data table for storing a data set to be migrated; the data migration target table is used for storing the data set of the data set to be migrated after the data set is migrated.
In an optional embodiment of the present invention, because there are many useless data in the data set to be migrated, the data to be migrated is subjected to data cleaning to obtain a standard data set, so that the useless data in the data set to be migrated can be screened out, and the data migration efficiency is improved.
The data acquisition module 101 may be further configured to:
carrying out duplication removal operation on the data set to be migrated, and detecting whether a data missing value exists in the data set to be migrated after duplication removal; if the missing data value does not exist, the data set to be migrated after the duplication removal is obtained; and if the data missing value exists, filling the data missing value to obtain a filled data set to be migrated.
In an embodiment of the present invention, the data set to be migrated is subjected to deduplication operations by using the following method:
wherein d represents the distance value of any two data in the data set to be migrated, w 1j And w 2j And representing any two data in the data set to be migrated, deleting any one of the data when the distance value is smaller than a preset distance value, and simultaneously keeping the two data if the distance value is not smaller than the preset distance value. Preferably, the preset distance value may be 0.1.
The data arrangement module 102 is configured to perform data arrangement on the data set to be migrated to obtain an arranged data set to be migrated.
In the embodiment of the present invention, the main function of the data arrangement is to adapt to the data arrangement performed by the data structure of the data migration target table because the data structure in the data source table is inconsistent with the data structure of the data migration target table.
According to the data migration method and device, the data arrangement is carried out on the data set to be migrated to obtain the data set to be migrated, so that the problem that due to the fact that the data structure in the data source table is inconsistent with the data structure in the data migration target table, the data set to be migrated subsequently is abnormal when being migrated can be avoided, and the subsequent data migration efficiency is improved.
As an embodiment of the present invention, the data arrangement module 102 performs data arrangement on the data set to be migrated by executing the following operations, to obtain an arranged data set to be migrated, including:
acquiring the data type of the data set to be migrated;
identifying a data structure of the data set to be migrated according to the data type;
and arranging the data structure according to a preset arrangement rule to obtain the data set to be migrated.
The data type is mainly used for representing data of the data set information to be migrated; the data structure refers to a collection of data elements with one or more relationships between data sets to be migrated, such as the insurance relationship between a client and different insurance classes in the client insurance application data.
In an embodiment of the present invention, the arrangement rule may be customized according to an actual scene, for example, in the insurance field, the data set to be migrated in the data source table represents the insurance application conditions of all clients, the arrangement rule may be arranged according to client types, when the client type is identified as an individual client, the arrangement rule is arranged to an individual table, and when the client type is identified as a group client, the arrangement rule is arranged to a group table.
The data segmentation module 103 is configured to perform a segmentation operation on the arranged data set to be migrated to obtain a segmented data set to be migrated.
And performing segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated.
In the embodiment of the present invention, the segmenting the to-be-migrated data set refers to data obtained by performing database partitioning and table partitioning on the arranged to-be-migrated data set.
The data set to be migrated is obtained by segmenting the data set to be arranged, so that the result of failure of overall data migration caused by an abnormal scene occurring in the one-time migration process of a large amount of data can be avoided, the data migration restoration of the abnormal data set to be migrated is only needed to be carried out again by identifying the abnormal data set to be migrated in the segmented data set to be migrated when the data migration process is abnormal, and the data integrity in the data migration process is ensured.
As an embodiment of the present invention, the data segmentation module 103 performs a segmentation operation on the arranged to-be-migrated data set by executing the following operations, so as to obtain a segmented to-be-migrated data set, including:
performing database partitioning operation on the data source databases to obtain a plurality of data source databases;
performing table division operation in the data source sub-databases to obtain a plurality of data source sub-tables;
and segmenting the data set to be arranged and migrated according to the data source sub-tables to obtain the segmented data set to be migrated.
The capacity of the database source database cannot be too large due to performance limitation of the database source database, and the database source database needs to be subjected to database splitting operation to obtain a plurality of database source sub-databases, for example, if the database source database is a, the plurality of database source sub-databases may be a1, a2, A3, a4, and the like; the table dividing operation of the data source sub-databases is to prevent the data tables from storing more recorded data and avoid abnormal interruption of data migration, and when the data in the sub-tables is abnormal, the migration of other sub-tables is not affected, for example, the number of the data source sub-databases is six, ten sub-tables are stored in each sub-database, the total number of the data source sub-tables is sixty, the dataset to be migrated is divided according to the data source sub-tables, and the sixty data sets to be migrated can be divided.
The data migration module 104 is configured to create a plurality of data migration tasks of the segmented to-be-migrated data set, sort the plurality of data migration tasks according to a preset priority to obtain a sorted data migration task, and migrate the segmented to-be-migrated data set to the data migration target table according to the sorted data migration task.
In the embodiment of the invention, the data migration task refers to a task of segmenting the data set to be migrated and migrating the data set to the data migration target table. For example, the data set to be migrated is divided into all the customer insurance information of the insurance company, and the data migration tasks are data migration tasks such as common customer insurance application information, common customer heavy insurance application information, VIP customer heavy insurance application information, and VIP customer insurance application information.
In the embodiment of the present invention, the preset priority refers to a user's custom priority ranking for performing multiple data migration tasks according to the requirement, for example, the first level may be VIP customer insurance application information, the second level may be VIP customer heavy insurance application information, the third level may be common customer heavy insurance application information, and the fourth level may be common customer insurance application information.
According to the embodiment of the invention, a plurality of data migration tasks for segmenting the data set to be migrated are created, the data migration tasks are sequenced according to the preset priority to obtain the sequenced data migration tasks, and the segmented data set to be migrated is migrated to the data migration target table according to the sequenced data migration tasks, so that the intelligent scheduling of the data migration tasks can be realized, and the data migration efficiency is improved.
In detail, the data migration module 104 creates a plurality of data migration tasks for segmenting the data set to be migrated by performing the following operations, including:
acquiring a segmentation serial number and a segmentation address of the segmented data set to be migrated; and respectively associating the data set to be migrated with the corresponding segmentation serial number and the segmentation address to obtain a plurality of data migration tasks for segmenting the data set to be migrated.
Further, the migrating the segmented to-be-migrated data set to the data migration target table according to the sorted data migration task includes:
executing the sorted data migration task, and reading an execution parameter of the sorted data migration task by using a preset plug-in; comparing the execution parameters with a preset load coefficient; when the execution parameter does not exceed the load coefficient threshold, loading the segmented data set to be migrated into the data migration target table through a preset migration thread by using a preset data migration tool; and when the execution parameter exceeds the load coefficient threshold, newly adding a plurality of migration threads, and sequentially loading the segmented data set to be migrated into the data migration target table through the plurality of migration threads by using the data migration tool.
The execution parameters refer to resource consumption parameters of a target database where a data migration target table is located in the execution process of the sequenced data migration task, for example, the running load coefficients of databases such as database connection number, tps, qps, CPU, memory and the like; the preset load factor threshold is 80%.
In an embodiment of the present invention, the ordered data migration task is executed by a migration thread, and the tasks can be executed by different threads through resource consumption of the target database, so as to implement intelligent scheduling of task execution and improve migration efficiency of data migration.
Preferably, the data migration tool may be a key tool.
In another embodiment of the present invention, the data migration module 104 may further be configured to:
checking the consistency of the data set to be migrated and the full migration data set in the data migration target table according to a preset full rule, recording a migration abnormal point of the data set to be migrated and executing data migration of the data set to be migrated again according to the migration abnormal point when the data set to be migrated and the full migration data set are inconsistent; and checking the consistency of the data set to be migrated and the incremental migration data set in the data migration target table according to a preset incremental rule, recording a migration abnormal point of the data set to be migrated when the data set to be migrated and the incremental migration data set are inconsistent, and re-executing data migration of the data set to be migrated according to the migration abnormal point.
The full-scale rule is mainly used for identifying whether the full-scale migration data sets, stored in the source data table, of the segmented data sets to be migrated, in the data target table are synchronous or not, changes of the data source table can be reflected to the data migration target table in real time, and finally the segmented migration data sets of the data source table are consistent with the full-scale migration data sets; the delta rule primarily functions to synchronize data migration without production shutdown.
For example, data migration of the data source table 2 months and 2 days before 2022 is regarded as full data migration, and data migration of the data source table 2 months and 2 days after 2022 is regarded as incremental data migration.
In the embodiment of the invention, firstly, the data arrangement is carried out on the data set to be migrated to obtain the data set to be migrated, so that the abnormality caused by the inconsistency between the data structure in the data source table and the data structure in the data migration target table when the subsequent data set to be migrated is migrated can be avoided; secondly, the data set to be migrated is arranged and segmented to obtain a segmented data set to be migrated, so that the consequence of failure of overall data migration caused by abnormal scenes in the one-time migration process of a large amount of data can be avoided, and when the data migration process is abnormal, the data migration restoration of the abnormal data set to be migrated is carried out again only by identifying the abnormal data set to be migrated in the segmented data set to be migrated, so that the data integrity in the data migration process is ensured; and finally, migrating the segmented data set to be migrated to the data migration target table according to the sorted data migration task, so that the intelligent scheduling of the data migration task can be realized, and the data migration efficiency can be improved. Therefore, the data migration device provided by the embodiment of the invention can improve the data migration efficiency.
Fig. 5 is a schematic structural diagram of an electronic device implementing the data migration method according to the present invention.
The electronic device may include a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further include a computer program, such as a data migration program, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of media, which includes flash memory, removable hard disk, multimedia card, card type memory (e.g., SD or DX memory, etc.), magnetic memory, local disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, for example a removable hard disk of the electronic device. The memory 11 may also be an external storage device of the electronic device in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only to store application software installed in the electronic device and various types of data, such as codes of a data migration program, but also to temporarily store data that has been output or is to be output.
The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device by running or executing programs or modules (e.g., data migration programs, etc.) stored in the memory 11 and calling data stored in the memory 11.
The communication bus 12 may be a PerIPheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The bus may be divided into an address bus, a data bus, a control bus, etc. The communication bus 12 is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
Fig. 5 shows only an electronic device having components, and those skilled in the art will appreciate that the structure shown in fig. 5 does not constitute a limitation of the electronic device, and may include fewer or more components than those shown, or some components may be combined, or a different arrangement of components.
For example, although not shown, the electronic device may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management and the like are realized through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Optionally, the communication interface 13 may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), which is generally used to establish a communication connection between the electronic device and other electronic devices.
Optionally, the communication interface 13 may further include a user interface, which may be a Display (Display), an input unit (such as a Keyboard (Keyboard)), and optionally, a standard wired interface, or a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable, among other things, for displaying information processed in the electronic device and for displaying a visualized user interface.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The data migration program stored in the memory 11 of the electronic device is a combination of a plurality of computer programs, and when running in the processor 10, can realize:
acquiring a data set to be migrated and a data migration target table in a data source table;
performing data arrangement on the data set to be migrated to obtain an arranged data set to be migrated;
performing segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated;
creating a plurality of data migration tasks of the data set to be segmented and migrated, sequencing the data migration tasks according to a preset priority to obtain a sequenced data migration task, and migrating the data set to be segmented and migrated to the data migration target table according to the sequenced data migration task.
Specifically, the processor 10 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the computer program, which is not described herein again.
Further, the electronic device integrated module/unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer readable medium. The computer readable medium may be non-volatile or volatile. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
Embodiments of the present invention may also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor of an electronic device, the computer program may implement:
acquiring a data set to be migrated and a data migration target table in a data source table;
performing data arrangement on the data set to be migrated to obtain an arranged data set to be migrated;
performing segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated;
creating a plurality of data migration tasks of the data set to be segmented and migrated, sequencing the data migration tasks according to a preset priority to obtain a sequenced data migration task, and migrating the data set to be segmented and migrated to the data migration target table according to the sequenced data migration task.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
In the embodiments provided by the present invention, it should be understood that the disclosed media, devices, apparatuses and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Claims (10)
1. A method of data migration, the method comprising:
acquiring a data set to be migrated and a data migration target table in a data source table;
performing data arrangement on the data set to be migrated to obtain an arranged data set to be migrated;
performing segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated;
creating a plurality of data migration tasks of the data set to be segmented and migrated, sequencing the data migration tasks according to a preset priority to obtain a sequenced data migration task, and migrating the data set to be segmented and migrated to the data migration target table according to the sequenced data migration task.
2. The data migration method according to claim 1, wherein the data arrangement of the data set to be migrated to obtain the data set to be migrated, includes:
acquiring the data type of the data set to be migrated;
identifying a data structure of the data set to be migrated according to the data type;
and arranging the data structure according to a preset arrangement rule to obtain the data set to be migrated.
3. The data migration method according to claim 1, wherein the performing a segmentation operation on the arranged to-be-migrated data set to obtain a segmented to-be-migrated data set includes:
performing database partitioning operation on the data source databases to obtain a plurality of data source databases;
performing table division operation in the data source sub-databases to obtain a plurality of data source sub-tables;
and segmenting the data set to be arranged and migrated according to the data source sub-tables to obtain the segmented data set to be migrated.
4. The data migration method according to claim 1, wherein said creating a plurality of data migration tasks for said slicing of the data set to be migrated comprises:
acquiring a segmentation serial number and a segmentation address of the segmented data set to be migrated;
and respectively associating the data set to be migrated with the corresponding segmentation serial number and the segmentation address to obtain a plurality of data migration tasks for segmenting the data set to be migrated.
5. The data migration method according to claim 1, wherein the migrating the sliced to-be-migrated data set to the data migration target table according to the sorted data migration task includes:
executing the sorted data migration task, and reading an execution parameter of the sorted data migration task by using a preset plug-in;
comparing the execution parameters with a preset load coefficient;
when the execution parameter does not exceed the load coefficient threshold, loading the segmented data set to be migrated into the data migration target table through a preset migration thread by using a preset data migration tool;
and when the execution parameter exceeds the load coefficient threshold, newly adding a plurality of migration threads, and sequentially loading the segmented data set to be migrated into the data migration target table through the plurality of migration threads by using the data migration tool.
6. The data migration method according to claim 1, wherein after the migrating the sliced to-be-migrated data set into the data migration target table according to the sorted data migration task, the method further comprises:
checking the consistency of the data set to be migrated and the full migration data set in the data migration target table according to a preset full rule, recording a migration abnormal point of the data set to be migrated and executing data migration of the data set to be migrated again according to the migration abnormal point when the data set to be migrated and the full migration data set are inconsistent;
and checking the consistency of the data set to be migrated and the incremental migration data set in the data migration target table according to a preset incremental rule, recording a migration abnormal point of the data set to be migrated when the data set to be migrated and the incremental migration data set are inconsistent, and re-executing data migration of the data set to be migrated according to the migration abnormal point.
7. The data migration method according to claim 1, wherein after the acquiring the data set to be migrated and the data migration target table in the data source table, the method further comprises:
carrying out duplication removal operation on the data set to be migrated, and detecting whether a data missing value exists in the data set to be migrated after duplication removal;
if the missing data value does not exist, the data set to be migrated after the duplication removal is obtained;
and if the data missing value exists, filling the data missing value to obtain a filled data set to be migrated.
8. An apparatus for data migration, the apparatus comprising:
the data acquisition module is used for acquiring a data set to be migrated and a data migration target table in the data source table;
the data arrangement module is used for carrying out data arrangement on the data set to be migrated to obtain an arranged data set to be migrated;
the data segmentation module is used for carrying out segmentation operation on the data set to be arranged and migrated to obtain a segmented data set to be migrated;
and the data migration module is used for creating a plurality of data migration tasks for segmenting the data set to be migrated, sequencing the data migration tasks according to a preset priority to obtain a sequenced data migration task, and migrating the segmented data set to be migrated to the data migration target table according to the sequenced data migration task.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data migration method of any one of claims 1 to 7.
10. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, implements the data migration method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210512455.6A CN114860690A (en) | 2022-05-12 | 2022-05-12 | Data migration method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210512455.6A CN114860690A (en) | 2022-05-12 | 2022-05-12 | Data migration method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114860690A true CN114860690A (en) | 2022-08-05 |
Family
ID=82637392
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210512455.6A Pending CN114860690A (en) | 2022-05-12 | 2022-05-12 | Data migration method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114860690A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117149915A (en) * | 2023-10-31 | 2023-12-01 | 湖南三湘银行股份有限公司 | Method for migrating cloud database to open source database |
-
2022
- 2022-05-12 CN CN202210512455.6A patent/CN114860690A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117149915A (en) * | 2023-10-31 | 2023-12-01 | 湖南三湘银行股份有限公司 | Method for migrating cloud database to open source database |
CN117149915B (en) * | 2023-10-31 | 2024-03-29 | 湖南三湘银行股份有限公司 | Method for migrating cloud database to open source database |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110209650B (en) | Data normalization and migration method and device, computer equipment and storage medium | |
US10783163B2 (en) | Instance-based distributed data recovery method and apparatus | |
CN113590632B (en) | Database index creation method, device, equipment and medium | |
CN111694844A (en) | Enterprise operation data analysis method and device based on configuration algorithm and electronic equipment | |
CN112486957A (en) | Database migration detection method, device, equipment and storage medium | |
CN112464619A (en) | Big data processing method, device and equipment and computer readable storage medium | |
CN112256783A (en) | Data export method and device, electronic equipment and storage medium | |
CN115145870A (en) | Method and device for positioning reason of failed task, electronic equipment and storage medium | |
CN114840531A (en) | Data model reconstruction method, device, equipment and medium based on blood relationship | |
CN114860690A (en) | Data migration method, device, equipment and storage medium | |
CN111694843A (en) | Missing number detection method and device, electronic equipment and storage medium | |
CN114880368A (en) | Data query method and device, electronic equipment and readable storage medium | |
CN111339072A (en) | User behavior based change value analysis method and device, electronic device and medium | |
CN112685384A (en) | Data migration method and device, electronic equipment and storage medium | |
CN113590703A (en) | ES data importing method and device, electronic equipment and readable storage medium | |
CN117520351A (en) | Data lake entering method, device, equipment and medium based on object storage | |
CN114756564B (en) | Data processing method, device, equipment and medium for stream computing | |
CN113434397B (en) | Task system testing method and device, electronic equipment and storage medium | |
CN114911479A (en) | Interface generation method, device, equipment and storage medium based on configuration | |
CN114116673A (en) | Data migration method based on artificial intelligence and related equipment | |
CN114490137A (en) | Service data real-time statistical method and device, electronic equipment and readable storage medium | |
CN113918296A (en) | Model training task scheduling execution method and device, electronic equipment and storage medium | |
CN114547010A (en) | Data analysis method and device, electronic equipment and storage medium | |
CN113360767A (en) | Information pushing method and device, electronic equipment and storage medium | |
CN113434359B (en) | Data traceability system construction method and device, electronic device and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |