CN116166638A - Data migration method, device, electronic equipment and readable storage medium - Google Patents

Data migration method, device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN116166638A
CN116166638A CN202310214001.5A CN202310214001A CN116166638A CN 116166638 A CN116166638 A CN 116166638A CN 202310214001 A CN202310214001 A CN 202310214001A CN 116166638 A CN116166638 A CN 116166638A
Authority
CN
China
Prior art keywords
data
checking
checked
rule
migrated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310214001.5A
Other languages
Chinese (zh)
Inventor
陈禹旭
姜唯
敖知琪
崔焱
代昊琦
康旖
梁子健
刘明伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southern Power Grid Digital Grid Research Institute Co Ltd
Original Assignee
Southern Power Grid Digital Grid Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southern Power Grid Digital Grid Research Institute Co Ltd filed Critical Southern Power Grid Digital Grid Research Institute Co Ltd
Priority to CN202310214001.5A priority Critical patent/CN116166638A/en
Publication of CN116166638A publication Critical patent/CN116166638A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a data migration method, a device, electronic equipment and a readable storage medium, wherein the data migration method comprises the following steps: acquiring data to be checked from a source system, and classifying the data to be checked according to preset classification conditions to acquire sub data to be checked; selecting a target checking rule of the data checking from preset checking rules according to the category weight corresponding to each piece of data to be checked; performing data checking on the data to be checked according to the target checking rule, and generating data to be migrated based on a data checking result and the data to be checked; and under the condition that the data to be migrated accords with a preset data migration standard, sending the data to be migrated to a target system so as to migrate the data in the source system to the target system. The accuracy of the data checking result can be improved, and the data to be migrated with higher quality can be obtained, so that the data migration efficiency is improved.

Description

Data migration method, device, electronic equipment and readable storage medium
Technical Field
The present application belongs to the field of data processing, and in particular, relates to a data migration method, a data migration device, an electronic device, and a readable storage medium.
Background
With the development of information technology, enterprises face the problems of updating iteration, cloud deployment or business integration of new and old systems, and data in the old systems need to be migrated to the new systems. And in the migration process, the history data is subjected to intermediate processing, so that the processed data meets the requirements of a new system.
At present, a data warehouse technology (extraction-transformation-Load) inspection tool or a preset rule script is adopted to inspect historical data so as to find problems in the historical data and process the problems correspondingly, so that the processed data reach the data standard of a new system. The ETL inspection tool or the preset rule script is adopted, and the computer is limited by the computer power in a limited time, so that all the inspection rules cannot be executed, and in the prior art, the selection of the inspection rules is usually carried out according to the artificial experience of a tester.
However, due to the limitation of human experience, the checking rule selected by the tester is often not matched with the historical data, so that the quality problem still exists in the data checking and the processed historical data, the checking quality is low, and the problem that multiple checking is needed and the data migration efficiency is further reduced is caused.
Disclosure of Invention
The application provides a data migration method, a data migration device, electronic equipment and a readable storage medium, so as to solve the problems of low check quality and reduced data migration efficiency caused by the operation of selecting check rules according to human experience.
In order to solve the technical problems, the application is realized as follows:
in a first aspect, the present application provides a data migration method, the method including:
acquiring data to be checked from a source system, and classifying the data to be checked according to preset classification conditions to acquire sub data to be checked;
selecting a target checking rule of the data checking from preset checking rules according to the category weight corresponding to each piece of data to be checked;
performing data checking on the data to be checked according to the target checking rule, and generating data to be migrated based on a data checking result and the data to be checked;
and under the condition that the data to be migrated accords with a preset data migration standard, sending the data to be migrated to a target system so as to migrate the data in the source system to the target system.
Optionally, the selecting, according to the category weight corresponding to each piece of data to be checked, a target checking rule for checking the data from preset checking rules includes:
Determining rule scores of all the checking rules in the preset checking rules according to the category weights corresponding to all the sub data to be checked;
and selecting a checking rule with rule score not smaller than a preset scoring threshold from the preset checking rules as a target checking rule for data checking at this time.
Optionally, before the data to be migrated is sent to the target system, if the data to be migrated meets a preset data migration standard, the method further includes:
checking the data to be migrated according to a preset test case to obtain a checking result; the test cases are determined according to the data migration standard;
and sending the data to be migrated to a target system under the condition that the data to be migrated accords with a preset data migration standard, wherein the data to be migrated comprises the following steps:
and under the condition that the test result indicates that the data to be migrated accords with a preset data migration standard, the data to be migrated is sent to a target system.
Optionally, after the data to be migrated is inspected according to the preset test case to obtain an inspection result, the method further includes:
correcting the target checking rule according to the checking result under the condition that the checking result represents that the data to be migrated does not accord with the preset data migration standard, so as to obtain a corrected target checking rule;
And re-checking the data to be checked according to the corrected target checking rule, and generating the data to be migrated based on a data checking result and the data to be checked.
Optionally, the correcting the target checking rule according to the checking result includes:
according to the quality problems of the data in the inspection result, selecting a checking rule corresponding to the quality problems from preset checking rules as a checking rule to be supplemented;
and supplementing the to-be-supplemented checking rule into the target checking rule so as to correct the target checking rule.
Optionally, the target checking rule includes a basic checking rule and a conversion checking rule, and the data checking is performed on the data to be checked according to the target checking rule, and the data to be migrated is generated based on a data checking result and the data to be checked, including:
performing pre-conversion checking on the data to be checked according to the basic checking rule to determine data with original quality problems in the data to be checked;
performing data conversion on the data to be checked according to the original quality problem and a preset data conversion standard to process the original quality problem and obtain converted data to be checked;
And performing post-conversion checking on the converted data to be checked according to the conversion checking rule, and generating data to be migrated based on a post-conversion checking result and the converted data to be checked.
Optionally, the acquiring the data to be checked from the source system includes:
unloading data in a source system according to a preset data unloading format to obtain source data;
determining supplementary data corresponding to the source data according to the source data and a preset data complement standard;
and generating data to be checked according to the source data and the supplementary data.
In a second aspect, the present application provides a data migration apparatus, the apparatus comprising:
the acquisition module is used for acquiring the data to be checked from the source system, and classifying the data to be checked according to preset classification conditions so as to acquire sub data to be checked;
the selection module is used for selecting a target checking rule of the data checking from preset checking rules according to the category weight corresponding to each piece of data to be checked;
the checking module is used for checking the data of the data to be checked according to the target checking rule and generating the data to be migrated based on the data checking result and the data to be checked;
And the sending module is used for sending the data to be migrated to a target system under the condition that the data to be migrated accords with a preset data migration standard so as to migrate the data in the source system to the target system.
In a third aspect, the present application provides an electronic device, comprising: a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing the data migration method described above when executing the program.
In a fourth aspect, the present application provides a readable storage medium, which when executed by a processor of an electronic device, enables the electronic device to perform the above-described data migration method.
In the embodiment of the application, the data to be checked is obtained from a source system, and classified according to preset classification conditions to obtain sub data to be checked; selecting a target checking rule of the data checking from preset checking rules according to the category weight corresponding to each piece of data to be checked; performing data checking on the data to be checked according to the target checking rule, and generating data to be migrated based on a data checking result and the data to be checked; and under the condition that the data to be migrated accords with the preset data migration standard, the data to be migrated is sent to the target system, so that the data in the source system is migrated to the target system. In this way, the target checking rule is determined according to the category weight corresponding to each piece of data to be checked, so that the target checking rule of the data checking can be matched with the data to be checked, and the rationality of the checking rule is improved. Further, according to the target checking rule, data checking is carried out on the data to be checked, a more accurate data checking result can be obtained, and because the data to be migrated is generated based on the more accurate data checking result and the data to be checked, the quality of the data to be migrated obtained by the data checking can be improved, the data to be migrated accords with a preset data migration standard, the number of times of data checking is reduced, and therefore the data migration efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, a brief description will be given below of the drawings that are needed in the embodiments or the prior art descriptions, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart illustrating steps of a data migration method according to an embodiment of the present application;
fig. 2 is a schematic flow chart of a data migration method according to an embodiment of the present application;
FIG. 3 is a block diagram of a data migration apparatus according to an embodiment of the present application;
fig. 4 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
Fig. 1 is a flowchart of steps of a data migration method according to an embodiment of the present application, as shown in fig. 1, the method may include:
and step 101, acquiring data to be checked from a source system, and classifying the data to be checked according to preset classification conditions to obtain sub data to be checked.
In this embodiment of the present application, the source system may be a system where migrated data is located, where the source system includes source data, and data to be checked may be obtained according to the source data. The preset classification condition may be a classification condition set according to the source, attribute, or purpose of the data. Classifying the data to be checked according to preset classification conditions to obtain data to be checked corresponding to each class, and taking the data to be checked corresponding to any class as sub data to be checked. It may be understood that, according to a preset classification condition, the number of sub-data to be checked may be one or more, which is not limited in the embodiment of the present application.
For example, the data to be checked may be classified into: transaction class data, customer class data, log class data, management class data, and system class data. The preset classification conditions may be as shown in table 1:
TABLE 1 preset classification conditions
Category(s) Classification index
Transaction class Amount, unit price, quantity, transaction order number, product number, transaction status, etc
Customer class Customer attribute information, customer classification information, customer index information, associated customers, and the like
Log class Automatic business process logs, transaction detail stream records, transaction product lists and the like
Management class Sales task index, sales client distribution, sales activity record, sales assessment, etc
System class System log, system auto-generated ID, standard data dictionary, time stamp, etc
And 102, selecting a target checking rule for checking the data from preset checking rules according to the category weight corresponding to each piece of data to be checked.
In the embodiment of the present application, a category weight may be set in advance for each category, and the category weight of the category to which any piece of data to be checked belongs is used as the category weight corresponding to the piece of data to be checked, so as to determine the category weights corresponding to the pieces of data to be checked respectively. For any one of the preset checking rules to be selected, whether the checking rule can be cut or not can be determined according to the relation between the checking rule and the category weight corresponding to each piece of data to be checked. If the checking rule cannot be cut, the checking rule is used as a target checking rule, otherwise, the checking rule cannot be used as the target checking rule. Specifically, it may be determined that the checking rule is in a rule parameter corresponding to any type, and then it is determined whether the checking rule can be cut according to all rule parameters, for example, rule 1=transaction type parameter+customer type parameter+log type parameter+management type parameter+system type parameter is not smaller than or equal to the tailorable parameter, and rule 1 cannot be cut. Further, the checking rules which cannot be cut out of the preset checking rules are taken as target checking rules of the data checking.
And 103, carrying out data checking on the data to be checked according to the target checking rule, and generating data to be migrated based on a data checking result and the data to be checked.
In the embodiment of the application, the data to be checked is checked according to the target checking rule, so that the quality problem of the data to be checked is found, and the data checking result is obtained. And processing quality problems in the data to be checked based on the data checking result to solve the quality problems of the data to be checked, and taking the processed data as the data to be migrated.
And 104, sending the data to be migrated to a target system to migrate the data in the source system to the target system under the condition that the data to be migrated accords with a preset data migration standard.
In this embodiment of the present application, the preset data migration criteria may be predetermined according to the data structure criteria and the data quality requirements of the target system, and the data criteria proposed for the data migrated to the target system. And checking the data to be migrated according to a preset data migration standard to determine whether the data to be migrated accords with the preset data migration standard. Specifically, whether the data quality problem exists in the data to be migrated can be checked through a preset data migration standard, so as to determine whether the data to be migrated accords with the preset data migration standard. If the data quality problem exists in the data to be migrated and the data quality problem needs to be processed, determining that the data to be migrated does not accord with the preset data migration standard. If the data quality problem does not exist in the data to be migrated, or even if the data quality problem exists, the data cannot be corrected and other data are not influenced, the data to be migrated is determined to accord with the preset data migration standard.
Optionally, for data that belongs to uncorrectable data and does not affect other data and has such data quality problems, a "migration problem remark" field or a "tag" may be added to the data to remark for migration to a new system to prompt a system user that the data has a data quality problem.
In the embodiment of the present application, when the data to be migrated meets a preset data migration standard, the data to be migrated is determined as target data, that is, the data to be migrated to the target system. The target data not only receives the data of the source system, but also meets the structural standard and quality requirement of the target system. The target data may be sent to the target system for the target system to store the target data in its own database, thereby enabling migration of the data in the source system into the target system.
In the embodiment of the application, the data to be checked is obtained from a source system, and classified according to preset classification conditions to obtain sub data to be checked; selecting a target checking rule of the data checking from preset checking rules according to the category weight corresponding to each piece of data to be checked; performing data checking on the data to be checked according to the target checking rule, and generating data to be migrated based on a data checking result and the data to be checked; and under the condition that the data to be migrated accords with the preset data migration standard, the data to be migrated is sent to the target system, so that the data in the source system is migrated to the target system. In this way, the target checking rule is determined according to the category weight corresponding to each piece of data to be checked, so that the target checking rule of the data checking can be matched with the data to be checked, and the rationality of the checking rule is improved. Further, according to the target checking rule, data checking is carried out on the data to be checked, a more accurate data checking result can be obtained, and because the data to be migrated is generated based on the more accurate data checking result and the data to be checked, the quality of the data to be migrated obtained by the data checking can be improved, the data to be migrated accords with a preset data migration standard, the number of times of data checking is reduced, and therefore the data migration efficiency is improved.
Optionally, step 102 may include the steps of:
and 1021, determining rule scores of all the preset checking rules according to the category weights corresponding to all the sub-data to be checked.
In this embodiment of the present application, for any one of the preset checking rules to be selected, the score obtained by the checking rule in the category may be determined according to the category weight corresponding to the checking rule and any one of the sub data to be checked, and then the rule score corresponding to the checking rule may be determined according to the score obtained by the checking rule in each category. For example, rule score=transaction class score+customer class score+log class score+management class score+system class score for rule 1. Correspondingly, rule scores of all the preset checking rules are respectively determined.
Step 1022, selecting a checking rule with a rule score not smaller than a preset scoring threshold from the preset checking rules, and taking the checking rule as a target checking rule of the data checking.
In this embodiment of the present application, for any one of the preset checking rules to be selected, the rule score corresponding to the checking rule may be compared with the preset score threshold to determine whether the checking rule may be cut. If the rule score corresponding to the checking rule is not smaller than the preset score threshold, the checking rule cannot be cut, and the checking rule can be used as a target checking rule. Further, the checking rules with the rule scores not smaller than a preset scoring threshold value in the preset checking rules are taken as target checking rules of the data checking. For example, as shown in table 2, it is determined whether or not a check rule among preset check rules can be cut, wherein a rule that is not cut is a target check rule of the present data check.
Table 2 target checking rules
Figure BDA0004114261420000081
Figure BDA0004114261420000091
In table 2, the preset scoring threshold is 75, and a value greater than or equal to 75 indicates that the checking rule cannot be cut out and can be used as the target checking rule.
In the embodiment of the application, determining rule scores of all the preset checking rules according to category weights corresponding to all the sub-data to be checked; and selecting a checking rule with rule score not smaller than a preset scoring threshold from the preset checking rules as a target checking rule for data checking at this time. Therefore, the target checking rule can be conveniently selected according to the rule score, and the rule score is determined according to the category weight corresponding to each piece of data to be checked, so that the target checking rule is matched with the data to be checked, and the rationality of the target checking rule of the data checking is improved.
Optionally, before step 104, the method further includes:
step 201, checking the data to be migrated according to a preset test case to obtain a checking result; the test cases are determined according to the data migration criteria.
In the embodiment of the application, the test case is used for checking quality problems in the data to be migrated so as to check whether the data to be migrated accords with a preset data migration standard. The test cases can be generated according to specific indexes in the data migration standard, wherein the specific indexes in the data migration standard comprise indexes corresponding to the data structure standard and the data quality requirement of the target system. Alternatively, the test case may include a case title, preconditions, test steps, and expected results. Wherein, the checking code can be written in advance according to the test case, and the testing step is realized by running the code, thereby checking the quality problem in the data to be migrated.
Alternatively, the data to be migrated may be initialized to a user acceptance test (User Acceptance Testing, UAT) environment or a quasi-production environment, data transmission may be performed according to security requirements during the initialization process, and sensitive data may be desensitized. Alternatively, the full volume data may be sampled and the sampled data initialized to the UAT test environment or the quasi-production environment. And checking the data to be migrated according to a preset test case in a UAT test environment or a quasi-production environment.
In the embodiment of the application, the data to be migrated is checked according to a preset test case to determine whether the data to be migrated meets the expected result, namely the preset data migration standard, and a check result is generated. The test result may include a final conclusion about whether the data to be migrated meets the preset data migration standard, and a quality problem corresponding to the data with a quality problem in the data to be migrated when the data to be migrated does not meet the preset data migration standard.
Optionally, step 104 may include the steps of:
step 1041, if the test result indicates that the data to be migrated meets a preset data migration standard, sending the data to be migrated to a target system.
In the embodiment of the application, under the condition that the test result indicates that the data to be migrated accords with the preset data migration standard, the data to be migrated is determined to be target data, and then the target data is sent to the target system so that the target system can store the target data into a database of the target system, and therefore data in the source system can be migrated into the target system.
In the embodiment of the application, the data to be migrated is inspected according to a preset test case to obtain an inspection result; and sending the data to be migrated to the target system under the condition that the test result indicates that the data to be migrated accords with the preset data migration standard. Because the test cases are determined according to the data migration standard, whether the data to be migrated accords with the preset data migration standard can be conveniently determined through the preset test cases. Under the condition that the data to be migrated accords with a preset data migration standard, the data to be migrated is sent to the target system, so that the data to be migrated is more suitable for the target system, and a better data migration effect is obtained.
Optionally, after step 201, the method further includes:
step 301, correcting the target checking rule according to the checking result when the checking result indicates that the data to be migrated does not meet the preset data migration standard, so as to obtain a corrected target checking rule.
In the embodiment of the application, under the condition that the test result represents that the data to be migrated does not accord with the preset data migration standard, whether a missing check rule exists in the target check rule or not can be determined according to the quality problem of the data to be migrated, which is included in the test result, wherein the data to be migrated has the quality problem, if the missing check rule exists in the target check rule, the target check rule is supplemented according to the quality problem of the data to be migrated, which has the quality problem, so that a new target check rule is obtained, the target check rule is corrected, and the new check rule is used as the corrected target check rule.
And step 302, re-checking the data to be checked according to the corrected target checking rule, and generating the data to be migrated based on the data checking result and the data to be checked.
In the embodiment of the application, the data to be checked is checked again according to the corrected target checking rule, so that the quality problem of the data to be checked is found, particularly the quality problem which is not found by the last data checking is obtained, and a new data checking result is obtained. And processing quality problems in the data to be checked based on the new data checking result to solve the quality problems of the data to be checked, and taking the processed data as the data to be migrated.
In the embodiment of the application, under the condition that the test result indicates that the data to be migrated does not accord with the preset data migration standard, the target checking rule is corrected according to the test result, so that the corrected target checking rule is obtained. In this way, the accuracy of the target checking rule for re-checking the data can be improved. And re-checking the data to be checked according to the corrected target checking rule, and generating the data to be migrated based on the data checking result and the data to be checked. Therefore, the data to be migrated is subjected to data inspection through a more accurate target inspection rule, so that the quality of data inspection can be improved, and the data to be migrated with higher quality can be generated.
Optionally, step 301 may include the steps of:
step 3011, selecting a checking rule corresponding to the quality problem from preset checking rules according to the quality problem of the data in the checking result, and taking the checking rule as the checking rule to be supplemented.
In the embodiment of the application, the quality problem of the data in the test result can be tested on the data to be migrated according to the test case, so that the data which does not meet the data structure standard and/or the data quality requirement of the target system in the data to be migrated can be found, and then the difference between the structure and/or the content of the data to be migrated and the data structure standard and/or the data quality requirement of the target system is used as the quality problem of the data to be migrated. The target checking rule can be supplemented according to the data with quality problems in the data to be migrated and the quality problems corresponding to the data. Specifically, a checking rule capable of checking the quality problem can be selected from preset checking rules according to the quality problem, namely, a checking rule corresponding to the quality problem is selected as a checking rule to be supplemented.
And step 3012, supplementing the checking rule to be supplemented into the target checking rule so as to correct the target checking rule.
In the embodiment of the application, the to-be-supplemented checking rule can be supplemented into the target checking rule to obtain a new target checking rule, so that the target checking rule is corrected, and the new checking rule is used as the corrected target checking rule.
In the embodiment of the application, according to the quality problem of the data in the test result, selecting a check rule corresponding to the quality problem from preset check rules as a check rule to be supplemented; and supplementing the checking rule to be supplemented into the target checking rule so as to correct the target checking rule. Therefore, the target checking rule can be conveniently corrected, and the accuracy of the target checking rule is improved.
Optionally, the target checking rule includes a basic checking rule and a conversion checking rule, and step 103 may include the following steps:
step 1031, performing pre-rotation checking on the data to be checked according to the basic checking rule to determine data with original quality problems in the data to be checked.
In the embodiment of the application, the basic checking rule is a checking rule for checking data quality problems on unprocessed original data. For example, the total number of records in a table, the number of records in a field for which the format of a field is not consistent, the number of records in a field for which the string is less than a certain length, etc. This is by way of example only, and the embodiments of the present application are not limited thereto.
In the embodiment of the application, through an ETL inspection tool or a preset basic inspection script, the data to be inspected is inspected before data conversion, i.e. before conversion, according to a basic inspection rule, so as to find the data quality problem existing in the data to be inspected, for example, null exists in a non-Null field, disorder exists in a character string, the time format is not uniform and/or the digital representation form is not uniform. This is by way of example only, and the embodiments of the present application are not limited thereto. And determining the data with the original quality problem in the data to be checked according to the data quality problem in the data to be checked.
And step 1032, performing data conversion on the data to be checked according to the original quality problem and a preset data conversion standard so as to process the original quality problem and obtain converted data to be checked.
In the embodiment of the present application, the preset data migration criteria may include a data structure criteria and a data quality requirement of the target system. The data to be checked can be processed according to the original quality problem and the data quality requirement in the data conversion standard, and the data structure of the data to be checked is adjusted according to the data conversion standard so as to perform data conversion on the data to be checked, and the original quality problem is processed to obtain the converted data to be checked.
And step 1033, performing post-conversion checking on the converted data to be checked according to the conversion checking rule, and generating data to be migrated based on a post-conversion checking result and the converted data to be checked.
In the embodiment of the application, the conversion checking rule is a checking rule for checking the data quality problem of the data to be migrated subjected to data conversion. For example, whether the A table mapping in the data table is split into a B table and a C table, whether the a field is mapped into the B field, whether the 15 bits of the old identity card are supplemented to the 18-bit standard of the new identity card, and the like.
In the embodiment of the application, the data to be checked after conversion is checked according to the conversion checking rule, so that the quality problem of the data to be checked after conversion is found, and a post-conversion checking result is obtained. And processing quality problems in the converted data to be checked based on the converted checking result to solve the quality problems of the converted data to be checked, and taking the processed data as data to be migrated.
In the embodiment of the application, the data with the original quality problem in the data to be checked can be conveniently determined by performing pre-rotation checking on the data to be checked according to the basic checking rules. According to the original quality problem and a preset data conversion standard, data conversion is carried out on the data to be checked, and the original quality problem can be processed, so that the data quality of the converted data to be checked is improved. Further, the converted data to be checked is checked after conversion according to the conversion checking rule, and the data to be migrated is generated based on the result of the checked after conversion and the data to be checked after conversion. Thus, the quality of the data to be migrated obtained by the data check can be improved.
Optionally, step 101 may include the steps of:
and step 1011, unloading the data in the source system according to a preset data unloading format to obtain source data.
In this embodiment, the data unloading format according to the preset may be determined according to the format of the data file of the database of the target system. For example, the preset data offloading format may be a cscv format. The data in the source system can be unloaded and stored as a source data file corresponding to a preset data unloading format so as to obtain source data.
Step 1012, determining the supplementary data corresponding to the source data according to the source data and a preset data complement standard.
In this embodiment of the present application, the preset data complement standard may be determined according to a data standard of the target system. For example, the data standard of the target system specifies a field that must be filled in, which may be included in the data completion standard. The source data can be analyzed according to a preset data complement standard, the data lacking in the source data is determined, and the complementary content corresponding to the source data is determined according to the data complement standard. A supplemental data file corresponding to the data offload format may be generated from the supplemental content to obtain the supplemental data.
And step 1013, generating data to be checked according to the source data and the supplementary data.
In the embodiment of the application, the source data and the supplemental data may be generated into data files, and stored in a preset Storage area, such as a Direct-attached Storage (DAS) data disc, by adopting security measures, such as file encryption. Data can be obtained from the data file as data to be checked. It should be noted that, in general, all the historical stock data in the available data file is also called full data. And for the case of larger historical data quantity and overlong subsequent checking and conversion processing time, the slicing processing can be performed according to time, namely, the stock and the increment data are checked and converted respectively, wherein the stock data are data which cannot be updated any more, and the data which are easy to change can be used as increment data.
In the embodiment of the application, the source data is obtained by unloading the data in the source system according to a preset data unloading format; determining supplementary data corresponding to the source data according to the source data and a preset data complement standard; and generating data to be checked according to the source data and the supplementary data. Therefore, the data to be checked which accords with the format of the preset requirement and is subjected to the completion can be obtained, and therefore, the data migration efficiency of the data migration method can be improved.
Fig. 2 is a flow chart of a data migration method provided in the embodiment of the present application, and as shown in fig. 2, a data migration architecture includes a data area, a checking area and a rule area. The data migration process comprises the following steps: unloading source data from a source system, obtaining supplementary data corresponding to the source data, generating a data file, and obtaining data to be checked from the data file. The data to be checked can be obtained respectively in the form of stock data or incremental data. And performing pre-conversion checking on the data to be checked according to basic checking rules by a checking program, and then performing data conversion, also called data cleaning, to obtain converted data to be checked. And performing post-conversion checking on the converted data to be checked according to the conversion checking rule by a checking program to obtain the data to be migrated.
And providing the data to be migrated for service acceptance, namely testing the supply number, and checking the data to be migrated by service personnel according to service acceptance standards through test cases. And if the data to be migrated pass the acceptance, the data to be migrated is used as target data, the data edition is carried out, and the data initialization is carried out on the target system according to a new online plan. If the data to be migrated does not pass the acceptance, returning the data to be migrated and carrying out data checking and processing again until the data quality of the obtained data to be migrated reaches the data standard of the target system. Business personnel can check whether the data are correct through the functional pages of the target system or through statistical reports. When the page function is not visual enough or the function is insufficient, a developer can write an acceptance program or script, and confirm the running result of the program or script with a business person to determine whether the acceptance is passed. If the acceptance is not passed, the business personnel and the technical personnel can analyze together to trace forward whether the problem is generated in the cleaning or converting step or to trace back the quality problem existing in the source data, so that the data is processed in the data checking and processing process.
In the rule area, the basic checking rules and the conversion checking rules in the target checking rules can be supplemented according to the business checking and accepting results, and checking and processing can be carried out again according to the supplemented checking rules, so that the quality of data checking is improved. Optionally, for the check rule version of the multiple times of data check, the rule of the supplement rule or the cutting rule can be obtained through analysis, and the method for selecting the target check rule is modified, so that the more accurate target check rule is obtained. Therefore, the data migration flow is standardized by the division and cooperation of technicians and service personnel to continuously supplement and iterate the check rules, so that the data migration efficiency is improved, the calculation force is saved, and the data quality of the on-line target data of the target system is ensured.
Fig. 3 is a block diagram of a data migration apparatus according to an embodiment of the present application, where the apparatus 40 may include:
an obtaining module 401, configured to obtain data to be checked from a source system, and classify the data to be checked according to a preset classification condition, so as to obtain sub data to be checked;
a selection module 402, configured to select a target checking rule for checking the data from preset checking rules according to the category weights corresponding to the data to be checked;
The checking module 403 is configured to perform data checking on the data to be checked according to the target checking rule, and generate data to be migrated based on a data checking result and the data to be checked;
and the sending module 404 is configured to send the data to be migrated to a target system to migrate the data in the source system to the target system if the data to be migrated meets a preset data migration standard.
Optionally, the selecting module 402 is specifically configured to:
determining rule scores of all the checking rules in the preset checking rules according to the category weights corresponding to all the sub data to be checked;
and selecting a checking rule with rule score not smaller than a preset scoring threshold from the preset checking rules as a target checking rule for data checking at this time.
Optionally, the apparatus 40 further includes:
the checking module is configured to check the data to be migrated according to a preset test case before the sending module 404 sends the data to be migrated to a target system if the data to be migrated meets a preset data migration standard, so as to obtain a checking result; the test cases are determined according to the data migration standard;
The sending module 404 is specifically configured to:
and under the condition that the test result indicates that the data to be migrated accords with a preset data migration standard, the data to be migrated is sent to a target system.
Optionally, the apparatus 40 further includes:
the correction module is used for carrying out the inspection on the data to be migrated according to a preset test case to obtain an inspection result, and correcting the target checking rule according to the inspection result under the condition that the inspection result represents that the data to be migrated does not accord with a preset data migration standard so as to obtain a corrected target checking rule;
and the generation module is used for carrying out data checking on the data to be checked again according to the corrected target checking rule, and generating data to be migrated based on a data checking result and the data to be checked.
Optionally, the correction module is specifically configured to:
according to the data with quality problems in the inspection result, selecting a check rule corresponding to the quality problems from preset check rules as a check rule to be supplemented;
and supplementing the to-be-supplemented checking rule into the target checking rule so as to correct the target checking rule.
Optionally, the target checking rule includes a basic checking rule and a conversion checking rule, and the checking module 403 is specifically configured to:
performing pre-conversion checking on the data to be checked according to the basic checking rule to determine data with original quality problems in the data to be checked;
performing data conversion on the data to be checked according to the original quality problem and a preset data conversion standard to process the original quality problem and obtain converted data to be checked;
and performing post-conversion checking on the converted data to be checked according to the conversion checking rule, and generating data to be migrated based on a post-conversion checking result and the converted data to be checked.
Optionally, the obtaining module 401 is specifically configured to:
unloading data in a source system according to a preset data unloading format to obtain source data;
determining supplementary data corresponding to the source data according to the source data and a preset data complement standard;
and generating data to be checked according to the source data and the supplementary data.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
The data migration device has the same advantages as those of the data migration method described above relative to the prior art, and will not be described in detail here.
The present application also provides an electronic device, see fig. 4, comprising: the data migration method of the foregoing embodiment is implemented by the processor 501, the memory 502, and the computer program 5021 stored on the memory and executable on the processor when the processor executes the program.
The present application also provides a readable storage medium, which when executed by a processor of an electronic device, enables the electronic device to perform the data migration method of the foregoing embodiments.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. The required structure for a construction of such a system is apparent from the description above. In addition, the present application is not directed to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present application as described herein, and the above description of specific languages is provided for disclosure of preferred embodiments of the present application.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the present application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the application and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Various component embodiments of the present application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in a sorting device according to the present application. The present application may also be embodied as an apparatus or device program for performing part or all of the methods described herein. Such a program embodying the present application may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the application, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
The foregoing description of the preferred embodiments of the present application is not intended to limit the invention to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It should be noted that, in the embodiment of the present application, the various data-related processes are all performed under the condition of conforming to the corresponding data protection rule policy of the country of the location and obtaining the authorization given by the owner of the corresponding device.

Claims (10)

1. A method of data migration, the method comprising:
Acquiring data to be checked from a source system, and classifying the data to be checked according to preset classification conditions to acquire sub data to be checked;
selecting a target checking rule of the data checking from preset checking rules according to the category weight corresponding to each piece of data to be checked;
performing data checking on the data to be checked according to the target checking rule, and generating data to be migrated based on a data checking result and the data to be checked;
and under the condition that the data to be migrated accords with a preset data migration standard, sending the data to be migrated to a target system so as to migrate the data in the source system to the target system.
2. The method of claim 1, wherein the selecting the target checking rule of the data checking from the preset checking rules according to the category weight corresponding to each piece of data to be checked comprises:
determining rule scores of all the checking rules in the preset checking rules according to the category weights corresponding to all the sub data to be checked;
and selecting a checking rule with rule score not smaller than a preset scoring threshold from the preset checking rules as a target checking rule for data checking at this time.
3. The method according to claim 1, wherein, in the case that the data to be migrated meets a preset data migration criteria, before sending the data to be migrated to a target system, the method further comprises:
checking the data to be migrated according to a preset test case to obtain a checking result; the test cases are determined according to the data migration standard;
and sending the data to be migrated to a target system under the condition that the data to be migrated accords with a preset data migration standard, wherein the data to be migrated comprises the following steps:
and under the condition that the test result indicates that the data to be migrated accords with a preset data migration standard, the data to be migrated is sent to a target system.
4. The method of claim 3, wherein after the data to be migrated is inspected according to a predetermined test case to obtain an inspection result, the method further comprises:
correcting the target checking rule according to the checking result under the condition that the checking result represents that the data to be migrated does not accord with the preset data migration standard, so as to obtain a corrected target checking rule;
And re-checking the data to be checked according to the corrected target checking rule, and generating the data to be migrated based on a data checking result and the data to be checked.
5. The method of claim 4, wherein correcting the target inspection rule based on the inspection result comprises:
according to the quality problems of the data in the inspection result, selecting a checking rule corresponding to the quality problems from preset checking rules as a checking rule to be supplemented;
and supplementing the to-be-supplemented checking rule into the target checking rule so as to correct the target checking rule.
6. The method of claim 1, wherein the target inspection rule includes a base inspection rule and a transition inspection rule, wherein the performing data inspection on the data to be inspected according to the target inspection rule, and generating the data to be migrated based on a data inspection result and the data to be inspected, comprises:
performing pre-conversion checking on the data to be checked according to the basic checking rule to determine data with original quality problems in the data to be checked;
Performing data conversion on the data to be checked according to the original quality problem and a preset data conversion standard to process the original quality problem and obtain converted data to be checked;
and performing post-conversion checking on the converted data to be checked according to the conversion checking rule, and generating data to be migrated based on a post-conversion checking result and the converted data to be checked.
7. The method of claim 1, wherein the obtaining the data to be checked from the source system comprises:
unloading data in a source system according to a preset data unloading format to obtain source data;
determining supplementary data corresponding to the source data according to the source data and a preset data complement standard;
and generating data to be checked according to the source data and the supplementary data.
8. A data migration apparatus, the apparatus comprising:
the acquisition module is used for acquiring the data to be checked from the source system, and classifying the data to be checked according to preset classification conditions so as to acquire sub data to be checked;
the selection module is used for selecting a target checking rule of the data checking from preset checking rules according to the category weight corresponding to each piece of data to be checked;
The checking module is used for checking the data of the data to be checked according to the target checking rule and generating the data to be migrated based on the data checking result and the data to be checked;
and the sending module is used for sending the data to be migrated to a target system under the condition that the data to be migrated accords with a preset data migration standard so as to migrate the data in the source system to the target system.
9. An electronic device, comprising: a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the data migration method according to any one of claims 1-7 when the program is executed.
10. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data migration method of any one of claims 1-7.
CN202310214001.5A 2023-03-07 2023-03-07 Data migration method, device, electronic equipment and readable storage medium Pending CN116166638A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310214001.5A CN116166638A (en) 2023-03-07 2023-03-07 Data migration method, device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310214001.5A CN116166638A (en) 2023-03-07 2023-03-07 Data migration method, device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN116166638A true CN116166638A (en) 2023-05-26

Family

ID=86413248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310214001.5A Pending CN116166638A (en) 2023-03-07 2023-03-07 Data migration method, device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN116166638A (en)

Similar Documents

Publication Publication Date Title
US8707268B2 (en) Testing operations of software
CN112116184A (en) Factory risk estimation using historical inspection data
CN111340584A (en) Method, device, equipment and storage medium for determining fund side
CN111553137B (en) Report generation method and device, storage medium and computer equipment
US20100162029A1 (en) Systems and methods for process improvement in production environments
CN113051291A (en) Work order information processing method, device, equipment and storage medium
CN110688536A (en) Label prediction method, device, equipment and storage medium
CN112116185A (en) Test risk estimation using historical test data
CN110414806B (en) Employee risk early warning method and related device
CN113722370A (en) Data management method, device, equipment and medium based on index analysis
CN110532612A (en) The operation data processing method and processing device of ship power system
US11341547B1 (en) Real-time detection of duplicate data records
CN116166638A (en) Data migration method, device, electronic equipment and readable storage medium
CN116385189A (en) Method and system for checking matching degree of account listed subjects of financial account-reporting document
US11106643B1 (en) System and method for integrating systems to implement data quality processing
CN111209214B (en) Code test processing method and device, electronic equipment and medium
CN112749079B (en) Defect classification method and device for software test and computing equipment
CN113849618A (en) Strategy determination method and device based on knowledge graph, electronic equipment and medium
CN111916165A (en) Similarity evaluation method and device for evaluation scale
US20130311207A1 (en) Medical Record Processing
CN115576957B (en) Evaluation report automatic generation method, device, equipment and storage medium
US20240160696A1 (en) Method for Automatic Detection of Pair-Wise Interaction Effects Among Large Number of Variables
CN111582754B (en) Risk investigation method, apparatus, device and computer readable storage medium
US20240119107A1 (en) Evaluation apparatus, evaluation method, and non-transitory computer-readable medium
CN117474662A (en) Money laundering risk assessment method, money laundering risk assessment device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination