CN112286910B - Data verification method and device - Google Patents

Data verification method and device Download PDF

Info

Publication number
CN112286910B
CN112286910B CN202011324306.4A CN202011324306A CN112286910B CN 112286910 B CN112286910 B CN 112286910B CN 202011324306 A CN202011324306 A CN 202011324306A CN 112286910 B CN112286910 B CN 112286910B
Authority
CN
China
Prior art keywords
data
data table
database
feature matrix
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011324306.4A
Other languages
Chinese (zh)
Other versions
CN112286910A (en
Inventor
焦洋
俱青
林海杰
王昭
靖飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202011324306.4A priority Critical patent/CN112286910B/en
Publication of CN112286910A publication Critical patent/CN112286910A/en
Application granted granted Critical
Publication of CN112286910B publication Critical patent/CN112286910B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data verification method and a data verification device, comprising the steps of obtaining a first data table of a first database and a second data table of a second database; processing the first data table and the second data table respectively to obtain a first feature matrix, a first MD5 feature value, a first data number, a second feature matrix, a second MD5 feature value and a second data number; checking the first data number, the first feature matrix and the first MD5 feature value with the second data number, the second feature matrix and the second MD5 feature value; and if the verification is passed, determining that the data of the first database is consistent with the data of the second database. In the scheme, manual screening is not needed, and the time for comparing the data before and after migration can be reduced by checking the first data number of the first data table, the first feature matrix and the first MD5 feature value with the second data number of the second data table, the second feature matrix and the second MD5 feature value, so that the consistency of the data before and after migration is ensured.

Description

Data verification method and device
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data verification method and apparatus.
Background
With the continuous improvement of the requirements of various enterprises on the performance, the safety, the cost and the like of the database, the database with better performance, safety and cost needs to be replaced in order to meet the other requirements on the performance, the safety, the cost and the like of the database.
During a database change, data in the old database needs to be migrated to the new database. Because of the difference in the ways in which the data is stored in the different databases, it is particularly important to ensure consistency of the data before and after migration during the process of replacing the databases. At present, consistency of data before and after migration is ensured by a manual screening mode, and the data in an old database before migration and the data in a new database after migration are required to be compared manually. The comparison of the data before and after the migration needs a great deal of time and energy, and the consistency of the data before and after the migration cannot be ensured by a manual screening mode.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide a data verification method and apparatus, so as to solve the problem of consistency of data before and after database migration in the prior art.
In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:
An embodiment of the present invention provides a data verification method, where the method includes:
acquiring a first data table of a first database before migration and a second data table corresponding to the first data table in a second database after migration;
processing the first data table to obtain a first feature matrix related to DQL and a first password hash function MD5 feature value, and determining a first data bar number in the first data table;
processing the second data table to obtain a second feature matrix and a second MD5 feature value related to the DQL, and determining the number of second data strips in the second data table;
checking the first data number, the first feature matrix and the first MD5 feature value of the first data table and the second data number, the second feature matrix and the second MD5 feature value of the second data table;
and if the verification is passed, determining that the data in the first database is consistent with the data in the second database.
Optionally, the processing the first data table, and determining the first number of data stripes in the first data table includes:
inquiring the number of data bars of the first data table based on a database inquiry statement, and determining the number of the first data bars corresponding to the first data table;
Correspondingly, the processing the second data table to determine the second data bar number in the second data table includes:
and inquiring the number of data bars of the second data table based on the database inquiry statement, and determining the number of second data bars corresponding to the second data table.
Optionally, the processing the first data table to obtain a first feature matrix related to DQL includes:
acquiring all special fields in the first data table, wherein the special fields are preset;
classifying and calculating all the special fields to determine the number of each characteristic field;
constructing a first feature matrix based on the number of each special field;
correspondingly, the processing the second data table to obtain a second feature matrix related to the DQL includes:
acquiring all special fields in the second data table, wherein the special fields are preset;
classifying and calculating all the special fields to determine the number of each characteristic field;
and constructing a second feature matrix based on the number of each special field.
Optionally, the processing the first data table to obtain a first cryptographic hash function MD5 eigenvalue related to DQL includes:
Exporting the first data table into a first data file according to a preset separator;
generating a first MD5 characteristic value corresponding to the first data table based on the first data file;
correspondingly, the processing the second data table to obtain a second MD5 characteristic value related to DQL includes:
exporting the second data table into a second data file according to a preset separator;
and generating a second MD5 characteristic value corresponding to the data table based on the second data file.
Optionally, the verifying the first number of data stripes, the first feature matrix, the first MD5 feature value, and the second number of data stripes, the second feature matrix, and the second MD5 feature value of the second data table includes:
judging whether the first data number of the first data table is consistent with the second data number of the corresponding second data table, and judging whether the first feature matrix of the first data table is consistent with the second feature matrix of the corresponding second data table;
if any one of the two is inconsistent, determining that the data in the first database is inconsistent with the data in the second database;
if the first MD5 characteristic values are consistent with the second MD5 characteristic values of the corresponding second data table, judging whether the first MD5 characteristic values of the first data table are consistent with the second MD5 characteristic values of the corresponding second data table;
If so, executing the step of determining that the data in the first database is consistent with the data in the second database;
if the data in the first database is inconsistent with the data in the second database, determining that the data in the first database is inconsistent with the data in the second database.
A second aspect of an embodiment of the present invention provides a data verification apparatus, including:
the first acquisition unit is used for acquiring a first data table of a first database before migration and a second data table corresponding to the first data table in a second database after migration;
a first determining unit, configured to process the first data table to obtain a first feature matrix related to DQL and a first cryptographic hash function MD5 feature value, and determine a first number of data stripes in the first data table;
a second determining unit, configured to process the second data table, obtain a second feature matrix and a second MD5 feature value related to DQL, and determine a second number of data stripes in the second data table;
the verification unit is used for verifying the first data bar number, the first feature matrix and the first MD5 feature value of the first data table and the second data bar number, the second feature matrix and the second MD5 feature value of the second data table, and if the verification is passed, the third determination unit is executed;
And a third determining unit, configured to determine that the data in the first database is consistent with the data in the second database.
Optionally, the first determining unit configured to process the first data table and determine a first number of data strips in the first data table is specifically configured to: inquiring the number of data bars of the first data table based on a database inquiry statement, and determining the number of the first data bars corresponding to the first data table;
correspondingly, the second determining unit for processing the second data table and determining the second data number in the second data table is specifically configured to: and inquiring the number of data bars of the second data table based on the database inquiry statement, and determining the number of second data bars corresponding to the second data table.
Optionally, the processing the first data table to obtain a first determining unit of a first feature matrix related to DQL is specifically configured to: acquiring all special fields in the first data table, wherein the special fields are preset; classifying and calculating all the special fields to determine the number of each characteristic field; constructing a first feature matrix based on the number of each special field;
Correspondingly, the second determination unit for processing the second data table to obtain a second feature matrix related to the DQL is specifically configured to: acquiring all special fields in the second data table, wherein the special fields are preset; classifying and calculating all the special fields to determine the number of each characteristic field; a second feature matrix is constructed based on the number of each particular field.
Optionally, the processing the first data table to obtain a first determining unit of a first cryptographic hash function MD5 eigenvalue related to DQL is specifically configured to: exporting the first data table into a first data file according to a preset separator; generating a first MD5 characteristic value corresponding to the first data table based on the first data file;
correspondingly, the second determining unit for processing the second data table to obtain a second MD5 feature value related to DQL is specifically configured to: exporting the second data table into a second data file according to a preset separator; and generating a second MD5 characteristic value corresponding to the data table based on the second data file.
Optionally, the verification unit includes a DQL data verification module and an MD5 module;
The data verification module is used for judging whether the first data number of the first data table is consistent with the second data number of the corresponding second data table, and whether the first feature matrix of the first data table is consistent with the second feature matrix of the corresponding second data table; if any one of the two is inconsistent, determining that the data in the first database is inconsistent with the data in the second database; if the MD5 modules are consistent, executing the MD5 modules;
the MD5 module is used for judging whether the first MD5 characteristic value of the first data table is consistent with the second MD5 characteristic value of the corresponding second data table; if the data in the first database is inconsistent with the data in the second database, determining that the data in the first database is inconsistent with the data in the second database; and if so, executing the third determining unit.
Based on the above data verification method and device provided by the embodiment of the invention, the method includes obtaining a first data table of a first database before migration and a second data table corresponding to the first data table in a second database after migration; processing the first data table to obtain a first feature matrix related to the DQL and a first password hash function MD5 feature value, and determining a first data bar number in the first data table; processing the second data table to obtain a second feature matrix and a second MD5 feature value related to the DQL, and determining the number of second data bars in the second data table; checking the first data number, the first feature matrix and the first MD5 feature value of the first data table and the second data number, the second feature matrix and the second MD5 feature value of the second data table; and if the verification is passed, determining that the data in the first database is consistent with the data in the second database. In the embodiment of the invention, manual screening is not needed, and the first data strip number, the first feature matrix and the first MD5 feature value of the first data table are checked with the second data strip number, the second feature matrix and the second MD5 feature value of the second data table to determine whether the data in the first database are consistent with the data in the second database, and when the check is passed, the data in the first database are determined to be consistent with the data in the second database, so that the time for comparing the data before and after migration can be reduced, and the consistency of the data before and after migration can be ensured.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of an application structure of a first database, a data verification device, and a second database according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a data verification method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating another data verification method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a comparison of the number of the first data table and the second data table according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a first data file and a second data file according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a data verification device according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of another data verification device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In this application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the embodiment of the invention, manual screening is not needed, and the first data strip number, the first feature matrix and the first MD5 feature value of the first data table are checked with the second data strip number, the second feature matrix and the second MD5 feature value of the second data table to determine whether the data in the first database are consistent with the data in the second database, and when the check is passed, the data in the first database are determined to be consistent with the data in the second database, so that the time for comparing the data before and after migration can be reduced, and the consistency of the data before and after migration can be ensured.
Referring to fig. 1, a schematic diagram of an application structure of a first database 10, a data verification device 20, and a second database 30 according to an embodiment of the present invention is shown.
The first database 10 and the second database 30 are respectively connected to the data verification device 20.
Wherein the first database 10 is used to indicate the database before data migration.
The second database 30 is used to indicate the database after the database has been migrated.
The first database 10 includes a plurality of first data tables 100, and specifically, the plurality of first data tables 100 includes a first data table 101, a first data table 102.
The second database 30 includes a plurality of second data tables 300, and specifically, the plurality of second data tables 300 includes a second data table 301, a second data table 302.
It should be noted that, the first data table in the first database corresponds to the second data table in the second database one by one, that is, the first data table 101 corresponds to the second data table 301, the first data table 102 corresponds to the second data table 302, and so on, and the first data table 10N corresponds to the second data table 30M.
The range of N and M is a positive integer greater than or equal to 1, and N is equal to M.
The process for realizing data verification based on the application structure comprises the following steps:
the data verification device 20 obtains a first data table of the first database 10, wherein the number of the first data tables is at least one.
The data verification device 20 obtains the second data tables of the second database 30, the number of which is the same as the number of the first data tables.
The data verification device 20 processes the first data tables to obtain a first feature matrix and a first cryptographic hash function MD5 feature value associated with the DQL, and determines a first number of data pieces in each of the first data tables. Similarly, the data verification device 20 also processes the second data table to obtain a second feature matrix and a second MD5 feature value related to the DQL, and determines the number of second data stripes in the second data table.
It should be noted that each first data table has a first feature matrix corresponding to the first data table; each first data table also has a corresponding first MD5 characteristic value. Each second data table is provided with a second characteristic matrix corresponding to the second data table; each second data table also has a second MD5 characteristic corresponding thereto.
The data verification device 20 verifies the first number of data bars, the first feature matrix and the first MD5 feature value of the first data table with the second number of data bars, the second feature matrix and the second MD5 feature value of the second data table, and determines that the data in the first database 10 is consistent with the data in the second database 30 when the verification is passed.
In the embodiment of the invention, manual screening is not needed, and the first data strip number, the first feature matrix and the first MD5 feature value of the first data table are checked with the second data strip number, the second feature matrix and the second MD5 feature value of the second data table to determine whether the data in the first database are consistent with the data in the second database, and when the check is passed, the data in the first database are determined to be consistent with the data in the second database, so that the time for comparing the data before and after migration can be reduced, and the consistency of the data before and after migration can be ensured.
Based on the processing architecture disclosed in the above embodiment of the present invention, referring to fig. 2, a flow chart of a data verification method is shown in the embodiment of the present invention, where the method is applicable to a data verification device, and the method includes:
step S201: and acquiring a first data table of the first database before migration and a second data table corresponding to the first data table in the second database after migration.
In the specific implementation process of step S201, a first database before migration, that is, a first data table in the original database, and a second database after data migration, that is, a second data table in the current database, are obtained.
The first data table is used for storing data before migration, and the second data table is used for storing data after migration.
The first database and the second database both store a plurality of data tables, and the first data tables in the first database are in one-to-one correspondence with the second data tables in the second database, that is to say, the data tables with the same table names in the first database and the second database are mutually corresponding.
Step S202: processing the first data table to obtain a first feature matrix and a first cryptographic hash function MD5 feature value related to the DQL, and determining a first number of data stripes in the first data table.
In the specific implementation process of step S202, first, the first number of data of each first data table in the first database is queried respectively; and then, respectively processing each first data table in the first database to obtain a first feature matrix corresponding to each data table and a first password hash function MD5 feature value corresponding to each data table.
It should be noted that the first cryptographic hash function MD5 feature value is a 128-bit hash value generated from data in the first data table.
Step S203: and processing the second data table to obtain a second feature matrix and a second MD5 feature value related to the DQL, and determining the second data strip number in the second data table.
It should be noted that the specific implementation procedure of step S203 is the same as that of step S202, and reference should be made to each other, which is not a limitation of the embodiments of the present invention.
Further, it should be noted that the execution sequence of step S202 and step S203 may be as described above, or may be executed simultaneously, or step S203 may be executed first, and then step S202 may be executed, which is not limited to this embodiment of the present invention.
Step S204: and checking the first data number, the first feature matrix and the first MD5 feature value of the first data table and the second data number, the second feature matrix and the second MD5 feature value of the second data table, if the check is passed, executing the step S205, and if the check is not passed, executing the step S206.
In the specific implementation step S204, the first data number of each first data table and the second data number of the corresponding second data table are checked, the first feature matrix of each first data table and the second feature matrix of the corresponding second data table are checked, the first MD5 feature value of each first data table and the second MD5 feature value of the corresponding second data table are checked, if both check passes, step S205 is executed, and if the data number, feature matrix or MD5 feature value of any one data table fails to pass, step S206 is executed.
Step S205: it is determined that the data in the first database is consistent with the data in the second database.
In the process of implementing step S205, the integrity of the data migrated from the first database to the second database is ensured.
Step S206: it is determined that the data in the first database is inconsistent with the data in the second database.
In the process of implementing step S206, it is explained that there is a migration careless of the data migrated from the first database to the second database.
Optionally, based on the data verification method shown above, after determining that the data in the first database is inconsistent with the data in the second database, the method further includes:
based on the table names corresponding to the first data table and the second data table which are not checked, corresponding prompt information is generated, so that a technician can quickly and accurately determine the problem of database migration, and corresponding measures are provided.
In the embodiment of the invention, manual screening is not needed, and whether the data in the first database is consistent with the data in the second database is determined by checking the first data bar number, the first feature matrix and the first MD5 feature value of the first data table and the second data bar number, the second feature matrix and the second MD5 feature value of the second data table. If the verification passes, determining that the data in the first database is consistent with the data in the second database; and if the verification fails, indicating that the migration careless error exists in the data migrated from the first database to the second database. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
Optionally, based on the data verification method shown above, in executing step S202 to process the first data table to obtain the first feature matrix and the first cryptographic hash function MD5 feature value related to the DQL, and determining the first number of data pieces in the first data table, the method includes the following steps:
step S11: and inquiring the number of the data pieces of the first data table based on the database inquiry statement, and determining the number of the first data pieces corresponding to the first data table.
In the specific implementation step S11, the number of data of each first data table in the first database is queried by using the database query statement, so as to obtain the number of first data corresponding to each first data table.
For example: the table name of the first data table is t, and the number of the first data bars corresponding to the first data table t is specifically: and inquiring the first data number corresponding to the first data table t by utilizing a database inquiry statement select count (x) from t related to the DQL, so as to obtain the first data number corresponding to the first data table t.
Step S12: and obtaining all special fields in the first data table, and performing classification calculation on all the special fields to determine the number of each characteristic field so as to construct a first characteristic matrix based on the number of each special field.
In step S12, the special field is preset.
In the specific implementation step S12, for each first data table, according to the preset special fields, searching all the special fields in the first data table, that is, searching the special fields in the first data table; summing the number of the searched special fields by utilizing a database searching statement to obtain the number of each characteristic field; and finally, constructing a first feature matrix corresponding to the first data table by utilizing the number of each feature field.
It should be noted that, the specific fields are set by the skilled person according to experience in advance, and the embodiments of the present invention are not limited.
The size of the first feature matrix is related to the number of preset special fields, for example, when the preset special fields are 6, the first feature matrix may be a 2*3 size matrix or a 3*2 size matrix.
For example: the preset special fields are assumed to be a, b, c, f, x, y, and the first feature matrix is a 2*3 matrix. Searching special fields a, b, c, f, x and y of a first data table t in a first database to obtain all special fields which are a, a, a, a, b, b, c, c, c, c, f, f, x, y, y and y, and summing the number of the searched special fields by using a DQL statement select sum (a), sum (b) from table t to obtain the number of each characteristic field, namely a A number of 4, a number of 3, a number of 5, a number of f, a number of 1, a number of y, a number of 2; finally, constructing a first feature matrix corresponding to the first data table by utilizing the number of each feature field as
Step S13: and exporting the first data table into a first data file according to a preset separator, and generating a first MD5 characteristic value corresponding to the first data table based on the first data file.
In the specific implementation step S13, respectively exporting a first data file of each first data table based on a preset separator; a128-bit hash value corresponding to each first data table is generated from each first data file.
It should be noted that, the preset separator is preset by a technician, and the embodiment of the present invention is not limited thereto, and may be set as ",".
The execution sequence of steps S11 to S13 may be as described above, or may be executed simultaneously, or step S12 may be executed first, then step S11 may be executed, and then step S13 may be executed. This application is not limited thereto.
In the embodiment of the invention, the first data number of each first data table is queried through a database query statement; constructing a first feature matrix corresponding to each first data table according to the special field of each first data table; and generating a first MD5 value corresponding to the first data table according to the first data file corresponding to each first data table. The method comprises the steps of checking the first data bar number, the first characteristic matrix and the first MD5 characteristic value of a first data table and the second data bar number, the second characteristic matrix and the second MD5 characteristic value of a second data table in sequence to determine whether data in a first database are consistent with data in a second database. And if the verification passes, determining that the data in the first database is consistent with the data in the second database. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
Optionally, based on the data verification method shown above, in executing step 203, the process of processing the second data table to obtain the second feature matrix and the second MD5 feature value related to the DQL, and determining the second number of data stripes in the second data table includes the following steps:
step S21: and inquiring the number of data bars of the second data table based on the database inquiry statement, and determining the number of the second data bars corresponding to the second data table.
Step S22: acquiring all special fields in a second data table, classifying and calculating all the special fields, and determining the number of each characteristic field; a second feature matrix is constructed based on the number of each particular field.
In step S22, the special field is set in advance.
Step S23: and exporting the second data table into a second data file according to the preset separator, and generating a second MD5 characteristic value corresponding to the data table based on the second data file.
It should be noted that the specific implementation contents of step S21 to step S23 are the same as the specific implementation procedures of step S11 to step S13, and reference may be made to each other, which is not a limitation of the present invention.
Similarly, the execution sequence of steps S21 to S23 may be performed as described above, or simultaneously, or step S22 may be performed first, then step S21 may be performed, and then step S23 may be performed. This application is not limited thereto.
In the embodiment of the invention, the second data number of each second data table is queried through a database query statement; constructing a second feature matrix corresponding to the second data table according to the special field of each second data table; and generating a second MD5 value corresponding to the second data table according to the first data file corresponding to each second data table. The method comprises the steps of checking the first data bar number, the first characteristic matrix and the first MD5 characteristic value of a first data table and the second data bar number, the second characteristic matrix and the second MD5 characteristic value of a second data table in sequence to determine whether data in a first database are consistent with data in a second database. And if the verification passes, determining that the data in the first database is consistent with the data in the second database. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
Based on the data verification method shown in the above embodiment of the present invention, referring to fig. 3 in conjunction with fig. 2, the present invention further shows a process for specifically implementing verification of the first number of data stripes, the first feature matrix and the first MD5 feature value of the first data table with the second number of data stripes, the second feature matrix and the second MD5 feature value of the second data table, as shown in fig. 3.
Step S301: and acquiring a first data table of the first database before migration and a second data table corresponding to the first data table in the second database after migration.
Step S302: processing the first data table to obtain a first feature matrix and a first cryptographic hash function MD5 feature value related to the DQL, and determining a first number of data stripes in the first data table.
Step S303: and processing the second data table to obtain a second feature matrix and a second MD5 feature value related to the DQL, and determining the second data strip number in the second data table.
Note that the specific implementation procedures of step S301 to step S303 are the same as those of step S201 to step S203 shown in the above steps, and can be referred to each other.
Step S304: judging whether the first data number of the first data table is consistent with the second data number of the corresponding second data table, judging whether the first feature matrix of the first data table is consistent with the second feature matrix of the corresponding second data table, if any two of the two feature matrices are inconsistent, executing step S307, and if the two feature matrices are consistent, executing step S305.
In the specific implementation process of step S304, comparing whether the first data stripe number of each first data table is consistent with the second data stripe number of the corresponding second data table, if the first data stripe number of any one first data table is inconsistent with the second data stripe number of the corresponding second data table, directly executing step S307, if the first data stripe number of each first data table is consistent with the second data stripe number of the corresponding second data table, comparing whether the first feature matrix of each first data table is consistent with the second feature matrix of the corresponding second data table, if the first feature matrix of any one first data table is inconsistent with the second feature matrix of the corresponding second data table, directly executing step S307, and if the second feature matrix of each first data table is inconsistent with the second feature matrix of the corresponding second data table, indicating that DQL data check is passed, executing step S305.
Step S305: judging whether the first MD5 characteristic value of the first data table is consistent with the second MD5 characteristic value of the corresponding second data table, if so, executing step S306, and if not, executing step S307.
In the specific implementation of step S305, comparing whether the first MD5 characteristic value of each first data table is consistent with the second MD5 characteristic value of the corresponding first data table, if any one of the first MD5 characteristic values of the first data tables is inconsistent with the second MD5 characteristic value of the corresponding second data table, executing step S307, and if the first MD5 characteristic value of each first data table is consistent with the second MD5 characteristic value of the corresponding second data table, executing step S306.
Step S306: it is determined that the data in the first database is consistent with the data in the second database.
Step S307: it is determined that the data in the first database is inconsistent with the data in the second database.
It should be noted that the specific implementation procedures of step S306 and step S307 are the same as the specific implementation procedures of step S204 to step S205 shown in the above-described embodiment of the present invention, and can be referred to each other.
In the embodiment of the invention, manual screening is not needed, whether the first data bar number of the first data table is consistent with the second data bar number of the second data table is checked, whether the first feature matrix of the first data table is consistent with the second feature matrix of the second data table is checked, and if so, whether the first MD5 feature value of the first data table is consistent with the second MD5 feature value of the second data table is checked, so that whether the data in the first database is consistent with the data in the second database is determined. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
For a better understanding of the data verification process shown above, the following description is given in detail with reference to the application architecture shown in fig. 1.
The data verification device 20 performs query processing on the first data table 101 and the first data table 102 in the first database 10, so as to obtain the first data number corresponding to each first data table, specifically, obtain the total number of the first data tables 101, that is, the first data number of the first data table 101, and the total number of the first data tables 102. Similarly, the data verification device 20 performs query processing on the second data table 301 and the second data table 302 in the second database 30, so as to obtain a second data bar number corresponding to each second data table, specifically, obtain the total number of second data tables 301, that is, the second data bar number of the second data table 301, and the total number of second data tables 302.
The data verification device 20 compares the total number of the first data table 101 with the total number of the second data table 301, and compares the total number of the first data table 102 with the total number of the second data table 302 until the total number of the first data table 10N and the total number of the second data table 30M are compared.
If the total number of the first data table 101 is consistent with the total number of the second data table 301, the total number of the first data table 102 is consistent with the total number of the second data table 302, and so on, when the total number of the first data table 10N is consistent with the total number of the second data table 30M, respectively searching all special fields of the first data table 101 and the first data table 102 in the first database 10 according to preset special fields, and summing the number of the special fields searched in each first data table 100 by utilizing a database search statement to obtain the number of each characteristic field; and finally, constructing a first feature matrix corresponding to the first data table 100 by using the number of each feature field, and specifically, obtaining a first feature matrix of the first data table 101, and the first feature matrix of the first data table 102. And similarly, the second data table is processed to obtain a second feature matrix of the second data table 301, and the second feature matrix of the second data table 302.
The data checking device 20 performs matrix comparison between the first feature matrix of the first data table 101 and the second feature matrix of the second data table 301, and performs matrix comparison between the first feature matrix of the first data table 102 and the second feature matrix of the second data table 302 until the first feature matrix of the first data table 10N and the second feature matrix of the second data table 30M are subjected to matrix comparison.
If the first feature matrix of the first data table 101 is identical to the second feature matrix of the second data table 301, the first feature matrix of the first data table 102 is identical to the second feature matrix of the second data table 302, and so on, when the first feature matrix of the first data table 10N is identical to the second feature matrix of the second data table 30M, the first data file of each first data table is respectively derived based on the preset separator ",", specifically, the first data file of the first data table 101 and the first data file of the first data table 102. And similarly, each second data table is processed to obtain a second MD5 characteristic value corresponding to the second data file of the second data table 301, and the second MD5 characteristic value corresponding to the second data file of the second data table 302.
The data verification device 20 compares the first MD5 characteristic value corresponding to the first data file of the first data table 101 with the second MD5 characteristic value corresponding to the second data file of the second data table 301, and compares the first MD5 characteristic value corresponding to the first data file of the first data table 102 with the second MD5 characteristic value corresponding to the second data file of the second data table 302 until the first MD5 characteristic value corresponding to the first data file of the first data table 10N and the second MD5 characteristic value corresponding to the second data file of the second data table 30N are compared.
If the first MD5 characteristic value corresponding to the first data file of the first data table 101 is consistent with the second MD5 characteristic value corresponding to the second data file of the second data table 301, the first MD5 characteristic value corresponding to the first data file of the first data table 102 is consistent with the second MD5 characteristic value corresponding to the second data file of the second data table 302, and so on, when the first MD5 characteristic value corresponding to the first data file of the first data table 10N is consistent with the second MD5 characteristic value corresponding to the second data file of the second data table 30N, determining that the data in the first database is consistent with the data in the second database.
In the embodiment of the invention, manual screening is not needed, whether the first data bar number of the first data table is consistent with the second data bar number of the second data table is checked, whether the first feature matrix of the first data table is consistent with the second feature matrix of the second data table is checked, and if so, whether the first MD5 feature value of the first data table is consistent with the second MD5 feature value of the second data table is checked, so that whether the data in the first database is consistent with the data in the second database is determined. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
Corresponding to the data verification method disclosed in the above embodiment of the present invention, the embodiment of the present invention also correspondingly discloses a structure schematic diagram of a data verification device, as shown in fig. 6, where the device includes:
the first obtaining unit 601 is configured to obtain a first data table of a first database before migration and a second data table corresponding to the first data table in a second database after migration.
A first determining unit 602, configured to process the first data table, obtain a first feature matrix related to the DQL and a first cryptographic hash function MD5 feature value, and determine a first number of data stripes in the first data table.
A second determining unit 603, configured to process the second data table, obtain a second feature matrix and a second MD5 feature value related to the DQL, and determine a second number of data stripes in the second data table.
And a checking unit 604, configured to check the first number of data stripes of the first data table, the first feature matrix, the first MD5 feature value, and the second number of data stripes of the second data table, the second feature matrix, and the second MD5 feature value, and if the check is passed, execute a third determining unit 605.
And a third determining unit 605 for determining that the data in the first database is consistent with the data in the second database.
It should be noted that, the specific principle and the execution process of each unit in the data verification device disclosed in the embodiment of the present application are the same as those of the data verification method shown in the embodiment of the present application, and reference may be made to corresponding parts in the data verification method disclosed in the embodiment of the present application, which are not repeated herein.
In the embodiment of the invention, manual screening is not needed, and whether the data in the first database is consistent with the data in the second database is determined by checking the first data bar number, the first feature matrix and the first MD5 feature value of the first data table and the second data bar number, the second feature matrix and the second MD5 feature value of the second data table. If the verification passes, determining that the data in the first database is consistent with the data in the second database; and if the verification fails, indicating that the migration careless error exists in the data migrated from the first database to the second database. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
Optionally, based on the data verification device shown above, the first data table is processed, and the first determining unit 602 is specifically configured to determine the first number of data strips in the first data table: inquiring the number of data of the first data table based on the database inquiry statement, and determining the number of the first data corresponding to the first data table;
correspondingly, the second data table is processed, and the second determining unit 603 for determining the second number of data strips in the second data table is specifically configured to: and inquiring the number of data bars of the second data table based on the database inquiry statement, and determining the number of the second data bars corresponding to the second data table.
In the embodiment of the invention, the first data number of each first data table is queried through a database query statement; and similarly, inquiring the second data number of each second data table according to the database inquiry statement so as to facilitate the follow-up verification of the first data number, the first feature matrix and the first MD5 feature value of the first data table and the second data number, the second feature matrix and the second MD5 feature value of the second data table, so as to determine whether the data in the first database is consistent with the data in the second database. And if the verification passes, determining that the data in the first database is consistent with the data in the second database. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
Optionally, based on the data verification device shown above, the first data table is processed to obtain a first determining unit of a first feature matrix related to DQL, which is specifically configured to: acquiring all special fields in a first data table, wherein the special fields are preset; classifying all special fields to determine the number of each characteristic field; a first feature matrix is constructed based on the number of each particular field.
Correspondingly, the second data table is processed to obtain a second determining unit of a second feature matrix related to the DQL, which is specifically used for: acquiring all special fields in a second data table, wherein the special fields are preset; classifying all special fields to determine the number of each characteristic field; a second feature matrix is constructed based on the number of each particular field.
In the embodiment of the invention, a first feature matrix corresponding to the first data table is constructed according to the special field of each first data table, and a second feature matrix corresponding to the second data table is constructed according to the special field of each second data table. The method comprises the steps of checking the first data bar number, the first characteristic matrix and the first MD5 characteristic value of a first data table and the second data bar number, the second characteristic matrix and the second MD5 characteristic value of a second data table in sequence to determine whether data in a first database are consistent with data in a second database. And if the verification passes, determining that the data in the first database is consistent with the data in the second database. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
Optionally, based on the data verification device shown above, the first data table is processed to obtain the first determining unit 602 of the first cryptographic hash function MD5 eigenvalue related to DQL, which is specifically configured to: exporting the first data table into a first data file according to a preset separator; based on the first data file, a first MD5 characteristic value corresponding to the first data table is generated.
Correspondingly, the second data table is processed to obtain a second determination unit 603 of a second MD5 characteristic value related to DQL, which is specifically configured to: exporting the second data table into a second data file according to a preset separator; based on the second data file, a second MD5 characteristic value corresponding to the data table is generated.
In the embodiment of the invention, a first MD5 value corresponding to the first data table is generated according to the first data file corresponding to each first data table, and similarly, a first MD5 value corresponding to the second data table is generated according to the second data file corresponding to each second data table. The method comprises the steps of checking the first data bar number, the first characteristic matrix and the first MD5 characteristic value of a first data table and the second data bar number, the second characteristic matrix and the second MD5 characteristic value of a second data table in sequence to determine whether data in a first database are consistent with data in a second database. And if the verification passes, determining that the data in the first database is consistent with the data in the second database. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
Alternatively, based on the data verification apparatus shown above, as shown in fig. 7 in conjunction with fig. 6, the verification unit 404 includes a DQL data verification module 6041 and an MD5 module 6042.
The data verification module 6041 is configured to determine whether a first number of data stripes of the first data table is consistent with a second number of data stripes of the corresponding second data table, and whether a first feature matrix of the first data table is consistent with a second feature matrix of the corresponding second data table; if any one of the two is inconsistent, determining that the data in the first database is inconsistent with the data in the second database; if they are identical, MD5 module 6042 is executed.
The MD5 module 6042 is configured to determine whether the first MD5 feature value of the first data table is consistent with the second MD5 feature value of the corresponding second data table; if the data in the first database is inconsistent with the data in the second database, determining that the data in the first database is inconsistent with the data in the second database; if so, executing a third determining unit.
In the embodiment of the invention, whether the first data bar number of the first data table is consistent with the second data bar number of the second data table or not is checked, and if so, whether the first feature matrix of the first data table is consistent with the second feature matrix of the second data table or not is checked, and if so, whether the first MD5 feature value of the first data table is consistent with the second MD5 feature value of the second data table or not is checked, so as to determine whether the data in the first database is consistent with the data in the second database or not. The time for comparing the data before and after migration can be reduced, so that the consistency of the data before and after migration is ensured.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (4)

1. A method of data verification, the method comprising:
acquiring a first data table of a first database before migration and a second data table corresponding to the first data table in a second database after migration;
processing the first data table to obtain a first feature matrix and a first cryptographic hash function MD5 feature value related to DQL, and determining a first number of data strips in the first data table, including: inquiring the number of data bars of the first data table based on a database inquiry statement, and determining the number of the first data bars corresponding to the first data table; acquiring all special fields in the first data table, wherein the special fields are preset; classifying and calculating all the special fields, and determining the number of each special field; constructing a first feature matrix based on the number of each special field; exporting the first data table into a first data file according to a preset separator; generating a first MD5 characteristic value corresponding to the first data table based on the first data file;
Processing the second data table to obtain a second feature matrix and a second MD5 feature value related to the DQL, and determining a second data bar number in the second data table, including: inquiring the number of data bars of the second data table based on a database inquiry statement, and determining the number of second data bars corresponding to the second data table; acquiring all special fields in the second data table, wherein the special fields are preset; classifying and calculating all the special fields, and determining the number of each special field; constructing a second feature matrix based on the number of each special field; exporting the second data table into a second data file according to a preset separator; generating a second MD5 characteristic value corresponding to the data table based on the second data file;
checking the first data number, the first feature matrix and the first MD5 feature value of the first data table and the second data number, the second feature matrix and the second MD5 feature value of the second data table;
and if the verification is passed, determining that the data in the first database is consistent with the data in the second database.
2. The method of claim 1, wherein verifying the first number of data stripes, the first eigen matrix, and the first MD5 eigenvalue of the first data table with the second number of data stripes, the second eigen matrix, and the second MD5 eigenvalue of the second data table comprises:
judging whether the first data number of the first data table is consistent with the second data number of the corresponding second data table, and judging whether the first feature matrix of the first data table is consistent with the second feature matrix of the corresponding second data table;
if any one of the two is inconsistent, determining that the data in the first database is inconsistent with the data in the second database;
if the first MD5 characteristic values are consistent with the second MD5 characteristic values of the corresponding second data table, judging whether the first MD5 characteristic values of the first data table are consistent with the second MD5 characteristic values of the corresponding second data table;
if so, executing the step of determining that the data in the first database is consistent with the data in the second database;
if the data in the first database is inconsistent with the data in the second database, determining that the data in the first database is inconsistent with the data in the second database.
3. A data verification device, the device comprising:
the first acquisition unit is used for acquiring a first data table of a first database before migration and a second data table corresponding to the first data table in a second database after migration;
A first determining unit, configured to process the first data table to obtain a first feature matrix related to DQL and a first cryptographic hash function MD5 feature value, and determine a first number of data stripes in the first data table, including: inquiring the number of data bars of the first data table based on a database inquiry statement, and determining the number of the first data bars corresponding to the first data table; acquiring all special fields in the first data table, wherein the special fields are preset; classifying and calculating all the special fields, and determining the number of each special field; constructing a first feature matrix based on the number of each special field; exporting the first data table into a first data file according to a preset separator; generating a first MD5 characteristic value corresponding to the first data table based on the first data file;
a second determining unit, configured to process the second data table to obtain a second feature matrix and a second MD5 feature value related to DQL, and determine a second number of data stripes in the second data table, including: inquiring the number of data bars of the second data table based on a database inquiry statement, and determining the number of second data bars corresponding to the second data table; acquiring all special fields in the second data table, wherein the special fields are preset; classifying and calculating all the special fields, and determining the number of each special field; constructing a second feature matrix based on the number of each special field; exporting the second data table into a second data file according to a preset separator; generating a second MD5 characteristic value corresponding to the data table based on the second data file;
The verification unit is used for verifying the first data bar number, the first feature matrix and the first MD5 feature value of the first data table and the second data bar number, the second feature matrix and the second MD5 feature value of the second data table, and if the verification is passed, the third determination unit is executed;
and a third determining unit, configured to determine that the data in the first database is consistent with the data in the second database.
4. The apparatus of claim 3, wherein the verification unit comprises a DQL data verification module and an MD5 module;
the data verification module is used for judging whether the first data number of the first data table is consistent with the second data number of the corresponding second data table, and whether the first feature matrix of the first data table is consistent with the second feature matrix of the corresponding second data table; if any one of the two is inconsistent, determining that the data in the first database is inconsistent with the data in the second database; if the MD5 modules are consistent, executing the MD5 modules;
the MD5 module is used for judging whether the first MD5 characteristic value of the first data table is consistent with the second MD5 characteristic value of the corresponding second data table; if the data in the first database is inconsistent with the data in the second database, determining that the data in the first database is inconsistent with the data in the second database; and if so, executing the third determining unit.
CN202011324306.4A 2020-11-23 2020-11-23 Data verification method and device Active CN112286910B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011324306.4A CN112286910B (en) 2020-11-23 2020-11-23 Data verification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011324306.4A CN112286910B (en) 2020-11-23 2020-11-23 Data verification method and device

Publications (2)

Publication Number Publication Date
CN112286910A CN112286910A (en) 2021-01-29
CN112286910B true CN112286910B (en) 2024-04-12

Family

ID=74425218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011324306.4A Active CN112286910B (en) 2020-11-23 2020-11-23 Data verification method and device

Country Status (1)

Country Link
CN (1) CN112286910B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860669A (en) * 2021-02-24 2021-05-28 中国联合网络通信集团有限公司 Data migration verification method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354314A (en) * 2015-11-10 2016-02-24 中国建设银行股份有限公司 Data migration method and device
CN107122368A (en) * 2016-02-25 2017-09-01 阿里巴巴集团控股有限公司 A kind of data verification method, device and electronic equipment
CN107807982A (en) * 2017-10-27 2018-03-16 中国农业银行股份有限公司 A kind of consistency desired result method and device of heterogeneous database
CN110019135A (en) * 2017-12-27 2019-07-16 航天信息股份有限公司 It is a kind of to migrate relational data to the method and device of HBase database
WO2020015150A1 (en) * 2018-07-18 2020-01-23 平安科技(深圳)有限公司 Method and device for dynamically exporting data table, computer apparatus, and storage medium
CN111104392A (en) * 2019-12-12 2020-05-05 京东数字科技控股有限公司 Database migration method and device, electronic equipment and storage medium
CN111290998A (en) * 2020-02-12 2020-06-16 平安科技(深圳)有限公司 Method, device and equipment for calibrating migration data and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354314A (en) * 2015-11-10 2016-02-24 中国建设银行股份有限公司 Data migration method and device
CN107122368A (en) * 2016-02-25 2017-09-01 阿里巴巴集团控股有限公司 A kind of data verification method, device and electronic equipment
CN107807982A (en) * 2017-10-27 2018-03-16 中国农业银行股份有限公司 A kind of consistency desired result method and device of heterogeneous database
CN110019135A (en) * 2017-12-27 2019-07-16 航天信息股份有限公司 It is a kind of to migrate relational data to the method and device of HBase database
WO2020015150A1 (en) * 2018-07-18 2020-01-23 平安科技(深圳)有限公司 Method and device for dynamically exporting data table, computer apparatus, and storage medium
CN111104392A (en) * 2019-12-12 2020-05-05 京东数字科技控股有限公司 Database migration method and device, electronic equipment and storage medium
CN111290998A (en) * 2020-02-12 2020-06-16 平安科技(深圳)有限公司 Method, device and equipment for calibrating migration data and storage medium

Also Published As

Publication number Publication date
CN112286910A (en) 2021-01-29

Similar Documents

Publication Publication Date Title
WO2021036810A1 (en) Evidence verification method, system, apparatus and device, and readable storage medium
CN102236672B (en) A kind of data lead-in method and device
CN110162516B (en) Data management method and system based on mass data processing
US9367580B2 (en) Method, apparatus and computer program for detecting deviations in data sources
CN105589706B (en) A kind of upgrade package generation method and device
CN106844730B (en) Method and device for displaying file content
CN107016019B (en) Database index creation method and device
CN107133233B (en) Processing method and device for configuration data query
CN112286910B (en) Data verification method and device
CN111290998A (en) Method, device and equipment for calibrating migration data and storage medium
CN109815697A (en) Wrong report behavior processing method and processing device
EP2862101A1 (en) Method and a consistency checker for finding data inconsistencies in a data repository
CN111475402B (en) Program function testing method and related device
WO2018202174A1 (en) Version comparison testing method and system
US8498963B2 (en) Method and system for data synchronization
US8463799B2 (en) System and method for consolidating search engine results
CN116401229A (en) Database data verification method, device and equipment
CN106776264B (en) Application program code testing method and device
CN114255134A (en) Account number disassembling method and device and storage medium
CN113688147B (en) Data processing method and system
CN110909211B (en) Metadata verification method and device based on metadata B + tree
CN105653525B (en) Method and system for importing data between account sets
CN114595159B (en) Test data generation method, device, equipment and storage medium
CN115454354B (en) Data processing method, system, electronic device and storage medium
US11010387B2 (en) Join operation and interface for wildcards

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant