CN113495928A - Data consistency checking method and device, electronic equipment and readable storage medium - Google Patents

Data consistency checking method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN113495928A
CN113495928A CN202111046391.7A CN202111046391A CN113495928A CN 113495928 A CN113495928 A CN 113495928A CN 202111046391 A CN202111046391 A CN 202111046391A CN 113495928 A CN113495928 A CN 113495928A
Authority
CN
China
Prior art keywords
record list
check
list
data
record
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111046391.7A
Other languages
Chinese (zh)
Other versions
CN113495928B (en
Inventor
朱雨朦
邹永强
杨晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Accumulus Technologies Tianjin Co Ltd
Original Assignee
Accumulus Technologies Tianjin Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Accumulus Technologies Tianjin Co Ltd filed Critical Accumulus Technologies Tianjin Co Ltd
Priority to CN202111046391.7A priority Critical patent/CN113495928B/en
Publication of CN113495928A publication Critical patent/CN113495928A/en
Application granted granted Critical
Publication of CN113495928B publication Critical patent/CN113495928B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/273Asynchronous replication or reconciliation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning

Abstract

The invention provides a data consistency checking method, a device, electronic equipment and a readable storage medium, which relate to the technical field of computers, and the method comprises the following steps: after two recording lists needing data consistency check are subjected to the same sorting treatment according to the same unique identification column, the first elements of the two recording lists are taken out respectively to be compared with the unique identification size until no residual elements exist in any one of the two recording lists, the first elements with the large unique identification are placed back into the original queue again, the first elements with the small unique identification are stored in a check result table in a classified mode, data consistency check on each element in the two recording lists is achieved, and the check result table comprises the following steps: a first result of only the elements in the first record list, a second result of only the elements in the second record list. The problem of data check difficulty when the target database is not identical with the source database in the prior art is solved.

Description

Data consistency checking method and device, electronic equipment and readable storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a data consistency checking method and device, electronic equipment and a readable storage medium.
Background
At present, data consistency check is mostly used for database synchronization or migration, and after synchronization or migration is completed, data consistency of a source database and a target database is checked. The verification method comprises data verification and log file verification, wherein the data verification is performed by comparing whether the checksums (such as MD5 values) of each record are the same or not; the log file verification verifies the consistency of the synchronous data by comparing the transaction logs; and both methods require checking whether the data are completely consistent.
The existing data consistency checking method needs to check whether the data are completely consistent or not, and cannot meet the condition of comparing the data which are not completely consistent. When the target database and the source database are not completely the same but can be verified (for example, one record of the source database corresponds to a plurality of records of the target database or one record of the target database corresponds to a plurality of records of the source database), or the structure of the target database is greatly changed, so that the structures of the source database and the target database are different, and the data sources are the same or the data of the target database originates from the source database, the existing data consistency verification method cannot be adopted for verification.
Disclosure of Invention
The embodiment of the invention provides a data consistency checking method, a data consistency checking device, electronic equipment and a readable storage medium, and aims to solve the problem that data checking is difficult when data of a target database and data of a source database are not identical in the prior art.
In order to solve the technical problem, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a data consistency checking method, including:
step s1, obtaining a first record list and a second record list; a plurality of elements in the first recording list and the second recording list are subjected to the same sorting treatment according to the same unique identification column; the unique identification column where each element is stored stores a corresponding unique identification;
step s2, if there are residual elements in the first record list and the second record list, respectively taking out the first element in the queue of the first record list and the first element in the queue of the second record list, and comparing the unique identification sizes of the first element of the current first record list and the first element of the current second record list;
step s3, the first element with the large unique identifier is used as the first element in the queue of the corresponding record list;
step s4, storing the first element with small unique identification into a verification result table according to the corresponding record list;
step s5, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, carrying out consistency check on the two first elements according to a preset check condition, and storing a check result into a check result table;
step s6, performing steps s2 to s5 until at least one of the first list of records or the second list of records has no remaining elements;
when the data of the first record list and the second record list are not completely consistent, the check result in the check result table is at least one of a first result of an element only in the first record list, a second result of an element only in the second record list, and a third result of an element in both the first record list and the second record list.
Optionally, the first record list or the second record list is obtained by screening from the same source table according to different screening conditions.
Optionally, the first record list and the second record list have the same unique identification column and at least one check column with the same data type.
Optionally, before the obtaining of the first record list and the second record list, the method further includes checking parameters of the first record list and the second record list.
Optionally, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, the preset check condition is that consistency check is performed according to at least one check column of the first record list and the second record list, where the data type of the check column is the same.
Optionally, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, the preset check condition is to perform consistency check according to a complex function formed by a plurality of check columns of the same data type in the first record list and the second record list.
Optionally, if the parameter for verifying the first record list and the second record list includes a record consistency verification function, and the unique identifier of the first element of the first record list and the unique identifier of the first element of the second record list are equal in size, the preset verification condition is to customize a verification range of the record consistency verification function, so as to implement consistency verification on the specified data.
In a second aspect, an embodiment of the present invention further provides a data consistency checking apparatus, including:
the acquisition module is used for acquiring a first record list and a second record list; a plurality of elements in the first recording list and the second recording list are subjected to the same sorting treatment according to the same unique identification column; the unique identification column where each element is stored stores a corresponding unique identification;
a comparing module, configured to, if there are remaining elements in the first record list and the second record list, respectively take out a first element in a queue where the first record list is located and a first element in a queue where the second record list is located, and compare unique identifier sizes of the first element of the current first record list and the first element of the current second record list;
an execution module to perform at least one of the following steps until at least one of the first record list or the second record list has no remaining elements:
the first element with the large unique identification is used as the first element in the queue of the corresponding record list;
the first element with the small unique identifier is classified and stored into a verification result table according to a corresponding record list;
when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, consistency check is carried out on the two first elements according to a preset check condition, and a check result is stored in a check result table;
the output module is used for outputting a verification result table when the data of the first record list and the data of the second record list are not completely consistent; the check result in the check result table is at least one of a first result of the elements in only the first record list, a second result of the elements in only the second record list, and a third result of the elements in both the first record list and the second record list.
In a third aspect, an embodiment of the present invention further provides an electronic device, including: a processor, a memory and a program stored on the memory and executable on the processor, which program, when executed by the processor, performs the steps of the data consistency checking method according to any one of the first aspect.
In a fourth aspect, the embodiment of the present invention further provides a readable storage medium, where a program is stored, and when the program is executed by a processor, the method implements the steps of the data consistency checking method according to any one of the first aspect.
In the embodiment of the invention, after two recording lists needing data consistency check are subjected to the same sorting treatment according to the same unique identification column, the first elements of the two recording lists are respectively taken out to compare the unique identification sizes until no residual elements exist in any one of the two recording lists, the first elements with the large unique identification sizes are placed back into the original queue again, and the first elements with the small unique identification sizes are classified and stored into a check result table, so that the data consistency check of each element in the two recording lists is realized; the method for checking the data is flexible and universal, can be applied to checking scenes of completely consistent data and incompletely consistent data, and can meet the requirements of checking the data consistency under the conditions of data synchronization, migration, database reconstruction, code reconstruction and the like.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic flow chart of a data consistency checking method according to an embodiment of the present invention;
fig. 2 is a second schematic flow chart of a data consistency verification method according to an embodiment of the present invention;
fig. 3 is a third schematic flow chart of a data consistency checking method according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data consistency checking apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic flow chart illustrating a data consistency verification method according to an embodiment of the present invention; the embodiment of the invention provides a data consistency checking method, which comprises the following steps:
step s1, obtaining a first record list and a second record list; a plurality of elements in the first recording list and the second recording list are subjected to the same sorting treatment according to the same unique identification column; the unique identification column where each element is stored stores a corresponding unique identification;
step s2, if there are residual elements in the first record list and the second record list, respectively taking out the first element in the queue of the first record list and the first element in the queue of the second record list, and comparing the unique identification sizes of the first element of the current first record list and the first element of the current second record list;
step s3, the first element with the large unique identifier is used as the first element in the queue of the corresponding record list;
step s4, storing the first element with small unique identification into a verification result table according to the corresponding record list;
step s5, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, carrying out consistency check on the two first elements according to a preset check condition, and storing a check result into a check result table;
step s6, performing steps s2 to s5 until at least one of the first list of records or the second list of records has no remaining elements;
when the data of the first record list and the second record list are not completely consistent, the check result in the check result table is at least one of a first result of an element only in the first record list, a second result of an element only in the second record list, and a third result of an element in both the first record list and the second record list.
In the embodiment of the invention, after two recording lists needing data consistency check are subjected to the same sorting treatment according to the same unique identification column, the first elements of the two recording lists are respectively taken out to compare the unique identification sizes until no residual elements exist in any one of the two recording lists, the first elements with the large unique identification sizes are placed back into the original queue again, and the first elements with the small unique identification sizes are classified and stored into a check result table, so that the data consistency check of each element in the two recording lists is realized; the method for checking the data is flexible and universal, can be applied to checking scenes of completely consistent data and incompletely consistent data, and can meet the requirements of checking the data consistency under the conditions of data synchronization, migration, database reconstruction, code reconstruction and the like.
In some embodiments of the present invention, optionally, the first record list or the second record list is obtained by filtering from the same source table according to different filtering conditions.
In this embodiment of the present invention, the first record list and the second record list may be different record lists obtained from the source database at the current time according to different filtering conditions, and are often used for data comparison of data in the same source database.
In some embodiments of the present invention, optionally, one of the first record list and the second record list is a source table, and the other is a new record table obtained by filtering from the source table according to a filtering condition.
In some embodiments of the present invention, optionally, the first record list and the second record list have the same unique identification column, and at least one check column with the same data type.
In the embodiment of the invention, when the first record list and the second record list have the same unique identification column and at least have one check column with the same data type, data consistency check can be carried out even if the data sources of the first record list and the second record list are different; the unique identification column can uniquely identify the data of the two tables, and the same unique identification column ensures the arrangement sequence of each element of the two tables; the check columns are fields needing to be checked, the query conditions can be set according to needs, at least one comparable check column ensures the check content of the two when the two are used for data check, the method is suitable for the data check scene of incompletely consistent data, and the check of the specified data is realized when one source table corresponds to one target table or a plurality of target tables.
In this embodiment of the present invention, optionally, the first record list and the second record list may also be obtained by screening from different source databases, where the first record list and the second record list have the same unique identifier column and at least one check column with the same data type.
In some embodiments of the present invention, optionally, names (field names) of check columns of the same data type in the first record list and the second record list may be different, and only the data type is the same, that is, data in a check column of the first record list and data in a check column of the second record list are comparable, if the data types are both time types; or one is time type and the other is time character string type, both belong to time type data, and data comparison can be carried out.
In some embodiments of the present invention, optionally, before the obtaining the first record list and the second record list, the method further includes checking parameters of the first record list and the second record list.
In the embodiment of the invention, the first record list and the second record list obtained by screening are subjected to parameter verification respectively, and then data consistency verification of the first record list and the second record list is carried out, so that the efficiency and the accuracy of data verification are further improved.
Referring to fig. 3, fig. 3 is a third schematic flow chart of a data consistency verification method according to an embodiment of the present invention;
before the data consistency check of the two recording lists, the method further includes checking parameters of at least one of the two recording lists, and specifically includes:
step 31: judging whether a data table to be verified still exists, if so, turning to a step 32, otherwise, finishing verification;
step 32: performing parameter verification; if the parameter verification is passed, the step 33 is carried out, otherwise, the verification is finished;
step 33: pre-checking parameters of a data table to be checked; if the pre-check is passed, the step 34 is carried out, and if the pre-check is not passed, the verification is ended;
step 34: the passed data table is saved as the first record list or the second record list, and the process goes to step 31.
In some embodiments of the present invention, optionally, before performing the data consistency check on the two record lists, the method further includes checking parameters in both the two record lists.
In the embodiment of the invention, before the data consistency check of the two record lists is executed, the parameters of the two record lists are checked respectively, and the specific parameter check comprises the check according to the parameter data list and the check according to the pre-check list. As shown in fig. 3, the input parameters in the data table are checked one by one, the parameter check is performed first, and the basic check includes whether the table link, the table name, the check column, and the like are empty, or more complicated checks may be performed. After the basic check is passed, the pre-check is carried out according to a pre-check table, and the pre-check table is used for checking whether the data are consistent, the unique identification column and the check column are correct (for the situations of alias, function and the like, skip check can be added). And after the pre-check is passed, checking out/screening out the data meeting the conditions according to the query conditions, the unique identification columns and the check columns, sorting according to the unique identification columns, and storing the passed data table as a first record list or a second record list. When the first record list and the second record list pass the pre-check, the unique identifiers and the check columns of the two lists can be acquired one by one to execute the data consistency check method, and the check result is output after the check is finished.
In some embodiments of the present invention, optionally, the parameter to be checked (input parameter) is composed of an array formed by information of the first record list and the second record list, and each element of the array includes information of the first record list and the second record list and a record consistency check function (a default function is to compare whether each field of a record is the same or not, and may also be set by itself as needed, so as to implement data check when the field values are not completely consistent).
Specifically, the table information includes a database connection, a database name, a table name, a unique identification column (the unique identification is not limited to the primary key, and may uniquely identify the element record), a check column (the unique identification is not limited to the column name of the table, and may be a database function, an alias, and the like, and only needs to correspond to the table query result column), a table query condition (for implementing checking of the specified record), and the like.
In some embodiments of the present invention, optionally, the output parameter here is a result of data verification, and is used to show the verification result of each table, where the result includes more data records in the first record list than in the second record list, more data records in the second record list than in the first record list, and inconsistent columns and column values in the two lists, and in addition, other verification result outputs may also be implemented as needed.
In some embodiments of the present invention, optionally, the parameters for performing verification include, but are not limited to, the input parameters and the output parameters described above, and can also be set and implemented by itself as needed.
In the embodiment of the present invention, optionally, the unique identifier may be set as one or more fields, which are used to uniquely distinguish one element of each record list under different verification scenarios. Such as the primary key (id) or other unique index (e.g., ref, uuid), the check column may be all fields, and the query condition is null, enabling the checking of all data. The data verification of the invention can be applied to the condition that data migration, synchronization and the like are completely consistent, and the verification column can also be applied to the condition that data reconstruction and the like are not completely consistent, and only the data after reconstruction is basically consistent with the data before reconstruction (namely the data sources are the same).
In some embodiments of the present invention, optionally, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, the preset checking condition is to perform consistency checking according to at least one checking column of the first record list and the second record list that has the same data type.
In the embodiment of the invention, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, consistency check is carried out by comparing at least one other check column with the same data type of the two list elements, and the check result is stored in the check result table.
Specifically, the unique identifiers of the two tables are the same, and when the unique identifier column and the check column are fields in the tables:
TABLE 1-1 Source Table
Figure DEST_PATH_IMAGE001
TABLE 1-2 target Table
Figure 331814DEST_PATH_IMAGE002
The table 1-1 is a source table, the table 1-2 is a target table, wherein id is student id, grade is grade, name is student name, hk is household location, and the target table only takes grade and name information of the students in the grade.
In this case, the unique identifier column is id, the check column is grade and name, the source table query condition is grade = grade one, and the target table has no query condition.
Note: the source table may also generate multiple target tables, such as multiple grades, one target table for each grade.
In some embodiments of the present invention, optionally, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, the preset check condition is to perform consistency check according to a complex function formed by a plurality of check columns of the same data type in the first record list and the second record list.
In the embodiment of the invention, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, a plurality of check columns with the same data type in the two record lists can be selected according to self requirements to construct a composite function for consistency check, and the check result is stored in the check result list.
Specifically, the unique identifiers of the two tables are the same, when the check column is a composite function of fields in the tables, the unique identifier column can uniquely identify the data of the two tables, the check column is a field to be checked or a composite function of the check field, the query condition can be set as required, and the check of the specified data is realized.
TABLE 2-1 Source Table
Figure DEST_PATH_IMAGE003
TABLE 2-2 target Table
Figure 796425DEST_PATH_IMAGE004
Table 2-1 is a source table, and table 2-2 is a target table, where id is student id, score is score, name is student name, sid is student number, and semester is term; the goal table only takes the best achievements of each student.
In this case, the unique identifiers are sid and semester; the check column is max (score), name, sid, semester; the grouping condition of the source table and the target table is that the table is grouped according to sid.
Note: the form of selecting a plurality of parity columns with the same data type in the two record tables to construct a composite function here includes, but is not limited to, a narrow composite function, and further includes a screening condition/query condition compositely constructed according to a plurality of parity columns with the same data type.
In some embodiments of the present invention, optionally, if the parameter for verifying the first record list and the second record list includes a record consistency check function, and when the size of the unique identifier of the first element of the first record list is equal to that of the unique identifier of the first element of the second record list, the preset check condition is to customize a check range of the record consistency check function, so as to implement consistency check on the specified data.
In the embodiment of the invention, when the parameters of the two record lists comprise the record consistency check function and the unique identifiers of the first element of the first record list and the first element of the second record list are equal, the consistency check of the specified data can be realized by self-defining the check range of the record consistency check function.
Specifically, when the two tables have not the same check columns (or the check columns have different numbers) and cannot be obtained through the complex function, the data of the two tables can be uniquely identified by setting the unique identification column, the check column is a field to be checked, the query condition can be set according to needs, the consistency check function is set, and the check on the specified data is realized.
TABLE 3-1 Source Table
Figure DEST_PATH_IMAGE005
TABLE 3-2 target Table
Figure 448598DEST_PATH_IMAGE006
Table 3-1 is a source table, and table 3-2 is a target table, where id is student id, name is student name, hk is the household location, and the household information of the target table is removed from the area information.
In this case, the unique identifier is id, the check column is name and hk, and the query condition is none. The record consistency check function needs to be set, and the following are set: only the province and city information of the two registered households is checked, and the district information is ignored.
Note: the number of check columns can be different, for example, multiple columns of the source table correspond to one column of the target table, or one column of the source table corresponds to multiple columns of the target table, a user-defined consistency check function can be set, so that the records can be checked
Referring to fig. 2, fig. 2 is a second schematic flow chart of a data consistency verification method according to an embodiment of the present invention; the two recording lists are respectively screened from a source database and a target database, wherein slist: recording a list after the source table is inquired; dlist: recording a list after the target table is inquired; slists and dlist have been processed in the same sort (uniformly ascending or descending) according to the same unique identification column.
sExtra: a plurality of records of source tables; dExtra: a plurality of records of the target table; notCons: a record where the two tables are inconsistent (both tables have this record, and the record is inconsistent).
The data consistency checking method comprises the following steps:
step 21: judging whether the current data reading is finished, if so, turning to a step 22, otherwise, turning to a step 23;
step 22: judging whether the residual data of the slist or dlist is 0, if so, switching to a step 25, otherwise, switching to a step 23;
step 23: respectively taking out a first element s of the slist and a first element d of the dlist;
step 24: comparing the sizes of the unique identifiers of the element s and the element d; if s < d, go to step 2411; if s = d, go to step 2421; if s > d, go to step 2431;
step 2411: adding the unique identifier of s into sExtra; executing step 2412;
step 2412: replace d back to the first element position of dlist; and go to step 21;
step 2421: comparing whether other fields of s and d are consistent; step 2422 is executed;
step 2422: storing the unique identification, the field name and the two table field values which are inconsistent into notCons, and turning to step 21;
step 2431: adding the unique identifier of d into dExtra; step 2432 is performed;
step 2432: replace s back to the first element position of slist; and go to step 21;
step 25: when the residual data of dlist is not 0, adding dExtra into the residual data;
step 26: and when the slist residual data is not 0, adding sExtra to the residual data.
In the embodiment of the invention, when the termination condition is that data reading is finished and the residual data of the slist or dlist is 0, the data verification is asynchronous data consistency verification, and the data reading is asynchronous operation; and when the termination condition is that only slist or dlist residual data is 0, the data verification is synchronous data consistency verification.
When the termination condition is not met, taking out the first elements s and d of the slist and dlist, and comparing the sizes of the unique identification columns of the s and d, wherein the slist and dlist elements are ordered, when the unique identification column of the s is larger than the d, the s is a record with more records in a source table, sExtra is put in, and the d is continuously compared with the next element of the slist; when the unique identification column of s is smaller than d, d is a plurality of records in the target table, dExtra is put in, and s is continuously compared with the next element of dlist; and when the unique identification column of s is equal to d, using a record consistency check function in the input parameters to check whether other fields are consistent, and if not, putting inconsistent records into notCons.
The residual data after the termination condition is reached is the record with more tables, and if the slist residual data is not 0, the residual data is written into sExtra; if dlist remaining data is not 0, then the remaining data is written to dExtra.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a data consistency checking apparatus according to an embodiment of the present invention;
the embodiment of the present invention further provides a data consistency checking apparatus 40, which includes:
an obtaining module 41, configured to obtain a first record list and a second record list; a plurality of elements in the first recording list and the second recording list are subjected to the same sorting treatment according to the same unique identification column; the unique identification column where each element is stored stores a corresponding unique identification;
a comparing module 42, configured to, if there are remaining elements in the first record list and the second record list, respectively take out a first element in a queue where the first record list is located and a first element in a queue where the second record list is located, and compare unique identifier sizes of the first element of the current first record list and the first element of the current second record list;
an executing module 43, configured to execute at least one of the following steps until at least one of the first record list or the second record list has no remaining elements:
the first element with the large unique identification is used as the first element in the queue of the corresponding record list;
the first element with the small unique identifier is classified and stored into a verification result table according to a corresponding record list;
when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, consistency check is carried out on the two first elements according to a preset check condition, and a check result is stored in a check result table;
an output module 44, configured to output a verification result table when the data in the first record list and the data in the second record list are not completely consistent; the check result in the check result table is at least one of a first result of the elements in only the first record list, a second result of the elements in only the second record list, and a third result of the elements in both the first record list and the second record list.
In the embodiment of the invention, after two recording lists needing data consistency check are subjected to the same sorting treatment according to the same unique identification column, the first elements of the two recording lists are respectively taken out to compare the unique identification sizes until no residual elements exist in any one of the two recording lists, the first elements with the large unique identification sizes are placed back into the original queue again, and the first elements with the small unique identification sizes are classified and stored into a check result table, so that the data consistency check of each element in the two recording lists is realized; the verification result in the verification result table is at least one of a first result of an element only in the first record list, a second result of an element only in the second record list and a third result of an element in the first record list and the second record list, and the device for flexibly and universally verifying data is provided, can be applied to a verification scene of completely consistent data and incompletely consistent data, and can meet the requirements of verifying data consistency under the conditions of data synchronization, migration, database reconstruction, code reconstruction and the like.
In some embodiments of the present invention, optionally, the first record list or the second record list is obtained by filtering from the same source table according to different filtering conditions.
In this embodiment of the present invention, the first record list and the second record list may be different record lists obtained from the source database at the current time according to different filtering conditions, and are often used for data comparison of data in the same source database.
In some embodiments of the present invention, optionally, one of the first record list and the second record list is a source table, and the other is a new record table obtained by filtering from the source table according to a filtering condition.
In some embodiments of the present invention, optionally, the first record list and the second record list have the same unique identification column, and at least one check column with the same data type. In the embodiment of the invention, when the first record list and the second record list have the same unique identification column and at least have one check column with the same data type, data consistency check can be carried out even if the data sources of the first record list and the second record list are different; the unique identification column can uniquely identify the data of the two tables, and the same unique identification column ensures the arrangement sequence of each element of the two tables; the check columns are fields needing to be checked, the query conditions can be set according to needs, at least one comparable check column ensures the check content of the two when the two are used for data check, the method is suitable for the data check scene of incompletely consistent data, and the check of the specified data is realized when one source table corresponds to one target table or a plurality of target tables.
In this embodiment of the present invention, the first record list and the second record list may also be obtained by screening from different source databases, and at this time, the first record list and the second record list have the same unique identifier column and at least one check column with the same data type.
In some embodiments of the present invention, optionally, the check columns with the same data type indicate that data in a check column of the first record list and data in a check column of the second record list are comparable, if both data types are time types; or one is time type and the other is time character string type, both belong to time type data, and data comparison can be carried out.
In some embodiments of the present invention, optionally, the obtaining module 41 is further configured to check parameters of the first record list and the second record list before obtaining the first record list and the second record list.
In the embodiment of the invention, the first record list and the second record list obtained by screening are subjected to parameter verification respectively, and then data consistency verification of the first record list and the second record list is carried out, so that the efficiency and the accuracy of data verification are further improved.
In some embodiments of the present invention, optionally, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, the preset checking condition is to perform consistency checking according to at least one checking column of the first record list and the second record list that has the same data type.
In the embodiment of the invention, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, consistency check is carried out by comparing at least one other check column with the same data type of the two list elements, and the check result is stored in the check result table.
In some embodiments of the present invention, optionally, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, the preset check condition is to perform consistency check according to a complex function formed by a plurality of check columns of the same data type in the first record list and the second record list.
In the embodiment of the invention, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, a plurality of check columns with the same data type in the two record lists can be selected according to self requirements to construct a composite function for consistency check, and the check result is stored in the check result list.
In some embodiments of the present invention, optionally, if the parameter for verifying the first record list and the second record list includes a record consistency check function, and when the size of the unique identifier of the first element of the first record list is equal to that of the unique identifier of the first element of the second record list, the preset check condition is to customize a check range of the record consistency check function, so as to implement consistency check on the specified data.
In the embodiment of the invention, when the parameters of the two record lists comprise the record consistency check function and the unique identifiers of the first element of the first record list and the first element of the second record list are equal, the consistency check of the specified data can be realized by self-defining the check range of the record consistency check function.
In addition, it should be noted that all relevant contents of each step related to the above data consistency checking method embodiment may be referred to the functional description of the corresponding functional module, and are not described herein again.
Fig. 5 shows a schematic structural diagram of an electronic device according to an embodiment of the present invention, where fig. 5 is a schematic structural diagram of the electronic device according to the present invention.
The electronic device 50 includes: the processor 51, the memory 52, and the program stored in the memory 52 and capable of running on the processor 51, where the program, when executed by the processor 51, implements each process of implementing any one of the embodiments of the data consistency check method described above, and can achieve the same technical effect, and are not described herein again to avoid repetition.
The embodiment of the present invention further provides a readable storage medium, where a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of any one of the embodiments of the data consistency verification method described above, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here.
The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A data consistency checking method is characterized by comprising the following steps:
step s1, obtaining a first record list and a second record list; a plurality of elements in the first recording list and the second recording list are subjected to the same sorting treatment according to the same unique identification column; the unique identification column where each element is stored stores a corresponding unique identification;
step s2, if there are residual elements in the first record list and the second record list, respectively taking out the first element in the queue of the first record list and the first element in the queue of the second record list, and comparing the unique identification sizes of the first element of the current first record list and the first element of the current second record list;
step s3, the first element with the large unique identifier is used as the first element in the queue of the corresponding record list;
step s4, storing the first element with small unique identification into a verification result table according to the corresponding record list;
step s5, when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, carrying out consistency check on the two first elements according to a preset check condition, and storing a check result into a check result table;
step s6, performing steps s2 to s5 until at least one of the first list of records or the second list of records has no remaining elements;
when the data of the first record list and the second record list are not completely consistent, the check result in the check result table is at least one of a first result of an element only in the first record list, a second result of an element only in the second record list, and a third result of an element in both the first record list and the second record list.
2. The data consistency checking method according to claim 1, wherein the first record list or the second record list is filtered from the same source table according to different filtering conditions.
3. The data consistency checking method according to claim 1, wherein the first record list and the second record list have the same unique identification column and at least one check column with the same data type.
4. The data consistency checking method according to claim 1, wherein before the obtaining the first record list and the second record list, the method further comprises checking parameters of the first record list and the second record list.
5. The method according to claim 1, wherein when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, the predetermined checking condition is to perform a consistency check according to at least one check column of the first record list and the second record list that has a same data type.
6. The method according to claim 1, wherein when the unique identifiers of the first element of the first record list and the first element of the second record list are equal, the predetermined checking condition is to perform consistency checking according to a complex function formed by a plurality of check columns of the first record list and the second record list, where the data types of the check columns are the same.
7. The method according to claim 4, wherein if the parameter for verifying the first record list and the second record list includes a record consistency verification function, and the unique identifier of the first element of the first record list and the unique identifier of the first element of the second record list are equal in size, the preset verification condition is to customize a verification range of the record consistency verification function, so as to implement consistency verification on the specified data.
8. A data consistency verification apparatus, comprising:
the acquisition module is used for acquiring a first record list and a second record list; a plurality of elements in the first recording list and the second recording list are subjected to the same sorting treatment according to the same unique identification column; the unique identification column where each element is stored stores a corresponding unique identification;
a comparing module, configured to, if there are remaining elements in the first record list and the second record list, respectively take out a first element in a queue where the first record list is located and a first element in a queue where the second record list is located, and compare unique identifier sizes of the first element of the current first record list and the first element of the current second record list;
an execution module to perform at least one of the following steps until at least one of the first record list or the second record list has no remaining elements:
the first element with the large unique identification is used as the first element in the queue of the corresponding record list;
the first element with the small unique identifier is classified and stored into a verification result table according to a corresponding record list;
when the unique identifiers of the first element of the first record list and the first element of the second record list are equal in size, consistency check is carried out on the two first elements according to a preset check condition, and a check result is stored in a check result table;
the output module is used for outputting a verification result table when the data of the first record list and the data of the second record list are not completely consistent; the check result in the check result table is at least one of a first result of the elements in only the first record list, a second result of the elements in only the second record list, and a third result of the elements in both the first record list and the second record list.
9. An electronic device, comprising: processor, memory and program stored on the memory and executable on the processor, which when executed by the processor implements the steps of the data consistency checking method according to any of the claims 1 to 7.
10. A readable storage medium, characterized in that the readable storage medium has stored thereon a program which, when executed by a processor, implements the steps of the data consistency checking method according to any one of claims 1 to 7.
CN202111046391.7A 2021-09-08 2021-09-08 Data consistency checking method and device, electronic equipment and readable storage medium Active CN113495928B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111046391.7A CN113495928B (en) 2021-09-08 2021-09-08 Data consistency checking method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111046391.7A CN113495928B (en) 2021-09-08 2021-09-08 Data consistency checking method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN113495928A true CN113495928A (en) 2021-10-12
CN113495928B CN113495928B (en) 2021-11-09

Family

ID=77996006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111046391.7A Active CN113495928B (en) 2021-09-08 2021-09-08 Data consistency checking method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN113495928B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104021132A (en) * 2013-12-08 2014-09-03 郑州正信科技发展股份有限公司 Method and system for verification of consistency of backup data of host database and backup database
CN104346377A (en) * 2013-07-31 2015-02-11 克拉玛依红有软件有限责任公司 Method for integrating and exchanging data on basis of unique identification
CN108256076A (en) * 2018-01-18 2018-07-06 广州大学 Distributed mass data processing method and processing device
CN112835972A (en) * 2019-11-22 2021-05-25 北京中电普华信息技术有限公司 Method and system for synchronizing unstructured data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104346377A (en) * 2013-07-31 2015-02-11 克拉玛依红有软件有限责任公司 Method for integrating and exchanging data on basis of unique identification
CN104021132A (en) * 2013-12-08 2014-09-03 郑州正信科技发展股份有限公司 Method and system for verification of consistency of backup data of host database and backup database
CN108256076A (en) * 2018-01-18 2018-07-06 广州大学 Distributed mass data processing method and processing device
CN112835972A (en) * 2019-11-22 2021-05-25 北京中电普华信息技术有限公司 Method and system for synchronizing unstructured data

Also Published As

Publication number Publication date
CN113495928B (en) 2021-11-09

Similar Documents

Publication Publication Date Title
CN107741903A (en) Application compatibility method of testing, device, computer equipment and storage medium
CN114281793A (en) Data verification method, device and system
US11086906B2 (en) System and method for reconciliation of data in multiple systems using permutation matching
CN111628975A (en) Method and device for assembling XML message
CN111475402A (en) Program function testing method and related device
CN111190880B (en) Database detection method, device and computer readable storage medium
CN112948473A (en) Data processing method, device and system of data warehouse and storage medium
CN107766519B (en) Method for visually configuring data structure
CN113900955A (en) Automatic testing method, device, equipment and storage medium
CN113495928B (en) Data consistency checking method and device, electronic equipment and readable storage medium
CN111061733B (en) Data processing method, device, electronic equipment and computer readable storage medium
JP2002259186A (en) Method, program and device for checking and processing compatibility of tree structured index
CN108241705A (en) A kind of data insertion method and device
CN115269548A (en) Method and system for generating data warehouse development model and related equipment
CN111459817B (en) Method and device for checking program execution code
CN106844447A (en) The processing method and processing unit of nuclear power station alarm card
CN112508520A (en) Method, system, device and storage medium for rapidly checking bill of material
CN113342647A (en) Test data generation method and device
CN112817931A (en) Method and device for generating incremental version file
CN114595159B (en) Test data generation method, device, equipment and storage medium
CN115774971A (en) Automatic checking method of terminal row wiring diagram based on CAD (computer-aided design) diagram identification technology
US8849866B2 (en) Method and computer program product for creating ordered data structure
US11228441B2 (en) System for automated data verification
CN116028448A (en) Identification code determining method, device, equipment and storage medium of electronic file
CN114036058A (en) Software automation test method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant