CN107798007A - A kind of method, apparatus and relevant apparatus of distributed data base data check - Google Patents

A kind of method, apparatus and relevant apparatus of distributed data base data check Download PDF

Info

Publication number
CN107798007A
CN107798007A CN201610794307.2A CN201610794307A CN107798007A CN 107798007 A CN107798007 A CN 107798007A CN 201610794307 A CN201610794307 A CN 201610794307A CN 107798007 A CN107798007 A CN 107798007A
Authority
CN
China
Prior art keywords
data
changed
row
check value
specified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610794307.2A
Other languages
Chinese (zh)
Other versions
CN107798007B (en
Inventor
郭龙波
丁岩
徐宜良
张宗禹
林周凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinzhuan Xinke Co Ltd
Original Assignee
Nanjing ZTE New Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing ZTE New Software Co Ltd filed Critical Nanjing ZTE New Software Co Ltd
Priority to CN201610794307.2A priority Critical patent/CN107798007B/en
Publication of CN107798007A publication Critical patent/CN107798007A/en
Application granted granted Critical
Publication of CN107798007B publication Critical patent/CN107798007B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/273Asynchronous replication or reconciliation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity

Abstract

The invention discloses the method, apparatus and relevant apparatus of a kind of distributed data base online data verification, the present invention by compare import before and after in data change specified row data check value it is whether consistent, to determine that data to be changed are changing self-consistentency, solve the problems, such as that distributed data base not can determine that data rear data consistency before changing in the prior art.

Description

A kind of method, apparatus and relevant apparatus of distributed data base data check
Technical field
The present invention relates to communication technical field, more particularly to a kind of method, apparatus of distributed data base data check And relevant apparatus.
Background technology
As business datum continues to build up on the extensive use of database technology and line, particularly Internet service is fast Speed development, data volume is growing day by day, and unit database performance increasingly becomes the bottleneck of business on line, and due to distributed data Storehouse can provide the database service of high-performance, large buffer memory, high concurrent, so as to be quickly applied to business on various lines In scene.
But existing distributed data base is in Data Migration, data initialization, after not can determine that data before changing Uniformity, so as to limit the application of distributed data base.
The content of the invention
It is existing to solve the invention provides a kind of method, apparatus and relevant apparatus of distributed data base data check After distributed data base not can determine that data before changing in technology the problem of data consistency.
One aspect of the present invention provides a kind of method of distributed data base data check, including:
Data to be changed are exported into data and describe text, the number to be changed according to derived data describe text calculating The check value of row data is specified in;
The data to be changed are split by row, and data to be changed described in after fractionation are imported into corresponding number According to storehouse node;
After the completion of data import, calculate and specify the check value of row data in the data to be changed after importing, and compare and lead Whether specify the check value of row data consistent, if unanimously, it is determined that the number to be changed if entering in the front and rear data to be changed According in change self-consistentency.
Further, the check value of row data is specified in data to be changed described in the calculating, is specifically included:
The check value for certain the row data specified in data to be changed described in calculating, or data middle finger to be changed described in calculating The sum of the check value of the continuous N row data of fixed one or more.
Further, when the specifies behavior row data, the calculating is specified after importing in the data to be changed The check value of row data, is specifically included:Calculate the check value for certain the row data specified after importing in the data to be changed;It is described Whether specify the check value of row data consistent, specifically include if comparing in the data to be changed:Compare and wait to become described in before and after importing The check value for certain the row data specified in more data;
When the data of the one or more continuous N rows of the specifies behavior, the data to be changed after calculating importing In specify the check values of row data, specifically include:Calculate the one or more continuous N specified after importing in the data to be changed The sum of the check value of row data;Specify the check value of row data whether consistent in data to be changed described in the comparison, specific bag Include:Compare the sum of the check value of the continuous N row data of the one or more specified before and after importing in the data to be changed.
Further, after being split to the data to be changed by row, and by data to be changed described in after fractionation It imported into before corresponding database node, in addition to:
The database node that should be deposited respectively according to the data to be changed after the distributed distribution rules acquisition fractionation.
Further, data to be changed described in after fractionation are imported into corresponding database node, specifically included:
Data to be changed described in after fractionation are write to the file cache of corresponding database node, notification database cluster Number of files and list of file names have been completed in management, and will be stored to text by the data-base cluster management trigger database broker The data to be changed of part caching are downloaded in the database node;
Wherein, the database broker corresponds with the database node respectively.
Further, the data to be changed include data to be initiated, data to be migrated and divided data to be weighed.
Another aspect of the present invention provides a kind of device of distributed data base data check, including:
First computing unit, text is described for data to be changed to be exported into data, text is described according to derived data The check value of row data is specified in data to be changed described in this calculating;
Import unit, for being split to the data to be changed by row, and by data to be changed described in after fractionation It imported into corresponding database node;
Second computing unit, after the completion of being imported for data, calculate and specify row data in the data to be changed after importing Check value;
Comparing unit, for compare import before and after in data change specified row data check value it is whether consistent, It is if consistent, it is determined that the data to be changed are in change self-consistentency.
Further, first computing unit is additionally operable to, certain the row data specified in data to be changed described in calculating The sum of the check values for the continuous N row data of one or more specified in check value, or data to be changed described in calculating.
Further, second computing unit is additionally operable to, and when the specifies behavior row data, calculates institute after importing State the check value for certain the row data specified in data to be changed;When the data of the one or more continuous N rows of the specifies behavior, Calculate the sum of the check value of the continuous N row data of the one or more specified after importing in the data to be changed;
The comparing unit is additionally operable to, and when the specifies behavior row data, compares the number to be changed before and after importing The check value for certain the row data specified in;When the data of the one or more continuous N rows of the specifies behavior, before comparing importing The sum of the check value for the continuous N row data of one or more specified afterwards in the data to be changed.
Further, the import unit further comprises:
Module is split, for being split to the data to be changed by row;
Acquisition module, for obtaining what the data to be changed after the fractionation should be deposited respectively according to distributed distribution rules Database node;
Import modul, for data to be changed described in after fractionation to be imported into corresponding database node.
Further, the import unit further comprises:
Module is split, for being split to the data to be changed by row;
Import modul, for data to be changed described in after fractionation to be write to the file cache of corresponding database node, Notification database cluster management has completed number of files and list of file names, and passes through the data-base cluster management trigger database Agency will store to the data to be changed of file cache and download in the database node, the database broker respectively with it is described Database node corresponds.
Further, the data to be changed include data to be initiated, data to be migrated and divided data to be weighed.
Further aspect of the present invention provides a kind of data of the device provided with the verification of above-mentioned any distributed database data Storehouse cluster server.
The present invention has the beneficial effect that:
The present invention by compare import before and after in data change specified row data check value it is whether consistent, to determine to treat Data are changed in change self-consistentency, are efficiently solved after distributed data base in the prior art not can determine that data before changing The problem of data consistency.
Brief description of the drawings
Fig. 1 is a kind of schematic flow sheet of the method for distributed data base data check of the embodiment of the present invention;
Fig. 2 is the schematic flow sheet of the method for another distributed data base data check of the embodiment of the present invention;
Fig. 3 is a kind of structural representation of the device of distributed data base data check of the embodiment of the present invention;
Fig. 4 is the configuration diagram of the system of the online data migration of the embodiment of the present invention.
Embodiment
In order to solve the problems, such as data consistency after distributed data base not can determine that data before changing in the prior art, this Invention provides a kind of method, apparatus and relevant apparatus of distributed data base data check, before the present invention is by comparing importing Need the data check value of the data row to be changed of checking whether consistent afterwards, or compare the number to be changed for importing front and rear need checking According to line number and corresponding line number data check value with it is whether consistent, so as to accurately determine data to be changed before changing Uniformity afterwards, efficiently solve data consistency after distributed data base in the prior art not can determine that data before changing and ask Topic.Below in conjunction with accompanying drawing and embodiment, the present invention will be described in further detail.It should be appreciated that tool described herein Body embodiment does not limit the present invention only to explain the present invention.
Embodiment of the method
The embodiments of the invention provide a kind of method of distributed data base data check, executive agent of the invention is several According to storehouse cluster server, referring to Fig. 1, this method includes:
S101, data to be changed exported into data describe text, according to derived data describe text calculate described in treat Change the check value that row data are specified in data;
S102, the data to be changed are split by row, and data to be changed described in after fractionation are imported into phase The database node answered;
After the completion of S103, data import, the check value for specifying row data after importing in the data to be changed is calculated;
Whether S104, the check value for comparing specified row data in data change before and after importing are consistent, if unanimously, Data to be changed are in change self-consistentency described in then determining.
That is, the present invention by compare import before and after in data change specified row data check value whether one Cause, to determine that data to be changed are changing self-consistentency, efficiently solve distributed data base in the prior art and not can determine that Data before changing after data consistency the problem of.
When it is implemented, the embodiment of the present invention is by the way that step S101 is specifically included:Data middle finger to be changed described in calculating The school for the continuous N row data of one or more specified in the check value of certain fixed row data, or data to be changed described in calculating Test the sum of value.
That is, the present invention can by the check value for calculating certain the row data specified in data to be changed simply sampled, or Person by the sums of the check values of the continuous N row data of the one or more specified in larger range of sample calculation data to be changed, It is whether consistent to compare the data to be changed before and after importing.
It should be noted that the one or more specified in data to be changed described in being calculated described in the embodiment of the present invention is continuous It is to include treating whole rows of change data all to be verified in the scheme of the sum of the check value of N row data.
Specifically, when the specifies behavior row data, the calculating is specified after importing in the data to be changed The check value of row data, is specifically included:Calculate the check value for certain the row data specified after importing in the data to be changed;It is described Whether specify the check value of row data consistent, specifically include if comparing in the data to be changed:Compare and wait to become described in before and after importing The check value for certain the row data specified in more data;
All even number lines are verified for example, specifying, then the present invention owns by data to be changed before and after calculating importing The check value of the data of even number line, and whether be compared to determine data to be changed before and after importing consistent.
When the data of the one or more continuous N rows of the specifies behavior, the data to be changed after calculating importing In specify the check values of row data, specifically include:Calculate the one or more continuous N specified after importing in the data to be changed The sum of the check value of row data;Specify the check value of row data whether consistent in data to be changed described in the comparison, specific bag Include:Compare the sum of the check value of the continuous N row data of the one or more specified before and after importing in the data to be changed.
All line numbers are verified for example, specifying, then all rows of data to be changed before and after the present invention is imported by calculating Data check value, and be compared to determine import before and after data to be changed it is whether consistent.
When it is implemented, the embodiment of the present invention is by by the data row to be changed in step S101, and corresponding to the row Data check value is stored in a default checklist, naturally it is also possible to by the number of the line number of data to be changed and corresponding line number According to check value and be stored in the checklist, in case data importing after carry out data consistency verification.
That is, it is of the invention on the basis of existing business operation is not influenceed, will according to default data distribution rule Data imported into corresponding node, are verified by whole line numbers and row sampling of data verifies two kinds of means and ensures to import in database Data and initial data strong consistency, while sample verification line number can configure.
It should be noted that the present invention will enter when changing data and exporting to data and describe text to each row data Demarcation is gone, and database node is distributed to according to the positioning of row data, after the completion of data importing, according in default checklist (line number can be appointed for row that the needs of record are verified (such as even number line, i.e. the row sampling of data verification) or default line number Meaning setting or the data of whole rows, i.e. whole line number verifications), the data check value of data after the completion of acquisition imports, and The data check value recorded in checklist described in the data check value is compared, if the two is consistent, then it is assumed that before importing Data are consistent afterwards.
Data to be changed described in the embodiment of the present invention include data to be initiated, data to be migrated and divided data to be weighed. That is the present invention can be to data initialization, Data Migration, data consistency carries out school data grade data before changing again after Test.Because whole data change checking procedure of the invention need not lock to current database, data press row location data Storehouse node and data distribution and data check all can be carried out independently, can just be taken when single-node data imports and lacked database clothes Be engaged in device I/O, therefore smaller to online service impact.
Data to be changed are exported to data and describe text by the embodiment of the present invention to be specifically included:
, it is necessary to which data to be migrated are exported into data describes text, across data before Data Migration or data initialization Storehouse and current distributed database are supported, need to export the database table for needing to migrate according to former database syntax when inter-library File is described into text, distributed data can be exported to text by current distributed database by LoadServer.
The embodiment of the present invention calculates the data check value for the data row to be changed that need to be verified, or calculates what need to be verified The sum of the line number of the data to be changed and the data check value of corresponding line number, and by data row to be changed and the data school changed one's profession Test value, the sum of the data check value of the line number of data to be changed and corresponding line number, be stored in default checklist, for rear It is continuous to carry out consistency desired result.
When it is implemented, the present invention is to read text in internal memory by row according to text description rule, and calculate current The ASCII value (i.e. above-mentioned data check value) of row data is saved in internal memory, when what need to be verified is continuous multirow data, By the way that the ASCII value of each row data is added, you can obtain the sum of the ASCII value of multirow data.
After the embodiment of the present invention is split to the data to be changed, and data to be changed described in after fractionation are led Enter to before corresponding database node, in addition to:
The database node that should be deposited respectively according to the data to be changed after the distributed distribution rules acquisition fractionation.
Data to be changed described in after fractionation are imported into corresponding database node by the embodiment of the present invention, are specifically included:
Data to be changed described in after fractionation are write to the file cache of corresponding database node, notification database cluster Number of files and list of file names have been completed in management, and will be stored to text by the data-base cluster management trigger database broker The data to be changed of part caching are downloaded in the database node;
Wherein, the database broker corresponds with the database node respectively.
When it is implemented, it is of the invention by the way that Current Datarow is written in corresponding database node file cache, if After configuration file requirement present node file storage number of data lines is write completely or reached to caching, write data into file and generate new File to be written;
After generating a certain amount of file, integrated databases can lead to according to number of files and list of file names has been completed Primary data storehouse agency downloads to corresponding document in database server, and imported into corresponding database;
After the completion of data importing, integrated databases initiate verification, by data-base cluster management to database Agency sends verification request, obtains the data check value of the data row of the current database node storage of database broker statistics; Wherein, the database broker is the database generation corresponding to the database node for the data for being stored with the data row to be changed Reason;Or the row according to the data to be changed, verification request is sent to database broker by data-base cluster management, obtained The line number of data and the sum of corresponding data check value of the current database node of database broker statistics are taken, wherein, it is described Database broker is the database broker corresponding to the database node for the data for being stored with the data row to be changed, and described Database broker corresponds with the database node respectively.
Specifically, for the embodiment of the present invention after the completion of data importing, integrated databases initiate data check stream Journey, data check request is distributed to the database broker DBAgent of all database nodes of current checklist, allows database Act on behalf of DBAgent and count the line number of current table and the ASCII value of current checklist data, and in the feedback for receiving each node feeding back As a result after, carry out number of data lines and data check value compares, if number of data lines is identical and sampling check value is identical, data are consistent Property verification pass through, feedback data migrates successfully.
Fig. 2 is the schematic flow sheet of the method for another distributed data base data check of the embodiment of the present invention, below Fig. 2 will be combined detailed explanation and illustration is carried out to method of the present invention:
S201, beginning;
S202, data export;
Will data be changed export to data and describe text, calculate the ASCII value of Current Datarow, or current line number According to ASCII value sum;
S203, data import and verification data generation;
Specifically, the step specifically includes:Data to be changed are write to the file cache of corresponding database node, notice Number of files and list of file names have been completed in data-base cluster management, and pass through the data-base cluster management trigger database broker The data to be changed stored to file cache are downloaded in the database node, after the completion of data importing, pass through database Cluster management sends verification request to database broker, obtains the data of the current database node storage of database broker statistics Capable data check value;
S204, data check;
Compare data check value (or the data check value for importing the front and rear data row to be changed that need to be verified With) whether consistent, if unanimously, it is determined that the data to be changed are in change self-consistentency.
S205, end.
Further detailed explanation will be carried out to method of the present invention by a specific example below and said Bright, method of the present invention includes:
Stage one, Generating Data File:
, it is necessary to which data to be migrated are exported into data describes text, across data before Data Migration or data initialization Storehouse and current distributed database are supported, need to export the database table for needing to migrate according to former database syntax when inter-library File is described into text, distributed data can be exported to text by current distributed database by integrated databases File.
Stage two, Data Migration:
According to text description rule by text by being about to digital independent into internal memory, and calculate the ASCII of Current Datarow Value is saved in internal memory;
The database node that Current Datarow should deposit is got according to distributed distribution rules;
Current Datarow is written in corresponding database node file cache, if configuration text is write completely or reached to caching After part requires present node file storage number of data lines, write data into file and generate new file to be written;
After generating a certain amount of file, integrated databases notify DBAgent that corresponding document is downloaded into database clothes It is engaged in device, and imported into corresponding database;
It steps be repeated alternatively until that all data are imported into distributed data base;
Stage three, consistency verification of data:
Integrated databases receive all data importings and completed after asking, and initiate data check flow.
Data check request is distributed to the DBAgent of all database nodes of current table by integrated databases, DBAgent is allowed to count the line number of current table and the ASCII value of current table data.
After the feedback result for receiving each node feeding back, carry out number of data lines and data check value compares, if number of data lines phase With and sampling check value it is identical, then consistency verification of data is by the way that feedback data migrates successfully.
The example of Mariadb distributed type assemblies databases will be moved to by a specific DB2 database below to this Invention is described in detail:
Export data:Method is provided using DB2, and data are exported into external file;
Generate checklist:Full dose verification and sampling school (are supported according to configuration verification line number and file line number generation checklist Test);
File declustering:Data file is read by row, Current Datarow home node is calculated according to distribution rules, is judged current Whether row data, which need, is verified, if then generation current line ASCII value is accumulated in check results and generates database The sql sentences of Current Datarow are positioned, is written in current group verification sql files, circulates successively, it is known that file, which is read, to be terminated, Count current file line number;
Data import:The data file split is imported into respective nodes database by database broker DBAgent In;
Data check:After the completion of data all import, data check flow is initiated by integrated databases, compares and works as Whether preceding document line number, check value summation are consistent with importing number of data lines summation, data check value summation in database, if unanimously Then data are consistent before and after Data Migration, and Data Migration is completed;Need to re-start migration if inconsistent;
The present invention will be entered by an example specifically based on Mariadb distributed type assemblies data backup restorations below Row detailed description:
Obtain full dose data:Distributed data base data are exported into text using distributed data base utility In file;
Generation verification row-column list:(support full dose verification according to configuration verification line number and file line number generation checklist and take out Sample verifies);
Original document is split reads data file by row, calculates Current Datarow home node according to distribution rules, judges Whether Current Datarow, which needs, is verified, if then generation current line ASCII value is accumulated in check results and generates number According to the sql sentences of storehouse positioning Current Datarow, it is written in current group verification sql files, circulates successively, it is known that file reads knot Beam, count current file line number;
Data recovery:The data file split is imported into respective nodes database by database broker DBAgent In;
Data check:After the completion of data all import, data check flow is initiated by integrated databases, compares and works as Whether number of data lines summation, data check summation are consistent in preceding document line number, check value summation and new node, are backed up if consistent Data are consistent before and after recovery, and full dose data recovery is completed.If inconsistent need to re-start data recovery procedure.
The present invention has following beneficial effect compared to existing distributed data base technique in the industry:
1. the performance of the present invention is good.Data check basic data of the present invention prepares, just complete in data migration process Into without re-starting verification data set-up procedure, so as to greatly save the Data Migration duration;
2. method of the present invention does not disturb online service operation, the present invention need not increase virtual in former checklist Row, without being locked to table, so minimum to online service impact;
3. the method for the invention verification mode is flexible, the present invention supports data from the sample survey verification and full dose data check, can To migrate task completion time by the different verification rank of reasonable arrangement different check table to shorten current data;
4. the method for the invention supports integration across database Data Migration data check, Data Migration entrance of the present invention is data Text is described, each database supports database to export to text and describe file, and distributed data base can pass through data Storehouse cluster server exports to distributed data base text.
Device embodiment
The embodiments of the invention provide a kind of device of distributed data base data check, referring to Fig. 3, the device includes: First computing unit, text is described for data to be changed to be exported into data, describing text according to derived data calculates institute State the check value that row data are specified in data to be changed;Import unit, for being split to the data to be changed by row, and Data to be changed described in after fractionation are imported into corresponding database node;Second computing unit, import and complete for data Afterwards, the check value for specifying row data after importing in the data to be changed is calculated;Comparing unit, for compare import before and after it is described Specify the check value of row data whether consistent in data to be changed, if unanimously, it is determined that the data to be changed are before changing It is consistent afterwards.
That is, the present invention by compare import before and after in data change specified row data check value whether one Cause, to determine that data to be changed are changing self-consistentency, efficiently solve distributed data base in the prior art and not can determine that Data before changing after data consistency the problem of.
Further, the first computing unit is additionally operable to described in the embodiment of the present invention, is specified in data to be changed described in calculating Certain row data check value, or the verification for the one or more continuously N row data specified in data change described in calculating The sum of value.
That is, the present invention can by the check value for calculating certain the row data specified in data to be changed simply sampled, or Person by the sums of the check values of the continuous N row data of the one or more specified in larger range of sample calculation data to be changed, It is whether consistent to compare the data to be changed before and after importing.
Further, the second computing unit is additionally operable to described in the embodiment of the present invention, when the specifies behavior row data, Calculate the check value for certain the row data specified after importing in the data to be changed;When the specifies behavior is one or more continuous During the data of N rows, the check value of the continuous N row data of the one or more specified after importing in the data to be changed is calculated With;
The comparing unit is additionally operable to, and when the specifies behavior row data, compares the number to be changed before and after importing The check value for certain the row data specified in;When the data of the one or more continuous N rows of the specifies behavior, before comparing importing The sum of the check value for the continuous N row data of one or more specified afterwards in the data to be changed.
It should be noted that data to be changed described in the embodiment of the present invention include data to be initiated, data to be migrated and Divided data to be weighed.That is, the present invention can be to data initialization, Data Migration, data data grade data before changing again after Uniformity is verified.Because whole data change checking procedure of the invention need not lock to current database, data It all can independently carry out by row location database node and data distribution and data check, can just be taken when single-node data imports Lack database server I/O, thus it is smaller to online service impact.
Further, the import unit further comprises:Module is split to split the data to be changed by row; The database node that acquisition module should be deposited respectively according to the data to be changed after the distributed distribution rules acquisition fractionation;Lead Enter module and data to be changed described in after fractionation are imported into corresponding database node.
That is, it is of the invention on the basis of existing business operation is not influenceed, will according to default data distribution rule Data imported into corresponding node, are verified by whole line numbers and row sampling of data verifies two kinds of means and ensures to import in database Data and initial data strong consistency, while sample verification line number can configure.
Further, the import unit further comprises:Module is split to split the data to be changed by row; Import modul writes data to be changed described in after fractionation the file cache of corresponding database node, notification database cluster Number of files and list of file names have been completed in management, and will be stored to text by the data-base cluster management trigger database broker Part caching data to be changed download in the database node, the database broker respectively with the database node one by one It is corresponding.
When it is implemented, it is of the invention by the way that Current Datarow is written in corresponding database node file cache, if After configuration file requirement present node file storage number of data lines is write completely or reached to caching, write data into file and generate new File to be written;
After generating a certain amount of file, integrated databases can lead to according to number of files and list of file names has been completed Primary data storehouse agency downloads to corresponding document in database server, and imported into corresponding database.
When it is implemented, the second computing unit described in the embodiment of the present invention is to database generation by data-base cluster management Haircut send verification to ask, and obtains the data check value of the data row of the current database node storage of database broker statistics;Its In, the database broker is the database generation corresponding to the database node for the data for being stored with the data row to be changed Reason, or, according to the row of the data to be changed, verification request is sent to database broker by data-base cluster management, obtained Take the line number of data and the sum of corresponding data check value of the current database node of database broker statistics;Wherein, it is described Database broker is the database broker corresponding to the database node for the data for being stored with the data row to be changed, and described Database broker corresponds with the database node respectively.
Fig. 4 is the configuration diagram of the system of the online data migration of the embodiment of the present invention, as shown in figure 4, the present invention is real Example is applied after the completion of data importing, and comparing unit initiates data check flow, and data check request is distributed into current verification The database broker DBAgent of all database nodes of table, database broker DBAgent is allowed to count the line number of current table and work as The ASCII value of preceding checklist data, and after the feedback result of each node feeding back is received, carry out number of data lines and data check value Compare, if number of data lines is identical and sampling check value is identical, consistency verification of data is by the way that feedback data migrates successfully.
The related content of apparatus of the present invention can refer to embodiment of the method part and be understood that in this not go into detail.
Server example
The embodiments of the invention provide a kind of integrated databases, the integrated databases are implemented including device The device of any one distributed data base data check described in example.
Related content in the embodiment of the present invention can refer to device embodiment and embodiment of the method part is understood, herein Repeat no more.
The present invention can at least reach following beneficial effect:
The present invention by compare import before and after need checking data row change data check value it is whether consistent, or compare Compared with the data to be changed for importing front and rear need checking line number and corresponding line number data check value with it is whether consistent, so as to Accurately determine that data to be changed are changing self-consistentency, efficiently solve distributed data base in the prior art and not can determine that Data before changing after data consistency the problem of.
Although being example purpose, the preferred embodiments of the present invention are had been disclosed for, those skilled in the art will recognize Various improvement, increase and substitution are also possible, and therefore, the scope of the present invention should be not limited to above-described embodiment.

Claims (13)

  1. A kind of 1. method of distributed data base distributed data base data check, it is characterised in that including:
    Data to be changed are exported into data and describe text, in the data to be changed according to derived data describe text calculating The check value of nominated bank's data;
    The data to be changed are split by row, and data to be changed described in after fractionation are imported into corresponding database Node;
    After the completion of data import, the check value of specified row data in data change after importing is calculated, and before comparing importing Specify the check value of row data whether consistent in the data to be changed afterwards, if unanimously, it is determined that the data to be changed exist Change self-consistentency.
  2. 2. according to the method for claim 1, it is characterised in that row data are specified in data to be changed described in the calculating Check value, specifically include:
    Specified in the check value for certain the row data specified in data to be changed described in calculating, or data to be changed described in calculating The sum of the check value of one or more continuous N row data.
  3. 3. according to the method for claim 2, it is characterised in that
    When the specifies behavior row data, the verification of row data is specified in the calculating in the data to be changed after importing Value, is specifically included:Calculate the check value for certain the row data specified after importing in the data to be changed;Wait to become described in the comparison Specify the check value of row data whether consistent in more data, specifically include:Compare and specified before and after importing in the data to be changed Certain row data check value;
    When the data of the one or more continuous N rows of the specifies behavior, the data middle finger to be changed after calculating importing Determine the check value of row data, specifically include:Calculate the continuous N line numbers of the one or more specified after importing in the data to be changed According to check value sum;Specify the check value of row data whether consistent in data to be changed described in the comparison, specifically include:Than Compared with the sum of the check value for the continuous N row data of one or more specified in the data to be changed before and after importing.
  4. 4. according to the method described in any one in claim 1-3, it is characterised in that the data to be changed are carried out by row After fractionation, and before data to be changed described in after fractionation are imported into corresponding database node, in addition to:
    The database node that should be deposited respectively according to the data to be changed after the distributed distribution rules acquisition fractionation.
  5. 5. according to the method described in any one in claim 1-3, it is characterised in that by data to be changed described in after fractionation Corresponding database node is imported into, is specifically included:
    Data to be changed described in after fractionation are write to the file cache of corresponding database node, notification database cluster management Number of files and list of file names have been completed, and will have been stored to file and delayed by the data-base cluster management trigger database broker The data to be changed deposited are downloaded in the database node;
    Wherein, the database broker corresponds with the database node respectively.
  6. 6. according to the method described in any one in claim 1-3, it is characterised in that
    The data to be changed include data to be initiated, data to be migrated and divided data to be weighed.
  7. A kind of 7. device of distributed data base distributed data base data check, it is characterised in that including:
    First computing unit, text is described for data to be changed to be exported into data, text meter is described according to derived data The check value of row data is specified in data to be changed described in calculation;
    Import unit, for being split to the data to be changed by row, and by data importing change described in after fractionation To corresponding database node;
    Second computing unit, after the completion of being imported for data, calculate the school for specifying row data after importing in the data to be changed Test value;
    Comparing unit, for compare import before and after in data change specified row data check value it is whether consistent, if Unanimously, it is determined that the data to be changed are in change self-consistentency.
  8. 8. device according to claim 7, it is characterised in that
    First computing unit is additionally operable to, the check value for certain the row data specified in data to be changed described in calculating, Huo Zheji The sum of the check value for the continuous N row data of one or more specified in data to be changed described in calculation.
  9. 9. device according to claim 8, it is characterised in that
    Second computing unit is additionally operable to, and when the specifies behavior row data, calculates the data to be changed after importing In the check value of certain row data specified;When the data of the one or more continuous N rows of the specifies behavior, institute after importing is calculated State the sum of the check value of the continuous N row data of the one or more specified in data to be changed;
    The comparing unit is additionally operable to, and when the specifies behavior row data, is compared before and after importing in the data to be changed The check value for certain the row data specified;When the data of the one or more continuous N rows of the specifies behavior, compare institute before and after importing State the sum of the check value of the continuous N row data of the one or more specified in data to be changed.
  10. 10. according to the device described in any one in claim 7-9, it is characterised in that the import unit further comprises:
    Module is split, for being split to the data to be changed by row;
    Acquisition module, the data that should be deposited respectively for obtaining the data to be changed after the fractionation according to distributed distribution rules Storehouse node;
    Import modul, for data to be changed described in after fractionation to be imported into corresponding database node.
  11. 11. according to the device described in any one in claim 7-9, it is characterised in that the import unit further comprises:
    Module is split, for being split to the data to be changed by row;
    Import modul, for data to be changed described in after fractionation to be write to the file cache of corresponding database node, notice Number of files and list of file names have been completed in data-base cluster management, and pass through the data-base cluster management trigger database broker Downloaded to storing to the data to be changed of file cache in the database node, the database broker respectively with the data Storehouse node corresponds.
  12. 12. according to the method described in any one in claim 7-9, it is characterised in that
    The data to be changed include data to be initiated, data to be migrated and divided data to be weighed.
  13. 13. a kind of integrated databases, it is characterised in that including the distribution described in any one in claim 7-12 The device of database distributed database data verification.
CN201610794307.2A 2016-08-31 2016-08-31 Distributed database data verification method, device and related device Active CN107798007B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610794307.2A CN107798007B (en) 2016-08-31 2016-08-31 Distributed database data verification method, device and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610794307.2A CN107798007B (en) 2016-08-31 2016-08-31 Distributed database data verification method, device and related device

Publications (2)

Publication Number Publication Date
CN107798007A true CN107798007A (en) 2018-03-13
CN107798007B CN107798007B (en) 2024-03-19

Family

ID=61530069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610794307.2A Active CN107798007B (en) 2016-08-31 2016-08-31 Distributed database data verification method, device and related device

Country Status (1)

Country Link
CN (1) CN107798007B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669989A (en) * 2018-12-29 2019-04-23 江苏满运软件科技有限公司 Data verification method, system, equipment and medium
CN110209521A (en) * 2019-02-22 2019-09-06 腾讯科技(深圳)有限公司 Data verification method, device, computer readable storage medium and computer equipment
CN112231403A (en) * 2020-10-15 2021-01-15 北京人大金仓信息技术股份有限公司 Consistency checking method, device, equipment and storage medium for data synchronization
CN116150175A (en) * 2023-04-18 2023-05-23 云账户技术(天津)有限公司 Heterogeneous data source-oriented data consistency verification method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102354292A (en) * 2011-09-21 2012-02-15 国家计算机网络与信息安全管理中心 Method and system for checking consistency of records in master and backup databases
CN103793424A (en) * 2012-10-31 2014-05-14 阿里巴巴集团控股有限公司 Database data migration method and database data migration system
CN104361119A (en) * 2014-12-02 2015-02-18 中国农业银行股份有限公司 Data cleaning method and system
CN104731792A (en) * 2013-12-19 2015-06-24 中国银联股份有限公司 Method and system for verifying database consistency and method and system for positioning data difference

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102354292A (en) * 2011-09-21 2012-02-15 国家计算机网络与信息安全管理中心 Method and system for checking consistency of records in master and backup databases
CN103793424A (en) * 2012-10-31 2014-05-14 阿里巴巴集团控股有限公司 Database data migration method and database data migration system
CN104731792A (en) * 2013-12-19 2015-06-24 中国银联股份有限公司 Method and system for verifying database consistency and method and system for positioning data difference
CN104361119A (en) * 2014-12-02 2015-02-18 中国农业银行股份有限公司 Data cleaning method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109669989A (en) * 2018-12-29 2019-04-23 江苏满运软件科技有限公司 Data verification method, system, equipment and medium
CN110209521A (en) * 2019-02-22 2019-09-06 腾讯科技(深圳)有限公司 Data verification method, device, computer readable storage medium and computer equipment
CN112231403A (en) * 2020-10-15 2021-01-15 北京人大金仓信息技术股份有限公司 Consistency checking method, device, equipment and storage medium for data synchronization
CN112231403B (en) * 2020-10-15 2024-01-30 北京人大金仓信息技术股份有限公司 Consistency verification method, device, equipment and storage medium for data synchronization
CN116150175A (en) * 2023-04-18 2023-05-23 云账户技术(天津)有限公司 Heterogeneous data source-oriented data consistency verification method and device

Also Published As

Publication number Publication date
CN107798007B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
US20230141556A1 (en) Structured cluster execution for data streams
CN104541247B (en) System and method for adjusting cloud computing system
CN107798007A (en) A kind of method, apparatus and relevant apparatus of distributed data base data check
CN104102710A (en) Massive data query method
CN106886371B (en) Caching data processing method and device
CN107844343A (en) The upgrade-system and method of a kind of complex services end application system
CN111767704B (en) Excel form template generation method and device
CN107368260A (en) Memory space method for sorting, apparatus and system based on distributed system
CN109783543A (en) Data query method, apparatus, equipment and storage medium
CN105528464A (en) Version management system capable of automatically judging technical condition consistency of associated data
CN104598299A (en) System and method for performing aggregation process for each piece of received data
CN106897342A (en) A kind of data verification method and equipment
CN106990970A (en) Based on MVC dynamic pages generation method and system
CN109510852A (en) The method and device of gray scale publication
CN111930716A (en) Database capacity expansion method, device and system
CN113177090A (en) Data processing method and device
US11455574B2 (en) Dynamically predict optimal parallel apply algorithms
CN111190814A (en) Software test case generation method and device, storage medium and terminal
CN107085613A (en) Enter the filter method and device of library file
CN107291623A (en) A kind of applied program testing method and device
CN104636397B (en) Resource allocation methods, calculating accelerated method and device for Distributed Calculation
CN111125067A (en) Data maintenance method and device
CN110059096A (en) Data version management method, apparatus, equipment and storage medium
EP3373165A1 (en) Method of transferring the structures and data sets between the source and target systems and the system to implement it
CN115454420A (en) Artificial intelligence algorithm model deployment system, method, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right

Effective date of registration: 20180426

Address after: 518057 five floor, block A, ZTE communication tower, Nanshan District science and Technology Park, Shenzhen, Guangdong.

Applicant after: ZTE Corp.

Address before: 210000 68 Bauhinia Road, Yuhuatai District, Nanjing, Jiangsu

Applicant before: Nanjing Zhongxing New Software Co.,Ltd.

TA01 Transfer of patent application right
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20220104

Address after: 100176 602, floor 6, building 6, courtyard 10, KEGU 1st Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing (Yizhuang group, high-end industrial area of Beijing Pilot Free Trade Zone)

Applicant after: Jinzhuan Xinke Co.,Ltd.

Address before: 518057 five floor, block A, ZTE communication tower, Nanshan District science and Technology Park, Shenzhen, Guangdong.

Applicant before: ZTE Corp.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant