CN106293977B - A kind of data verification method and equipment - Google Patents

A kind of data verification method and equipment Download PDF

Info

Publication number
CN106293977B
CN106293977B CN201510249497.5A CN201510249497A CN106293977B CN 106293977 B CN106293977 B CN 106293977B CN 201510249497 A CN201510249497 A CN 201510249497A CN 106293977 B CN106293977 B CN 106293977B
Authority
CN
China
Prior art keywords
data
verified
equipment
source
referring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510249497.5A
Other languages
Chinese (zh)
Other versions
CN106293977A (en
Inventor
王育雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510249497.5A priority Critical patent/CN106293977B/en
Publication of CN106293977A publication Critical patent/CN106293977A/en
Application granted granted Critical
Publication of CN106293977B publication Critical patent/CN106293977B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Detection And Prevention Of Errors In Transmission (AREA)

Abstract

The purpose of the application is to provide a kind of method and apparatus;Obtain the first data that purpose equipment is sent to from source device;Processing is formatted to first data, to obtain corresponding first data to be verified;Described first data to be verified are verified;When the corresponding check results of the described first data to be verified are qualified, processing is formatted to the described first data to be verified, to obtain the second data based on the second data format corresponding to the purpose equipment;Second data are sent to the purpose equipment.Compared with prior art, the application can be completed from the data check that each data source is read into data check equipment based on the data to be verified of type simple first, so as to avoid the data class diversity bring complexity checking procedure due to purpose data source, simultaneously, with the increase of data source, practical verifying work amount growth is slower, optimizes resource overhead on the whole.

Description

A kind of data verification method and equipment
Technical field
The present invention relates to computer field more particularly to a kind of data check technologies.
Background technique
Data transmission between Various types of data source, such as the data synchronous transfer between all kinds of heterogeneous data sources need pair The process of data transmission carries out transmission quality verification, to assess the correctness of data transmission.In the prior art, be generally based on to The source data for being synchronized to target data source determines corresponding expected data, then is transferred to corresponding target data source with the source data The result data obtained afterwards is verified with above-mentioned expected data, at this point, due to the diversity of data source category, when a data When data transmission quality verification is all set between each data source in synchronization system, whole inspection larger workload, in addition, If increasing new data source in existing data synchronous system, data source is increased newly to each data source and each number Required according to data transmission channel of source to the newly-increased data source it is verified so that newly-increased data transmission quality verification Workload will be in geometric growth, considerably increase data check cost.
Summary of the invention
The purpose of the application is to provide a kind of method and apparatus of data check.
According to the one aspect of the application, a kind of data verification method is provided, comprising:
Obtain the first data that purpose equipment is sent to from source device;
Processing is formatted to first data, to obtain corresponding first data to be verified;
Described first data to be verified are verified;
When the corresponding check results of the described first data to be verified are qualified, lattice are carried out to the described first data to be verified Formula conversion process, to obtain the second data based on the second data format corresponding to the purpose equipment;
Second data are sent to the purpose equipment.
According to the another aspect of the application, a kind of data check equipment is additionally provided, comprising:
First device, for obtaining the first data for being sent to purpose equipment from source device;
Second device, for formatting processing to first data, to obtain corresponding first number to be verified According to;
3rd device, for being verified to the described first data to be verified;
4th device, for when the corresponding check results of the described first data to be verified are qualified, to described first to Verification data format processing, to obtain the second number based on the second data format corresponding to the purpose equipment According to;
5th device, for second data to be sent to the purpose equipment.
Compared with prior art, the application is determined by formatting to the first data obtained from source device to school First tested data to be verified, and first data to be verified are verified, to realize synchrodata from source device It is read into the inspection of the data transmission quality of data check equipment this process, here, being read into data school from each data source The data check for testing equipment can be completed based on the data to be verified of type simple first, so as to avoid due to purpose data The data class diversity bring complexity checking procedure in source;Also, when participating in the every increase of the synchronous data source of data one, Synchrodata is read into the data biography of this process of data check equipment by the data verification method based on the application from source device The inspection of transmission quality only needs to complete the verification once carried out to the described first data to be verified, greatly reduces whole verification work It measures.Further, the application determines the described second number to be verified based on received second data of acquired purpose equipment institute According to, and second data to be verified are verified, synchrodata is further written from data check equipment to realize The inspection of the data transmission quality of this process of purpose equipment, here, when participating in the every increase of the synchronous data source of data one, Data verification method based on the application passes synchrodata from the data of data check equipment write-in this process of purpose equipment The inspection of transmission quality also only needs to complete the verification once carried out to the described second data to be verified, and then greatly optimizes whole Sports school tests workload.Further, in conjunction with the above-mentioned number that synchrodata is read into this process of data check equipment from source device According to the inspection of transmission quality, and by synchrodata further from the data of data check equipment write-in this process of purpose equipment The inspection of transmission quality, the application, which is realized, converts source device for corresponding quality indicator direct between data source and target is set Synchrodata in standby is verified with the data that mutually should refer in data check equipment respectively, thus synchronous with data are participated in Data source increase so that practical verifying work amount increase it is slower, thereby reduce data source access verifying cost, whole Resource overhead is optimized on body.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other Feature, objects and advantages will become more apparent upon:
Fig. 1 shows a kind of equipment schematic diagram of data check equipment according to the application one aspect;
Fig. 2 shows the equipment schematic diagrams according to a kind of data check equipment of one preferred embodiment of the application;
Fig. 3 shows a kind of method flow diagram of data check according to the application other side;
Fig. 4 shows a kind of method flow diagram of data check according to one preferred embodiment of the application.
The same or similar appended drawing reference represents the same or similar component in attached drawing.
Specific embodiment
Present invention is further described in detail with reference to the accompanying drawing.
In a typical configuration of this application, terminal, the equipment of service network and trusted party include one or more Processor (CPU), input/output interface, network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM).Memory is showing for computer-readable medium Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices or Any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, computer Readable medium does not include non-temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
Fig. 1 shows a kind of equipment schematic diagram of data check equipment according to the application one aspect.
The data check equipment 1 includes first device 101, second device 102,3rd device 103, the 4th device 104 With the 5th device 105.
Wherein, first device 101 obtains the first data that purpose equipment is sent to from source device.Second device 102 is right First data format processing, to obtain corresponding first data to be verified;3rd device 103 is to described first Data to be verified are verified;4th device 104 is when the corresponding check results of the described first data to be verified are qualified, to institute It states the first data to be verified and formats processing, to obtain based on the second data format corresponding to the purpose equipment Second data;Second data are sent to the purpose equipment by the 5th device 105.
Here, those skilled in the art will be understood that first data and the second data are respectively intended to refer to data school Each stage specific data in testing.For example, first data include being sent to purpose equipment from source device, and by the data The data that calibration equipment 1 obtains;For another example, second data include based on to the described first data progress to be verified format The data determined after conversion process.Here, those skilled in the art should also be understood that, second data format is used to refer to Dai Jing Cross the data format of second data obtained after the format conversion processing.Here, those skilled in the art should be able to also manage Solution, first data to be verified are used to refer to codes or data and verify medium data to be verified, and first data packet to be verified It includes and the data determined after corresponding format conversion process is carried out to first data.
Specifically, first device 101 obtains the first data that purpose equipment is sent to from source device.It is described herein Data check equipment 1 includes the server apparatus of the verification for carrying out to data transmission quality in data transmission procedure, can be with It is an individual equipment, also can according to need and be set as one or more device clusters.The data are transmitted The transmission of data between Various types of data source, it is preferable that such as the data simultaneously operating between heterogeneous data source.The data source can be with Including arbitrary structures such as data, XML document, html document, Email, ordinary files in Various types of data base management system Change, unstructured or partly-structured data information etc..During data between heterogeneous data source synchronize, the source device is The corresponding source device of synchrodata is export data source, i.e. source data source, corresponding equipment;Corresponding, the purpose is set Standby is the heterogeneous data source that the synchrodata will import, i.e. target data source, corresponding equipment.Here, the data Synchronize is to export data source corresponding reading plug-in unit synchrodata is read sync server memory, and then notice and import data source Write plug-in unit by the synchrodata in the sync server memory be written into import database process, read and write synchrodata Exchange carries out in the memory.Here, data synchronization check described in this programme will cooperate with the data synchronization.It is described Sync server and the data check equipment 1 can be respectively distinct device, may correspond to the same equipment.Here, Preferably, the same equipment is configured by the sync server and the data check equipment 1, so that network transmission be isolated Limitation, the data of data check can be made accordingly faster.It, can basis when the data of heterogeneous data source, which carry out data, to be synchronized Practical verification needs, and is periodically examined accordingly to the data synchronizing quality between whole or partial allosteric data source.? This, first data include issue and reach from the source device data check equipment 1 and correspond to export number According to the object synchronization data in source, the first device 101 obtains first data, for example, working as the data check equipment 1 and institute State sync server be the same equipment when, can directly be obtained from the memory in the data check equipment 1 this first number According to;It, then can be from the synchronous service if the data check equipment 1 corresponds to distinct device with the sync server Device obtains first data in corresponding memory exchange area.Here, first data source corresponds to institute in the source device The source-synchronous data in source device is stated, i.e. source-synchronous data reaches the data check equipment 1 by data transmission or synchronizes After in server, that is, it is determined as first data.
Then, the second device 102 formats processing to first data, with obtain corresponding first to Verify data.Here, first data to be verified include that first data carry out the data determined after corresponding format conversion, Its purpose is to realize the data transmission quality that synchrodata is read into data check equipment 1 this process from source device It examines.With data instance of the heterogeneous data source between different data base management systems, as data source can be The data of the data base management systems such as MySQL, Oracle, Hadoop, different data sources correspond to different data formats, example Such as, the data in the table a of data source MySQL database are imported into oracle database, wherein the table a has certain Table structure, field is provided with corresponding data type in table.Therefore, first data correspond to the number for belonging to respective data sources According to format, for example, MySQL, Oracle correspond to certain binary format.Here, the second device 102 is to described first Data format processing, it is preferable that the format conversion includes by the original Data Format Transform of the first data For txt text formatting, so that subsequent verification operation can be facilitated based on such general, simple format.Here, can be based on Java primary various data types convert first data to the first number to be verified stored with txt text formatting According to compared to the diversity of the data type of each data source, Java primary data type type is by a relatively simple, rear In continuous verification comparison, whole verifying work amount can be made to greatly simplify.For example, data to be synchronized are the Table A in MySQL In data, corresponding field includes: id, product_name, context, create_time in Table A, if original Table A It is respectively Long, String, String, Date in the data type of above-mentioned each field, then it, can be with when switching to txt format The corresponding data type of above-mentioned each field is arranged to remain unchanged, respectively corresponds Long, String, String, Date type, or Person switchs to other data types as needed, such as switchs to data type Long, String, Long, Date respectively.Here, specific The conversion of data type, can be based on the expectation of user, and combines the specific data class for participating in the synchronous each data source of data Type determines.
Here, those skilled in the art should be appreciated that the format conversion includes by the original data of the first data Format is converted to txt text formatting and is only for example, other extended formatting conversions that are existing or being likely to occur from now on are such as applicable It in the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Then, the 3rd device 103 verifies the described first data to be verified.Here, to described first to school The verification for testing data is to realize the data transmission that synchrodata is read into 1 this process of data check equipment from source device The inspection of quality.Here, may include that the mode based on transcription comparison carries out to the verification mode of the above-mentioned first data to be verified Verification, by the described first data to be verified with it is corresponding referring to data by text sequence compare, and to comparison result carry out Difference analysis realizes corresponding text for example, passing sequentially through the sort (sequence) in linux system, diff (comparison) order It compares, so that the correctness synchronous to the corresponding data of the first data to be verified judges.First data to be verified Verification mode can also include the verification carried out based on Hash way of contrast, i.e., to the first data to be verified and corresponding ginseng According to data, cryptographic Hash is calculated separately, such as with behavior unit, corresponding cryptographic Hash is calculated line by line, with the different of correlation data cryptographic Hash Together, come judge described first to school data with it is whether consistent referring to data, and then to the corresponding number of the described first data to be verified Judged according to synchronous correctness.
Here, those skilled in the art should be appreciated that the verification that the mode based on transcription comparison carries out, based on breathing out The verification that uncommon way of contrast carries out is only for example, other it is existing or be likely to occur from now on other can be to described to school It tests the mode that data are verified and is such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein to draw It is incorporated herein with mode.
Preferably, the data check equipment 1 further includes the tenth device (not shown), and the tenth device obtains described the First referring to data corresponding to one data to be verified;Wherein, the 3rd device 103 is based on the first reference data, right First data to be verified are verified.
Here, those skilled in the art will be understood that described first is used to refer to referring to data for be verified to described first The reference data of data progress data check.
Specifically, when the data in some corresponding data source of the source device are confirmed as source-synchronous data, the source Synchrodata will be by data synchronous transfer to data check equipment 1, and is determined as first data, since data are transmitted Possibility loss in the process, first data, or first data conversion resulting first data to be verified are based on, with The source-synchronous data of corresponding source device may have certain deviation, based on to the source-synchronous data in actual transmissions The expection or accuracy many reasons required for practical application are combined with based on the synchrodata that possibility is lost, in turn Transmission result corresponding for the source-synchronous data is reasonably expected, and determines corresponding with the described first data to be verified the One referring to data, this first referring to data be reflect it is preset, wish that synchrodata is transferred to the number from source device According to the quality of data retainable after calibration equipment 1 requirement.Here, described first can be by being manually arranged or base referring to data Correspond to the first verification data first is automatically analyzed out referring to data in machine preset condition.Here, the tenth dress Set obtain described first referring to data source can there are many, for example, described first can derive from the source referring to data Equipment is e.g. determined based on the source-synchronous data in the source device;For another example, the first reference data can also be directly in institute It states and is generated in data check equipment;For another example, the first reference data can also be due to any other suitable third party device It generates.In addition, the described first data format being consistent referring to data and the described first data to be verified, it is preferable that described First corresponds to txt text formatting referring to data.After the tenth device obtains described first referring to data, can to its into Row respective stored,
Then, here, the 3rd device 103 is based on described first referring to data, to the described first data to be verified into Row verification.Here, full text comparison or Hash can be carried out referring to data and first data to be verified based on described first The methods of calibration such as comparison, to judge the correctness of synchronous data transmission.For example, to be verified to described the first of txt text formatting The first reference data of data and corresponding format carry out text sequence, and it is right to be then based on corresponding system command progress full text Than judging check results based on difference value confirmation to obtain the difference value of data comparison.Herein, it is preferable that first by institute It states the expection of possibility loss of the corresponding source-synchronous data of the first data in actual transmissions or is cooperated based on the synchrodata Influence of the accuracy factor to data synchronizing quality required for practical application considered in advance described first referring to data Expectation setting in, in turn, the condition for set the verification qualification is the difference value of the data comparison as 0.In addition it is also possible to The expection or be based on the same step number that possibility based on the corresponding source-synchronous data of first data in actual transmissions is lost One optimal first is determined referring to data, further, in reality according to the accuracy factor required for practical application that is combined with In verification, a matching threshold range is set, such as when the described first data to be verified are similar to the first reference data Degree is when reaching in the threshold range, all it can be assumed that the check results are qualified, e.g., can by the diff order into The difference value is arranged corresponding to the threshold range in row transformation.
It is highly preferred that the tenth device of the data check equipment 1 obtains the first reference data from the source device, Wherein, described first is corresponding with first data referring to data.
Specifically, here, can directly obtain from the source device will compare with the described first data to be verified First referring to data, the source device can be in advance based on the corresponding source-synchronous data of first data in actual transmissions can The expection or be combined with accuracy many reasons required for practical application based on the synchrodata synchronizes for source that energy is lost Data Matching corresponding first is referring to data, at the same time, based on the same source-synchronous data, the data check equipment 1 obtains corresponding first data, and is based on the format conversion processing, it is determined that corresponding first data to be verified, here, Described first is i.e. corresponding with first data referring to data.Here, the data check equipment 1 can be described in each obtain When the first data, it is corresponding it is real-time obtain it is corresponding with first data first reference data;The data check equipment 1 Corresponding first can also just be obtained in advance before obtaining first data referring to data.Here, when the source device has It can also be first each source-synchronous data configuration corresponding the at source device end when multiple source-synchronous datas for needing to synchronize One referring to data, and in turn, the data check equipment 1 can the multiple first data corresponding first of disposable request It economizes on resources referring to data so as to reduce the number that the data check equipment is requested to the source device.Further, If verifying to the data synchronizing quality setting period between the source device and corresponding purpose equipment, then the data check is set Standby 1 periodically can also obtain the first reference data corresponding to source-synchronous data to be synchronized from the source device automatically.
It is highly preferred that the 3rd device 103 carries out entirely to described first referring to data and first data to be verified Text compares;If the full text compares matching, determine that the corresponding check results of the described first data to be verified are qualification.
Specifically, here, carrying out full text ratio referring to data and first data to be verified preferably for described first It is right, for example, first by the corresponding sort order of Linux shell (command interpreter) under Linux system respectively to described First carries out text sequence referring to data and first data to be verified, then, then it is corresponding based on the Linux shell Diff order carries out transcription comparison, e.g., the ratio in a manner of comparing line by line referring to data and the first data to be verified to described first Compared with the similarities and differences of text, determine that final comparing result e.g. if difference value is 0, then corresponds to institute based on the difference value that order returns It is identical referring to data text as described first to state the first data to be verified, if difference value is 1, then corresponds to above-mentioned two comparisons text Difference, if difference value is greater than 1, then mistake occurs for correspondence system.Here, being determined when the difference value that the order returns is 0 The full text comparison matching, then at this point, assert that first data to be verified are consistent referring to data with corresponding first, then accordingly Data check check results be qualification.
Here, can be determined described from source device when the corresponding check results of the described first data to be verified are qualified The data transmission quality for being read into 1 this process of data check equipment is qualified, and the quality of data transmission reaches preset requirement, example Such as, there are the data in the table D of the data source MySQL database in equipment to be transferred to data check equipment 1, if by the table The verification of corresponding first data to be verified of data in D determines check results qualification, then it is contemplated that at this point, from data source The transmission quality of the synchrodata of MySQL to corresponding data calibration equipment 1 can achieve the requirement of expected correctness, Further, it is possible to estimate, if there is no conditions to change, its subsequent from data source MySQL to corresponding data calibration equipment 1 Also available phase should ensure that the transmission quality of his synchrodata.
Then, the 4th device 104 is when the corresponding check results of the described first data to be verified are qualified, to institute It states the first data to be verified and formats processing, to obtain based on the second data format corresponding to the purpose equipment Second data.
Specifically, here, when the corresponding check results of the described first data to be verified are qualified, for example, described first When data to be verified with corresponding first referring to full text contrast difference's values of data are 0, it can determine and described be read in from source device Data transmission quality to 1 this process of data check equipment is qualified, and the quality of data transmission reaches preset requirement.Described The transmission path of corresponding first data of one data to be verified be purpose equipment is eventually arrived at by the source device, so here, If the corresponding data of the source-synchronous data of the source device are transferred to the corresponding purpose equipment, need described One data to be verified carry out certain format conversion processing, the second data based on second data format are obtained, to ensure The data format of the second data after conversion is consistent with the data format of the data source of corresponding purpose equipment, so that it is guaranteed that It finally can be by the corresponding data sources of the purpose equipment from the data of the original sending of the data source of the source device.It is described right It includes by the first of txt format the data conversion to be verified into purpose equipment that first data to be verified, which format processing, The corresponding data format of corresponding data source, if for example, data source is Oracle in purpose equipment, then second data pair The data format answered is exactly the possible corresponding data format of data in data source Oracle.
Then, second data are sent to the purpose equipment by the 5th device 105.Here, to described first Verification data format processing after determine the second data data format and the purpose equipment in receive this data Data source in corresponding data format be consistent.In turn, by the 5th device 105, second data are synchronized to In the corresponding purpose data source of the purpose equipment.
Here, the application by the first data obtained from source device are formatted determine to be verified first to Data are verified, and first data to be verified are verified, synchrodata is read into data from source device to realize The inspection of the data transmission quality of this process of calibration equipment, here, being read into the number of data check equipment from each data source It can be completed based on the data to be verified of type simple first according to verification, so as to avoid the data kind due to purpose data source Class diversity bring complexity checking procedure;Also, when participating in the every increase of the synchronous data source of data one, it is based on the application Data verification method, synchrodata is read into the inspection of the data transmission quality of data check equipment this process from source device The verification for only needing to complete once to carry out the described first data to be verified is tested, whole verifying work amount is greatly reduced.For example, Have two data sources Oracle and Hadoop, increase a data source MySQL at this time, the prior art is based on, if needing to examine The data synchronizing quality that the data source MySQL arrives Oracle, Hadoop and MySQL respectively is tested, then needs to examine MySQL respectively It is corresponding from source device to the data synchronizing quality of Oracle, MySQL to Hadoop, MySQL to MySQL, and in this application The data check of data source MySQL to the purpose data source may include the number from source device to the data check equipment 1 According to verification and subsequent, the data check of the data source from the data check equipment 1 to the purpose equipment, here, base In the application, only needed once from the data check of source device corresponding the data source MySQL to the data check equipment 1 The first data to be verified corresponding to the data source MySQL verify, therefore, one new data source of every increase, from source The number of data check of the corresponding data source MySQL of equipment to the data check equipment 1 is only increase accordingly once, phase Than in the prior art, data synchronize the middle quality of data and workload are examined to increase and the result phase of geometric growth with data source Than verifying work amount economization significantly in the application improves resource utilization.
Preferably, first data are based on first data format different from second data format.
Here, those skilled in the art will be understood that first data format is used to refer to the number for first data According to format, wherein first data format and the data format in source data source in the source device are corresponding to the same.
Specifically, corresponding first data format of first data is included with source-synchronous data in the source device The corresponding format of data is consistent, and the second data that the data check equipment 1 obtains, corresponding second data format It is consistent with the corresponding format for the data that purpose data source is included in purpose equipment.Here, being with the heterogeneous data source Data instance between different data base management systems, such as data source can be MySQL, Oracle, Hadoop database The data of management system, different data sources correspond to different data formats, for example, by the data of data source MySQL database It imported into oracle database, here, the first data format of first data is based on export data source MySQL database Data format it is consistent, and the second data format of second data then with import data source oracle database data lattice Formula is consistent.
Fig. 2 shows the equipment schematic diagrams according to a kind of data check equipment of one preferred embodiment of the application.
The data check equipment 1 include first device 201, second device 202,3rd device 203, the 4th device 204, 5th device 205, the 6th device 206 and the 7th device 207.
Wherein, the first device 201 obtains the first data that purpose equipment is sent to from source device.Second device 202 pairs of first data format processing, to obtain corresponding first data to be verified;203 pairs of institutes of 3rd device The first data to be verified are stated to be verified;4th device 204 is qualification when the corresponding check results of the described first data to be verified When, processing is formatted to the described first data to be verified, to obtain based on the second number corresponding to the purpose equipment According to the second data of format;Second data are sent to the purpose equipment by the 5th device 205;6th device 206 obtains Received second data of the purpose equipment institute, and processing is formatted to it, to obtain corresponding second to school Test data;7th device 207 verifies the described second data to be verified.Here, first device 201 described in Fig. 2, second First device 101, second device described in device 202,3rd device 203, the 4th device 204, the 5th device 205 and Fig. 1 102,3rd device 103, the 4th device 104, the 5th device 105 correspondence are identical or essentially identical, therefore details are not described herein again, and leads to The mode for crossing reference is incorporated herein.
Here, those skilled in the art will be understood that the described second data to be verified are used to refer to wait in codes or data verification The data of verification, and second data to be verified include that second data received to the target device carry out above-mentioned lattice The data determined after formula conversion process.
Specifically, the 6th device 206 obtains received second data of the purpose equipment institute, and carries out to it Format conversion processing, to obtain corresponding second data to be verified.
Here, second data that the 6th device 206 obtains are to issue via the 5th device 205 to purpose The synchrodata of equipment, the purpose equipment the data format of data source in received second data and the purpose equipment It is corresponding, for example, the data instance with the heterogeneous data source between different data base management systems, as data source can be The data of the data base management systems such as MySQL, Oracle, Hadoop, different data sources correspond to different data formats.? This, the data check equipment 1 formats processing to second data, and determines corresponding second data to be verified. It preferably, is txt text formatting by the original Data Format Transform of the second data, thus based on such general, simple Format can facilitate subsequent verification operation.Here, can be based on Java primary various data types by second data The second data to be verified stored with txt text formatting are converted into, compared to the multiplicity of the data type of each data source Property, Java primary data type type is by a relatively simple, in subsequent check comparison, can make whole verifying work Amount greatly simplifies.
Then, the 7th device 207 verifies the described second data to be verified.Here, to described second to school Test data verification be in order to realize by synchrodata from data check equipment 1 be written to the corresponding data source of purpose equipment this The inspection of the data transmission quality of process.Here, may include based on text to the verification mode of the above-mentioned second data to be verified The verification that the mode of comparison carries out compares the described second data to be verified with corresponding referring to data by text sequence, and Difference analysis is carried out to comparison result, for example, passing sequentially through sort, diff order in linux system, realizes corresponding text This comparison, so that the correctness synchronous to the corresponding data of the second data to be verified judges.Second data to be verified Verification mode can also include the verification carried out based on Hash way of contrast, i.e., to the second data to be verified and corresponding Referring to data, cryptographic Hash is calculated separately, such as with behavior unit, corresponding cryptographic Hash is calculated line by line, with correlation data cryptographic Hash The similarities and differences, come judge described second to school data with it is whether consistent referring to data and then corresponding to the described second data to be verified The synchronous correctness of data is judged.
Here, those skilled in the art should be appreciated that the verification that the mode based on transcription comparison carries out, based on breathing out The verification that uncommon way of contrast carries out is only for example, other it is existing or be likely to occur from now on other can be to described to school It tests the mode that data are verified and is such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein to draw It is incorporated herein with mode.
Here, the application determines the described second number to be verified based on received second data of acquired purpose equipment institute According to, and second data to be verified are verified, synchrodata is further written from data check equipment to realize The inspection of the data transmission quality of this process of purpose equipment, here, when participating in the every increase of the synchronous data source of data one, Data verification method based on the application passes synchrodata from the data of data check equipment write-in this process of purpose equipment The inspection of transmission quality also only needs to complete the verification once carried out to the described second data to be verified, and then greatly optimizes whole Sports school tests workload;
Here, further, in conjunction with the above-mentioned number that synchrodata is read into this process of data check equipment from source device According to the inspection of transmission quality, the application realize by corresponding quality indicator direct between available data source be changed into source device and Synchrodata in target device is verified with the data that mutually should refer in data check equipment 1 respectively, overall verification work Work amount reduction, and with the increase for participating in the synchronous data source of data, practical verifying work amount growth is slower, thereby reduces The verifying cost of data source access, optimizes resource overhead on the whole.If for example, there are N number of data source in synchronization system, And be required to carry out data check between each other, then N is shared based on the prior art2A checking procedure, and based on the application then only Need 2N checking procedure;Further, one data source of every increase is based on the prior art, will increase 2N+1 checking procedure, and It then only needs to increase by 2 data checking procedures based on the application.In compared with the prior art, data synchronize the middle quality of data and examine Workload increases with data source and the result of geometric growth is compared, the verifying work amount economization significantly in the application, improves Resource utilization.
Preferably, the data check equipment 1 further includes the 8th device (not shown), and the 8th device obtains described the Second referring to data corresponding to two data to be verified;Wherein, the 7th device 207 is based on the second reference data, right Second data to be verified are verified.
Here, those skilled in the art will be understood that described second is used to refer to referring to data for be verified to described second The reference data of data progress data check.
Specifically, with the source device into described this data transmission procedure of data check equipment 1 corresponding first to Verify data it is similar, second data to be verified compared with the first data to be verified in the data check equipment 1, due to It has passed through the data transmission procedure from data check equipment 1 to purpose equipment, can there is certain mass deviation, based in reality The expection of possibility loss in transmission or the standard needed based on the subsequent application of the second data for being transmitted to the purpose equipment The transmission quality of many reasons such as exactness, data to be verified for described second can have certain limitations, for clearly described second The quality standard of data to be verified determined described corresponding second referring to data, this second has reacted referring to data and wish Hope synchrodata from data check equipment 1 be transferred to the purpose equipment after the requirement of the retainable quality of data.Here, described Two keep data format consistent referring to data with the described second data to be verified, it is preferable that are based on the txt format.Here, institute Stating second can be determined referring to data based on first data to be verified corresponding with the described second data to be verified, described second The source-synchronous data in source device corresponding with the described first data to be verified can also be directly based upon referring to data to determine.This Outside, described second can be by being manually arranged referring to data, or automatically analyzed out based on machine preset condition and to correspond to described the The second of two verification data is referring to data.
Then, here, the 7th device 207 is based on described second referring to data, to the described second data to be verified into Row verification.Here, full text comparison or Hash can be carried out referring to data and first data to be verified based on described second The methods of calibration such as comparison, to judge the correctness of synchronous data transmission.For example, to be verified to described the second of txt text formatting The second reference data of data and corresponding format carry out text sequence, and it is right to be then based on corresponding system command progress full text Than judging check results based on difference value confirmation to obtain the difference value of data comparison.
It is highly preferred that 8th device is based on when the corresponding check results of the described first data to be verified are qualified First data to be verified determine the second reference data.
Specifically, since the described second data to be verified are corresponding with second data, i.e., with the described first number to be verified According to also corresponding, so second data determined by the 4th device 204 of the data check equipment 1 are from data check equipment 1 is transmitted to one of purpose equipment process, and optimum state is that the quality of data is not lost, i.e., described second data to be verified with it is right First answered data to be verified are consistent, at this point, first data to be verified, can be used as described second referring to data. If in addition, first data to be verified are corresponding, the second data for being obtained by the data check equipment 1 are in actual transmissions Middle presence such as may be lost at the factor of other influences data accuracy, then can be by these factors in view of described second referring to number According to expectation setting in, i.e., can be based on the described first data to be verified, and in conjunction with required consideration transmission process in matter The factors such as corresponding specific data type of purpose data source in amount loss and purpose equipment, to determine the second final reference Data.Here, being preferably based on the full text verification carries out school referring to data to the described second data to be verified and described second It tests, the qualified condition that sets the verification is the difference value of the data comparison as 0.
Preferably, the data check equipment 1 further includes the 9th device (not shown), when the described second data pair to be verified When the check results answered are qualified, the 9th device configures the data transmission channel of the source device to the purpose equipment.
Specifically, here, the verification of data to be verified for described first is to judge synchrodata from source device It is read into the quality of data of this process of data check equipment transmission, e.g., whether data synchronizing quality meets preset requirement, together Sample, the verification of data to be verified for described second are to judge synchrodata data check equipment being written to purpose equipment Whether the quality of data of this process transmission meets preset requirement.When the described first data to be verified and second number to be verified When qualified according to corresponding check results, it is determined that, the transmission of the data transmission channel of the source device to the purpose equipment Quality is to reach preset requirement, in turn, can corresponding number with concrete configuration from the corresponding data source of source device to purpose equipment According to the data transmission between source.Here, the verification of data to be verified for first or the verification of the second data to be verified, it can be with The result for being only each based on one group of data check determines, in addition it is also possible to which respectively setting multi-group data verification, is based on multiple groups number Determine whether corresponding data transmission channel reaches preset requirement according to the qualified ratio of check results, for example, determine 10 groups from Data check equipment 1 is then set in 10 groups of verifications, to the synchrodata of purpose equipment when the verification knot for having 9 groups of above data When fruit is qualified, then judges that the data transmission channel from data check equipment 1 to purpose equipment is unimpeded, can achieve to data matter The preset requirement of amount.When the described first data to be verified or the corresponding data check result of the second data to be verified are unqualified When, can be based on actual data transfer the case where, the described first data to be verified or the second data to be verified are respectively corresponded to Be adjusted referring to data, or other are carried out to the data transmission procedure and is adaptively adjusted to repair transmission channel.It is excellent Selection of land can regularly carry out corresponding data check for the data transmission channel of configuration, to guarantee that data transmission is quasi- Really.
Fig. 3 shows a kind of method flow diagram of data check according to the application other side.
Wherein, in step S301, the data check equipment 1 obtains from source device and is sent to the first of purpose equipment Data.In step s 302, the data check equipment 1 formats processing to first data, to obtain correspondence The first data to be verified;In step S303, the data check equipment 1 verifies the described first data to be verified; In step s 304, when the corresponding check results of the described first data to be verified are qualified, 1 pair of institute of the data check equipment It states the first data to be verified and formats processing, to obtain based on the second data format corresponding to the purpose equipment Second data;In step S305, second data are sent to the purpose equipment by the data check equipment 1.
Specifically, in step S301, the data check equipment 1 obtains from source device and is sent to the of purpose equipment One data.Data check equipment 1 described herein includes the school for carrying out to data transmission quality in data transmission procedure The server apparatus tested can be an individual equipment, also can according to need and be set as one or more device clusters.Institute State the transmission that data transmission may include data between Various types of data source, it is preferable that as the data between heterogeneous data source are synchronous Operation.The data source may include data in Various types of data base management system, XML document, html document, Email, The arbitrary structures such as ordinary file, unstructured or partly-structured data information etc..Data between heterogeneous data source are same In step, it is export data source that the source device, which is the corresponding source device of synchrodata, i.e. source data source, corresponding equipment; Corresponding, the purpose equipment is the heterogeneous data source that the synchrodata will import, i.e. target data source, and corresponding sets It is standby.Here, it is to export data source corresponding reading plug-in unit synchrodata is read sync server memory that the data, which synchronize, into And the synchrodata in the sync server memory is written into the mistake for importing database the plug-in unit of writing for noticing importing data source Journey, the exchange for reading and writing synchrodata carry out in the memory.Here, data synchronization check described in this programme will be with the number It is matched according to synchronizing.The sync server and the data check equipment 1 can be respectively distinct device, may correspond to The same equipment.Herein, it is preferable that the same equipment is configured by the sync server and the data check equipment 1, from And the limitation of network transmission has been isolated, the data of data check can be made accordingly faster.When the data of heterogeneous data source are counted When according to synchronizing, needs can be verified according to practical, periodically to the data synchronizing quality between whole or partial allosteric data source It is examined accordingly.Here, first data include that the data check equipment 1 is issued and reached from the source device And object synchronization data that correspond to export data source, the data check equipment 1 obtains first data, for example, working as institute It, can be directly from the data check equipment 1 when to state data check equipment 1 and the sync server be the same equipment First data are obtained in memory;If the data check equipment 1 corresponds to distinct device with the sync server, then First data can be obtained from the corresponding memory exchange area of the sync server.Here, first data source in The source device, corresponding to the source-synchronous data in the source device, i.e. source-synchronous data reaches the number by data transmission After in calibration equipment 1 or sync server, that is, it is determined as first data.
Then, in step s 302, the data check equipment 1 formats processing to first data, with Obtain corresponding first data to be verified.Here, the number with the heterogeneous data source between different data base management systems For, such as data source can be the data of MySQL, Oracle, Hadoop data base management system, different data sources pair Different data formats is answered, for example, the data in the table a of data source MySQL database are imported into oracle database, In, the table a has certain table structure, and field is provided with corresponding data type in table.Therefore, first data correspond to Belong to the data format of respective data sources, for example, MySQL, Oracle correspond to certain binary format.Here, the number Processing is formatted to first data according to calibration equipment 1, it is preferable that the format conversion includes by first number It is txt text formatting according to original Data Format Transform, so that subsequent school can be facilitated based on such general, simple format Test operation.Here, can based on Java primary various data types by first data be converted into txt text formatting into First data to be verified of row storage, compared to the diversity of the data type of each data source, Java primary data type Type is by a relatively simple, in subsequent check comparison, whole verifying work amount can be made to greatly simplify.For example, to same The data of step are the data in the Table A in MySQL, and corresponding field includes: id, product_name, context in Table A, Create_time, if original Table A is respectively Long, String, String in the data type of above-mentioned each field, Date can be set the corresponding data type of above-mentioned each field and remain unchanged, respectively correspond then when switching to txt format Long, String, String, Date type, or switch to other data types as needed, such as switch to data type respectively Long, String, Long, Date.Here, the conversion of specific data type, it can be based on the expectation of user, and combine specific ginseng The data type of each data source synchronous with data determines.
Here, those skilled in the art should be appreciated that the format conversion includes by the original data of the first data Format is converted to txt text formatting and is only for example, other extended formatting conversions that are existing or being likely to occur from now on are such as applicable It in the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Then, in step S303, the data check equipment 1 verifies the described first data to be verified.Here, Verification to the described first data to be verified be in order to realize by synchrodata from source device be read into data check equipment 1 this The inspection of the data transmission quality of process.Here, may include based on text to the verification mode of the above-mentioned first data to be verified The verification that the mode of comparison carries out compares the described first data to be verified with corresponding referring to data by text sequence, and Difference analysis is carried out to comparison result, for example, passing sequentially through sort, diff (strongly consistent of basic text in linux system Property inspection) order, realize that corresponding text compares, so that the correctness synchronous to the corresponding data of the first data to be verified carries out Judgement.The verification mode of first data to be verified can also include the verification carried out based on Hash way of contrast, i.e., to the One data to be verified and corresponding reference data, calculate separately cryptographic Hash, such as with behavior unit, calculate corresponding breathe out line by line Uncommon value, with the similarities and differences of correlation data cryptographic Hash, come judge described first to school data with it is whether consistent referring to data, and then to institute The synchronous correctness of the corresponding data of the first data to be verified is stated to be judged.
Here, those skilled in the art should be appreciated that the verification that the mode based on transcription comparison carries out, based on breathing out The verification that uncommon way of contrast carries out is only for example, other it is existing or be likely to occur from now on other can be to described to school It tests the mode that data are verified and is such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein to draw It is incorporated herein with mode.
Preferably, the data verification method further includes step S310 (not shown), in step s310, the data school It tests equipment 1 and obtains the first reference data corresponding to the described first data to be verified;Wherein, in step S303, the data Calibration equipment 1 is based on described first referring to data, verifies to the described first data to be verified.
Specifically, when the data in some corresponding data source of the source device are confirmed as source-synchronous data, the source Synchrodata will be by data synchronous transfer to data check equipment 1, and is determined as first data, since data are transmitted Possibility loss in the process, first data, or first data conversion resulting first data to be verified are based on, with The source-synchronous data of corresponding source device may have certain deviation, based on to the source-synchronous data in actual transmissions The expection or accuracy many reasons required for practical application are combined with based on the synchrodata that possibility is lost, in turn Transmission result corresponding for the source-synchronous data is reasonably expected, and determines corresponding with the described first data to be verified the One referring to data, this first referring to data is to have reacted preset, wish synchrodata and be transferred to the number from source device According to the quality of data retainable after calibration equipment 1 requirement.Here, described first can be by being manually arranged or base referring to data Correspond to the first verification data first is automatically analyzed out referring to data in machine preset condition.Here, in step S310 In, the data check equipment 1 obtain described first referring to data source can there are many, for example, described first referring to number According to the source device can be derived from, e.g., determined based on the source-synchronous data in the source device;For another example, first reference Data can also be generated directly in the data check equipment;For another example, the first reference data can also be due to other The suitable third party device of meaning generates.In addition, the described first number being consistent referring to data and the described first data to be verified According to format, it is preferable that described first corresponds to txt text formatting referring to data.Described in being obtained when the data check equipment 1 First, referring to after data, can carry out respective stored to it,
Then, in step S303, the data check equipment 1 is based on described first referring to data, to described first to Verification data are verified.Here, based on described first referring to data and first data to be verified can carry out in full it is right Than or Hash comparison etc. methods of calibration, to judge the correctness of synchronous data transmission.For example, to described in txt text formatting Described the first of first data and corresponding format to be verified carries out text sequence referring to data, is then based on corresponding system command Full text comparison is carried out, to obtain the difference value of data comparison, check results are judged based on difference value confirmation.Here, it is preferred that The expection on ground, the possibility loss first by the corresponding source-synchronous data of first data in actual transmissions should or be based on Synchrodata be combined with influence of the accuracy factor to data synchronizing quality required for practical application consider in advance it is described First referring to data expectation setting in, in turn, set it is described verification qualification condition be the data comparison difference value as 0.In addition it is also possible to based on the corresponding source-synchronous data of first data in actual transmissions possibility loss expection or It is to be combined with accuracy factor required for practical application based on the synchrodata to determine a first optimal reference data, Further, in actually verification, a matching threshold range is set, such as when the described first data to be verified and first ginseng It, e.g., can be by institute all it can be assumed that the check results are qualified when reaching in the threshold range according to the similarities of data It states diff order to be transformed, the difference value is set corresponding to the threshold range.
It is highly preferred that in step s310, the data check equipment 1 obtains described first referring to number from the source device According to, wherein described first is corresponding with first data referring to data.
Specifically, here, can directly obtain from the source device will compare with the described first data to be verified First referring to data, the source device can be in advance based on the corresponding source-synchronous data of first data in actual transmissions can The expection or be combined with accuracy many reasons required for practical application based on the synchrodata synchronizes for source that energy is lost Data Matching corresponding first is referring to data, at the same time, based on the same source-synchronous data, the data check equipment 1 obtains corresponding first data, and is based on the format conversion processing, it is determined that corresponding first data to be verified, here, Described first is i.e. corresponding with first data referring to data.Here, the data check equipment 1 can be described in each obtain When the first data, it is corresponding it is real-time obtain it is corresponding with first data first reference data;The data check equipment 1 Corresponding first can also just be obtained in advance before obtaining first data referring to data.Here, when the source device has It can also be first each source-synchronous data configuration corresponding the at source device end when multiple source-synchronous datas for needing to synchronize One referring to data, and in turn, the data check equipment 1 can the multiple first data corresponding first of disposable request It economizes on resources referring to data so as to reduce the number that the data check equipment is requested to the source device.Further, If verifying to the data synchronizing quality setting period between the source device and corresponding purpose equipment, then the data check is set Standby 1 periodically can also obtain the first reference data corresponding to source-synchronous data to be synchronized from the source device automatically.
It is highly preferred that in step S303, the data check equipment 1 to described first referring to data and described first to It verifies data and carries out full text comparison;If the full text compares matching, the corresponding check results of the described first data to be verified are determined For qualification.
Specifically, here, carrying out full text ratio referring to data and first data to be verified preferably for described first It is right, for example, first by the corresponding sort order of Linux shell under Linux system respectively to described first referring to data Text sequence is carried out with the described first data to be verified, then, then based on the corresponding diff order of the Linux shell to institute It states first and e.g. compares the similarities and differences of text in a manner of comparing line by line referring to data and the first data progress transcription comparison to be verified, Determine that final comparing result e.g. if difference value is 0, then it is to be verified to correspond to described first based on the difference value that order returns Data are identical referring to data text as described first, if difference value is 1, then above-mentioned two comparisons text difference corresponded to, if poor Different value is greater than 1, then mistake occurs for correspondence system.Here, determining that the full text is right when the difference value that the order returns is 0 Than matching, then at this point, assert that first data to be verified are consistent with corresponding first reference data, then corresponding data check Check results be qualification.
Here, can be determined described from source device when the corresponding check results of the described first data to be verified are qualified The data transmission quality for being read into 1 this process of data check equipment is qualified, and the quality of data transmission reaches preset requirement, example Such as, there are the data in the table D of the data source MySQL database in equipment to be transferred to data check equipment 1, if by the table The verification of corresponding first data to be verified of data in D determines check results qualification, then it is contemplated that at this point, from data source The transmission quality of the synchrodata of MySQL to corresponding data calibration equipment 1 can achieve the requirement of expected correctness, Further, it is possible to estimate, if there is no conditions to change, its subsequent from data source MySQL to corresponding data calibration equipment 1 Also available phase should ensure that the transmission quality of his synchrodata.
Then, in step s 304, the data check equipment 1 is tied when the corresponding verification of the described first data to be verified When fruit is qualified, processing is formatted to the described first data to be verified, to obtain based on corresponding to the purpose equipment The second data format the second data.
Specifically, here, when the corresponding check results of the described first data to be verified are qualified, for example, described first When data to be verified with corresponding first referring to full text contrast difference's values of data are 0, it can determine and described be read in from source device Data transmission quality to 1 this process of data check equipment is qualified, and the quality of data transmission reaches preset requirement.Described The transmission path of corresponding first data of one data to be verified be purpose equipment is eventually arrived at by the source device, so here, If the corresponding data of the source-synchronous data of the source device are transferred to the corresponding purpose equipment, need described One data to be verified carry out certain format conversion processing, the second data based on second data format are obtained, to ensure The data format of the second data after conversion is consistent with the data format of the data source of corresponding purpose equipment, so that it is guaranteed that It finally can be by the corresponding data sources of the purpose equipment from the data of the original sending of the data source of the source device.It is described right It includes by the first of txt format the data conversion to be verified into purpose equipment that first data to be verified, which format processing, The corresponding data format of corresponding data source, if for example, data source is Oracle in purpose equipment, then second data pair The data format answered is exactly the possible corresponding data format of data in data source Oracle.
Then, in step S305, second data are sent to the purpose equipment by the data check equipment 1. Here, the data format for formatting the second data determined after processing to the first verification data is set with the purpose Corresponding data format is consistent in the standby middle data source for receiving this data.In turn, pass through the data check equipment 1, institute The second data are stated to be synchronized in the corresponding purpose data source of the purpose equipment.
Here, the application by the first data obtained from source device are formatted determine to be verified first to Data are verified, and first data to be verified are verified, synchrodata is read into data from source device to realize The inspection of the data transmission quality of this process of calibration equipment, here, being read into the number of data check equipment from each data source It can be completed based on the data to be verified of type simple first according to verification, so as to avoid the data kind due to purpose data source Class diversity bring complexity checking procedure;Also, when participating in the every increase of the synchronous data source of data one, it is based on the application Data verification method, synchrodata is read into the inspection of the data transmission quality of data check equipment this process from source device The verification for only needing to complete once to carry out the described first data to be verified is tested, whole verifying work amount is greatly reduced.For example, Have two data sources Oracle and Hadoop, increase a data source MySQL at this time, the prior art is based on, if needing to examine The data synchronizing quality that the data source MySQL arrives Oracle, Hadoop and MySQL respectively is tested, then needs to examine MySQL respectively It is corresponding from source device to the data synchronizing quality of Oracle, MySQL to Hadoop, MySQL to MySQL, and in this application The data check of data source MySQL to the purpose data source may include the number from source device to the data check equipment 1 According to verification and subsequent, the data check of the data source from the data check equipment 1 to the purpose equipment, here, base In the application, only needed once from the data check of source device corresponding the data source MySQL to the data check equipment 1 The first data to be verified corresponding to the data source MySQL verify, therefore, one new data source of every increase, from source The number of data check of the corresponding data source MySQL of equipment to the data check equipment 1 is only increase accordingly once, phase Than in the prior art, data synchronize the middle quality of data and workload are examined to increase and the result phase of geometric growth with data source Than verifying work amount economization significantly in the application improves resource utilization.
Preferably, first data are based on first data format different from second data format.Specifically, institute The corresponding format for stating the data that source-synchronous data is included in corresponding first data format of the first data and the source device is protected It holds unanimously, and purpose in the second data that the data check equipment 1 obtains, corresponding second data format and purpose equipment The corresponding format for the data that data source is included is consistent.Here, with the heterogeneous data source for different data base administrations Data instance between system, such as data source can be the data of MySQL, Oracle, Hadoop data base management system, no Same data source corresponds to different data formats, for example, the data of data source MySQL database are imported into oracle database In, here, data format of the first data format of first data based on export data source MySQL database is consistent, and Second data format of second data is then consistent with the data format of data source oracle database is imported.
Fig. 4 shows a kind of method flow diagram of data check according to one preferred embodiment of the application.
Wherein, in step S401, the data check equipment 1 obtains from source device and is sent to the first of purpose equipment Data;In step S402, the data check equipment 1 formats processing to first data, to obtain correspondence The first data to be verified;In step S403, the data check equipment 1 verifies the described first data to be verified; In step s 404, the data check equipment 1 is when the corresponding check results of the described first data to be verified are qualified, to institute It states the first data to be verified and formats processing, to obtain based on the second data format corresponding to the purpose equipment Second data;In step S405, second data are sent to the purpose equipment by the data check equipment 1;In step In rapid S406, the data check equipment 1 obtains received second data of the purpose equipment institute, and carries out format to it Conversion process, to obtain corresponding second data to be verified;In step S 407, the data check equipment 1 is to described second Data to be verified are verified.Here, step S401 described in Fig. 4, step S402, step S403, step S404, step S405 It is corresponding identical or essentially identical with step S301 described in Fig. 3, step S302, step S303, step S304, step S305, therefore Details are not described herein again, and is incorporated herein by reference.
Specifically, in step S406, the data check equipment 1 obtains the purpose equipment institute received described second Data, and processing is formatted to it, to obtain corresponding second data to be verified.
Here, second data that the data check equipment 1 obtains are issued extremely via the data check equipment 1 The synchrodata of purpose equipment, the purpose equipment data of data source in received second data and the purpose equipment Format is corresponding, for example, the data instance with the heterogeneous data source between different data base management systems, as data source can To be the data of the data base management systems such as MySQL, Oracle, Hadoop, different data sources corresponds to different data formats. Here, the data check equipment 1 formats processing to second data, and determine corresponding second number to be verified According to.It preferably, is txt text formatting by the original Data Format Transform of the second data, thus based on such general, simple Format can facilitate subsequent verification operation.Here, described second can be counted based on Java primary various data types It is more compared to the data type of each data source according to being converted into the second data to be verified stored with txt text formatting Sample, Java primary data type type is by a relatively simple, in subsequent check comparison, can make whole verification work It is greatly simplified as amount.
Then, in step S 407, the data check equipment 1 verifies the described second data to be verified.Here, Verification to the described second data to be verified is to realize synchrodata being written to purpose equipment pair from data check equipment 1 The inspection of the data transmission quality for data source this process answered.Here, can to the verification mode of the above-mentioned second data to be verified By include based on transcription comparison in a manner of the verification that carries out, the described second data to be verified are passed through into text referring to data with corresponding This sequence compares, and carries out difference analysis to comparison result, for example, passing sequentially through sort, diff life in linux system It enables, realizes that corresponding text compares, so that the correctness synchronous to the corresponding data of the second data to be verified judges.It is described The verification mode of second data to be verified can also include the verification carried out based on Hash way of contrast, i.e., to the second number to be verified According to corresponding reference data, calculate separately cryptographic Hash, such as with behavior unit, calculate corresponding cryptographic Hash, line by line with right Than the similarities and differences of data cryptographic Hash, come judge described second to school data with it is whether consistent referring to data, and then to described second to The synchronous correctness of the corresponding data of verification data is judged.
Here, those skilled in the art should be appreciated that the verification that the mode based on transcription comparison carries out, based on breathing out The verification that uncommon way of contrast carries out is only for example, other it is existing or be likely to occur from now on other can be to described to school It tests the mode that data are verified and is such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein to draw It is incorporated herein with mode.
Here, the application determines the described second number to be verified based on received second data of acquired purpose equipment institute According to, and second data to be verified are verified, synchrodata is further written from data check equipment to realize The inspection of the data transmission quality of this process of purpose equipment, here, when participating in the every increase of the synchronous data source of data one, Data verification method based on the application passes synchrodata from the data of data check equipment write-in this process of purpose equipment The inspection of transmission quality also only needs to complete the verification once carried out to the described second data to be verified, and then greatly optimizes whole Sports school tests workload;
Here, further, in conjunction with the above-mentioned number that synchrodata is read into this process of data check equipment from source device According to the inspection of transmission quality, the application realize by corresponding quality indicator direct between available data source be changed into source device and Synchrodata in target device is verified with the data that mutually should refer in data check equipment 1 respectively, overall verification work Work amount reduction, and with the increase for participating in the synchronous data source of data, practical verifying work amount growth is slower, thereby reduces The verifying cost of data source access, optimizes resource overhead on the whole.If for example, there are N number of data source in synchronization system, And be required to carry out data check between each other, then N is shared based on the prior art2A checking procedure, and based on the application then only Need 2N checking procedure;Further, one data source of every increase is based on the prior art, will increase 2N+1 checking procedure, and It then only needs to increase by 2 data checking procedures based on the application.In compared with the prior art, data synchronize the middle quality of data and examine Workload increases with data source and the result of geometric growth is compared, the verifying work amount economization significantly in the application, improves Resource utilization.
Preferably, the data verification method further includes step S408 (not shown), in the step S406, the number The second reference data corresponding to the described second data to be verified are obtained according to calibration equipment 1;Wherein, in step S 407, described Data check equipment 1 is based on described second referring to data, verifies to the described second data to be verified.
Specifically, with the source device into described this data transmission procedure of data check equipment 1 corresponding first to Verify data it is similar, second data to be verified compared with the first data to be verified in the data check equipment 1, due to It has passed through the data transmission procedure from data check equipment 1 to purpose equipment, can there is certain mass deviation, based in reality The expection of possibility loss in transmission or the standard needed based on the subsequent application of the second data for being transmitted to the purpose equipment The transmission quality of many reasons such as exactness, data to be verified for described second can have certain limitations, for clearly described second The quality standard of data to be verified determined described corresponding second referring to data, this second has reacted referring to data and wish Hope synchrodata from data check equipment 1 be transferred to the purpose equipment after the requirement of the retainable quality of data.Here, described Two keep data format consistent referring to data with the described second data to be verified, it is preferable that are based on the txt format.Here, institute Stating second can be determined referring to data based on first data to be verified corresponding with the described second data to be verified, described second The source-synchronous data in source device corresponding with the described first data to be verified can also be directly based upon referring to data to determine.This Outside, described second can be by being manually arranged referring to data, or automatically analyzed out based on machine preset condition and to correspond to described the The second of two verification data is referring to data.
Then, in step S 407, the data check equipment 1 is based on described second referring to data, to described second to Verification data are verified.Here, based on described second referring to data and first data to be verified can carry out in full it is right Than or Hash comparison etc. methods of calibration, to judge the correctness of synchronous data transmission.For example, to described in txt text formatting Described the second of second data and corresponding format to be verified carries out text sequence referring to data, is then based on corresponding system command Full text comparison is carried out, to obtain the difference value of data comparison, check results are judged based on difference value confirmation.
It is highly preferred that described obtain the second reference data corresponding to second data to be verified including working as described first When the corresponding check results of data to be verified are qualified, determined based on the described first data to be verified second referring to data.
Specifically, since the described second data to be verified are corresponding with second data, i.e., with the described first number to be verified According to also corresponding, so second data determined by the data check equipment 1 are transmitted to purpose from data check equipment 1 and set One of standby process, optimum state are that the quality of data is not lost, i.e., described second data to be verified are with corresponding first to school It tests data to be consistent, at this point, first data to be verified, can be used as described second referring to data.If in addition, described The second data that first data to be verified are corresponding, are obtained by the data check equipment 1 exist in actual transmissions can energy loss The factor of the other influences data accuracies such as consumption these factors can be then arranged in view of the expectation of the second reference data In, i.e., it can be based on the described first data to be verified, and mass loss and mesh in the transmission process of the required consideration of combination Equipment in the factors such as the corresponding specific data type of purpose data source, to determine final second referring to data.Here, it is preferred that Ground verifies the described second data to be verified and described second referring to data based on full text verification, sets the school Testing the difference value that qualified condition is the data comparison is 0.
Preferably, the data verification method further includes step S409 (not shown), in step S409, when described second When the corresponding check results of data to be verified are qualified, the data check equipment 1 configures the source device to the purpose and sets Standby data transmission channel.
Specifically, here, the verification of data to be verified for described first is to judge synchrodata from source device It is read into the quality of data of this process of data check equipment transmission, e.g., whether data synchronizing quality meets preset requirement, together Sample, the verification of data to be verified for described second are to judge synchrodata data check equipment being written to purpose equipment Whether the quality of data of this process transmission meets preset requirement.When the described first data to be verified and second number to be verified When qualified according to corresponding check results, it is determined that, the transmission of the data transmission channel of the source device to the purpose equipment Quality is to reach preset requirement, in turn, can corresponding number with concrete configuration from the corresponding data source of source device to purpose equipment According to the data transmission between source.Here, the verification of data to be verified for first or the verification of the second data to be verified, it can be with The result for being only each based on one group of data check determines, in addition it is also possible to which respectively setting multi-group data verification, is based on multiple groups number Determine whether corresponding data transmission channel reaches preset requirement according to the qualified ratio of check results, for example, determine 10 groups from Data check equipment 1 is then set in 10 groups of verifications, to the synchrodata of purpose equipment when the verification knot for having 9 groups of above data When fruit is qualified, then judges that the data transmission channel from data check equipment 1 to purpose equipment is unimpeded, can achieve to data matter The preset requirement of amount.When the described first data to be verified or the corresponding data check result of the second data to be verified are unqualified When, can be based on actual data transfer the case where, the described first data to be verified or the second data to be verified are respectively corresponded to Be adjusted referring to data, or other are carried out to the data transmission procedure and is adaptively adjusted to repair transmission channel.It is excellent Selection of land can regularly carry out corresponding data check for the data transmission channel of configuration, to guarantee that data transmission is quasi- Really.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included in the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This Outside, it is clear that one word of " comprising " does not exclude other units or steps, and odd number is not excluded for plural number.That states in device claim is multiple Unit or device can also be implemented through software or hardware by a unit or device.The first, the second equal words are used to table Show title, and does not indicate any particular order.

Claims (14)

1. a kind of data verification method, comprising:
Obtain the first data that purpose equipment is sent to from source device;
Processing is formatted to first data, to obtain corresponding first data to be verified;
Described first data to be verified are verified;
When the corresponding check results of the described first data to be verified are qualified, format is carried out to the described first data to be verified and is turned Processing is changed, to obtain the second data based on the second data format corresponding to the purpose equipment;
Second data are sent to the purpose equipment;
Received second data of the purpose equipment institute are obtained, and format processing to it, it is corresponding to obtain Second data to be verified;
Described second data to be verified are verified;
When the corresponding check results of the described second data to be verified are qualified, the source device is configured to the purpose equipment Data transmission channel.
2. according to the method described in claim 1, wherein, the method also includes:
Obtain the second reference data corresponding to the described second data to be verified;
Wherein, it is described to the described second data to be verified carry out verification include:
Based on described second referring to data, the described second data to be verified are verified.
3. described to obtain the second reference corresponding to second data to be verified according to the method described in claim 2, wherein Data include:
When the corresponding check results of the described first data to be verified are qualified, second is determined based on the described first data to be verified Referring to data.
4. according to the method described in claim 1, wherein, the method also includes:
Obtain the first reference data corresponding to the described first data to be verified;
Wherein, it is described to the described first data to be verified carry out verification include:
Based on described first referring to data, the described first data to be verified are verified.
5. described to obtain the first reference corresponding to first data to be verified according to the method described in claim 4, wherein Data include:
Described first is obtained referring to data from the source device, wherein described first is opposite with first data referring to data It answers.
6. method according to claim 4 or 5, wherein it is described to the described first data to be verified carry out verification include:
Full text comparison is carried out referring to data and first data to be verified to described first;
If the full text compares matching, determine that the corresponding check results of the described first data to be verified are qualification.
7. according to the method described in claim 1, wherein, first data are based on different from second data format One data format.
8. a kind of data check equipment, comprising:
First device, for obtaining the first data for being sent to purpose equipment from source device;
Second device, for formatting processing to first data, to obtain corresponding first data to be verified;
3rd device, for being verified to the described first data to be verified;
4th device, it is to be verified to described first for when the corresponding check results of the described first data to be verified are qualified Data format processing, to obtain the second data based on the second data format corresponding to the purpose equipment;
5th device, for second data to be sent to the purpose equipment;
6th device for obtaining received second data of the purpose equipment institute, and formats processing to it, To obtain corresponding second data to be verified;
7th device, for being verified to the described second data to be verified;
9th device, for configuring the source device extremely when the corresponding check results of the described second data to be verified are qualified The data transmission channel of the purpose equipment.
9. equipment according to claim 8, wherein the equipment further include:
8th device, for obtaining the second reference data corresponding to the described second data to be verified;
Wherein, the 7th device is used for:
Based on described second referring to data, the described second data to be verified are verified.
10. equipment according to claim 9, wherein the 8th device is used for:
When the corresponding check results of the described first data to be verified are qualified, second is determined based on the described first data to be verified Referring to data.
11. equipment according to claim 8, wherein the equipment further include:
Tenth device, for obtaining the first reference data corresponding to the described first data to be verified;
Wherein, the 3rd device is used for:
Based on described first referring to data, the described first data to be verified are verified.
12. equipment according to claim 11, wherein the tenth device is used for:
Described first is obtained referring to data from the source device, wherein described first is opposite with first data referring to data It answers.
13. equipment according to claim 11 or 12, wherein the 3rd device is used for:
Full text comparison is carried out referring to data and first data to be verified to described first;
If the full text compares matching, determine that the corresponding check results of the described first data to be verified are qualification.
14. equipment according to claim 8, wherein first data are based on different from second data format First data format.
CN201510249497.5A 2015-05-15 2015-05-15 A kind of data verification method and equipment Active CN106293977B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510249497.5A CN106293977B (en) 2015-05-15 2015-05-15 A kind of data verification method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510249497.5A CN106293977B (en) 2015-05-15 2015-05-15 A kind of data verification method and equipment

Publications (2)

Publication Number Publication Date
CN106293977A CN106293977A (en) 2017-01-04
CN106293977B true CN106293977B (en) 2019-04-05

Family

ID=57631984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510249497.5A Active CN106293977B (en) 2015-05-15 2015-05-15 A kind of data verification method and equipment

Country Status (1)

Country Link
CN (1) CN106293977B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019351A (en) * 2017-09-05 2019-07-16 中国移动通信有限公司研究院 A kind of data detection method, device and computer readable storage medium
CN109634846B (en) * 2018-11-16 2021-10-19 武汉达梦数据库股份有限公司 ETL software testing method and device
CN110032513B (en) * 2019-04-02 2022-09-09 中汇信息技术(上海)有限公司 Data verification method and device and electronic equipment
CN110457153A (en) * 2019-07-18 2019-11-15 北京顺丰同城科技有限公司 Data check processing method and processing device
CN110704325B (en) * 2019-10-09 2021-07-30 京东数字科技控股有限公司 Data processing method and device, computer storage medium and electronic equipment
CN110831010B (en) * 2019-10-21 2024-04-16 上海鹄恩信息科技有限公司 Multichannel data sending and receiving method and device and data transmission system
CN110781647B (en) * 2019-10-29 2023-07-04 浪潮云信息技术股份公司 Method for realizing data format verification based on Flink
CN111311014B (en) * 2020-02-27 2024-04-12 广州酷旅旅行社有限公司 Service data processing method, device, computer equipment and storage medium
CN113568966A (en) * 2021-07-29 2021-10-29 上海哔哩哔哩科技有限公司 Data processing method and system used between ODS layer and DW layer

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819537A (en) * 2011-09-29 2012-12-12 金蝶软件(中国)有限公司 Method and system for data exchange in heterogeneous system
CN103577611A (en) * 2013-11-25 2014-02-12 方正国际软件有限公司 Data unifying device and data unifying method
CN104462604A (en) * 2014-12-31 2015-03-25 成都市卓睿科技有限公司 Data processing method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130126827A (en) * 2012-04-30 2013-11-21 (주)유니디아 Method for transforming database, apparatus therefor and recording medium therefor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819537A (en) * 2011-09-29 2012-12-12 金蝶软件(中国)有限公司 Method and system for data exchange in heterogeneous system
CN103577611A (en) * 2013-11-25 2014-02-12 方正国际软件有限公司 Data unifying device and data unifying method
CN104462604A (en) * 2014-12-31 2015-03-25 成都市卓睿科技有限公司 Data processing method and system

Also Published As

Publication number Publication date
CN106293977A (en) 2017-01-04

Similar Documents

Publication Publication Date Title
CN106293977B (en) A kind of data verification method and equipment
US10608827B1 (en) Systems and methods for computer digital certificate management and analysis
CN106528418B (en) A kind of test method and device
CN109960653A (en) Regression testing method, device, equipment and storage medium
CN109561106B (en) Ship communication message real-time analysis and filtering method
CN108133007A (en) A kind of method of data synchronization and system
US20150140956A1 (en) Methods, systems, and computer readable media for call flow analysis using comparison level indicators
CN109067617A (en) A kind of V2X protocol conformance test method, apparatus and system
CN110633198A (en) Block chain-based software test data storage method and system
CN109784818A (en) Product data processing method, device, equipment and storage medium based on BOM
CN108241576A (en) A kind of interface test method and system
CN109933535A (en) Generation method, device and the server of test case
CN107360233A (en) Method, apparatus, equipment and the readable storage medium storing program for executing that file uploads
CN109934712A (en) Account checking method, account checking apparatus and electronic equipment applied to distributed system
CN111611622A (en) Block chain-based file storage method and electronic equipment
CN111629063A (en) Block chain based distributed file downloading method and electronic equipment
CN104809250A (en) Loose type data consistency checking method
CN102420724B (en) Method and device for testing north-orientation performance index
CN113094272B (en) Application testing method, device, electronic equipment and computer readable medium
CN109472012A (en) A kind of management method and device of electronization test report
CN109977006A (en) Order matching process, device, equipment and storage medium
CN106294721A (en) A kind of company-data statistics and deriving method and device
CN110515910A (en) Data processing method, device and computer readable storage medium between heterogeneous system
CN109889285B (en) Multi-user test method and device
CN114006678B (en) Method for quickly acquiring source of received frame by FC-AE equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant