A kind of large scale data verification method that is used for embedded system
Technical field
The present invention relates to a kind of data verification method, particularly relate to a kind of large scale data verification method that is used for embedded system.
Background technology
Large-scale data are to produce under the situation that the fast development of current information technology and high capacity memory make rapid progress, but, the generation of large-scale data has also caused the appearance of another problem, is exactly how to know apace whether the data that produced are correct.Because large-scale data all is hundreds of million usually, even higher capacity, thus brought very big difficulty for the checking work of data, especially storage resources be of great rarity with and in the lower embedded system of processor speed, it is particularly outstanding that this problem seems.
Singularity based on embedded system itself, so that this system aspect processing speed, storage space all far away less than PC, and the widespread use of large-scale data in embedded system, more highlighted in embedded system, the data verification method of traditional approach is the weak point of large-scale data verification mode:
(1) owing in the built-in multimedia performance history, runs into the large-scale data amount carried out checking work through regular meeting, but tend to run into the situation of the insufficient memory on the development board, such as decoded data, in this case if the correctness of checking decoding back data is to bother very much with time-consuming;
(2) the general and decoded data volume of data volume before the decoding differs greatly, such as being compressed of one 1,000,000 capacity the decompressed data that after decoding, obtains of bit stream data be tens even hundreds of million, so decoded data volume is very large, if these data are stored with storer, will inevitably increase cost so.
Therefore the arithmetic speed of embedded system tends to allow work very consuming time PC finish, and carries out key and simple working with embedded system well below PC, could improve the efficient of data verification work like this.The method and the weak point thereof of data verification in embedded system at present are roughly as follows:
Method one: when whether the data that will compare two files under normal conditions equate fully; usually can be disposable (smaller at file; perhaps under the situation that internal memory is enough big) data of needs comparison are read in the corresponding core buffer; perhaps file data is read in the corresponding core buffer one section one section ground by loop statement in batches; byte-by-byte then or compare by turn; by this method as can be seen; data volume according to original data block constantly becomes big; also require to prepare a data block of size fully simultaneously indirectly; only in this way just can carry out the comparison of data, and the consumption of memory headroom will be doubled and redoubled at this moment.Perhaps can not cause big influence under the few situation of data volume to embedded system, but when needing data volume relatively to reach tens or more the time, the work efficiency of using the embedded system (comprising hardware and software) of this classic method will be lower at every turn.
Method two: because data are too big, can upload to data on the external unit (as PC) from embedded system usually, verify again.Generally speaking, upload data to PC or other devices all are to be undertaken by serial ports or USB (USB (universal serial bus)) from embedded type terminal equipment, and embedded system can be subjected to file system by the USB mouth to transmitting data, can keep flash memory, the restriction of each side such as the processing power of embedded system, so cause its transmission performance to reduce greatly, through the test proof, at present the logical USB2.0 of the embedded system maximum transfer speed that makes progress is in the 5MB/s, as comparing 1,000,000 part of H263 (a kind of video standard coding) CIF (high * is wide=352*288) yuv data after the formats, its data volume is 1,000,000* (352*288*2)=202752000000 Bytes if the time that the data more than having transmitted so will spend, the chances are 640 minutes, reaches about 10 hours.So this efficient is very low.Though this method has been alleviated the problem of memory space inadequate to a certain extent, its transmittability can not satisfy the needs of large-scale data far away.
Summary of the invention
The object of the present invention is to provide a kind of simple efficient, cost is lower and verification is correct large scale data verification method that is used for embedded system.
Purpose of the present invention realizes by following technical scheme: a kind of large scale data verification method that is used for embedded system, and its processing procedure may further comprise the steps:
Step 1 will need the raw data verified staging treating on request on PC, obtain each index segment data corresponding target cipher key check code by the key algorithm processor then;
Step 2 is stored in the target cipher key checkout code that obtains each index segment data in the step 1 in the corresponding target cipher key checkout code file;
Step 3 is transferred to the resulting target cipher key checkout code file of step 2 embedded system by transmitting device (USB or serial ports) from PC, and preserves with the form of file.
Step 4 in embedded system, generates check code to the data of each same section by key algorithm, compares with target cipher key checkout code, and carries out result treatment.
Concrete processing procedure in the described step (1): read one piece of data earlier, calculate the target cipher key checkout code of current this segment data, and then target cipher key checkout code is stored in the corresponding cipher key checkout code file with the key algorithm processor; Repeat above-mentioned processing procedure, all dispose up to the raw data that will carry out verification.Obtain the corresponding target cipher key check code thus and reach the corresponding secret key checkout code file of wherein putting into above-mentioned target cipher key checkout code.
Described step (3) is to be transferred to the embedded system by transmitting device (usb) and to be loaded in the Installed System Memory from PC cipher key checkout code file is disposable, forms target cipher key checkout code one dimension table.
The concrete processing procedure of described step 4:
In the raw data of embedded system, read one section with PC on the consistent data of data segment index position;
Embedded system generates check code to the data of this same section by key algorithm, and the mode by order reads or index reads reads article one target cipher key checkout code, and the cipher key checkout code that generates with current embedded system compares;
If the comparative result coupling, continuing to handle next section needs data relatively, and the corresponding check code of regeneration compares with next bar target cipher key checkout code; If comparative result does not match, withdraw from data verifying program and carry out the next round data check or withdraw from verification work.
Compared with prior art, adopt method of the present invention that following advantage is arranged:
1, saved the storage space of data greatly, made memory-aided cost thereby reduce;
2, accelerate data speed relatively, improved the work efficiency of data check;
3, with respect to tradition and loaded down with trivial details data verification method, greatly reduce the work complexity in operating process;
4, because the high hash of ripe key algorithm can be guaranteed the correctness of check results and assurance checking result's uniqueness.
Description of drawings
Fig. 1 is an overall procedure synoptic diagram of the present invention;
Fig. 2 holds pretreated schematic flow sheet for PC of the present invention;
The schematic flow sheet that Fig. 3 handles for embedded system end checking of the present invention.
Embodiment
As Fig. 1, Fig. 2 and shown in Figure 3, a kind of large scale data verification method that is used for embedded system of the present invention specifically comprises the steps:
(1) on PC, the raw data file that needs checking is read in segmentation in order, and the data that read are sent into key algorithm processor (as: MD5, HASH algorithms such as SHA1); Use the key algorithm processor to calculate the target cipher key checkout code of current one piece of data, and then the target check code is stored in the corresponding cipher key checkout code file, and repeat this processing procedure, all dispose up to the raw data that will carry out verification, obtain corresponding cipher key checkout code thus and reach the corresponding secret key checkout code file of wherein putting into above-mentioned cipher key checkout code.
(2) be transferred to embedded system from PC by transmitting device (USB USB (universal serial bus) or serial ports) and preserve resulting target cipher key checkout code file is disposable with document form.
Above step is to prepare for later data check.It is workload and storage space in order to reduce embedded system biglyyer that employing is carried out pre-service at PC end, because the verification work of data is to finish in the embedded system the inside, must finish most work consuming time earlier on PC; In addition, because PC can't influence the operate as normal of goal systems when handling above-mentioned work, so the method that adopts the data check to the PC end pre-service of mass data and embedded system to combine can be increased work efficiency greatly.
Because the target cipher key checkout code file that generates is not very big, such as 100,000 piece of data, be 100 from the shared storage space of the target cipher key checkout code of PC so, 000*16 Bytes=1,600,000 Bytes=1.5MB are so in general embedded system, can disposable cipher key checkout code file be loaded in the internal memory, thus, form a virtual key one dimension table, so that read when comparing later on.
(3) before comparison, earlier disposable being loaded in the Installed System Memory of school sign indicating number sign indicating number file that before downloaded in the embedded system; And form an one dimension table.
(4) then in embedded system, read one section with PC on the consistent raw data of data segment index position, in embedded system, this segment data is passed through key algorithm (as: MD5 then, HASH algorithms such as SHA1) generate cipher key checkout code, the last target cipher key checkout code identical with index position in the key one dimension table compares.Can carry out result treatment (as: statistical result waits operation) after relatively finishing.
When embedded system generates the cipher key checkout code of this segment data with the data of same index section by key algorithm according to raw data, read or mode by index by order, read article one target cipher key checkout code in the key one dimension table, the cipher key checkout code that generates with current embedded system compares, when the two is unequal, withdraw from data verifying program; Perhaps continue to handle the data that next group need compare, generate the corresponding secret key check code, compare with the data of the corresponding index position of next bar in the key one dimension table.
Pre-service of PC end and the implementation that relatively combines of the result of later stage in embedded system by early stage, compare 1,000,000 piece of data as can be seen if desired, be 1 from the shared storage space of the target cipher key checkout code of PC so, 000,000*16 Bytes=16,000,000 Bytes=15MB, if and use the most original comparative approach, as the yuv data after the H263 CIF formats relatively, the yuv data of less than 80 frames is only enough put in the space of 15MB.So realize data verification with the present invention, 1,000,000 piece of data can be deposited in the space of 15MB, and operates with traditional verification method, and 80 piece of data can only be deposited in the space of 15MB.So the technical scheme among use the present invention can be so that realize higher work efficiency under identical hardware condition.The present invention is particularly useful for the checking of multimedia packed data decoding back data check and relatively works.
Clearly saved a large amount of storage space and improved work efficiency greatly by the present invention, under the very limited condition of storage space, realized purpose that large-scale data is verified thereby reach.