CN103106068B - Internet of things big data fast calibration method - Google Patents

Internet of things big data fast calibration method Download PDF

Info

Publication number
CN103106068B
CN103106068B CN201310066481.1A CN201310066481A CN103106068B CN 103106068 B CN103106068 B CN 103106068B CN 201310066481 A CN201310066481 A CN 201310066481A CN 103106068 B CN103106068 B CN 103106068B
Authority
CN
China
Prior art keywords
file
block
block file
files
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310066481.1A
Other languages
Chinese (zh)
Other versions
CN103106068A (en
Inventor
王勃
陈曙东
陈岚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Yuantong Electronics Co.,Ltd.
Original Assignee
Jiangsu Cas Internet-Of-Thing Technology Venture Capital Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Cas Internet-Of-Thing Technology Venture Capital Co Ltd filed Critical Jiangsu Cas Internet-Of-Thing Technology Venture Capital Co Ltd
Priority to CN201310066481.1A priority Critical patent/CN103106068B/en
Publication of CN103106068A publication Critical patent/CN103106068A/en
Application granted granted Critical
Publication of CN103106068B publication Critical patent/CN103106068B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention provides an internet of things big data fast calibration method which comprises a big file upload verification method and a big file download calibration method. The big file upload verification method comprises steps of carrying out segmentation preprocessing on big files, verifying a block file and joint nodes in a multithreading mode, uploading in a multithreading mode, forming verified files, uploading verified files and the like. The big file download calibration method comprises steps of downloading the verified files, downloading a block file in multithreading and complicated modes, verifying accuracy of the verified files in multithreading and complicated modes, combining the files, calibrating whole accuracy of the combination of the files and the like. The method is used for fast verifying the big data formed in the internet of things, and the bottleneck problem of speed of big file verification in the process of processing the big data in the internet of things is effectively solved.

Description

The large data fast calibration method of Internet of Things
Technical field
The present invention relates to a kind of data processing method, the large data fast calibration method of especially a kind of Internet of Things.
Background technology
In the epoch of Internet of Things high speed development, along with the universal of the equipment such as video monitoring and linking Internet, large Statistical feature is further obvious.Current industry is more for the distributed storage research of all types of data.But less at data verification area research, rarely has new breakthrough.Substantially all take in the industry to utilize traditional MD5, the mode that CRC32, SHA1 scheduling algorithm directly verifies file.The thought of these algorithms is as follows:
(1) MD5: to MD5 algorithm concise and to the point describe can be: MD5 with 512 groupings to process the information of input, and each grouping is divided into again 16 32 seat groupings, after have passed through a series of process, the output of algorithm is made up of four 32 groupings, by after these four 32 packet concatenation by generation 128 hashed values.
(2) basic thought of CRC:CRC check code utilizes uniform enconding theoretical, at transmitting terminal according to the k position binary code sequence that will transmit, picket code (i.e. CRC code) the r position of a verification is produced with certain rule, and be attached to information back, form a new binary code sequence number (k+r) position altogether, finally send.At receiving end, then test according to the rule followed between information code and CRC code, to determine whether make mistakes in transmission.
(3) SHA:SHA is Secure Hash Algorithm again.After SHA family algorithm has SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512(four usually and claim SHA2), principle is similar with MD4, MD5.SHA can by maximum 2^64 position (2305843009213693952 byte) information, and converting the hashed value (summary info) of a string 160 (20 bytes) to, is HASH algorithm most widely used at present.
By the description of these algorithms, we can find out, these main flow file verification modes are all raw data is changed into a continuous print bit stream, then carry out iterative processing to this bit stream, finally calculate the result.This pattern, when processing larger Single document, needs whole large files to download complete, and by whole large files according to the process of continuous print bit stream, cannot the checking of download limit, limit.Time and place now needed for authenticating documents requires higher, and system overhead increases, and has had a strong impact on the processing speed that system is downloaded for big file uploading.
Summary of the invention
The object of the invention is to overcome the deficiencies in the prior art, propose the large data fast calibration method of a kind of Internet of Things, comprise big file uploading verification method and big file download calibration method.Big file uploading verification method forms multiple block file and articulation nodes after carrying out segmentation pre-service to large file, then carry out multi-thread concurrent checking and multi-thread concurrent upload; Big file download calibration method is enabled multi-thread concurrent and is downloaded block file to client, and the correctness of multi-thread concurrent checking block file, composition file, the overall correctness of terminal check combination of files.The technical solution used in the present invention is:
The large data fast calibration method of a kind of Internet of Things, comprises big file uploading verification method and big file download calibration method;
Described big file uploading verification method specifically comprises the following steps:
Step 101, client call large files stores API;
Step 102, large file is carried out segmentation pre-service, comprise the size of calculation document, calculate blocks of files number to be split, the data offset of each block file, the nodes of block file joint articulation nodes and node side-play amount, be divided into multiple block file by large files, and each block file is composed of corresponding sequence number; Thus formation carve information;
Step 103, INIT block file verification thread pool, the thread enabling respective amount according to block file number and articulation nodes number carries out concurrent checking, verifies separately, obtain the check information of each block file and each articulation nodes to wherein each block file and each articulation nodes; MD5 checking or CRC32 can be adopted to verify the concurrent checking of block file and articulation nodes in this step.
Step 104, the distributed file system of client and storage server cluster connects, enable multithreading and be uploaded to distributed file system by concurrent for multiple block files of above-mentioned formation, and record the memory location of each block file in distributed file system;
Step 105, is saved in structured data file by the check information of above-mentioned large files carve information, each block file and each articulation nodes, the memory location of each block file in distributed file system, forms authenticating documents;
Step 106, the authenticating documents formed in uploading step 105 is to server;
Described big file download calibration method specifically comprises the following steps:
Step 201, client call large files downloads API;
Step 202, connects with storage server cluster, and large files title and the file ID downloaded as required search corresponding authenticating documents, and download this authenticating documents;
Step 203, the size of the file downloaded as required opens up corresponding storage space in client disk, is used for storing large files;
Step 204, client is according to the memory location of each block file in distributed file system of recording in authenticating documents, enable multi-thread concurrent and download each block file to client, and verify the correctness of block file according to the check information multi-thread concurrent of each block file in authenticating documents;
Step 205, is kept at client according to the sequence number of block file and the data offset composition file of block file;
Step 206, finally uses the overall correctness of articulation joint authentication mechanism verification file combination; Client is according to the correctness of the check information multi-thread concurrent checking articulation nodes of each articulation nodes recorded in authenticating documents.
Advantage of the present invention: the present invention utilizes multiple threads technology to carry out parallel piecemeal checking to large data files, and is saved in structured data file by authorization information, together uploads onto the server to hold together with block file and preserves.Enable multi-thread concurrent during download and download block file, and the correctness of multi-thread concurrent checking block file.File verification speed can improve by new file verification pattern greatly, effectively improves Internet of Things large files verifying speed.This method clear layer, versatility is better, applied widely, file verification fast and reliable.Efficiently solve the large files verifying speed bottleneck problem of the large Data processing of Internet of Things, promote the overall performance of distributed file system.
Accompanying drawing explanation
Fig. 1 is big file uploading of the present invention checking process flow diagram.
Fig. 2 is that large files of the present invention downloads checking process figure.
Fig. 3 is articulation joint schematic diagram of the present invention.
Embodiment
Below in conjunction with concrete drawings and Examples, the invention will be further described.
According to technical scheme provided by the invention, the large data fast calibration method of novel Internet of Things comprises big file uploading verification method and big file download calibration method two parts.Big file uploading verification method comprises steps such as carrying out segmentation pre-service to large files, multithreading verifies that block file and articulation nodes, multithreading are uploaded, form authenticating documents, authenticating documents is uploaded.The steps such as the overall correctness that big file download calibration method comprises download authenticating documents, multi-thread concurrent downloads block file, the correctness of multi-thread concurrent checking block file, composition file, verification file combine.
Big file uploading verification method is concrete as shown in Figure 1, specifically comprises the following steps:
Step 101, client call large files stores API;
Step 102, large file is carried out segmentation pre-service, comprise the size of calculation document, calculate blocks of files number to be split, the data offset of each block file, the nodes of block file joint articulation nodes (joint node) and node side-play amount, large files is divided into multiple block file, and each block file is composed of corresponding sequence number; Thus formation carve information;
Step 103, INIT block file verification thread pool, the thread enabling respective amount according to block file number and articulation nodes number carries out concurrent checking, verifies separately, obtain the check information of each block file and each articulation nodes to wherein each block file and each articulation nodes; MD5 checking or CRC32 can be adopted to verify the concurrent checking of block file and articulation nodes in this step.
Step 104, the distributed file system of client and storage server cluster connects, enable multithreading and be uploaded to distributed file system by concurrent for multiple block files of above-mentioned formation, and record the memory location of each block file in distributed file system;
Step 105, is saved in structured data file by the check information of above-mentioned large files carve information, each block file and each articulation nodes, the memory location of each block file in distributed file system, forms authenticating documents;
Step 106, the authenticating documents formed in uploading step 105 is to server.
Big file download calibration method as shown in Figure 2, specifically comprises the following steps:
Step 201, client call large files downloads API;
Step 202, connects with storage server cluster, and large files title and the file ID downloaded as required search corresponding authenticating documents, and download this authenticating documents;
Step 203, the size of the file downloaded as required opens up corresponding storage space in client disk, is used for storing large files;
Step 204, client is according to the memory location of each block file in distributed file system of recording in authenticating documents, enable multi-thread concurrent and download each block file to client, and verify the correctness of block file according to the check information multi-thread concurrent of each block file in authenticating documents;
Step 205, is kept at client according to the sequence number of block file and the data offset composition file of block file;
Step 206, finally uses the overall correctness of articulation joint authentication mechanism verification file combination; To ensure the correct combination between block file.Client is according to the correctness of the check information multi-thread concurrent checking articulation nodes of each articulation nodes recorded in authenticating documents.
As shown in Figure 3, articulation nodes is positioned at the position of file division, contains the partial data of previous block file and the partial data of an adjacent rear block file; If the side-play amount in two adjacent block file centre positions is P1, then the size between P1+k and P1-k is that the data block of 2k constitutes an articulation nodes.Wherein the size of k requires according to verifying speed requirement and accuracy rate and determines.Verifying speed requires higher, and the value of k is less, and accuracy rate requires higher, and the value of k is larger (k must be less than the half of blocked file size, otherwise can cause the waste of computing).Due to the combination that combination of files is order, (such as IO buffering is not smooth for some mistakes, block sequential combination mistake etc.) large files partial data " skew " can be caused, the checking of articulation nodes can ensure that each data block is on correct position, and ensure the correctness of block order, do not need again to carry out entirety verification, so can obtain compromise between performance and verifying speed to large files simultaneously.
The large data fast calibration method of above Internet of Things, process large files being split, verifies, download rear combination, verify again.Downloading large files required time separately under supposing traditional mode is T1, and the time directly verified large files is T2, and required T.T. is T1+T2, and period is without any parallel work-flow.After file division is n block, only need to verify the file of original 1/n size, the proving time shortens at every turn.In multi-core parallel concurrent processing procedure, suppose to open the Thread Count (m≤n) of m concurrent checking, the proving time can be made to level off to original T2/m.In download checking procedure, when verifying the block file downloaded, other block file also do not downloaded can be downloaded simultaneously.New verification mode is under the prerequisite ensureing file integrality and correctness, significantly improve large files processing speed, improve the overall performance of Internet of Things large data upload down load application neighborhood system, efficiently solve the large files verifying speed bottleneck problem of the large Data processing of Internet of Things.

Claims (1)

1. the large data fast calibration method of Internet of Things, is characterized in that: comprise big file uploading verification method and big file download calibration method;
Described big file uploading verification method specifically comprises the following steps:
Step 101, client call large files stores API;
Step 102, large file is carried out segmentation pre-service, comprise the size of calculation document, calculate blocks of files number to be split, the data offset of each block file, the nodes of block file joint articulation nodes and node side-play amount, be divided into multiple block file by large files, each block file is composed of corresponding sequence number, thus forms carve information;
Step 103, INIT block file verification thread pool, the thread enabling respective amount according to block file number and articulation nodes number carries out concurrent checking, verifies separately, obtain the check information of each block file and each articulation nodes to wherein each block file and each articulation nodes;
Step 104, the distributed file system of client and storage server cluster connects, and enables multithreading and is uploaded to distributed file system by concurrent for the multiple block files formed, and record the memory location of each block file in distributed file system;
Step 105, is saved in structured data file by the check information of large files carve information, each block file and each articulation nodes, the memory location of each block file in distributed file system, forms authenticating documents;
Step 106, the authenticating documents formed in uploading step 105 is to storage server cluster;
Described big file download calibration method specifically comprises the following steps:
Step 201, client call large files downloads API;
Step 202, connects with storage server cluster, and large files title and the file ID downloaded as required search corresponding authenticating documents, and download this authenticating documents;
Step 203, the size of the file downloaded as required opens up corresponding storage space in client disk, is used for storing large files;
Step 204, client is according to the memory location of each block file in distributed file system of recording in authenticating documents, enable multi-thread concurrent and download each block file to client, and verify the correctness of block file according to the check information multi-thread concurrent of each block file in authenticating documents;
Step 205, is kept at client according to the sequence number of block file and the data offset composition file of block file;
Step 206, finally uses the overall correctness of articulation nodes authentication mechanism verification file combination, and client is according to the correctness of the check information multi-thread concurrent checking articulation nodes of each articulation nodes recorded in authenticating documents.
CN201310066481.1A 2013-02-28 2013-02-28 Internet of things big data fast calibration method Active CN103106068B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310066481.1A CN103106068B (en) 2013-02-28 2013-02-28 Internet of things big data fast calibration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310066481.1A CN103106068B (en) 2013-02-28 2013-02-28 Internet of things big data fast calibration method

Publications (2)

Publication Number Publication Date
CN103106068A CN103106068A (en) 2013-05-15
CN103106068B true CN103106068B (en) 2015-03-18

Family

ID=48313959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310066481.1A Active CN103106068B (en) 2013-02-28 2013-02-28 Internet of things big data fast calibration method

Country Status (1)

Country Link
CN (1) CN103106068B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103442037A (en) * 2013-08-09 2013-12-11 华南理工大学 Method for achieving multithreading breakpoint upload of oversized file based on FTP
CN104361107A (en) * 2014-11-28 2015-02-18 浪潮电子信息产业股份有限公司 Script tool implementation method for quickly detecting file consistency at multiple nodes
CN104408147A (en) * 2014-12-02 2015-03-11 浪潮(北京)电子信息产业有限公司 Multithreading data uploading method
CN104462324A (en) * 2014-12-03 2015-03-25 浪潮电子信息产业股份有限公司 HDFS multithreaded parallel downloading method
CN106294585B (en) * 2016-07-28 2019-10-18 上海倍增智能科技有限公司 A kind of storage method under cloud computing platform
CN106341634A (en) * 2016-08-31 2017-01-18 武汉烽火众智数字技术有限责任公司 Video acquisition system based on hard disk video recorder and method thereof
CN106331146A (en) * 2016-09-08 2017-01-11 四川大学 Large-capacity data downloading method based on mobile cloud computing and system thereof
US10185550B2 (en) 2016-09-28 2019-01-22 Mcafee, Inc. Device-driven auto-recovery using multiple recovery sources
CN107977341A (en) * 2016-10-21 2018-05-01 北京航天爱威电子技术有限公司 Big data text immediate processing method
CN108156188B (en) * 2016-12-02 2021-06-01 中科星图股份有限公司 Data validity checking system
US11196623B2 (en) 2016-12-30 2021-12-07 Intel Corporation Data packaging protocols for communications between IoT devices
CN106790653B (en) * 2017-01-17 2020-04-24 上海泓智信息科技有限公司 File transmission processing method and device
CN109088907B (en) * 2017-06-14 2021-10-01 北京京东尚科信息技术有限公司 File transfer method and device
CN110661829B (en) * 2018-06-28 2021-09-21 杭州海康威视系统技术有限公司 File downloading method and device, client and computer readable storage medium
CN109324897A (en) * 2018-08-24 2019-02-12 平安科技(深圳)有限公司 Data uploading method and system, terminal and computer readable storage medium
CN110708363A (en) * 2019-09-20 2020-01-17 济南浪潮数据技术有限公司 File transmission method, system, electronic equipment and storage medium
CN110740345A (en) * 2019-09-20 2020-01-31 北京旷视科技有限公司 Offline video file speed doubling analysis method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102281121A (en) * 2010-06-13 2011-12-14 中兴通讯股份有限公司 Method, equipment and system for transmitting and verifying data file
CN102497597A (en) * 2011-12-05 2012-06-13 中国华录集团有限公司 Method for carrying out integrity checkout on HD (high-definition) video files

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007257386A (en) * 2006-03-24 2007-10-04 Mitsubishi Motors Corp Method and system for verifying data for vehicle electronic controller
US9063778B2 (en) * 2008-01-09 2015-06-23 Microsoft Technology Licensing, Llc Fair stateless model checking

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102281121A (en) * 2010-06-13 2011-12-14 中兴通讯股份有限公司 Method, equipment and system for transmitting and verifying data file
CN102497597A (en) * 2011-12-05 2012-06-13 中国华录集团有限公司 Method for carrying out integrity checkout on HD (high-definition) video files

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
大数据量高复杂度CRC校验的串并优化设计;霍旭东等;《铁道通信信号》;20120417;第48卷(第4期);第51-53页 *

Also Published As

Publication number Publication date
CN103106068A (en) 2013-05-15

Similar Documents

Publication Publication Date Title
CN103106068B (en) Internet of things big data fast calibration method
US10878248B2 (en) Media authentication using distributed ledger
US9436722B1 (en) Parallel checksumming of data chunks of a shared data object using a log-structured file system
US8181024B2 (en) Data processing apparatus
US9430648B2 (en) Method and apparatus for near field communication
CN106126722B (en) A kind of prefix compound tree and design method based on verifying
CN103916483A (en) Self-adaptation data storage and reconstruction method for coding redundancy storage system
US20120096564A1 (en) Data integrity protecting and verifying methods, apparatuses and systems
US8601358B2 (en) Buffer transfer check on variable length data
CN109714325A (en) A kind of one-way optical gate data transmission method, system, electronic equipment and medium
WO2022222786A1 (en) File storage method and apparatus, and device
CN104661042A (en) Method, device and system for transmitting transport stream
KR20140107705A (en) Method and system of evidence preservation for digital documents
EP3819802A1 (en) Data consistency checking method and data uploading/downloading apparatus
CN101458638A (en) Large scale data verification method for embedded system
US20210099432A1 (en) Data consistency verification method, and data uploading and downloading device
CN112131609A (en) Merkle tree-based electric energy quality data exchange format file integrity verification method and system
CN104521239B (en) The synchronous coding of video data file and transmission
CN112328565A (en) Resource sharing method and device based on block chain
JP4260688B2 (en) Data transmission device, data transmission / reception system, data transmission device control method, and data transmission / reception system control method
CN102143183A (en) Document versioning method
CN110991358B (en) Text comparison method and device based on blockchain
CN103377251A (en) File comparison method and device for HDFS (Hadoop Distributed File System)
US9438425B2 (en) Robust MAC aggregation with short MAC tags
CN112291350A (en) File transmission method, system, device and medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: JIANGSU CAS INTERNET-OF-THING TECHNOLOGY VENTURE C

Free format text: FORMER OWNER: JIANGSU INTERNET OF THINGS RESEARCH + DEVELOMENT CO., LTD.

Effective date: 20140612

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20140612

Address after: 214135 Jiangsu New District of Wuxi City Linghu Road No. 200 China Sensor Network International Innovation Park building C

Applicant after: JIANGSU CAS INTERNET-OF-THINGS TECHNOLOGY VENTURE CAPITAL CO.,LTD.

Address before: 214135 Jiangsu New District of Wuxi City Linghu Road No. 200 China Sensor Network International Innovation Park C building 4 floor

Applicant before: JIANGSU R & D CENTER FOR INTERNET OF THINGS

C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: WUXI ZHONGKE HENGYUAN INFORMATION TECHNOLOGY CO.,

Free format text: FORMER OWNER: JIANGSU CAS INTERNET-OF-THING TECHNOLOGY VENTURE CAPITAL CO., LTD.

Effective date: 20150331

COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 214135 WUXI, JIANGSU PROVINCE TO: 214016 WUXI, JIANGSU PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20150331

Address after: 214016 Jiangsu city of Wuxi province Tongjiang Chong'an District Road No. 898 7 floor

Patentee after: WUXI CAS FOREVERSOURCE INFORMATION TECHNOLOGY CO.,LTD.

Address before: 214135 Jiangsu New District of Wuxi City Linghu Road No. 200 China Sensor Network International Innovation Park building C

Patentee before: JIANGSU CAS INTERNET-OF-THINGS TECHNOLOGY VENTURE CAPITAL CO.,LTD.

TR01 Transfer of patent right

Effective date of registration: 20230424

Address after: 214001 Yongding Lane 1-1905, Liangxi District, Wuxi City, Jiangsu Province

Patentee after: Wuxi Yuantong Electronics Co.,Ltd.

Address before: 7 / F, 898 Tongjiang Avenue, Chong'an District, Wuxi City, Jiangsu Province

Patentee before: WUXI CAS FOREVERSOURCE INFORMATION TECHNOLOGY CO.,LTD.

TR01 Transfer of patent right