CN106611135A - Storage data integrity verification and recovery method - Google Patents

Storage data integrity verification and recovery method Download PDF

Info

Publication number
CN106611135A
CN106611135A CN201610453804.6A CN201610453804A CN106611135A CN 106611135 A CN106611135 A CN 106611135A CN 201610453804 A CN201610453804 A CN 201610453804A CN 106611135 A CN106611135 A CN 106611135A
Authority
CN
China
Prior art keywords
data
node
fingerprint
label
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610453804.6A
Other languages
Chinese (zh)
Inventor
范勇
胡成华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Yonglian Information Technology Co Ltd
Original Assignee
Sichuan Yonglian Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Yonglian Information Technology Co Ltd filed Critical Sichuan Yonglian Information Technology Co Ltd
Priority to CN201610453804.6A priority Critical patent/CN106611135A/en
Publication of CN106611135A publication Critical patent/CN106611135A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Storage Device Security (AREA)

Abstract

The invention provides a storage data integrity verification and recovery method. The storage data integrity verification and recovery method comprises the following steps of: firstly, partitioning information data, and generating data evidence; then, mapping the data evidence onto a server node, performing secondary pseudo-random placement of the data evidence, after verification applied by a user is passed, returning a stored data block and a corresponding evidence label to the user, and calculating and comparing the consistency of information through a private key and an evidence key, so that verification is realized; and, if data is attacked or tampered, when node failure in a system is detected, returning the position of an error node by the system, performing linear processing of a data block, which is smaller than a source file, according to secondary pseudo-random placement storage and a regeneration code, performing iterative operation of an effective node, and precisely recovering data. By means of the storage data integrity verification and recovery method provided by the invention, whether data is integrated, attacked and tampered or not can be checked by using a small amount of resources; due to secondary pseudo-random placement storage of the data, data can be recovered through undamaged data blocks; and furthermore, by means of the method, the storage overhead and the communication overhead are relatively low and high in anti-attacking property.

Description

A kind of integrity of data stored checking and restoration methods
Technical field
The present invention relates in Computer Storage, cloud storage data integrity checking;Distort, damage the recovery neck of data Domain.
Background technology
With the development of cloud, enterprises and individuals more and more tend to store data in high in the clouds, to save movement Memory space is accessed whenever and wherever possible with convenient;Meanwhile, data also can beyond the clouds be realized sharing, facilitate other people to download.But, thus Also result in the safety problem of data, data storage beyond the clouds, departing from the control of data owner, or because of cloud service provider system Unstable, cloud space loss of data and damage are caused by malicious attack.User, or will be because of data in the case of unwitting It is imperfect and cause various losses;So how to determine the integrity of high in the clouds dataAfter data are destroyed in memory space, number According to no longer complete, the how part changed according to available data Exact recoveryAll it is the problem for being worth inquiring into.
With the development of technology and distributed memory system, researchers are proved data retrievable and data proof of possession Scheme has carried out the improvement and extension of many so as to can support dynamic renewal, reduce communication overhead and support unlimited checking User, not only will detect whether data malfunction, it is often more important that error data can recovers.Current cloud storage system Using two methods of copy and correcting and eleting codes redundant storage is carried out to user data to ensure the reliability of system, be required for during reparation Whole file is transmitted, a large amount of Internet resources are taken, huge pressure is increased to distributed system data center, the network for causing Congestion also seriously reduces user to data reading performance.
Regeneration code is paid close attention to because of its minimum reparation bandwidth by researchers, and is widely applied to distributed system Research in.The characteristics of for regenerating code, different storage schemes and data integrity validation scheme are designed, for checking distribution Formula system data safety is highly significant.Therefore, carrying out data integrity validation using regeneration code becomes current research Focus.
The content of the invention
For the above-mentioned deficiency of prior art, the present invention proposes a kind of integrity of data stored checking and restoration methods.
The technical scheme is that:A kind of integrity of data stored checking and restoration methods, the tool of its algorithm being related to Body step is as follows:
Step 1:Regeneration code (RC) carries out piecemeal to file
Step 2:Data encryption, generates data fingerprint label
Step 3:Twice Random Maps preserve fingerprint label
Step 4:Data verification
(1) two fingerprint labels of each data block storage are obtained, and generates the position of challenge block.
(2) fingerprint label value is recalculated.
(3) the fingerprint label to calculating twice is contrasted.
Step 5:Data recovery
The invention has the beneficial effects as follows:
1st, a small amount of resource is spent to check whether data are complete, if to be distorted by attack.
2nd, there is association in the secondary pseudo-random placement storage of data between data block, can pass through not impaired part number According to block, all data of Exact recovery.
3rd, storage overhead and communication overhead is less, attack tolerant is strong.
Specific embodiment
In order that the objects, technical solutions and advantages of the present invention become more apparent, enter below with reference to algorithm flow chart Row is concrete to be described in detail.
First, algorithm basic thought
First information is divided into into data block according to regeneration code minimum memory scheme file data, then data block is breathed out Uncommon (MD5 algorithms, Hash described hereinafter is not particularly illustrated, and MD5 criterions are adopted without exception), generates data evidence;Subsequently by data Evidence stores Algorithm mapping on storage server node using pseudorandom, and the secondary pseudo-random placement of data evidence is saved in On two ciphertext block datas;User preserves private key and evidence key, when user's checking request is received, first verifies that the power of user Limit, then returns to user by the data block of storage and corresponding evidence label, by private key and evidence cipher key calculation contrast letter The concordance of breath, realizes checking;If data are attacked or distorted, when having node failure or error in detecting system, System returns to the position of error node, and code is stored and regenerated according to secondary pseudo-random placement to be carried out to the data block less than source file Linear process, with the node interative computation currently not failed, Exact recovery data original text.
2nd, specific implementation step
Step 1:Regeneration code (RC) carries out piecemeal to file
If the size of file F is B, its piecemeal parameter is expressed as:{ [n, k, d], (α, β, B) }, represents file of the size for B On n node, each node stores α data volume to code storage, and d represents the complete node of data, and k is node to be repaired Number, β is the download of each node, makes β=t, d=2k-2, and t is constant, then B=k × (k-1) × t, α=(k-1) × t.Press According to parameter calculation document F, the blocks of files being divided into is b, i.e.,:B={ bi}=[b1, b2..., bm]。
Step 2:Data encryption, generates data fingerprint label
To data block bi symmetric cryptography, the symmetrical k of key is generated, be then stored on server node;Simultaneously to data bi Hash is carried out, hashed value is generatedAnd to hashed valueIt is encrypted, generates fingerprint key gm, then calculate:
Wherein, MiRepresent data block biCorresponding fingerprint label,
G (f)=G (fileName | | N), N for file piecemeal number, Sig () function representation { 0,1 }*→ M, completes Afterwards, M=[M are exported1, M2..., MN]。
Step 3:Twice Random Maps preserve fingerprint label
The present invention is mapped to label data on server node using pseudo-random function, and is stored in the data block of file On;Label generating algorithm obtains label MiAfterwards, each MiRandom Maps to two store biOn the node of encryption blocks of files.This If sample, label data can be prevented to be destroyed, while having resisted the risk that server is mutually cheated, before preservation, also be needed Data are processed:
T(Mi)=T (Index | | Mi)
Data fingerprint is generated and stored and finishes, and whole data fragmentation storage metadata is sent to into meta data server, and User only needs to preserve k and gmDeng encryption parameter, facilitate the data verification in later stage.
Step 4:Data verification
When user wishes verification of data integrity, or checking data are either with or without by unauthorized access and when distorting, can basis Oneself preserve encryption parameter to being stored in cloud in data integrity verify.Client server sends the file to be verified With the challenge information s (f, Ref) of data block index, after cloud server to s, the authority of user is first verified that, by authority After checking, the fingerprint label value of storage is returned in the lump user, user with the fingerprint label value recalculated to the data block By the k and g of preservationmThe concordance of contrast fingerprint label value is calculated, it is achieved thereby that the checking to file data.
(1) two fingerprint labels of each data block storage are obtained, and generates the position of challenge block
S (f)=s (file Name | | ref)
(2) fingerprint label value is recalculated
T(Mi)=T (Index | | Mi)
(3) the fingerprint label to calculating twice is contrasted
When challenge block parameter ref of user input is 0, to all of piece integrity verification is all carried out.Cyclic access number According to block, calculate the fingerprint label of data block, fetch other the evidence labels for preserving within the data block, and by it according to corresponding to it The corresponding contrast array of block call number write in, if unanimously, illustrate that data are not tampered with, be it is complete, it is safe, If it is inconsistent, explanation data are by unauthorized access and tampered, the corresponding node location of return inconsistent data.
Step 5:Data recovery
When user detects storage corrupt data beyond the clouds, most concerned is exactly that data can be recovered, and recovers effect Rate is how high, if can accurately repair.When having node failure or error in detecting system, system returns to the position of error node Put, the characteristics of by regeneration code, in the case where bandwidth is saved, accurate reparation damages data block, is repairing fail data node When, need d help node of connection, wherein d=2 α.The coding vector stored on each nodeWith number According to matrixIn vectorCorrelation, when repairing failure node f, using the characteristic of data matrix order rank (R)=2 α to data Block f is accurately repaired.Process is:
(1) appoint from remaining complete node and take d node { h1, h2..., hd, the coding vector of each node is
(2) the corresponding data matrixes of node f areBy the coding vector of each nodeWithTransport in node Calculate even, result is uploaded to the back end Q for needing to repair
(3) new node fnew, the restoration information that d node is uploaded is received, they are constituted into the matrix W of (d × 1):
Due toD (b)=[XY]T
So having:
I.e.
Rank (R)=2 α understands any 2 α row linear independences in matrix R, i.e. R2α×2αIt is reversible, so:
X, Y are symmetrical matrix, so having:
So far, new node fnewComplete to damaging node f reparations.

Claims (5)

1. a kind of integrity of data stored checking and restoration methods, is characterized in that, comprise the steps:
Step 1:Regeneration code(RC)Piecemeal is carried out to file
Step 2:Data encryption, generates data fingerprint label
Step 3:Twice Random Maps preserve fingerprint label
Step 4:Data verification
(1)Two fingerprint labels of each data block storage are obtained, and generates the position of challenge block
(2)Recalculate fingerprint label value
(3)Fingerprint label to calculating twice is contrasted
Step 5:Data recovery.
2. verified and restoration methods according to a kind of integrity of data stored described in claim 1, be it is characterized in that, the above The particular content of step 2 is as follows:
Step 2:Data encryption, generates data fingerprint label
To data blockSymmetric cryptography, generates the symmetrical k of key, is then stored on server node;Simultaneously to data blockEnter Row Hash, generates, and it is rightIt is encrypted, generates fingerprint key, then calculate:
Output
3. verified and restoration methods according to a kind of integrity of data stored described in claim 1, be it is characterized in that, the above Being specifically described in step 3 is as follows:
Step 3:Pseudorandom mapping twice preserves fingerprint label
Label data is mapped on server node with pseudo-random function, and is stored in the data block of file;Label is generated Algorithm obtains labelAfterwards, eachRandom Maps are to two storagesOn the node of encryption blocks of files, like this, Label data can be prevented to be destroyed, while resisted the risk that server is mutually cheated, before preservation, in addition it is also necessary to by number According to being processed:
Data fingerprint is generated and stored and finishes, and whole data fragmentation storage metadata is sent to into meta data server, and user Only need to preserve k andDeng encryption parameter, facilitate the data verification in later stage.
4. verified and restoration methods according to a kind of integrity of data stored described in claim 1, be it is characterized in that, the above Concrete calculating process is as follows in step 4:
Step 4:Data verification
Client server send the file to be verified and data block index challenge information s (f, Ref), cloud server to s Afterwards, the authority of user is first verified that, after Authority Verification,
(1)Two fingerprint labels of each data block storage are obtained, and generates the position of challenge block
(2)Recalculate fingerprint label value
(3)Fingerprint label to calculating twice is contrasted.
5. verified and restoration methods according to a kind of integrity of data stored described in claim 1, be it is characterized in that, the above Concrete calculating process is as follows in step 5:
Step 5:Data recovery
When having node failure or error in detecting system, system returns to the position of error node, needs d help of connection to save Point, wherein, the coding vector stored on each node, with the vector in data matrix RCorrelation, when repairing failure node f, using data matrix orderCharacteristic data block f is carried out accurately Repair, process is:
(1)Appoint from remaining complete node and take d node, the coding vector of each node is
(2)The corresponding data matrixes of node f are, by the coding vector of each nodeWithThe computing in node Afterwards, result is uploaded to the back end Q for needing to repair, Q is the back end of definition
(3)New node, the restoration information that d node is uploaded is received, they are constitutedMatrix W:
Due to
So having:
I.e.
Understand in matrix R arbitrarilyRow linear independence, i.e.,It is reversible, so:
X, Y are symmetrical matrix, so having:
So far, new nodeTo damaging nodefReparation is completed.
CN201610453804.6A 2016-06-21 2016-06-21 Storage data integrity verification and recovery method Pending CN106611135A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610453804.6A CN106611135A (en) 2016-06-21 2016-06-21 Storage data integrity verification and recovery method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610453804.6A CN106611135A (en) 2016-06-21 2016-06-21 Storage data integrity verification and recovery method

Publications (1)

Publication Number Publication Date
CN106611135A true CN106611135A (en) 2017-05-03

Family

ID=58614767

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610453804.6A Pending CN106611135A (en) 2016-06-21 2016-06-21 Storage data integrity verification and recovery method

Country Status (1)

Country Link
CN (1) CN106611135A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107026872A (en) * 2017-05-17 2017-08-08 成都麟成科技有限公司 A kind of method for preventing userspersonal information from decoding
CN107395652A (en) * 2017-09-08 2017-11-24 郑州云海信息技术有限公司 A kind of integrity of data stored inspection method, apparatus and system
CN108174136A (en) * 2018-03-14 2018-06-15 成都创信特电子技术有限公司 Cloud disk video coding and storage method
CN108491212A (en) * 2018-03-19 2018-09-04 广东美的暖通设备有限公司 Burning file method, equipment and computer readable storage medium
CN111291046A (en) * 2020-01-16 2020-06-16 湖南城市学院 Computer big data storage control system and method
CN111897845A (en) * 2020-07-29 2020-11-06 徐州金蝶软件有限公司 Method and system for processing mass credit information based on process
CN112580062A (en) * 2019-09-27 2021-03-30 厦门网宿有限公司 Data consistency checking method and data uploading and downloading device
US11177940B2 (en) 2017-06-20 2021-11-16 707 Limited Method of evidencing existence of digital documents and a system therefor

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102647433A (en) * 2012-05-21 2012-08-22 北京航空航天大学 Efficient cloud storage data possession verification method
CN103605784A (en) * 2013-11-29 2014-02-26 北京航空航天大学 Data integrity verifying method under multi-cloud environment
CN104811450A (en) * 2015-04-22 2015-07-29 电子科技大学 Data storage method based on identity in cloud computing and integrity verification method based on identity in cloud computing
CN104993937A (en) * 2015-07-07 2015-10-21 电子科技大学 Method for testing integrity of cloud storage data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102647433A (en) * 2012-05-21 2012-08-22 北京航空航天大学 Efficient cloud storage data possession verification method
CN103605784A (en) * 2013-11-29 2014-02-26 北京航空航天大学 Data integrity verifying method under multi-cloud environment
CN104811450A (en) * 2015-04-22 2015-07-29 电子科技大学 Data storage method based on identity in cloud computing and integrity verification method based on identity in cloud computing
CN104993937A (en) * 2015-07-07 2015-10-21 电子科技大学 Method for testing integrity of cloud storage data

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107026872A (en) * 2017-05-17 2017-08-08 成都麟成科技有限公司 A kind of method for preventing userspersonal information from decoding
CN107026872B (en) * 2017-05-17 2021-02-12 宁波潮涌道投资合伙企业(有限合伙) Method for preventing user personal information from being decoded
US11177940B2 (en) 2017-06-20 2021-11-16 707 Limited Method of evidencing existence of digital documents and a system therefor
CN107395652A (en) * 2017-09-08 2017-11-24 郑州云海信息技术有限公司 A kind of integrity of data stored inspection method, apparatus and system
CN108174136A (en) * 2018-03-14 2018-06-15 成都创信特电子技术有限公司 Cloud disk video coding and storage method
CN108491212A (en) * 2018-03-19 2018-09-04 广东美的暖通设备有限公司 Burning file method, equipment and computer readable storage medium
CN112580062A (en) * 2019-09-27 2021-03-30 厦门网宿有限公司 Data consistency checking method and data uploading and downloading device
WO2021056865A1 (en) * 2019-09-27 2021-04-01 厦门网宿有限公司 Data consistency checking method and data uploading/downloading apparatus
CN111291046A (en) * 2020-01-16 2020-06-16 湖南城市学院 Computer big data storage control system and method
CN111291046B (en) * 2020-01-16 2023-07-14 湖南城市学院 Computer big data storage control system and method
CN111897845A (en) * 2020-07-29 2020-11-06 徐州金蝶软件有限公司 Method and system for processing mass credit information based on process
CN111897845B (en) * 2020-07-29 2023-10-31 江苏新蝶数字科技有限公司 Method and system for processing massive credit information based on flow

Similar Documents

Publication Publication Date Title
CN106611135A (en) Storage data integrity verification and recovery method
US11223484B1 (en) Enhanced authentication method for Hadoop job containers
TWI729880B (en) Shared blockchain data storage based on error correction coding in trusted execution environments
US9436722B1 (en) Parallel checksumming of data chunks of a shared data object using a log-structured file system
US8527472B2 (en) Method and apparatus of securely processing data for file backup, de-duplication, and restoration
RU2696425C1 (en) Method of two-dimensional control and data integrity assurance
CN111222176B (en) Block chain-based cloud storage possession proving method, system and medium
US20130346374A1 (en) Restoring objects in a client-server environment
CN104408381A (en) Protection method of data integrity in cloud storage
He et al. Public integrity auditing for dynamic regenerating code based cloud storage
CN110968452A (en) Data integrity verification method capable of safely removing duplicate in cloud storage of smart power grid
CN110008755A (en) Dynamic data integrity verification system and method can be revoked in a kind of cloud storage
CN109101360B (en) Data integrity protection method based on bloom filter and cross coding
CN113704357A (en) Smart city data sharing method and system based on block chain
US9054864B2 (en) Method and apparatus of securely processing data for file backup, de-duplication, and restoration
WO2021151298A1 (en) Data redundancy processing method and apparatus, device, and storage medium
US11100235B2 (en) Backups of file system instances with root object encrypted by metadata encryption key
CN111291001B (en) Method and device for reading computer file, computer system and storage medium
CN117111854A (en) Data storage method, device and medium based on distributed encryption storage
US20200099537A1 (en) Method for providing information to be stored and method for providing a proof of retrievability
US8776191B2 (en) Techniques for reducing storage space and detecting corruption in hash-based application
Thakur et al. Data integrity techniques in cloud computing: an analysis
CN116192395A (en) Trusted system for distributed data storage
Shrivastava et al. A Big Data Deduplication Using HECC Based Encryption with Modified Hash Value in Cloud
Bae et al. An automated system recovery using blockchain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170503