WO2010011342A1 - Apparatus, methods, and computer program products providing dynamic provable data possession - Google Patents

Apparatus, methods, and computer program products providing dynamic provable data possession Download PDF

Info

Publication number
WO2010011342A1
WO2010011342A1 PCT/US2009/004322 US2009004322W WO2010011342A1 WO 2010011342 A1 WO2010011342 A1 WO 2010011342A1 US 2009004322 W US2009004322 W US 2009004322W WO 2010011342 A1 WO2010011342 A1 WO 2010011342A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
skip list
block
proof
file
Prior art date
Application number
PCT/US2009/004322
Other languages
English (en)
French (fr)
Inventor
Roberto Tamassia
Charalampos Papamanthou
Charles Christopher Erway
Alptekin Kupcu
Original Assignee
Brown University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Brown University filed Critical Brown University
Priority to US12/737,583 priority Critical patent/US8978155B2/en
Priority to CA2731954A priority patent/CA2731954C/en
Publication of WO2010011342A1 publication Critical patent/WO2010011342A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3218Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using proof of knowledge, e.g. Fiat-Shamir, GQ, Schnorr, ornon-interactive zero-knowledge proofs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/30Compression, e.g. Merkle-Damgard construction

Definitions

  • the exemplary embodiments of this invention relate generally to data storage and access and, more specifically, relate to access, security and updates for data stored by an untrusted agent (e.g., an untrusted remote server).
  • an untrusted agent e.g., an untrusted remote server.
  • users are provided with the opportunity to store data at untrusted servers (e.g., third party, untrusted remote servers).
  • untrusted servers e.g., third party, untrusted remote servers
  • users maybe able to access the remote storage via the internet in order to upload files for subsequent access or downloading (e.g., at a different location).
  • P2P networks provide third party storage where the data is stored by a different agent or an entity other than the user (e.g., the user who uploaded or provided the data).
  • P2P peer-to-peer
  • such an arrangement may be beneficial in order to provide other users with access to the data (e.g., based on considerations such as bandwidth usage and hosting capabilities).
  • users may desire to check if their data has been tampered with or deleted by the storage server.
  • the user may be required to download the data. If the outsourced data includes very large files or entire file systems, requiring the user to download the data will likely hinder validation and increase the expense (e.g., in terms of bandwidth and time), particularly if the client wishes to check the data frequently.
  • online storage-outsourcing services e.g., Amazon S3
  • outsourced database services [16]
  • peer-to-peer storage [13, 19]
  • network file systems [12, 15].
  • the common concern in all these systems is the fact that the server (or peer) who stores the client's data is not necessarily trusted. Therefore, users would like to check if their data has been tampered with or deleted.
  • PDP provable data possession
  • data (often represented as a file F ) is preprocessed by the client, producing metadata that is used for verification purposes.
  • the file is then sent to an untrusted server for storage, and the client may delete the local copy of the file.
  • the client keeps some (possibly secret) information to check the server's responses later.
  • the server proves the data has not been tampered with by responding to challenges sent by the client.
  • the authors present several variations of their scheme under different cryptographic assumptions. These schemes provide probabilistic guarantees of possession, where the client checks a random subset of stored blocks with each challenge.
  • the client preprocesses the data and then sends it to an untrusted server for storage, while keeping a small amount of meta-data.
  • the client later asks the server to prove that the stored data has not been tampered with or deleted (without downloading the actual data).
  • the original PDP scheme applies only to static (or append-only) files.
  • an apparatus comprising: at least one memory configured to store data; and at least one processor configured to perform operations on the stored data, where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file, where the apparatus is configured to maintain a skip list corresponding to the stored data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of the root node and the at least one internal node is computed from a level of the node within the skip list, the rank value of the node within the skip list
  • a program storage device readable by a processor of an apparatus, tangibly embodying a program of instructions executable by the processor for performing operations, the operations comprising: storing data, where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file; and maintaining a skip list corresponding to the stored data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of the root node and the at least one internal node is computed from a level of the node within the skip list, the rank value of the no
  • a method comprising: storing data on at least one memory of an apparatus, where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file; and maintaining, by the apparatus, a skip list corresponding to the stored data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of the root node and the at least one internal node is computed from a level of the node within the skip list, the rank value of the node within the skip list and an interval between the node and another linked node that
  • FIG. 1 shows a table illustrating a comparison of PDP schemes
  • FIG. 2 shows an exemplary skip list used to store a file of 12 blocks using ranks in accordance with the exemplary embodiments of the invention
  • FIG. 3 shows the proof for the 5 -th block of the file F stored in the skip list of FIG. 2;
  • FIG.4 depicts the proof IT(5) as produced by Algorithm 3.4 for the update "insert a new block with data T after block 5 at level 1";
  • FIG. 5 illustrates authenticated CVS server characteristics
  • FIG. 6 shows expected size of proofs of possession under the instant scheme on a IGB file, for 99% probability of detecting misbehavior
  • FIG. 7 depicts computation time required by the server in response to a challenge for a 1 GB file, with 99% probability of detecting misbehavior
  • FIG. 8 shows an exemplary skip list used to store an ordered set
  • FIG. 9 shows an exemplary file system skip list with blocks as leaves, directories and files as roots of nested skip lists
  • FIG. 10 illustrates an exemplary version control file system
  • FIG. 1 1 illustrates a simplified block diagram of various electronic devices that are suitable for use in practicing the exemplary embodiments of this invention
  • FIG. 12 depicts a flowchart illustrating one non-limiting example of a method for practicing the exemplary embodiments of this invention.
  • FIG. 13 depicts a flowchart illustrating another non-limiting example of a method for practicing the exemplary embodiments of this invention.
  • DPDP dynamic provable data possession
  • DPDP dynamic provable data possession
  • the DPDP solution is based on a new variant of authenticated dictionaries where rank information is used to organize dictionary entries, rather than search keys.
  • rank information is used to organize dictionary entries, rather than search keys.
  • the solution is able to support efficient authenticated operations on files at the block level, enabling file operations such as authenticated insert and delete .
  • the security of this DPDP constructions is proven using collision-resistant hash functions, the factoring assumption and the strong RSA assumption.
  • FIG. 1 shows a table illustrating a comparison of PDP schemes: original PDP scheme [2] ; Scalable PDP [3]; a DPDP scheme built on rank-based authenticated skip lists (described in further detail below); and a DPDP scheme built on rank-based RSA trees (also further described below).
  • a star (*) indicates that in Scalable PDP a certain operation can be performed only a limited (pre-determined) number of times.
  • the DPDP schemes are fully-dynamic, n denotes the number of blocks of the file, / is the fraction of the corrupted blocks, and C is the number of challenges used in [2, 3] and DPDP I.
  • the storage space is 0(1) at the client and O(n) at the server.
  • the efficiency of the DPDP schemes is summarized as follows, where n denotes the number of the blocks.
  • the server computation i.e., the time taken by the server to process an update or to compute a proof for a block, is 6>(log ⁇ ) for DPDP I and O(n ⁇ ) or O(l) respectively for DPDP II.
  • the client computation i.e., the time taken by the client to verify a proof returned by the server, is O(logn) .
  • the communication complexity i.e., the size of the proof returned by the untrusted server to the client
  • the client storage i.e., the size of the meta-data stored locally by the client
  • the probability of detection i.e., the probability of detecting a server misbehavior without downloading all the data
  • l -(l -/) c for DPDP I
  • 1 - (1 - /) clog for DPDP II, for fixed logarithmic communication complexity and where / is the ratio of corrupted blocks.
  • Juels and Kaliski [11] present proofs of retrievability (PORs) and, like the PDP model, focus on static archival storage of large files. Their scheme's effectiveness rests largely on preprocessing steps the client conducts before sending a file F to the server: "sentinel" blocks are randomly inserted to detect corruption, F is encrypted to hide these sentinels, and error-correcting codes are used to recover from corruption. As expected, the error- correcting codes improve the error-resiliency of their system. Unfortunately, these operations prevent any efficient extension to support updates, beyond simply replacing F with a new file F 1 . Furthermore, the number of queries a client can perform is limited, and fixed a priori. Shacham and Waters have an improved version of this protocol called Compact POR [30], but their solution is also static (see [6] for a summary of POR schemes and related trade-offs).
  • error-correcting codes or encryption are regarded as external to the system.
  • the user wants to have more error-resiliency, she can provide a file that has error-correcting codes integrated (or an encrypted file if secrecy is desired).
  • Such modifications to the file are regarded as external to the system. Since the construction does not modify the file and assumes no property on it, the system will work in perfect compliance.
  • Scalable PDP 3
  • Their idea is to come up with all future challenges during setup and store pre-computed answers as metadata (at the client, or at the server in an authenticated and encrypted manner). Because of this, the number of updates and challenges a client can perform is limited and fixed a priori. In addition, their scheme is not fully dynamic: the number of updates is limited (otherwise the lower bound of [7] would be violated) and then the setup phase has to be executed again. Also, one cannot perform block insertions anywhere (only append-type insertions are possible). Specifically, each update requires re-creating all the remaining challenges. This can be problematic when a large file system is outsourced.
  • This model builds on the PDP definitions from [2] . It starts by defining a general DPDP scheme, and then shows how the original PDP model is consistent with this definition.
  • DPDP Scheme In a DPDP scheme, there are two parties. The client wants to off-load her files to the untrusted server. A complete definition of a DPDP scheme should describe the following (possibly randomized) efficient procedures:
  • ⁇ sk,pk ⁇ is a probabilistic algorithm run by the client. It takes as input a security parameter, and outputs a secret key sk and a public key pk. The client stores the secret and public keys, and sends the public key to the server.
  • the client sends e(F),e(info),e(M) to the server.
  • the server sends M C ',P M , to the client.
  • VerifyUpdate (sk, pk, F,info,M c ,M e ',/ > w ,) ⁇ ⁇ ACCEPT, REJECT ⁇ is run by the client to verify the server's behavior during the update. It takes all the inputs the PrepareUpdate algorithm did*, plus the M C ',P M , sent by the server. It outputs acceptance (F can be deleted in that case) or rejection signals.
  • Challenge(sk,pk,M c ) -» ⁇ c ⁇ is a probabilistic procedure run by the client to create a challenge for the server. It takes the secret and public keys, along with the latest client metadata M c as input, and outputs a challenge c that is then sent to the server.
  • Prove(pk,/ ⁇ ,M ( ,c) — > ⁇ P ⁇ is the procedure run by the server upon receipt of a challenge from the client. It takes as input the public key, the latest version of the file and the metadata, and the challenge c . It outputs a proof P that is sent to the client.
  • Verify(sk,pk,M c , ⁇ :,/ » ) ⁇ ⁇ ACCEPT, REJECT ⁇ is the procedure run by the client upon receipt of the proof P from the server. It takes as input the secret and public keys, the client metadata M c , the challenge c, and the proof P sent by the server. It outputting accept ideally means that the server still has the file intact.
  • the security requirement of a DPDP scheme is defined below.
  • PDP Scheme A PDP scheme is consistent with the DPDP scheme definition, with algorithms PrepareUpdate , PerformUpdate and VerifyUpdate specifying an update that is a full re-write (or append).
  • PDP is a restricted case of DPDP. It will now be shown how the DPDP definition (when restricted in this way) fits some previous schemes.
  • the PDP scheme of [2] has the same algorithm definition for key generation, defines a restricted version of PrepareUpdate that can create the metadata for only one block at a time, and defines Prove and Verify algorithms similar to this model's definitions. It lacks an explicit definition of Challenge (but it is very easy to figure out).
  • PerformUpdate is simply performing a full re-write or an append (so that replay attacks can be avoided), and VerifyUpdate is used accordingly, i.e., it always accepts in case of a full re- write or it is run as in DPDP in case of an append. It is clear that this model's definitions allow a broad range of DPDP (and PDP) schemes.
  • Data Possession Game Played between the challenger who plays the role of the client and the adversary who acts as a server.
  • ACF Queries The adversary is very powerful.
  • the adversary can mount adaptive chosen file (ACF) queries as follows.
  • the adversary specifies a message F and the related information info specifying what kind of update to perform (see Definition 1) and sends these to the challenger.
  • the challenger runs PrepareUpdate on these inputs and sends the resulting e(.F), e(info), e(M) to the adversary.
  • the adversary replies with M C ',P M , which are verified by the challenger using the algorithm YerifyUpdate .
  • the result of the verification is told to the adversary.
  • the adversary can further request challenges, return proofs, and be told about the verification results.
  • the adversary can repeat the interaction defined above polynomial ly-many times.
  • Challenge Call the final version of the file F , which is created according to the verifying updates the adversary requested in the previous step.
  • the challenger holds the latest metadata M c sent by the adversary and verified as accepting.
  • the challenger creates a challenge using the algorithm Challenge(sk, pk,M c ) — > ⁇ c ⁇ and sends it to the adversary.
  • the adversary returns a proof P . If Ve ⁇ fy(sk,pk,M c ,c, P) accepts, then the adversary wins.
  • the challenger has the ability to reset the adversary to the beginning of the challenge phase and repeat this step polynomially-many times for the purpose of extraction. Overall, the goal is to extract (at least) the challenged parts of F from the adversary's responses which are accepting.
  • This new data structure which is called a rank-based authenticated skip list, is based on authenticated skip lists but indexes data in a different way. Note that one could have based the construction on any authenticated search data structure (e.g., a Merkle tree [17]) instead. This would work perfectly for the static case, but in the dynamic case one would need an authenticated red-black tree, and unfortunately no algorithms have been previously presented for rebalancing a Merkle tree while efficiently maintaining and updating authentication information (except for the three-party model, e.g., [14]). Yet, such algorithms have been extensively studied for the case of the authenticated skip list data structure [24]. Before presenting the new data structure, authenticated skip lists are briefly introduced.
  • FIG. 2 shows an exemplary skip list used to store a file of 12 blocks using ranks in accordance with the exemplary embodiments of the invention.
  • the authenticated skip list is a skip list [26] (see FIG. 2) with the difference that every internal node v of the skip list (which has two pointers, namely rgt(v) and dwn(v) ) also stores a label /(v) that is a cryptographic hash and is computed using some collision- resistant hash function h (e.g., SHA-I in practice) as a function of /(rgt(v)) and /(dwn(v)) .
  • some collision- resistant hash function h e.g., SHA-I in practice
  • Rank-based queries As noted before, one uses the authenticated skip list data structure [10] to check the integrity of the file blocks. However, the updates to be supported in the DPDP scenario are insertions of a new block after the / -th block and deletion or modification of the z -th block (there is no search key in this case, in contrast to [10], which basically implements an authenticated dictionary). If one were to use indices of blocks as search keys in an authenticated dictionary, the following problem arises. Suppose there is a file consisting of 100 blocks m, , m 2 , ... , m m and one wants to insert a block after the 40-th block.
  • F be a file consisting of n blocks m ⁇ ,m 2 ,...,m n .
  • the leaves of the skip list will contain some representation of the blocks, namely leaf / will store T( «z,) .
  • T(/n,) m, (T(/ ⁇ z ( .) will be defined below).
  • the actual block Ht 1 will be stored somewhere in the hard drive of the untrusted server.
  • Every internal node v of the skip list stores the size of the subtree rooted on this node, namely how many leaves of the skip list can be reached from this node, instead of storing a search key. Call this number a rank or rank value of an internal node v and denote it with r(y) .
  • n blocks m v m 2 ,. ,. ,m n are stored in the rank-based skip list.
  • the rank of the top leftmost node v of the skip list is n (all blocks can be reached from that node).
  • n all blocks can be reached from that node.
  • /(v) A(A Il /(dwn(v)) Il /(dwn(v)), A
  • /(v) A(A Il /(v) Il T(dat(v)), A
  • /(rgt(v))), where A l(v) ⁇ r(v) ,
  • the basis i.e., the label /(v) of the top leftmost node v of the skip list
  • the file consists of only two "fictitious" blocks - block 0 and block + ⁇ .
  • Queries Suppose now the file F and a skip list on the file have been stored at the untrusted server.
  • the client wants to verify the integrity of block i , and therefore queries for block i (we call the query rankAt(z) ).
  • the server constructs the proof IT(Z) for block
  • v,,v 2 ,...,v m be the search path in the skip list for block / (note that node v, corresponds to block i + 1 and node v 2 corresponds to block i and therefore this is concerned with the reverse path).
  • v y 1 ⁇ j ⁇ m a 4- tuple ⁇ (V j ) is added to the proof.
  • the 4-tuple ⁇ (v y ) contains the level Ky 1 ) , the rank r(V j ) , an interval 7(v y ) and a hash value (label) /(v ) .
  • the interval value is 7(v 7 ) and the hash value is T(data(v 7 )) .
  • /(v y ) equals the 7(v') and /(v 7 ) equals f(v') , where v is either rgt(v y ) or dwn(v 7 ) according from where v y gets its hash value.
  • the proof for the 5-th block of the skip list of FIG. 2 is depicted in FIG. 3.
  • any search path in the skip list is expected to be of logarithmic length (in the number of blocks) with high probability, the expected size of the proof is logarithmic with high probability too.
  • V 1 , V 2 ,..., v m be the search path for block / ;
  • 2: return IT(O ⁇ (v, ), ⁇ (v 2 ),..., A(y m ) ⁇ ;
  • Verification After receiving the 4-tuples A(V j ) which are the proof for a block m t , the client has to process them and compute a value /' . If /' is equal to the locally stored metadata M c , then the verification algorithm outputs ACCEPT , else it outputs REJECT (see Algorithm 4). If it outputs ACCEPT , then with high probability, the server indeed stores T(m t ) intact [24] (recall that T(m ( ) is a representation of the data of the actual block m t — which can be viewed as /n, itself for the sake of presentation— and this is what is stored at the leaves of the skip list).
  • operation S is associative: For every A(y t ), ⁇ (v y ), A(v k ) such that v, , V j and v k form an upward path in the skip list it is S(A(V 1 ), A(V j ), A(v k )) following result:
  • v, , v 2 , ... , v m be a reverse search path for a leaf node x in a skip list where the hashing scheme with ranks is used.
  • L be the maximum level
  • n be the number of stored items
  • the possible update types in this DPDP scheme are insertions of a new block after the i -th block, deletions of the i -th block, and modifications of the i -th block for l ⁇ i ⁇ n .
  • the client wants to insert a new block after the i -th block. He sends an "insert" query to the server. While performing the insertion (see Algorithm 3.3), the server also computes the proof FI (i) (using Algorithm 3.1). The server then sends the proof FI(Z) along with the new metadata M c ' to the client (M c ' is the new basis).
  • the server when it performs an insertion or deletion, it must update (and also include in the hashing scheme) the ranks and the intervals as well (see line 5 of Algorithm 3.3). This can be easily done in O(log ⁇ ) time: it is easy to see that only ranks and intervals along the search path of the skip list are affected.
  • F l _ l ,M x _ l are the previously stored file and metadata on the server (empty if this is the first run).
  • e(F) , e(info) , e(M) which are output by PrepareUpdate , are sent by the client ( e(M) being empty).
  • the file is stored as is, and the metadata stored at the server is a skip list (where for block b , T(b) is the block itself).
  • the procedure updates the file according to e(info) , outputting F 1 , runs the skip list update procedure on the previous skip list M t _, (or builds the skip list from scratch if this is the first run), outputs the resulting skip list as M 1 , the new skip list root as M c ' , and the proof returned by the skip list update as P M , .
  • This corresponds to calling Algorithm 3.3 on inputs the new data T
  • M c is the previous skip list root the client has (empty for the first time), whereas M c ' is the new root sent by the server.
  • Blockless verification using tags In the construction above, the skip list leaves were used as the data blocks themselves. This requires the client to download all the challenged blocks for verification purposes, since the skip list proof includes leaves. For more efficiency (i.e., blockless verification), one can employ homomorphic tags as in [2]. However, the tags described herein are simpler and more efficient to compute. It is briefly noted that homomorphic tags are tags that can be combined and verified at once.
  • each leaf T(W,. ) of the skip list is the tag of block m i .
  • the Prove procedure now sends the skip list proof for the challenged blocks Tn 1
  • the size of this combined block is roughly the size of a single block, and thus imposes much smaller overhead than sending C blocks. This achieves blockless verification.
  • the Challenge procedure can also be made more efficient by using the ideas in [2].
  • the expected update time is O(log «) at both the server and the client whp;
  • the expected query time at the server, the expected verification time at the client and the expected communication complexity for challenging C random blocks is 0(C log n) whp; 5.
  • the client uses O ⁇ ) space;
  • the server uses O( ⁇ ) expected space whp.
  • the challenger can either extract the challenged blocks, or break the collision-resistance of the hash function used.
  • the challenger will have two sub- entities: An extractor who extracts the challenged blocks from the adversary's proof, and a reductor who breaks the collision-resistance if the extractor fails to extract the original blocks.
  • the challenger is given a hash function, which he also passes on to the reductor.
  • the challenger plays the data possession game with the adversary using this hash function, honestly answering every query of the adversary.
  • the challenger provides the reductor the blocks (together with their ids) whose update proofs have verified, so that the reductor can keep them in its storage.
  • the extractor does not know the original blocks, only the reductor does.
  • the reductor keeps updating the blocks in its storage when the adversary performs updates. Therefore, the reductor always keeps the latest version of each block. This difference is invisible to the adversary, and so he will behave in the same way as he would to an honest challenger.
  • the adversary replies to the challenge sent by the challenger.
  • the extractor just outputs the blocks contained in the proof sent by the adversary. If this proof verifies, and hence the adversary wins, it must be the case that either all the blocks are intact (and so the extractor outputs the original blocks) or the reductor breaks collision-resistance as follows.
  • the challenger passes all the blocks (together with their ids) in the proof to the reductor.
  • the reductor can output the original block (the —latest verifying version of the— block he stored that has the same block id) and the block sent in the proof as a collision. Therefore, if the adversary has a non-negligible probability of winning the data possession game, the challenger can either extract (using the extractor) or break the collision-resistance of the hash function (using the reductor) with non-negligible probability.
  • Theorem 3 (Security of tagged DPDP protocol).
  • the DPDP protocol with tags is secure in the standard model according to Definition 3 and assuming the existence of a collision-resistant hash function and that the factoring assumption holds.
  • the challenger then samples a high-order element g (a random integer between 1 and N -I will have non-negligible probability of being of high order in ⁇ * N , which suffices for the sake of reduction argument — a tighter analysis can also be performed). He interacts with the adversary in the data possession game honestly, using the given hash function, and creates the tags while using N as the modulus and g as the base.
  • the challenger will have two sub-entities: An extractor who extracts the challenged blocks from the adversary's proof, and a reductor who breaks the collision-resistance of the hash function or factors N , if the extractor fails to extract the original blocks.
  • the challenger acts as in the previous proof. First, consider the case where only one block is challenged. If the adversary wins, and thus the proof verifies, then the challenger can either extract the block correctly (using the extractor), or break the factoring assumption or the collision-resistance of the hash function (using the reductor), as follows.
  • the extractor just outputs x . If the extractor succeeds in extracting the correct block, then one is done. Now suppose the extractor fails, which means x ⁇ b .
  • the challenger provides the reductor with the block x in the proof, its block id, the hash function, and g,N .
  • the extractor fails to extract the original blocks, one can employ the reductor as follows. With each rewind, if the proof given by the adversary verifies, the challenger passes on the M value and the tags in the proof to the reductor, along with the challenge. Call each original blocks 6, . The reductor first checks to see if there is any tag mismatch:
  • T(AW, ) g ' mod N for all 1 ⁇ j ⁇ C .
  • T g B mod N .
  • the challenger can either extract the original blocks (using the extractor), or break the collision-resistance of the hash function used or the factoring assumption (using the reductor) with non-negligible probability. This concludes the proof of Theorem 3.
  • some DPDP systems can tolerate some errors, e.g., movie files.
  • some errors e.g., movie files.
  • this probability will be high enough to deter any malicious behavior, especially considering the fact that one also has a public verifiability protocol that can be used for official arbitration purposes.
  • Theorem 1.1 Assume the strong RSA assumption and the factoring assumption hold.
  • the dynamic provable data possession scheme presented in this section (DPDP II) for a file consisting of n blocks has the following properties, where f is the ratio of the number of tampered blocks over the total number of blocks of the file:
  • the amortized update time is O(n ⁇ ) at the server for some 0 ⁇ ⁇ ⁇ 1 and 0(1) at the client;
  • the query time at the server, the verification time at the client and the communication complexity for challenging C random blocks is O(C) .
  • the client uses 0(1) space
  • the server uses O(n) space.
  • Variable-sized blocks Although the scheme enables updates that insert, modify and delete whole blocks of data without affecting neighboring blocks, some applications or filesystems may more naturally wish to perform updates that do not cleanly map to fixed- size block boundaries. For example, an update which added or removed a line in a text file would require modifying each of the blocks in the file after the change, so that data in later blocks could still be accessed easily by byte offset (by calculating the corresponding block index). Under such a naive scheme, whole-block updates are inefficient, since new tags and proofs must be generated for every block following the updated one. A more complicated solution based solely on existing constructions could store block-to-byte tables in a "special" lookup block.
  • the ranking scheme assigns each internal skip list node u a rank r(u) equivalent to the number of leaf nodes (data blocks) reachable from the subtree rooted at u ; leaves (blocks) are implicitly assigned a rank value of 1.
  • Variable-sized blocks are supported by defining a leaf node's rank to be equal to the size of its associated block (e.g., in bytes).
  • Each internal node is assigned a rank equivalent to the amount of bytes reachable below it.
  • Queries and proofs proceed the same as before, except that ranks and intervals associated with the search path refer to byte offsets, not block indices, with updates phrased as, e.g., "insert m bytes at byte offset i ". Such an update would require changing only the block containing the data at byte index i . Similarly, modifications and deletions affect only those blocks spanned by the range of bytes specified in the update.
  • Directory hierarchies One can also extend the DPDP scheme for use in authenticated storage systems consisting of multiple files within a directory hierarchy. The key idea is to place the root of each file's rank-based skip list (from the single-file scheme) as the leaves of a parent dictionary which is used to map file names to files.
  • key-based authenticated dictionaries [24] allows one to chain the proofs and update operations through the entire directory hierarchy; each directory is represented as a key-based skip list with leaves for each file or subdirectory it contains.
  • key-based authenticated dictionaries [24] allows one to chain the proofs and update operations through the entire directory hierarchy; each directory is represented as a key-based skip list with leaves for each file or subdirectory it contains.
  • each directory is represented as a key-based skip list with leaves for each file or subdirectory it contains.
  • these dictionaries in a nested manner, with the basis of the topmost dictionary as the root of the file system, and at the bottom, leaves for the tags associated with blocks of data (a
  • This extension provides added flexibility for multi-user environments.
  • a system administrator who employs an untrusted storage provider.
  • the administrator can keep the skip list basis corresponding to the topmost directory, and use it to periodically check the integrity of the whole file system.
  • Each user can keep the skip list basis corresponding to her home directory, and use it to independently check the integrity of the directory hierarchy rooted at that basis, at any time and without need for cooperation from the administrator.
  • Version control One can build on the extensions further to efficiently support versioning systems (e.g., a CVS repository, or versioning filesystem).
  • versioning systems e.g., a CVS repository, or versioning filesystem.
  • Such a system can be supported by adding another additional layer of key-based authenticated dictionaries [24], keyed by revision number (e.g., an indication of the revision), between the dictionaries for each file's directory and its data, chaining proofs as in previous extensions. (See FIG. 10 for an illustration.)
  • the client need only store the topmost basis; thus one can support a versioning system for a single file with only 0(1) storage at the client and O(log « + logv) proof complexity, where v is the number of the file versions.
  • the proof complexity for the versioning file system will be 0( ⁇ /(logn + logv)) .
  • the server may implement its method of block storage independently from the dictionary structures used to authenticate data; it need not physically duplicate each block of data that appears in each new version.
  • this extension requires the addition of a new rank-based dictionary representing file data for each new revision added (since this dictionary is placed at the leaf of each file's version dictionary).
  • persistent authenticated skip lists [1 ] along -with the rank mechanism.
  • These persistent data structures handle skip list updates by adding new nodes for those affected by an update (nodes appearing along the search path), while preserving old internal nodes and roots corresponding to previous versions of the structure before each update.
  • the server stores only the nodes corresponding to blocks affected by it.
  • the performance of the DPDP I scheme (Section 4) is evaluated in terms of communication and computational overhead, in order to determine ihe price of dynamism over static PDP. For ease of comparison, this evaluation uses the same scenario as in PDP [2], where a server wishes to prove possession of a IGB file. As observed in [2], detecting a 1 % fraction of incorrect data with 99% confidence requires challenging a constant number of 460 blocks; the same number of challenges is used for comparison.
  • FIG. 6 shows expected size of proofs of possession under the instant scheme on a IGB file, for 99% probability of detecting misbehavior.
  • FIG. 7 depicts computation time required by the server in response to a challenge for a IGB file, with 99% probability of detecting misbehavior.
  • the size of this block M is the same as that used by the PDP scheme in [2]***, and is thus represented by the line labeled PDP.
  • the distance between this line and those for the DPDP I scheme represents communication overhead — the price of dynamism — which comes from the skip list query responses (illustrated in FIG. 3).
  • Each response contains on average 1.5 log « rows, so the total size decreases exponentially (but slowly) with increasing block size, providing near-constant overhead except at very small block sizes.
  • FIG. 7 presents the results of these experiments (averaged from 5 trials), which were performed on an AMD Athlon X2 3800+ system with 2GHz CPU and 2GB of RAM.
  • FIG. 7 presents the results of these experiments (averaged from 5 trials), which were performed on an AMD Athlon X2 3800+ system with 2GHz CPU and 2GB of RAM.
  • one computes the time required by the scheme for a 1 GB file under varying block sizes, providing 99% confidence.
  • performance is dominated by computing M and increases linearly with the block size; note that static PDP [2] must also compute this M in response to the challenge.
  • FIG. 5 presents performance characteristics of three public CVS repositories under the scheme; while an authenticated CVS system has not been implemented, the server overhead required for proofs of possession for each repository are reported.
  • “commits” refer to individual CVS checkins, each of which establish a new version, adding a new leaf to the version dictionary for that file; “updates” describe the number of inserts or deletes required for each commit.
  • Total statistics sum the number of lines (blocks) and kilobytes required to store all inserted lines across all versions, even after they have been removed from the file by later deletions (since the server continues to store them).
  • Proof size and time per commit refer to a proof sent by the server to prove that a single commit (made up of, on average, about a dozen updates) was performed successfully, representing the typical use case. These commit proofs are very small (15KB to 21KB) and fast to compute, rendering them practical even though they are required for each commit. Experiments show that the DPDP scheme is efficient and practical for use in distributed applications.
  • the skip list data structure (see FIG. 8) is an efficient means for storing a set S of elements from an ordered universe. It supports the operations find( x ) (determine whether element x is in S ), insert( x ) (insert element x in S ) and delete( x ) (remove element x from S ). It stores a set S of elements in a series of linked lists S 0 ,S v S 2 ,..., S t .
  • the base list, S 0 stores all the elements of S in order, as well as sentinels associated with the special elements - ⁇ and + ⁇ .
  • Each successive list S 1 for / > 1 , stores a sample of the elements from .S 1-1 . To define the sample from one level to
  • FIG. 8 shows an exemplary skip list used to store the ordered set ⁇ 25,31 ,38,39,44,55,58,67,80,81 ⁇ .
  • the proof for the existence of element 39 (and for the absence of element 40) as proposed in [10] is the set ⁇ 44,39,38,31,/(v,),/(v 6 ),/(v 7 ),/(v 8 ),/(v 9 ) ⁇ .
  • the recomputation of /(w 7 ) is performed by sequentially applying /z( v ) to this set.
  • FIG. 9 shows an exemplary file system skip list with blocks as leaves, directories and files as roots of nested skip lists.
  • FIG. 10 illustrates an exemplary version control file system. Notice the additional level of skiplists for holding versions of a file. To eliminate redundancy at the version level, persistent authenticated skip lists could be used [I]: the complexity of these proofs will then be O( ⁇ og n + log v + d log /) .
  • a skip list is a data structure for storing a sorted list of items using a hierarchy of linked lists that connect subsequences of the items. These auxiliary lists enable item lookup with greater efficiency as compared with a balanced binary search tree (i.e., with a number of probes proportional to log n instead of ⁇ ).
  • a skip list is built in layers, also referred to herein as levels.
  • a search for a target element begins at the head element (i.e., root node) in the top list and proceeds horizontally until the current element is greater than or equal to the target. If the current element is equal to the target, it has been found. If the current element is greater than the target, the procedure is repeated after returning to the previous element and dropping down vertically to the next lower list (the next level down).
  • nodes of a skip list generally correspond to an interval of values and, thus, nodes of a skip list may be seen to have an interval value associated with the respective node.
  • RSA is an algorithm for public-key cryptography [25].
  • Hash trees or Merkle trees are a type of data structure which contains a tree of summary information about a larger piece of data (e.g., a file) used to verify its contents.
  • a hash tree is a tree of hashes in which the leaves are hashes of data blocks in, for instance, a file or set of files. Nodes further up in the tree are the hashes of their respective children.
  • a cryptographic hash function such as SHA-I, Whirlpool, orTigeris used for the hashing. If the hash tree only needs to protect against unintentional damage, much less secure checksums such as cyclic redundancy checks (CRCs) can be used.
  • CRCs cyclic redundancy checks
  • the top of a hash tree has a top hash (or root hash or master hash).
  • a top hash or root hash or master hash.
  • the top hash is acquired from a trusted source, for instance a friend or a web site that is known to have good recommendations of files to download.
  • the hash tree can be received from any non-trusted source, such as any peer in the p2p network. Then, the received hash tree is checked against the trusted top hash, and if the hash tree is damaged or fake, another hash tree from another source will be tried until the program finds one that matches the top hash.
  • a hash function is a well-defined procedure or mathematical function that converts a large amount of data into a small datum (e.g., a single integer) that may be used as an index (e.g., in an array or other data structure). Hash functions are often used to speed up table lookup or data comparison tasks.
  • exemplary cryptographic hashes elf64, HAVAL, MD2, MD4, MD5, Radio Gat ⁇ n, RIPEMD-64, RIPEMD- 160, RIPEMD-320, SHA-I, SHA-256, SHA-384, SHA-512, Skein, Tiger and Whirlpool.
  • any suitable hash function may be used with the exemplary embodiments of the invention.
  • the selection of a particular hash function may depend on the intended use and/or desired attributes of the system (e.g., in view of the attributes of the hash function, such as length and cost, for example).
  • Both a skip list and a hash tree are considered herein to be organizational structures having a generally tree-like structure comprised of nodes.
  • a root node e.g., located at the top or root of the hash tree or at the top left or root of the skip list
  • the internal nodes lead to zero or more other internal nodes and/or one or more leaf nodes.
  • the leaf nodes are located at the very bottom of the list/tree (e.g., at the bottommost level/layer).
  • Data e.g., one or more files, collections of files, directories, file systems
  • portions of data are stored in accordance with the leaf nodes, as noted above.
  • the root node, internal nodes and/or leaf nodes may lead to another node on the same level/layer.
  • the nodes of the list/tree each have a hash value associated with the node.
  • the nodes of the list/tree maybe referred to using a label (e.g., v, or wj). Two nodes are considered linked within the list/tree if there is a connection pointing from one node to the other node.
  • links between nodes are either pointing from one node to another node at the same level or pointing from one node to another node at a lower level.
  • FIG. 11 illustrates a simplified block diagram of various exemplary electronic devices that are suitable for use in practicing the exemplary embodiments of this invention.
  • FIG. 11 shows a system 100 having a client 102 and a server 112.
  • the client 102 has at least one data processor (DP) 104 and at least one memory (MEM) 106 coupled to the DP 104.
  • the client 102 is configured for bidirectional communication with the server 412, for example, using one or more communication components, such as a transceiver or modem (not shown).
  • the MEM 106 stores information (INFO) 110 in accordance with exemplary embodiments of the invention, as further described herein.
  • the INFO 1 10 may comprise one or more files, one or more dictionaries (e.g., authenticated dictionaries), one or more data files (e.g., skip lists, skip list information, hash values) used for security purposes (e.g., authentication, verification), one or more file systems or file collections and/or other information, data or files, as non-limiting examples.
  • the client 102 may comprise any suitable electronic device, including stationary and portable computers, as non-limiting examples.
  • the client 102 may comprise additional components and/or functions.
  • the client 102 may include one or more user interface (UI) elements, such as a display, a keyboard, a mouse or any other such UI components, as non-limiting examples.
  • the client 102 may comprise a communication component (e.g., a transceiver, a modem) that enables communication with one or more other devices, such as the server 112, for example.
  • a communication component e.g., a transceiver, a modem
  • the server 1 12 has at least one data processor (DP) 1 14 and at least one memory (MEM) 116 coupled to the DP 114.
  • the server 112 is configured for bidirectional communication with the client 402, for example, using one or more communication components, such as a transceiver or modem (not shown).
  • the MEM 116 stores a file system (FS) 120 and an authentication service (AS) 122 in accordance with exemplary embodiments of the invention, as further described herein.
  • the functionality of the FS 120 and AS 122 may be stored in or provided by a single component, such as a memory, a circuit, an integrated circuit or a processor, as non-limiting examples.
  • the functionality of the FS 120 and AS 122 may be stored in or provided by separate components (e.g., two or more memories, two or more circuits, two or more integrated circuits, two or more processors).
  • the MEM 116 of the server 1 12 may store additional information or data, such as one or more files, one or more dictionaries (e.g., authenticated dictionaries), one or more data files (e.g., skip lists, skip list information, hash values) used for security purposes (e.g., authentication, verification), one or more file systems or file collections and/or other information, data or files, as non-limiting examples.
  • the server 112 may comprise any suitable electronic device, including stationary and portable computers, as non-limiting examples.
  • the server 112 may comprise additional components and/or functions.
  • the server 112 may include one or more user interface (UI) elements, such as a display, a keyboard, a mouse or any other such UI components, as non-limiting examples.
  • UI user interface
  • the server 112 may comprise a communication component (e.g., a transceiver, a modem) that enables communication with one or more other devices, such as the client 102, for example.
  • the server 1 12 may be considered an untrusted remote server storing data on behalf of and for access by the client 102.
  • the server 112 may store data (e.g., one or more file systems) using one or more skip lists and/or hashing schemes (e.g., hash trees), as non-limiting examples.
  • the client 102 may be configured to access data stored by the server 112, such as data stored in one or more skip lists, for example.
  • the exemplary embodiments of this invention may be carried out by computer software implemented by the one or more of the DPs 104, 114 or by hardware, or by a combination of hardware and software.
  • the exemplary embodiments of this invention may be implemented by one or more integrated circuits.
  • the MEMs 106, 116 may be of any type appropriate to the technical environment and may be implemented using any appropriate data storage technology, such as optical memory devices, magnetic memory devices, semiconductor-based memory devices, fixed memory and removable memory, as non-limiting examples.
  • the DPs 104, 114 may be of any type appropriate to the technical environment, and may encompass one or more of microprocessors, general purpose computers, special purpose computers and processors based on a multi-core architecture, as non-limiting examples.
  • Exemplary embodiments of the invention or various aspects thereof, such as the authentication service, as a non-limiting example, may be implemented as a computer program stored by the respective MEM 106, 116 and executable by the respective DP 104, 114.
  • an apparatus comprising: at least one memory configured to store data; and at least one processor configured to perform operations on the stored data, where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file, where the apparatus is configured to maintain a skip list corresponding to the stored data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of the root node and the at least one internal node is computed from a level of the node within the skip list, the rank value of the node within the skip list and an interval between
  • /(V) A(A Il /(dwn(v)) Il /(dwn(v)), A
  • /(rgt(v))), where A /(v)
  • a second operation performed by the at least one processor comprises verifying the proof IT(Z) returned by the first operation as follows: if
  • T(OT 1 ) is a representation of a modification for the block i
  • mu is a new block to be inserted in the skip list
  • A'(v) is an updated 4-tuple for the node v
  • /'(v) is an updated hash value for the node v
  • f(s) is an updated hash value of a top-leftmost node of the skip list.
  • A( ⁇ '(/)) ⁇ (A '(V 1 X A '(v 2 ),... , A '(v m )) and denotes a last element of
  • each leaf node of the skip list has an associated homomorphic tag that is a function of the associated block, wherein a tag size of the associated homomorphic tag is smaller than a block size of the associated block and the homomorphic tags enable blockless verification.
  • An apparatus as in any above, where usage of the homomorphic tags enables a client to check the integrity of the at least one file (the data, the portions of the data associated with the blocks) by an operation performed on the homomorphic tags (e.g., the server performing an operation on the tags and sending a result to the client) and without the client downloading an entirety of the at least one file (without the client downloading the data or at least all of the data).
  • An apparatus as in any above, where the at least one memory is further configured to store the skip list.
  • An apparatus as in any above further comprising an input (e.g., means for receiving, such as a receiver or modem, as non-limiting examples) configured to receive an update instruction from a client.
  • an input e.g., means for receiving, such as a receiver or modem, as non-limiting examples
  • the update instruction comprising an instruction to perform at least one of: modifying at least one block, deleting at least one block and inserting at least one new block.
  • the at least one processor is further configured to perform the update instruction on the skip list and obtain an updated skip list, an updated hash value for the root node and an update proof corresponding to the updated skip list.
  • An apparatus as in any above further comprising an output (e.g., means for sending, such as a transmitter or modem, as non-limiting examples) configured to send at least the update proof and the updated hash value of the root node to the client.
  • an output e.g., means for sending, such as a transmitter or modem, as non-limiting examples
  • the update proof and the updated hash value of the root node enable the client to authenticate the performance of the update instruction by the apparatus.
  • An apparatus as in any above further comprising an input (e.g., means for receiving, such as a receiver or modem, as non-limiting examples) configured to receive a challenge from a client.
  • An apparatus as in any above where the at least one processor is further configured to generate a challenge proof based on the received challenge.
  • An apparatus as in any above further comprising an output (e.g., means for sending, such as a transmitter or modem, as non-limiting examples) configured to send the challenge proof to the client.
  • the challenge proof enables the client to verify that at least a portion of the data stored by the apparatus is intact.
  • the apparatus comprises a remote untrusted server.
  • An apparatus as in any above where the at least one file comprises a file system and the apparatus supports versioning file systems by use of at least one key-based authenticated dictionary, keyed by revision number, between one or more dictionaries for each file's directory and each file's data.
  • An apparatus as in any above where a block size of at least one block of the plurality of blocks is variable.
  • An apparatus as in any above, where a respective block size for each block of the plurality of blocks is variable.
  • a program storage device readable by a processor of an apparatus, tangibly embodying a program of instructions executable by the processor for performing operations, the operations comprising: storing data, where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file; and maintaining a skip list corresponding to the stored data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a RSA tree, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of the root node and the at least one internal node is computed from
  • a program storage device as above, further comprising one or more aspects of the exemplary embodiments of the invention as described in further detail herein.
  • a method comprising: storing data (e.g., on at least one memory of an apparatus), where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file (301); and maintaining (e.g., by the apparatus) a skip list corresponding to the stored data (302), where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a RSA tree, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of the root node and the at least one internal node is
  • an apparatus comprising: means for storing data (e.g., at least one memory), where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file; and means for maintaining (e.g., at least one processor) a skip list corresponding to the stored data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a RSA tree, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of the root node and the at least one internal node is computed from a level of the node within the
  • An apparatus as above, where the means for storing comprises a storage device or at least one memory and the means for maintaining comprises at least one circuit or at least one processor.
  • An apparatus as in any above, where the means for performing comprises at least one circuit or at least one processor.
  • an apparatus comprising: storage circuitry configured to store data, where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file; and processing circuitry configured to maintain a skip list corresponding to the stored data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a RSA tree, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of the root node and the at least one internal node is computed from a level of the node within the skip list, the rank value of the node within the skip list
  • an apparatus comprising: at least one memory configured to store information; and at least one processor configured to perform operations with (e.g., on or using) the stored information, where the information relates to data comprising at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file, where the apparatus is configured to perform operations with respect to (e.g., on or using) a skip list corresponding to the data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a RSA tree, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash
  • a program storage device readable by a processor of an apparatus, tangibly embodying a program of instructions executable by the processor for performing operations, the operations comprising: storing information; and performing further operations with (e.g., on or using) the stored information, where the information relates to data comprising at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file, where the apparatus is configured to perform operations with respect to (e.g., on or using) a skip list corresponding to the data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a RSA tree, where the skip list employs
  • a program storage device as above, further comprising one or more aspects of the exemplary embodiments of the invention as described in further detail herein.
  • a method comprising: storing information (e.g., on at least one memory of an apparatus) (401 ); and performing operations (e.g., using at least one processor of the apparatus) with (e.g., on or using) the stored information (402), where the information relates to data comprising at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file, where the apparatus is configured to perform operations with respect to (e.g., on or using) a skip list corresponding to the data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a RSA
  • an apparatus comprising: means for storing information (e.g., at least one memory, at least one storage device, storage circuitry); and means for performing operations (e.g., at least one processor, at least one processing component, processing circuitry) with (e.g., on or using) the stored information, where the information relates to data comprising at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file, where the apparatus is configured to perform operations with respect to (e.g., on or using) a skip list corresponding to the data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a
  • an apparatus comprising: storage circuitry configured to store information; and processing circuitry configured to perform operations with (e.g., on or using) the stored information, where the information relates to data comprising at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file, where the apparatus is configured to perform operations with respect to (e.g., on or using) a skip list corresponding to the data, where the skip list comprises an ordered tree structure having a root node, at least one internal node and at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the skip list rooted at the node, where the skip list comprises a skip list or a RSA tree, where the skip list employs a hashing scheme to assign a hash value to each node of the skip list, where the hash value of
  • an apparatus comprising: at least one memory configured to store (e.g., means for storing) data; and at least one processor configured to perform (e.g., means for performing) operations on the stored data, where the data comprises at least one file organized as a plurality of blocks with each block comprising at least a portion of the at least one file, where the apparatus is configured to maintain (e.g., means for maintaining, such as at least one processor) a RSA tree corresponding to the stored data, where the RSA tree comprises an ordered tree structure having a plurality of nodes including at least one leaf node, where each of the at least one leaf nodes corresponds to a block of the plurality of blocks, where each node of the skip list has an associated rank value corresponding to a size of a subtree of the RSA tree rooted at the node, where an ⁇ is chosen between 0 and 1 such that the tree structure has O( ⁇ l ⁇ ) levels with each node having degree 0 ⁇ n ⁇ ) ,
  • each leaf node has an associated homomorphic tag that is a function of the associated block, wherein a tag size of the associated homomorphic tag is smaller than a block size of the associated block and the homomorphic tags enable Modeless verification, where the RSA tree is configured to secure the homomorphic tags.
  • exemplary embodiments of the invention may be implemented as a computer program product comprising program instructions embodied on a tangible computer- readable medium. Execution of the program instructions results in operations comprising steps of utilizing the exemplary embodiments or steps of the exemplary method.
  • exemplary embodiments of the invention as discussed above and as particularly described with respect to exemplary methods, maybe implemented in conjunction with a program storage device (e.g., a computer-readable medium, a memory) readable by a machine (e.g., a computer, a portable computer, a device), tangibly embodying a program of instructions (e.g., a program, a computer program) executable by the machine (or by a processor of the machine) for performing operations.
  • a program storage device e.g., a computer-readable medium, a memory
  • machine e.g., a computer, a portable computer, a device
  • tangibly embodying a program of instructions e.g., a program, a computer program executable by the machine (or by a processor of the machine) for performing operations.
  • the operations comprise steps of utilizing the exemplary embodiments or steps of the exemplary method.
  • FIGS. 12 and 13 further may be considered to correspond to one or more functions and/or operations that are performed by one or more components, circuits, chips, apparatus, processors, computer programs and/or function blocks. Any and/or all of the above may be implemented in any practicable solution or arrangement that enables operation in accordance with the exemplary embodiments of the invention as described herein.
  • FIGS. 12 and 13 should be considered merely exemplary and non-limiting. It should be appreciated that the blocks shown in FIGS. 12 and 13 may correspond to one or more functions and/or operations that may be performed in any order (e.g., any suitable, practicable and/or feasible order) and/or concurrently (e.g., as suitable, practicable and/or feasible) so as to implement one or more of the exemplary embodiments of the invention. In addition, one or more additional functions, operations and/or steps may be utilized in conjunction with those shown in FIGS. 12 and 13 so as to implement one or more further exemplary embodiments of the invention.
  • FIGS. 12 and 13 may be utilized, implemented or practiced in conjunction with one or more further aspects in any combination (e.g., any combination that is suitable, practicable and/or feasible) and are not limited only to the steps, blocks, operations and/or functions shown in FIGS. 12 and 13.
  • connection or coupling between the identified elements.
  • one or more intermediate elements maybe present between the “coupled” elements.
  • the connection or coupling between the identified elements may be, as non-limiting examples, physical, electrical, magnetic, logical or any suitable combination thereof in accordance with the described exemplary embodiments.
  • the connection or coupling may comprise one or more printed electrical connections, wires, cables, mediums or any suitable combination thereof.
  • various exemplary embodiments of the invention can be implemented in different mediums, such as software, hardware, logic, special purpose circuits or any combination thereof.
  • some aspects may be implemented in software which may be run on a computing device, while other aspects may be implemented in hardware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/US2009/004322 2008-07-25 2009-07-24 Apparatus, methods, and computer program products providing dynamic provable data possession WO2010011342A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/737,583 US8978155B2 (en) 2008-07-25 2009-07-24 Apparatus, methods, and computer program products providing dynamic provable data possession
CA2731954A CA2731954C (en) 2008-07-25 2009-07-24 Apparatus, methods, and computer program products providing dynamic provable data possession

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13706608P 2008-07-25 2008-07-25
US61/137,066 2008-07-25

Publications (1)

Publication Number Publication Date
WO2010011342A1 true WO2010011342A1 (en) 2010-01-28

Family

ID=41570546

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/004322 WO2010011342A1 (en) 2008-07-25 2009-07-24 Apparatus, methods, and computer program products providing dynamic provable data possession

Country Status (3)

Country Link
US (1) US8978155B2 (}{US)
CA (1) CA2731954C (}{US)
WO (1) WO2010011342A1 (}{US)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8726034B2 (en) 2008-08-29 2014-05-13 Brown University Cryptographic accumulators for authenticated hash tables
WO2014127904A1 (en) * 2013-02-22 2014-08-28 Guardtime Ip Holdings Limited Verification system and method with extra security for lower-entropy input records
WO2020151330A1 (zh) * 2019-01-23 2020-07-30 平安科技(深圳)有限公司 数据持有性验证方法及终端设备

Families Citing this family (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001006374A2 (en) * 1999-07-16 2001-01-25 Intertrust Technologies Corp. System and method for securing an untrusted storage
US9497028B1 (en) * 2007-05-03 2016-11-15 Google Inc. System and method for remote storage auditing
WO2010020968A1 (en) * 2008-08-22 2010-02-25 Nxp B.V. Verification of process integrity
CN103890709B (zh) * 2011-11-07 2016-08-17 英派尔科技开发有限公司 基于缓存的键值数据库映射和复制
US9678979B1 (en) * 2013-07-31 2017-06-13 EMC IP Holding Company LLC Common backup format and log based virtual full construction
US9268969B2 (en) * 2013-08-14 2016-02-23 Guardtime Ip Holdings Limited System and method for field-verifiable record authentication
US9418131B1 (en) * 2013-09-24 2016-08-16 Emc Corporation Synchronization of volumes
CN104794137B (zh) * 2014-01-22 2019-01-22 腾讯科技(深圳)有限公司 排序记录的处理方法和排序查询方法及相关装置和系统
US9606870B1 (en) 2014-03-31 2017-03-28 EMC IP Holding Company LLC Data reduction techniques in a flash-based key/value cluster storage
CN105446964B (zh) * 2014-05-30 2019-04-26 国际商业机器公司 用于文件的重复数据删除的方法及装置
US10025843B1 (en) 2014-09-24 2018-07-17 EMC IP Holding Company LLC Adjusting consistency groups during asynchronous replication
US9608810B1 (en) 2015-02-05 2017-03-28 Ionic Security Inc. Systems and methods for encryption and provision of information security using platform services
WO2016155804A1 (en) 2015-03-31 2016-10-06 Nec Europe Ltd. Method for verifying information
WO2016180495A1 (en) 2015-05-13 2016-11-17 Nec Europe Ltd. A method for storing data in a cloud and a network for carrying out the method
US10193696B2 (en) * 2015-06-02 2019-01-29 ALTR Solutions, Inc. Using a tree structure to segment and distribute records across one or more decentralized, acylic graphs of cryptographic hash pointers
US10200198B2 (en) 2015-06-11 2019-02-05 PeerNova, Inc. Making cryptographic claims about stored data using an anchoring system
US20170017567A1 (en) 2015-07-15 2017-01-19 Innovium, Inc. System And Method For Implementing Distributed-Linked Lists For Network Devices
US20170017414A1 (en) 2015-07-15 2017-01-19 Innovium, Inc. System And Method For Implementing Hierarchical Distributed-Linked Lists For Network Devices
US20170017419A1 (en) * 2015-07-15 2017-01-19 Innovium, Inc. System And Method For Enabling High Read Rates To Data Element Lists
US20170017420A1 (en) 2015-07-15 2017-01-19 Innovium, Inc. System And Method For Enabling High Read Rates To Data Element Lists
US10740474B1 (en) 2015-12-28 2020-08-11 Ionic Security Inc. Systems and methods for generation of secure indexes for cryptographically-secure queries
US10503730B1 (en) 2015-12-28 2019-12-10 Ionic Security Inc. Systems and methods for cryptographically-secure queries using filters generated by multiple parties
US10152527B1 (en) 2015-12-28 2018-12-11 EMC IP Holding Company LLC Increment resynchronization in hash-based replication
US20170230186A1 (en) * 2016-02-05 2017-08-10 Samsung Electronics Co., Ltd. File management apparatus and method for verifying integrity
US10324635B1 (en) 2016-03-22 2019-06-18 EMC IP Holding Company LLC Adaptive compression for data replication in a storage system
US10310951B1 (en) 2016-03-22 2019-06-04 EMC IP Holding Company LLC Storage system asynchronous data replication cycle trigger with empty cycle detection
US9959063B1 (en) 2016-03-30 2018-05-01 EMC IP Holding Company LLC Parallel migration of multiple consistency groups in a storage system
US10095428B1 (en) 2016-03-30 2018-10-09 EMC IP Holding Company LLC Live migration of a tree of replicas in a storage system
US9959073B1 (en) 2016-03-30 2018-05-01 EMC IP Holding Company LLC Detection of host connectivity for data migration in a storage system
US10565058B1 (en) 2016-03-30 2020-02-18 EMC IP Holding Company LLC Adaptive hash-based data replication in a storage system
US10440033B2 (en) 2017-03-16 2019-10-08 Sap Se Data storage system file integrity check
CA3230160A1 (en) * 2017-08-11 2019-02-14 ALTR Solutions, Inc. Immutable datastore for low-latency reading and writing of large data sets
US11316696B2 (en) * 2017-09-29 2022-04-26 R3 Ltd. Hash subtrees for grouping components by component type
US11546166B2 (en) * 2018-07-02 2023-01-03 Koninklijke Philips N.V. Hash tree computation device
CN112445521B (zh) * 2019-09-02 2024-03-26 中科寒武纪科技股份有限公司 数据处理方法、相关设备及计算机可读介质
US11269595B2 (en) * 2019-11-01 2022-03-08 EMC IP Holding Company LLC Encoding and evaluating multisets using prime numbers
CN114265634A (zh) * 2021-12-22 2022-04-01 中国农业银行股份有限公司 基于集中式版本控制系统的文件提交方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040107346A1 (en) * 2001-11-08 2004-06-03 Goodrich Michael T Efficient authenticated dictionaries with skip lists and commutative hashing
US7181585B2 (en) * 2003-04-25 2007-02-20 International Business Machines Corporation Defensive heap memory management
US20070276843A1 (en) * 2006-04-28 2007-11-29 Lillibridge Mark D Method and system for data retention
US7340054B2 (en) * 2004-09-14 2008-03-04 Sony Corporation Information processing method, decrypting method, information processing apparatus, and computer program

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7613796B2 (en) * 2002-09-11 2009-11-03 Microsoft Corporation System and method for creating improved overlay network with an efficient distributed data structure
US7308448B1 (en) * 2003-03-21 2007-12-11 Sun Microsystems, Inc Method and apparatus for implementing a lock-free skip list that supports concurrent accesses
ITBO20060648A1 (it) * 2006-09-20 2008-03-21 Univ Degli Studi Roma Tre Metodo per la gestione dinamica e sicura di una tabella relazionale autenticata in un database
US9384175B2 (en) * 2008-02-19 2016-07-05 Adobe Systems Incorporated Determination of differences between electronic documents

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040107346A1 (en) * 2001-11-08 2004-06-03 Goodrich Michael T Efficient authenticated dictionaries with skip lists and commutative hashing
US7181585B2 (en) * 2003-04-25 2007-02-20 International Business Machines Corporation Defensive heap memory management
US7340054B2 (en) * 2004-09-14 2008-03-04 Sony Corporation Information processing method, decrypting method, information processing apparatus, and computer program
US20070276843A1 (en) * 2006-04-28 2007-11-29 Lillibridge Mark D Method and system for data retention

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8726034B2 (en) 2008-08-29 2014-05-13 Brown University Cryptographic accumulators for authenticated hash tables
US9098725B2 (en) 2008-08-29 2015-08-04 Brown University Cryptographic accumulators for authenticated hash tables
WO2014127904A1 (en) * 2013-02-22 2014-08-28 Guardtime Ip Holdings Limited Verification system and method with extra security for lower-entropy input records
CN105164971A (zh) * 2013-02-22 2015-12-16 保时知识产权控股有限公司 具有额外安全性的用于低熵输入记录的核验系统和方法
WO2020151330A1 (zh) * 2019-01-23 2020-07-30 平安科技(深圳)有限公司 数据持有性验证方法及终端设备

Also Published As

Publication number Publication date
CA2731954C (en) 2014-10-21
US20130198854A1 (en) 2013-08-01
US8978155B2 (en) 2015-03-10
CA2731954A1 (en) 2010-01-28

Similar Documents

Publication Publication Date Title
US8978155B2 (en) Apparatus, methods, and computer program products providing dynamic provable data possession
Erway et al. Dynamic provable data possession
US9977918B2 (en) Method and system for verifiable searchable symmetric encryption
Kamara et al. Cs2: A searchable cryptographic cloud storage system
US9098725B2 (en) Cryptographic accumulators for authenticated hash tables
Pasupuleti et al. An efficient and secure privacy-preserving approach for outsourced data of resource constrained mobile devices in cloud computing
Wang et al. Enabling public auditability and data dynamics for storage security in cloud computing
Kamara et al. Dynamic searchable symmetric encryption
Ateniese et al. Scalable and efficient provable data possession
Zhang et al. Provable multiple replication data possession with full dynamics for secure cloud storage
Bellare et al. Message-locked encryption and secure deduplication
Ateniese et al. Remote data checking using provable data possession
Rady et al. Integrity and confidentiality in cloud outsourced data
Esiner et al. Flexdpdp: Flexlist-based optimized dynamic provable data possession
Li et al. Integrity-verifiable conjunctive keyword searchable encryption in cloud storage
Chen et al. Data dynamics for remote data possession checking in cloud storage
Goodrich et al. Athos: Efficient authentication of outsourced file systems
Xu et al. Leakage resilient proofs of ownership in cloud storage, revisited
Heitzmann et al. Efficient integrity checking of untrusted network storage
Lu et al. Secure dynamic big graph data: Scalable, low-cost remote data integrity checking
Burns et al. Verifiable audit trails for a versioning file system
Junxiang et al. Dynamic provable data possession with batch-update verifiability
Zhao et al. Privacy-preserving TPA Auditing Scheme Based on Skip List for Cloud Storage.
Chen et al. Verifiable cloud data access: Design, analysis, and implementation
Abraham et al. Proving possession and retrievability within a cloud environment: A comparative survey

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09800695

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2731954

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09800695

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 12737583

Country of ref document: US