WO2022189865A1 - Methods and devices for verifying data integrity - Google Patents

Methods and devices for verifying data integrity Download PDF

Info

Publication number
WO2022189865A1
WO2022189865A1 PCT/IB2022/050377 IB2022050377W WO2022189865A1 WO 2022189865 A1 WO2022189865 A1 WO 2022189865A1 IB 2022050377 W IB2022050377 W IB 2022050377W WO 2022189865 A1 WO2022189865 A1 WO 2022189865A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
blockchain
encrypted
clip
random number
Prior art date
Application number
PCT/IB2022/050377
Other languages
French (fr)
Inventor
Yuan Yuan
Original Assignee
Alipay Labs (singapore) Pte. Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Labs (singapore) Pte. Ltd. filed Critical Alipay Labs (singapore) Pte. Ltd.
Priority to CN202280003220.2A priority Critical patent/CN115299010A/en
Publication of WO2022189865A1 publication Critical patent/WO2022189865A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3218Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using proof of knowledge, e.g. Fiat-Shamir, GQ, Schnorr, ornon-interactive zero-knowledge proofs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • H04L9/3231Biological data, e.g. fingerprint, voice or retina
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3247Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving digital signatures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/50Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using hash chains, e.g. blockchains or hash trees

Definitions

  • the specification relates generally to computer technologies, and more particularly, to methods and devices for verifying data integrity.
  • Blockchain systems also known as distributed ledger systems (DLSs) or consensus systems, may enable participating parties to store data securely and immutably.
  • Blockchain systems may include any DLSs, without referencing any particular use case, and may be used for public, private, and consortium blockchain networks.
  • a public blockchain network is open for all entities to use the system and participate in the consensus process.
  • a private blockchain network is provided for a particular entity, which centrally controls read and write permissions.
  • a consortium blockchain network is provided for a select group of entities, which control the consensus process, and includes an access control layer.
  • a blockchain system is implemented using a peer-to-peer (P2P) network, in which the nodes communicate directly with each other, e.g., without the need of a fixed, central server. Each node in the P2P network may initiate communication with another node in the P2P network.
  • P2P peer-to-peer
  • a blockchain system maintains one or more blockchains.
  • a blockchain is a data structure for storing data, such as transactions, that may prevent tampering and manipulation of the data by malicious parties.
  • blockchain systems can record data securely and immutably, they may lack the ability to check integrity or trustworthiness of the data in the first place. For example, if a user sends data to a blockchain system for recordation, but a malicious party intercepts and manipulates the data before the data is received by the blockchain system, absent a mechanism to check the integrity of the data received, the blockchain system may proceed to record the manipulated data, compromising the integrity of the data recorded on the blockchain system.
  • a computer- implemented method for verifying data integrity includes: obtaining a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain; decrypting the encrypted data clip to generate a decrypted data clip; parsing the decrypted data clip against the data content to recover a random number sequence that was used to create an original data clip corresponding to the encrypted data clip; calculating a hash based on the data content, the encrypted data clip, and the signature; encrypting the hash using the recovered random number sequence; and determining an integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.
  • a device for verifying data integrity includes: one or more processors; and one or more computer-readable memories coupled to the one or more processors and having instructions stored thereon that are executable by the one or more processors to: obtain a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain; decrypt the encrypted data clip to generate a decrypted data clip; parse the decrypted data clip against the data content to recover a random number sequence that was used to create an original data clip corresponding to the encrypted data clip; calculate a hash based on the data content, the encrypted data clip, and the signature; encrypt the hash using the recovered random number sequence; and determine an integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.
  • a non-transitory computer-readable medium has stored therein instructions that, when executed by a processor of a device, cause the device to perform a method for verifying data integrity.
  • the method includes: obtaining a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain; decrypting the encrypted data clip to generate a decrypted data clip; parsing the decrypted data clip against the data content to recover a random number sequence that was used to create an original data clip corresponding to the encrypted data clip; calculating a hash based on the data content, the encrypted data clip, and the signature; encrypting the hash using the recovered random number sequence; and determining an integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.
  • FIG. 1 is a schematic diagram of a blockchain system, according to an embodiment.
  • FIG. 2 is a schematic diagram of a computing device for implementing a node in a blockchain system, according to an embodiment.
  • FIGS. 3A-3C are a flow chart of a method for verifying data integrity, according to an embodiment.
  • FIG. 4 is an illustration depicting a data clip created based on a data content and a random number sequence, according to an embodiment.
  • FIG. 5 is a flow chart of a method for verifying data integrity, according to an embodiment.
  • FIG. 6 is a block diagram of an apparatus for verifying data integrity, according to an embodiment.
  • Embodiments of the specification provide methods and devices for verifying data integrity.
  • the methods and devices may allow users to sign off on data contents using signatures in manners so that the signatures can be used to verify the integrity of the data contents.
  • the methods and devices may also verify the authenticity of the signature.
  • the methods and devices may utilize blockchain systems to record information to facilitate the verifications of the integrity of the data content and the authenticity of the signature, and in some embodiments, the methods and devices may utilize one or more smart contracts executing on blockchain systems to perform the verifications.
  • the methods and devices may process user signatures in manners so that the signatures can be used to verify data integrity. This provides the methods and devices the abilities to determine whether a data content has been modified after a user signs off on the data content, thereby improving data integrity. In some embodiments, the methods and devices may process a signature so that its authenticity can be verified. This allows the methods and devices to determine whether the signature is provided by the purported signer, further improving data integrity. In some embodiments, the methods and devices may process the signature in a manner so that the signature may not be forged.
  • the methods and devices may improve data security because even if a malicious party obtains the signer’s secret information (e.g., a personal identification number) used to generate the signature, the malicious party still may not forge the signer’s signature.
  • the methods and devices may implement a voice signature. This makes the methods and devices more user friendly because the users are not required to keep track of cryptographic keys needed to generate their signatures.
  • the methods and devices may utilize blockchain systems to record information to facilitate the verifications of the integrity of the data contents and the authenticity of the signatures. This allows the methods and devices to record the information in a data structure that can prevent tampering and manipulation by malicious parties.
  • a blockchain is a data structure that stores data, e.g., transactions, in a way that may prevent tampering and manipulation of the data by malicious parties. The transactions stored in this manner may be immutable and subsequently verified.
  • a blockchain includes one or more blocks. Each block is linked to a previous block immediately before it in the blockchain by including a cryptographic hash of the previous block. Each block also may include a timestamp, its own cryptographic hash, and one or more transactions.
  • the transactions which generally have already been verified by the nodes of the blockchain system, may be hashed and encoded into a data structure, such as a Merkle tree.
  • a Merkle tree In a Merkle tree, data at leaf nodes of the tree is hashed, and all hashes in each branch of the tree may be concatenated at a root of the branch. This process continues up the tree to the root of the entire tree, which stores a hash that is representative of all data in the tree. A hash purporting to be of a transaction stored in the tree can be quickly verified by determining whether it is consistent with the structure of the tree.
  • a blockchain system includes a network of computing nodes that manage, update, and maintain one or more blockchains.
  • the network may be a public blockchain network, a private blockchain network, or a consortium blockchain network.
  • numerous entities such as hundreds, thousands, or even millions of entities, can operate in a public blockchain network, and each of the entities operates at least one node in the public blockchain network.
  • the public blockchain network can be considered a public network with respect to the participating entities.
  • a majority of entities (nodes) must sign every block for the block to be valid and added to the blockchain of the blockchain network.
  • Examples of public blockchain networks include particular peer-to-peer payment networks that leverage a distributed ledger, referred to as blockchain.
  • a public blockchain network may support public transactions.
  • a public transaction is shared with all of the nodes in the public blockchain network, and is stored in a global blockchain.
  • a global blockchain is a blockchain replicated across all nodes, and all nodes are in perfect state consensus with respect to the global blockchain.
  • consensus protocols include proof-of-work (POW) (e.g., implemented in the some crypto-currency networks), proof-of- stake (POS), and proof-of- authority (POA).
  • PW proof-of-work
  • POS proof-of- stake
  • POA proof-of- authority
  • a private blockchain network may be provided for a particular entity, which centrally controls read and write permissions.
  • the entity controls which nodes are able to participate in the blockchain network.
  • private blockchain networks are generally referred to as permissioned networks that place restrictions on who is allowed to participate in the network, and on their level of participation (e.g., only in certain transactions).
  • Various types of access control mechanisms can be used (e.g., existing participants vote on adding new entities, a regulatory authority can control admission).
  • a consortium blockchain network may be private among the participating entities.
  • the consensus process is controlled by an authorized set of nodes, one or more nodes being operated by a respective entity (e.g., a financial institution, insurance company).
  • a consortium of ten (10) entities e.g., financial institutions, insurance companies
  • the consortium blockchain network can be considered a private network with respect to the participating entities.
  • each entity (node) must sign every block in order for the block to be valid, and added to the blockchain.
  • at least a sub-set of entities (nodes) e.g., at least 7 entities
  • FIG. 1 illustrates a schematic diagram of a blockchain system 100, according to an embodiment.
  • the blockchain system 100 may include a plurality of nodes, e.g., nodes 102-110, configured to operate on a blockchain 120.
  • the nodes 102-110 may form a network 112, such as a peer-to-peer (P2P) network.
  • P2P peer-to-peer
  • Each of the nodes 102-110 may be a computing device, such as a computer or a computer system, configured to store a copy of the blockchain 120, or may be software running on the computing device, such as a process or an application.
  • Each of the nodes 102-110 may have a unique identifier.
  • the blockchain 120 may include a growing list of records in the form of data blocks, such as blocks B1-B5 in FIG. 1.
  • Each of the blocks B1-B5 may include a timestamp, a cryptographic hash of a previous block, and data of the present block, which may be transactions such as monetary transactions.
  • block B5 may include a timestamp, a cryptographic hash of block B4, and transaction data of block B5.
  • a hashing operation may be performed on the previous block to generate the cryptographic hash of the previous block.
  • the hashing operation may convert inputs of various lengths into cryptographic outputs of a fixed length through a hash algorithm, such as S HA-256.
  • the nodes 102-110 may be configured to perform an operation on the blockchain 120. For example, when a node, e.g., the node 102, wants to store new data onto the blockchain 120, that node may generate a new block to be added to the blockchain 120 and broadcast the new block to other nodes, e.g., the nodes 104-110, in the network 112. Based on legitimacy of the new block, e.g., validity of its signature and transactions, the other nodes may determine to accept the new block, such that the node 102 and the other nodes may add the new block to their respective copies of the blockchain 120. As this process repeats, more and more blocks of data may be added to the blockchain 120.
  • FIG. 2 illustrates a schematic diagram of a computing device 200 for implementing a node, e.g., the node 102 (FIG. 1), in a blockchain system, according to an embodiment.
  • the computing device 200 may include a communication interface 202, a processor 204, and a memory 206.
  • the communication interface 202 may facilitate communications between the computing device 200 and devices implementing other nodes, e.g., nodes 104-110 (FIG. 1), in the network.
  • the communication interface 202 is configured to support one or more communication standards, such as an Internet standard or protocol, an Integrated Services Digital Network (ISDN) standard, etc.
  • the communication interface 202 may include one or more of a Local Area Network (LAN) card, a cable modem, a satellite modem, a data bus, a cable, a wireless communication channel, a radio-based communication channel, a cellular communication channel, an Internet Protocol (IP) based communication device, or other communication devices for wired and/or wireless communications.
  • the communication interface 202 may be based on public cloud infrastructure, private cloud infrastmcture, hybrid public/private cloud infrastructure.
  • the processor 204 may include one or more dedicated processing units, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or various other types of processors or processing units.
  • the processor 204 is coupled with the memory 206 and is configured to execute instructions stored in the memory 206.
  • the memory 206 may store processor-executable instructions and data, such as a copy of the blockchain 120 (FIG. 1).
  • the memory 206 may include any type of volatile or non volatile memory devices, or a combination thereof, such as a static random-access memory (SRAM), an electrically erasable programmable read-only memory (EEPROM), an erasable programmable read-only memory (EPROM), a programmable read-only memory (PROM), a read-only memory (ROM), a magnetic memory, a flash memory, or a magnetic or optical disk.
  • SRAM static random-access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable programmable read-only memory
  • PROM programmable read-only memory
  • ROM read-only memory
  • magnetic memory a magnetic memory
  • flash memory or a magnetic or optical disk.
  • Users of a blockchain system may utilize the blockchain system 100 to record various types of information.
  • a user may utilize the blockchain system 100 to record data contents such as documents and files.
  • the user may also utilize the blockchain system 100 to record information that can be utilized to verify integrity of the data contents recorded.
  • FIGS. 3A-3C jointly illustrate a flow chart of a method 300 for verifying data integrity, according to an embodiment.
  • a Blockchain e.g., the blockchain 120 (FIG. 1)
  • the Blockchain may be implemented to support various types of users, or parties, including, e.g., individuals, businesses, banks, financial institutions, as well as other types of companies, organizations, and the like.
  • one user referred to as the User
  • the User may utilize an application, referred to as the User Application in FIGS. 3A-3C, running on a computing device to interact with the Blockchain.
  • the User Application may generate a random number sequence s. As will be described in detail below, this random number sequence s may be utilized to help verify data integrity.
  • the User Application may generate a commitment of the random number sequence s.
  • the User Application may calculate the commitment based on a commitment scheme.
  • the commitment scheme may include a cryptographic primitive that allows the User Application to commit to s while keeping s itself hidden from other users.
  • the commitment scheme used may be a Pedersen commitment scheme, e.g., as disclosed in Torben Pryds Pedersen, “Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing,” Advances in Cryptology— CRYPTO 91 Procs., Lecture Notes in Computer Science, vol. 576, pp. 129-140, 1991, which is herein incorporated by reference in its entirety.
  • the commitment of s may be denoted as comm(s, r), where r is a random number used to generate the commitment.
  • the User Application may submit the commitment comm(s, r to the Blockchain for recordation.
  • the Blockchain may record the commitment comm(s, r). In this manner, if a malicious party or the User attempts to change the random number sequence s, that change can be detected.
  • the User may provide or specify a data content that the User wants to record on the Blockchain.
  • the data content may be denoted as content, which may be provided or specified in various manners.
  • the User may identify an existing data content (e.g., a file that already exists on a computing device) and request the User Application to obtain the identified data content.
  • the User may create the data content (e.g., using the User Application or other applications), and the User Application may receive the data content as it is being created or shortly thereafter.
  • the data content may include one or more documents.
  • the data content may include other file types, including, e.g., voice recordings, chatbot conversations, and other types of audio/video files.
  • the User Application may create an original data clip, denoted as clip, based on the data content content and the random number sequence 5 .
  • the User Application may create the data clip clip by extracting pieces of data from the data content content according to the random number sequence s. For example, as illustrated in FIG. 4, if the data content content includes a voice recording and the random number sequence is [9, 3, 6, 3, 7], the User Application may create the data clip clip by extracting a first piece of data from the content at the 9 th position (e.g., 9 seconds into the voice recording), where 9 is the first number in s.
  • the User Application may then advance the extraction position by 3 (3 being the next number in s) and extract a second piece of data from the content at the 12 th position.
  • the User Application may continue to advance the extraction position based on the next number in s and extract additional pieces of data from the content as shown in FIG. 4.
  • the User Application may concatenate the extracted pieces of data together to form the data clip clip.
  • the User Application may create the data clip clip to include randomly selected pieces of the data content content, effectively binding the data clip clip to the data content content. It is contemplated that binding the data clip clip to the data content content may be beneficial because a malicious party may not create or forge the data clip clip without knowing the data content content . Binding the data clip clip to the data content content may also prevent the malicious party from bypassing the step 312 by reusing another data clip created for another data content.
  • the User Application may obtain a public key of a Verifier, e.g., a smart contract executing on the Blockchain or another user.
  • the User Application may obtain the public key from the Blockchain so that the User Application does not need to store any keys locally.
  • the Verifier may be another user who may verify the integrity of the data content content.
  • the Verifier may include one or more smart contracts executing on the Blockchain. Smart contracts are computer protocols implemented in the form of computer code that are incorporated into the Blockchain, to facilitate, verify, or enforce the negotiation or performance of agreed terms or conditions.
  • users of the Blockchain may program agreed terms into a smart contract using a programming language, such as C++, Java, Solidity, Python, etc., and when the terms are met, the smart contract may be automatically executed on the Blockchain, e.g., to perform a transaction.
  • the smart contract may include a plurality of subroutines or functions, each of which may be a sequence of program instructions that performs a specific task.
  • the smart contract may be operational code that is fully or partially executed without human interaction.
  • the User Application may encrypt the data clip clip using the public key of the Verifier.
  • the encrypted data clip may be denoted as En(clip, P K verifier ) ⁇
  • the User Application may receive a signature sig from the User.
  • receiving the signature sig from the User may indicate that the User has signed off on the data content content.
  • the signature sig may include a digital signature generated using a cryptographic key.
  • the signature sig may include a voice signature, which may allow the User to sign the data content content by stating a personal identification number (PIN), as disclosed in U.S. Patent No. 7,606,768, “Voice Signature With Strong Binding,” which is herein incorporated by reference in its entirety. It is contemplated that using voice signatures may be desirable in certain situations, including, e.g., data collection via chatbot conversations. Certain users may also prefer voice signatures because the users do not need to generate and safeguard their private keys typically required for digital signatures.
  • the User Application may calculate a hash of the appended data H — hash(D).
  • the User Application may encrypt the hash using the random number sequence s .
  • the User Application may submit the data content content, the encrypted data clip En(clip,PK Verifier ) , the signature sig , and the encrypted hash EH to the Blockchain for recordation.
  • the User Application may submit the plaintext values of the data content content and the signature sig to the Blockchain.
  • the User Application may encrypt the data content content and/or the signature sig before submitting them to the Blockchain.
  • the Blockchain may record the data submitted by the User Application, including the data content content, the encrypted data clip En(clip, PE Verif ier ). the signature sig, and the encrypted hash EH.
  • the Verifier e.g., a smart contract executing on the Blockchain or another user
  • the data obtained may include the data content content, the encrypted data clip En(clip, PE Verifier ), the signature sig, and the encrypted hash EH.
  • the Verifier may decrypt the encrypted data clip En(clip, PK Verifier ) using its private key to obtain the data clip clip.
  • the Verifier may parse the data clip clip against the data content content to recover the random number sequence s.
  • the recovered random number sequence may be denoted as s', which should equal s if the recovery is successful.
  • the Verifier may parse the data clip clip against the data content content and determine the positions where the various pieces of data were extracted. The Verifier may then recover the random number sequence s' based on the determined positions. However, if the data clip clip was forged, the Verifier may fail to recover the random number sequence.
  • the Verifier may submit the recovered random number sequence s' to the Blockchain for recordation at step 336.
  • the Verifier may append the recovered random number sequence s' to the data obtained from the Blockchain at step 330 and submit the appended data to the Blockchain for recordation. Submitting the recovered random number sequence s' to the Blockchain for recordation allows the recovered random number sequence s' to be published, and in some embodiments, the method 300 may prohibit reuse of any published random number sequences to improve security.
  • the Verifier may verify the correctness of the recovered random number sequence s' based on the commitment comm(s, r) recorded on the Blockchain (at step 308).
  • the User Application may release the random number r used to generate the commitment comm(s, r) , allowing the Verifier to calculate a commitment of the recovered random number sequence s' using the random number r and determine whether it matches the commitment comm(s, r) recorded on the Blockchain (at step 308).
  • the Verifier may refuse to proceed further because the mismatch suggests that the Verifier did not correctly recover the random number sequence s' (e.g., because the data clip clip was forged) or that the random number sequence has been modified (e.g., inadvertently or purposefully by the User or a malicious party) after the commitment comm(s, r) was recorded on the Blockchain (at step 308).
  • the commitment of the recovered random number sequence s' matches the commitment comm(s, r) recorded on the Blockchain, the verification process may continue.
  • the Verifier may encrypt the hash H' using the recovered random number sequence s'.
  • the Verifier may perform one or more additional verifications at step 342. For example, if the data content content includes a voice recording and the signature sig includes a voice signature, the Verifier may utilize a voice analyzer, including, e.g., an artificial intelligence-based voice analyzer, to compare the voice contained in the data content content against the voice contained in signature sig to determine the authenticity of the signature sig (e.g., determine whether the content and the sig were uttered by the same person). In another example, if the signature sig is a digital signature, the Verifier may utilize appropriate verification algorithms to verify the authenticity of the digital signature. In some embodiments, the Verifier may accept the data content content and the signature sig if they pass verification steps 340 and 342. In this manner, the method 300 can provide a mechanism for the Blockchain to determine the integrity of the data content content received. In some embodiments, the Blockchain may refuse to record the data content content if its integrity is not verified.
  • a voice analyzer including, e.g., an artificial intelligence-based voice analyzer
  • FIG. 5 illustrates a flow chart of a method 500 for verifying data integrity, according to an embodiment.
  • the method 500 may be performed by one or more nodes in a blockchain system, e.g., the nodes 102-110 in the blockchain system 100 (FIG. 1).
  • the nodes 102-110 in the blockchain system 100 may perform operations on a blockchain, e.g., the blockchain 120 (FIG. 1).
  • the blockchain 120 may be implemented as the Blockchain in the examples described above.
  • a node e.g., the node 102
  • the data content, the encrypted data clip, the signature, and the encrypted hash may be recorded on the blockchain by a user, e.g., the User (FIG. 3B, step 326), and the node 102 may obtain the recorded data to verify the integrity of the data content and the authenticity of the signature.
  • the node 102 may decrypt the encrypted data clip to generate a decrypted data clip.
  • the user who recorded the encrypted data clip on the blockchain may have encrypted the data clip using a public key of the node 102, allowing the node 102 to decrypt the encrypted data clip using its corresponding private key.
  • the node 102 may parse the decrypted data clip against the data content to recover a random number sequence that was used to create the data clip. As described above with reference to FIG. 4, in some embodiments, the node 102 may parse the decrypted data clip against the data content to determine the positions where the various pieces of data were extracted to create the data clip. The node 102 may then recover the random number sequence based on the determined positions. In some embodiments, the blockchain may record a commitment of the random number sequence used to create the data clip. In such embodiments, the node 102 may determine whether the recovered random number sequence is correct based on the commitment recorded on the blockchain, as described above.
  • the node 102 may calculate a hash based on the data content, the encrypted data clip, and the signature.
  • the node 102 may encrypt the hash using the recovered random number sequence, and in some embodiments, the node 102 may record the recovered random number sequence on the blockchain to prevent the recovered random number sequence from being used to encrypt the hash again. As described above, the hash encrypted using the recovered random number sequence should match the encrypted hash recorded on the blockchain. Accordingly, at step 512, the node 102 may determine the integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.
  • the node 102 may accept the integrity of the data content. Otherwise, if the hash encrypted using the recovered random number sequence does not match the encrypted hash recorded on the blockchain, the node 102 may refuse to accept the integrity of the data content.
  • the node 102 may further determine the authenticity of the signature. For example, if the data content includes a voice recording and the signature includes a voice signature, the node 102 may compare the voice contained in the data content against the voice contained in signature to determine the authenticity of the voice signature. In another example, if the signature is a digital signature, the node 102 may utilize an appropriate verification algorithm to verify the authenticity of the digital signature. In some embodiments, if the signature is authentic, the node 102 may accept the integrity of the data content. Otherwise, the node 102 may refuse to accept the integrity of the data content.
  • FIG. 6 is a block diagram of an apparatus 600 for verifying data integrity, according to an embodiment.
  • the apparatus 600 may be an implementation of a software process and may correspond to the method 500 (FIG. 5).
  • the apparatus 600 may include a processing module 602, an encryption/decryption module 604, and a determination module 606.
  • the processing module 602 may obtain a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain and provide the obtained data to the encryption/decryption module 604.
  • the encryption/decryption module 604 may decrypt the encrypted data clip to generate a decrypted data clip.
  • the processing module 602 may then parse the decrypted data clip against the data content to recover a random number sequence that was used to create the data clip.
  • the processing module 602 may also calculate a hash based on the data content, the encrypted data clip, and the signature, and provide the hash to the encryption/decryption module 604.
  • the encryption/decryption module 604 may encrypt the hash using the recovered random number sequence and provide the hash encrypted using the recovered random number sequence to the determination module 606.
  • the determination module 606 may determine the integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain, as described above.
  • Each of the above-described modules may be implemented as software, or hardware, or a combination of software and hardware.
  • each of the above-described modules may be implemented using a processor executing instmctions stored in a memory.
  • each the above-described modules may be implemented with one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components, for performing the described methods.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • controllers micro-controllers, microprocessors, or other electronic components, for performing the described methods.
  • each of the above- described modules may be implemented by using a computer chip or an entity, or implemented by using a product
  • the apparatus 600 may be a computer, and the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email receiving and sending device, a game console, a tablet computer, a wearable device, or any combination of these devices.
  • a computer program product may include a non-transitory computer-readable storage medium having computer-readable program instructions thereon for causing a processor to carry out the above-described methods.
  • the computer-readable storage medium may be a tangible device that can store instructions for use by an instruction execution device.
  • the computer -readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of the computer-readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • SRAM static random access memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • memory stick a floppy disk
  • a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
  • the computer-readable program instructions for carrying out the above-described methods may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or source code or object code written in any combination of one or more programming languages, including an object-oriented programming language, and conventional procedural programming languages.
  • the computer-readable program instructions may execute entirely on a computing device as a stand-alone software package, or partly on a first computing device and partly on a second computing device remote from the first computing device. In the latter scenario, the second, remote computing device may be connected to the first computing device through any type of network, including a local area network (LAN) or a wide area network (WAN).
  • LAN local area network
  • WAN wide area network
  • the computer-readable program instructions may be provided to a processor of a general-purpose or special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the above-described methods.
  • a block in the flow charts or diagrams may represent a software program, segment, or portion of code, which comprises one or more executable instructions for implementing specific functions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the diagrams and/or flow charts, and combinations of blocks in the diagrams and flow charts may be implemented by special purpose hardware -based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioethics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Storage Device Security (AREA)

Abstract

Disclosed herein are methods, devices, and apparatuses, including computer programs stored on computer-readable media, for verifying data integrity. One of the methods includes: obtaining a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain; decrypting the encrypted data clip to generate a decrypted data clip; parsing the decrypted data clip against the data content to recover a random number sequence that was used to create an original data clip corresponding to the encrypted data clip; calculating a hash based on the data content, the encrypted data clip, and the signature; encrypting the hash using the recovered random number sequence; and determining an integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.

Description

METHODS AND DEVICES FOR VERIFYING DATA INTEGRITY
TECHNICAL FIELD
[01] The specification relates generally to computer technologies, and more particularly, to methods and devices for verifying data integrity.
BACKGROUND
[02] Blockchain systems, also known as distributed ledger systems (DLSs) or consensus systems, may enable participating parties to store data securely and immutably. Blockchain systems may include any DLSs, without referencing any particular use case, and may be used for public, private, and consortium blockchain networks. A public blockchain network is open for all entities to use the system and participate in the consensus process. A private blockchain network is provided for a particular entity, which centrally controls read and write permissions. A consortium blockchain network is provided for a select group of entities, which control the consensus process, and includes an access control layer.
[03] A blockchain system is implemented using a peer-to-peer (P2P) network, in which the nodes communicate directly with each other, e.g., without the need of a fixed, central server. Each node in the P2P network may initiate communication with another node in the P2P network. A blockchain system maintains one or more blockchains. A blockchain is a data structure for storing data, such as transactions, that may prevent tampering and manipulation of the data by malicious parties.
[04] While blockchain systems can record data securely and immutably, they may lack the ability to check integrity or trustworthiness of the data in the first place. For example, if a user sends data to a blockchain system for recordation, but a malicious party intercepts and manipulates the data before the data is received by the blockchain system, absent a mechanism to check the integrity of the data received, the blockchain system may proceed to record the manipulated data, compromising the integrity of the data recorded on the blockchain system.
SUMMARY
[05] In one aspect, a computer- implemented method for verifying data integrity includes: obtaining a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain; decrypting the encrypted data clip to generate a decrypted data clip; parsing the decrypted data clip against the data content to recover a random number sequence that was used to create an original data clip corresponding to the encrypted data clip; calculating a hash based on the data content, the encrypted data clip, and the signature; encrypting the hash using the recovered random number sequence; and determining an integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.
[06] In another aspect, a device for verifying data integrity includes: one or more processors; and one or more computer-readable memories coupled to the one or more processors and having instructions stored thereon that are executable by the one or more processors to: obtain a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain; decrypt the encrypted data clip to generate a decrypted data clip; parse the decrypted data clip against the data content to recover a random number sequence that was used to create an original data clip corresponding to the encrypted data clip; calculate a hash based on the data content, the encrypted data clip, and the signature; encrypt the hash using the recovered random number sequence; and determine an integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.
[07] In still another aspect, a non-transitory computer-readable medium has stored therein instructions that, when executed by a processor of a device, cause the device to perform a method for verifying data integrity. The method includes: obtaining a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain; decrypting the encrypted data clip to generate a decrypted data clip; parsing the decrypted data clip against the data content to recover a random number sequence that was used to create an original data clip corresponding to the encrypted data clip; calculating a hash based on the data content, the encrypted data clip, and the signature; encrypting the hash using the recovered random number sequence; and determining an integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.
BRIEF DESCRIPTION OF THE DRAWINGS
[08] The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments. In the following description, which refers to the drawings, the same numbers in different drawings represent the same or similar elements unless otherwise represented.
[09] FIG. 1 is a schematic diagram of a blockchain system, according to an embodiment. [10] FIG. 2 is a schematic diagram of a computing device for implementing a node in a blockchain system, according to an embodiment.
[11] FIGS. 3A-3C are a flow chart of a method for verifying data integrity, according to an embodiment.
[12] FIG. 4 is an illustration depicting a data clip created based on a data content and a random number sequence, according to an embodiment.
[13] FIG. 5 is a flow chart of a method for verifying data integrity, according to an embodiment.
[14] FIG. 6 is a block diagram of an apparatus for verifying data integrity, according to an embodiment.
DETAILED DESCRIPTION OF THE EMBODIMETNS
[15] Embodiments of the specification provide methods and devices for verifying data integrity. The methods and devices may allow users to sign off on data contents using signatures in manners so that the signatures can be used to verify the integrity of the data contents. The methods and devices may also verify the authenticity of the signature. Furthermore, the methods and devices may utilize blockchain systems to record information to facilitate the verifications of the integrity of the data content and the authenticity of the signature, and in some embodiments, the methods and devices may utilize one or more smart contracts executing on blockchain systems to perform the verifications.
[16] Embodiments disclosed in the specification have one or more technical effects. In some embodiments, the methods and devices may process user signatures in manners so that the signatures can be used to verify data integrity. This provides the methods and devices the abilities to determine whether a data content has been modified after a user signs off on the data content, thereby improving data integrity. In some embodiments, the methods and devices may process a signature so that its authenticity can be verified. This allows the methods and devices to determine whether the signature is provided by the purported signer, further improving data integrity. In some embodiments, the methods and devices may process the signature in a manner so that the signature may not be forged. This allows the methods and devices to improve data security because even if a malicious party obtains the signer’s secret information (e.g., a personal identification number) used to generate the signature, the malicious party still may not forge the signer’s signature. Furthermore, in some embodiments, the methods and devices may implement a voice signature. This makes the methods and devices more user friendly because the users are not required to keep track of cryptographic keys needed to generate their signatures. Also, in some embodiments, the methods and devices may utilize blockchain systems to record information to facilitate the verifications of the integrity of the data contents and the authenticity of the signatures. This allows the methods and devices to record the information in a data structure that can prevent tampering and manipulation by malicious parties.
[17] A blockchain is a data structure that stores data, e.g., transactions, in a way that may prevent tampering and manipulation of the data by malicious parties. The transactions stored in this manner may be immutable and subsequently verified. A blockchain includes one or more blocks. Each block is linked to a previous block immediately before it in the blockchain by including a cryptographic hash of the previous block. Each block also may include a timestamp, its own cryptographic hash, and one or more transactions. The transactions, which generally have already been verified by the nodes of the blockchain system, may be hashed and encoded into a data structure, such as a Merkle tree. In a Merkle tree, data at leaf nodes of the tree is hashed, and all hashes in each branch of the tree may be concatenated at a root of the branch. This process continues up the tree to the root of the entire tree, which stores a hash that is representative of all data in the tree. A hash purporting to be of a transaction stored in the tree can be quickly verified by determining whether it is consistent with the structure of the tree.
[18] A blockchain system includes a network of computing nodes that manage, update, and maintain one or more blockchains. The network may be a public blockchain network, a private blockchain network, or a consortium blockchain network. For example, numerous entities, such as hundreds, thousands, or even millions of entities, can operate in a public blockchain network, and each of the entities operates at least one node in the public blockchain network. Accordingly, the public blockchain network can be considered a public network with respect to the participating entities. Sometimes, a majority of entities (nodes) must sign every block for the block to be valid and added to the blockchain of the blockchain network. Examples of public blockchain networks include particular peer-to-peer payment networks that leverage a distributed ledger, referred to as blockchain.
[19] In general, a public blockchain network may support public transactions. A public transaction is shared with all of the nodes in the public blockchain network, and is stored in a global blockchain. A global blockchain is a blockchain replicated across all nodes, and all nodes are in perfect state consensus with respect to the global blockchain. To achieve consensus (e.g., agreement to the addition of a block to a blockchain), a consensus protocol is implemented in the public blockchain network. Examples of consensus protocols include proof-of-work (POW) (e.g., implemented in the some crypto-currency networks), proof-of- stake (POS), and proof-of- authority (POA).
[20] In general, a private blockchain network may be provided for a particular entity, which centrally controls read and write permissions. The entity controls which nodes are able to participate in the blockchain network. Consequently, private blockchain networks are generally referred to as permissioned networks that place restrictions on who is allowed to participate in the network, and on their level of participation (e.g., only in certain transactions). Various types of access control mechanisms can be used (e.g., existing participants vote on adding new entities, a regulatory authority can control admission).
[21] In general, a consortium blockchain network may be private among the participating entities. In a consortium blockchain network, the consensus process is controlled by an authorized set of nodes, one or more nodes being operated by a respective entity (e.g., a financial institution, insurance company). For example, a consortium of ten (10) entities (e.g., financial institutions, insurance companies) can operate a consortium blockchain network, each of which operates at least one node in the consortium blockchain network. Accordingly, the consortium blockchain network can be considered a private network with respect to the participating entities. In some examples, each entity (node) must sign every block in order for the block to be valid, and added to the blockchain. In some examples, at least a sub-set of entities (nodes) (e.g., at least 7 entities) must sign every block in order for the block to be valid, and added to the blockchain.
[22] FIG. 1 illustrates a schematic diagram of a blockchain system 100, according to an embodiment. Referring to FIG. 1, the blockchain system 100 may include a plurality of nodes, e.g., nodes 102-110, configured to operate on a blockchain 120. The nodes 102-110 may form a network 112, such as a peer-to-peer (P2P) network. Each of the nodes 102-110 may be a computing device, such as a computer or a computer system, configured to store a copy of the blockchain 120, or may be software running on the computing device, such as a process or an application. Each of the nodes 102-110 may have a unique identifier.
[23] The blockchain 120 may include a growing list of records in the form of data blocks, such as blocks B1-B5 in FIG. 1. Each of the blocks B1-B5 may include a timestamp, a cryptographic hash of a previous block, and data of the present block, which may be transactions such as monetary transactions. For example, as illustrated in FIG. 1, block B5 may include a timestamp, a cryptographic hash of block B4, and transaction data of block B5. Also, for example, a hashing operation may be performed on the previous block to generate the cryptographic hash of the previous block. The hashing operation may convert inputs of various lengths into cryptographic outputs of a fixed length through a hash algorithm, such as S HA-256.
[24] The nodes 102-110 may be configured to perform an operation on the blockchain 120. For example, when a node, e.g., the node 102, wants to store new data onto the blockchain 120, that node may generate a new block to be added to the blockchain 120 and broadcast the new block to other nodes, e.g., the nodes 104-110, in the network 112. Based on legitimacy of the new block, e.g., validity of its signature and transactions, the other nodes may determine to accept the new block, such that the node 102 and the other nodes may add the new block to their respective copies of the blockchain 120. As this process repeats, more and more blocks of data may be added to the blockchain 120.
[25] FIG. 2 illustrates a schematic diagram of a computing device 200 for implementing a node, e.g., the node 102 (FIG. 1), in a blockchain system, according to an embodiment. Referring to FIG. 2, the computing device 200 may include a communication interface 202, a processor 204, and a memory 206.
[26] The communication interface 202 may facilitate communications between the computing device 200 and devices implementing other nodes, e.g., nodes 104-110 (FIG. 1), in the network. In some embodiments, the communication interface 202 is configured to support one or more communication standards, such as an Internet standard or protocol, an Integrated Services Digital Network (ISDN) standard, etc. In some embodiments, the communication interface 202 may include one or more of a Local Area Network (LAN) card, a cable modem, a satellite modem, a data bus, a cable, a wireless communication channel, a radio-based communication channel, a cellular communication channel, an Internet Protocol (IP) based communication device, or other communication devices for wired and/or wireless communications. In some embodiments, the communication interface 202 may be based on public cloud infrastructure, private cloud infrastmcture, hybrid public/private cloud infrastructure.
[27] The processor 204 may include one or more dedicated processing units, application- specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or various other types of processors or processing units. The processor 204 is coupled with the memory 206 and is configured to execute instructions stored in the memory 206.
[28] The memory 206 may store processor-executable instructions and data, such as a copy of the blockchain 120 (FIG. 1). The memory 206 may include any type of volatile or non volatile memory devices, or a combination thereof, such as a static random-access memory (SRAM), an electrically erasable programmable read-only memory (EEPROM), an erasable programmable read-only memory (EPROM), a programmable read-only memory (PROM), a read-only memory (ROM), a magnetic memory, a flash memory, or a magnetic or optical disk. When the instructions in the memory 206 are executed by the processor 204, the computing device 200 may perform an operation on the blockchain 120.
[29] Users of a blockchain system, e.g., the blockchain system 100, may utilize the blockchain system 100 to record various types of information. For example, in some embodiments, a user may utilize the blockchain system 100 to record data contents such as documents and files. The user may also utilize the blockchain system 100 to record information that can be utilized to verify integrity of the data contents recorded.
[30] FIGS. 3A-3C jointly illustrate a flow chart of a method 300 for verifying data integrity, according to an embodiment. For illustrative purposes, a Blockchain, e.g., the blockchain 120 (FIG. 1), is depicted in FIGS. 3A-3C. The Blockchain may be implemented to support various types of users, or parties, including, e.g., individuals, businesses, banks, financial institutions, as well as other types of companies, organizations, and the like. For illustrative purposes, one user, referred to as the User, is depicted in FIGS. 3A-3C. In some embodiments, the User may utilize an application, referred to as the User Application in FIGS. 3A-3C, running on a computing device to interact with the Blockchain.
[31] At step 302, the User Application may generate a random number sequence s. As will be described in detail below, this random number sequence s may be utilized to help verify data integrity. At step 304, the User Application may generate a commitment of the random number sequence s. In some embodiments, the User Application may calculate the commitment based on a commitment scheme. In some embodiments, the commitment scheme may include a cryptographic primitive that allows the User Application to commit to s while keeping s itself hidden from other users. In some embodiments, the commitment scheme used may be a Pedersen commitment scheme, e.g., as disclosed in Torben Pryds Pedersen, “Non-Interactive and Information-Theoretic Secure Verifiable Secret Sharing,” Advances in Cryptology— CRYPTO 91 Procs., Lecture Notes in Computer Science, vol. 576, pp. 129-140, 1991, which is herein incorporated by reference in its entirety. For illustrative purposes, the commitment of s may be denoted as comm(s, r), where r is a random number used to generate the commitment.
[32] At step 306, the User Application may submit the commitment comm(s, r to the Blockchain for recordation. At step 308, the Blockchain may record the commitment comm(s, r). In this manner, if a malicious party or the User attempts to change the random number sequence s, that change can be detected.
[33] At step 310, the User may provide or specify a data content that the User wants to record on the Blockchain. For illustrative purposes, the data content may be denoted as content, which may be provided or specified in various manners. For example, the User may identify an existing data content (e.g., a file that already exists on a computing device) and request the User Application to obtain the identified data content. In another example, the User may create the data content (e.g., using the User Application or other applications), and the User Application may receive the data content as it is being created or shortly thereafter. In some embodiments, the data content may include one or more documents. Alternatively, or additionally, the data content may include other file types, including, e.g., voice recordings, chatbot conversations, and other types of audio/video files.
[34] At step 312, the User Application may create an original data clip, denoted as clip, based on the data content content and the random number sequence 5 . In some embodiments, the User Application may create the data clip clip by extracting pieces of data from the data content content according to the random number sequence s. For example, as illustrated in FIG. 4, if the data content content includes a voice recording and the random number sequence is [9, 3, 6, 3, 7], the User Application may create the data clip clip by extracting a first piece of data from the content at the 9th position (e.g., 9 seconds into the voice recording), where 9 is the first number in s. The User Application may then advance the extraction position by 3 (3 being the next number in s) and extract a second piece of data from the content at the 12th position. The User Application may continue to advance the extraction position based on the next number in s and extract additional pieces of data from the content as shown in FIG. 4. The User Application may concatenate the extracted pieces of data together to form the data clip clip.
[35] It is to be understood that the extraction process described above is merely presented as an example and is not meant to be limiting. It is also to be understood that similar techniques may be applied to other types of data contents, including documents and the like. In this manner, the User Application may create the data clip clip to include randomly selected pieces of the data content content, effectively binding the data clip clip to the data content content. It is contemplated that binding the data clip clip to the data content content may be beneficial because a malicious party may not create or forge the data clip clip without knowing the data content content . Binding the data clip clip to the data content content may also prevent the malicious party from bypassing the step 312 by reusing another data clip created for another data content.
[36] Referring now to FIG. 3B, at step 314, the User Application may obtain a public key of a Verifier, e.g., a smart contract executing on the Blockchain or another user. In some embodiments, the User Application may obtain the public key from the Blockchain so that the User Application does not need to store any keys locally. In some embodiments, the Verifier may be another user who may verify the integrity of the data content content. Alternatively, in some embodiments, the Verifier may include one or more smart contracts executing on the Blockchain. Smart contracts are computer protocols implemented in the form of computer code that are incorporated into the Blockchain, to facilitate, verify, or enforce the negotiation or performance of agreed terms or conditions. For example, users of the Blockchain may program agreed terms into a smart contract using a programming language, such as C++, Java, Solidity, Python, etc., and when the terms are met, the smart contract may be automatically executed on the Blockchain, e.g., to perform a transaction. Also for example, the smart contract may include a plurality of subroutines or functions, each of which may be a sequence of program instructions that performs a specific task. The smart contract may be operational code that is fully or partially executed without human interaction. At step 316, the User Application may encrypt the data clip clip using the public key of the Verifier. For illustrative purposes, the encrypted data clip may be denoted as En(clip, P Kverifier) ·
[37] At step 318, the User Application may receive a signature sig from the User. In some embodiments, receiving the signature sig from the User may indicate that the User has signed off on the data content content. In some embodiments, the signature sig may include a digital signature generated using a cryptographic key. In some embodiments, the signature sig may include a voice signature, which may allow the User to sign the data content content by stating a personal identification number (PIN), as disclosed in U.S. Patent No. 7,606,768, “Voice Signature With Strong Binding,” which is herein incorporated by reference in its entirety. It is contemplated that using voice signatures may be desirable in certain situations, including, e.g., data collection via chatbot conversations. Certain users may also prefer voice signatures because the users do not need to generate and safeguard their private keys typically required for digital signatures.
[38] At step 320, the User Application may append the encrypted data clip En(clip, P Kverifier) and the signature sig to the data content content to create an appended data D = content + En(clip, PKVerifier) + sig . At step 322, the User Application may calculate a hash of the appended data H — hash(D). At step 324, the User Application may encrypt the hash using the random number sequence s . For illustrative purposes, the encrypted hash may be denoted as EH = En(H, s ).
[39] At step 326, the User Application may submit the data content content, the encrypted data clip En(clip,PKVerifier) , the signature sig , and the encrypted hash EH to the Blockchain for recordation. In some embodiments, the User Application may submit the plaintext values of the data content content and the signature sig to the Blockchain. In some embodiments, the User Application may encrypt the data content content and/or the signature sig before submitting them to the Blockchain. At step 328, the Blockchain may record the data submitted by the User Application, including the data content content, the encrypted data clip En(clip, PEVerif ier). the signature sig, and the encrypted hash EH.
[40] Referring now to FIG. 3C, at step 330, the Verifier (e.g., a smart contract executing on the Blockchain or another user) may obtain the data submitted by the User Application from the Blockchain. In some embodiments, the data obtained may include the data content content, the encrypted data clip En(clip, PEVerifier), the signature sig, and the encrypted hash EH.
[41] At step 332, the Verifier may decrypt the encrypted data clip En(clip, PKVerifier ) using its private key to obtain the data clip clip. At step 334, the Verifier may parse the data clip clip against the data content content to recover the random number sequence s. For illustrative purposes, the recovered random number sequence may be denoted as s', which should equal s if the recovery is successful. For example, as depicted in FIG. 4, the Verifier may parse the data clip clip against the data content content and determine the positions where the various pieces of data were extracted. The Verifier may then recover the random number sequence s' based on the determined positions. However, if the data clip clip was forged, the Verifier may fail to recover the random number sequence. [42] In some embodiments, the Verifier may submit the recovered random number sequence s' to the Blockchain for recordation at step 336. In some embodiments, the Verifier may append the recovered random number sequence s' to the data obtained from the Blockchain at step 330 and submit the appended data to the Blockchain for recordation. Submitting the recovered random number sequence s' to the Blockchain for recordation allows the recovered random number sequence s' to be published, and in some embodiments, the method 300 may prohibit reuse of any published random number sequences to improve security.
[43] Also, in some embodiments, the Verifier may verify the correctness of the recovered random number sequence s' based on the commitment comm(s, r) recorded on the Blockchain (at step 308). For example, in some embodiments, the User Application may release the random number r used to generate the commitment comm(s, r) , allowing the Verifier to calculate a commitment of the recovered random number sequence s' using the random number r and determine whether it matches the commitment comm(s, r) recorded on the Blockchain (at step 308). If the commitment values do not match, the Verifier may refuse to proceed further because the mismatch suggests that the Verifier did not correctly recover the random number sequence s' (e.g., because the data clip clip was forged) or that the random number sequence has been modified (e.g., inadvertently or purposefully by the User or a malicious party) after the commitment comm(s, r) was recorded on the Blockchain (at step 308). On the other hand, if the commitment of the recovered random number sequence s' matches the commitment comm(s, r) recorded on the Blockchain, the verification process may continue.
[44] At step 338, the Verifier may calculate a hash H' based on the data content content, the encrypted data clip En(clip, PKVerifier) , and the signature sig obtained from the Blockchain. For example, the Verifier may calculate the hash H' = hash(content + En(clip, PKVerifier ) + sig). At step 340, the Verifier may encrypt the hash H' using the recovered random number sequence s'. For illustrative purposes, the encrypted hash H' may be denoted as EH' = En(H',s'), which should equal EH obtained from the Blockchain. If EH' ¹ EH, the Verifier may determine that the integrity of the data content content and/or the authenticity of the signature sig have been compromised. In other words, if the data content content and/or the signature sig recorded on the Blockchain are different from that used by the User to calculate EH (e.g., if a malicious party changed the data content content or forged the signature sig recorded on the Blockchain), the differences can be recognized because EH' ¹ EH . In such cases, the Verifier may refuse to accept the data content content and the signature sig. Otherwise, if EH' = EH, the Verifier may accept the data content content and the signature sig.
[45] In some embodiments, the Verifier may perform one or more additional verifications at step 342. For example, if the data content content includes a voice recording and the signature sig includes a voice signature, the Verifier may utilize a voice analyzer, including, e.g., an artificial intelligence-based voice analyzer, to compare the voice contained in the data content content against the voice contained in signature sig to determine the authenticity of the signature sig (e.g., determine whether the content and the sig were uttered by the same person). In another example, if the signature sig is a digital signature, the Verifier may utilize appropriate verification algorithms to verify the authenticity of the digital signature. In some embodiments, the Verifier may accept the data content content and the signature sig if they pass verification steps 340 and 342. In this manner, the method 300 can provide a mechanism for the Blockchain to determine the integrity of the data content content received. In some embodiments, the Blockchain may refuse to record the data content content if its integrity is not verified.
[46] FIG. 5 illustrates a flow chart of a method 500 for verifying data integrity, according to an embodiment. The method 500 may be performed by one or more nodes in a blockchain system, e.g., the nodes 102-110 in the blockchain system 100 (FIG. 1). The nodes 102-110 in the blockchain system 100 may perform operations on a blockchain, e.g., the blockchain 120 (FIG. 1). The blockchain 120 may be implemented as the Blockchain in the examples described above.
[47] At step 502, a node, e.g., the node 102, may obtain a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain. The data content, the encrypted data clip, the signature, and the encrypted hash may be recorded on the blockchain by a user, e.g., the User (FIG. 3B, step 326), and the node 102 may obtain the recorded data to verify the integrity of the data content and the authenticity of the signature.
[48] At step 504, the node 102 may decrypt the encrypted data clip to generate a decrypted data clip. As descried above, the user who recorded the encrypted data clip on the blockchain may have encrypted the data clip using a public key of the node 102, allowing the node 102 to decrypt the encrypted data clip using its corresponding private key.
[49] At step 506, the node 102 may parse the decrypted data clip against the data content to recover a random number sequence that was used to create the data clip. As described above with reference to FIG. 4, in some embodiments, the node 102 may parse the decrypted data clip against the data content to determine the positions where the various pieces of data were extracted to create the data clip. The node 102 may then recover the random number sequence based on the determined positions. In some embodiments, the blockchain may record a commitment of the random number sequence used to create the data clip. In such embodiments, the node 102 may determine whether the recovered random number sequence is correct based on the commitment recorded on the blockchain, as described above.
[50] At step 508, the node 102 may calculate a hash based on the data content, the encrypted data clip, and the signature. At step 510, the node 102 may encrypt the hash using the recovered random number sequence, and in some embodiments, the node 102 may record the recovered random number sequence on the blockchain to prevent the recovered random number sequence from being used to encrypt the hash again. As described above, the hash encrypted using the recovered random number sequence should match the encrypted hash recorded on the blockchain. Accordingly, at step 512, the node 102 may determine the integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain. In some embodiments, if the hash encrypted using the recovered random number sequence matches the encrypted hash recorded on the blockchain, the node 102 may accept the integrity of the data content. Otherwise, if the hash encrypted using the recovered random number sequence does not match the encrypted hash recorded on the blockchain, the node 102 may refuse to accept the integrity of the data content.
[51] In some embodiments, the node 102 may further determine the authenticity of the signature. For example, if the data content includes a voice recording and the signature includes a voice signature, the node 102 may compare the voice contained in the data content against the voice contained in signature to determine the authenticity of the voice signature. In another example, if the signature is a digital signature, the node 102 may utilize an appropriate verification algorithm to verify the authenticity of the digital signature. In some embodiments, if the signature is authentic, the node 102 may accept the integrity of the data content. Otherwise, the node 102 may refuse to accept the integrity of the data content.
[52] FIG. 6 is a block diagram of an apparatus 600 for verifying data integrity, according to an embodiment. The apparatus 600 may be an implementation of a software process and may correspond to the method 500 (FIG. 5). Referring to FIG. 6, the apparatus 600 may include a processing module 602, an encryption/decryption module 604, and a determination module 606.
[53] The processing module 602 may obtain a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain and provide the obtained data to the encryption/decryption module 604. The encryption/decryption module 604 may decrypt the encrypted data clip to generate a decrypted data clip. The processing module 602 may then parse the decrypted data clip against the data content to recover a random number sequence that was used to create the data clip. The processing module 602 may also calculate a hash based on the data content, the encrypted data clip, and the signature, and provide the hash to the encryption/decryption module 604. The encryption/decryption module 604 may encrypt the hash using the recovered random number sequence and provide the hash encrypted using the recovered random number sequence to the determination module 606. The determination module 606 may determine the integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain, as described above.
[54] Each of the above-described modules may be implemented as software, or hardware, or a combination of software and hardware. For example, each of the above-described modules may be implemented using a processor executing instmctions stored in a memory. Also, for example, each the above-described modules may be implemented with one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components, for performing the described methods. Further for example, each of the above- described modules may be implemented by using a computer chip or an entity, or implemented by using a product having a certain function. In one embodiment, the apparatus 600 may be a computer, and the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email receiving and sending device, a game console, a tablet computer, a wearable device, or any combination of these devices.
[55] For an implementation process of functions and roles of each module in the apparatus 600, references can be made to corresponding steps in the above-described methods. Details are omitted here for simplicity.
[56] In some embodiments, a computer program product may include a non-transitory computer-readable storage medium having computer-readable program instructions thereon for causing a processor to carry out the above-described methods.
[57] The computer-readable storage medium may be a tangible device that can store instructions for use by an instruction execution device. The computer -readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer-readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
[58] The computer-readable program instructions for carrying out the above-described methods may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or source code or object code written in any combination of one or more programming languages, including an object-oriented programming language, and conventional procedural programming languages. The computer-readable program instructions may execute entirely on a computing device as a stand-alone software package, or partly on a first computing device and partly on a second computing device remote from the first computing device. In the latter scenario, the second, remote computing device may be connected to the first computing device through any type of network, including a local area network (LAN) or a wide area network (WAN).
[59] The computer-readable program instructions may be provided to a processor of a general-purpose or special-purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the above-described methods.
[60] The flow charts and diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of devices, methods, and computer program products according to various embodiments of the specification. In this regard, a block in the flow charts or diagrams may represent a software program, segment, or portion of code, which comprises one or more executable instructions for implementing specific functions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the diagrams and/or flow charts, and combinations of blocks in the diagrams and flow charts, may be implemented by special purpose hardware -based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
[61] It is appreciated that certain features of the specification, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the specification, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as suitable in any other described embodiment of the specification. Certain features described in the context of various embodiments are not essential features of those embodiments, unless noted as such.
[62] Although the specification has been described in conjunction with specific embodiments, many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the following claims embrace all such alternatives, modifications and variations that fall within the terms of the claims.

Claims

1. A computer-implemented method for verifying data integrity, the method comprising: obtaining a data content, an encrypted data clip, a signature, and an encrypted hash recorded on a blockchain; decrypting the encrypted data clip to generate a decrypted data clip; parsing the decrypted data clip against the data content to recover a random number sequence that was used to create an original data clip corresponding to the encrypted data clip; calculating a hash based on the data content, the encrypted data clip, and the signature; encrypting the hash using the recovered random number sequence; and determining an integrity of the data content by comparing the hash encrypted using the recovered random number sequence against the encrypted hash recorded on the blockchain.
2. The method of claim 1, further comprising: in response to a determination that the hash encrypted using the recovered random number sequence matches the encrypted hash recorded on the blockchain, accepting the integrity of the data content.
3. The method of any one of preceding claims, further comprising: in response to a determination that the hash encrypted using the recovered random number sequence does not match the encrypted hash recorded on the blockchain, refusing to accept the integrity of the data content.
4. The method of any one of preceding claims, further comprising: determining an authenticity of the signature; and in response to a determination that the signature is authentic, accepting the integrity of the data content.
5. The method of claim 4, further comprising: in response to a determination that the signature is not authentic, refusing to accept the integrity of the data content.
6. The method of any one of preceding claims, further comprising: recording the recovered random number sequence on the blockchain to prevent the recovered random number sequence from being used to encrypt the hash again.
7. The method of any one of preceding claims, wherein the original data clip is created by extracting pieces of data from the data content according to the random number sequence, a position of each of the extracted pieces of data in the data content being determined by a corresponding number in the random number sequence.
8. The method of any one of preceding claims, further comprising: recording a commitment of the random number sequence used to create the data clip on the blockchain; and determining whether the recovered random number sequence is correct based on the commitment recorded on the blockchain.
9. The method of any one of preceding claims, wherein the data content comprises a voice recording.
10. The method of any one of preceding claims, wherein the signature comprises a voice signature.
11. A device for verifying data integrity, comprising: one or more processors; and one or more computer-readable memories coupled to the one or more processors and having instmctions stored thereon that are executable by the one or more processors to perform the method of any one of claims 1 to 10.
12. An apparatus for verifying data integrity, the apparatus comprising a plurality of modules for performing the method of any one of claims 1 to 10.
13. A non-transitory computer-readable medium having stored therein instructions that, when executed by a processor of a device, cause the device to perform the method of any one of claims 1 to 10.
PCT/IB2022/050377 2021-03-08 2022-01-18 Methods and devices for verifying data integrity WO2022189865A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202280003220.2A CN115299010A (en) 2021-03-08 2022-01-18 Method and apparatus for verifying data integrity

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG10202102327U 2021-03-08
SG10202102327UA SG10202102327UA (en) 2021-03-08 2021-03-08 Methods and devices for verifying data integrity

Publications (1)

Publication Number Publication Date
WO2022189865A1 true WO2022189865A1 (en) 2022-09-15

Family

ID=78397435

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2022/050377 WO2022189865A1 (en) 2021-03-08 2022-01-18 Methods and devices for verifying data integrity

Country Status (3)

Country Link
CN (1) CN115299010A (en)
SG (1) SG10202102327UA (en)
WO (1) WO2022189865A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116233120A (en) * 2023-05-10 2023-06-06 深圳普菲特信息科技股份有限公司 Large file fragment transmission method, system and medium based on data processing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180308098A1 (en) * 2015-05-05 2018-10-25 ShoCard, Inc. Identity Management Service Using A Block Chain Providing Identity Transactions Between Devices
KR20180130623A (en) * 2017-05-29 2018-12-10 주식회사 익스트러스 Blockchain formation method for application integrity verification and application integrity verification method
CN109522698A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 User authen method and terminal device based on block chain
CN110351089A (en) * 2019-05-23 2019-10-18 西安电子科技大学 A kind of data signature authentication method and device
US20190372776A1 (en) * 2018-06-04 2019-12-05 Sap Se Secure data exchange
US20210058230A1 (en) * 2018-09-30 2021-02-25 Advanced New Technologies Co., Ltd. Blockchain-based transaction method and apparatus, and remitter device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180308098A1 (en) * 2015-05-05 2018-10-25 ShoCard, Inc. Identity Management Service Using A Block Chain Providing Identity Transactions Between Devices
KR20180130623A (en) * 2017-05-29 2018-12-10 주식회사 익스트러스 Blockchain formation method for application integrity verification and application integrity verification method
US20190372776A1 (en) * 2018-06-04 2019-12-05 Sap Se Secure data exchange
US20210058230A1 (en) * 2018-09-30 2021-02-25 Advanced New Technologies Co., Ltd. Blockchain-based transaction method and apparatus, and remitter device
CN109522698A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 User authen method and terminal device based on block chain
CN110351089A (en) * 2019-05-23 2019-10-18 西安电子科技大学 A kind of data signature authentication method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116233120A (en) * 2023-05-10 2023-06-06 深圳普菲特信息科技股份有限公司 Large file fragment transmission method, system and medium based on data processing
CN116233120B (en) * 2023-05-10 2023-07-14 深圳普菲特信息科技股份有限公司 Large file fragment transmission method, system and medium based on data processing

Also Published As

Publication number Publication date
SG10202102327UA (en) 2021-09-29
CN115299010A (en) 2022-11-04

Similar Documents

Publication Publication Date Title
US11165590B2 (en) Decentralized biometric signing of digital contracts
US20200344070A1 (en) Methods and devices for validating transaction in blockchain system
CN111066046B (en) Replay attack resistant authentication protocol
CN107403303B (en) Signing method of electronic contract system based on block chain deposit certificate
CN110661610B (en) Input acquisition method and device of secure multi-party computing protocol
US20190081800A1 (en) System for issuing certificate based on blockchain network, and method for issuing certificate based on blockchain network by using same
CN110458560B (en) Method and apparatus for transaction verification
US20210065169A1 (en) Methods and devices for providing transaction data to blockchain system for processing
KR20190075772A (en) AuthenticationSystem Using Block Chain Through Combination of Data after Separating Personal Information
US7606768B2 (en) Voice signature with strong binding
US11240041B2 (en) Blockchain-based transaction verification
US11436599B2 (en) Blockchain-based identity verification method and related hardware
US10880383B2 (en) Methods and devices for establishing communication between nodes in blockchain system
US11368309B2 (en) Methods and devices for generating and verifying passwords
WO2023035477A1 (en) Blockchain-based method for document validation
WO2022189865A1 (en) Methods and devices for verifying data integrity
CN116069856A (en) Data integrity verification method and system based on blockchain
WO2021139605A1 (en) Methods and devices for providing decentralized identity verification
WO2021088451A1 (en) Methods and devices for preventing denial-of-service attack on blockchain system
US20230237200A1 (en) Digital witness systems and methods for authenticating and confirming the integrity of a digital artifact
Yeh et al. Integrating Cellphone-based Hardware Wallet with Visional Certificate Verification System
WO2023167636A1 (en) Methods and devices for providing privacy-preserving auditable ledger for managing tokens

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22766447

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11/01/2024)