WO2022170370A1 - Method of protecting file contents with high information entropy using a combination of swap codes, aes encryption standard and blockchain technology and system for implementing the same - Google Patents

Method of protecting file contents with high information entropy using a combination of swap codes, aes encryption standard and blockchain technology and system for implementing the same Download PDF

Info

Publication number
WO2022170370A1
WO2022170370A1 PCT/VN2022/000001 VN2022000001W WO2022170370A1 WO 2022170370 A1 WO2022170370 A1 WO 2022170370A1 VN 2022000001 W VN2022000001 W VN 2022000001W WO 2022170370 A1 WO2022170370 A1 WO 2022170370A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
block
event
ciphertext
protecting
Prior art date
Application number
PCT/VN2022/000001
Other languages
French (fr)
Inventor
Khuong Tuan NGUYEN
Original Assignee
Nguyen Khuong Tuan
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nguyen Khuong Tuan filed Critical Nguyen Khuong Tuan
Publication of WO2022170370A1 publication Critical patent/WO2022170370A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • H04L9/0631Substitution permutation network [SPN], i.e. cipher composed of a number of stages or rounds each involving linear and nonlinear transformations, e.g. AES algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6209Protecting access to data via a platform, e.g. using keys or access control rules to a single file or object, e.g. in a secure envelope, encrypted and accessed using a key, or with access control rules appended to the object itself
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0863Generation of secret information including derivation or calculation of cryptographic keys or passwords involving passwords or one-time passwords
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0894Escrow, recovery or storing of secret information, e.g. secret key escrow or cryptographic key storage
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/14Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using a plurality of keys or algorithms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/50Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using hash chains, e.g. blockchains or hash trees

Abstract

The present invention provides a method of protecting the file contents with high information entropy using a combination of swap codes, AES encryption standard and blockchain technology, wherein the swap code is able to be performed by swapping each subsegment in a segment or swapping each codeword in the codeword space. The symetric encryption standard, including AES, Twofish, Serpent, Blowfish, CAST5, RC4, Tam phan DES (Triple DES), va IDEA (International Data Encryption Algorithm), is used to protect user information, ciphertext; combined with blockchain technology to comprehensively protect file contents. The invention also provides a system for implementing the method.

Description

METHOD OF PROTECTING FILE CONTENTS WITH HIGH INFORMATION ENTROPY USING A COMBINATION OF SWAP CODES, AES ENCRYPTION STANDARD AND BLOCKCHAIN TECHNOLOGY AND SYSTEM FOR IMPLEMENTING THE SAME
Technical Field
The present invention relates to a method to protect a file with high information entropy by permuting the data elements of the file (encryption), and then, information about the permutation (cypher key) together with other information related to the user will be encrypted using the AES encryption standard. At the same time, the file will be conventionally divided into blocks, these blocks are linked together by a one-way SHA code to control its integrity. On the other hand, all the information about the events related to the file, from the initialization event (creation of the file) to other events (such as access, copy, user authorization, permission change of the users, etc.) will be stored on the file itself and when there is an internet connection, it will be stored at the same time at the server according to the blockchain technology. It will also be stored at the server according to blockchain technology.
By this way, the file owner can fully protect the file contents, including controlling access to read the file's data; copying, distributing the file, preventing modification of file content, etc.; deleting, choosing to grant or blocking access to that file for one/some people while being connected to the Internet.
Background Art
Current files, especially civil and specialized audio and video files, are often large in size and unsecured content. These files are simply created and stored on memory devices in an open format, which means anyone can easily read and copy them. Cutting and editing the content of this files can also be done very easily.
In some cases where security is required, these data are protected by storing them in accordance with data protection procedures, such as in a safe, but there is still a security hole when the stored data is not encrypted in time. Because of this drawback, private, sensitive, important contents are still distributed when the storage device is lost, which can have unexpected consequences.
Another major drawback of audio and video files stored in an open, unsecured format is that they can easily be cropped, edited, falsified, and lost their originality. Therefore, audio and video files must always be considered and verified for their integrity before use. For added security, video distribution using CDs or removable storage devices (such as portable hard drives, USB sticks) can be replaced by the use of a centralized database. However, this eliminates the convenience of distributing information on removable storage devices, and the stored data still requires another strong data encryption system to protect.
For security of audio and video files, strong encryption algorithms can be applied. However, due to the characteristics of audio and video files with large capacity, so both encoding and decoding require strong computing power. It causes the cost and complexity of the encoding system/decode is very high. If the computing power is not strong enough, the decryption (opening) of the file will take a long time, causing inconvenience to the user.
Swap codes have long been discovered and used in information encryption. Swap codes have many different ways of doing things, but they are all based on a general principle that is to change the positions of elements in the original data in a controlled manner, according to the rules so that the data after changing positions becomes unable to be read in the normal way.
The disadvantage of permutation ciphers is that for ciphertexts with low information entropy, the swap cipher is easily decoded by taking advantage of the unequal distribution of the data elements. Accordingly, the attacker will calculate the frequency of occurrence of elements on the ciphertext, thereby deducing the permutation rule (cipher key). For example, according to statistics, in text files in English, the element "e" appears the most. In an encrypted file that is an English text file, it is possible to crack the code by finding the element in the encrypted data with the highest frequency and assigning it "e", and then along with other elements, etc. With current computing power, ciphertexts encrypted with conventional permutation ciphers are easily broken according to this principle.
It is realized that current audio and video files have been stored in popular formats such as MP3, MP4, JPEG, etc. These files have been proceeded entropy compression to reduce the size; Files of other types such as text files, image files, etc. can also be compressed entropy by file compression software such as Winrar, Winzip with great performance and high processing speed. Entropy compression causes the distribution of ciphertext elements of entropy-processed files to become even (high information entropy). Please refer to Figure la illustrating the probability distribution of codewords and the result of calculation of the information entropy of a text file before and after entropy compression and Figure lb illustrating the probability distribution of codewords and the result of calculation of the information entropy of a video file being compressed according to the H.264, H265 video compression standards. Therefore, it is very efficient and fast to protect files with high information entropy by permutation, especially for high-volume files, while eliminating the disadvantage of inequality of elements that make up the data, which can be exploited to break code. Only low-volume information such as information about users' names and passwords; information about authorization; new cryptographic key etc. is secured according to the AES (Advanced Encryption Standard) encryption standard.
Protection of the file contents, especially large audio and video files, by combining the permutation method and AES encryption standard will increase the encryption/decryption speed many times in comparision with applying only one encryption method (asymmetric like RSA, symmetric like AES) on the same computing power. In addition, block chain technology and SHA one-way encryption technology are also applied to the process of managing, decentralizing, using files and preventing file modification to protect files better.
For that reason, the present invention provides a method to protect file contents with high information entropy using a combination of permutation cipher, AES encryption standard and blockchain technology. The present invention also provides a system for implementing the method.
Summary of Invention
A purpose of the present invention is to secure data, helping to overcome the technical shortcomings of data encryption methods in a known prior art.
In sequence to achieve the foregoing, the present invention provides a method for encrypting data, a method for storing data history and copying data, and a method for decentralizing users with authentication.
A method of protecting file contents with high informational entropy using a combination of swap codes, AES encryption standard and blockchain technology, including: step 1: encrypt files with a swap code by generating a cryptographic key (CypherKey) randomly, then encrypt the file with a permutation of the cryptographic key (CypherKey) to generate a ciphertext according to the following formula:
DataFileciphertext = Swap(CypherKey, DataFilecontent)
Where DataFileciphertext is the ciphertext after the file is encrypted; DataFilecontent is the contents of the file unencrypted, CypherKey is the cryptographic key, and Swap is the permutation function. According to an embodiment of the present invention, after being encrypted in step 1, depending on the capacity, the ciphertext may be conventionally divided into one or more blocks; on each block there is a digital signature of the block calculated by the formula:
Hash[j] = SHA(Hash[j-l], DataFileciphertext, DataBlock[j]) step 2: set a password to protect the file, the system calculates a digital signature (hash) of the user's password according to the formula:
Password_Hash= SHA(secretkey, username, password, permission).
Wherein secret-key is a secret, private key of the system and is hidden, username is the username, password is the password set by the user and permission is the user's permission to the file. Setting the password can be done by the user, or the administrator, depending on the hierarchy for this task. The digital signature of the user's password is publicly stored, used against the user's password. step 3 : encrypt the cryptographic key and user account information using the symmetric encryption standard.
According to one aspect of the present invention, the symmetric encryption standard may be chosen as any of the symmetric encryption standards including Twofish, Serpent, AES (also known as Rijndael), Blowfish, CAST5, RC4, Tam DES (Triple DES), and IDEA (International Data Encryption Algorithm), etc.
According to an embodiment of the present invention, a method of protecting file contents with high information entropy of the present invention has the symmetric encryption standard AES, in which:
- User account information is encrypted according to the following formula:
ProtectedUserlnfo = AES.Encrypt(secretkey, username, password, permission)
- Each user has a unique cryptographic key (CypherKey) encrypted according to the following formula:
ProtectedCypherKey=AES.Encrypt(secretkey, password, CypherKey)
According to an embodiment of the present invention, the method of protecting file contents with high information entropy according to the present invention also includes step 4 and step 5 after step 3,
Step 4: Generating a "container file" to contain all the information of the encrypted file. The container file includes: the ciphertext (datafile); the digital signature of the passwords of all authorized users/groups (password hash) and the encrypted information about all users/usergroups (protected Userinfo) and all cryptographic keys (protectedCypherkey),
Step 5: Generating an event data table, which stores the entire history of the container fde from the time it was generated (initiated) and the events that happened to that container file after the time of initialization according to blockchain technology, and storing it as a container file extension or as a separate file;
The initiation block and subsequent blocks in the block chain will be sealed according to the formula: Block_Seal= SHA[block_Seal of the previous block; Information & Events]
If a block is the initiation block then the paramater “block_seal previous block” is set to null;
If the user copies a file, a new block on the original file will be created to record this event; meanwhile, on the newly copy file, the copy event will be treated as the initiation event of that newly copy file.
In one embodiment, the method of protecting file contents with high information entropy further comprise step 6, which is synchronizing the event data table on the file storage medium in step 5 with the event data table on the internet-connected server system. Accordingly, the server containing the Global Event Data Table contains all information related to the file, whereby the event information for the original file and all files copied from the original file stored on all File Storage Medium will be updated on the Global Event Data Table via the internet system. For each event, a new block is generated, along with a global block seal. The global block seal is calculated by the following formula:
Block Seal Global= SHA[block_Seal previous block; event information]
On the Event Data Table at File Storage there is an additional information field recording a global block seal (block seal global): accordingly, when the event information is updated with the global system, a global block seal is generated and sent to the File Storage Medium for archiving.
According to a prefered embodiment of the invention, in addition to the file-related event information, the original file is also stored on a server system, allowing the user to exercise his/her rights to the file directly on the server, including search, access to view, download the storage media, delete or grant user permissions and other permissions.
In one embodiment of the present invention, encryption of a file using a permutation cipher (step 1) is performed by permuting the subsegments in each segment of the file based on the cryptographic key, respectively, by the following procedure:
(i) Splite the file into segments and subsegments: wherein, the file is divided into several equal segments, the value of each segment is 2n x 256 bytes, where n is an integer, greater than or equal to zero.
In the case of splitting the file into segments with equal value of 2n x 256 bytes with a remainder, the remainder will be treated as a separate independent segment which is not encrypted.
Each segment will be splited into 256 equal subsegments, each subsegment has a corresponding value of 2n byte, where n is a non-negative integer.
(ii) A cryptographic key (Cypherkey) describing a sequence of subsegments with length of 256 elements will be randomly generated;
(iii) Swapping the sequence of the subsegments in all the segments, from the sequence (from 0 to 255, respectively) to a sequence specified by the cryptographic key; The swap of the subsegments sequence are performed uniformly in all the segments of the file.
According to a prefered embodiment of the invention, the parameter n has a value between 2 and 4.
In an embodiment of the invention, the encryption of the file using the swap codes (step 1) is performed by swapping the codewords, respectively by the following procedure:
(i) a cryptographic key describing the sequence of codewords which has the same number of elements as those of the codewords is generated randomly,
(ii) the codewords on the file will be swapped into the corresponding codewords specified by the Cypherkey according to the swap principle of each codeword.
According to a prefered embodiment of the invention, the number of codewords is 256.
In an embodiment, a method of protecting the file contents with a high information entropy according to the invention has an additional step lb right after step 1. Accordingly, after being encrypted at step 1, depending on the capacity, the ciphertext can be conventionally divided into one or more blocks; on each block there is a digital signature of the block, calculated by the following formula:
Hash[j] = SHA(Hash[j-l], DataFile,DataBlock[j])
Wherein, SHA is a 1-way encryption function; Hash[j-1] is the digital signature of the preceding block; if the ciphertext has only one block or is the first, then Hash[j-1] will be null; DataFile is the encrypted file (ciphertext); DataBlock[j] is the jth block in the data. If the ciphertext has only one block or is the first, the DataBlock parameter will be equal to the ciphertext value (DataFile).
In an embodiment of the invention, each block each block is conventionally sized between 1MB and 16MB.
The present invention also provides a system for implementing the method of protecting the file contents including: a storage device for storing and compressing entropy files; a device for encrypting/ decrypting the compressed files in the storage device and calculating parameters according to the method of the invention; an internet-connected server capable of storing file information, from the initiation event to the other file-related events is starred on the storage device.
In an embodiment of the invention, a system for implementing the method of protecting the file contents is characterized in that an internet-connected server is capable of storing additional original files.
Brieft Discription of the Drawings
Figure la illustrates the probability distribution of codewords and the result of calculation of the information entropy of a text file before and after entropy compression.
Figure lb illustrates the probability distribution of codewords and the result of calculation of the information entropy of a video file being compressed according to the H.264, H265 video compression standards.
Figure 2 is a flowchart illustrating the basic operation of the method of protecting the file contents of high information entropy.
Detailed Description of Embodiments
In the following, the invention is described in details via embodiments. The following embodiments of the invetion are given as an example for the purpose of disclosing the entire invention to persons skilled normal knowledge in the respective technical field. However, the examples are not intended to limit the invention. The invention includes all variations, equivalences and alternatives within the spirit and scope of the invention.
It should be understood that, unless otherwise indicated to contrary, the terms used in the description should be construed as being well understood and widely used by a person with average knowledge of the technical field. The terms used in the patent description are intended to describe specific embodiments and are not intended to be limited. The terms such as Cypherkey, Onx, Ony, Onz, etc. are used to distinguish objects, parameters, functions, algorithms and are not intended to limit the invention. Descriptions are known to the person skilled in the art obscure important points of the invention will be omitted.
In the present invention, the terms are construed as follows:
1. Codeword: Codewords are symbols carrying information. For example, the Chinese character "Japanese" with a vertical rectangle and a horizontal stroke in the middle ( H ) contains the information that the concept of the Sun; The letter "a" carries the information that the vowel "a".
In a computer, a codeword is a number consisting of n bits, for example n=8 bits or 1 byte, n is also known as the length of the codeword, for example the 8-bit number 3 is represented in the radix systems as 00000011 |bin or 0x03 |hex or 3|dec and interprets itself as 8 bits. In normal (unencrypted) conditions, the number that carries the information is that number, for example 00000011 |bin is the codeword, which carries the information of number 3.
In the codeword permutation, the codeword 0000001 l|bin is swapped with other information, making it impossible to read the fde in the usual way;
Codewords can be of different lengths, but for the purposes of this invention only codewords of equal length are considered.
2. Codeword space: A set of codewords contained in the fde forms the Codeword Space. The n-bit codeword space is a set of all possible n-bit codewords, consisting of 2n different codewords, denoted by Vn.
3. Segments: A fde is conventionally divided into several equal parts with a value of each part equal to 2n x 256 bytes, called segments. However, the division is only a convention, the fde remains unchanged, not physically divided.
In case the fde size is not divisible by the value of the segment, the remainder will be an independent, preserved, undisturbed segment in Cypherkey sequence.
4. Block: A block is a conventional part of a fde, but the concept of "block" is not identical with the concept of "segment" above because their functions are different.
The fde is conventionally divided into several equal blocks with a predetermined value. The fde is conventionally divided into blocks but remains physically intact. The blocks are linked together by a one-way SHA code, which controls the fde's integrity.
5. Cryptographic key (so-called Cypherkey): is a data scrambling sequence, applied uniformly to segments or to codewords of a file. It is a sequence of N numbers from 0 to (N-l); but not sequentially from 0 to (N-l), which is in random sequence.
6. Ciphertext: is the file after the codewords/subsegments have been swapped (encrypted). The ciphertext is publicly disclosed and exposed to attacks, but cannot be understood without the Cypherkey. The ciphertext of this invention preserves the high information entropy of the original file, i.e. the distribution of the codewords in the file is so homogenous that it is almost impossible to decipher based on anomaly analysis of the equality of the elements of the file.
7. Initiation event: is the event generated the file, including information about the time, place, and author of creating the file; who can use the file and their permissions.
8. Event data table: is a table containing file-related information, including but not limited to information such as the digital signature of the user; encrypted password; encrypted user information, etc. The table is updated regularly when events occur with that file.
The event data table can exist as an extension file inside the container file, or as an attachment file outside of the container file.
The event data table is stored on the File Storage Device. In a preferred embodiment, the Event Data Table is stored simultaneously on the file storage device and on the server; accordingly, the Event Data Table on the File Storage Device contains event information related to the files on that Device; The event data table on the server, called the Global Event Data Table, contains all event information related to the original file (initiation file) and the files copied from the initiation file .
9. High entropy file: High entropy files encrypted according to the invention including audio, video and other data files are entropy compressed. The entropy compression for audio and video files is usually proceded by popular software in common formats such as MP3, MP4, JPEG... to reduce the size; Other data files such as text files, images, etc. can also be compressed entropy by popular data compression software such as Winrar, Winzip. Thereby, the inequality of the components that make up the data has been almost eliminated.
A person having ordinary skill in the art could may know that if data with low information entropy are protected with permutation method, then the permutation code is easily decoded by taking advantage of the unequal distribution of the elements that make up the data. Therefore, it is quite easy to attack by statistics of the occurrence of elements on the ciphertext to deduce the permutation rule (cipher key). However, the inventor found that for files such as video, images, audio, text files, etc. have undergone entropy compression to reduce the size such as MP3, MP4, JPEG, Winrar, Winzip etc., this entropy compression makes distribution of the ciphertext elements of the files to be processed by entropy compression more uniform (so-called high entropy information). Therefore, combining the security of these high-entropy files with a permutation code will be very effective and fast, especially for high-volume files, while still eliminating unequal disadvantages of the components that make up the above data. Furthermore, if the permutation cipher is combined with the AES encryption standard and blockchain technology in securing these files, they will be at a very high level of security, and almost impossible to attack with the current technology.
With the above teachings, the invention is performed through the following main devices:
(i) A file storage device, capable of entropy compression of the file.
A file storage device can be a device with a recording function (for audio files), a video recording function (for image/video files), or a computer capable of creating different file types. The generated files must be processed with data compression software, which makes the information entropy of the file high. The popular data compression formats today can be mentioned as MP3, MP4, JPEG or Winrar, Winzip, etc.
A file storage devices can be voice recorders, cameras, video recorders, smartphones with the above common functions , or ordinary computers.
(ii) The device is able to encrypt/decrypt files according to the method provided according to this present invention. This device can be a separate mixer or a module or software integrated in the File Storage Device.
In one embodiment, the system has (iii) an internet-connected server capable of storing information about the file, from the initiation event to other file-related events. Accordingly, when the file is generated, the information about the initiation event and other event information (read, copy, share, grant user permissions, change user information, etc.) will be sent to the server; each data block will be sealed by the server and stored on the server. By this way, the system can track events related to the file as soon as the device on which the file is stored has an internet connection. However, the system does not know what the file contents are.
In a preferred embodiment, the system includes an internet-connected server capable of storing information about the file, from the initiation event to other file-related events. The characterization is that, when the file is generated, the entire file, along with information about the initiation event and other event information (read, copy, share, grant user permissions, change the user information, etc.) will be simultaneously sent to and stored on the server; each data block will be sealed and stored on the server by the server, in addition to being sealed and stored on the File Storage Device. Then, the data about the file on the server is complete, and the server has the highest rights to that file.
In this preferred embodiment, the server performs the data storage, encryption/decryption; content management system (CMS), including looking up information, viewing information on a web browser, downloading files, registering users, assigning permissions to users); and store all events related to the file, from the initiation event to other events like read, copy, share, etc.
By the way, the system can have a unified and comprehensive control over all aspects related to the storage, use, and distribution of the file.
According to the method in the invention, files with high entropy are protected according to the following steps:
Step 1. Encrypt a file with swap code
In an embodiment of the invention, a cryptographic key (Cypher) firstly is randomly generated by the system, then the file is encrypted by swapping the subsegments in each segment of the file based on the cryptographic key (CypherKey) to generate a ciphertext according to the following formula:
DataFileciphertext = Swap(CypherKey, DataFile content)
Where DataFileciphe rtext is the ciphertext after the file is encrypted; DataFilecontent is the unencrypted contents of the file, CypherKey is a cryptographic key, and Swap is the permutation function.
According to the embodiment, the file encryption step will be performed as follows:
(i) Splite the file into segments and subsegments: accordingly, the file is conventionally divided into several equal segments, the value of each segment is 2n x 256 bytes, where n is an integer, greater than or equal to zero. In the case of splitting the file into segments with equal value of 2n x 256 bytes with a remainder, the remainder will be treated as a separate independent segment which is not encrypted.
According to a prefered embodiment, parameter “n” has value between 2 and 4. Each segment will be splited into 256 equal subsegments, each subsegmenthas a corresponding value of 2n byte.
It should be noted that spliting the file into segments is only a convention to serve the permutation of subsegments, not physical division. That is, the file remains as it is, substantially, without being split into segments. A swap code is any permutation way that uses a cryptographic key Cypherkey to perform swapping the subsegments, and the cryptographic key Cypherkey describes the rule or sequence of permutation of subsegments, and use it to recover the original file.
(ii) A cryptographic key (Cypherkey) describing a sequence of subsegments with length of 256 elements will be randomly generated.
(iii) Permutate each subsequence sequence in segments, from the regular sequence (0 to 255, respectively) to the specified key-free sequence. The permutation of the subsequence sequence are performed uniformly in all the segments of the file.
Since the number of subsegments in each segment of 256 is too large to describe, for simplification, an example can be considered with a segment with a hypothetical number of subsegments of 4. Accordingly, the subsegments of the segments in the usual sequence are N=4:[l,2,3,4],
In this case, the cryptographic key (Cypherkey) with length 4 will be transformed into anew random sequence, assuming the variable 4 => 2, 1 => 4, 2 => 3, 3 => 1; then, Cypherkey will have the form: N=4:[4,3,l,2], Applying the cryptographic key would be done by: each position of the old subsegment is replaced by a new subsegment specified by the key, for example with data where the segments have a subsegment sequence [1,2, 3, 4] will be encoded as data where the segments have a subsequence of [4, 3, 1,2],
It should be noted that the generation of Cypherkey is random, does not follow any rules. If the length of the cryptographic key is 4, then breaking the code is quite simple, because it only needs to test the number of variables by the factorial of 4 (4!), i.e. by: 4x3x2=24 variables. However, when the length of the cryptographic key is N=256, the number of variables to be tested is equal to the factorial of 256 (256!), which is a number so large that current computing power cannot handle it.
In another embodiment of the present invention, the file encryption is to be performed by permutation of codewords. The information of each file is represented by many codewords. In audio and video files that are compressed according to today's popular standards, the number of codewords used usually is 256. However, the number of codewords used to represent information in the file is not limited to 256.
According to the embodiment, the file encryption step will be is performed as follows:
(i) A cryptographic key (Cypherkey) describing the sequence in which the codewords with the same number of elements as the codewords will be randomly generated.
(ii) Swapping codeword follows the permutation principle of each codeword, whereby, when the system encounters a codeword in the original file, it will immediately swap into a corresponding codeword specified by Cypherkey.
For simplicity, an example can be considered with a file having the number of codewords of 4, including the codewords a, b, c, d.
Assuming that the ctyptographic key with length of 4 will be transformed into a new random sequence, assuming the variable a => b, b => c, c => d, d => a; then, Cypherkey will have the form: N=4:[b,c,d,a]. Applying this cryptographic key to encryption will be performed by replacing each old codeword with a new codeword specified by the cryptographic key, for example with the data [a,a,b,b,c,d, d] will be encoded as [b,b,c,c,d,a,a] .
The generation of Cypherkey is completely random, not follow any rules. If the length of the cryptographic key is 4, then breaking the code is quite simple, because it only needs to test the number of variables by the factorial of 4 (4!), i.e. by: 4x3x2=24 variables. However, when the length of the encryption key is N=256, ie the number of codewords that are commonly used today in audio, image, and video files, the number of variables that must be tested is equal to the factorial of 256. (256!), which is a number so huge that the current computing power is not able to handle it.
In a preferred embodiment, the file content protection process has an additional step of lb. Accordingly, the encrypted file (ciphertext) is conventionally divided into several blocks, each block has a digital signature to ensure data integrity, particularly:
Step lb. Once being encrypted, the ciphertext is signed with a digital signature generated by the SHA one-way code to ensure integrity. Accordingly, depending on the capacity, the ciphertext can be conventionally divided into one or more blocks; each block has an optimal size between 1MB and 16MB because if the size of the block is too large, the capacity of the block division is not high, the block size is too small will increase the number of digital signatures (HASH), resulting in increase of the storage capacity.
The digital signature (Hash) of each block of the ciphertext will be calculated according to the following formula:
Hash[j] = SHA(Hash[j-l], DataFile,DataBlock[j])
Wherein,
SHA is a one-way encryption function, which can he SHA1, SHA2, SHA3, Keccak, Shake, or a like something.
Hashfi-1 ] is the hash of the previous block
DataFile is an file after being encrypted (ciphertext).
DataBlockfi] is the jth block in the data
In case the ciphertext has only one block or is the first block, Hash[j-1] will be null; The DataBlock will be equal to the ciphertext value (DataFile).
Step 2: set user password
When the user sets a password to protect the file, the system calculates the digital signature (hash) of the user's password according to the following formula:
Password_Hash= SHA(secretkey, username, password, permission)
Wherein, secretkey is the secret, private key of the system and is hidden, username is the user name;
- password is the password set by the user;
- permission is the user's right to the file, convention each permission is a natural number from 0 or more, for example, in the case of specifying that the user has 4 rights to the file, the read permission is conventionally 1, 2 is copy, 3 is sharing and 4 is right to delete files, etc.
The digital signature of the user's password is publicly stored, used to match the user password instead of storing the user password.
Step 3: Encrypt the cryptographic key, user account information using symmetric two- way encryption standard AES. In this invention, all encryption keys and user account information are encrypted using symmetric two-way encryption standard AES, accordingly:
- User account information is encrypted according to the following formula:
ProtectedUserlnfo = AES.Encrypt(secretkey, username, password, permission)
- Each person will have a unique encrypted CypherKey according to the following formula: protectedCypherKey=AES.Encrypt(secretkey, password, CypherKey)
It should be noted that, although the present invention discloses the encryption of encryption keys and user account information using the AES symmetric two-direction encryption standard, the AES symmetric two-direction encryption standard is aimed as an example to illustrate the invention. One persone ordinary skilled in the art can completely do it using other coding standards. Therefore, the present invention is not limited to the AES symmetric two-direction encryption standard, the encryption can be carried out with other symmetric encryption standards, such as Twofish, Serpent, Blowfish, CAST5, RC4, Triple DES ( Triple DES), and IDEA (International Data Encryption Algorithm etc.
Step 4: a "container file", containing all the information of the encrypted file will be generated; The container file will include the following information:
(i) Ciphertext (datafile);
(ii) The digital signature of passwords of all permissed users/groups (password hash) .
(iii) The encrypted information about all users/usergroups (protected Userinfo) and all cryptographic keys (protectedCypherkey).
Step 5: An event data table, storing the entire history of the file from the initialization time (initiated) and events that happened to that file after the initialization time according to blockchain technology (block -chain) , will be generated and stored as a container file extension or as a separate file.
Accordingly, the first block of the blockchain is information about the initiation event, including information about the time, place, and author of the file creation; who can use the file and their permissions. When a new event occurs with the file, including access to read, copy; change user; change user permissions; delete file, etc., then a new block is created on the Event Data Table to record the event.
The initiation block and the subsequent blocks in the block chain will be sealed according to the following formula:
Block_Seal= SHA[block_Seal previous block; event information]
In case a Block is the initiation block, the parameter “block seal previous block” is set to null.
In case the user copies a file, a new block on the original file will be created to record this event; also, on the newly created file, the copy event will be treated as the initiation event of that newly created file.
At the end of the process, the entire file contents with high information entropy is initially encrypted by swaping codewords or swaping subsegments within segments of the file to form a ciphertext (DataFile ), ready exposure for attack; the digital signature of the user's password is generated by encryption with a one-way SHA code; user information and encryption keys are protected by the AES encryption standard; The blocks of the ciphertext are sealed with a one-way SHA code to ensure data integrity.
The ciphertext, the cryptographic key, the digital signature of the user's password; user account information and seals of blocks encrypted with the two-way symmetric encryption standard AES are stored in a “container” file, where the initiation event and the events occuring with the file will be stored in the Event Data Table into blocks of the block chain, sealed with a one-way code SHA.
In a preferred embodiment, the Event Data Sheet at File Storage is synchronized with the Event Data Sheet on the Internet Server System in Step 6.
Step 6: The event data table at the file storage medium is synchronized with the event data table on the Internet-connected server system. Whereby,
- On the event data table at the file storage medium, there is an additional information field recording the global block seal (block seal global). When an event occurs with a file stored at the file storage medium is updated to the server system connected to the Internet, the server will calculate the seal of the global block (Block_seal global) and send it back to storage on the event data table at the file storage medium. - Every time the file event information is received from all File Storage Medium, the host system calculates the global block seal and stores it on the Event Data Sheet on the resulting server system connected Internet, together with the event. The global seal is calculated by the following formula:
Block Seal Global= SHA[block_Seal previous block; event information]
It is clear that, the formula for calculating a global block seal is the same as the formula for calculating the local block's seal (Block_seal Local), that is, the seal of the block belonging to the file stored on the File Storage Medium. However, these two seals are different. The reason is because the global block's sequence of events is all events occuring with all files stemed from the origin file stored at the server; while the sequence of events of the local block is the event that only occurs to the file stored on the File Storage Device.
For example, when there is an event (x) for a file stored in device A, which is the ath event from the initiation event forthat file, it will be stored with the block seal locally according to the formula: Block Seal Local (A)= SHA[block_Seal (a- 1); event information (x)]
However, the same event is synchronized to the server system, which becomes the b event in the chain of events for all files. Thus, the global block seal will be determined by the formula: Block Seal Glocal (A)= SHA[block_Seal (b-1); event information(x)]
By doing so, the system is aware of every event that happens to each file on all File Storage Medium, enabling unified control of all files on all storage media, as long as there is an internet connection.
The high entropy file, after being protected by the method proposed in this invention, is decrypted for use by the following steps:
Step 1. The account information of the user, including password, name, will be loaded to open the container file.
On that basis, the LoginHash parameter will be calculated according to the following formula:
LoginHash = SHA(SecretKey, UserName, Password, P)
Where, P is the number of attempts to check user rights, starting from 1 to [p]. Wherein, P is the user's p-th permission.
In an embodiment of the present invention, the parameter p is equal to 5, which corresponds to the user having 5 permissions, including (1) permission to read the file; (2) permission to view the file's event history; (3) file copy rights; (4) permission to stop copying files and (5) permission to delete files. However, the number of user permissions is not limited by the above mentioned number of permissions.
In turn, calculating the LoginHash parameters corresponding to the P value from 1 to p, if the calculated LoginHash parameter corresponds to the P value of a (a is any natural number between 1 and p) that does not coincide with the digital signature of the password (password_hash) of any authorized users/groups stored in the container file, the system will recalculate the LoginHash parameter with a P value of a+1, with condition a+1 is less than or equal to p.
If the calculated LoginHash parameter corresponding to all values of P from 1 to p do not match the digital signature of the password (password_hash) of any authorized users/groups stored on the container file, the system will consider the login failed. The system creates a new block on the event data table of the file to record the failed login event, sealing (Block_Seal) on that event. According to a prefered embodiment, the event will be updated to an internet- connected server, globally sealing (Block_Seal Global) on that event.
If the calculated LoginHash parameter corresponding to a P value of a (a is any natural number between 1 and p) matchs with the digital signature of the password (password_hash) of the kth person in the users/groups authorized user stored on the container file, it determines that it is the kth user with user permissions of a.
By doing so, the user does not need to remember and load information about his or her permissions into the system when opening the file.
Step 2. Decrypting the kth user information with the user's password entered by the following formula: Userinfo = AES. Decrypt( secret-key, password, ProtectedUserlnfo)
If the username in the decrypted user information matches the username entered, the password is correct.
If not, the calculation of the digital signature of the user's password in step 1 is just a coincidence (with a very small probability) and the login failed. The system creates a new block on the Event Data Table of the file to record the failed login event, sealing (Block Seal) on that event. In a preferred embodiment, the event would be updated to an internet-connected server, sealing the event globally (Block Seal Global). Step 3. On the basis of the user password, the protected cryptographic key (ProtectedCypherKey) is decrypted to obtain the cryptographic key (CypherKey) according to the following formula:
CypherKey =AES.Decrypt(secret-key, password, protectedCypherKey)
Step 4. Use CypherKey to decrypt data protected by swapping the elements of the ciphertext.
The present invention also provides a system for implementing the method of protecting fdes with high information entropy using a combination of swap codes, AES encryption standards and blockchain technology. The system according to the invention comprising three components: (i) A fde storage device; (ii) device/software capable of encrypting/decrypting fdes and (iii) an internet-connected server capable of storing fde-related event information or an internet-connected server capable of fde storage and all fde-related event information and all copies of fdes stored on single File Storage Devices. The details of these components and how the system works have been described above.

Claims

WHAT IS CLAIMED IS
1. A method of protecting file contents with high information entropy, comprising: step 1: encrypt files with a swap code by generating a cryptographic key (CypherKey) randomly, then encrypt the file with a permutation of the cryptographic key (CypherKey) to generate a ciphertext according to the following formula:
DataFileciphertext = Swap(CypherKey, DataFilecontent) wherein DataFileciphertext is the ciphertext after the file is encrypted; DataFilecontent is the contents of the file unencrypted, CypherKey is the cryptographic key, and Swap is the permutation function; step 2: set up a password to protect the file, the system calculates a digital signature (hash) of the user's password according to the following formula: password_Hash= SHA(secretkey, username, password, permission) wherein, secret-key is the secret, private key of the system and is hidden,
- username is the user name,
- password is the password set by the user,
- permission is the user's permission to the file, and step 3 : encrypt the cryptographic key and user account information using a symmetric encryption standard.
2. The method of protecting the file contents with high information entropy according to claim 1, wherein: the symmetric encryption standard is chosen among symmetric encryption standards including Twofish, Serpent, AES, Blowfish, CAST5, RC4, Tripartite DES (Triple DES), and IDEA (International Data Encryption Algorithm).
3. The method of protecting the file contents with high information entropy according to claim 1, wherein the symmetric encryption standard is AES, in which:
- user account information is encrypted according to the following formula:
ProtectedUserlnfo = AES.Encrypt(secretkey, username, password, permission)
- each user has a unique cryptographic key (CypherKey) encrypted according to the following formula: ProtectedCypherKey=AES.Encrypt(secretkey, password, CypherKey),
4..The method of protecting the file contents with high information entropy according to claim 1, further comprising the following steps: step 4: generate a "container file" to contain all the information of the encrypted file, including:
(i) a ciphertext (datafileciphertext)
(ii) a digital signature of the passwords of all authorized users/groups (password_hash),
(iii) encrypted information about all users/usergroups (protected Userinfo) and all cryptographic keys (protectedCypherkey), step 5: generate an event data table, which stores the entire history of the container file from the time it was generated (initiated) and the events that happened to the container file after the time of initialization according to blockchain technology, and archive it as an extension of the container file or as a separate file; wherein the initiation block and the subsequent blocks in the block chain are sealed according to the following formula: Block_Seal= SHA[block_Seal previous block; event information]; wherein, if a block the initiation block then parameter “block_seal previous block” is set to null; if the user copies a file, a new block on the original file will be created to record this event; meanwhile, on the newly copy file, the copy event will be treated as the initiation event of that newly copy file.
5. The method of protecting the file contents with high information entropy according to claim 4, further comprising step 6: synchronize the event data table at the file storage medium at step 5 with the event data table on the Internet-connected server system, in which: on the server containing the Global Event Data Table, containing all information related to the file, whereby the event information for the original file and all files copied from the original file stored on all File Storage Medium will be updated on the Global Event Data Table via the internet system; for each event, a new block is generated, along with a global block seal; the global block seal is calculated by the following formula:
Block_Seal Global= SHA|block Seal previous block; event information] - on the event data table at the file storage medium, there is an additional information field recording a global block seal (block_seal_global); accordingly, when the event information is updated with the global system, a global block seal is generated and sent to the file Storage Medium for archiving.
6. The method of protecting the file contents with high information entropy according to claim 5, characterized in that, in addition to the file-related event information, the original file is also stored on a server system, allowing the user to exercise his/her rights to the file directly on the server, including search, access to view, download the storage media, delete or grant user permissions and other permissions.
7. The method of protecting the file contents with high information entropy according to claim 1 , wherein, the encryption of the file using the cryptographic key at step 1 is performed by permuting the subsegments in each segment of the file based on the cryptographic key respectively by the following procedure:
(i) splite the file into several equal segments, the value of each segment is 2n*256 byte, where n is a non-negative integer; in case splitting has a remainder, the remainder is treated as a separate independent segment and is not encrypted; splite each segment into 256 equal subsegments, each subsegmenthas a corresponding value of 2n byte;
(ii) describe a sequence of subsegments with length of 256 elements by a cryptographic key (Cypherkey) which is randomly generated;
(iii) swap the subsegments sequence in the segments, from the sequence (from 0 to 255, respectively) to a sequence specified by the cryptographic key.
8. The method of protecting the file contents with high information entropy according to claim 7, wherein, n is between 2 and 4.
9. The method of protecting the file contents with high information entropy according to claim 1, wherein, the encryption of the file using the cryptographic key at step 1 is performed by swapping codewords by the following procedure:
(i) describe the sequence of codewords which has the same number of elements as those of the codewords by a randomly- generated cryptographic key; and (ii) swap the codewords on the file into the corresponding codewords specified by the Cypherkey according to the swap principle of each codeword.
10. The method of protecting the file contents with high information entropy according to claim 9, wherein, the number of codewords is 256.
11. The method of protecting the file contents with high information entropy according to claim
I, wherein, the method futher comprises an additional step lb after step 1: step lb: split the ciphertext into one or more blocks; and assign a digital signature to each block of the ciphertext ; wherein the digital signature of each block of the ciphertext is calculated by the following formula:
Hash|j] = SHA(Hash[j-l], DataFile, DataBlock[j]) wherein,
SHA is a one-way encryption function,
- Hash[j-1] is the digital signature of the previous block; if the ciphertext has only one block or is the first block, then Hash[j-1] will be null,
- DataFile is an file after being encrypted (ciphertext),
- DataBlockjj] is the jth block in the data; if the ciphertext has only one block or is the first block, the parameter DataBlock will be equal to the ciphertext value (DataFile).
12. The method of protecting the file contents with high information entropy according to claim
I I, wherein, each block is conventionally sized between 1MB and 16MB.
13. A system for implementing the method of protecting the file contents according to claim 1, comprising: a storage device for storing and entropy compressing files; a encrypting/decrypting device for encrypting/decrypting the compressed files in the storage device and calculating parameters according to the method of claim 1 ; and an internet-connected server for storing file information, from the initiation event to the other file-related events in the storage device.
14. The system according to claim 13, characterized in that, an internet-connected server is capable of storing additional original files.
PCT/VN2022/000001 2021-02-08 2022-01-27 Method of protecting file contents with high information entropy using a combination of swap codes, aes encryption standard and blockchain technology and system for implementing the same WO2022170370A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
VN1-2021-00705 2021-02-08
VN1202100705 2021-02-08

Publications (1)

Publication Number Publication Date
WO2022170370A1 true WO2022170370A1 (en) 2022-08-11

Family

ID=82742545

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/VN2022/000001 WO2022170370A1 (en) 2021-02-08 2022-01-27 Method of protecting file contents with high information entropy using a combination of swap codes, aes encryption standard and blockchain technology and system for implementing the same

Country Status (1)

Country Link
WO (1) WO2022170370A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150310219A1 (en) * 2014-04-28 2015-10-29 Topia Technology, Inc. Systems and methods for security hardening of data in transit and at rest via segmentation, shuffling and multi-key encryption
WO2020170225A2 (en) * 2019-02-24 2020-08-27 Nili Philipp System and method for securing data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150310219A1 (en) * 2014-04-28 2015-10-29 Topia Technology, Inc. Systems and methods for security hardening of data in transit and at rest via segmentation, shuffling and multi-key encryption
WO2020170225A2 (en) * 2019-02-24 2020-08-27 Nili Philipp System and method for securing data

Similar Documents

Publication Publication Date Title
US10324892B2 (en) Security via data concealment
US8571220B2 (en) Method and apparatus for securing data in a memory device
CN110213354B (en) Cloud storage data confidentiality protection method
US20120159175A1 (en) Deduplicated and Encrypted Backups
EP3035641A1 (en) Method for file upload to cloud storage system, download method and device
US11979500B2 (en) Data format-preserving encryption, tokenization, and access control for vaultless systems and methods
CN115048657B (en) System, method and computer readable medium for protecting cryptographic keys
WO2022193620A1 (en) Encoding method and apparatus for network file protection, and decoding method and apparatus for network file protection
Bala et al. Secure File Storage In Cloud Computing Using Hybrid Cryptography Algorithm.
CN104660590A (en) Cloud storage scheme for file encryption security
Jose et al. Hash and Salt based Steganographic Approach with Modified LSB Encoding
US20090168994A1 (en) Method for providing stronger encryption using conventional ciphers
Babatunde et al. Information security in health care centre using cryptography and steganography
JP2008242665A (en) Encryption processing device, encryption processing method and file dividing and storing system
JP2001142396A (en) Ciphering device, its method, ciphering/deciphering device, its method and communication system
WO2022170370A1 (en) Method of protecting file contents with high information entropy using a combination of swap codes, aes encryption standard and blockchain technology and system for implementing the same
GB2446200A (en) Encryption system for peer-to-peer networks which relies on hash based self-encryption and mapping
KR101566416B1 (en) Method and device of data encription with increased security
CN112615816A (en) Cloud document transmission encryption and decryption method
Sri et al. SECURE FILE STORAGE USING HYBRID CRYPTOGRAPHY
KR20210143846A (en) encryption systems
JP4338185B2 (en) How to encrypt / decrypt files
Jacob et al. Secured and reliable file sharing system with de-duplication using erasure correction code
KR102311996B1 (en) Device and method for anti-forensic unlocking for media files
Bindu Madavi et al. Security and Privacy Issues in Cloud and IoT Technology and Their Countermeasures

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22708027

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22708027

Country of ref document: EP

Kind code of ref document: A1