WO2016045641A2 - 数据块储存方法、数据查询方法和数据修改方法 - Google Patents

数据块储存方法、数据查询方法和数据修改方法 Download PDF

Info

Publication number
WO2016045641A2
WO2016045641A2 PCT/CN2015/090993 CN2015090993W WO2016045641A2 WO 2016045641 A2 WO2016045641 A2 WO 2016045641A2 CN 2015090993 W CN2015090993 W CN 2015090993W WO 2016045641 A2 WO2016045641 A2 WO 2016045641A2
Authority
WO
WIPO (PCT)
Prior art keywords
split
file
data block
data
content
Prior art date
Application number
PCT/CN2015/090993
Other languages
English (en)
French (fr)
Other versions
WO2016045641A3 (zh
Inventor
周海燕
Original Assignee
北京古盘创世科技发展有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京古盘创世科技发展有限公司 filed Critical 北京古盘创世科技发展有限公司
Priority to EP15845133.6A priority Critical patent/EP3200094A4/en
Priority to US15/515,125 priority patent/US10521144B2/en
Publication of WO2016045641A2 publication Critical patent/WO2016045641A2/zh
Publication of WO2016045641A3 publication Critical patent/WO2016045641A3/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0623Securing storage systems in relation to content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices

Definitions

  • the present invention relates to the field of data control, and in particular, to a data block storage method, a data query method, and a data modification method.
  • the carrier of cloud storage technology is a cloud system.
  • the cloud system includes a public cloud and a private cloud.
  • Private cloud security is not convenient for large-scale calls.
  • the security of public clouds feels weaker both physically and psychologically, but various computing resources, storage resources, and bandwidth resources are relatively good.
  • many users are faced with the dilemma of having their own data stored on the public network while facing security issues.
  • the current data such as video, audio, text, mail and pictures are all private data of individuals or public institutions. If they are placed in a single service provider, no matter how the provider encrypts, the user thinks that he is out of his own.
  • the control scope causes psychological insecurity.
  • the main technology adopted in the world is to encrypt and store the user's data through its own encryption means.
  • the desired plaintext information can be obtained from the data stored in the cloud system or other storage system for direct reading.
  • the data block is the main carrier of the data. If the file content of the data block is obtained maliciously, the problem of leaking is easy to occur. Thus, there is a need for a way to address the security of data blocks in a cloud storage environment.
  • An object of the present invention is to provide a data block storage method, a data query method, and a data modification method to solve the above problems.
  • a data block storage method is provided in an embodiment of the present invention, including:
  • the data block to be split includes a file content indicating its actual meaning
  • the file contents of different split data blocks are respectively stored in at least two independent storage systems according to a preset first storage location.
  • the method for storing a data block in the embodiment of the present invention further includes:
  • the data block to be split is divided into a plurality of molecular data blocks to be split, so that the file content of each to-be-detached molecular data block is completely the same as the content of the partial continuous file of the data block to be split;
  • Splitting the file contents of the split data block according to the preset first splitting rule includes:
  • the file contents of each to-be-removed molecular data block are separately split to generate a plurality of split data blocks.
  • the data block to be split further includes a file header indicating a file structure thereof;
  • the method further includes: splitting a file header of the data block to be split according to a preset second splitting rule, to generate a file header of at least two split data blocks, to be split data
  • the file header of the block includes at least one character group, the character group is the smallest unit whose meaning is represented by the file header of the data block to be split, the character group includes multiple characters, and the file header of each split data block includes the same character group.
  • Part of the character, or the header of each split data block includes a partial character set of the file header of the data block to be split;
  • the file headers of different split data blocks are respectively stored in at least two independent storage systems according to a preset initial storage location.
  • the file header of the split data block is split according to a preset second splitting rule to generate a file header of at least two split data blocks, including:
  • each region The file structure code and the index table of each area file are corresponding to a data block to be split, and each area file structure code and the corresponding to-be-removed molecular data block can be combined into a whole file that can be directly displayed, each area
  • the index table of the file carries the file content index information of the specified one to be split molecular data block;
  • each sub-area file structure code and the code of the index table of each area file are separately split to form a plurality of sub-area file structure codes and index tables of multiple sub-area files. Therefore, each sub-area file structure code cannot display the content encoded by the corresponding area file structure, and the index table of each sub-area file cannot display the content of the index table of the corresponding area file, and the plurality of sub-area file structure codes and The index tables of the sub-area files are respectively carried in the file headers of at least two split data blocks.
  • the character group of the file content of the data block to be split only includes a fixed length string
  • the total number of characters in the character group of the file content of the data block to be split can be divisible by Y, where Y is a fixed value
  • the fixed length strings of the file contents in the split data block are sequentially split to generate the file contents of the X split data blocks.
  • the file content of each split data block includes one or more strings of length Y.
  • splitting the character group of the file content into a fixed length string according to the number of packets X and the split length Y carried in the first splitting rule includes:
  • All characters of the file content of the data block to be split are split into a plurality of first character groups of length Y according to the order of arrangement; according to the order of the first character group, the plurality of first character groups are sequentially Assigned to the contents of the file of X split data blocks.
  • the character group of the file content of the data block to be split only includes the fixed length string, and the total number of characters in the character group of the file content of the data block to be split cannot be divisible by Y, where Y is a fixed value, then Filling the character group of the file content of the split data block with the padding number, so that the total number of characters in the character group of the file content of the filled data block to be split can be divisible by Y;
  • the character group of the file content of the data block to be split is split into fixed length strings to generate file contents of X split data blocks.
  • the file content of each split data block includes one or more strings of length Y.
  • the character group of the file content of the data block to be split only includes the fixed length string, and the total number of characters in the character group of the file content of the data block to be split cannot be divisible by Y, wherein the split length Y is a variable.
  • all the characters of the file content of the data block to be split are sequentially split into multiple groups of second character groups according to Y according to the order of arrangement;
  • Each set of second character groups is sequentially assigned to the file contents of the X split data blocks in the order of the second character group.
  • the second group of characters that are sequentially split into a plurality of groups of lengths according to Y include:
  • the method further includes: randomly acquiring a plurality of function values
  • the values to be used are rounded to determine the variable Y.
  • the character group of the file content of the data block to be split includes a variable length string
  • the variable length string and the fixed length string in the character group of the file content of the data block to be split are divided into two groups of strings. group;
  • the fixed length character of the file content in the data block to be split according to the number of packets X and the split length Y carried in the first splitting rule.
  • the string is split to generate the file contents of the X split data blocks, and the file content of each split data block includes one, or a plurality of strings of length Y;
  • each area file structure code and the corresponding to-be-removed molecular data block can be combined into a direct displayable overall file including: each area file structure code and corresponding to be deleted Molecular data blocks can be combined into consecutive segments of a specified number of words;
  • each region file structure code and the corresponding to-be-removed molecular data block can be combined into a direct displayable overall file including: each region file structure code and corresponding to-be-removed molecular data block Can be combined into a continuous video image of a specified length of time;
  • each region file structure code and the corresponding to-be-removed molecular data block can be combined into a direct displayable overall file including: each region file structure code and corresponding to-be-removed molecular data block Can be combined into continuous audio data of a specified length of time;
  • each region file structure code and the corresponding to-be-removed molecular data block can be combined into a directly displayable overall file including: each region file structure code and corresponding to-be-removed molecular data block Can be combined into a specified continuous image.
  • storing the file content of the different split data blocks according to the preset first storage location to the at least two independent storage systems respectively includes:
  • the file contents of the plurality of split data blocks are stored in a predetermined storage ratio to different public storage systems, and/or private storage systems.
  • storing the file headers of the different split data blocks according to the preset first storage location to the at least two independent storage systems respectively includes:
  • the file headers of the plurality of split data blocks are stored in a predetermined storage ratio to different public storage systems, and/or private storage systems, and the file headers of the split data blocks and the file contents of the split data blocks are separated.
  • the storage location is different.
  • the embodiment of the invention further provides a data query method, including a data block storage method, and further includes:
  • the file content of the plurality of split data blocks is composed of the file content of the data block to be split, or the file content of the data block to be split.
  • the data query method further provided by the embodiment of the present invention further includes: acquiring a file header keyword corresponding to a file content keyword of the split data block;
  • the file content of the data block to be split, or the file content of a part of the data block to be split and the file header of the data block to be split are formed into a data block to be split, or a part of the data block to be split.
  • the file header represents a structure code of the file structure and an index table of the file content, and the structure code is used to form a frame of the file content, so that the file content can be filled according to the frame to form a file form that can be directly displayed;
  • the file headers of the plurality of split data blocks formed into the file headers to be split according to the preset second splitting rule include:
  • the plurality of area file structure codes and the index tables of the plurality of area files are respectively combined according to a preset third splitting rule to form a file header of the data block to be split.
  • the data query method further provided by the embodiment of the present invention further includes: acquiring an implicit data code, where the implicit data code is used to identify the hidden file in the data block;
  • the file content of the split data block corresponding to the plurality of split hidden data codes is composed of the file content of the data block to be split, or the file content of the part to be split data block according to the preset first splitting rule. .
  • the embodiment of the invention further provides a data modification method, including a data block storage method, and further includes:
  • the write mode includes deletion, addition, and replacement.
  • the writing mode is deletion, deleting the character corresponding to the writing position in the data block to be modified according to the writing position and the first splitting rule to generate the file content of the modified data block;
  • the character to be written is split according to the first split rule to generate a character to be written
  • the data block storage method provided by the embodiment of the present invention is encrypted with the data to be stored in the prior art, so that others can use the reverse crack method to obtain the decryption password, thereby obtaining the original data with the stored data.
  • the data is in an unsafe state
  • the file content of the data block to be stored is split to generate file content of at least two split data blocks, wherein the data block to be split is
  • the file content includes at least one character group, the character group is the smallest unit of the file content of the data block to be split, and the character group includes multiple characters, and the file content of each split data block includes some characters of the same character group. That is, the unit in which the data block indicates its minimum meaning is split.
  • FIG. 1 is a basic flowchart of a data block storage method according to an embodiment of the present invention
  • FIG. 2 shows a basic flow chart of a data query method according to an embodiment of the present invention.
  • Embodiment 1 of the present invention provides a data block storage method, as shown in FIG. 1, which includes the following steps:
  • S101 Acquire a data block to be split, where the data block to be split includes a file content indicating an actual meaning thereof;
  • S103 Store the file contents of different split data blocks into at least two independent storage systems according to a preset first storage location.
  • the data block refers to a group of records continuously arranged in order, which is a data unit for transmission between the main memory and the input, output device or external memory.
  • the data block consists of two parts, which are the header and the contents of the file.
  • the file header carries the content of the data structure, and the file content carries the data content that identifies the actual meaning of the file.
  • the data block is used as the carrier of the data.
  • the usual encryption technology is to adjust the file data in the data block. By changing the expression of the data, the person reading the data cannot understand the true meaning of the data, wherein the data representation is changed.
  • the key is the encryption key, which is to change the representation of the data through the calculation of the encryption key.
  • this way of enhancing data security through encryption can usually obtain real data by acquiring a secret key or other forced reading, which leads to a leak problem.
  • the header and file contents can be separately split and stored for the characteristics of the data block to improve security. That is, it is first necessary to obtain the file content of the data block to be split.
  • the method for obtaining the content of the file does not need to be limited.
  • it can be obtained by the specified terminal or the cloud, and the content of the file is split accordingly.
  • the file content of the data block to be split includes at least one character group, where the character group is the smallest unit of the file content of the data block to be split, and the character group includes multiple characters, and each split data block
  • the contents of the file include some characters of the same character group.
  • a character group is a set of codes that can represent, or correspond to, a unit of meaning with a specific meaning.
  • the meaning corresponding to the character group should be the encoding of the text, that is, the encoding of the "I" word is "11001110", then the character group is composed of 8 characters.
  • Other audio, video, graphics, mail and other file contents can also be divided into such a character group, and a fixed code is also required in the character group.
  • the smallest language unit is binary 0 and 1, therefore, The character group (the number of characters in a character group is variable) is split so that the contents of the file cannot represent its original meaning.
  • Common text formats (file content is text) suffixes are: PDF, DOC, TXT, WPS.
  • the meaning of the character group is the pronunciation at a certain moment. If the pronunciation is composed of multiple sounds, the corresponding correspondence should be each tone, that is, The characters corresponding to each tone need to be split so that the split characters cannot correspond to a specific pronunciation.
  • Common audio and video streaming formats such as: asf, Advanced Streaming format. (Microsoft); rm, Real Video/Audio (Progressive Networks); ra, Real Audio (Progressive, Networks); swf, Shock Wave Flash (Macromedia); , QuickTime (Apple); viv, Vivo Movie (Vivo Software); mp4, (Motion Picture Experts Group); mp3, (Motion Picture Experts Group).
  • the meaning represented by the character group is the pixel code of a certain pixel point, or a collection of a certain number of pixel point codes, and the several pixels together appear to be unable to display the characteristics of the image.
  • the image feature refers to a set of color regions that the image has, and can reflect its meaning, such as a collection of several pixel points.
  • the code of several pixels is split (so that each pixel can not display its original color), and the pixels can be split as a whole (to generate a set of sufficiently scattered pixels), such as Split into several pixels that are not adjacent to each other, so that even if these pixels can be restored, it is impossible to know the true meaning of the image to be expressed based on these pixels.
  • the meaning represented by the character group is the image at a certain moment, or each pixel of the image at a certain moment.
  • the video is a relatively static picture (that is, each frame of image)
  • the relatively still picture is an image, so that you can split the image according to the previous method.
  • the image of one screen is split.
  • the text content of text, audio, image or video can be split into two, or multiple text file contents (file contents of the split data block), the more the data block file content is split, The better the security, the more difficult it is to assemble the contents of the split file, and the more computational complexity of the assembly.
  • step S103 is performed.
  • the split file contents need to be separately stored in different independent storage systems, of course, storage. The more scattered, the better the safety effect. If the split file contents A, B, C, and D can be stored in A', B', C', and D', respectively, even if someone can obtain the file content A through the A' storage system, it cannot be obtained.
  • the contents of the files in other storage systems also do not understand the true meaning of the complete file content.
  • each storage system After the file is stored, each storage system has a certain probability of failure. For example, if the storage system is infected by a virus, the content of the file after being split cannot be obtained; if the contents of the file in the storage system are deleted, or Any damage caused by external forces will result in the outside world not being able to obtain the contents of the file after the split.
  • the N files obtained by the split can be stored in N+1 independent storages during storage. In the system, and the file content in any one of the independent storage systems can be found in other independent storage systems.
  • a sufficient number of split file contents for example, the contents of the five split files are stored in five storage systems, which can be stored in AB in a storage system and stored in a storage in BC.
  • the CD is stored in a storage system
  • the DE is stored in a storage system
  • the EA is stored in a storage system.
  • the data block storage method provided by the present invention further includes the following steps:
  • the data block to be split is divided into a plurality of molecular data blocks to be split, so that the file content of each to-be-detached molecular data block is completely the same as the content of the partial continuous file of the data block to be split;
  • Step S102 splitting the file content of the data block to be split according to the preset first splitting rule includes:
  • the file contents of each to-be-removed molecular data block are separately split to generate a plurality of split data blocks.
  • Splitting the split data block to generate multiple data blocks to be split that is, splitting the data block to be split into several paragraphs, such as splitting an article into several consecutive paragraphs, or sentence groups;
  • a video, or audio is split into several consecutive time periods of video, or audio; the picture is split into several consecutive areas of borders.
  • the split data block can be combined into a molecular data block to be split. Since the molecular data block to be split has continuous file content, the molecular data block to be removed can also restore the data.
  • this part of the data can be modified at this time, which is much smaller than the work of combining all the split data blocks into pieces to be split.
  • this part of the data can be modified at this time, which is much smaller than the work of combining all the split data blocks into pieces to be split.
  • after all the split data blocks are stored in the cloud if you want to modify the contents of the file, you do not need to download all the split data blocks, but only need to split the corresponding part of the data block. Download it, modify it after assembly, or assemble it, download it, and modify it. In this way, when the network transmission is performed, only a part of the original data block (the data block to be split) can be transmitted, which greatly reduces the data transmission amount and saves network resources.
  • the file header can also be split, that is, the data block to be split also includes a file header indicating its file structure;
  • Step S101 after obtaining the data block to be split, further comprising: splitting a file header of the data block to be split according to a preset second splitting rule, to generate a file header of at least two split data blocks, to be
  • the file header of the split data block includes at least one character group, the character group is the smallest unit whose meaning is represented by the file header of the data block to be split, the character group includes multiple characters, and the file header of each split data block includes the same Part of the character group, or the header of each split data block includes a partial character set of the file header of the data block to be split;
  • the file headers of different split data blocks are respectively stored in at least two independent storage systems according to a preset initial storage location.
  • the splitting of the file header can be done as a splitting of the contents of the file.
  • the characters carried in the file header represent the structure of the file.
  • the file header can be split according to the character group, that is, the split file header can include the complete file header.
  • Character set That is, the file header to be split includes multiple character groups, and all the character groups can form a complete file structure after being combined, then the file header is split into multiple file headers of the split data block, and the file header is split.
  • the file header of the data block includes a partial character group of the file header of the data block to be split, that is, the file header of the split data block includes a part of the file structure information having a specific meaning, but simply knowing the information is insufficient. To show enough content, that is, not to cause a leak.
  • the split file header can form a file structure corresponding to the molecular data block to be split, thereby being able to form part of the file information (a certain piece of audio, video or a part of the image).
  • splitting the file header of the split data block according to the preset second splitting rule to generate a file header of at least two split data blocks includes the following steps:
  • S104 Obtain an index table indicating a structure code and a file content of the file structure in the file header, and the structure code is used to form a frame of the file content, so that the file content can be filled according to the frame to form a file form that can be directly displayed;
  • each The area file structure code and the index table of each area file are corresponding to a to-be-removed molecular data block, and each area file structure code and the corresponding to-be-removed molecular data block can be combined into an overall file that can be directly displayed, each The index table of the area file carries the file content index information of the specified one to be split molecular data block;
  • each sub-area file structure code cannot display the content encoded by the corresponding area file structure, and the index table of each sub-area file cannot display the content of the index table of the corresponding area file, and the plurality of sub-area file structure codes
  • an index table of the plurality of sub-area files are respectively carried in the file header of at least two split data blocks.
  • the index table of the structure code and the file content is two contents for describing the content of the file, such as an index table describing the appearance position of the specified file content, and the structure code describing the presentation form of the file, such as a table, an array, etc. That is, structure coding and index tables are usually indispensable parts of files, or complex files, which describe the structure of the file.
  • step S105 in the foregoing, the data block to be split is decomposed into a plurality of data blocks to be split, where the structure code is split into the structure code of the area file and the index table is split into the area file.
  • the index table is designed to cooperate with the molecular data block to be split, so that the molecular data block to be deleted can be restored into a complete file (that is, restored to a complete part of the data block to be split, this part is continuous, such as A part of a picture, a file of a certain time period of audio or video, a sentence of a word, or a sentence group).
  • step S106 the structure code of the area file that has been split and the index table of the area file need to be split again to generate a plurality of sub-area file structure codes and an index table of the plurality of sub-area files.
  • Each sub-area file structure code and the index table of each sub-area file cannot represent the true meaning, that is, even if a sub-area file structure code or an index table of a sub-area file is read, the file architecture is not known.
  • the fourth splitting rule refers to splitting the code of the file, that is, decoding the code of a sub-area file structure or the index table of a sub-area file, so as to ensure the security of the file structure. Sex.
  • the two can be stored separately to improve security.
  • the sub-area file structure code, the index table of the sub-area file, and the content of the split file may be separately stored for security.
  • three are retrieved to restore a partial data block (this part of the data block can restore continuous file content).
  • the content of the file that is restored by the partial data block may be discontinuous, that is, discrete.
  • the character group of the file content of the data block to be split only includes the fixed length character string, and the total number of characters in the character group of the file content of the data block to be split can be divisible by Y, where Y is a fixed value, then According to the number of groups X and the split length Y carried in the first splitting rule, the fixed length strings of the file contents in the split data block are sequentially split to generate the file contents of the X split data blocks.
  • the file content of each split data block includes one or more strings of length Y.
  • strings are fixed-length strings, that is, the length of the string does not change with the storage location or the storage form changes. Then you can determine how many pieces of length Y are to be split by setting the split length Y. After determining the number of groups X and the length of the split Y, the splitting can be targeted. Specifically, all the characters can be divided into a plurality of character groups of length Y, and then each character group is placed in turn. Go to the corresponding group. For example, each character group can be labeled, the number of the identification is 1-X, that is, the code identifying each character group, and after identifying the code of each character group, each character group can be placed. Its corresponding code is grouped.
  • the array characters are as follows: 01010010,00001001, 01010100, 10100101, 01010101, 00101011, 01001010, 10011111, 00101001, 10100100, 01010010, 01011001.
  • X is 2 and Y is 8, then multiple character groups can be split into two groups:
  • X1 01010010,01010100,01010101,01001010,00101001,01010010;
  • X2 00001001,10100101,00101011,10011111,10100100,01011001; if X is set to 3 and Y is 16, it can be split into 3 groups.
  • X1 01010010,00001001, 01001010, 10011111;
  • X2 01010100,10100101,00101001,10100100;
  • Y is preferably not set to 1, or set to a number equal to the total number of characters.
  • splitting can be performed in a sequential manner, that is, the character group of the file content is split according to the number of groups X and the split length Y carried in the first splitting rule.
  • All characters of the file content of the data block to be split are split into a plurality of first character groups of length Y according to the order of arrangement; according to the order of the first character group, the plurality of first character groups are sequentially Assigned to the contents of the file of X split data blocks.
  • the remainder of the total number of characters divided by Y is not zero. This is the case that if the character group of the file content of the data block to be split only includes the fixed length string, and the total number of characters in the character group of the file content of the data block to be split cannot be divisible by Y, where Y is If the value is fixed, the character group of the file content of the data block to be split is filled with the padding number, so that the total number of characters in the character group of the file content of the filled data block to be divided can be divisible by Y;
  • the character group of the file content of the data block to be split is split into fixed length strings to generate file contents of X split data blocks.
  • the file content of each split data block includes one or more strings of length Y.
  • the characters in the character group are first filled, so that the number of filled characters can be divisible by Y, wherein the filled characters can be preset characters with no actual meaning.
  • the character group of the file content of the data block to be split only includes the fixed length string, and the total number of characters in the character group of the file content of the data block to be split cannot be divisible by Y, wherein the split length Y is a variable, then All the characters of the file content of the data block to be split are sequentially split into multiple groups of second character groups according to the change of Y according to the order of arrangement;
  • Each set of second character groups is sequentially assigned to the file contents of the X split data blocks in the order of the second character group.
  • the value of Y is a change value, so that the length of the character group that is separated each time can be different, so when assembling these character groups, it is necessary to know the change rule of Y, which further strengthens the security. Sex.
  • the character group In addition to splitting the character group in order, it is also possible to split the character group by partition. If the code of a word is D4C6 4E91 E4 BA 91, then the string can be split in the order of its parity. Minute. For example, D4C6 is split into DC and C6, 4E91 is split into 49 and E1, and E4BA91 is split into EB9 and 4A1, so that each split can also play the role of implied meaning of the original character. Further, in addition to the above-mentioned partition splitting, it is also possible to perform splitting according to a preset order, such as taking the first and last characters of each character string, and D4C6 4E91 E4 BA 91 obtains the first and last characters, that is, D (first character).
  • splitting according to the Fibonacci sequence that is, taking the numbers of 1, 2, 3, 5, 8, 13, 21 in a set of strings. And leave other numbers to distinguish the original string into two sets of substrings.
  • This pre-set order can be part of the association. If the value of Y is 2, 4, 6, 8, 10... that is, the value of Y, each time you want to increase by 2, then the following character group can be divided as follows: 00001111,00001111, 11110000, 1111000 is divided into 00 , 0011, 110000, 11111111, 0000111100, 00... In this way, the data security after splitting can be increased. Further, splitting into a plurality of sets of second character groups whose length is changed according to Y includes:
  • This splitting method can ensure that the number of each group is the same, that is, when storing the characters of each component, the required storage space is the same, and the storage space can be allocated more reasonably. To improve the efficiency of reading and storage, and to improve efficiency when updating inside the computer.
  • the character string indicating the same meaning can be split in the same manner, which can reduce the amount of work and facilitate the combination of the split data blocks.
  • the function value can be used to determine the change value of Y each time.
  • the data block storage method provided by the present invention further includes the following steps:
  • the values to be used are rounded to determine the variable Y.
  • Cos A Y (A is greater than 0 degrees less than 90 degrees, not equal to 60 degrees);
  • Tan A Y (A is greater than 0 degrees less than 90 degrees, not equal to 45 degrees);
  • a open N times Y.
  • the result calculated by different numbers may be a decimal, so the decimal needs to be rounded up, and the rounding can be rounded up, rounded down, or rounded.
  • variable-length character groups such as UTF-8. If the character group of the file content of the data block to be split includes a variable-length string, the character of the file content of the data block to be split is The variable length string and the fixed length string in the group are divided into two groups of string groups;
  • the fixed length character of the file content in the data block to be split according to the number of packets X and the split length Y carried in the first splitting rule.
  • the string is split to generate the file contents of the X split data blocks, and the file content of each split data block includes one, or a plurality of strings of length Y;
  • each area file structure code and the corresponding to-be-removed molecular data block can be combined into a direct displayable overall file including: each area file structure code and corresponding to be deleted Molecular data blocks can be combined into consecutive segments of a specified number of words;
  • each region file structure code and the corresponding to-be-removed molecular data block can be combined into a direct displayable overall file including: each region file structure code and corresponding to-be-removed molecular data block Can be combined into a continuous video image of a specified length of time;
  • each region file structure code and the corresponding to-be-removed molecular data block can be combined into a direct displayable overall file including: each region file structure code and corresponding to-be-removed molecular data block Can be combined into continuous audio data of a specified length of time;
  • each region file structure code and the corresponding to-be-removed molecular data block can be combined into a directly displayable overall file including: each region file structure code and corresponding to-be-removed molecular data block Can be combined into a specified continuous image.
  • each of the to-be-demolition molecular data blocks has a fixed meaning and can express a certain content.
  • the user wants to modify the content of the data block, it is not necessary to combine and restore all the data blocks that have been split. It is only necessary to restore a part to the molecular data block to be removed.
  • the amount of data processing is greatly saved, and if the split data block is stored in a network database (cloud database), if the terminal wants to obtain the content in the cloud database, it may also refer to obtaining a small portion of the file content.
  • the modification of the contents of a small part of the file can greatly reduce the amount of data transmitted by the network while ensuring that the modification meets the requirements.
  • the security can be improved by changing the storage location of the data block, and storing the file contents of the different split data blocks according to the preset first storage location to at least two independent storage systems.
  • the file contents of the plurality of split data blocks are stored in a predetermined storage ratio to different public storage systems, and/or private storage systems.
  • storing the file headers of the different split data blocks according to the preset first storage location to at least two independent storage systems respectively includes:
  • the file headers of the plurality of split data blocks are stored in a predetermined storage ratio to different public storage systems, and/or private storage systems, and the file headers of the split data blocks and the file contents of the split data blocks are separated.
  • the storage location is different.
  • the security of the private storage system is high, but the accessibility is poor.
  • the public storage system has low security, but the access convenience is fast, so it can be removed according to the needs.
  • the nature of the data blocks (emphasis on security, or the convenience of accessing and reading data blocks) to select the storage ratio in the public storage system or private storage system.
  • the plurality of split data blocks are stored in at least one public storage system and at least one private storage system.
  • a part of the plurality of split data blocks can be stored in the public storage system according to the user's needs (requiring the data block to have higher security or better readability) (may be Stored separately in multiple public storage systems to increase the security of new data blocks (split data blocks), and store several new data blocks in a private storage system (can be stored separately in multiple private storage)
  • all the split data stored in the public storage system and the private storage system After the blocks are combined, they can be restored to the data block to be split (the original data block); or a part of the plurality of split data blocks can be restored before being restored to the original data block table, where this part (multiple The partial data blocks in the split data block also need to be stored in at least one public storage system and one private storage system, respectively.
  • the data block storage method provided by the embodiment of the present invention is encrypted with the data to be stored in the prior art, so that others can use the reverse crack method to obtain the decryption password, thereby obtaining the original data with the stored data.
  • the data is in an unsafe state
  • the file content of the data block to be stored is split to generate file content of at least two split data blocks, wherein the data block to be split is
  • the file content includes at least one character group, the character group is the smallest unit of the file content of the data block to be split, and the character group includes multiple characters, and the file content of each split data block includes some characters of the same character group. That is, the unit in which the data block indicates its minimum meaning is split.
  • Embodiment 2 of the present invention provides a data query method.
  • the data block storage method provided in Embodiment 1 further includes the following steps after storing the split data block, as shown in FIG. 2:
  • the file content of the plurality of split data blocks is composed of the file content of the data block to be split or the file content of the data block to be split according to the preset first splitting rule.
  • the keyword is a file content that has been split, or a file header. This may be recorded during the execution of the data block storage method provided by the embodiment.
  • the split data block obtained by the splitting is identified to determine which split data block corresponds to the keyword. In this way, it is possible to determine which data block needs to be queried.
  • splitting the keyword is to split the keyword according to the first splitting rule (if the splitting rules of each string are the same) to determine the split data block corresponding to the keyword. .
  • step S203 the obtained split data block may be restored according to the split manner, and the restored original data block may be all original data blocks.
  • the embodiment 2 of the present invention provides that the data query method further includes: acquiring a file header keyword corresponding to a file content keyword of the split data block;
  • the file content of the data block to be split, or the file content of a part of the data block to be split and the file header of the data block to be split are formed into a data block to be split, or a part of the data block to be split.
  • the file header represents a structure code of the file structure and an index table of the file content, and the structure code is used to form a frame of the file content, so that the file content can be filled according to the frame to form a file form that can be directly displayed;
  • the file headers of the plurality of split data blocks formed into the file headers to be split according to the preset second splitting rule include:
  • the plurality of area file structure codes and the index tables of the plurality of area files are respectively combined according to a preset third splitting rule to form a file header of the data block to be split.
  • the data query method provided by the present invention further includes: acquiring an implicit data code, wherein the implicit data code is used to identify the hidden file in the data block;
  • the file content of the split data block corresponding to the plurality of split hidden data codes is composed of the file content of the data block to be split, or the file content of the part to be split data block according to the preset first splitting rule. .
  • Embodiment 3 of the present invention provides a data modification method, including the data block storage method of Embodiment 1, which further includes:
  • the write mode includes deletion, addition, and replacement.
  • the writing mode is deletion, deleting the character corresponding to the writing position in the data block to be modified according to the writing position and the first splitting rule to generate the file content of the modified data block;
  • the character to be written is split according to the first split rule to generate a character to be written
  • the modification method may be that the modified content is separately stored first, and then added to the specified location after confirming the modification.
  • the specified position can be between two characters, or between two paragraphs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

本发明涉及数据控制领域,具体而言,涉及数据块储存方法、数据查询方法和数据修改方法。该数据块储存方法,通过将待储存的数据块的文件内容进行拆分,以生成至少两个已拆分数据块的文件内容,其中,待拆分数据块的文件内容包括至少一个字符组,字符组是待拆分数据块的文件内容表示其含义的最小单位,字符组包括多个字符,每个已拆分数据块的文件内容包括同一字符组的部分字符,也就使数据块表示其最小含义的单位被拆分,即使他人获得了拆分后的一部分数据内容,也无法解析出有效的内,并且将拆分后的文件内容储存到了至少两个独立存储系统中,进而提高了数据的安全性,从而解决了现有技术中的不足。

Description

数据块储存方法、数据查询方法和数据修改方法 技术领域
本发明涉及数据控制领域,具体而言,涉及数据块储存方法、数据查询方法和数据修改方法。
背景技术
随着互联网技术的不断发展,目前基于各种云架构的产品层出不穷,云存储技术的载体是云端系统,具体的,云端系统包括公有云和私有云。私有云安全但不便于大规模调用,公有云的安全性从实际上和心理上都感觉会弱一些,但各种运算资源、存储资源和带宽资源都相对好的多。因此,很多用户都面临要将自己的数据存入公共网络而又面临安全问题的窘境。
比如现在的视频、音频、文本、邮件和图片等数据,很多都是个人或公共机构的私有数据,如果单一地放在一家服务提供商,不管提供商如何加密,用户都认为是脱离了自己的控制范围,造成心理上的不安全感,目前全球采用的主要技术都是通过自身的加密手段将用户的数据保存加密,再进行存放。
可以预知的是,通过获取加密密码,或者使用暴力破解工具来反向计算出解密密码,便可以从储存在云端系统,或其他储存系统里的数据中获取想要得到的明文信息,以直接读取。其中,数据块作为数据的主要载体,如果数据块的文件内容被恶意获取,则很容易发生泄密的问题。由此,需要出现一个方法,以解决云存储环境中的数据块的安全问题。
发明内容
本发明的目的在于提供数据块储存方法、数据查询方法和数据修改方法,以解决上述的问题。
在本发明的实施例中提供了数据块储存方法,其包括:
获取待拆分数据块,待拆分数据块包括表示其实际含义的文件内容;
按照预设的第一拆分规则对待拆分数据块的文件内容进行拆分,以生成至少两个已拆分数据块的文件内容,待拆分数据块的文件内容包括至少一个字符组,字符组是待拆分数据块的文件内容表示其含义的最小单位,字符组包括多个字符,每个已拆分数据块的文件内容包括同一字符组的部分字符;
分别将不同的已拆分数据块的文件内容按照预设的首次储存位置,储存至至少两个独立存储系统中。
优选的,在本发明的实施例中提供了数据块储存方法还包括:
按照预设的分割方式,将待拆分数据块分割为多个待拆分子数据块,使每个待拆分子数据块的文件内容与待拆分数据块的部分连续文件内容完全相同;
按照预设的第一拆分规则对待拆分数据块的文件内容进行拆分包括:
按照数据块拆分规则,分别对每个待拆分子数据块的文件内容进行拆分,以生成多个已拆分数据块。
优选的,待拆分数据块还包括表示其文件结构的文件头;
在获取待拆分数据块后还包括:按照预设的第二拆分规则对待拆分数据块的文件头进行拆分,以生成至少两个已拆分数据块的文件头,待拆分数据块的文件头包括至少一个字符组,字符组是待拆分数据块的文件头表示其含义的最小单位,字符组包括多个字符,每个已拆分数据块的文件头包括同一字符组的部分字符,或每个已拆分数据块的文件头包括待拆分数据块的文件头的部分字符组;
分别将不同的已拆分数据块的文件头按照预设的首次储存位置,储存至至少两个独立存储系统中。
优选的,按照预设的第二拆分规则对待拆分数据块的文件头进行拆分,以生成至少两个已拆分数据块的文件头包括:
获取文件头中表示文件结构的结构编码和文件内容的索引表,结构编码用于形成文件内容的框架,使文件内容能按照该框架进行填充,以形成能够直接显示的文件形式;
按照预设的第三拆分规则分别对结构编码和索引表进行拆分,以生成多个区域文件结构编码和与每个区域文件结构编码相对应的多个区域文件的索引表,每个区域文件结构编码和每个区域文件的索引表均与一个待拆分子数据块相对应,且每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件,每个区域文件的索引表携带有指定的一个待拆分子数据块的文件内容索引信息;
按照预设的第四拆分规则,分别对每个区域文件结构编码的代码和每个区域文件的索引表的代码进行拆分,以形成多个子区域文件结构编码和多个子区域文件的索引表,使每个子区域文件结构编码无法显示与其对应的区域文件结构编码的内容,且使每个子区域文件的索引表无法显示与其对应的区域文件的索引表的内容,多个子区域文件结构编码和多个子区域文件的索引表分别携带在至少两个已拆分数据块的文件头内。
优选的,若待拆分数据块的文件内容的字符组只包括定长字符串,且待拆分数据块的文件内容的字符组中字符总量能被Y整除,其中Y为定值,则根据第一拆分规则中所携带的分组数量X和拆分长度Y,依次对待拆分数据块中文件内容的定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个已拆分数据块的文件内容包括一个,或多个长度为Y的字符串。
优选的,根据第一拆分规则中所携带的分组数量X和拆分长度Y对文件内容的字符组为定长字符串进行拆分包括:
将待拆分数据块的文件内容的全部字符,按照排列的先后顺序,拆分为多组长度为Y的第一字符群;按照第一字符群的前后顺序,将多个第一字符群依次分配到X个已拆分数据块的文件内容中。
优选的,若待拆分数据块的文件内容的字符组只包括定长字符串,且待拆分数据块的文件内容的字符组中字符总量不能被Y整除,其中Y为定值,则使用填充数字对待拆分数据块的文件内容的字符组进行填充,使填充后的待拆分数据块的文件内容的字符组中字符总量能被Y整除;
根据第一拆分规则中所携带的分组数量X和拆分长度Y对待拆分数据块的文件内容的字符组为定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个已拆分数据块的文件内容包括一个,或多个长度为Y的字符串。
优选的,若待拆分数据块的文件内容的字符组只包括定长字符串,且待拆分数据块的文件内容的字符组中字符总量不能被Y整除,其中拆分长度Y为变量,则将待拆分数据块的文件内容的全部字符,按照排列的先后顺序,依次拆分为多组长度为按照Y变化的第二字符群;
按照第二字符群的前后顺序,依次将每组第二字符群分配到X个已拆分数据块的文件内容中。
优选的,依次拆分为多组长度为按照Y变化的第二字符群包括:
按照Y的一个变化值,连续拆分出X组字符长度相等的字符群。
优选的,还包括:随机获取多个函数值;
分别对每个函数值进行函数运算,以生成多个待使用值,函数运算包括三角函数运算、指对函数运算;
将每个待使用值进行取整运算,以确定变量Y。
优选的,若待拆分数据块的文件内容的字符组包括变长字符串,则将待拆分数据块的文件内容的字符组中变长字符串和定长字符串分为两组字符串组;
若定长字符串组中字符总量能被拆分长度Y整除,则根据第一拆分规则中所携带的分组数量X和拆分长度Y,对待拆分数据块中文件内容的定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个已拆分数据块的文件内容包括一个,或多个长度为Y的字符串;
确认变长字符串组的字符长度,并根据该字符长度进行拆分,以生成多个已拆分变长字符串组,多个已拆分变长字符串组依次分配到X个已拆分数据块的文件内容中。
优选的,若待拆分数据块为文本格式,则每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个区域文件结构编码与相应的待拆分子数据块能组合成指定字数的连续文段;
若待拆分数据块为视频格式,则每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个区域文件结构编码与相应的待拆分子数据块能组合成指定时间长度的连续视频影像;
若待拆分数据块为音频格式,则每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个区域文件结构编码与相应的待拆分子数据块能组合成指定时间长度的连续音频数据;
若待拆分数据块为图像格式,则每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个区域文件结构编码与相应的待拆分子数据块能组合成指定的连续图像。
优选的,分别将不同的已拆分数据块的文件内容按照预设的首次储存位置,储存至至少两个独立存储系统中包括:
将多个已拆分数据块的文件内容按照预定的储存比例储存至不同的公有储存系统,和/或私有储存系统中。
优选的,分别将不同的已拆分数据块的文件头按照预设的首次储存位置,储存至至少两个独立存储系统中包括:
将多个已拆分数据块的文件头按照预定的储存比例储存至不同的公有储存系统,和/或私有储存系统中,且已拆分数据块的文件头与已拆分数据块的文件内容的储存位置不同。
本发明实施例还提供了数据查询方法,包括数据块储存方法,还包括:
获取文件内容关键字;
在独立储存系统中查询与所述文件内容关键字相对应的多个已拆分数据块的文件内容;
根据预设的第一拆分规则将多个已拆分数据块的文件内容组成待拆分数据块的文件内容,或部分待拆分数据块的文件内容。
优选的,本发明实施例还提供的数据查询方法还包括,获取与已拆分数据块的文件内容关键字相对应的文件头关键字;
在云端系统中查询包含所述文件头关键字的多个已拆分数据块的文件头;
根据预设的第二拆分规则将多个已拆分数据块的文件头组成待拆分数据块的文件头;
将待拆分数据块的文件内容,或部分待拆分数据块的文件内容与待拆分数据块的文件头组成待拆分数据块,或部分待拆分数据块。
优选的,文件头中表示文件结构的结构编码和文件内容的索引表,结构编码用于形成文件内容的框架,使文件内容能按照该框架进行填充,以形成能够直接显示的文件形式;
根据预设的第二拆分规则将多个已拆分数据块的文件头组成待拆分数据块的文件头包括:
获取多个已拆分数据块的多个区域文件结构编码和多个区域文件的索引表;
按照预设的第三拆分规则分别对多个区域文件结构编码和多个区域文件的索引表进行组合,以形成待拆分数据块的文件头。
优选的,本发明实施例还提供的数据查询方法还包括:获取隐含数据代码,隐含数据代码用于在数据块中标识隐含文件;
按照预设的隐含数据代码拆分规则对隐含数据代码进行拆分,以获得至少两个已拆分隐含数据代码;
在云端系统中查询包含隐含数据代码的多个已拆分数据块的文件内容;
根据预设的第一拆分规则将多个已拆分隐含数据代码所对应的已拆分数据块的文件内容组成待拆分数据块的文件内容,或部分待拆分数据块的文件内容。
本发明实施例另提供了数据修改方法,包括数据块储存方法,还包括:
获取待写入字符串、写入位置和写入方式,写入方式包括删除、增加和替换;
根据写入位置和预先获取的首次储存位置,在独立的储存系统中查找待修改的数据块,首次储存位置包括每个已拆分数据块的文件内容的储存地址;
若写入方式为删除,则按照写入位置和第一拆分规则,将待修改的数据块中与写入位置相对应的字符删除,以生成已修改数据块的文件内容;
若写入方式为增加或替换,则将待写入字符串按照第一拆分规则进行拆分,以生成待写入字符;
将已拆分数据块的文件内容中与写入位置相对应的字符删除,并将待写入字符按照写入位置加入待修改的新数据块的文件内容,以生成多个已修改数据块的文件内容。
本发明实施例提供的数据块储存方法,与现有技术中的只通过对待储存的数据进行加密,导致他人可以使用反向破解的方式来得到解密密码,从而获取到原始的带出存数据,也就导致了数据处于不安全的状态相比,其通过将待储存的数据块的文件内容进行拆分,以生成至少两个已拆分数据块的文件内容,其中,待拆分数据块的文件内容包括至少一个字符组,字符组是待拆分数据块的文件内容表示其含义的最小单位,字符组包括多个字符,每个已拆分数据块的文件内容包括同一字符组的部分字符,也就使数据块表示其最小含义的单位被拆分,即使他人获得了拆分后的一部分数据内容,也无法解析出有效的内,并且将拆分后的文件内容储存到了至少两个独立存储系统中,进而提高了数据的安全性,从而解决了现有技术中的不足。
附图说明
图1示出了本发明实施例的数据块储存方法的基本流程图;
图2示出了本发明实施例的数据查询方法的基本流程图。
具体实施方式
下面通过具体的实施例子并结合附图对本发明做进一步的详细描述。
本发明实施例1提供了数据块储存方法,如图1所示,其包括如下步骤:
S101,获取待拆分数据块,待拆分数据块包括表示其实际含义的文件内容;
S102,按照预设的第一拆分规则对待拆分数据块的文件内容进行拆分,以生成至少两个已拆分数据块的文件内容;
S103,分别将不同的已拆分数据块的文件内容按照预设的首次储存位置,储存至至少两个独立存储系统中。
步骤S101中,数据块是指一组按顺序连续排列在一起的几组记录,是主存储器与输入、输出设备或外存储器之间进行传输的一个数据单位。数据块由2部分组成,分别是由文件头和文件内容。文件头中携带了数据架构的内容,而文件内容中携带了标识文件实际含义的数据内容。数据块作为数据的载体,通常的加密技术是对数据块中的文件数据进行调整,通过改变数据的表现形式来使读取数据的人无法了解到数据的真正含义,其中,改变数据表现形式的关键是加密秘钥,也就是通过加密秘钥的计算来改变数据的表现形式。但这种通过加密(包括对称加密和非对称加密)来加强数据安全性的方式,通常能够通过获取秘钥,或者其他的强行读取的方式来获取真实的数据,也就导致了泄密问题。有鉴于此,可以针对数据块的特性,分别将文件头和文件内容进行拆分,并储存,以提高安全性。也就是首先需要获取待拆分数据块的文件内容。具体获取文件内容的方式并不需要做限定,当然,为了提高处理的安全性,可以由指定的终端,或云端来获取,并对文件内容进行相应的拆分。
步骤S102中,待拆分数据块的文件内容包括至少一个字符组,字符组是待拆分数据块的文件内容表示其含义的最小单位,字符组包括多个字符,每个已拆分数据块的文件内容包括同一字符组的部分字符。其中,字符组是一组代码,该代码能够表示,或对应一个具有具体含义的释义单位。
如当文件内容所对应的是文字的话,那么,字符组所对应的含义应是文字的编码,即“我”字的编码为“11001110”,那么字符组也就是由8位的字符所组成。其他的音频、视频、图形、邮件等文件内容同样可以划分为这样的字符组,字符组中也需要固定的编码,作为计算机语言,其最小的语言单位便是二进制的0和1,因此,对字符组(一个字符组中的字符数量是可变的)进行拆分,能够使得文件内容无法表示其原有的含义。常见的文本格式(文件内容是文字)后缀有:PDF、DOC、TXT、WPS。
如果文件内容是音频的话,字符组所表示的含义便是某一时刻的发音,如该发音是由多个声音简单因复合而成,那么拆分的对应就应是每个单音,也就是需要对每个单音所对应的字符进行拆分,以使拆分后的字符无法对应上某具体发音。常见的音视频流媒体格式如:asf,Advanced Streaming format.(Microsoft);rm,Real Video/Audio(Progressive Networks);ra,Real Audio(Progressive,Networks);swf,Shock Wave Flash(Macromedia);mov,QuickTime(Apple);viv,Vivo Movie(Vivo Software);mp4,(Motion Picture Experts Group);mp3,(Motion Picture Experts Group)。
如果文件内容是图像的话,字符组所表示的含义便是某一个像素点的像素代码,或者是某几个像素点代码的集合,且这个几个像素点共同出现不能够显示出图像的特征。其中,图像的特征是指图像所具有的某种,能够反映其含义的色彩区域的集合,如几个像素点的集合,那么在对这几个像素点进行拆分的时候,可以分别对这几个像素点的代码进行拆分(使每个像素点不能显示其原有的色彩),也可以对这几个像素点进行整体的拆分(以生成足够零散的像素点的集合),如拆分成互不相邻的几个像素点,这样,即使能够还原出这几个像素点,也无法根据这几个像素点来获知图像想要表达的真实含义。当然,在拆分为零散的几个像素点之后,还可以进一步对每个,或几个像素点的的代码进行拆分,以进一步提高安全性。
如果文件内容是视频的话,那么字符组所表示的含义便是某一时刻的图像,或某一时刻的图像的每个像素点。此时,视频某一时刻便是相对静止的画面(也就是每一帧图像),那么这个相对静止的画面也就是一个图像,如此,便可以按照前文中对图像进行拆分的方式来对某一个画面的图像进行拆分。当然,不管是文字、音频、图像还是视频的文件内容均可以拆分成2个,或者多个文字文件内容(已拆分数据块的文件内容),拆分出来的数据块文件内容越多,安全性越好,同样,将拆分后的文件内容进行组装的难度也就越大,组装的计算量也就越大。
将文件内容进行拆分后,便需要将文件进行储存,也就是执行步骤S103,为了提高文件内容的安全性,需要将拆分后的文件内容分别储存到不同的独立储存系统中,当然,储存的越零散,安全效果越好。如可以将拆分得到的文件内容A、B、C和D分别储存到A’、B’、C’和D’中,即使有人能够通过A’储存系统中获取到文件内容A,但无法获取其他的储存系统中的文件内容,同样不能了解到完整的文件内容所想要表达的真实含义。
在对文件进行储存后,每个储存系统都有一定发生故障的概率,如储存系统受到病毒的入侵,无法获取到被拆分后的文件内容;如储存系统中的文件内容均被删除,或者受到外力的破坏,均会导致外界无法获取到拆分后的文件内容。考虑到此,可以在储存的时候将拆分得到的N个文件内容储存到N+1个独立的储存 系统中,并且,任意1个独立储存系统中的文件内容均可以在其他独立的储存系统中找到。
如,拆分得到A、B、C、D和E,5个已拆分数据块的文件内容(已拆分文件内容),其中,任意三个已拆分文件内容组合后便能够还原出原数据块(待拆分数据块),那么便可以在每个储存系统中最多储存两个已拆分文件内容,来防止某一个储存系统能够获得足够还原为待拆分数据块(待拆分文件内容)的足够数量的已拆分文件内容,如,将这5个已拆分文件内容分别储存在5个储存系统中,其储存方式可以是AB储存在一个储存系统中,BC储存在一个储存系统中,CD储存在一个储存系统中,DE储存在一个储存系统中,EA储存在一个储存系统中。当然,也可以在每个储存系统中储存三、四个已拆分文件内容,但某一个储存系统中储存的已拆分文件内容越多,文件的安全性也就越差,也就是单独的一个储存系统越容易还原出待拆分数据块,这样对数据安全不利,由此,可以将2-3个已拆分文件内容储存到独立的储存系统中。
进一步地,本发明所提供的数据块储存方法,还包括:如下步骤:
按照预设的分割方式,将待拆分数据块分割为多个待拆分子数据块,使每个待拆分子数据块的文件内容与待拆分数据块的部分连续文件内容完全相同;
则步骤S102,按照预设的第一拆分规则对待拆分数据块的文件内容进行拆分包括:
按照数据块拆分规则,分别对每个待拆分子数据块的文件内容进行拆分,以生成多个已拆分数据块。
对待拆分数据块进行拆分,生成多个待拆分子数据块也就是将待拆分子数据块拆分为几个段落,如将一篇文章拆分为数个连续的段落,或句群;将一段视频,或音频拆分为数个连续的时间段视频,或音频;将图画拆分为几个边界连续的区域。这样,在进行下一步拆分的时候,便可以以此为基础进行进一步的拆分工作。那么在需要对文件进行修改的时候,可以将已拆分数据块组合成待拆分子数据块,由于待拆分子数据块是具有连续的文件内容,那么待拆分子数据块也就能够还原数据的一部分,且这一部分是具有具体含义的,此时对这一部分数据进行修改即可,这要远比将全部的已拆分数据块组合成待拆分数据块的工作量要小。并且,当已拆分数据块全部储存在云端之后,如果想要对文件内容进行修改,就不需要将全部的已拆分数据块下载下来,而是只要将对应的某一部分已拆分数据块下载下来,组装后再修改;也可以组装后,再下载,进而进行修改。这样,在进行网络传输的时候,可以只传输原数据块(待拆分数据块)的一部分,这大大降低了数据的传输量,节约了网络资源。
除了对文件内容进行拆分,还可以对文件头进行拆分,也就是待拆分数据块还包括表示其文件结构的文件头;
步骤S101,在获取待拆分数据块后还包括:按照预设的第二拆分规则对待拆分数据块的文件头进行拆分,以生成至少两个已拆分数据块的文件头,待拆分数据块的文件头包括至少一个字符组,字符组是待拆分数据块的文件头表示其含义的最小单位,字符组包括多个字符,每个已拆分数据块的文件头包括同一字符组的部分字符,或每个已拆分数据块的文件头包括待拆分数据块的文件头的部分字符组;
分别将不同的已拆分数据块的文件头按照预设的首次储存位置,储存至至少两个独立存储系统中。
对文件头进行的拆分可以如对文件内容进行的拆分方式进行。文件头中所携带的字符是表示文件结构的,除了按照文件内容的拆分和储存方式进行,还可以是将文件头按照字符组进行拆分,也就是拆分后的文件头可以包括完整的字符组。即待拆分的文件头包括有多个字符组,所有的字符组组合后能够形成完整的文件结构,那么,将文件头拆分成多个已拆分数据块的文件头,且已拆分数据块的文件头包括所述待拆分数据块的文件头的部分字符组,也就是已拆分数据块的文件头包括有一部分有具体含义的文件结构信息,但单纯了解这些信息,并不足以显示足够的内容,也就是不造成泄密的情况。而且,拆分后的文件头可以形成与待拆分子数据块相对应的文件结构,从而能够形成部分的文件信息(某一段音频、视频或某一部分图像)。
进一步,按照预设的第二拆分规则对待拆分数据块的文件头进行拆分,以生成至少两个已拆分数据块的文件头包括如下步骤:
S104,获取文件头中表示文件结构的结构编码和文件内容的索引表,结构编码用于形成文件内容的框架,使文件内容能按照该框架进行填充,以形成能够直接显示的文件形式;
S105,按照预设的第三拆分规则分别对结构编码和索引表进行拆分,以生成多个区域文件结构编码和与每个区域文件结构编码相对应的多个区域文件的索引表,每个区域文件结构编码和每个区域文件的索引表均与一个待拆分子数据块相对应,且每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件,每个区域文件的索引表携带有指定的一个待拆分子数据块的文件内容索引信息;
S106,按照预设的第四拆分规则,分别对每个区域文件结构编码的代码和每个区域文件的索引表的代码进行拆分,以形成多个子区域文件结构编码和多个子区域文件的索引表,使每个子区域文件结构编码无法显示与其对应的区域文件结构编码的内容,且使每个子区域文件的索引表无法显示与其对应的区域文件的索引表的内容,多个子区域文件结构编码和多个子区域文件的索引表分别携带在至少两个已拆分数据块的文件头内。
步骤S104中,结构编码和文件内容的索引表是两种对文件内容进行描述的内容,如索引表描述了指定的文件内容的出现位置,结构编码描述了文件的展现形式,如表格、阵列等,也就是结构编码和索引表通常是文件中,或者说是复杂文件中不可缺少的一部分,二者描述了文件的架构。
步骤S105中,与前文中,对待拆分数据块分解为多个待拆分子数据块相对应的,此处,将结构编码拆分为区域文件的结构编码和将索引表拆分为区域文件的索引表都是为了与待拆分子数据块相配合,使待拆分子数据块能够还原成一个完整的文件(也就是还原成一个完整的待拆分数据块的一部分,这一部分是连续的,如图片的某一部分,音频、视频的某一个时间段的文件、文字的某一个句子,或者句群)。
步骤S106中,需要对已经拆分过的区域文件的结构编码和区域文件的索引表再次进行拆分,以生成多个子区域文件结构编码和多个子区域文件的索引表。每个子区域文件结构编码和每个子区域文件的索引表无法表示真实含义,也就是即使读取到了一个子区域文件结构编码,或一个子区域文件的索引表,也无法知悉文件的架构体系。此步骤中,第四拆分规则是指对文件的代码进行拆分,也就是对一个子区域文件结构编码,或一个子区域文件的索引表的代码进行分解,这样便能够保证文件结构的安全性了。
在储存子区域文件结构编码和子区域文件的索引表时,可以分别对二者进行储存,以提高安全性。并且,在储存已拆分文件内容(已拆分文件内容)的时候,也可以分别子区域文件结构编码、子区域文件的索引表和已拆分文件内容分开进行储存,以提高安全性。在需要使用的时候,分别调取三者,以还原部分数据块(该部分数据块是能够还原出连续的文件内容)。当然,为了加强数据块的管理性,也可以将使部分数据块还原出的文件内容是非连续的,也就是离散的。
进一步地,若待拆分数据块的文件内容的字符组只包括定长字符串,且待拆分数据块的文件内容的字符组中字符总量能被Y整除,其中Y为定值,则根据第一拆分规则中所携带的分组数量X和拆分长度Y,依次对待拆分数据块中文件内容的定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个已拆分数据块的文件内容包括一个,或多个长度为Y的字符串。
大部分的字符串都是定长的字符串,也就是字符串的长度不会随存储位置的改变,或者存储形式的改变而发生变化。那么便可以通过设定拆分长度Y,来确定要将全部字符拆分为多少个长度为Y的部分。在确定了分组数量X和拆分长度Y之后,便可以针对性的进行拆分,具体的,可以将全部字符现分为多个长度为Y的字符群,再依次将每一个字符群,放置到对应的分组中。如,可以对每个字符群进行标号,标识的号码就是1-X,也就是标识了每个字符群的代码,再标识了每个字符群的代码之后,便可以将每个字符群放置到其对应的代码的分组中。 如数组字符如下:01010010,00001001,01010100,10100101,01010101,00101011,01001010,10011111,00101001,10100100,01010010,01011001。其中设X为2,Y为8,那么可以将多个字符群拆分为两组:
X1=01010010,01010100,01010101,01001010,00101001,01010010;
X2=00001001,10100101,00101011,10011111,10100100,01011001;如将X设为3,Y为16,也就可以拆分为3组,
X1=01010010,00001001,01001010,10011111;
X2=01010100,10100101,00101001,10100100;
X3=01010101,00101011,01010010,01011001;
同样的,将X设为任意一个小于字符总量的数字均可,Y可以设置为能够被字符总量整除的数字即可(如X=9,Y=2;X=5,Y=4)。当然,为了提高保密性,Y最好不要设置为1,或者设置为与字符总量相等的数字。
也可以在对字符群标识的时候打乱标识的顺序,如隔位标识,或者奇偶位的标识方式不同,以提高安全性。
在拆分的时候,可以按照顺序的方式进行拆分,也就是,根据第一拆分规则中所携带的分组数量X和拆分长度Y对文件内容的字符组为定长字符串进行拆分包括:
将待拆分数据块的文件内容的全部字符,按照排列的先后顺序,拆分为多组长度为Y的第一字符群;按照第一字符群的前后顺序,将多个第一字符群依次分配到X个已拆分数据块的文件内容中。
除了字符总量能够被Y整除的情况,还有字符总量不能被Y整除的情况,对于具体计算而言,也就是字符总量除以Y的余数不为零。也就是如下这种情况:若待拆分数据块的文件内容的字符组只包括定长字符串,且待拆分数据块的文件内容的字符组中字符总量不能被Y整除,其中Y为定值,则使用填充数字对待拆分数据块的文件内容的字符组进行填充,使填充后的待拆分数据块的文件内容的字符组中字符总量能被Y整除;
根据第一拆分规则中所携带的分组数量X和拆分长度Y对待拆分数据块的文件内容的字符组为定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个已拆分数据块的文件内容包括一个,或多个长度为Y的字符串。
也就是先对字符组中的字符进行填充,使填充后的字符数量能够被Y整除,其中,填充的字符可以是预设的无实际含义的字符。
若待拆分数据块的文件内容的字符组只包括定长字符串,且待拆分数据块的文件内容的字符组中字符总量不能被Y整除,其中拆分长度Y为变量,则将待拆分数据块的文件内容的全部字符,按照排列的先后顺序,依次拆分为多组长度为按照Y变化的第二字符群;
按照第二字符群的前后顺序,依次将每组第二字符群分配到X个已拆分数据块的文件内容中。
也就是Y的值是一个变化值,这样每次分出来的字符群的长度均可以是不相同,那么在组装这些字符群的时候,也就需要知悉Y的变化规律,这样又进一步加强了安全性。
除了将字符群按照先后顺序进行拆分,还能够将字符群进行隔位拆分,如某字的编码是D4C6 4E91 E4 BA 91,那么,可以将该字符串的按照其奇偶数位的顺序进行拆分。如D4C6拆分为DC和C6,将4E91拆分为49和E1,将E4BA91拆分为EB9和4A1,如此各位拆分也同样能够起到隐含原字符所对应含义的作用。进一步,除了上述提及的隔位拆分,还可以是按照预先设定的顺序进行拆分,如取每个字符串的首尾字符,D4C6 4E91 E4 BA 91取得首尾字符也就是D(首字符)和1(尾字符),那么拆分出来的两个组合就是D1和4C6 4E91 E4 BA 9。还可以是按照一定的数学规则进行拆分,如,按照斐波那契数列进行拆分,也就是取得一组字符串中1、1、2、3、5、8、13、21位置的数字,并且留下其他的数字,以将原字符串区分为两组子字符串,这种预先设定的顺序可以是关联方式中的一部分。如Y的值为2,4,6,8,10…也就是Y的取值,每次要增加2,那么对于如下的字符群可以如此的划分:00001111,00001111,11110000,1111000划分为,00,0011,110000,11111111,0000111100,00…可通过如此的方式来增加拆分后的数据安全性。进一步,依次拆分为多组长度为按照Y变化的第二字符群包括:
按照Y的一个变化值,连续拆分出X组字符长度相等的字符群。
在按照Y的变化值进行拆分的基础上,为了保证分配给每个分组的字符群的字符数量是相等的,也就可以按照Y的一个变化值,连续拆分多次(X次),如123456789098765432,这18数字可以按照Y=2,1,3拆分为3组(X=3)。那么,先将这18个数字分为9组,分别是12,34,56,7,8,9,098,765,432,再将这几组数字分别分配给三个分组,X1=12,7,098;X2=34,8,765;X3=56,9,432。这样的拆分方式能够保证每个分组所得到的数量均是相同的,也就是对每组分的字符进行储存的时候,所需要的储存空间均是相同的,可以更加合理的分配储存空间,以提高读取和储存的效率,并且在计算机内部进行更新的时候提高效率。
需要说明的是,为了便于确定数据块的储存位置,可以将表示含义相同的字符串,以同样的方式进行拆分,这样可以减少工足量,也便于将已拆分数据块进行组合。
除了预先生成多个Y值,并且将生成的多个Y值作为随机变量,还可以通过函数运算来确定每次,Y的变化值,如本发明所提供的数据块储存方法还包括如下步骤:
随机获取多个函数值;
分别对每个函数值进行函数运算,以生成多个待使用值,函数运算包括三角函数运算、指对函数运算;
将每个待使用值进行取整运算,以确定变量Y。
三角函数运算和指对函数运算如,Sin A=Y(A大于0度小于90度,不等于30度,),
Cos A=Y(A大于0度小于90度,不等于60度);
Tan A=Y(A大于0度小于90度,不等于45度);
Log 以A为底的C的对数=Y(C大于0,不等于1,C不能为A的倍数);
In A=Y(A不等于1,不等于e);
A开N次方=Y。其中,不同的数字计算出的结果可能是小数,这样便需要将小数取整,取整可以按照向上取整,或者向下取整,或者四舍五入的方式进行。
除了定长的字符组,还有变长的字符组,如UTF-8,若待拆分数据块的文件内容的字符组包括变长字符串,则将待拆分数据块的文件内容的字符组中变长字符串和定长字符串分为两组字符串组;
若定长字符串组中字符总量能被拆分长度Y整除,则根据第一拆分规则中所携带的分组数量X和拆分长度Y,对待拆分数据块中文件内容的定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个已拆分数据块的文件内容包括一个,或多个长度为Y的字符串;
确认变长字符串组的字符长度,并根据该字符长度进行拆分,以生成多个已拆分变长字符串组,多个已拆分变长字符串组依次分配到X个已拆分数据块的文件内容中。
如果定长字符串组的字符长度无法被Y整除,那么可以按照前文中的方式,使用添加字符,将其字符串组的长度添加至能够被Y正常的长度,再进行分组。
进一步地,若待拆分数据块为文本格式,则每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个区域文件结构编码与相应的待拆分子数据块能组合成指定字数的连续文段;
若待拆分数据块为视频格式,则每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个区域文件结构编码与相应的待拆分子数据块能组合成指定时间长度的连续视频影像;
若待拆分数据块为音频格式,则每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个区域文件结构编码与相应的待拆分子数据块能组合成指定时间长度的连续音频数据;
若待拆分数据块为图像格式,则每个区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个区域文件结构编码与相应的待拆分子数据块能组合成指定的连续图像。
也就是每一个待拆分子数据块均有其固定的含义,能够表达一定的内容,那么在使用者想要修改数据块的内容时,不需要将已经拆分过的数据块全部组合、还原,只需要将一部分还原成待拆分子数据块即可。大大节省了数据处理的数量,并且,如果拆分后的数据块是存储在网络数据库(云端数据库)中,那么终端想要获取云端数据库中的内容的话,也可以指获取少部分的文件内容,进而对这少部分的文件内容进行修改,在保证修改符合要求的情况下,可以大大降低网络传输数据的数量。
除了对数据块进行拆分,还可以通过对数据块改变储存位置来提高安全性,分别将不同的已拆分数据块的文件内容按照预设的首次储存位置,储存至至少两个独立存储系统中包括:
将多个已拆分数据块的文件内容按照预定的储存比例储存至不同的公有储存系统,和/或私有储存系统中。
通过将数据储存到不同的储存系统中,也能够起到提高安全性的效果。还可以将不同的已拆分数据块储存到不同的云端系统中,以进一步提高安全性。
还可以是,分别将不同的已拆分数据块的文件头按照预设的首次储存位置,储存至至少两个独立存储系统中包括:
将多个已拆分数据块的文件头按照预定的储存比例储存至不同的公有储存系统,和/或私有储存系统中,且已拆分数据块的文件头与已拆分数据块的文件内容的储存位置不同。
并且,需要说明的是,私有储存系统的安全性高,但接入的便捷性较差,相对的,公有储存系统的安全性低,但接入的便捷性较快,因此,可以根据待拆分的数据块的性质(强调安全性,还是强调接入、读取数据块的便捷性),来选择在公有储存系统,或私有储存系统的储存比例。其中,较好的是将多个已拆分数据块存储在至少一个公有储存系统中和至少一个私有储存系统中。当然,可以根据使用者的需求的不同(要求数据块具有更高的安全性,还是更好的可读取性)可以将多个已拆分数据块中的一部分储存在公有储存系统(可以是分别储存到多个公有储存系统中,以提高新数据块(已拆分数据块)的安全性),并将另几个新数据块储存在私有储存系统(可以是分别储存到多个私有储存系统中,以提高新数据块的安全性),储存在公有储存系统和私有储存系统中的全部已拆分数据 块组合后,才能够还原为待拆分数据块(原数据块);也可以是多个已拆分数据块中的一部分组合后,才能够还原为原数据块表,其中这一部分(多个已拆分数据块中的部分数据块)也需要分别储存在至少一个公有储存系统和一个私有储存系统中。如可以按照公有储存系统:私有储存系统=3:7的方式对多个已拆分数据块进行储存。
本发明实施例提供的数据块储存方法,与现有技术中的只通过对待储存的数据进行加密,导致他人可以使用反向破解的方式来得到解密密码,从而获取到原始的带出存数据,也就导致了数据处于不安全的状态相比,其通过将待储存的数据块的文件内容进行拆分,以生成至少两个已拆分数据块的文件内容,其中,待拆分数据块的文件内容包括至少一个字符组,字符组是待拆分数据块的文件内容表示其含义的最小单位,字符组包括多个字符,每个已拆分数据块的文件内容包括同一字符组的部分字符,也就使数据块表示其最小含义的单位被拆分,即使他人获得了拆分后的一部分数据内容,也无法解析出有效的内,并且将拆分后的文件内容储存到了至少两个独立存储系统中,进而提高了数据的安全性,并且通过对文件头中的结构编码和索引表分别进行拆分和存储,还通过改变公有储存系统和私有储存系统中的储存比例,从而更好的解决了现有技术中的不足。
本发明实施例2提供了数据查询方法,实施例1所提供的数据块储存方法,在对已拆分数据块进行储存后还包括如下步骤,如图2所示:
S201,获取文件内容关键字;
S202,在独立储存系统中查询与所述文件内容关键字相对应的多个已拆分数据块的文件内容;
S203,根据预设的第一拆分规则将多个已拆分数据块的文件内容组成待拆分数据块的文件内容,或部分待拆分数据块的文件内容。
步骤S201中,关键字是一个已经被拆分好的文件内容,或者文件头,这可以是在实施例所提供的数据块储存方法,执行的过程中记录下来的,如果拆分后终端取得某个具有标示性作用的字符,或者执行拆分动作的时候,对拆分得到的已拆分数据块进行标识,以确定关键字所对应的哪个已拆分数据块。这样,便能够确定需要查询的数据块是哪个了。
步骤S202中,对关键字进行拆分是将关键字按照第一拆分规则进行拆分(如果每种字符串的拆分规则是相同的),以确定关键字所对应的已拆分数据块。
步骤S203中,按照拆分的方式将获取到的已拆分数据块还原即可,还原的可以是全部的原数据块,也可以是部分的原数据块。
进一步地,本发明实施例2提供了数据查询方法还包括,获取与已拆分数据块的文件内容关键字相对应的文件头关键字;
在云端系统中查询包含文件头关键字的多个已拆分数据块的文件头;
根据预设的第二拆分规则将多个已拆分数据块的文件头组成待拆分数据块的文件头;
将待拆分数据块的文件内容,或部分待拆分数据块的文件内容与待拆分数据块的文件头组成待拆分数据块,或部分待拆分数据块。
进一步地,文件头中表示文件结构的结构编码和文件内容的索引表,结构编码用于形成文件内容的框架,使文件内容能按照该框架进行填充,以形成能够直接显示的文件形式;
根据预设的第二拆分规则将多个已拆分数据块的文件头组成待拆分数据块的文件头包括:
获取多个已拆分数据块的多个区域文件结构编码和多个区域文件的索引表;
按照预设的第三拆分规则分别对多个区域文件结构编码和多个区域文件的索引表进行组合,以形成待拆分数据块的文件头。
进一步地,本发明所提供的数据查询方法,其还包括:获取隐含数据代码,隐含数据代码用于在数据块中标识隐含文件;
按照预设的隐含数据代码拆分规则对隐含数据代码进行拆分,以获得至少两个已拆分隐含数据代码;
在云端系统中查询包含隐含数据代码的多个已拆分数据块的文件内容;
根据预设的第一拆分规则将多个已拆分隐含数据代码所对应的已拆分数据块的文件内容组成待拆分数据块的文件内容,或部分待拆分数据块的文件内容。
本发明实施例3提供了数据修改方法,包括实施例1的数据块储存方法,其还包括:
获取待写入字符串、写入位置和写入方式,写入方式包括删除、增加和替换;
根据写入位置和预先获取的首次储存位置,在独立的储存系统中查找待修改的数据块,首次储存位置包括每个已拆分数据块的文件内容的储存地址;
若写入方式为删除,则按照写入位置和第一拆分规则,将待修改的数据块中与写入位置相对应的字符删除,以生成已修改数据块的文件内容;
若写入方式为增加或替换,则将待写入字符串按照第一拆分规则进行拆分,以生成待写入字符;
将已拆分数据块的文件内容中与写入位置相对应的字符删除,并将待写入字符按照写入位置加入待修改的新数据块的文件内容,以生成多个已修改数据块的文件内容。
其中,修改的方式,可以是先将修改的内容独立储存,再在确认修改后,添加到指定的位置。指定的位置可以是两个字符之间,或者某两个段落之间等。
显然,本领域的技术人员应该明白,上述的本发明的各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所 组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。软件类发明可有这段话,否则删除。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (19)

  1. 一种数据块储存方法,其特征在于,包括:
    获取待拆分数据块,所述待拆分数据块包括表示其实际含义的文件内容;
    按照预设的第一拆分规则对所述待拆分数据块的文件内容进行拆分,以生成至少两个已拆分数据块的文件内容,所述待拆分数据块的文件内容包括至少一个字符组,所述字符组是所述待拆分数据块的文件内容表示其含义的最小单位,所述字符组包括多个字符,每个所述已拆分数据块的文件内容包括同一字符组的部分所述字符;
    分别将不同的所述已拆分数据块的文件内容按照预设的首次储存位置,储存至至少两个独立存储系统中。
  2. 根据权利要求1所述的数据块储存方法,其特征在于,还包括:
    按照预设的分割方式,将所述待拆分数据块分割为多个待拆分子数据块,使每个所述待拆分子数据块的文件内容与所述待拆分数据块的部分连续文件内容完全相同;
    所述按照预设的第一拆分规则对所述待拆分数据块的文件内容进行拆分包括:
    按照数据块拆分规则,分别对每个所述待拆分子数据块的文件内容进行拆分,以生成多个已拆分数据块。
  3. 根据权利要求2所述的数据块储存方法,其特征在于,所述待拆分数据块还包括表示其文件结构的文件头;
    在所述获取待拆分数据块后还包括:按照预设的第二拆分规则对所述待拆分数据块的文件头进行拆分,以生成至少两个已拆分数据块的文件头,所述待拆分数据块的文件头包括至少一个字符组,所述字符组是所述待拆分数据块的文件头表示其含义的最小单位,所述字符组包括多个字符,每个所述已拆分数据块的文件头包括同一字符组的部分所述字符,或每个所述已拆分数据块的文件头包括所述待拆分数据块的文件头的部分字符组;
    分别将不同的所述已拆分数据块的文件头按照预设的首次储存位置,储存至至少两个独立存储系统中。
  4. 根据权利要求3所述的数据块储存方法,其特征在于,所述按照预设的第二拆分规则对所述待拆分数据块的文件头进行拆分,以生成至少两个已拆分数据块的文件头包括:
    获取所述文件头中表示文件结构的结构编码和文件内容的索引表,所述结构编码用于形成所述文件内容的框架,使所述文件内容能按照该框架进行填充,以形成能够直接显示的文件形式;
    按照预设的第三拆分规则分别对所述结构编码和所述索引表进行拆分,以生成多个区域文件结构编码和与每个区域文件结构编码相对应的多个区域文件的索 引表,每个所述区域文件结构编码和每个区域文件的索引表均与一个所述待拆分子数据块相对应,且每个所述区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件,每个所述区域文件的索引表携带有指定的一个待拆分子数据块的文件内容索引信息;
    按照预设的第四拆分规则,分别对每个所述区域文件结构编码的代码和每个区域文件的索引表的代码进行拆分,以形成多个子区域文件结构编码和多个子区域文件的索引表,使每个子区域文件结构编码无法显示与其对应的区域文件结构编码的内容,且使每个子区域文件的索引表无法显示与其对应的区域文件的索引表的内容,多个所述子区域文件结构编码和多个所述子区域文件的索引表分别携带在至少两个所述已拆分数据块的文件头内。
  5. 根据权利要求4所述的数据块储存方法,其特征在于,
    若所述待拆分数据块的文件内容的字符组只包括定长字符串,且所述待拆分数据块的文件内容的字符组中字符总量能被Y整除,其中Y为定值,则根据所述第一拆分规则中所携带的分组数量X和拆分长度Y,依次对所述待拆分数据块中所述文件内容的定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个所述已拆分数据块的文件内容包括一个,或多个长度为Y的字符串。
  6. 根据权利要求5所述的数据块储存方法,其特征在于,所述根据所述第一拆分规则中所携带的分组数量X和拆分长度Y对所述文件内容的字符组为定长字符串进行拆分包括:
    将所述待拆分数据块的文件内容的全部字符,按照排列的先后顺序,拆分为多组长度为Y的第一字符群;按照所述第一字符群的前后顺序,将多个第一字符群依次分配到X个已拆分数据块的文件内容中。
  7. 根据权利要求4所述的数据块储存方法,其特征在于,若所述待拆分数据块的文件内容的字符组只包括定长字符串,且所述待拆分数据块的文件内容的字符组中字符总量不能被Y整除,其中Y为定值,则使用填充数字对所述待拆分数据块的文件内容的字符组进行填充,使填充后的所述待拆分数据块的文件内容的字符组中字符总量能被Y整除;
    根据所述第一拆分规则中所携带的分组数量X和拆分长度Y对所述待拆分数据块的所述文件内容的字符组为定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个所述已拆分数据块的文件内容包括一个,或多个长度为Y的字符串。
  8. 根据权利要求4所述的数据块储存方法,其特征在于,
    若所述待拆分数据块的文件内容的字符组只包括定长字符串,且所述待拆分数据块的文件内容的字符组中字符总量不能被Y整除,其中拆分长度Y为变量,则将所述待拆分数据块的文件内容的全部字符,按照排列的先后顺序,依次拆分为多组长度为按照Y变化的第二字符群;
    按照所述第二字符群的前后顺序,依次将每组第二字符群分配到X个已拆分数据块的文件内容中。
  9. 根据权利要求8所述的数据块储存方法,其特征在于,所述依次拆分为多组长度为按照Y变化的第二字符群包括:
    按照Y的一个变化值,连续拆分出X组字符长度相等的字符群。
  10. 根据权利要求8所述的数据块储存方法,其特征在于,还包括:随机获取多个函数值;
    分别对每个函数值进行函数运算,以生成多个待使用值,所述函数运算包括三角函数运算、指对函数运算;
    将每个待使用值进行取整运算,以确定变量Y。
  11. 根据权利要求8所述的数据块储存方法,其特征在于,若所述待拆分数据块的文件内容的字符组包括变长字符串,则将所述待拆分数据块的文件内容的字符组中变长字符串和定长字符串分为两组字符串组;
    若定长字符串组中字符总量能被拆分长度Y整除,则根据所述第一拆分规则中所携带的分组数量X和拆分长度Y,对所述待拆分数据块中所述文件内容的定长字符串进行拆分,以生成X个已拆分数据块的文件内容,每个所述已拆分数据块的文件内容包括一个,或多个长度为Y的字符串;
    确认变长字符串组的字符长度,并根据该字符长度进行拆分,以生成多个已拆分变长字符串组,多个所述已拆分变长字符串组依次分配到X个已拆分数据块的文件内容中。
  12. 根据权利要求4所述的数据块储存方法,其特征在于,
    若所述待拆分数据块为文本格式,则所述每个所述区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个所述区域文件结构编码与相应的待拆分子数据块能组合成指定字数的连续文段;
    若所述待拆分数据块为视频格式,则所述每个所述区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个所述区域文件结构编码与相应的待拆分子数据块能组合成指定时间长度的连续视频影像;
    若所述待拆分数据块为音频格式,则所述每个所述区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个所述区域文件结构编码与相应的待拆分子数据块能组合成指定时间长度的连续音频数据;
    若所述待拆分数据块为图像格式,则所述每个所述区域文件结构编码与相应的待拆分子数据块能组合成可直接显示的整体文件包括:每个所述区域文件结构编码与相应的待拆分子数据块能组合成指定的连续图像。
  13. 根据权利要求1所述的数据块储存方法,其特征在于,所述分别将不同的所述已拆分数据块的文件内容按照预设的首次储存位置,储存至至少两个独立存储系统中包括:
    将多个所述已拆分数据块的文件内容按照预定的储存比例储存至不同的公有储存系统,和/或私有储存系统中。
  14. 根据权利要求3所述的数据块储存方法,其特征在于,所述分别将不同的所述已拆分数据块的文件头按照预设的首次储存位置,储存至至少两个独立存储系统中包括:
    将多个所述已拆分数据块的文件头按照预定的储存比例储存至不同的公有储存系统,和/或私有储存系统中,且已拆分数据块的文件头与已拆分数据块的文件内容的储存位置不同。
  15. 一种数据查询方法,包括权利要求1至14中任一项所述的数据块储存方法,其特征在于,还包括:
    获取文件内容关键字;
    在独立储存系统中查询与所述文件内容关键字相对应的多个已拆分数据块的文件内容;
    根据预设的所述第一拆分规则将多个所述已拆分数据块的文件内容组成待拆分数据块的文件内容,或部分待拆分数据块的文件内容。
  16. 根据权利要求15所述的数据查询方法,其特征在于,还包括,获取与所述已拆分数据块的文件内容关键字相对应的文件头关键字;
    在云端系统中查询包含所述文件头关键字的多个已拆分数据块的文件头;
    根据预设的所述第二拆分规则将多个所述已拆分数据块的文件头组成待拆分数据块的文件头;
    将所述待拆分数据块的文件内容,或部分待拆分数据块的文件内容与所述待拆分数据块的文件头组成待拆分数据块,或部分待拆分数据块。
  17. 根据权利要求16所述的数据查询方法,其特征在于,
    所述文件头中表示文件结构的结构编码和文件内容的索引表,所述结构编码用于形成所述文件内容的框架,使所述文件内容能按照该框架进行填充,以形成能够直接显示的文件形式;
    所述根据预设的所述第二拆分规则将多个所述已拆分数据块的文件头组成待拆分数据块的文件头包括:
    获取多个已拆分数据块的多个区域文件结构编码和多个区域文件的索引表;
    按照预设的第三拆分规则分别对多个区域文件结构编码和多个区域文件的索引表进行组合,以形成待拆分数据块的文件头。
  18. 根据权利要求15所述的数据查询方法,其特征在于,还包括:获取隐含数据代码,所述隐含数据代码用于在数据块中标识隐含文件;
    按照预设的隐含数据代码拆分规则对所述隐含数据代码进行拆分,以获得至少两个已拆分隐含数据代码;
    在云端系统中查询包含所述隐含数据代码的多个已拆分数据块的文件内容;
    根据预设的所述第一拆分规则将多个所述已拆分隐含数据代码所对应的已拆分数据块的文件内容组成待拆分数据块的文件内容,或部分待拆分数据块的文件内容。
  19. 一种数据修改方法,包括权利要求1至14任一项所述的数据块储存方法,其特征在于,还包括:
    获取待写入字符串、写入位置和写入方式,所述写入方式包括删除、增加和替换;
    根据所述写入位置和预先获取的首次储存位置,在独立的储存系统中查找待修改的数据块,所述首次储存位置包括每个所述已拆分数据块的文件内容的储存地址;
    若所述写入方式为删除,则按照写入位置和所述第一拆分规则,将所述待修改的数据块中与所述写入位置相对应的字符删除,以生成已修改数据块的文件内容;
    若所述写入方式为增加或替换,则将待写入字符串按照所述第一拆分规则进行拆分,以生成待写入字符;
    将所述已拆分数据块的文件内容中与所述写入位置相对应的字符删除,并将所述待写入字符按照所述写入位置加入待修改的新数据块的文件内容,以生成多个已修改数据块的文件内容。
PCT/CN2015/090993 2014-09-28 2015-09-28 数据块储存方法、数据查询方法和数据修改方法 WO2016045641A2 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP15845133.6A EP3200094A4 (en) 2014-09-28 2015-09-28 Data block storage method, data query method and data modification method
US15/515,125 US10521144B2 (en) 2014-09-28 2015-09-28 Data block storage by splitting file content and file headers for independent storage

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410510613.XA CN105528347B (zh) 2014-09-28 2014-09-28 数据块储存方法、数据查询方法和数据修改方法
CN201410510613.X 2014-09-28

Publications (2)

Publication Number Publication Date
WO2016045641A2 true WO2016045641A2 (zh) 2016-03-31
WO2016045641A3 WO2016045641A3 (zh) 2016-05-19

Family

ID=55582186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/090993 WO2016045641A2 (zh) 2014-09-28 2015-09-28 数据块储存方法、数据查询方法和数据修改方法

Country Status (4)

Country Link
US (1) US10521144B2 (zh)
EP (1) EP3200094A4 (zh)
CN (1) CN105528347B (zh)
WO (1) WO2016045641A2 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110609984A (zh) * 2019-08-26 2019-12-24 深圳市亿道数码技术有限公司 一种Windows系统中自动拆分谷歌key的方法及系统

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10289310B2 (en) * 2017-06-27 2019-05-14 Western Digital Technologies, Inc. Hybrid data storage system with private storage cloud and public storage cloud
CN109886725A (zh) * 2018-12-29 2019-06-14 深圳云天励飞技术有限公司 事件处理方法及相关装置
CN110032549B (zh) * 2019-01-28 2023-10-20 北京奥星贝斯科技有限公司 分区分裂方法、装置、电子设备及可读存储介质
CN112118091B (zh) * 2020-09-22 2021-04-23 郑州嘉晨电器有限公司 一种数据加密总线自适应的工业设备远程系统升级方法
CN115470186A (zh) * 2022-07-29 2022-12-13 天翼云科技有限公司 一种数据切片方法、装置和系统
CN116383781B (zh) * 2023-06-06 2023-08-04 中航信移动科技有限公司 一种软件运行参数的控制方法、电子设备及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100343825C (zh) * 2004-01-05 2007-10-17 华为技术有限公司 一种处理流媒体数据的方法
US7747096B2 (en) * 2005-07-15 2010-06-29 Samsung Electronics Co., Ltd. Method, medium, and system encoding/decoding image data
US20140108796A1 (en) * 2006-01-26 2014-04-17 Unisys Corporation Storage of cryptographically-split data blocks at geographically-separated locations
CN101231653B (zh) * 2008-01-24 2010-09-22 创新科存储技术(深圳)有限公司 数据块拆分方法及其装置
US20120136960A1 (en) * 2010-11-29 2012-05-31 Beijing Z & W Technology Consulting Co., Ltd Cloud Storage Data Access Method, Apparatus and System
US9195666B2 (en) * 2012-01-17 2015-11-24 Apple Inc. Location independent files

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110609984A (zh) * 2019-08-26 2019-12-24 深圳市亿道数码技术有限公司 一种Windows系统中自动拆分谷歌key的方法及系统

Also Published As

Publication number Publication date
US10521144B2 (en) 2019-12-31
EP3200094A2 (en) 2017-08-02
CN105528347B (zh) 2019-03-26
EP3200094A4 (en) 2018-03-21
WO2016045641A3 (zh) 2016-05-19
US20170242620A1 (en) 2017-08-24
CN105528347A (zh) 2016-04-27

Similar Documents

Publication Publication Date Title
WO2016045641A2 (zh) 数据块储存方法、数据查询方法和数据修改方法
US10778441B2 (en) Redactable document signatures
US8811611B2 (en) Encryption/decryption of digital data using related, but independent keys
US10963542B2 (en) Blockchain-based image processing method and apparatus
US8345876B1 (en) Encryption/decryption system and method
CN108829899B (zh) 数据表储存、修改、查询和统计方法
CN108197324B (zh) 用于存储数据的方法和装置
US8769302B2 (en) Encrypting data and characterization data that describes valid contents of a column
US10601580B2 (en) Secure order preserving string compression
AU2019283979A1 (en) Systems and methods for personalized video rendering
WO2017097159A1 (zh) 一种随机字符串生成方法及装置
US9858300B2 (en) Hash based de-duplication in a storage system
US10754973B2 (en) Secure cloud storage system
US10078492B2 (en) Generating pseudo-random numbers using cellular automata
US11533173B2 (en) Systems and methods for compression and encryption of data
WO2020192012A1 (zh) 一种数据处理方法、设备及存储介质
US20210224242A1 (en) Systems and methods for indexing and searching data
CN107329911B (zh) 一种基于cp-abe属性访问机制的缓存替换方法
US10162934B2 (en) Data de-duplication system using genome formats conversion
US11733971B2 (en) System and method of managing pseudo-random number generation in a multiprocessor environment
JP2023529690A (ja) 同型暗号文に対する統計演算を行う装置及び方法
US10616190B2 (en) Reduced size key allocation descriptors
US20190179952A1 (en) Searching base encoded text
CN109952570B (zh) 媒体资产访问控制系统
US11003783B1 (en) Searchable encrypted data stores

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 15515125

Country of ref document: US

REEP Request for entry into the european phase

Ref document number: 2015845133

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2015845133

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15845133

Country of ref document: EP

Kind code of ref document: A2