Summary of the invention
In view of this, the present invention provides a kind of storage of data, restoring method and device, to be data storage Jie based on DNA
Matter realizes the purpose of data storage.
Technical solution is as follows:
A kind of date storage method, comprising:
Determine that one group of target data at least one set of data that file to be stored is divided into, the data include first
The subdata of default number of branches, the data length of the subdata are preset data length;
The product for calculating generator matrix and the target data, obtains the first object of every data in the generator matrix
Subdata, the generator matrix include the data of the second default number of branches, the data length of every data in the generator matrix
It is identical as first default number of branches;
If the first object subdata of data is unsatisfactory for gene coding constraints in the generator matrix, the life is updated
At the data in matrix, until the first object subdata generated meets the gene coding constraints;
Based on the first object subdata of pieces of data in the generator matrix, the DNA data of the target data are generated,
The DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of the file to be stored are by described
The DNA data for each data that file to be stored is divided into are constituted.
Preferably, the product for calculating generator matrix and the target data, obtains every number in the generator matrix
According to first object subdata, comprising:
The tail portion of every subdata in the target data generates parity check bit, obtains and the target data pair
The first object data answered;
The product for calculating generator matrix and the first object data, obtains first of every data in the generator matrix
Target subdata.
Preferably, the first object subdata based on pieces of data in the generator matrix, generates the number of targets
According to DNA data, comprising:
The last bit data in the first object subdata are deleted, the second target of the first object subdata is obtained
Data;
Based on the second target subdata of target subdata each in the target data, the DNA of the target data is generated
Data.
Preferably, the product for calculating generator matrix and the first object data obtains every in the generator matrix
The first object subdata of data, comprising:
It is superimposed the first object data and non-zero matrix, obtains the second number of targets corresponding with the first object data
According to the line number of the non-zero matrix is identical as first default number of branches, and the columns of the non-zero matrix is the preset data
Length with 1 and;
The product for calculating generator matrix and second target data, obtains first of every data in the generator matrix
Target subdata.
Preferably, the first object subdata based on pieces of data in the generator matrix, generates the number of targets
According to DNA data, comprising:
Identification information, the mark are generated in the tail portion of the first object subdata of every data of the generator matrix respectively
Know information to be used to indicate the target data, update when generating the first object subdata this in the generator matrix
It is located at the generator matrix for generating the data of the first object subdata in the number of data and the generator matrix
In address;
Based on the first object subdata of the carrying identification information of pieces of data in the generator matrix, the target is generated
The DNA data of data.
A kind of data storage device, comprising:
Target data determination unit, for determining one group of target at least one set of data that file to be stored is divided into
Data, the data include the subdata of the first default number of branches, and the data length of the subdata is preset data length;
Computing unit obtains in the generator matrix every for calculating the product of generator matrix Yu the target data
The first object subdata of data, the generator matrix include the data of the second default number of branches, and every in the generator matrix
The data length of data is identical as first default number of branches;
Updating unit, if the first object subdata for data in the generator matrix is unsatisfactory for gene coding bound item
Part updates the data in the generator matrix, until the first object subdata generated meets the gene coding
Constraint condition;
Generation unit generates the target for the first object subdata based on pieces of data in the generator matrix
The DNA data of data, the DNA data are used to carry out data storage by data storage medium of DNA, the file to be stored
The DNA data for each data that DNA data are divided by the file to be stored are constituted.
A kind of data restoration method, comprising:
Determine that DNA data to be restored, the DNA data are made of at least one first object subdata;
By at least one described first object data, indicate that the first object subdata of same target data is classified as one
Group obtains multiple groups first object subdata;
For every group of first object subdata, it is the present count that data length is chosen from this group of first object subdata
According to the first default number of branches first object subdata of length, the corresponding third target of this group of first object subdata is constituted
Data;
For first object subdata described in every group, according to first object subdata each in this group of first object subdata
Identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
For first object subdata described in every group, the inverse matrix and the group first of this group of first object subdata are calculated
The product of the corresponding third target data of target subdata obtains corresponding 4th target data of this group of first object subdata;
4th target data is for being reduced into file.
Preferably, further includes:
For every group of first object subdata, in the corresponding third target data of this group of first object subdata
The tail portion of each first object subdata generates parity check bit, obtains the 5th target corresponding with this group of first object subdata
Data;
It is described for first object subdata described in every group, calculate the inverse matrix and the group of this group of first object subdata
The product of the corresponding third target data of first object subdata obtains corresponding 4th number of targets of this group of first object subdata
According to, comprising:
For first object subdata described in every group, the inverse matrix and first mesh of group of this group of first object subdata are calculated
The product for marking corresponding 5th target data of subdata, obtains corresponding 4th target data of this group of first object subdata.
Preferably, further includes:
The data for deleting every data tail portion in the 4th target data obtain the 4th final target data.
A kind of data recovery device, comprising:
DNA data determination unit, for determining DNA data to be restored, the DNA data are by least one first object
Subdata is constituted;
Grouped element, for indicating the first object of same target data at least one described first object data
Subdata is classified as one group, obtains multiple groups first object subdata;
It is long to choose data for being directed to every group of first object subdata from this group of first object subdata for selection unit
Degree is the first default number of branches first object subdata of the preset data length, constitutes this group of first object subdata
Corresponding third target data;
Inverse matrix structural unit is used for for first object subdata described in every group, according to this group of first object subdata
In each first object subdata identification information instruction address and number, construct the inverse square of this group of first object subdata
Battle array;
Reduction unit, for calculating the inverse square of this group of first object subdata for first object subdata described in every group
The product of battle array and the corresponding third target data of this group of first object subdata, it is corresponding to obtain this group of first object subdata
4th target data;4th target data is for being reduced into file.
The application provides a kind of storage of data, restoring method and device, and this method, which passes through, determines that file to be stored is divided
At at least one set of data in one group of target data;The product for calculating generator matrix and target data, obtains in generator matrix
The first object subdata of every data;If the first object subdata of data is unsatisfactory for gene coding bound item in generator matrix
Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;Based on generation
The first object subdata of pieces of data in matrix, generates the mode of the DNA data of target data, file to be stored is converted into
It is real in order to reach using DNA as data storage medium for carrying out the DNA data of data storage using DNA as data storage medium
The purpose of existing data storage.
Embodiment:
Fig. 1 is a kind of date storage method flow chart provided by the embodiments of the present application.As shown in Figure 1, this method comprises:
S101, one group of target data at least one set of data that file to be stored is divided into is determined, data include the
The subdata of one default number of branches, the data length of subdata are preset data length;
In the embodiment of the present application, the lossless compression that standard can be used compresses file to be stored, will be wait store
File is bundled in compressed file, so by compressed file be divided into length be K*L non-overlap group (that is, compressed file is divided
At multiple non-overlap groups, each non-overlap group regards one group of data as), in each non-overlap group comprising K length for L two into
Data processed.
Wherein, each group of data that file to be stored is divided into can regard a matrix as, which is a K row
The matrix of L column, the matrix can use a vector D=(D1, D2..., DK) description.Wherein, DkFor indicating to be located at the matrix
The element of middle line k.
S102, the product for calculating generator matrix and target data obtain first object of every data in generator matrix
Data, generator matrix include the data of the second default number of branches, and the data length of every data in generator matrix and first is preset
Item number is identical;
In the embodiment of the present application, generator matrix G is initialized, wherein the generator matrix G of initialization is defined as follows:
The generator matrix of the initialization is the matrix of N row K column, and generator matrix G is defined in RpOn matrix, be
The modified of generalized circular matrix.
Wherein, every row element in the generator matrix can regard a data as, which includes second default
The data of item number, also, the number of every row element is identical as the first default number of branches in the generator matrix, every row member in generator matrix
The number of element can regard the length of data corresponding to the row element in generator matrix as.That is, line number in generator matrix is the
Two default number of branches, the columns in generator matrix are identical as the first default number of branches.
If the first object subdata of data is unsatisfactory for gene coding constraints in S103, generator matrix, more newly-generated
Data in matrix, until the first object subdata generated meets gene coding constraints;
In the embodiment of the present application, data to be stored is divided into multi-group data, can regard every group of data as one respectively
A target data calculates the generator matrix G product with every group of target data respectively, to obtain the product knot of this group of target data
Fruit.
Wherein, target data is the matrix of a K*L, and generator matrix G is the matrix of a N*K, calculate generator matrix G with
The product of target data, the matrix of an available N*L, the matrix of the N*L can regard the product of this group of target data as
As a result, wherein every row element can regard the data in generator matrix as in generator matrix, calculate generator matrix and target
The product of data can regard the process for calculating the corresponding first object subdata of every data in generator matrix, this group of target as
Every row element can regard a first object subdata as in the result of product of data, and every data is corresponding in generator matrix
First object subdata be this group of target data result of product in correspond to the first object subdata of row.
In the embodiment of the present application, when the product of calculating generator matrix and target data, if for the number in generator matrix
According to the corresponding first object subdata of the obtained data is unsatisfactory for gene coding constraints, then needs to the generator matrix
In the data be updated, until the first object subdata of the data generated meets gene coding constraints and is
Only, not update generator matrix in the data and using the first object subdata obtained at this time it is corresponding as the data
First object subdata, and record the update times of the data.
Wherein, the product for calculating generator matrix and target data, every data corresponding the in available generator matrix
One target subdata (the first object subdata of alternatively referred to as every data), and each obtained first object subdata is made
For the DNA data of the target data.
The DNA data of target data can regard the matrix being made of a plurality of first object subdata, every first mesh as
The row marked in subdata matrix is identical as the row that the corresponding data of first object subdata are located in generator matrix.
S104, the first object subdata based on pieces of data in generator matrix, generate the DNA data of target data, DNA
Data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are divided by file to be stored
At each data DNA data constitute.
In the embodiment of the present application, file to be stored is divided at least one set of data, executes above-mentioned step for every group of data
After rapid S10-S104 obtains the DNA data of this group of data, the DNA data of every group of data at least one obtained group data are made
For the DNA data of file to be stored, and then the DNA data of file to be stored are subjected to data as data storage medium using DNA and are deposited
Storage.
Fig. 2 is another date storage method flow chart provided by the embodiments of the present application.
As shown in Fig. 2, this method comprises:
S201, one group of target data at least one set of data that file to be stored is divided into is determined, data include the
The subdata of one default number of branches, the data length of subdata are preset data length;
S202, every subdata in target data tail portion generate parity check bit, obtain corresponding with target data
First object data;
In the embodiment of the present application, include a plurality of subdata in target data, generated respectively in the tail portion of every subdata
Generation is had each subdata of parity check bit as first object data corresponding with target data by parity check bit.
S203, the product for calculating generator matrix and first object data, obtain the first mesh of every data in generator matrix
Mark subdata;Generator matrix includes the data of the second default number of branches, the data length and first of every data in generator matrix
Default number of branches are identical;
In the embodiment of the present application, it is preferred that the product for calculating generator matrix and first object data obtains generator matrix
In every data first object subdata, comprising: superposition first object data and non-zero matrix obtain and first object data
Corresponding second target data, the line number of non-zero matrix is identical as the first default number of branches, and the columns of non-zero matrix is preset data
Length with 1 and;The product for calculating generator matrix and the second target data, obtains the first object of every data in generator matrix
Subdata.
In the embodiment of the present application, it is provided with non-zero matrix, the line number of the non-zero matrix is identical as the first default number of branches, should
The column book of non-zero matrix be preset data length with 1 and.That is, non-zero matrix relative to matrix corresponding to target data and
Speech, the line number of the line number of non-zero matrix matrix corresponding with target data is identical, and the columns of the non-zero matrix compares target data
Corresponding matrix column number more 1.
In the embodiment of the present application, superposition non-zero matrix and first object data, available and first object data pair
The second target data answered, and then the product of generator matrix and second target data is calculated, obtain every number in generator matrix
According to first object subdata.
If the first object subdata of data is unsatisfactory for gene coding constraints in S204, generator matrix, more newly-generated
Data in matrix, until the first object subdata generated meets gene coding constraints;
Last bit data in S205, deletion first object subdata, obtain the second target subnumber of first object subdata
According to;
In the embodiment of the present application, the first object subdata of data meets gene coding bound in obtaining generator matrix
When condition, using the first object subdata as the first object subdata of the data in the generator matrix, and by first mesh
The last bit data marked in subdata are deleted, and using the first object subdata after deletion last bit data as the first object subnumber
According to the second target subdata.
S206, the second target subdata based on target subdata each in target data, generate the DNA number of target data
According to DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are by file to be stored
The DNA data for each data being divided into are constituted.
In the embodiment of the present application, by the second target subdata of target subdata each in target data, the target is constituted
The DNA data of data.
Further, in a kind of date storage method provided by the embodiments of the present application, for the ease of the reduction to data,
Can also identification information, mark further be being generated in the tail portion of the first object subdata of every data of generator matrix respectively
Information be used to indicate target data, when generating first object subdata update generator matrix in the data number, with
And the address being located in generator matrix in generator matrix for generating the data of first object subdata;And then it is based on generator matrix
The first object subdata of the carrying identification information of middle pieces of data, generates the DNA data of target data.
For the ease of to a kind of understanding of date storage method provided by the embodiments of the present application, now to file to be stored
DNA data are described in detail using the process that DNA carries out data storage as data storage medium.
In the embodiment of the present application, the DNA data of file to be stored can be regarded to file to be stored as by above-mentioned number
Sample is stored according to the DNA that storage method generates, which is stored into sample and carries out data storage by data storage medium of DNA
Process are as follows:
1, the preparation of sample cell: DNA storage sample is resuspended, is saved (being saved after can dispensing).
PCR is carried out using different archaeal dna polymerases.
A) according to the specification of polymerase, suitable component, template and forward and reverse primer is added, mixes;B) primer is calculated
Optimal T m value;C) the PCR reaction condition given according to polymerase carries out PCR reaction, executes n altogether and recycles, d) production that will obtain
Object carries out purification and recovery, is dissolved in the sample cell of appropriate volume.
2, PCR amplification
Using the product in main sample cell as template, the reaction for carrying out PCR is circuited sequentially.
The reaction of first time PCR:
A) according to the specification of polymerase, suitable component, template and forward and reverse primer is added, mixes;B) primer is calculated
Optimal T m value;C) the PCR reaction condition given according to polymerase carries out PCR reaction, executes n circulation, d altogether) it verifies and will obtain
PCR product, and save;E) template for reacting obtained PCR product as PCR next time after taking in right amount, carries out next time
PCR reaction.
N times PCR reaction afterwards is same as above.After the completion of n times PCR cycle, the PCR product of n-th is sequenced, then sequencing is tied
Fruit compares sequence, that is, completes the extraction of storage information.
1) pond DNA of synthesis is resuspended in 428.4 μ L 0.5x TE, ultimately joins 150ng/ μ L (l/ parts of 50 μ of packing
Liquid nitrogen frozen, -80 DEG C of refrigerators save).
2) it is expanded using the DNA fragmentation that standard DNA polymerase carries out PCR
PCR reaction condition:
Purified product is dissolved in 25 μ l dd H2O。
After the completion, PCR product is sequenced, then sequencing result is compared into sequence, complete the extraction of storage information.
Further, in the embodiment of the present application, a kind of data restoration method flow chart is also provided, Fig. 3 is specifically referred to.
As shown in figure 3, this method comprises:
S301, determine that DNA data to be restored, DNA data are made of at least one first object subdata;
S302, it at least one first object data, will indicate that the first object subdata of same target data is classified as one
Group obtains multiple groups first object subdata;
S303, it is directed to every group of first object subdata, it is default that data length is chosen from this group of first object subdata
First default number of branches first object subdata of data length, constitutes the corresponding third number of targets of this group of first object subdata
According to;
S304, it is directed to every group of first object subdata, according to first object subnumber each in this group of first object subdata
According to identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
In the embodiment of the present application, the inverse matrix of first object subdata may be considered the above embodiments of the present application and hold
The inverse matrix of used generator matrix when row date storage method.
S305, it is directed to every group of first object subdata, calculates the inverse matrix and the group the of this group of first object subdata
The product of the corresponding third target data of one target subdata obtains corresponding 4th number of targets of this group of first object subdata
According to;4th target data is for being reduced into file.
Fig. 4 is another data restoration method flow chart provided by the embodiments of the present application.
As shown in figure 4, this method comprises:
S401, determine that DNA data to be restored, DNA data are made of at least one first object subdata;
S402, it at least one first object data, will indicate that the first object subdata of same target data is classified as one
Group obtains multiple groups first object subdata;
S403, it is directed to every group of first object subdata, it is default that data length is chosen from this group of first object subdata
First default number of branches first object subdata of data length, constitutes the corresponding third number of targets of this group of first object subdata
According to;
S404, it is directed to every group of first object subdata, in the corresponding third target data of this group of first object subdata
Each first object subdata tail portion generate parity check bit, obtain the 5th mesh corresponding with this group of first object subdata
Mark data;
S405, it is directed to every group of first object subdata, according to first object subnumber each in this group of first object subdata
According to identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
S406, it is directed to every group of first object subdata, calculates the inverse matrix and the group first of this group of first object subdata
The product of corresponding 5th target data of target subdata obtains corresponding 4th target data of this group of first object subdata;
S407, the data for deleting every data tail portion in the 4th target data, obtain the 4th final target data, this is most
The 4th whole target data is for being reduced into file.
Correspondingly, Fig. 5 is a kind of structural schematic diagram of data storage device provided by the embodiments of the present application.
As shown in figure 5, the device includes:
Target data determination unit 51, for determining one group of mesh at least one set of data that file to be stored is divided into
Data are marked, data include the subdata of the first default number of branches, and the data length of subdata is preset data length;
Computing unit 52 obtains every data in generator matrix for calculating the product of generator matrix and target data
First object subdata, generator matrix include the data of the second default number of branches, the data length of every data in generator matrix
It is identical as the first default number of branches;
Updating unit 53, if the first object subdata for data in generator matrix is unsatisfactory for gene coding bound item
Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;
Generation unit 54 generates target data for the first object subdata based on pieces of data in generator matrix
DNA data, DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are by wait deposit
The DNA data for each data that storage file is divided into are constituted.
Correspondingly, Fig. 6 is a kind of structural schematic diagram of data recovery device provided by the embodiments of the present application.
As shown in fig. 6, the device includes:
DNA data determination unit 61, for determining DNA data to be restored, DNA data are by least one first object
Data are constituted;
Grouped element 62, at least one first object data, will indicate first object of same target data
Data are classified as one group, obtain multiple groups first object subdata;
Selection unit 63 chooses data from this group of first object subdata for being directed to every group of first object subdata
Length is the first default number of branches first object subdata of preset data length, and it is corresponding to constitute this group of first object subdata
Third target data;
Inverse matrix structural unit 64, for being directed to every group of first object subdata, according in this group of first object subdata
The address of the identification information instruction of each first object subdata and number, construct the inverse matrix of this group of first object subdata;
Reduction unit 65 calculates the inverse matrix of this group of first object subdata for being directed to every group of first object subdata
And the product of the corresponding third target data of this group of first object subdata, obtain this group of first object subdata corresponding
Four target datas;4th target data is for being reduced into file.
The application provides a kind of storage of data, restoring method and device, and this method, which passes through, determines that file to be stored is divided
At at least one set of data in one group of target data;The product for calculating generator matrix and target data, obtains in generator matrix
The first object subdata of every data;If the first object subdata of data is unsatisfactory for gene coding bound item in generator matrix
Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;Based on generation
The first object subdata of pieces of data in matrix, generates the mode of the DNA data of target data, file to be stored is converted into
It is real in order to reach using DNA as data storage medium for carrying out the DNA data of data storage using DNA as data storage medium
The purpose of existing data storage.
A kind of storage of data, restoring method and device provided by the present invention are described in detail above, herein
Apply that a specific example illustrates the principle and implementation of the invention, the explanation of above example is only intended to help
Understand method and its core concept of the invention;At the same time, for those skilled in the art, according to the thought of the present invention,
There will be changes in the specific implementation manner and application range, and to sum up, the content of the present specification should not be construed as to the present invention
Limitation.
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight
Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other.
For the device disclosed in the embodiment, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, phase
Place is closed referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one
Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation
There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain
Lid non-exclusive inclusion, so that the element that the process, method, article or equipment including a series of elements is intrinsic,
It further include either the element intrinsic for these process, method, article or equipments.In the absence of more restrictions,
The element limited by sentence "including a ...", it is not excluded that in the process, method, article or equipment for including element also
There are other identical elements.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.