CN109887549A - A kind of storage of data, restoring method and device - Google Patents
A kind of storage of data, restoring method and device Download PDFInfo
- Publication number
- CN109887549A CN109887549A CN201910132713.6A CN201910132713A CN109887549A CN 109887549 A CN109887549 A CN 109887549A CN 201910132713 A CN201910132713 A CN 201910132713A CN 109887549 A CN109887549 A CN 109887549A
- Authority
- CN
- China
- Prior art keywords
- data
- subdata
- group
- generator matrix
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides a kind of storage of data, restoring method and device, one group of target data at least one set of data that this method is divided by determination file to be stored;The product for calculating generator matrix and target data, obtains the first object subdata of every data in generator matrix;If the first object subdata of data is unsatisfactory for gene coding constraints in generator matrix, the data in generator matrix are updated, until the first object subdata generated meets gene coding constraints;First object subdata based on pieces of data in generator matrix, generate the mode of the DNA data of target data, the DNA data that file to be stored is converted into being used for carrying out data storage using DNA as data storage medium are realized into the purpose of data storage in order to reach using DNA as data storage medium.
Description
Technical field
The present invention relates to computer software technical fields, more specifically to a kind of storage of data, restoring method and dress
It sets.
Background technique
The exponential increase of data volume has been more than pushing the speed for data storage hard disk capacity, makes people to the property of storage equipment
There can be higher requirement, need significantly more efficient memory technology to guarantee that data store.
It has been investigated that DNA is a kind of outstanding data storage medium, it shows that the PB grade information of every grams evidence is close
Degree, high-durability.Also, it carries out not needing to store data to data when data storage using DNA as data storage medium
Therefore storage hard disk carries out data storage for DNA as data storage medium, it is possible to prevente effectively from because data storage hard disk holds
Amount be not able to satisfy data volume growth demand, caused by cannot effectively realize data storage the problem of.
Summary of the invention
In view of this, the present invention provides a kind of storage of data, restoring method and device, to be data storage Jie based on DNA
Matter realizes the purpose of data storage.
Technical solution is as follows:
A kind of date storage method, comprising:
Determine that one group of target data at least one set of data that file to be stored is divided into, the data include first
The subdata of default number of branches, the data length of the subdata are preset data length;
The product for calculating generator matrix and the target data, obtains the first object of every data in the generator matrix
Subdata, the generator matrix include the data of the second default number of branches, the data length of every data in the generator matrix
It is identical as first default number of branches;
If the first object subdata of data is unsatisfactory for gene coding constraints in the generator matrix, the life is updated
At the data in matrix, until the first object subdata generated meets the gene coding constraints;
Based on the first object subdata of pieces of data in the generator matrix, the DNA data of the target data are generated,
The DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of the file to be stored are by described
The DNA data for each data that file to be stored is divided into are constituted.
Preferably, the product for calculating generator matrix and the target data, obtains every number in the generator matrix
According to first object subdata, comprising:
The tail portion of every subdata in the target data generates parity check bit, obtains and the target data pair
The first object data answered;
The product for calculating generator matrix and the first object data, obtains first of every data in the generator matrix
Target subdata.
Preferably, the first object subdata based on pieces of data in the generator matrix, generates the number of targets
According to DNA data, comprising:
The last bit data in the first object subdata are deleted, the second target of the first object subdata is obtained
Data;
Based on the second target subdata of target subdata each in the target data, the DNA of the target data is generated
Data.
Preferably, the product for calculating generator matrix and the first object data obtains every in the generator matrix
The first object subdata of data, comprising:
It is superimposed the first object data and non-zero matrix, obtains the second number of targets corresponding with the first object data
According to the line number of the non-zero matrix is identical as first default number of branches, and the columns of the non-zero matrix is the preset data
Length with 1 and;
The product for calculating generator matrix and second target data, obtains first of every data in the generator matrix
Target subdata.
Preferably, the first object subdata based on pieces of data in the generator matrix, generates the number of targets
According to DNA data, comprising:
Identification information, the mark are generated in the tail portion of the first object subdata of every data of the generator matrix respectively
Know information to be used to indicate the target data, update when generating the first object subdata this in the generator matrix
It is located at the generator matrix for generating the data of the first object subdata in the number of data and the generator matrix
In address;
Based on the first object subdata of the carrying identification information of pieces of data in the generator matrix, the target is generated
The DNA data of data.
A kind of data storage device, comprising:
Target data determination unit, for determining one group of target at least one set of data that file to be stored is divided into
Data, the data include the subdata of the first default number of branches, and the data length of the subdata is preset data length;
Computing unit obtains in the generator matrix every for calculating the product of generator matrix Yu the target data
The first object subdata of data, the generator matrix include the data of the second default number of branches, and every in the generator matrix
The data length of data is identical as first default number of branches;
Updating unit, if the first object subdata for data in the generator matrix is unsatisfactory for gene coding bound item
Part updates the data in the generator matrix, until the first object subdata generated meets the gene coding
Constraint condition;
Generation unit generates the target for the first object subdata based on pieces of data in the generator matrix
The DNA data of data, the DNA data are used to carry out data storage by data storage medium of DNA, the file to be stored
The DNA data for each data that DNA data are divided by the file to be stored are constituted.
A kind of data restoration method, comprising:
Determine that DNA data to be restored, the DNA data are made of at least one first object subdata;
By at least one described first object data, indicate that the first object subdata of same target data is classified as one
Group obtains multiple groups first object subdata;
For every group of first object subdata, it is the present count that data length is chosen from this group of first object subdata
According to the first default number of branches first object subdata of length, the corresponding third target of this group of first object subdata is constituted
Data;
For first object subdata described in every group, according to first object subdata each in this group of first object subdata
Identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
For first object subdata described in every group, the inverse matrix and the group first of this group of first object subdata are calculated
The product of the corresponding third target data of target subdata obtains corresponding 4th target data of this group of first object subdata;
4th target data is for being reduced into file.
Preferably, further includes:
For every group of first object subdata, in the corresponding third target data of this group of first object subdata
The tail portion of each first object subdata generates parity check bit, obtains the 5th target corresponding with this group of first object subdata
Data;
It is described for first object subdata described in every group, calculate the inverse matrix and the group of this group of first object subdata
The product of the corresponding third target data of first object subdata obtains corresponding 4th number of targets of this group of first object subdata
According to, comprising:
For first object subdata described in every group, the inverse matrix and first mesh of group of this group of first object subdata are calculated
The product for marking corresponding 5th target data of subdata, obtains corresponding 4th target data of this group of first object subdata.
Preferably, further includes:
The data for deleting every data tail portion in the 4th target data obtain the 4th final target data.
A kind of data recovery device, comprising:
DNA data determination unit, for determining DNA data to be restored, the DNA data are by least one first object
Subdata is constituted;
Grouped element, for indicating the first object of same target data at least one described first object data
Subdata is classified as one group, obtains multiple groups first object subdata;
It is long to choose data for being directed to every group of first object subdata from this group of first object subdata for selection unit
Degree is the first default number of branches first object subdata of the preset data length, constitutes this group of first object subdata
Corresponding third target data;
Inverse matrix structural unit is used for for first object subdata described in every group, according to this group of first object subdata
In each first object subdata identification information instruction address and number, construct the inverse square of this group of first object subdata
Battle array;
Reduction unit, for calculating the inverse square of this group of first object subdata for first object subdata described in every group
The product of battle array and the corresponding third target data of this group of first object subdata, it is corresponding to obtain this group of first object subdata
4th target data;4th target data is for being reduced into file.
The application provides a kind of storage of data, restoring method and device, and this method, which passes through, determines that file to be stored is divided
At at least one set of data in one group of target data;The product for calculating generator matrix and target data, obtains in generator matrix
The first object subdata of every data;If the first object subdata of data is unsatisfactory for gene coding bound item in generator matrix
Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;Based on generation
The first object subdata of pieces of data in matrix, generates the mode of the DNA data of target data, file to be stored is converted into
It is real in order to reach using DNA as data storage medium for carrying out the DNA data of data storage using DNA as data storage medium
The purpose of existing data storage.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of date storage method flow chart provided by the embodiments of the present application;
Fig. 2 is another date storage method flow chart provided by the embodiments of the present application;
Fig. 3 is a kind of data restoration method flow chart provided by the embodiments of the present application;
Fig. 4 is another data restoration method flow chart provided by the embodiments of the present application;
Fig. 5 is a kind of structural schematic diagram of data storage device provided by the embodiments of the present application;
Fig. 6 is a kind of structural schematic diagram of data recovery device provided by the embodiments of the present application.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Embodiment:
Fig. 1 is a kind of date storage method flow chart provided by the embodiments of the present application.As shown in Figure 1, this method comprises:
S101, one group of target data at least one set of data that file to be stored is divided into is determined, data include the
The subdata of one default number of branches, the data length of subdata are preset data length;
In the embodiment of the present application, the lossless compression that standard can be used compresses file to be stored, will be wait store
File is bundled in compressed file, so by compressed file be divided into length be K*L non-overlap group (that is, compressed file is divided
At multiple non-overlap groups, each non-overlap group regards one group of data as), in each non-overlap group comprising K length for L two into
Data processed.
Wherein, each group of data that file to be stored is divided into can regard a matrix as, which is a K row
The matrix of L column, the matrix can use a vector D=(D1, D2..., DK) description.Wherein, DkFor indicating to be located at the matrix
The element of middle line k.
S102, the product for calculating generator matrix and target data obtain first object of every data in generator matrix
Data, generator matrix include the data of the second default number of branches, and the data length of every data in generator matrix and first is preset
Item number is identical;
In the embodiment of the present application, generator matrix G is initialized, wherein the generator matrix G of initialization is defined as follows:
The generator matrix of the initialization is the matrix of N row K column, and generator matrix G is defined in RpOn matrix, be
The modified of generalized circular matrix.
Wherein, every row element in the generator matrix can regard a data as, which includes second default
The data of item number, also, the number of every row element is identical as the first default number of branches in the generator matrix, every row member in generator matrix
The number of element can regard the length of data corresponding to the row element in generator matrix as.That is, line number in generator matrix is the
Two default number of branches, the columns in generator matrix are identical as the first default number of branches.
If the first object subdata of data is unsatisfactory for gene coding constraints in S103, generator matrix, more newly-generated
Data in matrix, until the first object subdata generated meets gene coding constraints;
In the embodiment of the present application, data to be stored is divided into multi-group data, can regard every group of data as one respectively
A target data calculates the generator matrix G product with every group of target data respectively, to obtain the product knot of this group of target data
Fruit.
Wherein, target data is the matrix of a K*L, and generator matrix G is the matrix of a N*K, calculate generator matrix G with
The product of target data, the matrix of an available N*L, the matrix of the N*L can regard the product of this group of target data as
As a result, wherein every row element can regard the data in generator matrix as in generator matrix, calculate generator matrix and target
The product of data can regard the process for calculating the corresponding first object subdata of every data in generator matrix, this group of target as
Every row element can regard a first object subdata as in the result of product of data, and every data is corresponding in generator matrix
First object subdata be this group of target data result of product in correspond to the first object subdata of row.
In the embodiment of the present application, when the product of calculating generator matrix and target data, if for the number in generator matrix
According to the corresponding first object subdata of the obtained data is unsatisfactory for gene coding constraints, then needs to the generator matrix
In the data be updated, until the first object subdata of the data generated meets gene coding constraints and is
Only, not update generator matrix in the data and using the first object subdata obtained at this time it is corresponding as the data
First object subdata, and record the update times of the data.
Wherein, the product for calculating generator matrix and target data, every data corresponding the in available generator matrix
One target subdata (the first object subdata of alternatively referred to as every data), and each obtained first object subdata is made
For the DNA data of the target data.
The DNA data of target data can regard the matrix being made of a plurality of first object subdata, every first mesh as
The row marked in subdata matrix is identical as the row that the corresponding data of first object subdata are located in generator matrix.
S104, the first object subdata based on pieces of data in generator matrix, generate the DNA data of target data, DNA
Data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are divided by file to be stored
At each data DNA data constitute.
In the embodiment of the present application, file to be stored is divided at least one set of data, executes above-mentioned step for every group of data
After rapid S10-S104 obtains the DNA data of this group of data, the DNA data of every group of data at least one obtained group data are made
For the DNA data of file to be stored, and then the DNA data of file to be stored are subjected to data as data storage medium using DNA and are deposited
Storage.
Fig. 2 is another date storage method flow chart provided by the embodiments of the present application.
As shown in Fig. 2, this method comprises:
S201, one group of target data at least one set of data that file to be stored is divided into is determined, data include the
The subdata of one default number of branches, the data length of subdata are preset data length;
S202, every subdata in target data tail portion generate parity check bit, obtain corresponding with target data
First object data;
In the embodiment of the present application, include a plurality of subdata in target data, generated respectively in the tail portion of every subdata
Generation is had each subdata of parity check bit as first object data corresponding with target data by parity check bit.
S203, the product for calculating generator matrix and first object data, obtain the first mesh of every data in generator matrix
Mark subdata;Generator matrix includes the data of the second default number of branches, the data length and first of every data in generator matrix
Default number of branches are identical;
In the embodiment of the present application, it is preferred that the product for calculating generator matrix and first object data obtains generator matrix
In every data first object subdata, comprising: superposition first object data and non-zero matrix obtain and first object data
Corresponding second target data, the line number of non-zero matrix is identical as the first default number of branches, and the columns of non-zero matrix is preset data
Length with 1 and;The product for calculating generator matrix and the second target data, obtains the first object of every data in generator matrix
Subdata.
In the embodiment of the present application, it is provided with non-zero matrix, the line number of the non-zero matrix is identical as the first default number of branches, should
The column book of non-zero matrix be preset data length with 1 and.That is, non-zero matrix relative to matrix corresponding to target data and
Speech, the line number of the line number of non-zero matrix matrix corresponding with target data is identical, and the columns of the non-zero matrix compares target data
Corresponding matrix column number more 1.
In the embodiment of the present application, superposition non-zero matrix and first object data, available and first object data pair
The second target data answered, and then the product of generator matrix and second target data is calculated, obtain every number in generator matrix
According to first object subdata.
If the first object subdata of data is unsatisfactory for gene coding constraints in S204, generator matrix, more newly-generated
Data in matrix, until the first object subdata generated meets gene coding constraints;
Last bit data in S205, deletion first object subdata, obtain the second target subnumber of first object subdata
According to;
In the embodiment of the present application, the first object subdata of data meets gene coding bound in obtaining generator matrix
When condition, using the first object subdata as the first object subdata of the data in the generator matrix, and by first mesh
The last bit data marked in subdata are deleted, and using the first object subdata after deletion last bit data as the first object subnumber
According to the second target subdata.
S206, the second target subdata based on target subdata each in target data, generate the DNA number of target data
According to DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are by file to be stored
The DNA data for each data being divided into are constituted.
In the embodiment of the present application, by the second target subdata of target subdata each in target data, the target is constituted
The DNA data of data.
Further, in a kind of date storage method provided by the embodiments of the present application, for the ease of the reduction to data,
Can also identification information, mark further be being generated in the tail portion of the first object subdata of every data of generator matrix respectively
Information be used to indicate target data, when generating first object subdata update generator matrix in the data number, with
And the address being located in generator matrix in generator matrix for generating the data of first object subdata;And then it is based on generator matrix
The first object subdata of the carrying identification information of middle pieces of data, generates the DNA data of target data.
For the ease of to a kind of understanding of date storage method provided by the embodiments of the present application, now to file to be stored
DNA data are described in detail using the process that DNA carries out data storage as data storage medium.
In the embodiment of the present application, the DNA data of file to be stored can be regarded to file to be stored as by above-mentioned number
Sample is stored according to the DNA that storage method generates, which is stored into sample and carries out data storage by data storage medium of DNA
Process are as follows:
1, the preparation of sample cell: DNA storage sample is resuspended, is saved (being saved after can dispensing).
PCR is carried out using different archaeal dna polymerases.
A) according to the specification of polymerase, suitable component, template and forward and reverse primer is added, mixes;B) primer is calculated
Optimal T m value;C) the PCR reaction condition given according to polymerase carries out PCR reaction, executes n altogether and recycles, d) production that will obtain
Object carries out purification and recovery, is dissolved in the sample cell of appropriate volume.
2, PCR amplification
Using the product in main sample cell as template, the reaction for carrying out PCR is circuited sequentially.
The reaction of first time PCR:
A) according to the specification of polymerase, suitable component, template and forward and reverse primer is added, mixes;B) primer is calculated
Optimal T m value;C) the PCR reaction condition given according to polymerase carries out PCR reaction, executes n circulation, d altogether) it verifies and will obtain
PCR product, and save;E) template for reacting obtained PCR product as PCR next time after taking in right amount, carries out next time
PCR reaction.
N times PCR reaction afterwards is same as above.After the completion of n times PCR cycle, the PCR product of n-th is sequenced, then sequencing is tied
Fruit compares sequence, that is, completes the extraction of storage information.
1) pond DNA of synthesis is resuspended in 428.4 μ L 0.5x TE, ultimately joins 150ng/ μ L (l/ parts of 50 μ of packing
Liquid nitrogen frozen, -80 DEG C of refrigerators save).
2) it is expanded using the DNA fragmentation that standard DNA polymerase carries out PCR
PCR reaction condition:
Purified product is dissolved in 25 μ l dd H2O。
After the completion, PCR product is sequenced, then sequencing result is compared into sequence, complete the extraction of storage information.
Further, in the embodiment of the present application, a kind of data restoration method flow chart is also provided, Fig. 3 is specifically referred to.
As shown in figure 3, this method comprises:
S301, determine that DNA data to be restored, DNA data are made of at least one first object subdata;
S302, it at least one first object data, will indicate that the first object subdata of same target data is classified as one
Group obtains multiple groups first object subdata;
S303, it is directed to every group of first object subdata, it is default that data length is chosen from this group of first object subdata
First default number of branches first object subdata of data length, constitutes the corresponding third number of targets of this group of first object subdata
According to;
S304, it is directed to every group of first object subdata, according to first object subnumber each in this group of first object subdata
According to identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
In the embodiment of the present application, the inverse matrix of first object subdata may be considered the above embodiments of the present application and hold
The inverse matrix of used generator matrix when row date storage method.
S305, it is directed to every group of first object subdata, calculates the inverse matrix and the group the of this group of first object subdata
The product of the corresponding third target data of one target subdata obtains corresponding 4th number of targets of this group of first object subdata
According to;4th target data is for being reduced into file.
Fig. 4 is another data restoration method flow chart provided by the embodiments of the present application.
As shown in figure 4, this method comprises:
S401, determine that DNA data to be restored, DNA data are made of at least one first object subdata;
S402, it at least one first object data, will indicate that the first object subdata of same target data is classified as one
Group obtains multiple groups first object subdata;
S403, it is directed to every group of first object subdata, it is default that data length is chosen from this group of first object subdata
First default number of branches first object subdata of data length, constitutes the corresponding third number of targets of this group of first object subdata
According to;
S404, it is directed to every group of first object subdata, in the corresponding third target data of this group of first object subdata
Each first object subdata tail portion generate parity check bit, obtain the 5th mesh corresponding with this group of first object subdata
Mark data;
S405, it is directed to every group of first object subdata, according to first object subnumber each in this group of first object subdata
According to identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
S406, it is directed to every group of first object subdata, calculates the inverse matrix and the group first of this group of first object subdata
The product of corresponding 5th target data of target subdata obtains corresponding 4th target data of this group of first object subdata;
S407, the data for deleting every data tail portion in the 4th target data, obtain the 4th final target data, this is most
The 4th whole target data is for being reduced into file.
Correspondingly, Fig. 5 is a kind of structural schematic diagram of data storage device provided by the embodiments of the present application.
As shown in figure 5, the device includes:
Target data determination unit 51, for determining one group of mesh at least one set of data that file to be stored is divided into
Data are marked, data include the subdata of the first default number of branches, and the data length of subdata is preset data length;
Computing unit 52 obtains every data in generator matrix for calculating the product of generator matrix and target data
First object subdata, generator matrix include the data of the second default number of branches, the data length of every data in generator matrix
It is identical as the first default number of branches;
Updating unit 53, if the first object subdata for data in generator matrix is unsatisfactory for gene coding bound item
Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;
Generation unit 54 generates target data for the first object subdata based on pieces of data in generator matrix
DNA data, DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are by wait deposit
The DNA data for each data that storage file is divided into are constituted.
Correspondingly, Fig. 6 is a kind of structural schematic diagram of data recovery device provided by the embodiments of the present application.
As shown in fig. 6, the device includes:
DNA data determination unit 61, for determining DNA data to be restored, DNA data are by least one first object
Data are constituted;
Grouped element 62, at least one first object data, will indicate first object of same target data
Data are classified as one group, obtain multiple groups first object subdata;
Selection unit 63 chooses data from this group of first object subdata for being directed to every group of first object subdata
Length is the first default number of branches first object subdata of preset data length, and it is corresponding to constitute this group of first object subdata
Third target data;
Inverse matrix structural unit 64, for being directed to every group of first object subdata, according in this group of first object subdata
The address of the identification information instruction of each first object subdata and number, construct the inverse matrix of this group of first object subdata;
Reduction unit 65 calculates the inverse matrix of this group of first object subdata for being directed to every group of first object subdata
And the product of the corresponding third target data of this group of first object subdata, obtain this group of first object subdata corresponding
Four target datas;4th target data is for being reduced into file.
The application provides a kind of storage of data, restoring method and device, and this method, which passes through, determines that file to be stored is divided
At at least one set of data in one group of target data;The product for calculating generator matrix and target data, obtains in generator matrix
The first object subdata of every data;If the first object subdata of data is unsatisfactory for gene coding bound item in generator matrix
Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;Based on generation
The first object subdata of pieces of data in matrix, generates the mode of the DNA data of target data, file to be stored is converted into
It is real in order to reach using DNA as data storage medium for carrying out the DNA data of data storage using DNA as data storage medium
The purpose of existing data storage.
A kind of storage of data, restoring method and device provided by the present invention are described in detail above, herein
Apply that a specific example illustrates the principle and implementation of the invention, the explanation of above example is only intended to help
Understand method and its core concept of the invention;At the same time, for those skilled in the art, according to the thought of the present invention,
There will be changes in the specific implementation manner and application range, and to sum up, the content of the present specification should not be construed as to the present invention
Limitation.
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight
Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other.
For the device disclosed in the embodiment, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, phase
Place is closed referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one
Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation
There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain
Lid non-exclusive inclusion, so that the element that the process, method, article or equipment including a series of elements is intrinsic,
It further include either the element intrinsic for these process, method, article or equipments.In the absence of more restrictions,
The element limited by sentence "including a ...", it is not excluded that in the process, method, article or equipment for including element also
There are other identical elements.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.
Claims (10)
1. a kind of date storage method characterized by comprising
Determine that one group of target data at least one set of data that file to be stored is divided into, the data include first default
The subdata of item number, the data length of the subdata are preset data length;
The product for calculating generator matrix and the target data, obtains the first object subnumber of every data in the generator matrix
According to the generator matrix includes the data of the second default number of branches, the data length of every data in the generator matrix and institute
It is identical to state the first default number of branches;
If the first object subdata of data is unsatisfactory for gene coding constraints in the generator matrix, the generation square is updated
The data in battle array, until the first object subdata generated meets the gene coding constraints;
Based on the first object subdata of pieces of data in the generator matrix, the DNA data of the target data are generated, it is described
DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of the file to be stored are by described wait deposit
The DNA data for each data that storage file is divided into are constituted.
2. the method according to claim 1, wherein described calculate multiplying for generator matrix and the target data
Product, obtains the first object subdata of every data in the generator matrix, comprising:
The tail portion of every subdata in the target data generates parity check bit, obtains corresponding with the target data
First object data;
The product for calculating generator matrix and the first object data, obtains the first object of every data in the generator matrix
Subdata.
3. according to the method described in claim 2, it is characterized in that, first based on pieces of data in the generator matrix
Target subdata generates the DNA data of the target data, comprising:
The last bit data in the first object subdata are deleted, the second target subnumber of the first object subdata is obtained
According to;
Based on the second target subdata of target subdata each in the target data, the DNA data of the target data are generated.
4. according to the method described in claim 2, it is characterized in that, described calculate generator matrix and the first object data
Product obtains the first object subdata of every data in the generator matrix, comprising:
It is superimposed the first object data and non-zero matrix, obtains the second target data corresponding with the first object data,
The line number of the non-zero matrix is identical as first default number of branches, and the columns of the non-zero matrix is the preset data length
With 1 and;
The product for calculating generator matrix and second target data, obtains the first object of every data in the generator matrix
Subdata.
5. method according to any of claims 1-4, which is characterized in that described based on each item in the generator matrix
The first object subdata of data generates the DNA data of the target data, comprising:
Identification information, the mark letter are generated in the tail portion of the first object subdata of every data of the generator matrix respectively
Breath is used to indicate the target data, updates when generating the first object subdata data in the generator matrix
Number and the generator matrix in data for generating the first object subdata be located in the generator matrix
Address;
Based on the first object subdata of the carrying identification information of pieces of data in the generator matrix, the target data is generated
DNA data.
6. a kind of data storage device characterized by comprising
Target data determination unit, for determining one group of number of targets at least one set of data that file to be stored is divided into
According to the data include the subdata of the first default number of branches, and the data length of the subdata is preset data length;
Computing unit obtains every data in the generator matrix for calculating the product of generator matrix Yu the target data
First object subdata, the generator matrix includes the data of the second default number of branches, every data in the generator matrix
Data length it is identical as first default number of branches;
Updating unit, if the first object subdata for data in the generator matrix is unsatisfactory for gene coding constraints,
The data in the generator matrix are updated, until the first object subdata generated meets the gene coding bound
Condition;
Generation unit generates the target data for the first object subdata based on pieces of data in the generator matrix
DNA data, the DNA data be used for using DNA be data storage medium progress data storage, the DNA of the file to be stored
The DNA data for each data that data are divided by the file to be stored are constituted.
7. a kind of data restoration method characterized by comprising
Determine that DNA data to be restored, the DNA data are made of at least one first object subdata;
By at least one described first object data, indicates that the first object subdata of same target data is classified as one group, obtain
To multiple groups first object subdata;
For every group of first object subdata, it is that the preset data is long that data length is chosen from this group of first object subdata
The first default number of branches first object subdata of degree, constitutes the corresponding third number of targets of this group of first object subdata
According to;
For first object subdata described in every group, according to the mark of first object subdata each in this group of first object subdata
Address and the number for knowing information instruction, construct the inverse matrix of this group of first object subdata;
For first object subdata described in every group, the inverse matrix and this group of first object of this group of first object subdata are calculated
The product of the corresponding third target data of subdata obtains corresponding 4th target data of this group of first object subdata;It is described
4th target data is for being reduced into file.
8. the method according to the description of claim 7 is characterized in that further include:
For every group of first object subdata, in each of corresponding described third target data of this group of first object subdata
The tail portion of first object subdata generates parity check bit, obtains the 5th number of targets corresponding with this group of first object subdata
According to;
It is described for first object subdata described in every group, calculate the inverse matrix and the group first of this group of first object subdata
The product of the corresponding third target data of target subdata obtains corresponding 4th target data of this group of first object subdata,
Include:
For first object subdata described in every group, the inverse matrix and this group of first object of this group of first object subdata are calculated
The product of corresponding 5th target data of data obtains corresponding 4th target data of this group of first object subdata.
9. according to the method described in claim 8, it is characterized by further comprising:
The data for deleting every data tail portion in the 4th target data obtain the 4th final target data.
10. a kind of data recovery device characterized by comprising
DNA data determination unit, for determining DNA data to be restored, the DNA data are by least one first object subnumber
According to composition;
Grouped element, for indicating the first object subnumber of same target data at least one described first object data
According to being classified as one group, multiple groups first object subdata is obtained;
Selection unit, for being directed to every group of first object subdata, choosing data length from this group of first object subdata is
It is corresponding to constitute this group of first object subdata for the first default number of branches first object subdata of the preset data length
Third target data;
Inverse matrix structural unit is used for for first object subdata described in every group, according to every in this group of first object subdata
The address of the identification information instruction of a first object subdata and number, construct the inverse matrix of this group of first object subdata;
Reduction unit, for for first object subdata described in every group, calculate the inverse matrix of this group of first object subdata with
And the product of the corresponding third target data of this group of first object subdata, obtain this group of first object subdata the corresponding 4th
Target data;4th target data is for being reduced into file.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910132713.6A CN109887549B (en) | 2019-02-22 | 2019-02-22 | Data storage and restoration method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910132713.6A CN109887549B (en) | 2019-02-22 | 2019-02-22 | Data storage and restoration method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109887549A true CN109887549A (en) | 2019-06-14 |
CN109887549B CN109887549B (en) | 2023-01-20 |
Family
ID=66928942
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910132713.6A Active CN109887549B (en) | 2019-02-22 | 2019-02-22 | Data storage and restoration method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109887549B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113096742A (en) * | 2021-04-14 | 2021-07-09 | 湖南科技大学 | DNA information storage parallel addressing writing method and system |
CN118227947A (en) * | 2024-05-22 | 2024-06-21 | 北京灵汐科技有限公司 | Data storage and data processing method and device for matrix, equipment and medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040061702A1 (en) * | 2002-08-08 | 2004-04-01 | Robert Kincaid | Methods and system for simultaneous visualization and manipulation of multiple data types |
US20040221223A1 (en) * | 2003-04-29 | 2004-11-04 | Nam-Yul Yu | Apparatus and method for encoding a low density parity check code |
CN104850411A (en) * | 2015-06-10 | 2015-08-19 | 清华大学 | Storage system reference evaluation program generating method and apparatus |
WO2015180203A1 (en) * | 2014-05-30 | 2015-12-03 | 周家锐 | High-throughput dna sequencing quality score lossless compression system and compression method |
CN105760706A (en) * | 2014-12-15 | 2016-07-13 | 深圳华大基因研究院 | Compression method for next generation sequencing data |
CN107055468A (en) * | 2012-06-01 | 2017-08-18 | 欧洲分子生物学实验室 | The high-capacity storage of digital information in DNA |
CN107798219A (en) * | 2016-08-30 | 2018-03-13 | 清华大学 | Data are subjected to biometric storage and the method reduced |
US20180089369A1 (en) * | 2016-05-19 | 2018-03-29 | Seven Bridges Genomics Inc. | Systems and methods for sequence encoding, storage, and compression |
US20180265921A1 (en) * | 2017-03-15 | 2018-09-20 | Microsoft Technology Licensing, Llc | Random access of data encoded by polynucleotides |
CN109074424A (en) * | 2016-05-04 | 2018-12-21 | 深圳华大生命科学研究院 | Utilize method, its coding/decoding method and the application of DNA storage text information |
-
2019
- 2019-02-22 CN CN201910132713.6A patent/CN109887549B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040061702A1 (en) * | 2002-08-08 | 2004-04-01 | Robert Kincaid | Methods and system for simultaneous visualization and manipulation of multiple data types |
US20040221223A1 (en) * | 2003-04-29 | 2004-11-04 | Nam-Yul Yu | Apparatus and method for encoding a low density parity check code |
CN107055468A (en) * | 2012-06-01 | 2017-08-18 | 欧洲分子生物学实验室 | The high-capacity storage of digital information in DNA |
WO2015180203A1 (en) * | 2014-05-30 | 2015-12-03 | 周家锐 | High-throughput dna sequencing quality score lossless compression system and compression method |
CN105760706A (en) * | 2014-12-15 | 2016-07-13 | 深圳华大基因研究院 | Compression method for next generation sequencing data |
CN104850411A (en) * | 2015-06-10 | 2015-08-19 | 清华大学 | Storage system reference evaluation program generating method and apparatus |
CN109074424A (en) * | 2016-05-04 | 2018-12-21 | 深圳华大生命科学研究院 | Utilize method, its coding/decoding method and the application of DNA storage text information |
US20180089369A1 (en) * | 2016-05-19 | 2018-03-29 | Seven Bridges Genomics Inc. | Systems and methods for sequence encoding, storage, and compression |
CN107798219A (en) * | 2016-08-30 | 2018-03-13 | 清华大学 | Data are subjected to biometric storage and the method reduced |
US20180265921A1 (en) * | 2017-03-15 | 2018-09-20 | Microsoft Technology Licensing, Llc | Random access of data encoded by polynucleotides |
Non-Patent Citations (2)
Title |
---|
HANXU HOU: "BASIC Codes: Low-Complexity Regenerating", 《IEEE TRANSACTIONS ON INFORMATION THEORY》 * |
SIDDHARTH JAIN: "Duplication-Correcting Codes for Data Storage", 《IEEE TRANSACTIONS ON INFORMATION THEORY》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113096742A (en) * | 2021-04-14 | 2021-07-09 | 湖南科技大学 | DNA information storage parallel addressing writing method and system |
CN113096742B (en) * | 2021-04-14 | 2022-06-14 | 湖南科技大学 | DNA information storage parallel addressing writing method and system |
CN118227947A (en) * | 2024-05-22 | 2024-06-21 | 北京灵汐科技有限公司 | Data storage and data processing method and device for matrix, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN109887549B (en) | 2023-01-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ping et al. | Carbon-based archiving: current progress and future prospects of DNA-based data storage | |
CN109887549A (en) | A kind of storage of data, restoring method and device | |
US20230214319A9 (en) | High-Capacity Storage of Digital Information in DNA | |
Paun et al. | Processes driving the adaptive radiation of a tropical tree (Diospyros, Ebenaceae) in New Caledonia, a biodiversity hotspot | |
CN109830263B (en) | DNA storage method based on oligonucleotide sequence coding storage | |
Schulze | Kinetic Monte Carlo simulations with minimal searching | |
CN104239750A (en) | High-throughput sequencing data-based genome de novo assembly method | |
Latz et al. | Short‐and long‐read metabarcoding of the eukaryotic rRNA operon: evaluation of primers and comparison to shotgun metagenomics sequencing | |
JP2019009776A (en) | Methods of encoding and decoding information | |
US8554489B2 (en) | Method and apparatus for controlling properties of nucleic acid nanostructures | |
Löchel et al. | Fractal construction of constrained code words for DNA storage systems | |
EP3067809A1 (en) | Method and apparatus for storing and selectively retrieving data encoded in nucleic acid molecules | |
CN110244932B (en) | System and method for long addition and long multiplication in associative memory | |
Gross et al. | On the Hilbert polynomials and Hilbert series of homogeneous projective varieties | |
CN104504627A (en) | Test paper automatic composing method utilizing genetic algorithm | |
Ebadi et al. | The duplication of genomes and genetic networks and its potential for evolutionary adaptation and survival during environmental turmoil | |
Ertz et al. | A new lineage of lichenized basidiomycetes inferred from a two‐gene phylogeny: The Lepidostromataceae with three species from the tropics | |
CN109885401B (en) | Structured grid load balancing method based on LPT local optimization | |
Raja et al. | Prediction and identification of novel sRNAs involved in Agrobacterium strains by integrated genome-wide and transcriptome-based methods | |
Guo et al. | The complete chloroplast genome of salt cress (Eutrema salsugineum) | |
Pearson et al. | The Human Genome Initiative—Do Databases Reflect Current Progress? | |
CN114927169A (en) | Distributed array storage and high-capacity error-correction DNA storage technology (Bio-RAID) based on microorganisms | |
Kholmurodov | Computer design for new drugs and materials: Molecular dynamics of nanoscale phenomena | |
CN104699909B (en) | A kind of variable step multistep processes time discrete method keeping stiff stability | |
Nance | Two Numerical Algorithms for the Ballistic Motion Equations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |