CN109887549A - A kind of storage of data, restoring method and device - Google Patents

A kind of storage of data, restoring method and device Download PDF

Info

Publication number
CN109887549A
CN109887549A CN201910132713.6A CN201910132713A CN109887549A CN 109887549 A CN109887549 A CN 109887549A CN 201910132713 A CN201910132713 A CN 201910132713A CN 109887549 A CN109887549 A CN 109887549A
Authority
CN
China
Prior art keywords
data
subdata
group
generator matrix
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910132713.6A
Other languages
Chinese (zh)
Other versions
CN109887549B (en
Inventor
郝建业
齐浩
张程伟
侯韩旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201910132713.6A priority Critical patent/CN109887549B/en
Publication of CN109887549A publication Critical patent/CN109887549A/en
Application granted granted Critical
Publication of CN109887549B publication Critical patent/CN109887549B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a kind of storage of data, restoring method and device, one group of target data at least one set of data that this method is divided by determination file to be stored;The product for calculating generator matrix and target data, obtains the first object subdata of every data in generator matrix;If the first object subdata of data is unsatisfactory for gene coding constraints in generator matrix, the data in generator matrix are updated, until the first object subdata generated meets gene coding constraints;First object subdata based on pieces of data in generator matrix, generate the mode of the DNA data of target data, the DNA data that file to be stored is converted into being used for carrying out data storage using DNA as data storage medium are realized into the purpose of data storage in order to reach using DNA as data storage medium.

Description

A kind of storage of data, restoring method and device
Technical field
The present invention relates to computer software technical fields, more specifically to a kind of storage of data, restoring method and dress It sets.
Background technique
The exponential increase of data volume has been more than pushing the speed for data storage hard disk capacity, makes people to the property of storage equipment There can be higher requirement, need significantly more efficient memory technology to guarantee that data store.
It has been investigated that DNA is a kind of outstanding data storage medium, it shows that the PB grade information of every grams evidence is close Degree, high-durability.Also, it carries out not needing to store data to data when data storage using DNA as data storage medium Therefore storage hard disk carries out data storage for DNA as data storage medium, it is possible to prevente effectively from because data storage hard disk holds Amount be not able to satisfy data volume growth demand, caused by cannot effectively realize data storage the problem of.
Summary of the invention
In view of this, the present invention provides a kind of storage of data, restoring method and device, to be data storage Jie based on DNA Matter realizes the purpose of data storage.
Technical solution is as follows:
A kind of date storage method, comprising:
Determine that one group of target data at least one set of data that file to be stored is divided into, the data include first The subdata of default number of branches, the data length of the subdata are preset data length;
The product for calculating generator matrix and the target data, obtains the first object of every data in the generator matrix Subdata, the generator matrix include the data of the second default number of branches, the data length of every data in the generator matrix It is identical as first default number of branches;
If the first object subdata of data is unsatisfactory for gene coding constraints in the generator matrix, the life is updated At the data in matrix, until the first object subdata generated meets the gene coding constraints;
Based on the first object subdata of pieces of data in the generator matrix, the DNA data of the target data are generated, The DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of the file to be stored are by described The DNA data for each data that file to be stored is divided into are constituted.
Preferably, the product for calculating generator matrix and the target data, obtains every number in the generator matrix According to first object subdata, comprising:
The tail portion of every subdata in the target data generates parity check bit, obtains and the target data pair The first object data answered;
The product for calculating generator matrix and the first object data, obtains first of every data in the generator matrix Target subdata.
Preferably, the first object subdata based on pieces of data in the generator matrix, generates the number of targets According to DNA data, comprising:
The last bit data in the first object subdata are deleted, the second target of the first object subdata is obtained Data;
Based on the second target subdata of target subdata each in the target data, the DNA of the target data is generated Data.
Preferably, the product for calculating generator matrix and the first object data obtains every in the generator matrix The first object subdata of data, comprising:
It is superimposed the first object data and non-zero matrix, obtains the second number of targets corresponding with the first object data According to the line number of the non-zero matrix is identical as first default number of branches, and the columns of the non-zero matrix is the preset data Length with 1 and;
The product for calculating generator matrix and second target data, obtains first of every data in the generator matrix Target subdata.
Preferably, the first object subdata based on pieces of data in the generator matrix, generates the number of targets According to DNA data, comprising:
Identification information, the mark are generated in the tail portion of the first object subdata of every data of the generator matrix respectively Know information to be used to indicate the target data, update when generating the first object subdata this in the generator matrix It is located at the generator matrix for generating the data of the first object subdata in the number of data and the generator matrix In address;
Based on the first object subdata of the carrying identification information of pieces of data in the generator matrix, the target is generated The DNA data of data.
A kind of data storage device, comprising:
Target data determination unit, for determining one group of target at least one set of data that file to be stored is divided into Data, the data include the subdata of the first default number of branches, and the data length of the subdata is preset data length;
Computing unit obtains in the generator matrix every for calculating the product of generator matrix Yu the target data The first object subdata of data, the generator matrix include the data of the second default number of branches, and every in the generator matrix The data length of data is identical as first default number of branches;
Updating unit, if the first object subdata for data in the generator matrix is unsatisfactory for gene coding bound item Part updates the data in the generator matrix, until the first object subdata generated meets the gene coding Constraint condition;
Generation unit generates the target for the first object subdata based on pieces of data in the generator matrix The DNA data of data, the DNA data are used to carry out data storage by data storage medium of DNA, the file to be stored The DNA data for each data that DNA data are divided by the file to be stored are constituted.
A kind of data restoration method, comprising:
Determine that DNA data to be restored, the DNA data are made of at least one first object subdata;
By at least one described first object data, indicate that the first object subdata of same target data is classified as one Group obtains multiple groups first object subdata;
For every group of first object subdata, it is the present count that data length is chosen from this group of first object subdata According to the first default number of branches first object subdata of length, the corresponding third target of this group of first object subdata is constituted Data;
For first object subdata described in every group, according to first object subdata each in this group of first object subdata Identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
For first object subdata described in every group, the inverse matrix and the group first of this group of first object subdata are calculated The product of the corresponding third target data of target subdata obtains corresponding 4th target data of this group of first object subdata; 4th target data is for being reduced into file.
Preferably, further includes:
For every group of first object subdata, in the corresponding third target data of this group of first object subdata The tail portion of each first object subdata generates parity check bit, obtains the 5th target corresponding with this group of first object subdata Data;
It is described for first object subdata described in every group, calculate the inverse matrix and the group of this group of first object subdata The product of the corresponding third target data of first object subdata obtains corresponding 4th number of targets of this group of first object subdata According to, comprising:
For first object subdata described in every group, the inverse matrix and first mesh of group of this group of first object subdata are calculated The product for marking corresponding 5th target data of subdata, obtains corresponding 4th target data of this group of first object subdata.
Preferably, further includes:
The data for deleting every data tail portion in the 4th target data obtain the 4th final target data.
A kind of data recovery device, comprising:
DNA data determination unit, for determining DNA data to be restored, the DNA data are by least one first object Subdata is constituted;
Grouped element, for indicating the first object of same target data at least one described first object data Subdata is classified as one group, obtains multiple groups first object subdata;
It is long to choose data for being directed to every group of first object subdata from this group of first object subdata for selection unit Degree is the first default number of branches first object subdata of the preset data length, constitutes this group of first object subdata Corresponding third target data;
Inverse matrix structural unit is used for for first object subdata described in every group, according to this group of first object subdata In each first object subdata identification information instruction address and number, construct the inverse square of this group of first object subdata Battle array;
Reduction unit, for calculating the inverse square of this group of first object subdata for first object subdata described in every group The product of battle array and the corresponding third target data of this group of first object subdata, it is corresponding to obtain this group of first object subdata 4th target data;4th target data is for being reduced into file.
The application provides a kind of storage of data, restoring method and device, and this method, which passes through, determines that file to be stored is divided At at least one set of data in one group of target data;The product for calculating generator matrix and target data, obtains in generator matrix The first object subdata of every data;If the first object subdata of data is unsatisfactory for gene coding bound item in generator matrix Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;Based on generation The first object subdata of pieces of data in matrix, generates the mode of the DNA data of target data, file to be stored is converted into It is real in order to reach using DNA as data storage medium for carrying out the DNA data of data storage using DNA as data storage medium The purpose of existing data storage.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of date storage method flow chart provided by the embodiments of the present application;
Fig. 2 is another date storage method flow chart provided by the embodiments of the present application;
Fig. 3 is a kind of data restoration method flow chart provided by the embodiments of the present application;
Fig. 4 is another data restoration method flow chart provided by the embodiments of the present application;
Fig. 5 is a kind of structural schematic diagram of data storage device provided by the embodiments of the present application;
Fig. 6 is a kind of structural schematic diagram of data recovery device provided by the embodiments of the present application.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Embodiment:
Fig. 1 is a kind of date storage method flow chart provided by the embodiments of the present application.As shown in Figure 1, this method comprises:
S101, one group of target data at least one set of data that file to be stored is divided into is determined, data include the The subdata of one default number of branches, the data length of subdata are preset data length;
In the embodiment of the present application, the lossless compression that standard can be used compresses file to be stored, will be wait store File is bundled in compressed file, so by compressed file be divided into length be K*L non-overlap group (that is, compressed file is divided At multiple non-overlap groups, each non-overlap group regards one group of data as), in each non-overlap group comprising K length for L two into Data processed.
Wherein, each group of data that file to be stored is divided into can regard a matrix as, which is a K row The matrix of L column, the matrix can use a vector D=(D1, D2..., DK) description.Wherein, DkFor indicating to be located at the matrix The element of middle line k.
S102, the product for calculating generator matrix and target data obtain first object of every data in generator matrix Data, generator matrix include the data of the second default number of branches, and the data length of every data in generator matrix and first is preset Item number is identical;
In the embodiment of the present application, generator matrix G is initialized, wherein the generator matrix G of initialization is defined as follows:
The generator matrix of the initialization is the matrix of N row K column, and generator matrix G is defined in RpOn matrix, be The modified of generalized circular matrix.
Wherein, every row element in the generator matrix can regard a data as, which includes second default The data of item number, also, the number of every row element is identical as the first default number of branches in the generator matrix, every row member in generator matrix The number of element can regard the length of data corresponding to the row element in generator matrix as.That is, line number in generator matrix is the Two default number of branches, the columns in generator matrix are identical as the first default number of branches.
If the first object subdata of data is unsatisfactory for gene coding constraints in S103, generator matrix, more newly-generated Data in matrix, until the first object subdata generated meets gene coding constraints;
In the embodiment of the present application, data to be stored is divided into multi-group data, can regard every group of data as one respectively A target data calculates the generator matrix G product with every group of target data respectively, to obtain the product knot of this group of target data Fruit.
Wherein, target data is the matrix of a K*L, and generator matrix G is the matrix of a N*K, calculate generator matrix G with The product of target data, the matrix of an available N*L, the matrix of the N*L can regard the product of this group of target data as As a result, wherein every row element can regard the data in generator matrix as in generator matrix, calculate generator matrix and target The product of data can regard the process for calculating the corresponding first object subdata of every data in generator matrix, this group of target as Every row element can regard a first object subdata as in the result of product of data, and every data is corresponding in generator matrix First object subdata be this group of target data result of product in correspond to the first object subdata of row.
In the embodiment of the present application, when the product of calculating generator matrix and target data, if for the number in generator matrix According to the corresponding first object subdata of the obtained data is unsatisfactory for gene coding constraints, then needs to the generator matrix In the data be updated, until the first object subdata of the data generated meets gene coding constraints and is Only, not update generator matrix in the data and using the first object subdata obtained at this time it is corresponding as the data First object subdata, and record the update times of the data.
Wherein, the product for calculating generator matrix and target data, every data corresponding the in available generator matrix One target subdata (the first object subdata of alternatively referred to as every data), and each obtained first object subdata is made For the DNA data of the target data.
The DNA data of target data can regard the matrix being made of a plurality of first object subdata, every first mesh as The row marked in subdata matrix is identical as the row that the corresponding data of first object subdata are located in generator matrix.
S104, the first object subdata based on pieces of data in generator matrix, generate the DNA data of target data, DNA Data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are divided by file to be stored At each data DNA data constitute.
In the embodiment of the present application, file to be stored is divided at least one set of data, executes above-mentioned step for every group of data After rapid S10-S104 obtains the DNA data of this group of data, the DNA data of every group of data at least one obtained group data are made For the DNA data of file to be stored, and then the DNA data of file to be stored are subjected to data as data storage medium using DNA and are deposited Storage.
Fig. 2 is another date storage method flow chart provided by the embodiments of the present application.
As shown in Fig. 2, this method comprises:
S201, one group of target data at least one set of data that file to be stored is divided into is determined, data include the The subdata of one default number of branches, the data length of subdata are preset data length;
S202, every subdata in target data tail portion generate parity check bit, obtain corresponding with target data First object data;
In the embodiment of the present application, include a plurality of subdata in target data, generated respectively in the tail portion of every subdata Generation is had each subdata of parity check bit as first object data corresponding with target data by parity check bit.
S203, the product for calculating generator matrix and first object data, obtain the first mesh of every data in generator matrix Mark subdata;Generator matrix includes the data of the second default number of branches, the data length and first of every data in generator matrix Default number of branches are identical;
In the embodiment of the present application, it is preferred that the product for calculating generator matrix and first object data obtains generator matrix In every data first object subdata, comprising: superposition first object data and non-zero matrix obtain and first object data Corresponding second target data, the line number of non-zero matrix is identical as the first default number of branches, and the columns of non-zero matrix is preset data Length with 1 and;The product for calculating generator matrix and the second target data, obtains the first object of every data in generator matrix Subdata.
In the embodiment of the present application, it is provided with non-zero matrix, the line number of the non-zero matrix is identical as the first default number of branches, should The column book of non-zero matrix be preset data length with 1 and.That is, non-zero matrix relative to matrix corresponding to target data and Speech, the line number of the line number of non-zero matrix matrix corresponding with target data is identical, and the columns of the non-zero matrix compares target data Corresponding matrix column number more 1.
In the embodiment of the present application, superposition non-zero matrix and first object data, available and first object data pair The second target data answered, and then the product of generator matrix and second target data is calculated, obtain every number in generator matrix According to first object subdata.
If the first object subdata of data is unsatisfactory for gene coding constraints in S204, generator matrix, more newly-generated Data in matrix, until the first object subdata generated meets gene coding constraints;
Last bit data in S205, deletion first object subdata, obtain the second target subnumber of first object subdata According to;
In the embodiment of the present application, the first object subdata of data meets gene coding bound in obtaining generator matrix When condition, using the first object subdata as the first object subdata of the data in the generator matrix, and by first mesh The last bit data marked in subdata are deleted, and using the first object subdata after deletion last bit data as the first object subnumber According to the second target subdata.
S206, the second target subdata based on target subdata each in target data, generate the DNA number of target data According to DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are by file to be stored The DNA data for each data being divided into are constituted.
In the embodiment of the present application, by the second target subdata of target subdata each in target data, the target is constituted The DNA data of data.
Further, in a kind of date storage method provided by the embodiments of the present application, for the ease of the reduction to data, Can also identification information, mark further be being generated in the tail portion of the first object subdata of every data of generator matrix respectively Information be used to indicate target data, when generating first object subdata update generator matrix in the data number, with And the address being located in generator matrix in generator matrix for generating the data of first object subdata;And then it is based on generator matrix The first object subdata of the carrying identification information of middle pieces of data, generates the DNA data of target data.
For the ease of to a kind of understanding of date storage method provided by the embodiments of the present application, now to file to be stored DNA data are described in detail using the process that DNA carries out data storage as data storage medium.
In the embodiment of the present application, the DNA data of file to be stored can be regarded to file to be stored as by above-mentioned number Sample is stored according to the DNA that storage method generates, which is stored into sample and carries out data storage by data storage medium of DNA Process are as follows:
1, the preparation of sample cell: DNA storage sample is resuspended, is saved (being saved after can dispensing).
PCR is carried out using different archaeal dna polymerases.
A) according to the specification of polymerase, suitable component, template and forward and reverse primer is added, mixes;B) primer is calculated Optimal T m value;C) the PCR reaction condition given according to polymerase carries out PCR reaction, executes n altogether and recycles, d) production that will obtain Object carries out purification and recovery, is dissolved in the sample cell of appropriate volume.
2, PCR amplification
Using the product in main sample cell as template, the reaction for carrying out PCR is circuited sequentially.
The reaction of first time PCR:
A) according to the specification of polymerase, suitable component, template and forward and reverse primer is added, mixes;B) primer is calculated Optimal T m value;C) the PCR reaction condition given according to polymerase carries out PCR reaction, executes n circulation, d altogether) it verifies and will obtain PCR product, and save;E) template for reacting obtained PCR product as PCR next time after taking in right amount, carries out next time PCR reaction.
N times PCR reaction afterwards is same as above.After the completion of n times PCR cycle, the PCR product of n-th is sequenced, then sequencing is tied Fruit compares sequence, that is, completes the extraction of storage information.
1) pond DNA of synthesis is resuspended in 428.4 μ L 0.5x TE, ultimately joins 150ng/ μ L (l/ parts of 50 μ of packing Liquid nitrogen frozen, -80 DEG C of refrigerators save).
2) it is expanded using the DNA fragmentation that standard DNA polymerase carries out PCR
PCR reaction condition:
Purified product is dissolved in 25 μ l dd H2O。
After the completion, PCR product is sequenced, then sequencing result is compared into sequence, complete the extraction of storage information.
Further, in the embodiment of the present application, a kind of data restoration method flow chart is also provided, Fig. 3 is specifically referred to.
As shown in figure 3, this method comprises:
S301, determine that DNA data to be restored, DNA data are made of at least one first object subdata;
S302, it at least one first object data, will indicate that the first object subdata of same target data is classified as one Group obtains multiple groups first object subdata;
S303, it is directed to every group of first object subdata, it is default that data length is chosen from this group of first object subdata First default number of branches first object subdata of data length, constitutes the corresponding third number of targets of this group of first object subdata According to;
S304, it is directed to every group of first object subdata, according to first object subnumber each in this group of first object subdata According to identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
In the embodiment of the present application, the inverse matrix of first object subdata may be considered the above embodiments of the present application and hold The inverse matrix of used generator matrix when row date storage method.
S305, it is directed to every group of first object subdata, calculates the inverse matrix and the group the of this group of first object subdata The product of the corresponding third target data of one target subdata obtains corresponding 4th number of targets of this group of first object subdata According to;4th target data is for being reduced into file.
Fig. 4 is another data restoration method flow chart provided by the embodiments of the present application.
As shown in figure 4, this method comprises:
S401, determine that DNA data to be restored, DNA data are made of at least one first object subdata;
S402, it at least one first object data, will indicate that the first object subdata of same target data is classified as one Group obtains multiple groups first object subdata;
S403, it is directed to every group of first object subdata, it is default that data length is chosen from this group of first object subdata First default number of branches first object subdata of data length, constitutes the corresponding third number of targets of this group of first object subdata According to;
S404, it is directed to every group of first object subdata, in the corresponding third target data of this group of first object subdata Each first object subdata tail portion generate parity check bit, obtain the 5th mesh corresponding with this group of first object subdata Mark data;
S405, it is directed to every group of first object subdata, according to first object subnumber each in this group of first object subdata According to identification information instruction address and number, construct the inverse matrix of this group of first object subdata;
S406, it is directed to every group of first object subdata, calculates the inverse matrix and the group first of this group of first object subdata The product of corresponding 5th target data of target subdata obtains corresponding 4th target data of this group of first object subdata;
S407, the data for deleting every data tail portion in the 4th target data, obtain the 4th final target data, this is most The 4th whole target data is for being reduced into file.
Correspondingly, Fig. 5 is a kind of structural schematic diagram of data storage device provided by the embodiments of the present application.
As shown in figure 5, the device includes:
Target data determination unit 51, for determining one group of mesh at least one set of data that file to be stored is divided into Data are marked, data include the subdata of the first default number of branches, and the data length of subdata is preset data length;
Computing unit 52 obtains every data in generator matrix for calculating the product of generator matrix and target data First object subdata, generator matrix include the data of the second default number of branches, the data length of every data in generator matrix It is identical as the first default number of branches;
Updating unit 53, if the first object subdata for data in generator matrix is unsatisfactory for gene coding bound item Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;
Generation unit 54 generates target data for the first object subdata based on pieces of data in generator matrix DNA data, DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of file to be stored are by wait deposit The DNA data for each data that storage file is divided into are constituted.
Correspondingly, Fig. 6 is a kind of structural schematic diagram of data recovery device provided by the embodiments of the present application.
As shown in fig. 6, the device includes:
DNA data determination unit 61, for determining DNA data to be restored, DNA data are by least one first object Data are constituted;
Grouped element 62, at least one first object data, will indicate first object of same target data Data are classified as one group, obtain multiple groups first object subdata;
Selection unit 63 chooses data from this group of first object subdata for being directed to every group of first object subdata Length is the first default number of branches first object subdata of preset data length, and it is corresponding to constitute this group of first object subdata Third target data;
Inverse matrix structural unit 64, for being directed to every group of first object subdata, according in this group of first object subdata The address of the identification information instruction of each first object subdata and number, construct the inverse matrix of this group of first object subdata;
Reduction unit 65 calculates the inverse matrix of this group of first object subdata for being directed to every group of first object subdata And the product of the corresponding third target data of this group of first object subdata, obtain this group of first object subdata corresponding Four target datas;4th target data is for being reduced into file.
The application provides a kind of storage of data, restoring method and device, and this method, which passes through, determines that file to be stored is divided At at least one set of data in one group of target data;The product for calculating generator matrix and target data, obtains in generator matrix The first object subdata of every data;If the first object subdata of data is unsatisfactory for gene coding bound item in generator matrix Part updates the data in generator matrix, until the first object subdata generated meets gene coding constraints;Based on generation The first object subdata of pieces of data in matrix, generates the mode of the DNA data of target data, file to be stored is converted into It is real in order to reach using DNA as data storage medium for carrying out the DNA data of data storage using DNA as data storage medium The purpose of existing data storage.
A kind of storage of data, restoring method and device provided by the present invention are described in detail above, herein Apply that a specific example illustrates the principle and implementation of the invention, the explanation of above example is only intended to help Understand method and its core concept of the invention;At the same time, for those skilled in the art, according to the thought of the present invention, There will be changes in the specific implementation manner and application range, and to sum up, the content of the present specification should not be construed as to the present invention Limitation.
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other. For the device disclosed in the embodiment, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, phase Place is closed referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the element that the process, method, article or equipment including a series of elements is intrinsic, It further include either the element intrinsic for these process, method, article or equipments.In the absence of more restrictions, The element limited by sentence "including a ...", it is not excluded that in the process, method, article or equipment for including element also There are other identical elements.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of date storage method characterized by comprising
Determine that one group of target data at least one set of data that file to be stored is divided into, the data include first default The subdata of item number, the data length of the subdata are preset data length;
The product for calculating generator matrix and the target data, obtains the first object subnumber of every data in the generator matrix According to the generator matrix includes the data of the second default number of branches, the data length of every data in the generator matrix and institute It is identical to state the first default number of branches;
If the first object subdata of data is unsatisfactory for gene coding constraints in the generator matrix, the generation square is updated The data in battle array, until the first object subdata generated meets the gene coding constraints;
Based on the first object subdata of pieces of data in the generator matrix, the DNA data of the target data are generated, it is described DNA data are used to carry out data storage by data storage medium of DNA, and the DNA data of the file to be stored are by described wait deposit The DNA data for each data that storage file is divided into are constituted.
2. the method according to claim 1, wherein described calculate multiplying for generator matrix and the target data Product, obtains the first object subdata of every data in the generator matrix, comprising:
The tail portion of every subdata in the target data generates parity check bit, obtains corresponding with the target data First object data;
The product for calculating generator matrix and the first object data, obtains the first object of every data in the generator matrix Subdata.
3. according to the method described in claim 2, it is characterized in that, first based on pieces of data in the generator matrix Target subdata generates the DNA data of the target data, comprising:
The last bit data in the first object subdata are deleted, the second target subnumber of the first object subdata is obtained According to;
Based on the second target subdata of target subdata each in the target data, the DNA data of the target data are generated.
4. according to the method described in claim 2, it is characterized in that, described calculate generator matrix and the first object data Product obtains the first object subdata of every data in the generator matrix, comprising:
It is superimposed the first object data and non-zero matrix, obtains the second target data corresponding with the first object data, The line number of the non-zero matrix is identical as first default number of branches, and the columns of the non-zero matrix is the preset data length With 1 and;
The product for calculating generator matrix and second target data, obtains the first object of every data in the generator matrix Subdata.
5. method according to any of claims 1-4, which is characterized in that described based on each item in the generator matrix The first object subdata of data generates the DNA data of the target data, comprising:
Identification information, the mark letter are generated in the tail portion of the first object subdata of every data of the generator matrix respectively Breath is used to indicate the target data, updates when generating the first object subdata data in the generator matrix Number and the generator matrix in data for generating the first object subdata be located in the generator matrix Address;
Based on the first object subdata of the carrying identification information of pieces of data in the generator matrix, the target data is generated DNA data.
6. a kind of data storage device characterized by comprising
Target data determination unit, for determining one group of number of targets at least one set of data that file to be stored is divided into According to the data include the subdata of the first default number of branches, and the data length of the subdata is preset data length;
Computing unit obtains every data in the generator matrix for calculating the product of generator matrix Yu the target data First object subdata, the generator matrix includes the data of the second default number of branches, every data in the generator matrix Data length it is identical as first default number of branches;
Updating unit, if the first object subdata for data in the generator matrix is unsatisfactory for gene coding constraints, The data in the generator matrix are updated, until the first object subdata generated meets the gene coding bound Condition;
Generation unit generates the target data for the first object subdata based on pieces of data in the generator matrix DNA data, the DNA data be used for using DNA be data storage medium progress data storage, the DNA of the file to be stored The DNA data for each data that data are divided by the file to be stored are constituted.
7. a kind of data restoration method characterized by comprising
Determine that DNA data to be restored, the DNA data are made of at least one first object subdata;
By at least one described first object data, indicates that the first object subdata of same target data is classified as one group, obtain To multiple groups first object subdata;
For every group of first object subdata, it is that the preset data is long that data length is chosen from this group of first object subdata The first default number of branches first object subdata of degree, constitutes the corresponding third number of targets of this group of first object subdata According to;
For first object subdata described in every group, according to the mark of first object subdata each in this group of first object subdata Address and the number for knowing information instruction, construct the inverse matrix of this group of first object subdata;
For first object subdata described in every group, the inverse matrix and this group of first object of this group of first object subdata are calculated The product of the corresponding third target data of subdata obtains corresponding 4th target data of this group of first object subdata;It is described 4th target data is for being reduced into file.
8. the method according to the description of claim 7 is characterized in that further include:
For every group of first object subdata, in each of corresponding described third target data of this group of first object subdata The tail portion of first object subdata generates parity check bit, obtains the 5th number of targets corresponding with this group of first object subdata According to;
It is described for first object subdata described in every group, calculate the inverse matrix and the group first of this group of first object subdata The product of the corresponding third target data of target subdata obtains corresponding 4th target data of this group of first object subdata, Include:
For first object subdata described in every group, the inverse matrix and this group of first object of this group of first object subdata are calculated The product of corresponding 5th target data of data obtains corresponding 4th target data of this group of first object subdata.
9. according to the method described in claim 8, it is characterized by further comprising:
The data for deleting every data tail portion in the 4th target data obtain the 4th final target data.
10. a kind of data recovery device characterized by comprising
DNA data determination unit, for determining DNA data to be restored, the DNA data are by least one first object subnumber According to composition;
Grouped element, for indicating the first object subnumber of same target data at least one described first object data According to being classified as one group, multiple groups first object subdata is obtained;
Selection unit, for being directed to every group of first object subdata, choosing data length from this group of first object subdata is It is corresponding to constitute this group of first object subdata for the first default number of branches first object subdata of the preset data length Third target data;
Inverse matrix structural unit is used for for first object subdata described in every group, according to every in this group of first object subdata The address of the identification information instruction of a first object subdata and number, construct the inverse matrix of this group of first object subdata;
Reduction unit, for for first object subdata described in every group, calculate the inverse matrix of this group of first object subdata with And the product of the corresponding third target data of this group of first object subdata, obtain this group of first object subdata the corresponding 4th Target data;4th target data is for being reduced into file.
CN201910132713.6A 2019-02-22 2019-02-22 Data storage and restoration method and device Active CN109887549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910132713.6A CN109887549B (en) 2019-02-22 2019-02-22 Data storage and restoration method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910132713.6A CN109887549B (en) 2019-02-22 2019-02-22 Data storage and restoration method and device

Publications (2)

Publication Number Publication Date
CN109887549A true CN109887549A (en) 2019-06-14
CN109887549B CN109887549B (en) 2023-01-20

Family

ID=66928942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910132713.6A Active CN109887549B (en) 2019-02-22 2019-02-22 Data storage and restoration method and device

Country Status (1)

Country Link
CN (1) CN109887549B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113096742A (en) * 2021-04-14 2021-07-09 湖南科技大学 DNA information storage parallel addressing writing method and system
CN118227947A (en) * 2024-05-22 2024-06-21 北京灵汐科技有限公司 Data storage and data processing method and device for matrix, equipment and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040061702A1 (en) * 2002-08-08 2004-04-01 Robert Kincaid Methods and system for simultaneous visualization and manipulation of multiple data types
US20040221223A1 (en) * 2003-04-29 2004-11-04 Nam-Yul Yu Apparatus and method for encoding a low density parity check code
CN104850411A (en) * 2015-06-10 2015-08-19 清华大学 Storage system reference evaluation program generating method and apparatus
WO2015180203A1 (en) * 2014-05-30 2015-12-03 周家锐 High-throughput dna sequencing quality score lossless compression system and compression method
CN105760706A (en) * 2014-12-15 2016-07-13 深圳华大基因研究院 Compression method for next generation sequencing data
CN107055468A (en) * 2012-06-01 2017-08-18 欧洲分子生物学实验室 The high-capacity storage of digital information in DNA
CN107798219A (en) * 2016-08-30 2018-03-13 清华大学 Data are subjected to biometric storage and the method reduced
US20180089369A1 (en) * 2016-05-19 2018-03-29 Seven Bridges Genomics Inc. Systems and methods for sequence encoding, storage, and compression
US20180265921A1 (en) * 2017-03-15 2018-09-20 Microsoft Technology Licensing, Llc Random access of data encoded by polynucleotides
CN109074424A (en) * 2016-05-04 2018-12-21 深圳华大生命科学研究院 Utilize method, its coding/decoding method and the application of DNA storage text information

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040061702A1 (en) * 2002-08-08 2004-04-01 Robert Kincaid Methods and system for simultaneous visualization and manipulation of multiple data types
US20040221223A1 (en) * 2003-04-29 2004-11-04 Nam-Yul Yu Apparatus and method for encoding a low density parity check code
CN107055468A (en) * 2012-06-01 2017-08-18 欧洲分子生物学实验室 The high-capacity storage of digital information in DNA
WO2015180203A1 (en) * 2014-05-30 2015-12-03 周家锐 High-throughput dna sequencing quality score lossless compression system and compression method
CN105760706A (en) * 2014-12-15 2016-07-13 深圳华大基因研究院 Compression method for next generation sequencing data
CN104850411A (en) * 2015-06-10 2015-08-19 清华大学 Storage system reference evaluation program generating method and apparatus
CN109074424A (en) * 2016-05-04 2018-12-21 深圳华大生命科学研究院 Utilize method, its coding/decoding method and the application of DNA storage text information
US20180089369A1 (en) * 2016-05-19 2018-03-29 Seven Bridges Genomics Inc. Systems and methods for sequence encoding, storage, and compression
CN107798219A (en) * 2016-08-30 2018-03-13 清华大学 Data are subjected to biometric storage and the method reduced
US20180265921A1 (en) * 2017-03-15 2018-09-20 Microsoft Technology Licensing, Llc Random access of data encoded by polynucleotides

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HANXU HOU: "BASIC Codes: Low-Complexity Regenerating", 《IEEE TRANSACTIONS ON INFORMATION THEORY》 *
SIDDHARTH JAIN: "Duplication-Correcting Codes for Data Storage", 《IEEE TRANSACTIONS ON INFORMATION THEORY》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113096742A (en) * 2021-04-14 2021-07-09 湖南科技大学 DNA information storage parallel addressing writing method and system
CN113096742B (en) * 2021-04-14 2022-06-14 湖南科技大学 DNA information storage parallel addressing writing method and system
CN118227947A (en) * 2024-05-22 2024-06-21 北京灵汐科技有限公司 Data storage and data processing method and device for matrix, equipment and medium

Also Published As

Publication number Publication date
CN109887549B (en) 2023-01-20

Similar Documents

Publication Publication Date Title
Ping et al. Carbon-based archiving: current progress and future prospects of DNA-based data storage
CN109887549A (en) A kind of storage of data, restoring method and device
US20230214319A9 (en) High-Capacity Storage of Digital Information in DNA
Paun et al. Processes driving the adaptive radiation of a tropical tree (Diospyros, Ebenaceae) in New Caledonia, a biodiversity hotspot
CN109830263B (en) DNA storage method based on oligonucleotide sequence coding storage
Schulze Kinetic Monte Carlo simulations with minimal searching
CN104239750A (en) High-throughput sequencing data-based genome de novo assembly method
Latz et al. Short‐and long‐read metabarcoding of the eukaryotic rRNA operon: evaluation of primers and comparison to shotgun metagenomics sequencing
JP2019009776A (en) Methods of encoding and decoding information
US8554489B2 (en) Method and apparatus for controlling properties of nucleic acid nanostructures
Löchel et al. Fractal construction of constrained code words for DNA storage systems
EP3067809A1 (en) Method and apparatus for storing and selectively retrieving data encoded in nucleic acid molecules
CN110244932B (en) System and method for long addition and long multiplication in associative memory
Gross et al. On the Hilbert polynomials and Hilbert series of homogeneous projective varieties
CN104504627A (en) Test paper automatic composing method utilizing genetic algorithm
Ebadi et al. The duplication of genomes and genetic networks and its potential for evolutionary adaptation and survival during environmental turmoil
Ertz et al. A new lineage of lichenized basidiomycetes inferred from a two‐gene phylogeny: The Lepidostromataceae with three species from the tropics
CN109885401B (en) Structured grid load balancing method based on LPT local optimization
Raja et al. Prediction and identification of novel sRNAs involved in Agrobacterium strains by integrated genome-wide and transcriptome-based methods
Guo et al. The complete chloroplast genome of salt cress (Eutrema salsugineum)
Pearson et al. The Human Genome Initiative—Do Databases Reflect Current Progress?
CN114927169A (en) Distributed array storage and high-capacity error-correction DNA storage technology (Bio-RAID) based on microorganisms
Kholmurodov Computer design for new drugs and materials: Molecular dynamics of nanoscale phenomena
CN104699909B (en) A kind of variable step multistep processes time discrete method keeping stiff stability
Nance Two Numerical Algorithms for the Ballistic Motion Equations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant