CN113268461B - Method and device for gene sequencing data recombination packaging - Google Patents
Method and device for gene sequencing data recombination packaging Download PDFInfo
- Publication number
- CN113268461B CN113268461B CN202110810347.2A CN202110810347A CN113268461B CN 113268461 B CN113268461 B CN 113268461B CN 202110810347 A CN202110810347 A CN 202110810347A CN 113268461 B CN113268461 B CN 113268461B
- Authority
- CN
- China
- Prior art keywords
- gene
- data
- sequence
- nucleotides
- gene sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/50—Compression of genetic data
Abstract
The invention discloses a gene sequencing data recombination and encapsulation method, which comprises the following steps of 1: constructing a reference genome database and a gene dictionary; step 2: obtaining a second gene sequence of a chromosome in the sample; and step 3: comparing the second gene sequence of step 2 with a plurality of first gene sequences; and 4, step 4: comparing the second gene sequence with the standard gene; and 5: sequentially grouping the nucleotides in the gene segments by taking N nucleotides as a group; step 6: expressing the front section, the gene fragment and the rear section by codes in a gene dictionary to form a group of nucleotide data; and 7: counting and compressing the nucleotide data on different chromosomes to obtain compressed genome data; and 8: reducing to obtain a second gene sequence of the sample. According to the invention, a small segment of nucleotides is coded by a dictionary, so that effective compression of data can be realized; meanwhile, the invention also provides a device based on the method.
Description
Technical Field
The invention relates to the field of electric digital data processing of a new generation of information technology, in particular to a method and a device for gene sequencing data recombination and encapsulation.
Background
CN202010457824.7 discloses a lossless compression method for deeply sequencing a second gene sequence data file, and the technical solution of the patent application uses a built-in standard reference genome and a built-in dictionary file which do not need to be transmitted in the transmission process as a comparison. Therefore, if the converted second gene sequence or the compressed second gene sequence data in the patent is lost in the transmission or storage process, the related sequence cannot be restored as long as other personnel cannot obtain the built-in standard gene and the built-in dictionary file, and the safety is greatly enhanced. And (4) adding a temporary dictionary according to variation on unmatched files, and compressing and transmitting the dictionary along with the files. If the special variation which is not matched for the first time is written into the dictionary, the special variation which appears in the sequencing data for hundreds or even tens of thousands of times does not need to be stored additionally, and the space is greatly saved.
The method adopts the dictionary file to reduce the data of the nucleotide sequence to achieve the purpose of compressing and transmitting the nucleotide data, but whether an effective path for further reducing the data transmission amount exists or not is not further researched or explained, and the urgent need in the field is met.
Disclosure of Invention
The invention aims to provide a gene sequencing data recombination and packaging method, which adopts dictionary coding on a small segment of nucleotide and can realize effective compression of data;
meanwhile, the invention also provides a device based on the method.
In order to achieve the purpose, the invention provides the following technical scheme: a method for gene sequencing data recombination encapsulation comprises the following steps:
step 1: constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary uses codes to represent different combinations of nucleotide sequences which are less than or equal to N;
step 2: obtaining a second gene sequence of a chromosome in the sample;
and step 3: comparing the second gene sequence in the step (2) with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
and 4, step 4: comparing the second gene sequence with the standard gene to separate out a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
and 5: sequentially grouping the nucleotides in the gene segments by taking N nucleotides as a group;
step 6: expressing the front section, the gene fragment and the rear section by codes in a gene dictionary to form a group of nucleotide data;
and 7: counting and compressing nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the serial number of a first gene sequence corresponding to a standard gene to a data receiving end;
and 8: when the data receiving end receives the genome data and the serial number of the first gene sequence, decompressing the genome data, extracting the nucleotide data on each chromosome by referring to the gene dictionary, determining the position of the gene segment on the standard gene according to the number of the nucleotide sequences of the front segment and the rear segment and the number of the nucleotides between the front segment and the rear segment, and reducing to obtain the second gene sequence of the sample.
In the above method for packaging gene sequencing data by recombination, N is 3 or 4 or 5 or 6.
In the method for packaging gene sequencing data by recombination, the length of the gene fragment is more than N nucleotides.
In the method for packaging gene sequencing data recombination, the first gene sequence in the reference genome database comprises a first gene sequence of an autosome and a first gene sequence of a sex chromosome.
Meanwhile, the invention also discloses a gene sequencing data recombination packaging device, which comprises the following modules:
a storage module: the system comprises a database for storing and constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary represents different combinations of nucleotide sequences which are less than or equal to N by codes;
standard genome selection module: comparing the second gene sequence of each chromosome of the sample with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
a comparison module: the second gene sequence is compared with the standard gene, and a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment are separated; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
a dictionary module: the nucleotide sequence is used for grouping the nucleotides in the gene segments in sequence by taking N as a group; the front section, the gene fragment and the back section are represented by codes in a gene dictionary to form a group of nucleotide data; and the system is used for counting and compressing the nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the code number of the reference gene corresponding to the standard gene to a data receiving end.
In the gene sequencing data recombination and encapsulation device, N is 3, 4, 5 or 6.
In the gene sequencing data recombination and encapsulation device, the length of the gene segment is greater than N nucleotides.
In the above gene sequencing data reassembly and packaging apparatus, the first gene sequence in the reference genome database includes a first gene sequence of an autosome and a first gene sequence of a sex chromosome.
Compared with the prior art, the invention has the beneficial effects that:
the gene dictionary restores the front section, the rear section and the gene fragment in the data, determines the accurate position of the first gene sequence according to the length, the gene sequences of the front section and the rear section and the number of the first gene sequence, and replaces the corresponding position in the first gene sequence to obtain a second gene sequence.
The compressed data volume is small, and the calculation speed is high.
Drawings
FIG. 1 is a flow chart of example 1 of the present invention;
fig. 2 is a topology diagram of embodiment 2 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1, a method for gene sequencing data recombination encapsulation comprises the following steps:
step 1: constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary uses codes to represent different combinations of nucleotide sequences which are less than or equal to N; each first gene sequence is numbered;
in practice, when N is chosen to be 3, any combination of all nucleotides can be combined into 64 combinations, and 4 different combinations of single nucleotides, 16 combinations of 2 nucleotides, and 84 combinations in total are included.
By choosing N as 4, any combination of all nucleotides can be combined into 256 combinations, with 4 different cases for a single nucleotide, 16 combinations for 2 nucleotides, 64 combinations for 3 nucleotides, and 340 combinations in total.
Taking N as 4 as an example, in the gene dictionary, the 340 combinations are represented by symbols.
The reference genome database does not only contain 23 chromosome pairs for men and women, but also contains data of the first gene sequences of a plurality of chromosomes with 23 chromosome pairs as a group.
Step 2: obtaining a second gene sequence of a chromosome in the sample;
and step 3: comparing the second gene sequence in the step (2) with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
each person has 23 second gene sequences, and the 23 second gene sequences are compared with the first gene sequences in the reference genome database one by one to obtain a plurality of second gene sequences as standard genes.
As a further optimization, the positions possibly appearing in the first gene sequences in the reference genome database can be marked according to the positions where the human genes appear in distinction, a plurality of marking points are generated in each first gene sequence, and when the first gene sequences are aligned with the second gene sequences, only the genes at the same sites of the second gene sequences are aligned with the genes at the marking points, so that the first gene sequences with the least difference are used as standard genes, the determination time of the standard genes can be further obviously shortened, and the speed of the step 3 is increased.
And 4, step 4: comparing the second gene sequence with the standard gene to separate out a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
and 5: sequentially grouping the nucleotides in the gene segments by taking N nucleotides as a group;
for example, if the gene fragment is 101 nucleotides and N is 4, the genes can be divided into 26 groups.
Step 6: expressing the front section, the gene fragment and the rear section by codes in a gene dictionary to form a group of nucleotide data;
the nucleotide data consists of several codes in sequence.
And 7: counting and compressing nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the serial number of a first gene sequence corresponding to a standard gene to a data receiving end;
and 8: when the data receiving end receives the genome data and the serial number of the first gene sequence, decompressing the genome data, extracting the nucleotide data on each chromosome by referring to the gene dictionary, determining the position of the gene segment on the standard gene according to the number of the nucleotide sequences of the front segment and the rear segment and the number of the nucleotides between the front segment and the rear segment, and reducing to obtain the second gene sequence of the sample.
The data receiving end receives 23 groups of data, and each group of data comprises genome data and reference gene codes;
in the case of restoring genes of human chromosomes, the gene sequences of the anterior and posterior segments are mainly considered, and how long the length between the anterior and posterior segments is, which can be calculated from the above codes.
Generally, no matter whether N =3 or N =4, the same anterior segment and posterior segment are hardly obtained in the same length, and therefore, this localization method has uniqueness, and position data of genes distinguished in the data set is not necessary.
Which can effectively save the data volume.
Example 2
Referring to FIG. 2, a gene sequencing data recombination packaging device for implementing the method of example 1 comprises the following modules:
the storage module 1: the system comprises a database for storing and constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary represents different combinations of nucleotide sequences which are less than or equal to N by codes;
standard genome selection module 2: comparing the second gene sequence of each chromosome of the sample with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
and a comparison module 3: the second gene sequence is compared with the standard gene, and a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment are separated; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
the dictionary module 4: the nucleotide sequence is used for grouping the nucleotides in the gene segments in sequence by taking N as a group; the front section, the gene fragment and the back section are represented by codes in a gene dictionary to form a group of nucleotide data; and the system is used for counting and compressing the nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the code number of the reference gene corresponding to the standard gene to a data receiving end.
The working process is as follows:
manually sequencing to obtain a whole genome sequence of a tested person, wherein the whole genome sequence consists of 23 second gene sequences;
finding the closest first gene sequences for the second gene sequences one by one through a standard genome selection module to serve as standard genes, wherein the standard genes are multiple;
the distinguishing positions of the first gene sequence and the second gene sequence are dictionary-formed through a dictionary module, and the front section, the rear section and the gene segments of the distinguishing positions form continuous codes; and performing dictionary formation on the 23 second gene sequences one by one through a dictionary formation module, and compressing to obtain compressed genome data.
The method comprises the steps that a same storage module is arranged in a server of an operation end at a peripheral data receiving end, a gene dictionary in the storage module restores a front section, a rear section and a gene fragment in data, the accurate position of the first gene sequence is determined according to the length of the first gene sequence, the gene sequences of the front section and the rear section and the number of the first gene sequence, the corresponding position in the first gene sequence is replaced, and a second gene sequence can be obtained.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Claims (8)
1. A gene sequencing data recombination and encapsulation method is characterized by comprising the following steps:
step 1: constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary uses codes to represent different combinations of nucleotide sequences which are less than or equal to N;
step 2: obtaining a second gene sequence of a chromosome in the sample;
and step 3: comparing the second gene sequence in the step (2) with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
and 4, step 4: comparing the second gene sequence with the standard gene to separate out a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
and 5: sequentially grouping the nucleotides in the gene segments by taking N nucleotides as a group;
step 6: expressing the front section, the gene fragment and the rear section by codes in a gene dictionary to form a group of nucleotide data;
and 7: counting and compressing nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the serial number of a first gene sequence corresponding to a standard gene to a data receiving end;
and 8: when the data receiving end receives the genome data and the serial number of the first gene sequence, decompressing the genome data, extracting the nucleotide data on each chromosome by referring to the gene dictionary, determining the position of the gene segment on the standard gene according to the number of the nucleotide sequences of the front segment and the rear segment and the number of the nucleotides between the front segment and the rear segment, and reducing to obtain the second gene sequence of the sample.
2. The method for recombinantly encapsulating gene sequencing data according to claim 1, wherein N is 3 or 4 or 5 or 6.
3. The method for recombinantly encapsulating gene sequencing data according to claim 1, wherein the gene segment is longer than N nucleotides.
4. The method for repackaging gene sequencing data of claim 1, wherein the first gene sequence comprises a first gene sequence of an autosome and a first gene sequence of a sex chromosome in the reference genomic database.
5. The gene sequencing data recombination packaging device is characterized by comprising the following modules:
a storage module: the system comprises a database for storing and constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary represents different combinations of nucleotide sequences which are less than or equal to N by codes;
standard genome selection module: comparing the second gene sequence of each chromosome of the sample with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
a comparison module: the second gene sequence is compared with the standard gene, and a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment are separated; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
a dictionary module: the nucleotide sequence is used for grouping the nucleotides in the gene segments in sequence by taking N as a group; the front section, the gene fragment and the back section are represented by codes in a gene dictionary to form a group of nucleotide data; and the system is used for counting and compressing the nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the code number of the reference gene corresponding to the standard gene to a data receiving end.
6. The genetic sequencing data reassembly device of claim 5, wherein N is 3 or 4 or 5 or 6.
7. The genetic sequencing data reassembly device of claim 5, wherein said gene fragment is longer than N nucleotides.
8. The gene sequencing data recombination packaging apparatus of claim 5, wherein the first gene sequence comprises a first gene sequence of an autosome and a first gene sequence of a sex chromosome in the reference genome database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110810347.2A CN113268461B (en) | 2021-07-19 | 2021-07-19 | Method and device for gene sequencing data recombination packaging |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110810347.2A CN113268461B (en) | 2021-07-19 | 2021-07-19 | Method and device for gene sequencing data recombination packaging |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113268461A CN113268461A (en) | 2021-08-17 |
CN113268461B true CN113268461B (en) | 2021-09-17 |
Family
ID=77236633
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110810347.2A Active CN113268461B (en) | 2021-07-19 | 2021-07-19 | Method and device for gene sequencing data recombination packaging |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113268461B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104335213A (en) * | 2012-05-18 | 2015-02-04 | 国际商业机器公司 | Minimization of surprisal data through application of hierarchy of reference genomes |
CN104699998A (en) * | 2013-12-06 | 2015-06-10 | 国际商业机器公司 | Method and device for compressing and decompressing genome |
CN106971090A (en) * | 2017-03-10 | 2017-07-21 | 首度生物科技(苏州)有限公司 | A kind of gene sequencing data compression and transmission method |
CN108197434A (en) * | 2018-01-16 | 2018-06-22 | 深圳市泰康吉音生物科技研发服务有限公司 | The method for removing human source gene sequence in macro gene order-checking data |
CN109256178A (en) * | 2018-07-26 | 2019-01-22 | 中山大学 | The Leon-RC compression method of gene order-checking data |
CN109450452A (en) * | 2018-11-27 | 2019-03-08 | 中国科学院计算技术研究所 | A kind of compression method and system of the sampling dictionary tree index for gene data |
CN110491441A (en) * | 2019-05-06 | 2019-11-22 | 西安交通大学 | A kind of gene sequencing data simulation system and method for simulation crowd background information |
CN111625509A (en) * | 2020-05-26 | 2020-09-04 | 福州数据技术研究院有限公司 | Lossless compression method for deep sequencing gene sequence data file |
CN112309501A (en) * | 2019-08-02 | 2021-02-02 | 华为技术有限公司 | Gene comparison technology |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9483610B2 (en) * | 2013-01-17 | 2016-11-01 | Edico Genome, Corp. | Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform |
US11393559B2 (en) * | 2016-03-09 | 2022-07-19 | Sophia Genetics S.A. | Methods to compress, encrypt and retrieve genomic alignment data |
-
2021
- 2021-07-19 CN CN202110810347.2A patent/CN113268461B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104335213A (en) * | 2012-05-18 | 2015-02-04 | 国际商业机器公司 | Minimization of surprisal data through application of hierarchy of reference genomes |
CN104699998A (en) * | 2013-12-06 | 2015-06-10 | 国际商业机器公司 | Method and device for compressing and decompressing genome |
CN106971090A (en) * | 2017-03-10 | 2017-07-21 | 首度生物科技(苏州)有限公司 | A kind of gene sequencing data compression and transmission method |
CN108197434A (en) * | 2018-01-16 | 2018-06-22 | 深圳市泰康吉音生物科技研发服务有限公司 | The method for removing human source gene sequence in macro gene order-checking data |
CN109256178A (en) * | 2018-07-26 | 2019-01-22 | 中山大学 | The Leon-RC compression method of gene order-checking data |
CN109450452A (en) * | 2018-11-27 | 2019-03-08 | 中国科学院计算技术研究所 | A kind of compression method and system of the sampling dictionary tree index for gene data |
CN110491441A (en) * | 2019-05-06 | 2019-11-22 | 西安交通大学 | A kind of gene sequencing data simulation system and method for simulation crowd background information |
CN112309501A (en) * | 2019-08-02 | 2021-02-02 | 华为技术有限公司 | Gene comparison technology |
CN111625509A (en) * | 2020-05-26 | 2020-09-04 | 福州数据技术研究院有限公司 | Lossless compression method for deep sequencing gene sequence data file |
Also Published As
Publication number | Publication date |
---|---|
CN113268461A (en) | 2021-08-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112711935B (en) | Encoding method, decoding method, apparatus, and computer-readable storage medium | |
CN110603595B (en) | Methods and systems for reconstructing genomic reference sequences from compressed genomic sequence reads | |
US8972201B2 (en) | Compression of genomic data file | |
WO2011007956A4 (en) | Data compression method | |
CN104579360B (en) | A kind of method and apparatus of data processing | |
CN101350858A (en) | Method for decoding short message and user terminal | |
CN113539370B (en) | Encoding method, decoding method, device, terminal device and readable storage medium | |
CN104937599A (en) | Data analysis device and method therefor | |
CN115276666B (en) | Efficient data transmission method for equipment training simulator | |
CN116151740A (en) | Inventory transaction data process safety management system and cloud platform | |
CN112100982A (en) | DNA storage method, system and storage medium | |
AU2021376411A1 (en) | Quality score compression | |
JP2012124679A (en) | Apparatus and method for decoding encoded data | |
CN111526151A (en) | Data transmission method and device, electronic equipment and storage medium | |
CN113268461B (en) | Method and device for gene sequencing data recombination packaging | |
JP2956704B2 (en) | Variable length code converter | |
CN116827354B (en) | File data distributed storage management system | |
CN112016270B (en) | Logistics information coding method, device and equipment of Chinese-character codes | |
CN113990393B (en) | Data processing method and device for gene detection and electronic equipment | |
CN115865099A (en) | Multi-type data segmentation compression method and system based on Huffman coding | |
CN110111852A (en) | A kind of magnanimity DNA sequencing data lossless Fast Compression platform | |
CN113779932A (en) | Digital formatting method, device, terminal equipment and storage medium | |
CN114025024A (en) | Data transmission method and device | |
CN108629157B (en) | Method for compressing and encrypting nucleic acid sequencing data | |
CN110111851B (en) | Gene sequencing data compression method, system and computer readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |