CN113268461B - Method and device for gene sequencing data recombination packaging - Google Patents

Method and device for gene sequencing data recombination packaging Download PDF

Info

Publication number
CN113268461B
CN113268461B CN202110810347.2A CN202110810347A CN113268461B CN 113268461 B CN113268461 B CN 113268461B CN 202110810347 A CN202110810347 A CN 202110810347A CN 113268461 B CN113268461 B CN 113268461B
Authority
CN
China
Prior art keywords
gene
data
sequence
nucleotides
gene sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110810347.2A
Other languages
Chinese (zh)
Other versions
CN113268461A (en
Inventor
郭祥学
张巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Jiajian Medical Testing Co ltd
Original Assignee
Guangzhou Jiajian Medical Testing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jiajian Medical Testing Co ltd filed Critical Guangzhou Jiajian Medical Testing Co ltd
Priority to CN202110810347.2A priority Critical patent/CN113268461B/en
Publication of CN113268461A publication Critical patent/CN113268461A/en
Application granted granted Critical
Publication of CN113268461B publication Critical patent/CN113268461B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/50Compression of genetic data

Abstract

The invention discloses a gene sequencing data recombination and encapsulation method, which comprises the following steps of 1: constructing a reference genome database and a gene dictionary; step 2: obtaining a second gene sequence of a chromosome in the sample; and step 3: comparing the second gene sequence of step 2 with a plurality of first gene sequences; and 4, step 4: comparing the second gene sequence with the standard gene; and 5: sequentially grouping the nucleotides in the gene segments by taking N nucleotides as a group; step 6: expressing the front section, the gene fragment and the rear section by codes in a gene dictionary to form a group of nucleotide data; and 7: counting and compressing the nucleotide data on different chromosomes to obtain compressed genome data; and 8: reducing to obtain a second gene sequence of the sample. According to the invention, a small segment of nucleotides is coded by a dictionary, so that effective compression of data can be realized; meanwhile, the invention also provides a device based on the method.

Description

Method and device for gene sequencing data recombination packaging
Technical Field
The invention relates to the field of electric digital data processing of a new generation of information technology, in particular to a method and a device for gene sequencing data recombination and encapsulation.
Background
CN202010457824.7 discloses a lossless compression method for deeply sequencing a second gene sequence data file, and the technical solution of the patent application uses a built-in standard reference genome and a built-in dictionary file which do not need to be transmitted in the transmission process as a comparison. Therefore, if the converted second gene sequence or the compressed second gene sequence data in the patent is lost in the transmission or storage process, the related sequence cannot be restored as long as other personnel cannot obtain the built-in standard gene and the built-in dictionary file, and the safety is greatly enhanced. And (4) adding a temporary dictionary according to variation on unmatched files, and compressing and transmitting the dictionary along with the files. If the special variation which is not matched for the first time is written into the dictionary, the special variation which appears in the sequencing data for hundreds or even tens of thousands of times does not need to be stored additionally, and the space is greatly saved.
The method adopts the dictionary file to reduce the data of the nucleotide sequence to achieve the purpose of compressing and transmitting the nucleotide data, but whether an effective path for further reducing the data transmission amount exists or not is not further researched or explained, and the urgent need in the field is met.
Disclosure of Invention
The invention aims to provide a gene sequencing data recombination and packaging method, which adopts dictionary coding on a small segment of nucleotide and can realize effective compression of data;
meanwhile, the invention also provides a device based on the method.
In order to achieve the purpose, the invention provides the following technical scheme: a method for gene sequencing data recombination encapsulation comprises the following steps:
step 1: constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary uses codes to represent different combinations of nucleotide sequences which are less than or equal to N;
step 2: obtaining a second gene sequence of a chromosome in the sample;
and step 3: comparing the second gene sequence in the step (2) with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
and 4, step 4: comparing the second gene sequence with the standard gene to separate out a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
and 5: sequentially grouping the nucleotides in the gene segments by taking N nucleotides as a group;
step 6: expressing the front section, the gene fragment and the rear section by codes in a gene dictionary to form a group of nucleotide data;
and 7: counting and compressing nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the serial number of a first gene sequence corresponding to a standard gene to a data receiving end;
and 8: when the data receiving end receives the genome data and the serial number of the first gene sequence, decompressing the genome data, extracting the nucleotide data on each chromosome by referring to the gene dictionary, determining the position of the gene segment on the standard gene according to the number of the nucleotide sequences of the front segment and the rear segment and the number of the nucleotides between the front segment and the rear segment, and reducing to obtain the second gene sequence of the sample.
In the above method for packaging gene sequencing data by recombination, N is 3 or 4 or 5 or 6.
In the method for packaging gene sequencing data by recombination, the length of the gene fragment is more than N nucleotides.
In the method for packaging gene sequencing data recombination, the first gene sequence in the reference genome database comprises a first gene sequence of an autosome and a first gene sequence of a sex chromosome.
Meanwhile, the invention also discloses a gene sequencing data recombination packaging device, which comprises the following modules:
a storage module: the system comprises a database for storing and constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary represents different combinations of nucleotide sequences which are less than or equal to N by codes;
standard genome selection module: comparing the second gene sequence of each chromosome of the sample with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
a comparison module: the second gene sequence is compared with the standard gene, and a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment are separated; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
a dictionary module: the nucleotide sequence is used for grouping the nucleotides in the gene segments in sequence by taking N as a group; the front section, the gene fragment and the back section are represented by codes in a gene dictionary to form a group of nucleotide data; and the system is used for counting and compressing the nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the code number of the reference gene corresponding to the standard gene to a data receiving end.
In the gene sequencing data recombination and encapsulation device, N is 3, 4, 5 or 6.
In the gene sequencing data recombination and encapsulation device, the length of the gene segment is greater than N nucleotides.
In the above gene sequencing data reassembly and packaging apparatus, the first gene sequence in the reference genome database includes a first gene sequence of an autosome and a first gene sequence of a sex chromosome.
Compared with the prior art, the invention has the beneficial effects that:
the gene dictionary restores the front section, the rear section and the gene fragment in the data, determines the accurate position of the first gene sequence according to the length, the gene sequences of the front section and the rear section and the number of the first gene sequence, and replaces the corresponding position in the first gene sequence to obtain a second gene sequence.
The compressed data volume is small, and the calculation speed is high.
Drawings
FIG. 1 is a flow chart of example 1 of the present invention;
fig. 2 is a topology diagram of embodiment 2 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1, a method for gene sequencing data recombination encapsulation comprises the following steps:
step 1: constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary uses codes to represent different combinations of nucleotide sequences which are less than or equal to N; each first gene sequence is numbered;
in practice, when N is chosen to be 3, any combination of all nucleotides can be combined into 64 combinations, and 4 different combinations of single nucleotides, 16 combinations of 2 nucleotides, and 84 combinations in total are included.
By choosing N as 4, any combination of all nucleotides can be combined into 256 combinations, with 4 different cases for a single nucleotide, 16 combinations for 2 nucleotides, 64 combinations for 3 nucleotides, and 340 combinations in total.
Taking N as 4 as an example, in the gene dictionary, the 340 combinations are represented by symbols.
The reference genome database does not only contain 23 chromosome pairs for men and women, but also contains data of the first gene sequences of a plurality of chromosomes with 23 chromosome pairs as a group.
Step 2: obtaining a second gene sequence of a chromosome in the sample;
and step 3: comparing the second gene sequence in the step (2) with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
each person has 23 second gene sequences, and the 23 second gene sequences are compared with the first gene sequences in the reference genome database one by one to obtain a plurality of second gene sequences as standard genes.
As a further optimization, the positions possibly appearing in the first gene sequences in the reference genome database can be marked according to the positions where the human genes appear in distinction, a plurality of marking points are generated in each first gene sequence, and when the first gene sequences are aligned with the second gene sequences, only the genes at the same sites of the second gene sequences are aligned with the genes at the marking points, so that the first gene sequences with the least difference are used as standard genes, the determination time of the standard genes can be further obviously shortened, and the speed of the step 3 is increased.
And 4, step 4: comparing the second gene sequence with the standard gene to separate out a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
and 5: sequentially grouping the nucleotides in the gene segments by taking N nucleotides as a group;
for example, if the gene fragment is 101 nucleotides and N is 4, the genes can be divided into 26 groups.
Step 6: expressing the front section, the gene fragment and the rear section by codes in a gene dictionary to form a group of nucleotide data;
the nucleotide data consists of several codes in sequence.
And 7: counting and compressing nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the serial number of a first gene sequence corresponding to a standard gene to a data receiving end;
and 8: when the data receiving end receives the genome data and the serial number of the first gene sequence, decompressing the genome data, extracting the nucleotide data on each chromosome by referring to the gene dictionary, determining the position of the gene segment on the standard gene according to the number of the nucleotide sequences of the front segment and the rear segment and the number of the nucleotides between the front segment and the rear segment, and reducing to obtain the second gene sequence of the sample.
The data receiving end receives 23 groups of data, and each group of data comprises genome data and reference gene codes;
in the case of restoring genes of human chromosomes, the gene sequences of the anterior and posterior segments are mainly considered, and how long the length between the anterior and posterior segments is, which can be calculated from the above codes.
Generally, no matter whether N =3 or N =4, the same anterior segment and posterior segment are hardly obtained in the same length, and therefore, this localization method has uniqueness, and position data of genes distinguished in the data set is not necessary.
Which can effectively save the data volume.
Example 2
Referring to FIG. 2, a gene sequencing data recombination packaging device for implementing the method of example 1 comprises the following modules:
the storage module 1: the system comprises a database for storing and constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary represents different combinations of nucleotide sequences which are less than or equal to N by codes;
standard genome selection module 2: comparing the second gene sequence of each chromosome of the sample with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
and a comparison module 3: the second gene sequence is compared with the standard gene, and a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment are separated; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
the dictionary module 4: the nucleotide sequence is used for grouping the nucleotides in the gene segments in sequence by taking N as a group; the front section, the gene fragment and the back section are represented by codes in a gene dictionary to form a group of nucleotide data; and the system is used for counting and compressing the nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the code number of the reference gene corresponding to the standard gene to a data receiving end.
The working process is as follows:
manually sequencing to obtain a whole genome sequence of a tested person, wherein the whole genome sequence consists of 23 second gene sequences;
finding the closest first gene sequences for the second gene sequences one by one through a standard genome selection module to serve as standard genes, wherein the standard genes are multiple;
the distinguishing positions of the first gene sequence and the second gene sequence are dictionary-formed through a dictionary module, and the front section, the rear section and the gene segments of the distinguishing positions form continuous codes; and performing dictionary formation on the 23 second gene sequences one by one through a dictionary formation module, and compressing to obtain compressed genome data.
The method comprises the steps that a same storage module is arranged in a server of an operation end at a peripheral data receiving end, a gene dictionary in the storage module restores a front section, a rear section and a gene fragment in data, the accurate position of the first gene sequence is determined according to the length of the first gene sequence, the gene sequences of the front section and the rear section and the number of the first gene sequence, the corresponding position in the first gene sequence is replaced, and a second gene sequence can be obtained.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.

Claims (8)

1. A gene sequencing data recombination and encapsulation method is characterized by comprising the following steps:
step 1: constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary uses codes to represent different combinations of nucleotide sequences which are less than or equal to N;
step 2: obtaining a second gene sequence of a chromosome in the sample;
and step 3: comparing the second gene sequence in the step (2) with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
and 4, step 4: comparing the second gene sequence with the standard gene to separate out a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
and 5: sequentially grouping the nucleotides in the gene segments by taking N nucleotides as a group;
step 6: expressing the front section, the gene fragment and the rear section by codes in a gene dictionary to form a group of nucleotide data;
and 7: counting and compressing nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the serial number of a first gene sequence corresponding to a standard gene to a data receiving end;
and 8: when the data receiving end receives the genome data and the serial number of the first gene sequence, decompressing the genome data, extracting the nucleotide data on each chromosome by referring to the gene dictionary, determining the position of the gene segment on the standard gene according to the number of the nucleotide sequences of the front segment and the rear segment and the number of the nucleotides between the front segment and the rear segment, and reducing to obtain the second gene sequence of the sample.
2. The method for recombinantly encapsulating gene sequencing data according to claim 1, wherein N is 3 or 4 or 5 or 6.
3. The method for recombinantly encapsulating gene sequencing data according to claim 1, wherein the gene segment is longer than N nucleotides.
4. The method for repackaging gene sequencing data of claim 1, wherein the first gene sequence comprises a first gene sequence of an autosome and a first gene sequence of a sex chromosome in the reference genomic database.
5. The gene sequencing data recombination packaging device is characterized by comprising the following modules:
a storage module: the system comprises a database for storing and constructing a reference genome database and a gene dictionary, wherein the reference genome database stores first gene sequences of a plurality of chromosomes, and the gene dictionary represents different combinations of nucleotide sequences which are less than or equal to N by codes;
standard genome selection module: comparing the second gene sequence of each chromosome of the sample with a plurality of first gene sequences, and finding out the first gene sequence with the highest similarity with the second gene sequence as a standard gene;
a comparison module: the second gene sequence is compared with the standard gene, and a gene segment which is different from the standard gene in the second gene sequence and N nucleotides in front of and behind the gene segment are separated; n nucleotides at the front end of the gene fragment are defined as a front section, and N nucleotides at the rear end of the gene fragment are defined as a rear section;
a dictionary module: the nucleotide sequence is used for grouping the nucleotides in the gene segments in sequence by taking N as a group; the front section, the gene fragment and the back section are represented by codes in a gene dictionary to form a group of nucleotide data; and the system is used for counting and compressing the nucleotide data on different chromosomes to obtain compressed genome data, and sending the genome data and the code number of the reference gene corresponding to the standard gene to a data receiving end.
6. The genetic sequencing data reassembly device of claim 5, wherein N is 3 or 4 or 5 or 6.
7. The genetic sequencing data reassembly device of claim 5, wherein said gene fragment is longer than N nucleotides.
8. The gene sequencing data recombination packaging apparatus of claim 5, wherein the first gene sequence comprises a first gene sequence of an autosome and a first gene sequence of a sex chromosome in the reference genome database.
CN202110810347.2A 2021-07-19 2021-07-19 Method and device for gene sequencing data recombination packaging Active CN113268461B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110810347.2A CN113268461B (en) 2021-07-19 2021-07-19 Method and device for gene sequencing data recombination packaging

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110810347.2A CN113268461B (en) 2021-07-19 2021-07-19 Method and device for gene sequencing data recombination packaging

Publications (2)

Publication Number Publication Date
CN113268461A CN113268461A (en) 2021-08-17
CN113268461B true CN113268461B (en) 2021-09-17

Family

ID=77236633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110810347.2A Active CN113268461B (en) 2021-07-19 2021-07-19 Method and device for gene sequencing data recombination packaging

Country Status (1)

Country Link
CN (1) CN113268461B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104335213A (en) * 2012-05-18 2015-02-04 国际商业机器公司 Minimization of surprisal data through application of hierarchy of reference genomes
CN104699998A (en) * 2013-12-06 2015-06-10 国际商业机器公司 Method and device for compressing and decompressing genome
CN106971090A (en) * 2017-03-10 2017-07-21 首度生物科技(苏州)有限公司 A kind of gene sequencing data compression and transmission method
CN108197434A (en) * 2018-01-16 2018-06-22 深圳市泰康吉音生物科技研发服务有限公司 The method for removing human source gene sequence in macro gene order-checking data
CN109256178A (en) * 2018-07-26 2019-01-22 中山大学 The Leon-RC compression method of gene order-checking data
CN109450452A (en) * 2018-11-27 2019-03-08 中国科学院计算技术研究所 A kind of compression method and system of the sampling dictionary tree index for gene data
CN110491441A (en) * 2019-05-06 2019-11-22 西安交通大学 A kind of gene sequencing data simulation system and method for simulation crowd background information
CN111625509A (en) * 2020-05-26 2020-09-04 福州数据技术研究院有限公司 Lossless compression method for deep sequencing gene sequence data file
CN112309501A (en) * 2019-08-02 2021-02-02 华为技术有限公司 Gene comparison technology

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9483610B2 (en) * 2013-01-17 2016-11-01 Edico Genome, Corp. Bioinformatics systems, apparatuses, and methods executed on an integrated circuit processing platform
US11393559B2 (en) * 2016-03-09 2022-07-19 Sophia Genetics S.A. Methods to compress, encrypt and retrieve genomic alignment data

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104335213A (en) * 2012-05-18 2015-02-04 国际商业机器公司 Minimization of surprisal data through application of hierarchy of reference genomes
CN104699998A (en) * 2013-12-06 2015-06-10 国际商业机器公司 Method and device for compressing and decompressing genome
CN106971090A (en) * 2017-03-10 2017-07-21 首度生物科技(苏州)有限公司 A kind of gene sequencing data compression and transmission method
CN108197434A (en) * 2018-01-16 2018-06-22 深圳市泰康吉音生物科技研发服务有限公司 The method for removing human source gene sequence in macro gene order-checking data
CN109256178A (en) * 2018-07-26 2019-01-22 中山大学 The Leon-RC compression method of gene order-checking data
CN109450452A (en) * 2018-11-27 2019-03-08 中国科学院计算技术研究所 A kind of compression method and system of the sampling dictionary tree index for gene data
CN110491441A (en) * 2019-05-06 2019-11-22 西安交通大学 A kind of gene sequencing data simulation system and method for simulation crowd background information
CN112309501A (en) * 2019-08-02 2021-02-02 华为技术有限公司 Gene comparison technology
CN111625509A (en) * 2020-05-26 2020-09-04 福州数据技术研究院有限公司 Lossless compression method for deep sequencing gene sequence data file

Also Published As

Publication number Publication date
CN113268461A (en) 2021-08-17

Similar Documents

Publication Publication Date Title
CN112711935B (en) Encoding method, decoding method, apparatus, and computer-readable storage medium
CN110603595B (en) Methods and systems for reconstructing genomic reference sequences from compressed genomic sequence reads
US8972201B2 (en) Compression of genomic data file
WO2011007956A4 (en) Data compression method
CN104579360B (en) A kind of method and apparatus of data processing
CN101350858A (en) Method for decoding short message and user terminal
CN113539370B (en) Encoding method, decoding method, device, terminal device and readable storage medium
CN104937599A (en) Data analysis device and method therefor
CN115276666B (en) Efficient data transmission method for equipment training simulator
CN116151740A (en) Inventory transaction data process safety management system and cloud platform
CN112100982A (en) DNA storage method, system and storage medium
AU2021376411A1 (en) Quality score compression
JP2012124679A (en) Apparatus and method for decoding encoded data
CN111526151A (en) Data transmission method and device, electronic equipment and storage medium
CN113268461B (en) Method and device for gene sequencing data recombination packaging
JP2956704B2 (en) Variable length code converter
CN116827354B (en) File data distributed storage management system
CN112016270B (en) Logistics information coding method, device and equipment of Chinese-character codes
CN113990393B (en) Data processing method and device for gene detection and electronic equipment
CN115865099A (en) Multi-type data segmentation compression method and system based on Huffman coding
CN110111852A (en) A kind of magnanimity DNA sequencing data lossless Fast Compression platform
CN113779932A (en) Digital formatting method, device, terminal equipment and storage medium
CN114025024A (en) Data transmission method and device
CN108629157B (en) Method for compressing and encrypting nucleic acid sequencing data
CN110111851B (en) Gene sequencing data compression method, system and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant