CN113025697A - Rapid sequencing method based on nanopore - Google Patents

Rapid sequencing method based on nanopore Download PDF

Info

Publication number
CN113025697A
CN113025697A CN202110305615.5A CN202110305615A CN113025697A CN 113025697 A CN113025697 A CN 113025697A CN 202110305615 A CN202110305615 A CN 202110305615A CN 113025697 A CN113025697 A CN 113025697A
Authority
CN
China
Prior art keywords
module
nanopore
data
signal
computational analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110305615.5A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Tianqing Intelligent Technology Co ltd
Original Assignee
Suzhou Tianqing Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Tianqing Intelligent Technology Co ltd filed Critical Suzhou Tianqing Intelligent Technology Co ltd
Priority to CN202110305615.5A priority Critical patent/CN113025697A/en
Publication of CN113025697A publication Critical patent/CN113025697A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a rapid sequencing method based on a nanopore. The method comprises the following steps: step S1, detecting the signal change caused by the passing of the molecule by using the nanopore chip; step S2, the encoding module encodes the signal change into electronic data; step S3, uploading the electronic data to a server through a transmission module; step S4, the decoding module at the server end decodes the uploaded electronic data into signal data; in step S5, the computational analysis module at the server identifies the signal data as a molecular sequence. The invention can be applied to scenes needing molecular sequencing.

Description

Rapid sequencing method based on nanopore
Technical Field
The invention relates to the field of molecular sequencing, in particular to a rapid sequencing method based on a nanopore.
Background
Nanopore sequencing technology is a new generation of sequencing technology that has emerged in recent years. The widely accepted Nanopore sequencing platform on the market today is the Oxford Nanopore Technologies (ONT) Bionanopore sequencer. Compared with the second generation sequencing technology, the method has the advantages of single molecule sequencing, long sequencing read length (the reported longest length can exceed 2 Mb), real-time acquisition of sequencing data, no need of amplification and recognition of nucleic acid modification and the like. The method gradually shows irreplaceable status in a plurality of specific application fields such as metagenome sequencing, new species genome sequencing, pathogen sequencing and epigenetic sequencing.
The rationale for nanopore sequencing is that motor proteins pull DNA/RNA into association with nanopore proteins because the potential difference across the membrane causes the melted strand to pass through the nanopore. The electrical signals are different due to the difference in resistance caused by the difference in Base structure and charge, and finally Base recognition is performed by reading the original electrical signals (Base-calling).
Since the frequency of amperometric detection is typically 7-9 times the speed of DNA sequence through a nanopore, this poses a significant technical challenge to Base-calling. Compared with the second generation Illumina sequencing data, the method has the characteristics of longer read length, high error rate, uneven length distribution and the like.
Generally, sequencing can generate massive data in G magnitude and even T magnitude, and is difficult to store in a conventional manner. And the calculation amount of the Base-calling algorithm and the subsequent analysis algorithm is very large, the requirements on software and hardware are extremely high, and the use is difficult and serious.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a rapid sequencing method based on a nanopore.
The technical scheme of the invention is as follows: a rapid sequencing method based on a nanopore.
As shown in fig. 1, the method comprises the steps of:
step S1, drawing the pre-prepared molecules to be detected through the nanometer detection holes, and detecting the signal change caused by the passing of the molecules;
step S2, after detecting the signal change when the molecule passes through the nanopore, encoding the molecule into electronic data through an encoding module for transmission and storage;
step S3, the system uploads the data to the server end through the transmission module;
step S4, decoding the uploaded electronic data into original signal data by a decoding module at the server end;
in step S5, the signal data is identified as a molecular sequence by the calculation and analysis module at the server side.
Further, the nanopore chip described in step S1 is composed of a nanopore element, a signal capture module, and a signal transmission module, and organically combined by a chipset.
Furthermore, the nanopore element is a solid nanopore and consists of a silicon substrate and a silicon nitride film.
Further, the signal encoding in step S2 further includes an encryption module, a compression module, and a data buffering module. The encryption module can use a symmetric encryption algorithm or an asymmetric encryption algorithm, the compression module is a high-efficiency compression algorithm which is realized in a targeted manner according to the particularity of signal data, the data cache is composed of two layers of caches, the first layer of cache is a memory, and the second layer of cache is a solid state disk.
Further, the decoding module of step S4 includes a decryption module, a decompression module, and a data caching module. The algorithms of the decryption module and the decompression module correspond to the encryption module and the compression module in step S2, the data cache is composed of three layers of caches, the first layer is a memory cache, the second layer is a solid-state cache, and the third layer is a mechanical hard disk.
Further, the server side in step S5 is a cloud server cluster, and the calculation and analysis module is a parallel algorithm running in the cloud server cluster.
Further, the calculation and analysis module of step S5 further includes performing calibration and data quality evaluation on the sequence.
Further, the molecular sequences in step S5 are DNA sequences and RNA sequences, and DNA modifications and RNA modifications.
Further, the calculation analysis module in step S5 further includes a further calculation analysis or statistical analysis after the identification sequence, and all the calculation analysis processes are performed in parallel by the cluster on the server side.
Further, the computational analysis and statistical analysis module of step S5 further comprises analyzing structural variations, repeat regions, Single Nucleotide Polymorphisms (SNPs), modified bases, haplotypes, metagenomes, isoforms, indirect variants, and fusions. Each analysis is a separate calculation analysis algorithm and runs in parallel on the server cluster, so that the analysis can be completed quickly and the whole analysis report can be generated.
The invention has the beneficial effects that: by the rapid sequencing method based on the nanopore, the analysis and calculation cost of sequencing and the storage cost of sequencing results are greatly reduced, the updating and using cost of sequencing software and analysis algorithm is also reduced, and the comprehensive performance of the whole sequencing process is improved. The threshold of using the related sequencing technology by a user is lower, the speed is higher, and the overall efficiency of the industry is greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of steps of a method for nanopore-based molecular sequencing according to an embodiment of the present invention;
FIG. 2 is a block diagram of a nanopore based molecular sequencing system according to an embodiment of the present invention;
FIG. 3 is a signal schematic of nanopore-based whole genome sequencing according to an embodiment of the invention;
FIG. 4 is a DNA sequence schematic of nanopore-based whole genome sequencing according to an embodiment of the invention;
detailed description of the preferred embodiments
In order to make the technical solutions of the present invention better understood by those skilled in the art, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings.
A rapid sequencing method based on a nanopore is disclosed, as shown in figure 2, namely, a signal is generated when a target to be detected passes through the nanopore at a chip terminal of the nanopore, the signal is captured by a system, then the signal is coded into electronic data, the data is uploaded to a server end through a network transmission module, and relevant sequence recognition operation and a calculation analysis report are carried out at the server end.
Optionally, the transmission module may include but is not limited to: a wired network, a wireless network, wherein the wired network comprises: a local area network, a metropolitan area network, and a wide area network, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communication.
Optionally, in this embodiment, when the signal is encoded into electronic data, some local processing may be performed, including but not limited to data compression, data encryption, data source quality evaluation, and transmission efficiency evaluation.
Optionally, in this embodiment, the server-side recognition algorithm includes, but is not limited to, a Machine Learning (Machine Learning) based algorithm and a Consensus (Consensus) based algorithm.
Two specific examples follow.
Whole genome sequencing was performed based on the invention:
according to the using steps, a certain laboratory firstly carries out biological sampling, then carries out preparation operation on a sample to be detected, places the completely prepared sample in a detection hole of a nanopore instrument, generates a current signal shown in figure 3 when DNA in the sample passes through the nanopore, and after the instrument transmits signal data to a cloud end, the cloud end generates a DNA sequence in real time through a recognition algorithm, and establishes a sequence library (library establishment) of the biological sample at the cloud end as shown in figure 4. Subsequent laboratories may also use a series of quality testing and sequencing tools in the cloud to obtain Whole Genome Sequencing (WGS) datasets.
The invention is based on the rapid detection of the new coronavirus:
an organization or a laboratory develops an identification algorithm according to a known new coronavirus sequence, and the algorithm is deployed to the cloud in advance, so that the algorithm can be opened to other organizations or laboratories. When the method is used specifically, a mechanism or a laboratory carries out preparation, sequencing uploading and other work on a sample to be detected according to using steps, after signal data to be detected are transmitted to a cloud, the cloud runs a corresponding new coronavirus sequence recognition algorithm, and a library is built in real time and an analysis report is generated.
The new coronavirus has the characteristics of extremely high transmission speed and extremely large transmission amount. The novel coronavirus detection method has the advantages of obvious advantages, high detection speed and capability of reporting results in real time.
The new coronavirus identification algorithm is developed into an intelligent learning algorithm, so that the virus can be rapidly detected, a virus variation structure can be rapidly found, and the method has great significance for preventing and treating the new coronavirus.
The above description is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto. Any person skilled in the art can substitute or change the technical solution of the present invention and its inventive concept within the technical scope of the present invention, and all the equivalents or changes thereof are covered within the protective scope of the present invention.

Claims (10)

1. A rapid sequencing method based on a nanopore is characterized by comprising the following steps:
step S1, detecting the signal change caused by the passing of the molecule by using the nanopore chip;
step S2, the encoding module encodes the signal change into electronic data;
step S3, uploading the electronic data to a server through a transmission module;
step S4, the decoding module at the server end decodes the uploaded electronic data into signal data;
in step S5, the computational analysis module at the server identifies the signal data as a molecular sequence.
2. The method of claim 1, wherein the nanopore chip of step S1 is comprised of a nanopore element, a signal capture module, and a signal transmission module.
3. The method of claim 2, wherein the nanopore element is a solid state nanopore.
4. The method as claimed in claim 1, 2 or 3, wherein the signal encoding of step S2 includes an encryption module, a compression module, and a data buffering module.
5. The method of claim 4, wherein the decoding module of step S4 comprises a decryption module, a decompression module, and a data buffering module.
6. The method according to claim 1, 2, 3 or 5, wherein the server side in step S5 is a cloud server cluster, and the computational analysis module is to be run in the cloud server cluster.
7. The method of claim 1, 2, 3, 5 or 6, wherein the computational analysis module of step S5 further comprises performing calibration and data quality assessment on the sequence.
8. The method of claim 1 or 2 or 3 or 5 or 6 or 7, wherein the molecular sequences of step S5 are DNA sequences and RNA sequences, and DNA modifications and RNA modifications.
9. The method of claim 8, wherein the computational analysis module of step S5 further comprises further computational analysis or statistical analysis after identifying the sequence.
10. The method of claim 9, wherein the further computational analysis and statistical analysis module can analyze structural variations, repeat regions, Single Nucleotide Polymorphisms (SNPs), modified bases, haplotypes, metagenomes, isoforms, indirect variants, and fusions.
CN202110305615.5A 2021-03-24 2021-03-24 Rapid sequencing method based on nanopore Pending CN113025697A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110305615.5A CN113025697A (en) 2021-03-24 2021-03-24 Rapid sequencing method based on nanopore

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110305615.5A CN113025697A (en) 2021-03-24 2021-03-24 Rapid sequencing method based on nanopore

Publications (1)

Publication Number Publication Date
CN113025697A true CN113025697A (en) 2021-06-25

Family

ID=76472828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110305615.5A Pending CN113025697A (en) 2021-03-24 2021-03-24 Rapid sequencing method based on nanopore

Country Status (1)

Country Link
CN (1) CN113025697A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114615567A (en) * 2022-03-08 2022-06-10 东南大学 Solid-state nanopore gene sequencing data communication method based on wireless communication
WO2023123344A1 (en) * 2021-12-31 2023-07-06 深圳华大生命科学研究院 Nucleic acid molecule capable of blocking motor protein, and construction method and application thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023123344A1 (en) * 2021-12-31 2023-07-06 深圳华大生命科学研究院 Nucleic acid molecule capable of blocking motor protein, and construction method and application thereof
CN114615567A (en) * 2022-03-08 2022-06-10 东南大学 Solid-state nanopore gene sequencing data communication method based on wireless communication

Similar Documents

Publication Publication Date Title
Wang et al. Nanopore sequencing technology, bioinformatics and applications
Tourancheau et al. Discovering multiple types of DNA methylation from bacteria and microbiome using nanopore sequencing
Slatko et al. Overview of next‐generation sequencing technologies
CN108350494B (en) Systems and methods for genomic analysis
Magi et al. Characterization of MinION nanopore data for resequencing analyses
CN113025697A (en) Rapid sequencing method based on nanopore
CN106960006B (en) System and method for measuring similarity between different tracks
Plesivkova et al. A review of the potential of the MinION™ single‐molecule sequencing system for forensic applications
Zhang et al. Real-time mapping of nanopore raw signals
CN109994155B (en) Gene variation identification method, device and storage medium
US20130166221A1 (en) Method and system for sequence correlation
US20180253528A1 (en) Polynucleotide sequencer tuned to artificial polynucleotides
CN116486910B (en) Deep learning training set establishment method for nanopore sequencing base recognition and application thereof
CN113923042B (en) Detection and identification system and method for malicious software abuse (DoH)
CN112309503A (en) Base interpretation method, interpretation equipment and storage medium based on nanopore electric signal
Qu et al. Clover: tree structure-based efficient DNA clustering for DNA-based data storage
Sun et al. AutoNanopore: an automated adaptive and robust method to locate translocation events in solid-state nanopore Current Traces
Ghurye et al. Better identification of repeats in metagenomic scaffolding
CN118120017A (en) Nanopore measurement signal analysis
Hoffmann Computational analysis of high throughput sequencing data
Sun et al. HBS‐Tools for Hairpin Bisulfite Sequencing Data Processing and Analysis
CN113132455A (en) Distributed industrial Internet of things monitoring method and system
AlEisa et al. K‐Mer Spectrum‐Based Error Correction Algorithm for Next‐Generation Sequencing Data
CN116564415B (en) Stream sequencing analysis method, device, storage medium and computer equipment
US20240161870A1 (en) Alignment of target and reference sequences of polymer units

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication