WO2021164472A1 - Library construction method based on nanopore sequencing platform, microorganism identification method, and application - Google Patents

Library construction method based on nanopore sequencing platform, microorganism identification method, and application Download PDF

Info

Publication number
WO2021164472A1
WO2021164472A1 PCT/CN2021/071423 CN2021071423W WO2021164472A1 WO 2021164472 A1 WO2021164472 A1 WO 2021164472A1 CN 2021071423 W CN2021071423 W CN 2021071423W WO 2021164472 A1 WO2021164472 A1 WO 2021164472A1
Authority
WO
WIPO (PCT)
Prior art keywords
database
highly sensitive
data
bacteria
comparison data
Prior art date
Application number
PCT/CN2021/071423
Other languages
French (fr)
Chinese (zh)
Inventor
辜家爽
付爱思
Original Assignee
武汉臻熙医学检验实验室有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 武汉臻熙医学检验实验室有限公司 filed Critical 武汉臻熙医学检验实验室有限公司
Publication of WO2021164472A1 publication Critical patent/WO2021164472A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
    • C12Q1/701Specific hybridization probes
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • C40B40/08Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Definitions

  • the present disclosure relates to the field of gene sequencing, in particular to a method for constructing a library based on a nanopore sequencing platform, a method for identifying microorganisms, and applications.
  • Bacteria, fungi, and viruses are three types of pathogens that cause clinical infections. Cultivation is a common method for clinical bacteria and fungi detection. It will be affected by the natural differences in the growth conditions of bacteria and fungi, which will lead to slow detection speed, which often takes 2-7 days; and the growth of some pathogens will be affected by others. The influence of pathogenic bacteria makes the sensitivity of culture detection low. In addition, different types of bacteria or fungi require different culture methods, and need to predict the type of infection in advance, which limits its application.
  • PCR methods can detect nucleic acids in specimens without pre-judging the types of bacterial and fungal infections, and detect bacteria, fungi, and viruses in specimens at one time without distinction.
  • conventional PCR methods are limited by PCR technical conditions, and generally can only diagnose 1-15 specific pathogens in a specimen at the same time.
  • the use of antibodies for specific detection of specific pathogens can only target specific pathogens.
  • the use of gene sequencing technology for pathogen diagnosis is to use a gene sequencer to perform sequence detection on the specimens processed by a certain method of nucleic acid extraction and sequencing library construction, and then compare the obtained gene sequence with the database to determine what the specimen contains The species information of the fungus or/and bacteria.
  • the comprehensiveness of the test (detection of bacteria and fungi at the same time), speed (from specimen to report time), sensitivity (detection of very low pathogens from clinical specimens with complex composition) and convenience (needs of the site environment required to carry out the test) , The minimum sample size requirement) is extremely important for clinical testing.
  • the overall detection cycle of second-generation metagenomic sequencing is long, often requiring 1 to 3 days.
  • the amount of data required for detection is large, and the detection cost is relatively high.
  • Sequencing equipment and analysis equipment require high requirements and occupy a large area.
  • it is necessary to collect samples for testing, which can only be carried out in professional testing companies or large central laboratories.
  • nanopore metagenomic sequencing analysis it has the advantages of short sequencing time and excellent sequencing results.
  • it may require a large amount of sequencing data, for example, generally 5-10Gb/sample, and the cost of reagents is 5-10 times higher than that of second-generation sequencing, so it is difficult to implement it in clinical practice.
  • the present disclosure solves at least one of the problems existing in the prior art to a certain extent, and provides a method for constructing a library based on a nanopore sequencing platform, a method for identifying microorganisms, and applications.
  • the inventors of the present disclosure creatively researched and designed target gene sites for microbial identification. For example, they designed target gene sites for bacteria, fungi, and viruses; at the same time, they optimized the detection experiment process between different target genes. , To achieve simultaneous detection of bacteria, fungi, viruses, etc. in the sample in one sequencing.
  • the target gene locus provided by the present disclosure, it is possible to use the nanopore sequencing platform to increase the proportion of microbial nucleic acid during the sequencing process through a targeted enrichment method, and reduce the amount of data required for detection to 10-50Mb/sample , which greatly reduces the cost of testing.
  • the present disclosure provides a method for constructing a library based on a nanopore sequencing platform, including: enriching target genes from microorganisms to obtain enriched products, the microorganisms including those selected from At least one of bacteria, fungi, or viruses; a library is constructed based on the enriched product, so as to obtain the sequencing library.
  • the present disclosure provides a method for constructing a library based on a nanopore sequencing platform. By enriching target genes from microorganisms, a library is constructed for the obtained enriched products, for example, referring to the library construction process of the nanopore sequencing platform Build a library to obtain a sequencing library. The obtained sequencing library only contains the enriched target gene nucleic acid from microorganisms.
  • the obtained sequencing library can be used to realize rapid sequencing analysis with the help of the nanopore sequencing platform.
  • the detection time is less than 12 hours from sample to result, which is the fastest It can be completed in 6 hours, and at the same time can realize a one-time broad-spectrum detection of bacteria, fungi, and viruses.
  • the target gene nucleic acid is enriched, the amount of data required for detection is reduced to 10-50Mb/sample, which greatly reduces the cost of detection.
  • the method for constructing a library based on the nanopore sequencing platform described above may further include the following technical features:
  • the bacterial target genes include universal bacterial target genes and/or highly sensitive bacterial target genes, and the universal bacterial target genes include those selected from the group consisting of 16s rRNA, rpob, gyrB, hsp60, ISR, 23s rRNA At least one, preferably including the target region listed in Table 5 and the 500bp region before and after; the highly sensitive bacterial target gene includes at least one selected from the target gene of the highly sensitive bacteria shown in Table 1, these high
  • the target genes of sensitive bacteria are specifically 16s rRNA, rpob, gyrB, hsp60, ISR, dnaJ, tuf, atpD, rnpB, sodA, inhA, mip, recA, trkA, femA, gap, katG, mabA, gacA and other genes; preferably include Table 7 lists the target regions and the 500bp region before and after.
  • the universal bacterial target gene has strong
  • the fungal target genes include universal fungal target genes and/or highly sensitive fungal target genes, and the universal fungal target genes include selected from ITS1-4, LSU(D1/2), 18s rRNA, RPB2 At least one of them preferably includes the target region listed in Table 6 and the region of 500 bp before and after; for example, it may be a region of 100 bp to 450 bp before and after, a region of 100 to 400 bp before and after, a region of 100 to 350 bp before and after, and 100 bp before and after.
  • the highly sensitive fungal target genes include options From at least one of the target genes of highly sensitive fungi shown in Table 2, these target genes may specifically be RPB1, RPB2, TEF1, BenA, CaM, ND6, MCM7, CAL, TUB2, ACT, ND6 and other genes, preferably including the table
  • the viral target genes include multiple viral target genes and/or new coronavirus target genes, and the multiple viral target genes include at least one selected from the viral target genes shown in Table 3, preferably including The target region listed in 9 and the region of 200 bp before and after; for example, a region of 100 to 200 bp before and after, a region of 100 to 150 bp before and after, or a region within 100 bp before and after; the new coronavirus target genes include those selected from Table 4
  • At least one of the indicated target genes preferably includes the target region listed in Table 10 and the region of 200 bp before and after; for example, it can be a region of 100 to 200 bp before and after, a region of 100 to 150 bp before and after, or within 100 bp before and after Area, the area of 50-100bp before and after.
  • the bacterial target gene includes a universal bacterial target gene and a highly sensitive bacterial target gene
  • the fungal target gene includes a universal fungal target gene and a highly sensitive fungal target gene
  • the viral target gene includes a multiple viral target Gene and new coronavirus target gene
  • the method further comprises: enriching the enriched product of universal bacterial target gene, the enriched product of universal fungal target gene, the enriched product of highly sensitive bacterial target gene, and the enriched product of highly sensitive fungal target gene
  • the products, the enriched products of multiple viral target genes and the enriched products of new coronavirus target genes are mixed in a mass ratio of 20-60:5-15:10-25:10-25:10-25.
  • the product library is constructed to obtain the sequencing library.
  • the enriched products of universal bacterial target genes, the enriched products of universal fungal target genes, the enriched products of highly sensitive bacterial target genes, the enriched products of highly sensitive fungal target genes, and the enriched products of multiple viral target genes can be mixed in a mass ratio of 30-50:6-10:12-20:12-20:12-20.
  • the method for constructing a library based on a nanopore sequencing platform further includes: performing PCR amplification on the target gene from the microorganism based on the primers in the primer pool, so as to realize the enrichment of the target gene.
  • the primer pool contains at least one primer.
  • the primers in the primer pool all meet the following conditions: a.
  • the length of the primer is 18-30 bases; b.
  • the melting temperature Tm of the primer is 57-64°C; c.
  • In the primer The GC content is 40-60%; d.
  • the Gibbs free energy ⁇ G of the 5 bases at the 3'end of the primer is greater than or equal to -9kcal/mol; e.
  • the primer self-complementarity value is less than 8.0, and the 3'end self-complementary parameter of the primer Less than 3.0; f.
  • the length of the amplified product of the primer is 200 to 1500 bases.
  • the primer self-complementarity value mentioned is used to characterize the tendency of each primer's own base sequence to form a complementary structure;
  • the 3'end self-complementarity parameter of the primer is to calculate all bases at the 3'end of different primers The tendency of sequences to form complementary structures.
  • the specific trend is evaluated by numerical value.
  • the numerical calculation logic is: 1 point is counted when forming a pair of base complements, the complementation between N bases is -0.25 points, and the non-complementary bases are counted-1 point, forming a gap Complementary (Gap) bases are counted as -2 points.
  • Gap gap Complementary
  • the primer pool includes at least one selected from the following primer pools: a universal bacterial primer pool including the primers listed in Table 5; a universal fungal primer pool, the The universal fungal primer pool includes the primers listed in Table 6; the highly sensitive bacterial primer pool includes the primers listed in Table 7; the highly sensitive fungal primer pool includes the primers listed in Table 7. 8 primers listed; multiple virus primer pool, the multiple virus primer pool includes the primers listed in Table 9; new coronavirus primer pool, the new coronavirus primer pool includes the primers listed in Table 10.
  • the present disclosure provides a sequencing method, including: obtaining a sequencing library based on the method described in any one of the embodiments of the first aspect of the present disclosure; ⁇ Sequencing.
  • a sequencing method including: obtaining a sequencing library based on the method described in any one of the embodiments of the first aspect of the present disclosure; ⁇ Sequencing.
  • Using the nanopore sequencing platform for sequencing compared with the second-generation sequencing technology, it can be monitored on-site in non-professional laboratories, and detection and data analysis can be performed through laptops and network cloud processes.
  • the present disclosure provides a method for identifying microorganisms, including: obtaining a sequencing library according to any one of the embodiments of the first aspect of the present disclosure based on the nucleic acid of the sample to be tested; and based on the sequencing library Sequencing using a nanopore sequencing platform to obtain sequencing results; comparing the sequencing results with a reference database, and determining the microorganisms in the sample to be tested based on the comparison results.
  • the provided method for identifying microorganisms can be the identification of pathogens, and the identification of pathogens can assist clinical medication.
  • the provided methods for identifying microorganisms can be used to identify the types of bacteria, fungi or viruses in the sample to be tested, can be used as clinical auxiliary drugs, and can also be used for other purposes, such as big data collection, commercial kits or commercial platforms Construction and other non-disease diagnosis purposes.
  • the method for identifying microorganisms described above may further include the following technical features:
  • the reference database includes at least one of the following: a general bacterial database containing 16s rRNA, rpob, gyrB, hsp60, 23s rRNA, and ISR gene data; a general fungal database, so The general fungus database contains TS1-4, LSU(D1/2), 18s rRNA and RPB2 gene data; a highly sensitive bacteria database, the highly sensitive bacteria database contains the highly sensitive bacterial target gene data shown in Table 1; a highly sensitive fungus database , The highly sensitive fungus database contains the highly sensitive fungal target gene data shown in Table 2; the multiple virus database, the multiple virus database contains the virus target gene data shown in Table 3; the new coronavirus database, the new coronavirus database contains SARS- Genomic data of CoV-2.
  • the method for identifying microorganisms described above further includes: comparing the sequencing results with the general bacterial database and the general fungal database, respectively, so as to obtain the first comparison data and the first comparison data. Comparison data; compare the first uncompared data with the highly sensitive bacteria database and the highly sensitive fungus database, respectively, so as to obtain the second comparison data and the second uncompared data; The second uncompared data is compared with the multiple virus database and the new coronavirus database to obtain the third comparison data; based on the first comparison data, the common bacteria and common bacteria contained in the sample are determined For fungi, based on the second comparison data, it is determined that the sample contains highly sensitive bacteria and highly sensitive fungi, and based on the third comparison data, the virus contained in the sample is determined.
  • the determining the general bacteria and general fungi contained in the sample based on the first comparison data further includes: dividing the first comparison data into first unique comparison data and At least one set of first cross-alignment data, and each set of first cross-alignment data contains multiple alignment sequences; some of the multiple alignment sequences in each set of first cross-alignment data are used as seed sequences, and the remaining part of the group is used for comparison.
  • the sequence is corrected for the seed sequence to obtain a corrected seed sequence; the corrected seed sequence is compared with the optimal alignment data of the general bacterial database and the general fungal database, and the first unique comparison Combine the data to determine the common bacteria and common fungi contained in the sample.
  • the determining the highly sensitive bacteria and highly sensitive fungi contained in the sample based on the second comparison data further includes: dividing the second comparison data into second unique comparison data And at least one set of second cross-aligned data, each set of second cross-aligned data contains multiple alignment sequences; part of multiple alignment sequences in each set of second cross-aligned data are used as seed sequences, and the remaining part in the group is used
  • the alignment sequence corrects the seed sequence to obtain a corrected seed sequence; the optimal alignment result of the corrected seed sequence with the highly sensitive bacteria database and the highly sensitive fungus database, and the second The two unique comparison data are combined to determine the highly sensitive bacteria and highly sensitive fungi contained in the sample.
  • all bacteria contained in the sample are combined and determined; based on the determined general fungi and highly sensitive fungi in the sample, combined to determine the content contained in the sample All fungi.
  • the determining the virus contained in the sample based on the third comparison data further includes:
  • the multiple alignment sequences in the third alignment data that are aligned to the same gene region are used as a group, a part of the alignment sequence in each group is used as a seed sequence, and the remaining alignment sequences in the group are used to align the The seed sequence is corrected to obtain the corrected seed sequence;
  • it further includes: determining the virus present in the sample to be tested based on at least 80% or more of the same base difference between the corrected seed sequence and the multiple virus database and the new coronavirus database Mutation site.
  • the present disclosure provides an apparatus for identifying microorganisms, including: a data processing unit that compares the sequencing result of the nucleic acid of the sample to be tested with a reference database for determining the Microorganisms in the sample to be tested.
  • the apparatus for identifying microorganisms further includes: a library construction unit that obtains a sequencing library based on the nucleic acid of the sample to be tested according to the method of any one of the embodiments of the third aspect of the present disclosure Sequencing unit, based on the sequencing library, the sequencing unit uses a nanopore sequencing platform to perform sequencing, so as to obtain the sequencing result.
  • the device for identifying microorganisms may further include the following technical features:
  • the reference database includes at least one of the following: a general bacterial database containing 16s rRNA, rpob, gyrB, hsp60, 23s rRNA, and ISR gene data; a general fungal database, so The general fungus database contains TS1-4, LSU(D1/2), 18s rRNA and RPB2 gene data; a highly sensitive bacteria database, which contains the bacterial target gene data shown in Table 1; a highly sensitive fungus database , The highly sensitive fungus database contains the fungal target gene data shown in Table 2; a multiple virus database, the multiple virus database contains the virus target gene data shown in Table 3; a new coronavirus database, the new coronavirus database contains SARS -Genomic data of CoV-2.
  • the data processing unit further includes: comparing the sequencing results with the general bacterial database and the general fungal database, respectively, so as to obtain first comparison data and first uncompared data Data; compare the first uncompared data with the highly sensitive bacteria database and the highly sensitive fungus database, so as to obtain the second comparison data and the second uncompared data; compare the second The unaligned data is compared with the multiple virus database and the new coronavirus database to obtain the third comparison data; based on the first comparison data, the general bacteria and general fungi contained in the sample are determined, Based on the second comparison data, it is determined that the sample contains highly sensitive bacteria and highly sensitive fungi, and based on the third comparison data, the virus contained in the sample is determined.
  • the mentioned first comparison data refers to the data that can be compared with the general bacteria database and the general fungus database
  • the first uncompared data refers to the data that cannot be compared with the general bacteria database and the general fungus database.
  • These unmatched data continue to be compared with the highly sensitive bacteria database and the highly sensitive fungus database.
  • the data that can be compared is used as the second comparison data
  • the unmatched data is used as the second uncompared data.
  • These second uncompared data continue to be compared with the multiple virus database and the new coronavirus database, and the data that can be compared is used as the third comparison data.
  • the determining the general bacteria and general fungi contained in the sample based on the first comparison data further includes: dividing the first comparison data into first unique comparison data and At least one set of first cross-alignment data, and each set of first cross-alignment data contains multiple sequences; some multiple sequences in each set of first cross-alignment data are used as seed sequences, and the remaining sequences in the group are used to compare the seeds The sequence is corrected to obtain a corrected seed sequence; the optimal alignment results of the corrected seed sequence with the general bacterial database and the general fungal database are combined with the first unique alignment data to determine the The general bacteria and general fungi contained in the sample.
  • a random 20%-40% sequence for example, 30%
  • each group of first crossover data can be used as a seed sequence.
  • the determining that the sample contains highly sensitive bacteria and highly sensitive fungi based on the second comparison data further includes:
  • each group of second cross comparison data contains multiple sequences
  • a random 20%-40% sequence for example, 30%
  • a random 20%-40% sequence for example, 30%
  • the determining the virus contained in the sample based on the third comparison data further includes: dividing multiple sequences in the third comparison data that are aligned to the same gene region as One group, the partial sequence in each group is used as the seed sequence, and the remaining partial sequences in the group are used to correct the seed sequence to obtain the corrected seed sequence; based on the corrected seed sequence and the multiple virus database and all According to the optimal comparison result of the new coronavirus database, the virus contained in the sample is determined.
  • a random 20%-40% sequence for example, 30%
  • a seed sequence for example, 30%
  • the present disclosure provides a kit, including a primer pool, the primer pool includes at least one selected from the following: a universal bacterial primer pool, the universal bacterial primer pool includes table 5 Listed primers; universal fungal primer pool, the universal fungal primer pool includes the primers listed in Table 6; highly sensitive bacterial primer pool, the highly sensitive bacterial primer pool includes the primers listed in Table 7; highly sensitive fungus A primer pool, the highly sensitive fungal primer pool includes the primers listed in Table 8; a multiple virus primer pool, the multiple virus primer pool includes the primers listed in Table 9; a new coronavirus primer pool, the new coronavirus primer pool Include the primers listed in Table 10.
  • the beneficial effects achieved by the present disclosure are: applying the sequencing method provided by the present disclosure can quickly obtain microbial information in a sample, for example, it can achieve a detection result of a virus in a sample on the same day.
  • the detection sensitivity is high: it detects low-abundance viruses, bacteria, and fungi, helping early diagnosis and prompting the risk of early infection.
  • the detection range is wide, for example, it can be applied to the detection of the new coronavirus SARS-CoV-2, and it can also detect co-infections caused by other viruses to help quickly determine the diagnosis and treatment plan. It can further detect bacteria, fungi, atypical pathogens, and viruses at one time.
  • the application of this method to identify microorganisms can simultaneously realize the detection of bacteria, fungi, viruses, etc., and the information is more comprehensive. For example, it can realize virus detection and virus mutation monitoring at the same time, and provide rapid and real-time epidemics that can be interpreted and acted upon for epidemic monitoring. Disease information.
  • the provided method uses the nanopore sequencing platform for sequencing and identification of microorganisms. It is simple and portable, and is suitable for development in laboratories such as hospitals and CDC. The sequencing data can be analyzed in the cloud. Quickly establish and monitor the epidemic.
  • Fig. 1 is a schematic structural diagram of an apparatus for identifying microorganisms according to an embodiment of the present disclosure.
  • Fig. 2 is the result of mutation analysis on the virus-carrying genome of a patient with new crown infection numbered C1 according to an embodiment of the present disclosure.
  • “plurality” means at least two, such as two, three, etc., unless otherwise specifically defined.
  • “universal bacteria” refers to some common bacteria, which can usually be distinguished by combining some target genes. These target genes were screened and finally determined to be at least one of 16s rRNA, rpob, gyrB, hsp60, ISR, and 23rRNA.
  • the general bacterial database refers to a reference database that can be used for the identification of these general bacteria. According to the embodiments of the present disclosure, these general bacterial databases contain data of these general bacterial target genes.
  • the universal bacterial primer pool refers to primers that can be used for the amplification or identification of these universal bacteria.
  • highly sensitive bacteria refers to some common pathogenic bacteria in the clinic.
  • the target genes used to characterize or identify these bacteria are usually different, but they often cause some clinical diseases.
  • the target genes on bacteria have been studied, and target genes that can characterize these bacteria have been found to be used for the identification of these bacteria.
  • general bacteria and highly sensitive bacteria are not deliberately distinguished, and there may be crossover, that is, some bacteria may belong to both general bacteria and highly sensitive bacteria.
  • the highly sensitive bacteria database refers to a reference database that can be used for the identification of these highly sensitive bacteria.
  • these highly sensitive bacteria databases contain the data of the target genes of these highly sensitive bacteria.
  • a highly sensitive bacterial primer pool refers to primers that can be used for the amplification or identification of these highly sensitive bacteria.
  • universal fungi refers to some common fungi, which can usually be distinguished by combining some target genes. These target genes were screened and finally determined to be at least one of ITS1-4, LSU(D1/2), 18s rRNA, and RPB2.
  • the general fungal database refers to a reference database that can be used for the identification of these general fungi. According to the embodiments of the present disclosure, these fungal and bacterial libraries contain data on these general fungal target genes.
  • the universal fungal primer pool refers to primers that can be used for the amplification or identification of these universal fungi.
  • highly sensitive fungi refers to some common pathogenic fungi in the clinic.
  • the target genes used to characterize or identify these fungi are usually different, but they often cause some clinical diseases.
  • the target genes on the fungi have been studied, and the target genes that can characterize these fungi have been found and used for the identification of these fungi.
  • general fungi and highly sensitive fungi are not deliberately distinguished, and there may be crossover, that is, some bacteria may belong to both general fungi and highly sensitive fungi.
  • the identification results of the highly sensitive fungi shall be used as the standard.
  • a highly sensitive fungus database refers to a reference database that can be used as a reference database for the identification of these highly sensitive fungi.
  • these highly sensitive fungal databases contain data on target genes of these highly sensitive fungi.
  • the highly sensitive bacterial primer pool refers to the primers that can be used for the amplification or identification of these highly sensitive fungi.
  • multiple viruses when referring to “multiple viruses”, it means that it contains at least one virus.
  • multiple viral target genes refer to target genes present on these viruses.
  • the new coronavirus refers to the SARS-CoV-2 virus.
  • the multiple virus database refers to a reference database that can be used for the identification of these viruses.
  • the multiple virus database contains data on target genes of these viruses.
  • multiple virus primer pools refer to primers that can be used for the amplification or identification of these viruses.
  • the present disclosure provides a sequencing method, including: enriching target genes from microorganisms to obtain enriched products, the microorganisms including at least one selected from bacteria, fungi, or viruses; based on the enrichment A library of products is constructed to obtain the sequencing library; based on the sequencing library, the sequencing is performed using a nanopore sequencing platform.
  • the obtained enriched products are constructed in accordance with the library construction process of the nanopore sequencing platform to obtain a sequencing library.
  • the obtained sequencing library only contains the enriched target gene nucleic acid from microorganisms.
  • the obtained sequencing library can be used to realize rapid sequencing analysis with the help of the nanopore sequencing platform.
  • the detection time is less than 12 hours from sample to result, which is the fastest It can be completed in 6 hours, and at the same time, it can realize a one-time broad-spectrum detection of bacteria, fungi, and viruses. Moreover, because the target gene nucleic acid is enriched, the amount of data required for detection is reduced to 10-50mb/sample, which greatly reduces the cost of detection.
  • the nanopore sequencing platform for sequencing compared with the second-generation sequencing technology, it can be monitored on-site in non-professional laboratories, and detection and data analysis can be performed through laptops and network cloud processes.
  • these target genes may be at least one selected from 16s rRNA, rpob, gyrB, hsp60, 23s rRNA, and ISR. These target genes can be used as detection areas for bacteria, and the specific types of bacteria can be determined by comparing the sequencing data of these detection areas with a reference database. According to embodiments of the present disclosure, these target genes may also be at least one of the target genes of the highly sensitive bacteria described in Table 1. These bacteria are clinically important and highly sensitive bacteria, and the specific species of the corresponding bacteria can be determined by detecting the corresponding target genes.
  • these target genes may be at least one selected from ITS1-4, LSU(D1/2), 18s rRNA, and RPB2. These target genes can be used as detection areas for fungi, and the specific species of fungi can be determined by comparing the sequencing data of these detection areas with a reference database. According to the embodiments of the present disclosure, these target genes may also be the target genes of the fungi described in Table 2. These bacteria are clinically important and highly sensitive fungi. By detecting the target genes of these fungi, the specific species of the corresponding fungi can be determined.
  • the viral target genes include multiple viral target genes and/or new coronavirus target genes, and the multiple viral target genes include at least one selected from the viral target genes shown in Table 3; the new coronavirus The viral target genes include at least one selected from the target genes shown in Table 4.
  • Virus name Target gene Boca virus NP1, VP1-2 Rhinovirus VP4/VP2, 5’UTR Human metapneumovirus N gene, F gene, glycoprotein G, N gene Respiratory syncytial virus N gene, P gene
  • Coronavirus 1a, 1b Adenovirus hexon gene Parainfluenza H gene Influenza A virus M gene, H gene Influenza B virus M gene, HA, na, NS Influenza C virus M gene Enterovirus VP1 Herpes virus glycoprotein G Rubella virus E1
  • the method for constructing a library based on a nanopore sequencing platform further includes: performing PCR amplification on the target gene from the microorganism based on the primers in the primer pool, so as to realize the enrichment of the target gene.
  • the primer pool contains at least one primer.
  • the primer pool includes at least one selected from the following primer pools: a universal bacterial primer pool including the primers listed in Table 5; a universal fungal primer pool, the The universal fungal primer pool includes the primers listed in Table 6; the highly sensitive bacterial primer pool includes the primers listed in Table 7; the highly sensitive fungal primer pool includes the primers listed in Table 7. 8 primers listed; multiple virus primer pool, the multiple virus primer pool includes the primers listed in Table 9; new coronavirus primer pool, the new coronavirus primer pool includes the primers listed in Table 10.
  • the enriched products of universal bacterial target genes, the enriched products of universal fungal target genes, the enriched products of highly sensitive bacteria and highly sensitive fungi target genes, the enriched products of multiple viral target genes, and the new crown are mixed in a mass ratio of 20-60:5-15:10-25:10-25:10-25, and a library is constructed on the mixed products to obtain the sequencing library. According to a preferred embodiment of the present disclosure, these enriched products are mixed in a ratio of 30-50:6-10:12-20:12-20:12-20.
  • the present disclosure also provides an apparatus for identifying microorganisms, as shown in FIG. 1, comprising: a library construction unit, which is based on the sample nucleic acid to be tested and obtains a sequencing library according to the above method; a sequencing unit, the sequencing The unit is based on the sequencing library and uses a nanopore sequencing platform to perform sequencing to obtain the sequencing result; a data processing unit that compares the sequencing result of the nucleic acid of the sample to be tested with a reference database for determining Describe the microorganisms in the sample to be tested.
  • the sequencing unit can be connected to the library building unit, and the data processing unit can be connected to the sequencing unit.
  • connection should be understood in a broad sense, for example, it may be a fixed connection, a detachable connection, or a whole; it may be a mechanical connection or an electrical connection. Connected or can communicate with each other; it can be directly connected or indirectly connected through an intermediary, and it can be the internal communication between two components or the interaction relationship between two components, unless specifically defined otherwise.
  • the specific meaning of the above-mentioned terms in the present disclosure can be understood according to specific circumstances.
  • rpob bacteria RNA polymerase ⁇ subunit
  • gyrB gyrase B subunit
  • hsp60 Heat shock protein 60
  • ISR ribosomal 16s rRNA/23s rRNA intergenic region
  • atpD ribosomal 16s rRNA/23s rRNA intergenic region
  • atpD ribosomal 16s rRNA/23s rRNA intergenic region
  • atpD ribosomal 16s rRNA/23s rRNA intergenic region
  • dnaJ ribosomal 16s rRNA/23s rRNA intergenic region
  • atpD ribosomal 16s rRNA/23s rRNA intergenic region
  • atpD ribosomal 16s rRNA/23s rRNA intergenic region
  • atpD ribosomal 16s rRNA/23s rRNA intergenic region
  • atpD ribosomal 16s rRNA/23s rRNA
  • Example 1 Using the database constructed in Example 1, using a self-written and integrated analysis process, conservative and specific analysis of the data of each target was performed.
  • Level 2 According to the species information of each data source in the database, use the genus and species name of each species to flatten all the sequences in the database with the species species name, and only keep no more than 5 sequences for the same species.
  • the target gene is flattened out of the database, and all the data in the database is used for analysis to obtain "comparison analysis file 2".
  • Level three according to the important pathogen information in Table 1 and Table 2, split each target gene data database into an independent clinically important bacterial database (a total of 60 genera) and a clinically important fungus database (a total of 15 genera), respectively Analyze each independent database to obtain "Comparison Analysis File 3".
  • the target gene can be used for bacterial primer design.
  • the criteria for determining conservative areas are:
  • the final universal bacterial target genes 16s rRNA, rpob, gyrB, hsp60, ISR, 23s rRNA.
  • the conservative frequency and variable pattern analysis were performed on the conserved region sites obtained in the "target gene evaluation file", and the "site frequency file” and the “variable pattern file” were obtained respectively.
  • the “site frequency file” records the exact probability of A, T, C, G and deletion at each base site in each conservative region. Based on this file, calculate the maximum probability sequence that may appear in each area and other possible permutation and combination types.
  • the "variable pattern file” records the relationship between variable bases at different positions on each conserved region.
  • the GC content, TM value, primer length, and expansion of each region on the target gene that can be used for primer design are performed. Increase the length of the product, calculate the GC content of the last 5 bases at the 3 end of the sequence, and the number of consecutive >3 identical base regions.
  • the primer amplified product is 200-1500 bases, the best 800bp; 2 Reduce the risk of amplification efficiency caused by mutations in conservative regions; according to the results of the "site frequency file", use a similar "shingle” structure to design multiple primers for a single site to improve the amplification efficiency of primers for different types of microorganisms; 3. Consider increasing the specificity of primers as much as possible. According to the results of the "variable pattern file”, optimize the selection of the variable combination with the highest frequency, design specific primers, and mix with universal degenerate primers to form a new primer pool to improve the amplification efficiency of primers. 4.
  • the length of the primer is 18-30 bases;
  • the melting temperature Tm of the primer is 57-64°C;
  • the GC content in the primer is 40-60%;
  • the Gibbs free energy ⁇ G of the 5 bases at the 3'end of the primer is greater than or equal to -9kcal/mol;
  • the primer self-complementarity value is less than 8.0, and the 3'end self-complementarity parameter of the primer is less than 3.0;
  • the length of the amplified product of the primer is 300 to 1500 bases.
  • the designed primers are as follows:
  • the primers in the provided universal fungal primer pool are as follows:
  • Some target genes do not have universal conserved regions in all bacteria or fungi, and cannot be used for the detection of all bacteria or fungi. They are non-universal target genes.
  • the highly sensitive bacteria listed in Table 1 and the highly sensitive fungi listed in Table 2 were primed one by one. the design of.
  • these non-universal target genes are not conserved in all bacteria or fungi, they are highly conserved within a specific genus, so primers designed based on this genus level can specifically amplify bacteria/fungi of this genus. In this way, adding this primer to the detection scheme can improve the sensitivity of the identification of bacteria/fungi of this genus.
  • the non-universal gene has a greater degree of genetic sequence difference between closely related bacteria/fungi, and the bacteria/fungi can be better identified through the sequencing results, which can improve the method for the identification of bacteria/fungi Resolving power.
  • both ends of the amplified products of all primer pools need to carry a specific 24 base "tag sequence".
  • Different specimens carry different tag sequences, and different primer pools in the same specimen carry the same tag sequences. Assign each piece of sequencing data to a specific specimen through the tag sequence.
  • the tag sequence can be introduced to both ends of the targeted gene by means of amplification.
  • the primers in the universal bacterial primer pool, universal fungal primer pool, and highly sensitive bacteria/fungal primer pool are used to amplify target genes to ensure the sensitivity of amplification and the high efficiency of tag sequence introduction.
  • the enriched products of universal bacterial target genes, the enriched products of universal fungal target genes, and the enriched products of highly sensitive bacteria/fungal target genes are obtained respectively.
  • Example 3 the method mentioned in Example 3 was used to design amplification primers suitable for single-molecule sequencing platforms such as nanopore sequencing to obtain a "multiple virus primer pool”.
  • the primers in the provided multiple virus primer pool are shown in the following table:
  • the "new coronavirus primer pool” was designed to cover the 9094bp gene region on the viral genome. 100% coverage of S, E, M genes related to virulence in the viral genome.
  • Example 9 Multiplex virus target gene amplification method and amplification primer combination method
  • the “multiple virus primer pool” and the “new coronavirus primer pool” choose the same method for amplification.
  • the detailed process includes reverse transcription-cDna amplification, as follows:
  • Denaturation program incubate at 65°C for 5 minutes, and cool quickly on ice (the PCR program can be set to 4°C).
  • the enriched products obtained by PCR amplification using universal bacterial primer pool, universal fungal primer pool, highly sensitive bacteria/fungal primer pool, virus primer pool, and novel coronavirus primer pool can be fully detected by one sequencing.
  • the enriched products obtained from five primer pool amplifications are selected and mixed according to the following proportions in Table 11 to obtain mixed products.
  • the mixed products were constructed using Oxford nanopore technologies' ligation sequencing kit SQK-LSK109 for library construction, and were sequenced using Oxford nanopore technologies' MinION, GridION or PromethION sequencers.
  • the nucleic acid was extracted from 45 throat swab specimens with clinically suspected novel coronavirus infection, and the “new coronavirus primer pool” provided in the above example was used to amplify 45 specimens, and the “multiple virus primer pool” was used for 16 specimens.
  • a different tag sequence is added to each sample. The concentration of the amplified product and the mixing amount of the sample are shown in Table 12 below:
  • the amplified products are built using a library building kit suitable for the nanopore sequencing platform, and then the nanopore sequencing platform is used for sequencing analysis.
  • the sequencing results are compared with the new coronavirus 2019-nCoV detection kit that has been approved by the cFDA as the reference object. Blind sample comparison was performed, and the results are shown in Table 13 below:
  • the fluorescent quantitative PCR method has fewer cases where PCR is negative but sequencing is positive. Therefore, the above two indicators show that the nanopore sequencing diagnostic program of this application is not weaker than the current fluorescent quantitative PCR method in detection sensitivity.
  • the top sequence is the standard reference sequence of the new coronavirus
  • the bottom 30 are the corrected "seed sequence”.
  • the corrected seed sequence and the virus reference sequence are different at this location, which indicates the location. There is a genetic mutation.
  • Patient A sent a blood culture for detection of bloodstream infection on May 6, and the same blood was sent for testing with the primers and methods provided in this disclosure.
  • the blood culture result report was positive on May 12, and after purification and culture, it was identified as C.
  • Blood culture is currently the most commonly used clinically, and it is also considered the gold standard for the diagnosis of blood infections.
  • the two clinical cases shown above are serious infections and initial infections.
  • the solutions provided by the present disclosure are consistent with the results of blood culture testing, and at the same time, the time required for testing is greatly reduced.
  • the inventors analyzed more than one thousand clinical cases, compared and analyzed the results of culture and detection by the primers and methods provided in the present disclosure, and found through comparison that the primers provided in the present disclosure were used And method for detection, the detection accuracy rate is much higher than the detection result of culture; and using the primers and methods provided in the present disclosure for detection, the detection time used is shorter, especially for the detection of fungi, the time is more demonstrated Obviously shorten the advantage.
  • Mycobacterium tuberculosis grows very slowly, it takes nearly a month for culture and identification. Therefore, in the past, in clinical diagnosis, GeneXpert nucleic acid detection method (WHO recommended gold standard), T-SPOT antigen detection method or acid-fast staining method was often used for clinical diagnosis. Identification. These methods require 1-8 hours to identify Mycobacterium tuberculosis. However, due to its technical limitations, it can only detect one pathogen of Mycobacterium tuberculosis in a targeted manner. Therefore, in clinical use, it is often necessary for clinicians to pre-judge through clinical symptoms or to screen multiple pathogens one by one. This whole process is time-consuming and laborious, and is easily affected by subjective factors such as doctor's experience.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medicinal Chemistry (AREA)
  • General Chemical & Material Sciences (AREA)
  • Virology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Botany (AREA)
  • Mycology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present application provides a library construction method based on a nanopore sequencing platform, a microorganism identification method, and an application. The provided construction method comprises: enriching a target gene from a microorganism to obtain an enriched product; and constructing a library on the basis of the enriched product to obtain a sequencing library. The use of a nanopore sequencing platform for sequencing of a sequencing library can achieve one-step broad-spectrum detection of bacteria, fungi, and viruses in a short time. Moreover, because the target gene nucleic acid is enriched, the testing sample amount and cost are reduced.

Description

基于纳米孔测序平台的文库的构建方法、鉴定微生物的方法及应用Method for constructing library based on nanopore sequencing platform, method for identifying microorganisms and application 技术领域Technical field
本公开涉及基因测序领域,具体涉及一种基于纳米孔测序平台的文库的构建方法、鉴定微生物的方法及应用。The present disclosure relates to the field of gene sequencing, in particular to a method for constructing a library based on a nanopore sequencing platform, a method for identifying microorganisms, and applications.
背景技术Background technique
细菌,真菌,病毒是引起临床感染的三类病原体。培养作为一种临床细菌和真菌检测常用的手段,其会受到细菌和真菌生长条件的天然差异的影响,进而导致检测的速度慢,往往需要2-7天时间;而且一些病原体的生长会受到其他病原菌的影响,从而使得通过培养检测的敏感度低。另外,不同的细菌或真菌类型,需要的培养手段不同,需要提前预判引起感染的类型,限制了其应用。Bacteria, fungi, and viruses are three types of pathogens that cause clinical infections. Cultivation is a common method for clinical bacteria and fungi detection. It will be affected by the natural differences in the growth conditions of bacteria and fungi, which will lead to slow detection speed, which often takes 2-7 days; and the growth of some pathogens will be affected by others. The influence of pathogenic bacteria makes the sensitivity of culture detection low. In addition, different types of bacteria or fungi require different culture methods, and need to predict the type of infection in advance, which limits its application.
常规PCR方法能够针对标本中的核酸进行检测,无需对细菌与真菌感染类型进行预判,一次性无差别的对标本中的细菌,真菌,病毒进行检测。但是,常规PCR方法会受到PCR技术条件的限制,一般仅能同时对标本中的1~15种特定病原体进行诊断。利用抗体对特定病原体进行特异性检测,也仅能针对特定的病原体。Conventional PCR methods can detect nucleic acids in specimens without pre-judging the types of bacterial and fungal infections, and detect bacteria, fungi, and viruses in specimens at one time without distinction. However, conventional PCR methods are limited by PCR technical conditions, and generally can only diagnose 1-15 specific pathogens in a specimen at the same time. The use of antibodies for specific detection of specific pathogens can only target specific pathogens.
利用基因测序技术进行病原体诊断是通过基因测序仪对使用一定的核酸提取与测序建库手段进行处理后的标本进行序列检测,再将所获得的基因序列与数据库比对,进而判断标本中所含有的真菌或/和细菌的种属信息。检测的全面性(同时检测细菌与真菌),速度(从标本到报告时间),敏感性(从成分复杂的临床标本中检出含量极低的病原体)以及便捷性(开展检测所需场地环境需求,最低样本数量需求)对于临床检测极为重要。The use of gene sequencing technology for pathogen diagnosis is to use a gene sequencer to perform sequence detection on the specimens processed by a certain method of nucleic acid extraction and sequencing library construction, and then compare the obtained gene sequence with the database to determine what the specimen contains The species information of the fungus or/and bacteria. The comprehensiveness of the test (detection of bacteria and fungi at the same time), speed (from specimen to report time), sensitivity (detection of very low pathogens from clinical specimens with complex composition) and convenience (needs of the site environment required to carry out the test) , The minimum sample size requirement) is extremely important for clinical testing.
二代宏基因组测序整体检测周期长,往往需要1~3天时间。检测所需数据量大,检测成本较高。测序设备与分析设备要求高,占地面积大,为降低单个样本成本,需要凑样检测,仅能在专业检测公司或者大型中心实验室开展。基于纳米孔宏基因组测序分析,具有测序时间短,测序结果优的优势。但是其可以需要大量的测序数据,例如一般5~10Gb/样本,而且试剂成本相比较于二代测序高于5~10倍,因此难以在临床实际中开展。The overall detection cycle of second-generation metagenomic sequencing is long, often requiring 1 to 3 days. The amount of data required for detection is large, and the detection cost is relatively high. Sequencing equipment and analysis equipment require high requirements and occupy a large area. In order to reduce the cost of a single sample, it is necessary to collect samples for testing, which can only be carried out in professional testing companies or large central laboratories. Based on nanopore metagenomic sequencing analysis, it has the advantages of short sequencing time and excellent sequencing results. However, it may require a large amount of sequencing data, for example, generally 5-10Gb/sample, and the cost of reagents is 5-10 times higher than that of second-generation sequencing, so it is difficult to implement it in clinical practice.
基于样本中微生物的鉴定手段,还需要进一步改进。Based on the identification of microorganisms in the sample, further improvements are needed.
发明内容Summary of the invention
本公开至少在一定程度上解决现有技术中存在的问题的至少之一,提供了一种基于纳米孔测序平台的文库的构建方法、鉴定微生物的方法及应用。The present disclosure solves at least one of the problems existing in the prior art to a certain extent, and provides a method for constructing a library based on a nanopore sequencing platform, a method for identifying microorganisms, and applications.
本公开的发明人创造性研究设计了用于微生物鉴定的靶基因位点,例如,分别设计了针对细菌、针对真菌以及针对病毒的靶基因位点;同时优化了不同靶基因之间的检测实验流程,实现在一次测序中对样本中的细菌、真菌、病毒等进行同时检测。应用本公开所提供的靶基因位点,能够借助于纳米孔测序平台,通过靶向富集的方法,提高测序过程中微 生物核酸的比例,将检测所需的数据量减少到10~50Mb/样本,大幅度降低检测成本。The inventors of the present disclosure creatively researched and designed target gene sites for microbial identification. For example, they designed target gene sites for bacteria, fungi, and viruses; at the same time, they optimized the detection experiment process between different target genes. , To achieve simultaneous detection of bacteria, fungi, viruses, etc. in the sample in one sequencing. Using the target gene locus provided by the present disclosure, it is possible to use the nanopore sequencing platform to increase the proportion of microbial nucleic acid during the sequencing process through a targeted enrichment method, and reduce the amount of data required for detection to 10-50Mb/sample , Which greatly reduces the cost of testing.
在本公开的第一方面,本公开提供了一种基于纳米孔测序平台的文库的构建方法,包括:对来自于微生物的靶基因进行富集,以便获得富集产物,所述微生物包括选自细菌、真菌或者病毒中的至少一种;基于所述富集产物进行建库,以便获得所述测序文库。本公开提供了一种基于纳米孔测序平台的文库的构建方法,通过对来自于微生物的靶基因进行富集,对所获得的富集产物进行建库,例如参照纳米孔测序平台的建库流程进行建库,获得测序文库。所获得的测序文库中仅含有来自于微生物的经过富集的靶基因核酸,通过对所获得测序文库可以借助于纳米孔测序平台实现快速测序分析,检测时间从样本到结果<12小时,最快6小时即可以完成,并同时可以实现细菌,真菌,病毒的一次性广谱检测。而且由于靶基因核酸得到富集,检测所需要的数据量减少到10~50Mb/样本,大幅度降低检测成本。In the first aspect of the present disclosure, the present disclosure provides a method for constructing a library based on a nanopore sequencing platform, including: enriching target genes from microorganisms to obtain enriched products, the microorganisms including those selected from At least one of bacteria, fungi, or viruses; a library is constructed based on the enriched product, so as to obtain the sequencing library. The present disclosure provides a method for constructing a library based on a nanopore sequencing platform. By enriching target genes from microorganisms, a library is constructed for the obtained enriched products, for example, referring to the library construction process of the nanopore sequencing platform Build a library to obtain a sequencing library. The obtained sequencing library only contains the enriched target gene nucleic acid from microorganisms. The obtained sequencing library can be used to realize rapid sequencing analysis with the help of the nanopore sequencing platform. The detection time is less than 12 hours from sample to result, which is the fastest It can be completed in 6 hours, and at the same time can realize a one-time broad-spectrum detection of bacteria, fungi, and viruses. And because the target gene nucleic acid is enriched, the amount of data required for detection is reduced to 10-50Mb/sample, which greatly reduces the cost of detection.
根据本公开的实施例,以上所述基于纳米孔测序平台的文库的构建方法可以进一步包括如下技术特征:According to an embodiment of the present disclosure, the method for constructing a library based on the nanopore sequencing platform described above may further include the following technical features:
根据本公开的实施例,所述细菌靶基因包括通用细菌靶基因和/或高敏感细菌靶基因,所述通用细菌靶基因包括选自16s rRNA、rpob、gyrB、hsp60、ISR、23s rRNA中的至少一种,优选包括表5中所列出的靶区域及其前后500bp的区域;所述高敏感细菌靶基因包括选自表1所示高敏感细菌的靶基因中的至少一种,这些高敏感细菌的靶基因具体为16s rRNA、rpob、gyrB、hsp60、ISR、dnaJ、tuf、atpD、rnpB、sodA、inhA、mip、recA、trkA、femA、gap、katG、mabA、gacA等基因;优选包括表7所列出的靶区域及其前后500bp的区域。通用细菌靶基因区域特异性强,能够用作不同细菌的检测。表1中所列出的细菌为临床上重要的高敏感细菌,通过对所列出的靶基因进行检测,可以用作这些高敏感细菌的检测。According to an embodiment of the present disclosure, the bacterial target genes include universal bacterial target genes and/or highly sensitive bacterial target genes, and the universal bacterial target genes include those selected from the group consisting of 16s rRNA, rpob, gyrB, hsp60, ISR, 23s rRNA At least one, preferably including the target region listed in Table 5 and the 500bp region before and after; the highly sensitive bacterial target gene includes at least one selected from the target gene of the highly sensitive bacteria shown in Table 1, these high The target genes of sensitive bacteria are specifically 16s rRNA, rpob, gyrB, hsp60, ISR, dnaJ, tuf, atpD, rnpB, sodA, inhA, mip, recA, trkA, femA, gap, katG, mabA, gacA and other genes; preferably include Table 7 lists the target regions and the 500bp region before and after. The universal bacterial target gene has strong regional specificity and can be used for the detection of different bacteria. The bacteria listed in Table 1 are clinically important highly sensitive bacteria, and can be used to detect these highly sensitive bacteria by detecting the listed target genes.
根据本公开的实施例,所述真菌靶基因包括通用真菌靶基因和/或高敏感真菌靶基因,所述通用真菌靶基因包括选自ITS1-4、LSU(D1/2)、18s rRNA、RPB2中的至少一种,优选包括表6中所列出的靶区域及其前后500bp的区域;例如可以为前后100bp~450bp的区域、前后100~400bp的区域、前后100~350bp的区域、前后100~300bp的区域、前后100~250bp的区域、前后100~200bp的区域、前后100~150bp的区域,或者是前后100bp以内的区域、前后50~100bp的区域;所述高敏感真菌靶基因包括选自表2所示高敏感真菌的靶基因中的至少一种,这些靶基因具体可以为RPB1、RPB2、TEF1、BenA、CaM、ND6,MCM7,CAL,TUB2,ACT、ND6等基因,优选包括表8所列出的靶区域及其前后500bp的区域;例如可以为前后100bp~450bp的区域、前后100~400bp的区域、前后100~350bp的区域、前后100~300bp的区域、前后100~250bp的区域、前后100~200bp的区域、前后100~150bp的区域,或者是前后100bp以内的区域、前后50~100bp的区域。这些靶基因区域特异性强,能够用作不同细菌的检测。According to an embodiment of the present disclosure, the fungal target genes include universal fungal target genes and/or highly sensitive fungal target genes, and the universal fungal target genes include selected from ITS1-4, LSU(D1/2), 18s rRNA, RPB2 At least one of them preferably includes the target region listed in Table 6 and the region of 500 bp before and after; for example, it may be a region of 100 bp to 450 bp before and after, a region of 100 to 400 bp before and after, a region of 100 to 350 bp before and after, and 100 bp before and after. The region of ~300bp, the region of 100~250bp before and after, the region of 100~200bp before and after, the region of 100~150bp before and after, or the region within 100bp before and after, and the region of 50~100bp before and after; the highly sensitive fungal target genes include options From at least one of the target genes of highly sensitive fungi shown in Table 2, these target genes may specifically be RPB1, RPB2, TEF1, BenA, CaM, ND6, MCM7, CAL, TUB2, ACT, ND6 and other genes, preferably including the table The target region listed in 8 and the region of 500 bp before and after; for example, the region of 100 to 450 bp before and after, the region of 100 to 400 bp before and after, the region of 100 to 350 bp before and after, the region of 100 to 300 bp before and after, and the region of 100 to 250 bp before and after A region, a region of 100 to 200 bp before and after, a region of 100 to 150 bp before and after, or a region within 100 bp before and after, and a region of 50 to 100 bp before and after. These target gene regions are highly specific and can be used for the detection of different bacteria.
根据本公开的实施例,所述病毒靶基因包括多重病毒靶基因和/或新冠病毒靶基因,所述多重病毒靶基因包括选自表3所示病毒靶基因中的至少一种,优选包括表9所列出的靶区域及其前后200bp的区域;例如可以为前后100~200bp的区域、前后100~150bp的区域, 或者是前后100bp以内的区域;所述新冠病毒靶基因包括选自表4所示靶基因中的至少一种,优选包括表10所列出的靶区域及其前后200bp的区域;例如可以为前后100~200bp的区域、前后100~150bp的区域,或者是前后100bp以内的区域、前后50~100bp的区域。According to an embodiment of the present disclosure, the viral target genes include multiple viral target genes and/or new coronavirus target genes, and the multiple viral target genes include at least one selected from the viral target genes shown in Table 3, preferably including The target region listed in 9 and the region of 200 bp before and after; for example, a region of 100 to 200 bp before and after, a region of 100 to 150 bp before and after, or a region within 100 bp before and after; the new coronavirus target genes include those selected from Table 4 At least one of the indicated target genes preferably includes the target region listed in Table 10 and the region of 200 bp before and after; for example, it can be a region of 100 to 200 bp before and after, a region of 100 to 150 bp before and after, or within 100 bp before and after Area, the area of 50-100bp before and after.
根据本公开的实施例,所述细菌靶基因包括通用细菌靶基因和高敏感细菌靶基因,所述真菌靶基因包括通用真菌靶基因和高敏感真菌靶基因,所述病毒靶基因包括多重病毒靶基因和新冠病毒靶基因,所述方法进一步包括:将通用细菌靶基因的富集产物、通用真菌靶基因的富集产物、高敏感细菌靶基因的富集产物和高敏感真菌靶基因的富集产物、多重病毒靶基因的富集产物和新冠病毒靶基因的富集产物按照质量比为20~60:5~15:10~25:10~25:10~25的比例混合,对混合后的产物进行建库,以便获得所述测序文库。根据本公开的实施例,通用细菌靶基因的富集产物、通用真菌靶基因的富集产物、高敏感细菌靶基因的富集产物和高敏感真菌靶基因的富集产物、多重病毒靶基因的富集产物和新冠病毒靶基因的富集产物可以按照质量比为30~50:6~10:12~20:12~20:12~20的比例混合。According to an embodiment of the present disclosure, the bacterial target gene includes a universal bacterial target gene and a highly sensitive bacterial target gene, the fungal target gene includes a universal fungal target gene and a highly sensitive fungal target gene, and the viral target gene includes a multiple viral target Gene and new coronavirus target gene, the method further comprises: enriching the enriched product of universal bacterial target gene, the enriched product of universal fungal target gene, the enriched product of highly sensitive bacterial target gene, and the enriched product of highly sensitive fungal target gene The products, the enriched products of multiple viral target genes and the enriched products of new coronavirus target genes are mixed in a mass ratio of 20-60:5-15:10-25:10-25:10-25. The product library is constructed to obtain the sequencing library. According to the embodiments of the present disclosure, the enriched products of universal bacterial target genes, the enriched products of universal fungal target genes, the enriched products of highly sensitive bacterial target genes, the enriched products of highly sensitive fungal target genes, and the enriched products of multiple viral target genes The enriched product and the enriched product of the new coronavirus target gene can be mixed in a mass ratio of 30-50:6-10:12-20:12-20:12-20.
根据本公开的实施例,所述基于纳米孔测序平台的文库的构建方法进一步包括:基于引物池中的引物对所述来自于微生物的靶基因进行PCR扩增,实现对所述靶基因的富集,以便获得富集产物,所述引物池包含至少一条引物。According to an embodiment of the present disclosure, the method for constructing a library based on a nanopore sequencing platform further includes: performing PCR amplification on the target gene from the microorganism based on the primers in the primer pool, so as to realize the enrichment of the target gene. In order to obtain an enriched product, the primer pool contains at least one primer.
根据本公开的实施例,所述引物池中的引物均满足下列条件:a.引物长度为18-30个碱基;b.引物的解链温度Tm值为57-64℃;c.引物中GC含量为40-60%;d.引物的3’末端5个碱基的吉布斯自由能ΔG大于等于-9kcal/mol;e.引物自身互补性数值小于8.0,引物3’末端自我互补参数小于3.0;f.引物3’末端连续3个碱基上不存在简并碱基;g.所述引物的扩增产物长度为200~1500个碱基。本文中,所提到的引物自身互补性数值用来表征每条引物自身所有碱基序列之间可能形成互补结构的趋势性;引物3’末端自我互补参数是计算不同引物3’端的所有碱基序列之间形成互补结构的趋势性。具体的趋势性通过数值评估,数值计算逻辑为:形成一对碱基互补时计1分,形成N碱基之间的互补计-0.25分,形成非互补的碱基计-1分,形成空缺互补(Gap)的碱基时计-2分。更详细的计算方法可以参照文献Shen,Z.,et al.,MPprimer:a program for reliable multiplex PCR primer design.BMC Bioinformatics,2010.11:p.143.中所记载的内容。According to the embodiments of the present disclosure, the primers in the primer pool all meet the following conditions: a. The length of the primer is 18-30 bases; b. The melting temperature Tm of the primer is 57-64°C; c. In the primer The GC content is 40-60%; d. The Gibbs free energy ΔG of the 5 bases at the 3'end of the primer is greater than or equal to -9kcal/mol; e. The primer self-complementarity value is less than 8.0, and the 3'end self-complementary parameter of the primer Less than 3.0; f. There is no degenerate base on 3 consecutive bases at the 3'end of the primer; g. The length of the amplified product of the primer is 200 to 1500 bases. In this article, the primer self-complementarity value mentioned is used to characterize the tendency of each primer's own base sequence to form a complementary structure; the 3'end self-complementarity parameter of the primer is to calculate all bases at the 3'end of different primers The tendency of sequences to form complementary structures. The specific trend is evaluated by numerical value. The numerical calculation logic is: 1 point is counted when forming a pair of base complements, the complementation between N bases is -0.25 points, and the non-complementary bases are counted-1 point, forming a gap Complementary (Gap) bases are counted as -2 points. For more detailed calculation methods, please refer to the document Shen, Z., et al., MPprimer: a program for reliable multiplex PCR primer design. BMC Bioinformatics, 2010.11: p.143.
根据本公开的实施例,所述引物池包括选自下列引物池中的至少一种:通用细菌引物池,所述通用细菌引物池包括表5所列出的引物;通用真菌引物池,所述通用真菌引物池包括表6所列出的引物;高敏感细菌引物池,所述高敏感细菌引物池包括表7所列出的引物;高敏感真菌引物池,所述高敏感真菌引物池包括表8所列出的引物;多重病毒引物池,所述多重病毒引物池包括表9所列出的引物;新冠病毒引物池,所述新冠病毒引物池包括表10所列出的引物。According to an embodiment of the present disclosure, the primer pool includes at least one selected from the following primer pools: a universal bacterial primer pool including the primers listed in Table 5; a universal fungal primer pool, the The universal fungal primer pool includes the primers listed in Table 6; the highly sensitive bacterial primer pool includes the primers listed in Table 7; the highly sensitive fungal primer pool includes the primers listed in Table 7. 8 primers listed; multiple virus primer pool, the multiple virus primer pool includes the primers listed in Table 9; new coronavirus primer pool, the new coronavirus primer pool includes the primers listed in Table 10.
在本公开的第二方面,本公开提供了一种测序方法,包括:基于本公开第一方面任一实施例所述的方法获得测序文库;基于所述测序文库,利用纳米孔测序平台进行所述测序。利用纳米孔测序平台进行测序,相比较于二代测序技术,可以在非专业实验室内就地监测,通过笔记本电脑与网络云端流程即可以进行检测与数据分析。In the second aspect of the present disclosure, the present disclosure provides a sequencing method, including: obtaining a sequencing library based on the method described in any one of the embodiments of the first aspect of the present disclosure;述 Sequencing. Using the nanopore sequencing platform for sequencing, compared with the second-generation sequencing technology, it can be monitored on-site in non-professional laboratories, and detection and data analysis can be performed through laptops and network cloud processes.
在本公开的第三方面,本公开提供了一种鉴定微生物的方法,包括:基于待测样本核酸,根据本公开第一方面任一实施例所述的方法获得测序文库;基于所述测序文库,利用纳米孔测序平台进行测序,以便获得测序结果;将所述测序结果与参考数据库进行比对,基于比对结果确定所述待测样本中的微生物。所提供的鉴定微生物的方法,可以是对病原体的鉴定,通过鉴定病原体,可以辅助临床用药。所提供给的鉴定微生物的方法可以用于鉴定待测样本中的细菌、真菌或者病毒种类,可以用作临床辅助用药,也可以用作其他目的,例如大数据汇总,商用试剂盒或者商用平台的搭建等非疾病诊断目的。In the third aspect of the present disclosure, the present disclosure provides a method for identifying microorganisms, including: obtaining a sequencing library according to any one of the embodiments of the first aspect of the present disclosure based on the nucleic acid of the sample to be tested; and based on the sequencing library Sequencing using a nanopore sequencing platform to obtain sequencing results; comparing the sequencing results with a reference database, and determining the microorganisms in the sample to be tested based on the comparison results. The provided method for identifying microorganisms can be the identification of pathogens, and the identification of pathogens can assist clinical medication. The provided methods for identifying microorganisms can be used to identify the types of bacteria, fungi or viruses in the sample to be tested, can be used as clinical auxiliary drugs, and can also be used for other purposes, such as big data collection, commercial kits or commercial platforms Construction and other non-disease diagnosis purposes.
根据本公开的实施例,以上所述鉴定微生物的方法可以进一步包括如下技术特征:According to an embodiment of the present disclosure, the method for identifying microorganisms described above may further include the following technical features:
根据本公开的实施例,所述参考数据库包括下列中的至少一种:通用细菌数据库,所述通用细菌数据库含有16s rRNA、rpob、gyrB、hsp60、23s rRNA和ISR基因数据;通用真菌数据库,所述通用真菌数据库含有TS1-4、LSU(D1/2)、18s rRNA和RPB2基因数据;高敏感细菌数据库,所述高敏感细菌数据库含有表1所示高敏感细菌靶基因数据;高敏感真菌数据库,所述高敏感真菌数据库含有表2所示高敏感真菌靶基因数据;多重病毒数据库,所述多重病毒数据库含有表3所示病毒靶基因数据;新冠病毒数据库,所述新冠病毒数据库含有SARS-CoV-2的基因组数据。According to an embodiment of the present disclosure, the reference database includes at least one of the following: a general bacterial database containing 16s rRNA, rpob, gyrB, hsp60, 23s rRNA, and ISR gene data; a general fungal database, so The general fungus database contains TS1-4, LSU(D1/2), 18s rRNA and RPB2 gene data; a highly sensitive bacteria database, the highly sensitive bacteria database contains the highly sensitive bacterial target gene data shown in Table 1; a highly sensitive fungus database , The highly sensitive fungus database contains the highly sensitive fungal target gene data shown in Table 2; the multiple virus database, the multiple virus database contains the virus target gene data shown in Table 3; the new coronavirus database, the new coronavirus database contains SARS- Genomic data of CoV-2.
根据本公开的实施例,以上所述鉴定微生物的方法进一步包括:将所述测序结果分别与所述通用细菌数据库和所述通用真菌数据库进行比对,以便获得第一比对数据和第一未比对数据;将所述第一未比对数据分别与所述高敏感细菌数据库和所述高敏感真菌数据库进行比对,以便获得第二比对数据和第二未比对数据;将所述第二未比对数据与所述多重病毒数据库和所述新冠病毒数据库进行比对,以便获得第三比对数据;基于所述第一比对数据,确定所述样本中含有的通用细菌和通用真菌,基于所述第二比对数据,确定样本中含有高敏感细菌和高敏感真菌,基于所述第三比对数据,确定所述样本中含有的病毒。According to an embodiment of the present disclosure, the method for identifying microorganisms described above further includes: comparing the sequencing results with the general bacterial database and the general fungal database, respectively, so as to obtain the first comparison data and the first comparison data. Comparison data; compare the first uncompared data with the highly sensitive bacteria database and the highly sensitive fungus database, respectively, so as to obtain the second comparison data and the second uncompared data; The second uncompared data is compared with the multiple virus database and the new coronavirus database to obtain the third comparison data; based on the first comparison data, the common bacteria and common bacteria contained in the sample are determined For fungi, based on the second comparison data, it is determined that the sample contains highly sensitive bacteria and highly sensitive fungi, and based on the third comparison data, the virus contained in the sample is determined.
根据本公开的实施例,所述基于所述第一比对数据,确定所述样本中含有的通用细菌和通用真菌进一步包括:将所述第一比对数据分为第一唯一比对数据和至少一组第一交叉比对数据,每组第一交叉比对数据含有多条比对序列;将每组第一交叉数据中的部分多条比对序列作为种子序列,利用组内剩余部分比对序列对所述种子序列进行校正,以便获得校正后种子序列;将所述校正后种子序列与所述通用细菌数据库和所述通用真菌数据库的最优比对数据,和所述第一唯一比对数据合并,确定所述样本中含有的通用细菌和通用真菌。According to an embodiment of the present disclosure, the determining the general bacteria and general fungi contained in the sample based on the first comparison data further includes: dividing the first comparison data into first unique comparison data and At least one set of first cross-alignment data, and each set of first cross-alignment data contains multiple alignment sequences; some of the multiple alignment sequences in each set of first cross-alignment data are used as seed sequences, and the remaining part of the group is used for comparison. The sequence is corrected for the seed sequence to obtain a corrected seed sequence; the corrected seed sequence is compared with the optimal alignment data of the general bacterial database and the general fungal database, and the first unique comparison Combine the data to determine the common bacteria and common fungi contained in the sample.
根据本公开的实施例,所述基于所述第二比对数据,确定所述样本含有的高敏感细菌和高敏感真菌进一步包括:将所述第二比对数据分为第二唯一比对数据和至少一组第二交叉比对数据,每组第二交叉比对数据含有多条比对序列;将每组第二交叉数据中的部分多条比对序列作为种子序列,利用组内剩余部分比对序列对所述种子序列进行校正,以便获得校正后种子序列;将所述校正后种子序列与所述高敏感细菌数据库和所述高敏感真菌数据库的最优比对结果,和所述第二唯一比对数据合并,确定所述样本中含有的高敏感细菌和高敏感真菌。According to an embodiment of the present disclosure, the determining the highly sensitive bacteria and highly sensitive fungi contained in the sample based on the second comparison data further includes: dividing the second comparison data into second unique comparison data And at least one set of second cross-aligned data, each set of second cross-aligned data contains multiple alignment sequences; part of multiple alignment sequences in each set of second cross-aligned data are used as seed sequences, and the remaining part in the group is used The alignment sequence corrects the seed sequence to obtain a corrected seed sequence; the optimal alignment result of the corrected seed sequence with the highly sensitive bacteria database and the highly sensitive fungus database, and the second The two unique comparison data are combined to determine the highly sensitive bacteria and highly sensitive fungi contained in the sample.
根据本公开的实施例,基于所确定的样本中的通用细菌以及高敏感细菌,合并确定样本中含有的全部细菌;基于所确定的样本中的通用真菌以及高敏感真菌,合并确定样本中含有的全部真菌。将所确定的全部细菌或者全部真菌生成细菌清单或者真菌清单。以细菌清单为例,如果清单中的细菌或者高敏感细菌有交叉,比较交叉细菌在属和种上的一致性。如果两者在种水平上相同,则确定唯一结果,如果两者在属水平上鉴定结果一致,但种水平上鉴定结果不一致,则以高敏感细菌的结果作为唯一结果,如果两者鉴定到不同的属,则将两者结果均作为结果输出。根据本公开的实施例,所述基于所述第三比对数据,确定所述样本中含有的病毒进一步包括:According to the embodiments of the present disclosure, based on the determined general bacteria and highly sensitive bacteria in the sample, all bacteria contained in the sample are combined and determined; based on the determined general fungi and highly sensitive fungi in the sample, combined to determine the content contained in the sample All fungi. Generate a list of bacteria or a list of fungi from all the bacteria or all the fungi identified. Take the list of bacteria as an example. If there are crossovers of bacteria or highly sensitive bacteria in the list, compare the genus and species consistency of the crossed bacteria. If the two are the same at the species level, the unique result is determined. If the identification results of the two are the same at the genus level, but the identification results at the species level are inconsistent, the result of the highly sensitive bacteria will be used as the only result. If the two are identified as different , Then both results will be output as the result. According to an embodiment of the present disclosure, the determining the virus contained in the sample based on the third comparison data further includes:
将所述第三比对数据中比对到同一基因区域的多条比对序列作为一组,以每组中的部分比对序列作为种子序列,利用组内其他剩余部分比对序列对所述种子序列进行校正,以便获得校正后种子序列;The multiple alignment sequences in the third alignment data that are aligned to the same gene region are used as a group, a part of the alignment sequence in each group is used as a seed sequence, and the remaining alignment sequences in the group are used to align the The seed sequence is corrected to obtain the corrected seed sequence;
基于所述校正后种子序列与所述多重病毒数据库和所述新冠病毒数据库的最优比对结果,确定所述样本中含有的病毒;Determine the virus contained in the sample based on the optimal comparison result of the corrected seed sequence with the multiple virus database and the new coronavirus database;
任选地,进一步包括:基于至少80%以上的所述校正后种子序列与所述多重病毒数据库和所述新冠病毒数据库之间存在的相同碱基差异,确定所述待测样本中存在的病毒突变位点。Optionally, it further includes: determining the virus present in the sample to be tested based on at least 80% or more of the same base difference between the corrected seed sequence and the multiple virus database and the new coronavirus database Mutation site.
在本公开的第四方面,本公开提供了一种鉴定微生物的装置,包括:数据处理单元,所述数据处理单元基于待测样本核酸的测序结果与参考数据库进行比对,用于确定所述待测样本中的微生物。根据本公开的实施例,所述鉴定微生物的装置进一步包括:文库构建单元,所述文库构建单元基于所述待测样本核酸,根据本公开第三方面任一实施例所述的方法获得测序文库;测序单元,所述测序单元基于所述测序文库,利用纳米孔测序平台进行测序,以便获得所述测序结果。In a fourth aspect of the present disclosure, the present disclosure provides an apparatus for identifying microorganisms, including: a data processing unit that compares the sequencing result of the nucleic acid of the sample to be tested with a reference database for determining the Microorganisms in the sample to be tested. According to an embodiment of the present disclosure, the apparatus for identifying microorganisms further includes: a library construction unit that obtains a sequencing library based on the nucleic acid of the sample to be tested according to the method of any one of the embodiments of the third aspect of the present disclosure Sequencing unit, based on the sequencing library, the sequencing unit uses a nanopore sequencing platform to perform sequencing, so as to obtain the sequencing result.
根据本公开的实施例,所述鉴定微生物的装置可以进一步包括如下技术特征:According to an embodiment of the present disclosure, the device for identifying microorganisms may further include the following technical features:
根据本公开的实施例,所述参考数据库包括下列中的至少一种:通用细菌数据库,所述通用细菌数据库含有16s rRNA、rpob、gyrB、hsp60、23s rRNA和ISR基因数据;通用真菌数据库,所述通用真菌数据库含有TS1-4、LSU(D1/2)、18s rRNA和RPB2基因数据;高敏感细菌数据库,所述高敏感细菌数据库含有表1中示出的细菌靶基因数据;高敏感真菌数据库,所述高敏感真菌数据库含有表2中示出的真菌靶基因数据;多重病毒数据库,所述多重病毒数据库含有表3所示的病毒靶基因数据;新冠病毒数据库,所述新冠病毒数据库含有SARS-CoV-2的基因组数据。According to an embodiment of the present disclosure, the reference database includes at least one of the following: a general bacterial database containing 16s rRNA, rpob, gyrB, hsp60, 23s rRNA, and ISR gene data; a general fungal database, so The general fungus database contains TS1-4, LSU(D1/2), 18s rRNA and RPB2 gene data; a highly sensitive bacteria database, which contains the bacterial target gene data shown in Table 1; a highly sensitive fungus database , The highly sensitive fungus database contains the fungal target gene data shown in Table 2; a multiple virus database, the multiple virus database contains the virus target gene data shown in Table 3; a new coronavirus database, the new coronavirus database contains SARS -Genomic data of CoV-2.
根据本公开的实施例,所述数据处理单元进一步包括:将所述测序结果分别与所述通用细菌数据库和所述通用真菌数据库进行比对,以便获得第一比对数据和第一未比对数据;将所述第一未比对数据分别与所述高敏感细菌数据库和所述高敏感真菌数据库进行比对,以便获得第二比对数据和第二未比对数据;将所述第二未比对数据与所述多重病毒数据库和所述新冠病毒数据库进行比对,以便获得第三比对数据;基于所述第一比对数据,确定所述样本中含有的通用细菌和通用真菌,基于所述第二比对数据,确定样本中含有高敏感 细菌和高敏感真菌,基于所述第三比对数据,确定所述样本中含有的病毒。所提到的第一比对数据是指能够与通用细菌数据库和通用真菌数据库比对上的数据,第一未比对数据是指与通用细菌数据库和通用真菌数据库比对不上的数据。这些比对不上的数据继续与高敏感细菌数据库和高敏感真菌数据库进行比对,能够比对上的数据作为第二比对数据,比对不上的数据作为第二未比对数据。这些第二未比对数据继续与多重病毒数据库和新冠病毒数据库进行比对,能够比对上的数据作为第三比对数据。According to an embodiment of the present disclosure, the data processing unit further includes: comparing the sequencing results with the general bacterial database and the general fungal database, respectively, so as to obtain first comparison data and first uncompared data Data; compare the first uncompared data with the highly sensitive bacteria database and the highly sensitive fungus database, so as to obtain the second comparison data and the second uncompared data; compare the second The unaligned data is compared with the multiple virus database and the new coronavirus database to obtain the third comparison data; based on the first comparison data, the general bacteria and general fungi contained in the sample are determined, Based on the second comparison data, it is determined that the sample contains highly sensitive bacteria and highly sensitive fungi, and based on the third comparison data, the virus contained in the sample is determined. The mentioned first comparison data refers to the data that can be compared with the general bacteria database and the general fungus database, and the first uncompared data refers to the data that cannot be compared with the general bacteria database and the general fungus database. These unmatched data continue to be compared with the highly sensitive bacteria database and the highly sensitive fungus database. The data that can be compared is used as the second comparison data, and the unmatched data is used as the second uncompared data. These second uncompared data continue to be compared with the multiple virus database and the new coronavirus database, and the data that can be compared is used as the third comparison data.
根据本公开的实施例,所述基于所述第一比对数据,确定所述样本中含有的通用细菌和通用真菌进一步包括:将所述第一比对数据分为第一唯一比对数据和至少一组第一交叉比对数据,每组第一交叉比对数据含有多条序列;将每组第一交叉数据中的部分多条序列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;将所述校正后种子序列与所述通用细菌数据库和所述通用真菌数据库的最优比对结果,和所述第一唯一比对数据合并,确定所述样本中含有的通用细菌和通用真菌。根据本公开的实施例,可以将每组第一交叉数据中的随机的20%~40%的序列(例如可以为30%)作为种子序列。According to an embodiment of the present disclosure, the determining the general bacteria and general fungi contained in the sample based on the first comparison data further includes: dividing the first comparison data into first unique comparison data and At least one set of first cross-alignment data, and each set of first cross-alignment data contains multiple sequences; some multiple sequences in each set of first cross-alignment data are used as seed sequences, and the remaining sequences in the group are used to compare the seeds The sequence is corrected to obtain a corrected seed sequence; the optimal alignment results of the corrected seed sequence with the general bacterial database and the general fungal database are combined with the first unique alignment data to determine the The general bacteria and general fungi contained in the sample. According to an embodiment of the present disclosure, a random 20%-40% sequence (for example, 30%) in each group of first crossover data can be used as a seed sequence.
根据本公开的实施例,所述基于所述第二比对数据,确定所述样本含有的高敏感细菌和高敏感真菌进一步包括:According to an embodiment of the present disclosure, the determining that the sample contains highly sensitive bacteria and highly sensitive fungi based on the second comparison data further includes:
将所述第二比对数据分为第二唯一比对数据和至少一组第二交叉比对数据,每组第二交叉比对数据含有多条序列;Dividing the second comparison data into second unique comparison data and at least one group of second cross comparison data, each group of second cross comparison data contains multiple sequences;
将每组第二交叉数据中的部分多条序列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;Taking part of multiple sequences in each group of second crossover data as seed sequences, and correcting the seed sequence by using the remaining part of the sequence in the group, so as to obtain a corrected seed sequence;
将所述校正后种子序列与所述高敏感细菌数据库和所述高敏感真菌数据库的最优比对结果,和所述第二唯一比对数据合并,确定所述样本中含有的高敏感细菌和高敏感真菌。根据本公开的实施例,可以将每组第二交叉数据中的随机的20%~40%的序列(例如可以为30%)作为种子序列。Combine the corrected seed sequence with the optimal comparison result of the highly sensitive bacteria database and the highly sensitive fungus database, and the second unique comparison data to determine the highly sensitive bacteria contained in the sample and Highly sensitive fungus. According to an embodiment of the present disclosure, a random 20%-40% sequence (for example, 30%) in each set of second crossover data can be used as a seed sequence.
根据本公开的实施例,所述基于所述第三比对数据,确定所述样本中含有的病毒进一步包括:将所述第三比对数据中比对到同一基因区域的多条序列划为一组,以每组中的部分序列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;基于所述校正后种子序列与所述多重病毒数据库和所述新冠病毒数据库的最优比对结果,确定所述样本中含有的病毒。根据本公开的实施例,可以将每组中的随机的20%~40%的序列(例如可以为30%)作为种子序列。According to an embodiment of the present disclosure, the determining the virus contained in the sample based on the third comparison data further includes: dividing multiple sequences in the third comparison data that are aligned to the same gene region as One group, the partial sequence in each group is used as the seed sequence, and the remaining partial sequences in the group are used to correct the seed sequence to obtain the corrected seed sequence; based on the corrected seed sequence and the multiple virus database and all According to the optimal comparison result of the new coronavirus database, the virus contained in the sample is determined. According to an embodiment of the present disclosure, a random 20%-40% sequence (for example, 30%) in each group can be used as a seed sequence.
根据本公开的实施例,进一步包括:基于至少80%以上的所述校正后种子序列与所述多重病毒数据库和所述新冠病毒数据库之间存在的相同碱基差异,确定所述待测样本病毒的突变位点。According to an embodiment of the present disclosure, further comprising: determining the sample virus to be tested based on at least 80% or more of the same base difference between the corrected seed sequence and the multiple virus database and the new coronavirus database The mutation site.
在本公开的第五方面,本公开提供了一种试剂盒,包括引物池,所述引物池包括选自下列中的至少之一:通用细菌引物池,所述通用细菌引物池包括表5所列出的引物;通用真菌引物池,所述通用真菌引物池包括表6所列出的引物;高敏感细菌引物池,所述高敏 感细菌引物池包括表7所列出的引物;高敏感真菌引物池,所述高敏感真菌引物池包括表8所列出的引物;多重病毒引物池,所述多重病毒引物池包括表9所列出的引物;新冠病毒引物池,所述新冠病毒引物池包括表10所列出的引物。In the fifth aspect of the present disclosure, the present disclosure provides a kit, including a primer pool, the primer pool includes at least one selected from the following: a universal bacterial primer pool, the universal bacterial primer pool includes table 5 Listed primers; universal fungal primer pool, the universal fungal primer pool includes the primers listed in Table 6; highly sensitive bacterial primer pool, the highly sensitive bacterial primer pool includes the primers listed in Table 7; highly sensitive fungus A primer pool, the highly sensitive fungal primer pool includes the primers listed in Table 8; a multiple virus primer pool, the multiple virus primer pool includes the primers listed in Table 9; a new coronavirus primer pool, the new coronavirus primer pool Include the primers listed in Table 10.
本公开所取得的有益效果为:应用本公开所提供的测序方法,能够快速获得样本中的微生物信息,例如其可以实现当天对样本中的病毒出具检测结果。而且检测灵敏度高:检测低丰度的病毒、细菌、真菌,助力早期诊断,提示早期感染风险。检测范围广,例如可以应用于新型冠状病毒SARS-CoV-2的检测,还可以检测其他病毒引起的合并感染,辅助快速确定诊疗方案。更可进一步可一次检测细菌、真菌、非典型病原菌、病毒的检测方案。应用该方法鉴定微生物,可以同时实现细菌、真菌、病毒等的检测,信息更加全面,例如可以同时实现病毒检测和病毒突变情况监测,为疫情监测提供可进行诠释和采取行动的快速、实时的流行病学信息。另外,所提供的方法借助于纳米孔测序平台进行测序,并进行微生物的鉴定,简易便携,适合在医院和CDC等实验室开展,测序数据可进行云分析,在资源有限的环境中,可以被快速建立并监控疫情。The beneficial effects achieved by the present disclosure are: applying the sequencing method provided by the present disclosure can quickly obtain microbial information in a sample, for example, it can achieve a detection result of a virus in a sample on the same day. And the detection sensitivity is high: it detects low-abundance viruses, bacteria, and fungi, helping early diagnosis and prompting the risk of early infection. The detection range is wide, for example, it can be applied to the detection of the new coronavirus SARS-CoV-2, and it can also detect co-infections caused by other viruses to help quickly determine the diagnosis and treatment plan. It can further detect bacteria, fungi, atypical pathogens, and viruses at one time. The application of this method to identify microorganisms can simultaneously realize the detection of bacteria, fungi, viruses, etc., and the information is more comprehensive. For example, it can realize virus detection and virus mutation monitoring at the same time, and provide rapid and real-time epidemics that can be interpreted and acted upon for epidemic monitoring. Disease information. In addition, the provided method uses the nanopore sequencing platform for sequencing and identification of microorganisms. It is simple and portable, and is suitable for development in laboratories such as hospitals and CDC. The sequencing data can be analyzed in the cloud. Quickly establish and monitor the epidemic.
附图说明Description of the drawings
图1是根据本公开的实施例提供的鉴定微生物的装置的结构示意图。Fig. 1 is a schematic structural diagram of an apparatus for identifying microorganisms according to an embodiment of the present disclosure.
图2是根据本公开的实施例提供的编号C1的新冠感染患者携带病毒基因组上突变分析结果。Fig. 2 is the result of mutation analysis on the virus-carrying genome of a patient with new crown infection numbered C1 according to an embodiment of the present disclosure.
具体实施方式Detailed ways
下面详细描述本公开的实施例,需要说明的是,所描述的实施例是示例性的,旨在用于解释本公开,而不能理解为对本公开的限制。同时,对本文中的一些术语进行解释和说明,以方便本领域技术人员理解,需要说明的是,这些解释和说明不应看做是对本公开保护范围的限制。The embodiments of the present disclosure are described in detail below. It should be noted that the described embodiments are exemplary, and are intended to explain the present disclosure, and should not be construed as limiting the present disclosure. At the same time, some terms in this article are explained and explained to facilitate the understanding of those skilled in the art. It should be noted that these explanations and explanations should not be regarded as limiting the scope of protection of the present disclosure.
本文中所提到的术语“第一”、“第二”、“第三”等仅用于描述方便的目的,而不能理解为指示或暗示相对重要性,也不能专门用于指示或者暗示先后顺序。在本公开的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。本文中,“通用细菌”是指一些常见细菌,这些细菌通常可以结合一些靶基因的情况就可以进行区分。这些靶基因进行筛选,最终确定为16s rRNA、rpob、gyrB、hsp60、ISR、23rRNA中的至少一种。相应地,通用细菌数据库,是指能够用作这些通用细菌鉴定的参考数据库,根据本公开的实施例,这些通用细菌库含有这些通用细菌靶基因的数据。同样地,通用细菌引物池是指能够用于这些通用细菌扩增或者鉴定的引物。The terms "first", "second", "third", etc. mentioned in this article are only used for the convenience of description, and cannot be understood as indicating or implying relative importance, nor can they be used exclusively to indicate or imply sequence order. In the description of the present disclosure, "plurality" means at least two, such as two, three, etc., unless otherwise specifically defined. In this article, "universal bacteria" refers to some common bacteria, which can usually be distinguished by combining some target genes. These target genes were screened and finally determined to be at least one of 16s rRNA, rpob, gyrB, hsp60, ISR, and 23rRNA. Correspondingly, the general bacterial database refers to a reference database that can be used for the identification of these general bacteria. According to the embodiments of the present disclosure, these general bacterial databases contain data of these general bacterial target genes. Similarly, the universal bacterial primer pool refers to primers that can be used for the amplification or identification of these universal bacteria.
本文中,“高敏感细菌”是指在临床上常见的一些致病细菌,用于表征或者鉴别这些细菌的靶基因通常各有不同,但是又常常会导致一些临床上的疾病,因此专门对这些细菌上的靶基因进行了研究,找到能够表征这些细菌的靶基因,用作这些细菌的鉴定。需要说 明的是,通用细菌或者高敏感细菌并不刻意区分,可能存在交叉,即某些细菌可能既属于通用细菌,也属于高敏感细菌。当鉴定出来的通用细菌和高敏感细菌在属水平上一致,但是在种水平上不一致时,以高敏感细菌的鉴定结果作为标准。相应地,高敏感细菌数据库,是指能够用作这些高敏感细菌鉴定的参考数据库,根据本公开的实施例,这些高敏感细菌库含有这些高敏感细菌靶基因的数据。同样地,高敏感细菌引物池是指能够用于这些高敏感细菌扩增或者鉴定的引物。In this article, "highly sensitive bacteria" refers to some common pathogenic bacteria in the clinic. The target genes used to characterize or identify these bacteria are usually different, but they often cause some clinical diseases. The target genes on bacteria have been studied, and target genes that can characterize these bacteria have been found to be used for the identification of these bacteria. What needs to be clarified is that general bacteria and highly sensitive bacteria are not deliberately distinguished, and there may be crossover, that is, some bacteria may belong to both general bacteria and highly sensitive bacteria. When the identified general bacteria and highly sensitive bacteria are consistent at the genus level but not at the species level, the identification results of the highly sensitive bacteria are used as the standard. Correspondingly, the highly sensitive bacteria database refers to a reference database that can be used for the identification of these highly sensitive bacteria. According to the embodiments of the present disclosure, these highly sensitive bacteria databases contain the data of the target genes of these highly sensitive bacteria. Similarly, a highly sensitive bacterial primer pool refers to primers that can be used for the amplification or identification of these highly sensitive bacteria.
本文中,“通用真菌”是指在一些常见真菌,这些真菌通常可以结合一些靶基因的情况就可以进行区分。这些靶基因进行筛选,最终确定为ITS1-4、LSU(D1/2)、18s rRNA、RPB2中的至少一种。相应地,通用真菌数据库,是指能够用作这些通用真菌鉴定的参考数据库,根据本公开的实施例,这些真菌细菌库含有这些通用真菌靶基因的数据。同样地,通用真菌引物池是指能够用于这些通用真菌扩增或者鉴定的引物。In this article, "universal fungi" refers to some common fungi, which can usually be distinguished by combining some target genes. These target genes were screened and finally determined to be at least one of ITS1-4, LSU(D1/2), 18s rRNA, and RPB2. Correspondingly, the general fungal database refers to a reference database that can be used for the identification of these general fungi. According to the embodiments of the present disclosure, these fungal and bacterial libraries contain data on these general fungal target genes. Similarly, the universal fungal primer pool refers to primers that can be used for the amplification or identification of these universal fungi.
本文中,“高敏感真菌”是指在临床上常见的一些致病真菌,用于表征或者鉴别这些真菌的靶基因通常各有不同,但是又常常会导致一些临床上的疾病,因此专门对这些真菌上的靶基因进行了研究,找到能够表征这些真菌的靶基因,用作这些真菌的鉴定。需要说明的是,通用真菌或者高敏感真菌并不刻意区分,可能存在交叉,即某些细菌可能既属于通用真菌,也属于高敏感真菌。当鉴定出来的通用真菌和高敏感真菌在属水平上一致,但是在种水平上不一致时,以高敏感真菌的鉴定结果作为标准。相应地,高敏感真菌数据库,是指能够用作这些高敏感真菌鉴定的参考数据库,根据本公开的实施例,这些高敏感真菌数据库含有这些高敏感真菌靶基因的数据。同样地,高敏感细菌引物池是指能够用于这些高敏感真菌扩增或者鉴定的引物。In this article, "highly sensitive fungi" refers to some common pathogenic fungi in the clinic. The target genes used to characterize or identify these fungi are usually different, but they often cause some clinical diseases. The target genes on the fungi have been studied, and the target genes that can characterize these fungi have been found and used for the identification of these fungi. It should be noted that general fungi and highly sensitive fungi are not deliberately distinguished, and there may be crossover, that is, some bacteria may belong to both general fungi and highly sensitive fungi. When the identified general fungi and highly sensitive fungi are consistent at the genus level but not at the species level, the identification results of the highly sensitive fungi shall be used as the standard. Correspondingly, a highly sensitive fungus database refers to a reference database that can be used as a reference database for the identification of these highly sensitive fungi. According to the embodiments of the present disclosure, these highly sensitive fungal databases contain data on target genes of these highly sensitive fungi. Similarly, the highly sensitive bacterial primer pool refers to the primers that can be used for the amplification or identification of these highly sensitive fungi.
本文中,当提到“多重病毒”是指包含至少一种以上病毒。相应地,“多重病毒靶基因”是指存在于这些病毒上的靶基因。本文中,新冠病毒是指SARS-CoV-2病毒。相应地,多重病毒数据库,是指能够用作这些病毒鉴定的参考数据库,根据本公开的实施例,多重病毒数据库含有这些病毒靶基因的数据。同样地,多重病毒引物池是指能够用于这些病毒扩增或者鉴定的引物。In this article, when referring to "multiple viruses", it means that it contains at least one virus. Correspondingly, "multiple viral target genes" refer to target genes present on these viruses. In this article, the new coronavirus refers to the SARS-CoV-2 virus. Correspondingly, the multiple virus database refers to a reference database that can be used for the identification of these viruses. According to an embodiment of the present disclosure, the multiple virus database contains data on target genes of these viruses. Similarly, multiple virus primer pools refer to primers that can be used for the amplification or identification of these viruses.
本公开提供了一种测序方法,包括:对来自于微生物的靶基因进行富集,以便获得富集产物,所述微生物包括选自细菌、真菌或者病毒中的至少一种;基于所述富集产物进行建库,以便获得所述测序文库;基于所述测序文库,利用纳米孔测序平台进行所述测序。通过对来自于微生物的靶基因进行富集,对所获得的富集产物依照纳米孔测序平台的建库流程进行建库,获得测序文库。所获得的测序文库中仅含有来自于微生物的经过富集的靶基因核酸,通过对所获得测序文库可以借助于纳米孔测序平台实现快速测序分析,检测时间从样本到结果<12小时,最快6小时即可以完成,并同时可以实现细菌,真菌,病毒检测一次性广谱检测。而且由于靶基因核酸得到富集,检测所需要的数据量减少到10~50mb/样本,大幅度减低检测成本。利用纳米孔测序平台进行测序,相比较于二代测序技术,可以在非专业实验室内就地监测,通过笔记本电脑与网络云端流程即可以进行检测与数据分析。The present disclosure provides a sequencing method, including: enriching target genes from microorganisms to obtain enriched products, the microorganisms including at least one selected from bacteria, fungi, or viruses; based on the enrichment A library of products is constructed to obtain the sequencing library; based on the sequencing library, the sequencing is performed using a nanopore sequencing platform. By enriching the target genes from microorganisms, the obtained enriched products are constructed in accordance with the library construction process of the nanopore sequencing platform to obtain a sequencing library. The obtained sequencing library only contains the enriched target gene nucleic acid from microorganisms. The obtained sequencing library can be used to realize rapid sequencing analysis with the help of the nanopore sequencing platform. The detection time is less than 12 hours from sample to result, which is the fastest It can be completed in 6 hours, and at the same time, it can realize a one-time broad-spectrum detection of bacteria, fungi, and viruses. Moreover, because the target gene nucleic acid is enriched, the amount of data required for detection is reduced to 10-50mb/sample, which greatly reduces the cost of detection. Using the nanopore sequencing platform for sequencing, compared with the second-generation sequencing technology, it can be monitored on-site in non-professional laboratories, and detection and data analysis can be performed through laptops and network cloud processes.
根据本公开的实施例,这些靶基因可以为选自16s rRNA、rpob、gyrB、hsp60、23s rRNA、ISR中的至少一种。这些靶基因可以用作细菌的检测区域,通过将这些检测区域的测序数据与参考数据库进行比对,确定细菌的具体种类。根据本公开的实施例,这些靶基因还可以为表1所述高敏感细菌的靶基因中的至少一种。这些细菌作为临床上重要的高敏感细菌,通过对相应的靶基因进行检测,可以确定相应细菌的具体种类。According to an embodiment of the present disclosure, these target genes may be at least one selected from 16s rRNA, rpob, gyrB, hsp60, 23s rRNA, and ISR. These target genes can be used as detection areas for bacteria, and the specific types of bacteria can be determined by comparing the sequencing data of these detection areas with a reference database. According to embodiments of the present disclosure, these target genes may also be at least one of the target genes of the highly sensitive bacteria described in Table 1. These bacteria are clinically important and highly sensitive bacteria, and the specific species of the corresponding bacteria can be determined by detecting the corresponding target genes.
表1高敏感细菌及其靶基因Table 1 Highly sensitive bacteria and their target genes
Figure PCTCN2021071423-appb-000001
Figure PCTCN2021071423-appb-000001
Figure PCTCN2021071423-appb-000002
Figure PCTCN2021071423-appb-000002
Figure PCTCN2021071423-appb-000003
Figure PCTCN2021071423-appb-000003
根据本公开的实施例,这些靶基因可以为选自ITS1-4、LSU(D1/2)、18s rRNA和RPB2中的至少一种。这些靶基因可以用作真菌的检测区域,通过将这些检测区域的测序数据与参考数据库进行比对,确定真菌的具体种类。根据本公开的实施例,这些靶基因还可以为表2所述真菌的靶基因。这些细菌作为临床上重要的高敏感真菌,通过对这些真菌的靶基因进行检测,可以确定相应真菌的具体种类。According to an embodiment of the present disclosure, these target genes may be at least one selected from ITS1-4, LSU(D1/2), 18s rRNA, and RPB2. These target genes can be used as detection areas for fungi, and the specific species of fungi can be determined by comparing the sequencing data of these detection areas with a reference database. According to the embodiments of the present disclosure, these target genes may also be the target genes of the fungi described in Table 2. These bacteria are clinically important and highly sensitive fungi. By detecting the target genes of these fungi, the specific species of the corresponding fungi can be determined.
表2高敏感真菌及其靶基因Table 2 Highly sensitive fungi and their target genes
Figure PCTCN2021071423-appb-000004
Figure PCTCN2021071423-appb-000004
根据本公开的实施例,所述病毒靶基因包括多重病毒靶基因和/或新冠病毒靶基因,所述多重病毒靶基因包括选自表3所示病毒靶基因中的至少一种;所述新冠病毒靶基因包括选自表4所示靶基因中的至少一种。According to an embodiment of the present disclosure, the viral target genes include multiple viral target genes and/or new coronavirus target genes, and the multiple viral target genes include at least one selected from the viral target genes shown in Table 3; the new coronavirus The viral target genes include at least one selected from the target genes shown in Table 4.
表3多重病毒及其靶基因Table 3 Multiple viruses and their target genes
病毒名称Virus name 靶基因Target gene
博卡病毒Boca virus NP1、VP1-2NP1, VP1-2
鼻病毒Rhinovirus VP4/VP2、5’UTRVP4/VP2, 5’UTR
人偏肺病毒Human metapneumovirus N gene、F gene、glycoprotein G、N geneN gene, F gene, glycoprotein G, N gene
呼吸道合胞病毒Respiratory syncytial virus N gene、P geneN gene, P gene
冠状病毒Coronavirus 1a、1b1a, 1b
腺病毒Adenovirus hexon genehexon gene
副流感Parainfluenza H geneH gene
甲型流感病毒Influenza A virus M gene,H geneM gene, H gene
乙型流感病毒Influenza B virus M gene、HA、na、NSM gene, HA, na, NS
丙型流感病毒Influenza C virus M geneM gene
肠病毒Enterovirus VP1VP1
疱疹病毒Herpes virus glycoprotein Gglycoprotein G
风疹病毒Rubella virus E1E1
表4新冠病毒靶基因Table 4 New Coronavirus Target Genes
Figure PCTCN2021071423-appb-000005
Figure PCTCN2021071423-appb-000005
根据本公开的实施例,所述基于纳米孔测序平台的文库的构建方法进一步包括:基于引物池中的引物对所述来自于微生物的靶基因进行PCR扩增,实现对所述靶基因的富集,以便获得富集产物,所述引物池包含至少一条引物。According to an embodiment of the present disclosure, the method for constructing a library based on a nanopore sequencing platform further includes: performing PCR amplification on the target gene from the microorganism based on the primers in the primer pool, so as to realize the enrichment of the target gene. In order to obtain an enriched product, the primer pool contains at least one primer.
根据本公开的实施例,所述引物池包括选自下列引物池中的至少一种:通用细菌引物池,所述通用细菌引物池包括表5所列出的引物;通用真菌引物池,所述通用真菌引物池包括表6所列出的引物;高敏感细菌引物池,所述高敏感细菌引物池包括表7所列出的引物;高敏感真菌引物池,所述高敏感真菌引物池包括表8所列出的引物;多重病毒引物池,所述多重病毒引物池包括表9所列出的引物;新冠病毒引物池,所述新冠病毒引物池包括表10所列出的引物。According to an embodiment of the present disclosure, the primer pool includes at least one selected from the following primer pools: a universal bacterial primer pool including the primers listed in Table 5; a universal fungal primer pool, the The universal fungal primer pool includes the primers listed in Table 6; the highly sensitive bacterial primer pool includes the primers listed in Table 7; the highly sensitive fungal primer pool includes the primers listed in Table 7. 8 primers listed; multiple virus primer pool, the multiple virus primer pool includes the primers listed in Table 9; new coronavirus primer pool, the new coronavirus primer pool includes the primers listed in Table 10.
根据本公开的实施例,将通用细菌靶基因的富集产物、通用真菌靶基因的富集产物、高敏感细菌和高敏感真菌靶基因的富集产物、多重病毒靶基因的富集产物和新冠病毒靶基因的富集产物按照质量比为20~60:5~15:10~25:10~25:10~25的比例混合,对混合后的产物进行建库,以便获得所述测序文库。根据本公开的优选实施例,将这些富集产物按照30~50:6~10:12~20:12~20:12~20的比例混合。According to the embodiments of the present disclosure, the enriched products of universal bacterial target genes, the enriched products of universal fungal target genes, the enriched products of highly sensitive bacteria and highly sensitive fungi target genes, the enriched products of multiple viral target genes, and the new crown The enriched products of viral target genes are mixed in a mass ratio of 20-60:5-15:10-25:10-25:10-25, and a library is constructed on the mixed products to obtain the sequencing library. According to a preferred embodiment of the present disclosure, these enriched products are mixed in a ratio of 30-50:6-10:12-20:12-20:12-20.
本公开还提供了一种鉴定微生物的装置,如图1所示,包括:文库构建单元,所述文库构建单元基于所述待测样本核酸,根据上述方法获得测序文库;测序单元,所述测序单元基于所述测序文库,利用纳米孔测序平台进行测序,以便获得所述测序结果;数据处理单元,所述数据处理单元基于待测样本核酸的测序结果与参考数据库进行比对,用于确定所述待测样本中的微生物。根据本公开的实施例,所述测序单元可以和所述文库构建单元 相连,所述数据处理单元可以和所述测序单元相连。在本公开中,除非另有明确的规定和限定,术语“相连”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接或彼此可通讯;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本公开中的具体含义。The present disclosure also provides an apparatus for identifying microorganisms, as shown in FIG. 1, comprising: a library construction unit, which is based on the sample nucleic acid to be tested and obtains a sequencing library according to the above method; a sequencing unit, the sequencing The unit is based on the sequencing library and uses a nanopore sequencing platform to perform sequencing to obtain the sequencing result; a data processing unit that compares the sequencing result of the nucleic acid of the sample to be tested with a reference database for determining Describe the microorganisms in the sample to be tested. According to an embodiment of the present disclosure, the sequencing unit can be connected to the library building unit, and the data processing unit can be connected to the sequencing unit. In the present disclosure, unless otherwise clearly defined and defined, the term "connected" should be understood in a broad sense, for example, it may be a fixed connection, a detachable connection, or a whole; it may be a mechanical connection or an electrical connection. Connected or can communicate with each other; it can be directly connected or indirectly connected through an intermediary, and it can be the internal communication between two components or the interaction relationship between two components, unless specifically defined otherwise. For those of ordinary skill in the art, the specific meaning of the above-mentioned terms in the present disclosure can be understood according to specific circumstances.
下面将结合实施例对本公开的方案进行解释。本领域技术人员将会理解,下面的实施例仅用于说明本公开,而不应视为限定本公开的范围。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。The solutions of the present disclosure will be explained below in conjunction with examples. Those skilled in the art will understand that the following embodiments are only used to illustrate the present disclosure, and should not be regarded as limiting the scope of the present disclosure. Where specific techniques or conditions are not indicated in the examples, the procedures shall be carried out in accordance with the techniques or conditions described in the literature in the field or in accordance with the product specification. The reagents or instruments used without the manufacturer's indication are all conventional products that can be purchased commercially.
实施例1靶基因设计数据库构建Example 1 Construction of target gene design database
从NCBI等现有数据库中,下载通用细菌靶基因数据。过滤去除其中非完整基因序列,不属于目的靶基因的错误命名序列等。获得高质量的细菌16s rRNA(核糖体16s rRNA亚基),与真菌ITS(内部转录间隔区)。Download general bacterial target gene data from existing databases such as NCBI. Filter to remove incomplete gene sequences, incorrectly named sequences that are not part of the target gene, etc. Obtain high-quality bacterial 16s rRNA (ribosomal 16s rRNA subunit), and fungal ITS (internal transcribed spacer).
同时,从EnsemblBacteria数据库中下载44493个高质量的细菌,真菌完整基因组数据。通过对所有基因组注释信息的分析,参照文章的方法,从基因组数据库里提取获得了用于细菌鉴定的:rpob(细菌Rna聚合酶β亚基),gyrB(促旋酶B亚基),hsp60(热休克蛋白60),ISR(核糖体16s rRNA/23s rRNA基因间隔区),atpD,dnaJ,tuf,sodA,rnpB,inhA,IS900,katG,RecA,fumC,icd,mdh,purA;以及适用于真菌鉴定的LSU,ITS1-4,LSU(D1/2),18s rRNA,RPB1/2,TEF1,Bena(β-tulbulin),CaM(calmodulin),cyp51A,ND6,MCM7,CAL,TUB2,ACT等数据库。At the same time, 44493 high-quality bacteria and fungi complete genome data were downloaded from the EnsemblBacteria database. Through the analysis of all genome annotation information, referring to the method of the article, we extracted from the genome database the bacteria identification: rpob (bacterial RNA polymerase β subunit), gyrB (gyrase B subunit), hsp60( Heat shock protein 60), ISR (ribosomal 16s rRNA/23s rRNA intergenic region), atpD, dnaJ, tuf, sodA, rnpB, inhA, IS900, katG, RecA, fumC, icd, mdh, purA; and suitable for fungi Identified LSU, ITS1-4, LSU (D1/2), 18s rRNA, RPB1/2, TEF1, Bena (β-tulbulin), CaM (calmodulin), cyp51A, ND6, MCM7, CAL, TUB2, ACT and other databases.
最后根据基因长度,基因名称,基因完整度对所获得的数据库进行过滤与优化,获得针对每个靶基因的高质量的数据库。Finally, filter and optimize the obtained database according to the gene length, gene name, and gene completeness to obtain a high-quality database for each target gene.
实施例2靶基因保守型与特异性区域分析方案Example 2 Target gene conservative and specific region analysis scheme
使用实施例1所构建的数据库,使用自行撰写、整合的分析流程,针对每个靶标的数据进行保守性与特异性分析。Using the database constructed in Example 1, using a self-written and integrated analysis process, conservative and specific analysis of the data of each target was performed.
首先使用ClustalW/ClustalX等软件对序列进行多序列比对,并通过自行撰写的统计流程计算出每个比对位点的保守性数据。每个靶位点都进行三个层面的分析:First, use ClustalW/ClustalX and other software to perform multiple sequence alignments, and calculate the conservative data of each alignment site through a statistical process written by yourself. Each target site is analyzed at three levels:
层面一,针对单个靶位点全部数据的分析,获得“比对分析文件1”。Level one, for the analysis of all the data of a single target site, obtain the "comparison analysis file 1".
层面二,根据数据库里每一条数据来源的物种信息,以每个物种的属和种名称,将数据库里所有序列以物种种属名称进行抽平,相同的物种只保留不超过5条序列,获得靶基因抽平数据库,并使用该数据库的全部数据进行分析,获得“比对分析文件2”。Level 2: According to the species information of each data source in the database, use the genus and species name of each species to flatten all the sequences in the database with the species species name, and only keep no more than 5 sequences for the same species. The target gene is flattened out of the database, and all the data in the database is used for analysis to obtain "comparison analysis file 2".
层面三,根据表1和表2里的重要病原菌信息,将每个靶基因数据数据库拆分为独立的临床重要细菌数据库(共计60个属)和临床重要真菌数据库(共计15个属),分别针对每个独立数据库进行分析,获得“比对分析文件3”。Level three, according to the important pathogen information in Table 1 and Table 2, split each target gene data database into an independent clinically important bacterial database (a total of 60 genera) and a clinically important fungus database (a total of 15 genera), respectively Analyze each independent database to obtain "Comparison Analysis File 3".
综合比对分析文件1,2,3的数据,评估每个靶基因在全部细菌或真菌水平,在临床重要病原菌属中的保守性,特异性,保守区域与特异性区域等统计数据。用于后续新引物的设计。Comprehensively compare and analyze the data of files 1, 2, and 3 to evaluate the statistical data of each target gene at the level of all bacteria or fungi, and the conservativeness, specificity, conservative region and specific region of clinically important pathogenic bacteria. For the design of new primers.
实施例3通用细菌靶向基因与扩增引物设计Example 3 Design of universal bacterial targeting genes and amplification primers
根据上述针对不同靶标基因的分析结果,对所有细菌靶标进行统计与分类。According to the above analysis results for different target genes, all bacterial targets are counted and classified.
首先,统计分析比对分析文件1以及比对分析文件2中展现出保守性的数据,确定靶基因上是否存在保守区域。如果存在,可以将该靶基因用于细菌的引物设计。其中保守性区域的判定标准为:First, statistically analyze the conservative data in the comparison analysis file 1 and the comparison analysis file 2 to determine whether there is a conservative region on the target gene. If present, the target gene can be used for bacterial primer design. The criteria for determining conservative areas are:
连续>15个碱基的保守度均超过80%;The conservative degree of consecutive >15 bases exceeds 80%;
或长度介于15-25碱基的区域中,其中大于90%碱基的保守度超过85%;Or in the region of 15-25 bases in length, where more than 90% of the bases have a conservative degree of more than 85%;
或>25个碱基的连续区域,其中大于80%碱基的保守度超过85%。Or a contiguous region of >25 bases, in which more than 80% of the bases have a conservative degree of more than 85%.
根据每个靶基因的保守性分析,将细菌靶基因的应用范围进行划分,获得“靶基因评估文件”。最终确定的通用细菌靶基因:16s rRNA、rpob、gyrB、hsp60、ISR、23s rRNA。According to the conservative analysis of each target gene, the application scope of bacterial target genes is divided, and the "target gene evaluation file" is obtained. The final universal bacterial target genes: 16s rRNA, rpob, gyrB, hsp60, ISR, 23s rRNA.
针对“靶基因评估文件”获得的保守区域位点进行保守频率与可变模式的分析,分别获得“位点频率文件”和“可变模式文件”。“位点频率文件”记录了每个保守区中每个碱基位点上A,T,C,G和缺失的确切概率。以此文件为基础,计算出每个区域上可能出现的最大概率序列以及其他可能出现的排列组合类型。“可变模式文件”记录了每个保守区上,不同位点可变碱基之间的相互关系。The conservative frequency and variable pattern analysis were performed on the conserved region sites obtained in the "target gene evaluation file", and the "site frequency file" and the "variable pattern file" were obtained respectively. The "site frequency file" records the exact probability of A, T, C, G and deletion at each base site in each conservative region. Based on this file, calculate the maximum probability sequence that may appear in each area and other possible permutation and combination types. The "variable pattern file" records the relationship between variable bases at different positions on each conserved region.
根据纳米孔测序方法可兼容的核酸长度与上述“位点频率文件”和“可变模式文件”的结果,对靶基因上各可用于引物设计的区域进行GC含量,TM值,引物长度,扩增产物长度,序列3端最后5位碱基的GC含量,连续>3个相同碱基区域数量进行计算。综合考虑一下因素,进行每个引物位点序列,简并和引物间组合的确定:1.尽可能最大限度的提高数据的分辨能力;引物扩增产物200-1500碱基,最优800bp;2.减少保守区域突变引起的扩增效率降低风险;根据“位点频率文件”结果,采用类似“叠瓦”结构设计针对单个位点的多个引物,提高引物对于不同类型微生物的扩增效率;3.考虑尽可能提高引物的特异性。根据“可变模式文件”的结果,优化选择出现频率最高的可变组合,设计特异性的引物,与通用简并引物混合形成全新的引物池,提高引物的扩增效率。4.考虑不同引物之间的兼容性,根据“位点频率文件”将每个引物的GC等关键指标固定在一定范围内。提高引物扩增的均一性。通过该过程,完成了对16s rRNA,rpob,gyrB,hsp60,ISR基因上引物的设计与测试,获得“通用细菌引物池”。According to the compatible nucleic acid length of the nanopore sequencing method and the results of the above-mentioned "site frequency file" and "variable pattern file", the GC content, TM value, primer length, and expansion of each region on the target gene that can be used for primer design are performed. Increase the length of the product, calculate the GC content of the last 5 bases at the 3 end of the sequence, and the number of consecutive >3 identical base regions. Consider the following factors comprehensively to determine the sequence of each primer site, degenerate and primer combination: 1. Maximize the resolution of the data as much as possible; the primer amplified product is 200-1500 bases, the best 800bp; 2 Reduce the risk of amplification efficiency caused by mutations in conservative regions; according to the results of the "site frequency file", use a similar "shingle" structure to design multiple primers for a single site to improve the amplification efficiency of primers for different types of microorganisms; 3. Consider increasing the specificity of primers as much as possible. According to the results of the "variable pattern file", optimize the selection of the variable combination with the highest frequency, design specific primers, and mix with universal degenerate primers to form a new primer pool to improve the amplification efficiency of primers. 4. Consider the compatibility between different primers, and fix the GC and other key indicators of each primer within a certain range according to the "site frequency file". Improve the uniformity of primer amplification. Through this process, the design and testing of primers on 16s rRNA, rpob, gyrB, hsp60, ISR genes were completed, and a "universal bacterial primer pool" was obtained.
所设计的通用细菌引物池中各引物的均满足下列条件:All primers in the designed universal bacterial primer pool meet the following conditions:
.所述引物长度为18-30个碱基;The length of the primer is 18-30 bases;
b.所述引物的解链温度Tm值为57-64℃;b. The melting temperature Tm of the primer is 57-64°C;
c.所述引物中GC含量为40-60%;c. The GC content in the primer is 40-60%;
d.所述引物的3’末端5个碱基的吉布斯自由能ΔG大于等于-9kcal/mol;d. The Gibbs free energy ΔG of the 5 bases at the 3'end of the primer is greater than or equal to -9kcal/mol;
e.所述引物自身互补性数值小于8.0,引物3’末端自我互补参数小于3.0;e. The primer self-complementarity value is less than 8.0, and the 3'end self-complementarity parameter of the primer is less than 3.0;
f.所述引物3’末端连续3个碱基上不存在简并碱基;f. There is no degenerate base on 3 consecutive bases at the 3'end of the primer;
g.所述引物的扩增产物长度为300~1500个碱基。g. The length of the amplified product of the primer is 300 to 1500 bases.
具体来说,所设计的引物分别如下:Specifically, the designed primers are as follows:
表5通用细菌引物池Table 5 General bacterial primer pool
Figure PCTCN2021071423-appb-000006
Figure PCTCN2021071423-appb-000006
Figure PCTCN2021071423-appb-000007
Figure PCTCN2021071423-appb-000007
实施例4通用真菌靶向基因及扩增引物设计Example 4 Universal fungal targeting gene and amplification primer design
采用与实施例3相同的方法,对通用型真菌靶基因以及基因上适用于引物扩增的区域进行详细分析。最终确定ITS1-4,LSU(D1/2)两个靶向位点,并对位点上的引物进行了重新设计,获得“通用真菌引物池”。Using the same method as in Example 3, detailed analysis was performed on universal fungal target genes and regions suitable for primer amplification on genes. Finally, two target sites of ITS1-4 and LSU (D1/2) were determined, and the primers at the sites were redesigned to obtain a "universal fungal primer pool".
所提供的通用真菌引物池中的引物如下所示:The primers in the provided universal fungal primer pool are as follows:
表6通用真菌引物池Table 6 General Fungal Primer Pool
Figure PCTCN2021071423-appb-000008
Figure PCTCN2021071423-appb-000008
实施例5高敏感细菌与真菌靶向基因及扩增引物的组合Example 5 Combinations of highly sensitive bacteria and fungi targeting genes and amplification primers
有一些靶基因在所有细菌或真菌并不存在通用的保守区,不能用于全部的细菌或真菌的检测,属于非通用型靶基因。针对这类靶基因,根据“比对分析文件3”的内容,采用与实施例3相同的方法,对表1所列出的高敏感细菌,表2所列出的高敏感真菌逐个进行了引物的设计。这些非通用靶基因虽然在全部细菌或者真菌中不保守,但在特定的属内,存在高度的保守性,所以基于此属水平设计的引物可以特异性扩增该属的细菌/真菌。以此,在检 测方案中增添该引物,可提高对该属细菌/真菌的鉴别敏感性。此外,相比通用靶基因,非通用基因在近缘细菌/真菌之间的基因序列差异度更大,通过测序结果可以更好的对细菌/真菌进行鉴定,可提高方法对细菌/真菌的鉴定分辨能力。采用实施例3中引物搭配选择方法,最终设计一套高敏感细菌引物池和一套高敏感真菌引物池。Some target genes do not have universal conserved regions in all bacteria or fungi, and cannot be used for the detection of all bacteria or fungi. They are non-universal target genes. For this type of target gene, according to the content of "Comparison Analysis File 3", using the same method as in Example 3, the highly sensitive bacteria listed in Table 1 and the highly sensitive fungi listed in Table 2 were primed one by one. the design of. Although these non-universal target genes are not conserved in all bacteria or fungi, they are highly conserved within a specific genus, so primers designed based on this genus level can specifically amplify bacteria/fungi of this genus. In this way, adding this primer to the detection scheme can improve the sensitivity of the identification of bacteria/fungi of this genus. In addition, compared with the universal target gene, the non-universal gene has a greater degree of genetic sequence difference between closely related bacteria/fungi, and the bacteria/fungi can be better identified through the sequencing results, which can improve the method for the identification of bacteria/fungi Resolving power. Using the primer matching selection method in Example 3, a set of highly sensitive bacterial primer pool and a set of highly sensitive fungal primer pool were finally designed.
所设计的高敏感细菌真菌引物池如下表7所示:The designed primer pool for highly sensitive bacteria and fungi is shown in Table 7 below:
表7高敏感细菌引物池Table 7 Primer Pool of Highly Sensitive Bacteria
Figure PCTCN2021071423-appb-000009
Figure PCTCN2021071423-appb-000009
Figure PCTCN2021071423-appb-000010
Figure PCTCN2021071423-appb-000010
Figure PCTCN2021071423-appb-000011
Figure PCTCN2021071423-appb-000011
Figure PCTCN2021071423-appb-000012
Figure PCTCN2021071423-appb-000012
Figure PCTCN2021071423-appb-000013
Figure PCTCN2021071423-appb-000013
Figure PCTCN2021071423-appb-000014
Figure PCTCN2021071423-appb-000014
Figure PCTCN2021071423-appb-000015
Figure PCTCN2021071423-appb-000015
Figure PCTCN2021071423-appb-000016
Figure PCTCN2021071423-appb-000016
表8高敏感真菌引物池Table 8 Highly sensitive fungus primer pool
Figure PCTCN2021071423-appb-000017
Figure PCTCN2021071423-appb-000017
实施例6多重细菌,真菌靶基因扩增方法与扩增引物组合方法Example 6 Multiple bacterial and fungal target gene amplification method and amplification primer combination method
为实现多个细菌与真菌的检测,兼顾高敏感性与鉴定范围的广泛性,需要选择可在一个扩增反应中使用的引物组合以及合适的扩增方法。通过上述发明内容,共计设计了3套独立扩增的引物池:通用细菌引物池,通用真菌引物池,高敏感细菌/真菌引物池。In order to achieve the detection of multiple bacteria and fungi, taking into account the high sensitivity and the wide range of identification, it is necessary to select a primer combination that can be used in an amplification reaction and a suitable amplification method. Based on the foregoing invention, a total of 3 sets of independently amplified primer pools were designed: universal bacterial primer pool, universal fungal primer pool, and highly sensitive bacteria/fungal primer pool.
为了提高效率,可以将多个不同标本的扩增产物混合后进行测序文库构建与测序可以降低成本。因此,所有引物池扩增产物的两端都需要携带上特定24个碱基的“标签序列”。不同的标本携带的标签序列不同,相同的标本内不同的引物池携带的标签序列相同。通过标签序列将每一条测序数据归属到具体标本。In order to improve efficiency, the amplification products of multiple different specimens can be mixed and then the sequencing library construction and sequencing can be performed to reduce costs. Therefore, both ends of the amplified products of all primer pools need to carry a specific 24 base "tag sequence". Different specimens carry different tag sequences, and different primer pools in the same specimen carry the same tag sequences. Assign each piece of sequencing data to a specific specimen through the tag sequence.
可以采用扩增的方式将标签序列引入到靶向基因两端。分别利用通用细菌引物池和通用真菌引物池以及高敏感细菌/真菌引物池中引物对靶基因进行扩增,以确保扩增的敏感性与标签序列引入的高效性。分别获得通用细菌靶基因富集产物,通用真菌靶基因富集产物,以及高敏感细菌/真菌靶基因富集产物。The tag sequence can be introduced to both ends of the targeted gene by means of amplification. The primers in the universal bacterial primer pool, universal fungal primer pool, and highly sensitive bacteria/fungal primer pool are used to amplify target genes to ensure the sensitivity of amplification and the high efficiency of tag sequence introduction. The enriched products of universal bacterial target genes, the enriched products of universal fungal target genes, and the enriched products of highly sensitive bacteria/fungal target genes are obtained respectively.
实施例7病毒靶向扩增引物的设计Example 7 Design of virus targeted amplification primers
从NCBI等现有数据库中,下载需要检测的病毒完整参考基因组,使用实施例2里的方案,分析病毒基因组上相对保守与可变区域。同时,参考目前NCBI PubMed数据库里已发表的针对每一类病毒进行PCR鉴定,以及已获得FDA批准的病毒核酸检测试剂盒中选择的病毒基因。综合分析后,选择用于每类病毒的鉴定的一个或者多个靶基因。接着从NCBI数据库 中将每类病毒的靶基因中含有的非完整基因序列过滤去除,同时过滤去除不属于目的靶基因的错误命名序列等,获得高质量的多重病毒数据库。Download the complete reference genome of the virus to be detected from existing databases such as NCBI, and use the scheme in Example 2 to analyze the relatively conservative and variable regions on the virus genome. At the same time, refer to the published PCR identification for each type of virus in the current NCBI PubMed database, and the virus genes selected in the virus nucleic acid detection kit that has been approved by the FDA. After comprehensive analysis, select one or more target genes for the identification of each type of virus. Then filter out the non-complete gene sequences contained in the target genes of each type of virus from the NCBI database, and filter out the wrongly named sequences that do not belong to the target gene of interest, etc., to obtain a high-quality multiple virus database.
另外,采用实施例3中所提到的方法,设计适用于纳米孔测序等单分子测序平台的扩增引物,获得“多重病毒引物池”。In addition, the method mentioned in Example 3 was used to design amplification primers suitable for single-molecule sequencing platforms such as nanopore sequencing to obtain a "multiple virus primer pool".
所提供的多重病毒引物池中各引物如下表所示:The primers in the provided multiple virus primer pool are shown in the following table:
表9多重病毒引物池Table 9 Multiple virus primer pool
Figure PCTCN2021071423-appb-000018
Figure PCTCN2021071423-appb-000018
Figure PCTCN2021071423-appb-000019
Figure PCTCN2021071423-appb-000019
Figure PCTCN2021071423-appb-000020
Figure PCTCN2021071423-appb-000020
实施例8新型冠状病毒2019-nCoV(SARS-CoV-2)高敏感,高覆盖度引物设计Example 8 Design of highly sensitive and high coverage primers for the novel coronavirus 2019-nCoV (SARS-CoV-2)
从GISAID数据库中,下载已经公开的新型冠状病毒2019-nCoV基因组数据,使用实施例2里的方案,确认现有病毒基因组上保守区域。针对保守区域同时针对基因组上ORF1a/b中rdrp,S,ORF3a,E,M,ORF6,ORF7a,ORF8,N基因9个基因,设计“新冠病毒引物池”全面覆盖病毒基因组上9094bp的基因区域,100%覆盖病毒基因组上毒力相关的S,E,M基因。From the GISAID database, download the published new coronavirus 2019-nCoV genome data, and use the scheme in Example 2 to confirm the conserved regions on the existing virus genome. Aiming at the conserved regions and simultaneously targeting the 9 genes of rdrp, S, ORF3a, E, M, ORF6, ORF7a, ORF8, and N genes in ORF1a/b on the genome, the "new coronavirus primer pool" was designed to cover the 9094bp gene region on the viral genome. 100% coverage of S, E, M genes related to virulence in the viral genome.
表10新冠病毒引物池Table 10 New Coronavirus Primer Pool
Figure PCTCN2021071423-appb-000021
Figure PCTCN2021071423-appb-000021
Figure PCTCN2021071423-appb-000022
Figure PCTCN2021071423-appb-000022
实施例9多重病毒靶基因扩增方法与扩增引物组合方法Example 9 Multiplex virus target gene amplification method and amplification primer combination method
“多重病毒引物池”与“新冠病毒引物池”选择相同的方法分别进行扩增,详细流程包括反转录—cDna扩增,具体如下:The "multiple virus primer pool" and the "new coronavirus primer pool" choose the same method for amplification. The detailed process includes reverse transcription-cDna amplification, as follows:
1、反转录1. Reverse transcription
1.1变性体系配置:1.1 Transgender system configuration:
成分Element 体积(ul)Volume (ul)
Random 6 mersRandom 6 mers 11
dNTP Mixture dNTP Mixture 11
核酸样本Nucleic acid sample 88
总体积total capacity 1010
变性程序:65℃孵育5分钟,冰上迅速冷却(可将PCR程序设置为4℃)。Denaturation program: incubate at 65°C for 5 minutes, and cool quickly on ice (the PCR program can be set to 4°C).
1.2反转录体系配置:1.2 Reverse transcription system configuration:
成分Element 体积(ul)Volume (ul)
上述变性产物The above-mentioned denatured products 1010
Rnase InhibitorRnase Inhibitor 0.50.5
5X PrimeScript Ⅱ Buffer5X PrimeScript Ⅱ Buffer 44
PrimeScript II RTase PrimeScript II RTase 11
Rnase Free dH2ORnase Free dH2O 4.54.5
总体积total capacity 2020
反转录程序:Reverse transcription program:
循环数Number of cycles 温度(℃)Temperature(℃) 时间(分钟:秒)Time (minutes:seconds)
11 3030 10:0010:00
11 4242 30:0030:00
11 9595 5:005:00
11 44
1.3使用磁珠对扩增产物进行纯化1.3 Purify the amplified product using magnetic beads
2、cDna扩增2. cDna amplification
2.1第一轮扩增体系配置:2.1 The first round of amplification system configuration:
Figure PCTCN2021071423-appb-000023
Figure PCTCN2021071423-appb-000023
2.2第一轮扩增PCR程序:2.2 The first round of amplification PCR program:
Figure PCTCN2021071423-appb-000024
Figure PCTCN2021071423-appb-000024
2.3使用磁珠对扩增产物进行纯化。2.4、第二轮扩增体系配置:2.3 Purify the amplified product using magnetic beads. 2.4. The second round of amplification system configuration:
Figure PCTCN2021071423-appb-000025
Figure PCTCN2021071423-appb-000025
2.5第二轮扩增体系程序:2.5 The second round of amplification system procedures:
Figure PCTCN2021071423-appb-000026
Figure PCTCN2021071423-appb-000026
Figure PCTCN2021071423-appb-000027
Figure PCTCN2021071423-appb-000027
实施例10扩增产物的混合建库与测序Example 10 Hybrid library construction and sequencing of amplified products
为确保利用通用细菌引物池,通用真菌引物池,高敏感性细菌/真菌引物池,病毒引物池,新型冠状病毒引物池进行PCR扩增所获得的富集产物能通过一次测序进行全面检测。根据样本数量,选择五个引物池扩增所得到的富集产物按照如下表11比例进行混合,获得混合产物。混合产物采用Oxford nanopore technologies公司的连接测序试剂盒SQK-LSK109进行文库构建,并使用Oxford nanopore technologies公司MinION,GridION或者PromethION等测序仪测序。In order to ensure that the enriched products obtained by PCR amplification using universal bacterial primer pool, universal fungal primer pool, highly sensitive bacteria/fungal primer pool, virus primer pool, and novel coronavirus primer pool can be fully detected by one sequencing. According to the number of samples, the enriched products obtained from five primer pool amplifications are selected and mixed according to the following proportions in Table 11 to obtain mixed products. The mixed products were constructed using Oxford nanopore technologies' ligation sequencing kit SQK-LSK109 for library construction, and were sequenced using Oxford nanopore technologies' MinION, GridION or PromethION sequencers.
表11富集产物混合方案Table 11 Mixing scheme of enriched products
Figure PCTCN2021071423-appb-000028
Figure PCTCN2021071423-appb-000028
实施例11标本中常见11类呼吸道病毒与新型冠状病毒同时检测Example 11 Simultaneous detection of 11 types of respiratory viruses and new coronaviruses in specimens
选择45份临床疑似新型冠状病毒感染的咽拭子标本提取核酸,使用上述实施例中所提供的“新冠病毒引物池”对45份标本进行扩增,使用“多重病毒引物池”对16份标本进行扩增,每个样本上添加不同的标签序列。扩增产物浓度,样本的混合量如下表12显示:The nucleic acid was extracted from 45 throat swab specimens with clinically suspected novel coronavirus infection, and the “new coronavirus primer pool” provided in the above example was used to amplify 45 specimens, and the “multiple virus primer pool” was used for 16 specimens. For amplification, a different tag sequence is added to each sample. The concentration of the amplified product and the mixing amount of the sample are shown in Table 12 below:
表12样本混合量Table 12 Sample mixing volume
Figure PCTCN2021071423-appb-000029
Figure PCTCN2021071423-appb-000029
Figure PCTCN2021071423-appb-000030
Figure PCTCN2021071423-appb-000030
Figure PCTCN2021071423-appb-000031
Figure PCTCN2021071423-appb-000031
将扩增产物利用适用于纳米孔测序平台的建库试剂盒进行建库,然后利用纳米孔测序平台进行测序分析,测序结果与已经获得cFDA批准的新冠病毒2019-nCoV检测试剂盒作为参照对象,进行盲样比对,结果如下表13:The amplified products are built using a library building kit suitable for the nanopore sequencing platform, and then the nanopore sequencing platform is used for sequencing analysis. The sequencing results are compared with the new coronavirus 2019-nCoV detection kit that has been approved by the cFDA as the reference object. Blind sample comparison was performed, and the results are shown in Table 13 below:
表13比对结果Table 13 Comparison results
Figure PCTCN2021071423-appb-000032
Figure PCTCN2021071423-appb-000032
以荧光定量试剂盒诊断结果为参考,使用公式PCR与测序双阳性/(PCR与测序双阳性+PCR阳性但测序阴性)×100%检测计算敏感性为100%;使用公式PCR与测序双阴性/(PCR与测序双阴性+PCR阳性但测序阴性)×100%计算阴性预测值为100%。Based on the diagnostic results of the fluorescence quantitative kit, use the formula PCR and sequencing double positive/(PCR and sequencing double positive + PCR positive but sequencing negative) × 100% detection calculation sensitivity is 100%; use the formula PCR and sequencing double negative/ (PCR and sequencing double negative + PCR positive but sequencing negative) × 100% The negative predictive value is calculated as 100%.
因为荧光定量PCR方法出现PCR阴性但测序阳性的情况较少。所以通过上述两个指标说明本次申请的纳米孔测序诊断方案在检测敏感度上不弱于目前的荧光定量PCR方法。Because the fluorescent quantitative PCR method has fewer cases where PCR is negative but sequencing is positive. Therefore, the above two indicators show that the nanopore sequencing diagnostic program of this application is not weaker than the current fluorescent quantitative PCR method in detection sensitivity.
突变分析能力试验Mutation analysis capability test
对45份标本中含有的病毒基因组基因分析。将样本中比对到各个靶区域的片段进行分组,随机选取每组内的30条数据作为校正“种子序列”,每次随机选择分组内的其他50条序列使用Medaka software(version 0.10.1)对每一条“种子序列”进行数据准确性校正。校正后的30条“种子序列”再与病毒的标准参考基因组进行比较。当超过80%的“种子序列”与参考基因组序列之间均存在相同的碱基差异时,被测样本携带病毒的该基因位点被认为发生突变。Analysis of virus genome genes contained in 45 specimens. The fragments in the sample that are compared to each target area are grouped, and 30 data in each group are randomly selected as the calibration "seed sequence", and the other 50 sequences in the group are randomly selected each time to use Medaka software (version 0.10.1) Perform data accuracy correction for each "seed sequence". The 30 “seed sequences” after correction are compared with the standard reference genome of the virus. When more than 80% of the "seed sequence" and the reference genome sequence have the same base difference, the gene locus of the tested sample carrying the virus is considered to be mutated.
采用上述流程,在本次检测的45份标本中,在编号C1的感染者携带新冠病毒基因组上发现了单碱基突变。如图2显示,该患者携带病毒的基因组上22097个碱基位点由原来的G碱基突变为了A碱基。Using the above process, in the 45 specimens tested this time, a single-base mutation was found in the genome of the infected person with the number C1 carrying the new coronavirus. As shown in Figure 2, the 22,097 bases of the patient’s virus-carrying genome have been mutated from the original G base to the A base.
其中图2中,最上方一条序列为新型冠状病毒标准参考序列,下方30条为校正后的“种子序列”,经过校正后的种子序列与病毒参考序列在该位点存在差异,提示该位点存 在基因突变。In Figure 2, the top sequence is the standard reference sequence of the new coronavirus, and the bottom 30 are the corrected "seed sequence". The corrected seed sequence and the virus reference sequence are different at this location, which indicates the location. There is a genetic mutation.
实施例12血流系统中真菌感染的快速鉴定Example 12 Rapid identification of fungal infections in the bloodstream system
重症移植科病区A患者和B患者两人。A患者和B患者均表现为临床真菌抗原阳性,临床诊断:社区获得性肺炎,重症(肺孢子菌感染),临床表现为免疫性缺失的感染症状。使用针对肺孢子菌药物进行治疗后效果不明显,A患者于5月6日送血培养检测血流感染情况,同一份血液送检本公开所提供的引物和方法进行检测。5月12日血培养结果报告阳性,再通过纯化培养后于5月13日通过质谱分析鉴定为马尔尼菲蓝状菌,共计花费一周时间进行鉴定;而使用本公开所提供的方法在测序8小时后,于5月7日获得标本鉴定结果,从真菌检测数据中得到8条正确比对到马尔尼菲蓝状菌的测序数据,并以此最终确定为马尔尼菲蓝状菌引起的真菌性血流感染,共计花费1天时间。类似情况,B患者于5月28日凌晨2点送检血液培养,29日上午10点血培养报告阳性,再通过纯化培养于30日上午10点报告为马尔尼菲蓝状菌引起的血流感染,共计花费近3天时间;而使用本公开所提供的方法,由于患者血流感染情况严重,血液中马尔尼菲蓝状菌的比例较高,在测序10分钟后既可以从血液中检出超过50条比对到马尔尼菲蓝状菌的测序数据,并在28日当天予以报告临床,共计花费12小时。Two patients, A and B, in the intensive transplantation department. Both patients A and B were clinically positive for fungal antigens. The clinical diagnosis: community-acquired pneumonia, severe (Pneumocystis infection), and clinical manifestations of infection with lack of immunity. The effect of treatment with drugs against Pneumocystis spp. is not obvious. Patient A sent a blood culture for detection of bloodstream infection on May 6, and the same blood was sent for testing with the primers and methods provided in this disclosure. The blood culture result report was positive on May 12, and after purification and culture, it was identified as C. marneffei by mass spectrometry on May 13, and it took a total of one week for identification; and the method provided in the present disclosure was used for sequencing 8 Hours later, the specimen identification results were obtained on May 7, and 8 sequencing data correctly aligned to C. marneffei were obtained from the fungal test data, and finally determined to be the fungus caused by C. marneffei Sexual bloodstream infection, it takes 1 day in total. In a similar situation, patient B was sent for blood culture at 2 a.m. on May 28, and the blood culture report was positive at 10 a.m. on the 29th. Then, it was reported as blood flow caused by cyanobacteria marneffei at 10 a.m. on the 30th through purification culture. Infection, it takes a total of nearly 3 days; and using the method provided in the present disclosure, due to the severe bloodstream infection of the patient, the proportion of cyanobacteria marneffei in the blood is high, and the blood can be detected 10 minutes after sequencing Produced more than 50 comparisons to the sequencing data of C. marneffei, and reported them to the clinic on the 28th, which took 12 hours in total.
血培养是目前临床最为常用,也是被认为检验金标准的血液感染诊断方案。上述所展示的两个临床案例,分别为严重感染与初期感染的情况。无论是何种临床情况,通过本公开所提供的方案均与血培养检测的结果一致,同时大幅度缩减了检测所需的时间。Blood culture is currently the most commonly used clinically, and it is also considered the gold standard for the diagnosis of blood infections. The two clinical cases shown above are serious infections and initial infections. Regardless of the clinical situation, the solutions provided by the present disclosure are consistent with the results of blood culture testing, and at the same time, the time required for testing is greatly reduced.
以上所描述的详细内容仅作为两个临床案例,用于说明本公开的效果。采用与上述相同的方法,发明人通过对一千多例临床案例进行分析,比较分析了培养与通过本公开所提供的引物和方法进行检测的结果,通过比较发现:采用本公开所提供的引物和方法进行检测,其检测准确率远高于培养的检测结果;而且采用本公开所提供的引物和方法进行检测,所用到的检测时间更短,尤其是针对真菌的检测,时间上更表现出明显缩短的优势。The detailed content described above is only used as two clinical cases to illustrate the effects of the present disclosure. Using the same method as the above, the inventors analyzed more than one thousand clinical cases, compared and analyzed the results of culture and detection by the primers and methods provided in the present disclosure, and found through comparison that the primers provided in the present disclosure were used And method for detection, the detection accuracy rate is much higher than the detection result of culture; and using the primers and methods provided in the present disclosure for detection, the detection time used is shorter, especially for the detection of fungi, the time is more demonstrated Obviously shorten the advantage.
实施例13呼吸系统中结核分枝杆菌感染的快速鉴定Example 13 Rapid identification of Mycobacterium tuberculosis infection in the respiratory system
由于结核分枝杆菌生长极为缓慢,需要近一个月的时间培养鉴定,因此以往在临床诊断上,往往采用GeneXpert核酸检测方法(WHO推荐金标准),T-SPOT抗原检测方法或者抗酸染色方法进行鉴定。而这些方法需要1-8小时可以对结核分枝杆菌进行鉴定。但受其技术限制,其只能针对性的对结核分枝杆菌一种病原体进行检测,因此在临床使用中,往往需要临床医生通过临床症状进行预先判断或对多种病原体进行逐一筛查。这整个过程耗时费力,且容易受到医生经验等主观因素影响。采用本公开所提供的引物和方法对呼吸内科收治的500余份呼吸系统感染患者的痰液或者肺泡灌洗液标本进行全部细菌与真菌的检测分析,在其中34个患者的标本中鉴定到结核分枝杆菌感染(表14)。Because Mycobacterium tuberculosis grows very slowly, it takes nearly a month for culture and identification. Therefore, in the past, in clinical diagnosis, GeneXpert nucleic acid detection method (WHO recommended gold standard), T-SPOT antigen detection method or acid-fast staining method was often used for clinical diagnosis. Identification. These methods require 1-8 hours to identify Mycobacterium tuberculosis. However, due to its technical limitations, it can only detect one pathogen of Mycobacterium tuberculosis in a targeted manner. Therefore, in clinical use, it is often necessary for clinicians to pre-judge through clinical symptoms or to screen multiple pathogens one by one. This whole process is time-consuming and laborious, and is easily affected by subjective factors such as doctor's experience. Using the primers and methods provided in the present disclosure, more than 500 specimens of sputum or alveolar lavage fluid of patients with respiratory system infection admitted to the Department of Respiratory Medicine were used to detect and analyze all bacteria and fungi. Tuberculosis was identified in 34 patients’ specimens Mycobacterial infection (Table 14).
针对这34个患者,其中有27人同时使用金标准的GeneXpert核酸检测方法进行了复检,结果显示27人均为结核分枝杆菌阳性,与采用本公开所提供的引物和方法所获得的结 果100%吻合。For these 34 patients, 27 of them were also re-examined using the gold standard GeneXpert nucleic acid detection method. The results showed that 27 people were positive for Mycobacterium tuberculosis. % Agree.
表14结核分枝杆菌检测吻合性Table 14 Anastomosis in detection of Mycobacterium tuberculosis
Figure PCTCN2021071423-appb-000033
Figure PCTCN2021071423-appb-000033
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本公开的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions with reference to the terms "one embodiment", "some embodiments", "examples", "specific examples", or "some examples" etc. mean specific features described in conjunction with the embodiment or example , Structures, materials, or characteristics are included in at least one embodiment or example of the present disclosure. In this specification, the schematic representations of the above-mentioned terms are not necessarily directed to the same embodiment or example. Moreover, the described specific features, structures, materials or characteristics can be combined in any one or more embodiments or examples in a suitable manner. In addition, those skilled in the art can combine and combine the different embodiments or examples and the features of the different embodiments or examples described in this specification without contradicting each other.
尽管上面已经示出和描述了本公开的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本公开的限制,本领域的普通技术人员在本公开的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present disclosure have been shown and described above, it can be understood that the above-mentioned embodiments are exemplary and should not be construed as limiting the present disclosure. Those of ordinary skill in the art can comment on the foregoing within the scope of the present disclosure. The embodiment undergoes changes, modifications, substitutions, and modifications.

Claims (31)

  1. 一种基于纳米孔测序平台的文库的构建方法,其特征在于,包括:A method for constructing a library based on a nanopore sequencing platform, which is characterized in that it comprises:
    对来自于微生物的靶基因进行富集,以便获得富集产物,所述微生物包括选自细菌、真菌或者病毒中的至少一种;Enriching target genes from microorganisms to obtain enriched products, the microorganisms including at least one selected from bacteria, fungi or viruses;
    基于所述富集产物进行建库,以便获得所述测序文库。A library is constructed based on the enriched product, so as to obtain the sequencing library.
  2. 根据权利要求1所述的方法,其特征在于,所述微生物为真菌。The method of claim 1, wherein the microorganism is a fungus.
  3. 根据权利要求1所述的方法,其特征在于,所述细菌靶基因包括通用细菌靶基因和/或高敏感细菌靶基因,The method according to claim 1, wherein the bacterial target gene comprises a universal bacterial target gene and/or a highly sensitive bacterial target gene,
    所述通用细菌靶基因包括选自16s rRNA、rpob、gyrB、hsp60、ISR、23s rRNA中的至少一种;The universal bacterial target gene includes at least one selected from the group consisting of 16s rRNA, rpob, gyrB, hsp60, ISR, and 23s rRNA;
    所述高敏感细菌靶基因包括选自表1所示高敏感细菌上靶基因中的至少一种。The target genes of the highly sensitive bacteria include at least one selected from the target genes on the highly sensitive bacteria shown in Table 1.
  4. 根据权利要求3所述的方法,其特征在于,所述通用细菌靶基因包括表5中所列出的靶区域及其前后500bp的区域。The method according to claim 3, wherein the universal bacterial target genes include the target regions listed in Table 5 and the regions of 500 bp before and after them.
  5. 根据权利要求3所述的方法,其特征在于,所述高敏感细菌靶基因包括表7所列出的靶区域及其前后500bp的区域。The method according to claim 3, wherein the highly sensitive bacterial target gene includes the target region listed in Table 7 and the region of 500 bp before and after the target region.
  6. 根据权利要求1所述的方法,其特征在于,所述真菌靶基因包括通用真菌靶基因和/或高敏感真菌靶基因,The method according to claim 1, wherein the fungal target gene comprises a universal fungal target gene and/or a highly sensitive fungal target gene,
    所述通用真菌靶基因包括选自ITS1-4、LSU(D1/2)、18s rRNA、RPB2中的至少一种;The universal fungal target gene includes at least one selected from ITS1-4, LSU(D1/2), 18s rRNA, and RPB2;
    所述高敏感真菌靶基因包括选自表2所示高敏感真菌上靶基因中的至少一种。The target gene of the highly sensitive fungus includes at least one selected from the target genes on the highly sensitive fungus shown in Table 2.
  7. 根据权利要求6所述的方法,其特征在于,所述通用真菌靶基因包括表6中所列出的靶区域及其前后500bp的区域。The method according to claim 6, wherein the universal fungal target gene includes the target region listed in Table 6 and the region of 500 bp before and after it.
  8. 根据权利要求6所述的方法,其特征在于,所述高敏感真菌靶基因包括表8所列出的靶区域及其前后500bp的区域。The method according to claim 6, wherein the highly sensitive fungal target gene includes the target region listed in Table 8 and the region of 500 bp before and after it.
  9. 根据权利要求1所述的方法,其特征在于,所述病毒靶基因包括多重病毒靶基因和/或新冠病毒靶基因,The method according to claim 1, wherein the viral target genes comprise multiple viral target genes and/or new coronavirus target genes,
    所述多重病毒靶基因包括选自表3所示病毒靶基因中的至少一种;The multiple viral target genes include at least one selected from the viral target genes shown in Table 3;
    所述新冠病毒靶基因包括选自表4所示靶基因中的至少一种。The new coronavirus target gene includes at least one selected from the target genes shown in Table 4.
  10. 根据权利要求9所述的方法,其特征在于,所述多重病毒靶基因包括表9所列出的靶区域及其前后200bp的区域。The method according to claim 9, wherein the multiple viral target genes include the target regions listed in Table 9 and the regions of 200 bp before and after them.
  11. 根据权利要求9所述的方法,其特征在于,所述新冠病毒靶基因包括表10所列出的靶区域及其前后200bp的区域。The method according to claim 9, wherein the target gene of the new coronavirus comprises the target region listed in Table 10 and the region of 200 bp before and after it.
  12. 根据权利要求1~11中任一项所述的方法,其特征在于,所述细菌靶基因包括通用细菌靶基因和高敏感细菌靶基因,The method according to any one of claims 1 to 11, wherein the bacterial target gene comprises a universal bacterial target gene and a highly sensitive bacterial target gene,
    所述真菌靶基因包括通用真菌靶基因和高敏感真菌靶基因,The fungal target genes include universal fungal target genes and highly sensitive fungal target genes,
    所述病毒靶基因包括多重病毒靶基因和新冠病毒靶基因,The viral target genes include multiple viral target genes and new coronavirus target genes,
    所述方法进一步包括:The method further includes:
    将通用细菌靶基因富集产物、通用真菌靶基因富集产物、高敏感细菌靶基因和高敏感真菌靶基因富集产物、多重病毒靶基因富集产物和新冠病毒靶基因富集产物按照质量比为20~60:5~15:10~25:10~25:10~25的比例混合,对混合后的产物进行建库,以便获得所述测序文库。The general bacterial target gene enrichment product, the general fungal target gene enrichment product, the highly sensitive bacterial target gene and the highly sensitive fungal target gene enrichment product, the multiple viral target gene enrichment product and the new coronavirus target gene enrichment product are based on the mass ratio Mixing in a ratio of 20-60:5-15:10-25:10-25:10-25, and building a library of the mixed products to obtain the sequencing library.
  13. 根据权利要求1所述的方法,其特征在于,进一步包括:The method according to claim 1, further comprising:
    基于引物池中的引物对所述来自于微生物的靶基因进行PCR扩增,实现对所述靶基因的富集,以便获得富集产物,所述引物池包含至少一条引物。The target gene from the microorganism is amplified by PCR based on the primers in the primer pool to achieve the enrichment of the target gene so as to obtain the enriched product, and the primer pool includes at least one primer.
  14. 根据权利要求13所述的方法,其特征在于,所述引物池中的引物均满足下列条件:The method according to claim 13, wherein the primers in the primer pool all meet the following conditions:
    a.所述引物长度为18-30个碱基;a. The length of the primer is 18-30 bases;
    b.所述引物的解链温度Tm值为57-64℃;b. The melting temperature Tm of the primer is 57-64°C;
    c.所述引物中GC含量为40-60%;c. The GC content in the primer is 40-60%;
    d.所述引物的3’末端5个碱基的吉布斯自由能ΔG大于等于-9kcal/mol;d. The Gibbs free energy ΔG of the 5 bases at the 3'end of the primer is greater than or equal to -9kcal/mol;
    e.所述引物自身互补性数值小于8.0,引物3’末端自我互补参数小于3.0;e. The primer self-complementarity value is less than 8.0, and the 3'end self-complementarity parameter of the primer is less than 3.0;
    f.所述引物3’末端连续3个碱基上不存在简并碱基;f. There is no degenerate base on 3 consecutive bases at the 3'end of the primer;
    g.所述引物的扩增产物长度为200~1500个碱基。g. The length of the amplified product of the primer is 200 to 1500 bases.
  15. 根据权利要求13所述的方法,其特征在于,所述引物池包括选自下列引物池中的至少一种:The method according to claim 13, wherein the primer pool comprises at least one selected from the following primer pools:
    通用细菌引物池,所述通用细菌引物池包括表5所列出的引物;A universal bacterial primer pool, the universal bacterial primer pool includes the primers listed in Table 5;
    通用真菌引物池,所述通用真菌引物池包括表6所列出的引物;A universal fungal primer pool, the universal fungal primer pool includes the primers listed in Table 6;
    高敏感细菌引物池,所述高敏感细菌引物池包括表7所列出的引物;A highly sensitive bacterial primer pool, the highly sensitive bacterial primer pool includes the primers listed in Table 7;
    高敏感真菌引物池,所述高敏感真菌引物池包括表8所列出的引物;A highly sensitive fungal primer pool, which includes the primers listed in Table 8;
    多重病毒引物池,所述多重病毒引物池包括表9所列出的引物;Multiple virus primer pool, the multiple virus primer pool includes the primers listed in Table 9;
    新冠病毒引物池,所述新冠病毒引物池包括表10所列出的引物。The new coronavirus primer pool includes the primers listed in Table 10.
  16. 一种测序方法,其特征在于,包括:A sequencing method, characterized in that it comprises:
    基于权利要求1~15中任一项所述的方法获得测序文库;Obtain a sequencing library based on the method of any one of claims 1-15;
    基于所述测序文库,利用纳米孔测序平台进行测序。Based on the sequencing library, the nanopore sequencing platform is used for sequencing.
  17. 一种鉴定微生物的方法,其特征在于,包括:A method for identifying microorganisms, which is characterized in that it comprises:
    基于待测样本核酸,根据权利要求1~15中任一项所述的方法获得测序文库;Obtain a sequencing library according to the method of any one of claims 1-15 based on the nucleic acid of the sample to be tested;
    基于所述测序文库,利用纳米孔测序平台进行测序,以便获得测序结果;Based on the sequencing library, the nanopore sequencing platform is used to perform sequencing, so as to obtain sequencing results;
    将所述测序结果与参考数据库进行比对,基于比对结果确定所述待测样本中的微生物。The sequencing result is compared with a reference database, and the microorganisms in the sample to be tested are determined based on the comparison result.
  18. 根据权利要求17所述的方法,其特征在于,所述参考数据库包括下列中的至少一种:The method according to claim 17, wherein the reference database comprises at least one of the following:
    通用细菌数据库,所述通用细菌数据库含有16s rRNA、rpob、gyrB、hsp60、23s rRNA和ISR基因数据;A general bacterial database, which contains 16s rRNA, rpob, gyrB, hsp60, 23s rRNA, and ISR gene data;
    通用真菌数据库,所述通用真菌数据库含有TS1-4、LSU(D1/2)、18s rRNA和RPB2基因数据;A general fungus database, which contains TS1-4, LSU(D1/2), 18s rRNA and RPB2 gene data;
    高敏感细菌数据库,所述高敏感细菌数据库含有表1所示高敏感细菌靶基因数据;A highly sensitive bacteria database, where the highly sensitive bacteria database contains the target gene data of the highly sensitive bacteria shown in Table 1;
    高敏感真菌数据库,所述高敏感真菌数据库含有表2所示高敏感真菌靶基因数据;A highly sensitive fungus database, the highly sensitive fungus database containing the highly sensitive fungal target gene data shown in Table 2;
    多重病毒数据库,所述多重病毒数据库含有表3所示病毒靶基因数据;A multiple virus database, the multiple virus database containing virus target gene data shown in Table 3;
    新冠病毒数据库,所述新冠病毒数据库含有SARS-CoV-2病毒的基因组数据。The new coronavirus database contains the genome data of the SARS-CoV-2 virus.
  19. 根据权利要求18所述的方法,其特征在于,进一步包括:The method according to claim 18, further comprising:
    将所述测序结果分别与所述通用细菌数据库和所述通用真菌数据库进行比对,以便获得第一比对数据和第一未比对数据;Comparing the sequencing results with the universal bacterial database and the universal fungus database, respectively, so as to obtain the first comparison data and the first uncompared data;
    将所述第一未比对数据分别与所述高敏感细菌数据库和所述高敏感真菌数据库进行比对,以便获得第二比对数据和第二未比对数据;Comparing the first uncompared data with the highly sensitive bacteria database and the highly sensitive fungus database respectively, so as to obtain the second comparison data and the second uncompared data;
    将所述第二未比对数据与所述多重病毒数据库和所述新冠病毒数据库进行比对,以便获得第三比对数据;Comparing the second uncompared data with the multiple virus database and the new coronavirus database, so as to obtain third comparison data;
    基于所述第一比对数据,确定所述样本中含有的通用细菌和通用真菌,基于所述第二比对数据,确定样本中含有高敏感细菌和高敏感真菌,基于所述第三比对数据,确定所述样本中含有的病毒。Based on the first comparison data, determine the general bacteria and general fungi contained in the sample, based on the second comparison data, determine that the sample contains highly sensitive bacteria and highly sensitive fungi, based on the third comparison Data to determine the virus contained in the sample.
  20. 根据权利要求19所述的方法,其特征在于,The method of claim 19, wherein:
    所述基于所述第一比对数据,确定所述样本中含有的通用细菌和通用真菌进一步包括:The determining the general bacteria and general fungi contained in the sample based on the first comparison data further includes:
    将所述第一比对数据分为第一唯一比对数据和至少一组第一交叉比对数据,每组第一交叉比对数据含有多条序列;Dividing the first comparison data into first unique comparison data and at least one group of first cross comparison data, each group of first cross comparison data contains multiple sequences;
    将每组第一交叉数据中的部分多条序列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;Taking part of multiple sequences in each group of first crossover data as seed sequences, and correcting the seed sequence by using the remaining part of the sequence in the group, so as to obtain a corrected seed sequence;
    将所述校正后种子序列与所述通用细菌数据库和所述通用真菌数据库的最优比对数据,和所述第一唯一比对数据合并,确定所述样本中含有的通用细菌和通用真菌。The calibrated seed sequence is combined with the optimal comparison data of the universal bacteria database and the universal fungus database, and the first unique comparison data to determine the universal bacteria and universal fungi contained in the sample.
  21. 根据权利要求19所述的方法,其特征在于,The method of claim 19, wherein:
    所述基于所述第二比对数据,确定所述样本含有的高敏感细菌和高敏感真菌进一步包括:The determining that the sample contains highly sensitive bacteria and highly sensitive fungi based on the second comparison data further includes:
    将所述第二比对数据分为第二唯一比对数据和至少一组第二交叉比对数据,每组第二交叉比对数据含有多条序列;Dividing the second comparison data into second unique comparison data and at least one group of second cross comparison data, each group of second cross comparison data contains multiple sequences;
    将每组第二交叉数据中的部分多条序列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;Taking part of multiple sequences in each group of second crossover data as seed sequences, and correcting the seed sequence by using the remaining part of the sequence in the group, so as to obtain a corrected seed sequence;
    将所述校正后种子序列与所述高敏感细菌数据库和所述高敏感真菌数据库的最优比对结果,和所述第二唯一比对数据合并,确定所述样本中含有的高敏感细菌和高敏感真菌。Combine the corrected seed sequence with the optimal comparison result of the highly sensitive bacteria database and the highly sensitive fungus database, and the second unique comparison data to determine the highly sensitive bacteria contained in the sample and Highly sensitive fungus.
  22. 根据权利要求19所述的方法,其特征在于,The method of claim 19, wherein:
    所述基于所述第三比对数据,确定所述样本中含有的病毒进一步包括:The determining the virus contained in the sample based on the third comparison data further includes:
    将所述第三比对数据中比对到同一基因区域的多条序列划为一组,以每组中的部分序 列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;The multiple sequences that are aligned to the same gene region in the third comparison data are grouped into a group, the partial sequence in each group is used as the seed sequence, and the remaining partial sequences in the group are used to correct the seed sequence, so that Obtain the corrected seed sequence;
    基于所述校正后种子序列与所述多重病毒数据库和所述新冠病毒数据库的最优比对结果,确定所述样本中含有的病毒。Based on the optimal comparison result of the corrected seed sequence with the multiple virus database and the new coronavirus database, the virus contained in the sample is determined.
  23. 根据权利要求22所述的方法,其特征在于,进一步包括:基于至少80%以上的所述校正后种子序列与所述多重病毒数据库和所述新冠病毒数据库之间存在的相同碱基差异,确定所述待测样本中存在的病毒突变位点。The method according to claim 22, further comprising: determining based on at least 80% or more of the same base differences between the corrected seed sequence and the multiple virus database and the new coronavirus database The mutation site of the virus in the sample to be tested.
  24. 一种鉴定微生物的装置,其特征在于,包括:A device for identifying microorganisms, which is characterized in that it comprises:
    文库构建单元,所述文库构建单元基于所述待测样本核酸,根据权利要求1~15中任一项所述的方法获得测序文库;A library construction unit, based on the nucleic acid of the sample to be tested, to obtain a sequencing library according to the method according to any one of claims 1 to 15;
    测序单元,所述测序单元基于所述测序文库,利用纳米孔测序平台进行测序,以便获得所述测序结果;A sequencing unit, the sequencing unit uses a nanopore sequencing platform to perform sequencing based on the sequencing library, so as to obtain the sequencing result;
    数据处理单元,所述数据处理单元基于待测样本核酸的测序结果与参考数据库进行比对,用于确定所述待测样本中的微生物。A data processing unit, which compares the sequencing result of the nucleic acid of the sample to be tested with a reference database, and is used to determine the microorganisms in the sample to be tested.
  25. 根据权利要求24所述的装置,其特征在于,所述参考数据库包括下列中的至少一种:The device according to claim 24, wherein the reference database comprises at least one of the following:
    通用细菌数据库,所述通用细菌数据库含有16s rRNA、rpob、gyrB、hsp60、23s rRNA和ISR基因数据;A general bacterial database, which contains 16s rRNA, rpob, gyrB, hsp60, 23s rRNA, and ISR gene data;
    通用真菌数据库,所述通用真菌数据库含有TS1-4、LSU(D1/2)、18s rRNA和RPB2基因数据;A general fungus database, which contains TS1-4, LSU(D1/2), 18s rRNA and RPB2 gene data;
    高敏感细菌数据库,所述高敏感细菌数据库含有表1所示高敏感细菌靶基因数据;A highly sensitive bacteria database, where the highly sensitive bacteria database contains the target gene data of the highly sensitive bacteria shown in Table 1;
    高敏感真菌数据库,所述高敏感真菌数据库含有表2所示高敏感真菌靶基因数据;A highly sensitive fungus database, the highly sensitive fungus database containing the highly sensitive fungal target gene data shown in Table 2;
    多重病毒数据库,所述多重病毒数据库含有表3所示病毒靶基因数据;A multiple virus database, the multiple virus database containing virus target gene data shown in Table 3;
    新冠病毒数据库,所述新冠病毒数据库含有SARS-CoV-2病毒的基因组数据。The new coronavirus database contains the genome data of the SARS-CoV-2 virus.
  26. 根据权利要求24所述的装置,其特征在于,所述数据处理单元进一步包括:The device according to claim 24, wherein the data processing unit further comprises:
    将所述测序结果分别与所述通用细菌数据库和所述通用真菌数据库进行比对,以便获得第一比对数据和第一未比对数据;Comparing the sequencing results with the universal bacterial database and the universal fungus database, respectively, so as to obtain the first comparison data and the first uncompared data;
    将所述第一未比对数据分别与所述高敏感细菌数据库和所述高敏感真菌数据库进行比对,以便获得第二比对数据和第二未比对数据;Comparing the first uncompared data with the highly sensitive bacteria database and the highly sensitive fungus database respectively, so as to obtain the second comparison data and the second uncompared data;
    将所述第二未比对数据与所述多重病毒数据库和所述新冠病毒数据库进行比对,以便获得第三比对数据;Comparing the second uncompared data with the multiple virus database and the new coronavirus database, so as to obtain third comparison data;
    基于所述第一比对数据,确定所述样本中含有的通用细菌和通用真菌,基于所述第二比对数据,确定样本中含有高敏感细菌和高敏感真菌,基于所述第三比对数据,确定所述样本中含有的病毒。Based on the first comparison data, determine the general bacteria and general fungi contained in the sample, based on the second comparison data, determine that the sample contains highly sensitive bacteria and highly sensitive fungi, based on the third comparison Data to determine the virus contained in the sample.
  27. 根据权利要求26所述的装置,其特征在于,The device of claim 26, wherein:
    所述基于所述第一比对数据,确定所述样本中含有的通用细菌和通用真菌进一步包括:The determining the general bacteria and general fungi contained in the sample based on the first comparison data further includes:
    将所述第一比对数据分为第一唯一比对数据和至少一组第一交叉比对数据,每组第一交叉比对数据含有多条序列;Dividing the first comparison data into first unique comparison data and at least one group of first cross comparison data, each group of first cross comparison data contains multiple sequences;
    将每组第一交叉数据中的部分多条序列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;Taking part of multiple sequences in each group of first crossover data as seed sequences, and correcting the seed sequence by using the remaining part of the sequence in the group, so as to obtain a corrected seed sequence;
    将所述校正后种子序列与所述通用细菌数据库和所述通用真菌数据库的最优比对结果,和所述第一唯一比对数据合并,确定所述样本中含有的通用细菌和通用真菌。Combine the corrected seed sequence with the optimal comparison result of the universal bacteria database and the universal fungus database, and the first unique comparison data to determine the universal bacteria and universal fungi contained in the sample.
  28. 根据权利要求26所述的装置,其特征在于,所述基于所述第二比对数据,确定所述样本含有的高敏感细菌和高敏感真菌进一步包括:The device according to claim 26, wherein the determining, based on the second comparison data, the highly sensitive bacteria and the highly sensitive fungi contained in the sample further comprises:
    将所述第二比对数据分为第二唯一比对数据和至少一组第二交叉比对数据,每组第二交叉比对数据含有多条序列;Dividing the second comparison data into second unique comparison data and at least one group of second cross comparison data, each group of second cross comparison data contains multiple sequences;
    将每组第二交叉数据中的部分多条序列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;Taking part of multiple sequences in each group of second crossover data as seed sequences, and correcting the seed sequence by using the remaining part of the sequence in the group, so as to obtain a corrected seed sequence;
    将所述校正后种子序列与所述高敏感细菌数据库和所述高敏感真菌数据库的最优比对结果,和所述第二唯一比对数据合并,确定所述样本中含有的高敏感细菌和高敏感真菌。Combine the corrected seed sequence with the optimal comparison result of the highly sensitive bacteria database and the highly sensitive fungus database, and the second unique comparison data to determine the highly sensitive bacteria contained in the sample and Highly sensitive fungus.
  29. 根据权利要求26所述的装置,其特征在于,所述基于所述第三比对数据,确定所述样本中含有的病毒进一步包括:The device of claim 26, wherein the determining the virus contained in the sample based on the third comparison data further comprises:
    将所述第三比对数据中比对到同一基因区域的多条序列划为一组,以每组中的部分序列作为种子序列,利用组内剩余部分序列对所述种子序列进行校正,以便获得校正后种子序列;The multiple sequences that are aligned to the same gene region in the third comparison data are grouped into a group, the partial sequence in each group is used as the seed sequence, and the remaining partial sequences in the group are used to correct the seed sequence, so that Obtain the corrected seed sequence;
    基于所述校正后种子序列与所述多重病毒数据库和所述新冠病毒数据库的最优比对结果,确定所述样本中含有的病毒。Based on the optimal comparison result of the corrected seed sequence with the multiple virus database and the new coronavirus database, the virus contained in the sample is determined.
  30. 根据权利要求29所述的装置,其特征在于,进一步包括:基于至少80%以上的所述校正后种子序列与所述多重病毒数据库和所述新冠病毒数据库之间存在的相同碱基差异,确定所述待测样本中存在的病毒突变位点。The device according to claim 29, further comprising: determining based on at least 80% or more of the same base differences between the corrected seed sequence and the multiple virus database and the new coronavirus database The mutation site of the virus in the sample to be tested.
  31. 一种试剂盒,其特征在于,包括引物池,所述引物池包括选自下列中的至少之一:A kit, characterized by comprising a primer pool, the primer pool comprising at least one selected from the following:
    通用细菌引物池,所述通用细菌引物池包括表5所列出的引物;A universal bacterial primer pool, the universal bacterial primer pool includes the primers listed in Table 5;
    通用真菌引物池,所述通用真菌引物池包括表6所列出的引物;A universal fungal primer pool, the universal fungal primer pool includes the primers listed in Table 6;
    高敏感细菌引物池,所述高敏感细菌引物池包括表7所列出的引物;A highly sensitive bacterial primer pool, the highly sensitive bacterial primer pool includes the primers listed in Table 7;
    高敏感真菌引物池,所述高敏感真菌引物池包括表8所列出的引物;A highly sensitive fungal primer pool, which includes the primers listed in Table 8;
    多重病毒引物池,所述多重病毒引物池包括表9所列出的引物;Multiple virus primer pool, the multiple virus primer pool includes the primers listed in Table 9;
    新冠病毒引物池,所述新冠病毒引物池包括表10所列出的引物。The new coronavirus primer pool includes the primers listed in Table 10.
PCT/CN2021/071423 2020-02-18 2021-01-13 Library construction method based on nanopore sequencing platform, microorganism identification method, and application WO2021164472A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202010100272.4 2020-02-18
CN202010100272 2020-02-18
CN202010306570.9A CN111662958B (en) 2020-02-18 2020-04-17 Construction method of library based on nanopore sequencing platform, method for identifying microorganisms and application
CN202010306570.9 2020-04-17

Publications (1)

Publication Number Publication Date
WO2021164472A1 true WO2021164472A1 (en) 2021-08-26

Family

ID=72382846

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/071423 WO2021164472A1 (en) 2020-02-18 2021-01-13 Library construction method based on nanopore sequencing platform, microorganism identification method, and application

Country Status (2)

Country Link
CN (1) CN111662958B (en)
WO (1) WO2021164472A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114196779A (en) * 2021-12-27 2022-03-18 武汉明德生物科技股份有限公司 Pathogenic microorganism detection method and kit based on targeted sequencing
CN114196743A (en) * 2021-12-27 2022-03-18 武汉明德生物科技股份有限公司 Rapid detection method for pathogenic microorganisms and kit thereof
CN114351261A (en) * 2022-02-28 2022-04-15 江苏先声医学诊断有限公司 Method for detecting respiratory tract sample difficultly-detected pathogenic microorganisms based on nanopore sequencing platform
CN115101126A (en) * 2022-02-22 2022-09-23 中国医学科学院北京协和医院 Respiratory tract virus and/or bacterial subtype primer design method and system based on CE platform
CN117701780A (en) * 2024-02-06 2024-03-15 江西师范大学 Hantaa virus whole genome targeting capture primer group and application thereof
CN117701780B (en) * 2024-02-06 2024-05-28 江西师范大学 Hantaa virus whole genome targeting capture primer group and application thereof

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111662958B (en) * 2020-02-18 2022-12-06 武汉臻熙医学检验实验室有限公司 Construction method of library based on nanopore sequencing platform, method for identifying microorganisms and application
CN112176032B (en) * 2020-10-16 2021-10-26 广州市达瑞生物技术股份有限公司 Primer combination for nanopore sequencing and library building of respiratory pathogens and application thereof
CN112501268B (en) * 2020-11-23 2023-04-07 广州市达瑞生物技术股份有限公司 Nanopore sequencing-based primer group and kit for rapidly identifying respiratory microorganisms and application of primer group and kit
CN112967753B (en) * 2021-02-25 2022-04-22 美格医学检验所(广州)有限公司 Pathogenic microorganism detection system and method based on nanopore sequencing
CN113106171A (en) * 2021-03-18 2021-07-13 中国计量大学 Newcastle disease virus vaccine strain identification and mutation detection method based on nanopore sequencing platform and application
CN113744806B (en) * 2021-06-23 2024-03-12 杭州圣庭医疗科技有限公司 Fungus sequencing data identification method based on nanopore sequencer
CN113637796B (en) * 2021-07-06 2023-09-01 杭州圣庭医疗科技有限公司 Rapid identification method for 8 herpesviruses based on nanopore sequencer
CN113667729A (en) * 2021-07-29 2021-11-19 杭州圣庭医疗科技有限公司 Rapid microorganism identification method based on nanopore sequencer
CN114381535A (en) * 2021-12-24 2022-04-22 天津科技大学 Method for rapidly detecting total bacteria and 5 main genus bacteria in oral biomembrane without sequencing
CN114438182B (en) * 2022-02-18 2024-04-05 杭州柏熠科技有限公司 Inlet plant quarantine virus identification method based on nanopore sequencing and application

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108611350A (en) * 2018-05-04 2018-10-02 广州金域医学检验集团股份有限公司 16S rDNA microorganism fungus kinds identification primer system, kit and application
CN108885649A (en) * 2015-11-12 2018-11-23 塞缪尔·威廉姆斯 Short dna segment is quickly sequenced using nano-pore technology
CN109797438A (en) * 2019-01-17 2019-05-24 武汉康测科技有限公司 A kind of joint component and library constructing method quantifying sequencing library building for the variable region 16S rDNA
CN110438199A (en) * 2019-08-15 2019-11-12 深圳谱元科技有限公司 A kind of method of novel the pathogenic microorganism examination
CN111662958A (en) * 2020-02-18 2020-09-15 武汉臻熙医学检验实验室有限公司 Construction method of library based on nanopore sequencing platform, method for identifying microorganisms and application

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11400454B2 (en) * 2017-01-10 2022-08-02 Mriglobal Modular mobile field-deployable laboratory for rapid, on-site detection and analysis of biological targets
US20210246519A1 (en) * 2018-05-04 2021-08-12 The Regents Of The University Of California Spiked primers for enrichment of pathogen nucleic acids among background of nucleic acids

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108885649A (en) * 2015-11-12 2018-11-23 塞缪尔·威廉姆斯 Short dna segment is quickly sequenced using nano-pore technology
CN108611350A (en) * 2018-05-04 2018-10-02 广州金域医学检验集团股份有限公司 16S rDNA microorganism fungus kinds identification primer system, kit and application
CN109797438A (en) * 2019-01-17 2019-05-24 武汉康测科技有限公司 A kind of joint component and library constructing method quantifying sequencing library building for the variable region 16S rDNA
CN110438199A (en) * 2019-08-15 2019-11-12 深圳谱元科技有限公司 A kind of method of novel the pathogenic microorganism examination
CN111662958A (en) * 2020-02-18 2020-09-15 武汉臻熙医学检验实验室有限公司 Construction method of library based on nanopore sequencing platform, method for identifying microorganisms and application

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114196779A (en) * 2021-12-27 2022-03-18 武汉明德生物科技股份有限公司 Pathogenic microorganism detection method and kit based on targeted sequencing
CN114196743A (en) * 2021-12-27 2022-03-18 武汉明德生物科技股份有限公司 Rapid detection method for pathogenic microorganisms and kit thereof
CN115101126A (en) * 2022-02-22 2022-09-23 中国医学科学院北京协和医院 Respiratory tract virus and/or bacterial subtype primer design method and system based on CE platform
CN114351261A (en) * 2022-02-28 2022-04-15 江苏先声医学诊断有限公司 Method for detecting respiratory tract sample difficultly-detected pathogenic microorganisms based on nanopore sequencing platform
CN114351261B (en) * 2022-02-28 2023-12-15 江苏先声医学诊断有限公司 Detection method for difficult-to-detect pathogenic microorganisms in respiratory tract sample based on nanopore sequencing platform
CN117701780A (en) * 2024-02-06 2024-03-15 江西师范大学 Hantaa virus whole genome targeting capture primer group and application thereof
CN117701780B (en) * 2024-02-06 2024-05-28 江西师范大学 Hantaa virus whole genome targeting capture primer group and application thereof

Also Published As

Publication number Publication date
CN111662958B (en) 2022-12-06
CN111662958A (en) 2020-09-15

Similar Documents

Publication Publication Date Title
WO2021164472A1 (en) Library construction method based on nanopore sequencing platform, microorganism identification method, and application
EP4116440A1 (en) Rt-pcr detection method and kit for novel coronavirus
WO2019114146A1 (en) Method for enriching gene target regions and library construction kit
WO2022033334A1 (en) Novel coronavirus (sars-cov-2) rapid test kit and method thereof
US9951387B2 (en) Methods for detection of depressive disorders
JP2018514205A (en) Prediction method of rejection of organ transplantation using next-generation nucleotide sequence analysis technique
JP6574703B2 (en) Method for detecting Helicobacter pylori DNA in stool samples
WO2023109032A1 (en) Multiple nucleic acid detection system, and preparation method therefor and use thereof
CN114085903B (en) Primer pair probe combination product for detecting mitochondria 3243A &amp; gtG mutation, kit and detection method thereof
CN109576346A (en) The construction method of high-throughput sequencing library and its application
JP2013198483A (en) Method for classifying test body fluid sample
WO2008077330A1 (en) Taqman mgb probe for detecting maternal inherited mitochondrial genetic deafness c1494t mutation and its usage
CN108048565A (en) A kind of primer for detecting ApoE gene pleiomorphisms and its detection method and application
WO2008077329A1 (en) Probe for detecting maternal inherited mitochondrial genetic deafness a1555g mutation and its usage
Li et al. Development and clinical implications of a novel CRISPR-based diagnostic test for pulmonary Aspergillus fumigatus infection
US20220136046A1 (en) Detection and antibiotic resistance profiling of microorganisms
CN112831605A (en) Multienzyme isothermal amplification detection kit and application thereof
WO2023207909A1 (en) Crispr-based nucleic acid detection kit and use thereof
CN116377095A (en) Primer library, kit and detection method for detecting mycobacterium and/or tuberculosis drug-resistant genes
CN108660252A (en) A kind of human immunodeficiency virus drug resistance analysis method based on pyrosequencing
CN111690736A (en) Warfarin medication gene detection kit and use method thereof
CN112029829B (en) Nucleic acid isothermal amplification method based on hairpin structure and kit application thereof
CN108070637A (en) A kind of pre- amplification method of Circulating DNA based on from ring amplification principle
CN116356054A (en) Kit for detecting common pathogenic microorganisms and drug resistance genes and detection method
CN112048552B (en) Intestinal flora for diagnosing myasthenia gravis and application thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21756743

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21756743

Country of ref document: EP

Kind code of ref document: A1