WO2017129110A1 - 一种人体微生物定性与定量的检测方法 - Google Patents

一种人体微生物定性与定量的检测方法 Download PDF

Info

Publication number
WO2017129110A1
WO2017129110A1 PCT/CN2017/072441 CN2017072441W WO2017129110A1 WO 2017129110 A1 WO2017129110 A1 WO 2017129110A1 CN 2017072441 W CN2017072441 W CN 2017072441W WO 2017129110 A1 WO2017129110 A1 WO 2017129110A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
microorganism
characteristic
region
group
Prior art date
Application number
PCT/CN2017/072441
Other languages
English (en)
French (fr)
Inventor
彭海
张英
卢龙
Original Assignee
江汉大学
辛辛那提儿童医院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 江汉大学, 辛辛那提儿童医院 filed Critical 江汉大学
Priority to US16/073,395 priority Critical patent/US20190048393A1/en
Priority to EP17743725.8A priority patent/EP3409789A4/en
Publication of WO2017129110A1 publication Critical patent/WO2017129110A1/zh

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6848Nucleic acid amplification reactions characterised by the means for preventing contamination or increasing the specificity or sensitivity of an amplification reaction
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/02Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving viable microorganisms
    • C12Q1/04Determining presence or kind of microorganism; Use of selective media for testing antibiotics or bacteriocides; Compositions containing a chemical indicator therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/02Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving viable microorganisms
    • C12Q1/04Determining presence or kind of microorganism; Use of selective media for testing antibiotics or bacteriocides; Compositions containing a chemical indicator therefor
    • C12Q1/06Quantitative determination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/166Oligonucleotides used as internal standards, controls or normalisation probes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Definitions

  • the invention relates to the field of biotechnology, in particular to a method for detecting qualitative and quantitative human microorganisms.
  • Human microbes are an important basis for the diagnosis of human diseases. It is necessary to accurately and quantitatively detect human microbes.
  • microbial qualitative and quantitative detection technologies include morphological counting, chip detection, 16S rRNA sequencing, metagenomic sequencing and real-time quantitative PCR (Polymerase Chain Reaction).
  • Morphological counting requires pre-incubation of microorganisms, which takes a long time. Non-cultured microorganisms are undetectable. Only one microorganism can be detected at a time, the flux is low, the sampling amount is limited at the time of counting, and the result is rough, and the classification unit below the species cannot be classified. Make a distinction. The amount of DNA of the sample to be tested required for chip detection is large, and the microorganism needs to be pre-cultured and enriched, the detection result is inaccurate, and quantitative detection cannot be performed.
  • 16S rRNA sequencing cannot distinguish between the following taxa.
  • the metagenomics have a limited depth of sequencing, and the accuracy of quantitative detection of low-level microorganisms is very poor.
  • Real-time quantitative PCR can only detect one microorganism at a time, and the flux is low.
  • the common drawback of existing methods is that the reliability of microbial qualitative and quantitative cannot be calculated, making the conclusion practicable.
  • the above technical defects have caused problems such as untimely diagnosis of human diseases, inaccurate diagnosis and misdiagnosis.
  • embodiments of the present invention provide a qualitative and quantitative detection method for human microorganisms.
  • the technical solution is as follows:
  • Embodiments of the present invention provide a method for detecting qualitative and quantitative human microorganisms, and the method includes:
  • the number of the target microbial population is ⁇ 1, and each of the target microbial populations includes ⁇ 0 kinds of the target microorganisms;
  • the target microorganism is at least one of bacteria, viruses, fungi, actinomycetes, rickettsia, mycoplasma, chlamydia, spirochetes, and protozoa;
  • the reference microorganism is at least one of bacteria, viruses, fungi, actinomycetes, rickettsia, mycoplasma, chlamydia, spirochetes, and protozoa.
  • the method for determining a non-target organism in a sample to be tested includes: determining the non-target organism as all organisms except the target microbial group, and if the characteristic region of the target microbial group is obtained, The non-target organism refers to all organisms except the target microbial group; if the characteristic region of the target microbial group is not obtained, the non-target organism refers to the target microbial group in the mixed sample. Other creatures than others.
  • the characteristic region of the target microbial population is a nucleic acid sequence on a reference genome of the microorganism within the target microbial population; the sequences on both sides of the characteristic region of the target microflora are a single sequence in the reference genome The order of the two sides of the characteristic region of the target microbial group The columns are conserved among different microorganisms in the target microbial group; the distinguishing degree of the characteristic regions of the target microbial group is ⁇ 3;
  • the characteristic region of the target microorganism is homologous to the characteristic region of the target microbial group; the characteristic region of the target microorganism has an m2 value ⁇ 2, wherein the m2 value is a characteristic region of the target microorganism and the target microbial group a minimum value of the number of differential bases between the other microorganisms other than the target microorganism;
  • a characteristic region of the reference microorganism is a nucleic acid sequence on a reference genome of the reference microorganism; a sequence on both sides of a characteristic region of the reference microorganism is a single sequence in a reference genome of the reference microorganism; The sequences on both sides of the characteristic region do not have homology in other organisms other than the reference microorganism.
  • the degree of discrimination refers to a minimum value of the number of bases of difference between a characteristic region of any of the target microbial populations amplified by the same mixed multiplex primer, and any non-characterized region, wherein
  • the non-characteristic region is an amplification product of the hybrid multiplex amplification primer with the nucleic acid of the mixed sample as a template, and the non-feature region is not a characteristic region of the target microbial group, if the non-feature region is absent
  • the degree of discrimination is 3 ⁇ L1/4, wherein L1 is the length of the nucleic acid sequence of the characteristic region of the target microbial group.
  • the mixed multiplex primer may not be amplified during the process of extracting the nucleic acid of the mixed sample. Exogenous nucleic acid.
  • the qualitative analysis method of the target microbial group and the target microorganism is as follows:
  • the sequencing fragment is a characteristic sequencing fragment of the target microbial population
  • the probability of the characteristic sequencing fragment of the target microbial population is P5 ⁇ 5, it is determined that the target microbial group exists in the sample to be tested, wherein ⁇ 5 is a probability guarantee; if the characteristic sequencing fragment of the target microbial group exists If the probability P5 ⁇ ⁇ 5, it is determined that the target microorganism group is not present in the sample to be tested;
  • the probability of the characteristic sequencing fragment of the target microorganism is P6 ⁇ 6, it is determined that the target microorganism exists in the sample to be tested, wherein ⁇ 6 is a probability guarantee; if the target microorganism has a probability of existence of the sequence segment P6 ⁇ 6, determining that the target microorganism is not present in the sample to be tested;
  • N1 such that P1 ⁇ ⁇ 1 and P3 ⁇ ⁇ 3, wherein P1 is a false high-throughput sequencing fragment that is not a characteristic sequencing fragment of the target microbial population is misjudged as a characteristic sequencing fragment of the target microbial population Probability of positivity; P3 is the probability that a characteristic sequencing fragment of the target microbial population is misjudged as a false negative that is not a characteristic sequencing fragment of the target microbial population; ⁇ 1 and ⁇ 3 are judgment thresholds;
  • S1 is the target of all the characteristic regions of the target microbial group The median of the number of sequencing fragments of the characteristics of the microbial population; S3 is the median of the number of characteristic sequencing fragments of the target microorganism of the characteristic regions of all of the target microorganisms, FALSE is the parameter value; the BINOM.DIST function returns the probability of a unary binomial distribution.
  • the quantitative analysis method of the target microbial group and the target microorganism is as follows:
  • the amount of the target microorganism M2 M1 ⁇ S3/S1
  • the confidence interval of the amount of the target microorganism is [M21, M22]
  • M21 and M22 are respectively the lower limit and the upper limit of the confidence interval of the M2 value
  • the technical solution provided by the embodiment of the invention has the beneficial effects that the method provided by the invention does not need to pre-culture and proliferate microorganisms, is short in time, can simultaneously detect a plurality of microorganisms, has high flux, and has a large sampling amount when counting.
  • the detection results are fine, and the classification unit can be distinguished without a large amount of DNA.
  • the enrichment culture is avoided, the detection structure is noise-free and accurate, the quantitative accuracy for low-level microorganisms is high, and the detection results of microbial qualitative and quantitative are accurate, the resolution is high, the sensitivity is high, the probability is guaranteed, and the detection process is simple and fast. And process specifications.
  • the method provided by the present invention facilitates timely and accurate diagnosis of blood diseases.
  • the sample to be tested is human tissue, body fluid and excrement, among which blood microbe is the basis for diagnosis and treatment of human diseases.
  • the sample to be tested in the present embodiment is human blood, and is taken from a doctor who diagnoses a patient having bacteremia, and detecting the microorganism in the blood is a basis for providing a treatment plan.
  • Step 1 Determine the target microbial group, the target microorganism and the non-target organism in the sample to be tested, and the reference microorganism that is not present in the sample to be tested, and the specific method is as follows:
  • the number of target microbial groups is ⁇ 1, and each target microbial group includes ⁇ 0 target microorganisms; the target microorganisms may be bacteria, viruses, fungi, actinomycetes, rickettsia, mycoplasma, chlamydia, spirochetes and protozoa. At least one of them.
  • the purpose of this example is to identify Pseudomonas aeruginosa in the sample to be tested, which has a Latin name of Pseudomonas aeruginosa, and is known as the patina of the reference genome at NCBI (National center for biotechnology information).
  • the reference microorganism may be at least one of bacteria, viruses, fungi, actinomycetes, rickettsia, mycoplasma, chlamydia, spirochetes, and protozoa.
  • the reference microorganism is not present in the sample to be tested.
  • the role of the reference microorganism is to provide a reference for quantification of the target microbial population and target microorganisms in the sample to be tested.
  • Agrobacterium tumefaciens Since Agrobacterium tumefaciens is present in the roots of plants, it is not present in the sample to be tested, therefore, In the present example, Agrobacterium tumefaciens was selected as a reference microorganism, and its Latin name was Agrobacterium tumefaciens K84.
  • the method for determining a non-target organism in the sample to be tested includes: determining the non-target organism as all organisms except the target microbial group, and if the characteristic region of the target microbial group is obtained, the non-target organism refers to the target microorganism All organisms outside the group, where all organisms refer to organisms with reference genomes, are the most stringent criteria for non-target organisms.
  • the non-target organism is determined as all the known organisms other than the target microbial group, the characteristic regions of the target microbial group can be found (the acquisition process of the characteristic region is as follows, and the results are shown in Table 1). Therefore, the present implementation
  • the non-target organisms in the example are a collection of all organisms except the target microbial group.
  • the non-target organism is determined as all organisms except the target microbial group. If the characteristic region of the target microbial group is not obtained, the non-target organism refers to other organisms other than the target microbial group in the mixed sample to narrow the non-target organism. The scope of the increase in the likelihood of finding a characteristic region of the target microbial population.
  • other organisms other than the target microbial group can be determined empirically.
  • the mixed sample includes blood and reference microorganisms, and it is impossible to have plant components and obligate parasitic plants in the mixed sample.
  • Microorganisms therefore, if the non-target organisms in this embodiment are identified as all known organisms other than the target microorganism group, the characteristic regions of the target microorganisms cannot be obtained, and the non-target microorganisms can be determined to be target microorganisms, plants, and obligate parasites. A collection of organisms other than the microbes of plants.
  • Step 2 obtaining a characteristic region of the target microbial group, a characteristic region of the target microorganism, and a reference microorganism according to the reference genome sequence of the target microbial group, the reference genome sequence of the target microorganism, the reference genome sequence of the reference microorganism, and the reference genome sequence of the non-target organism
  • the characteristic area is as follows:
  • the characteristic region of the target microbial group is the nucleic acid sequence on the reference genome of the microorganism in the target microbial group; the sequences on both sides of the characteristic region of the target microbial group are single sequences in the reference genome; the two sides of the characteristic region of the target microbial group
  • the sequence is conserved among different microorganisms in the target microbial group; the distinguishing degree of the characteristic region of the target microbial group is ⁇ 3.
  • the non-featured region is not the characteristic region of the target microbial group, and the non-characterized region refers to the amplification product of the mixed multiplex primer to mix the nucleic acid of the sample as a template; the discrimination refers to any amplification by the same mixed multiplex primer.
  • Target microorganism The minimum value of the number of bases difference between the characteristic region of the group and any non-characteristic region. If there is no non-feature region, the degree of discrimination is 3 ⁇ L1/4, where L1 is the length of the nucleic acid sequence of the characteristic region of the target microbial group.
  • the characteristic region of the target microbial group is used to represent the target microbial group, and the characteristic region of the target microbial group exists, which represents the existence of the target microbial group, and the number of the sequenced fragments of the characteristic region of the target microbial group represents the number of the target microbial group.
  • Multiple primers in the characteristic regions of the ideal target microbial population only amplify the characteristic regions of the target microbial population and cannot amplify non-target organisms. This requires that the two-sided sequences of the characteristic regions of the target microbial population, that is, the primer design regions are different sources in the non-target organisms, then the non-target organisms cannot be amplified, nor can the non-feature regions be generated.
  • the degree of discrimination is 3 ⁇ L1/4.
  • the discrimination degree of the characteristic region of the target microbial group is ⁇ 3 in order to ensure that the false positive rate and the false negative rate of the target sequencing of the target microbial group are low, and the principle is shown in Table 2.
  • the sequences on both sides of the characteristic region of the target microbial group are conserved among different microorganisms in the target microbial group, and the same primers can be used to amplify different microorganisms in the target microbial group to exclude the amplification efficiency from among the different microorganisms of the target microbial group. Relative quantitative impact.
  • the characteristic region of the target microorganism is homologous to the characteristic region of the target microbial group; the m2 value of the characteristic region of the target microorganism is ⁇ 2, wherein the m2 value is between the characteristic region of the target microorganism and the microorganisms other than the target microorganism in the target microbial group
  • the minimum number of difference bases refer to other physiological races in the target microorganism group other than the target microorganism, and the m2 value is compared with the other physiological races in the target microbial group, respectively. The smallest of the number of differential bases.
  • the focus is on distinguishing it from other microorganisms in the target microbial group.
  • the target microorganisms are closely related to the target microbial group, and the similarity between the sequences is high, so it is difficult to distinguish.
  • In the qualitative and quantitative analysis of target microorganisms only the standard genotypes in the amplicon which are different from other microorganisms in the target microbial group are concerned, which reduces the source of error and thus better targets the target microorganisms from the target microbial group. Distinguish it.
  • m2 ⁇ 2 it is judged that the sequencing fragment is the target microbial characteristic.
  • the false positive rate and the false negative rate of the sequencing fragment of the target microorganism are both low. Therefore, the target microorganism can be distinguished from the target microbial group, and the principle thereof is shown in Table 2.
  • the characteristic region of the reference microorganism is the nucleic acid sequence on the reference genome of the reference microorganism; the sequence on both sides of the characteristic region of the reference microorganism is a single sequence in the reference genome of the reference microorganism; the sequence on both sides of the characteristic region of the reference microorganism is in addition to the reference There is no homology in other organisms other than microorganisms.
  • the degree of discrimination is the only selection criterion of the characteristic region of the target microbial group, and depending on the purpose of the detection, the microorganism having the specific gene sequence may be regarded as the target microbial group, and the specific gene sequence is taken as the characteristic of the target microbial group. region.
  • a microorganism having a specific causative gene can be used as a target microbial group, and the causative gene can be used as a characteristic region of the target microorganism to guide administration according to the type of the causative gene.
  • drug resistance genes can also be used as a specific gene sequence.
  • Step 3 preparing a first multiplex amplification primer that amplifies a characteristic region of the target microorganism group, a second multiplex amplification primer that amplifies a characteristic region of the target microorganism, and a third multiple amplification of a characteristic region of the amplification reference microorganism
  • the primer is increased, and the first multiplex amplification primer, the second multiplex amplification primer and the third multiplex amplification primer are mixed to obtain a mixed multiplex amplification primer.
  • step two The specific methods combined with step two and step three are as follows:
  • the genomic sequences of different races in the target microbial population were downloaded at ftp://ftp.ncbi.nlm.nih.gov/genomes/ and their genome and query sequences were used with the software Megablast (version 2.2.26). Sequence analysis is performed.
  • the query sequence is the genome sequence of the accession number AE004091 on NCBI.
  • the parameters of the Megablast software comparison are set to: parameter -e is set to 1e-5; parameter -p is set to 0; parameter -v is set to 5000; parameter -m is set to 1. After the alignment is completed, homologous sequences between all microorganisms of the target microbial group are obtained, and homologous sequences appearing only once in the query sequence are selected.
  • window translation was performed within the selected homologous sequence.
  • For each window obtained by translation compare and obtain bases that differ between at least two microorganisms in the target microbial group, and intercept the region from the first differential base to the last differential base in the window as the characteristic region. And count the number of different bases in the feature region.
  • a region extending to a length of 160 bp-characteristic region on both sides of the characteristic region is used as a primer search region, and in the primer search region, a region having a length greater than 20 bp and having no base difference among all microorganisms in the target microbial group is searched.
  • As a primer design area for the feature area abandon the lack of primer design area Characteristic area.
  • the characteristic regions of all the target microbial populations obtained above and their corresponding primer design regions are connected by 100 bases N (N represents any one of four bases A, T, C and G) to generate one The reference genome for primer design.
  • N represents any one of four bases A, T, C and G
  • the reference genome of the generated primer design is uploaded.
  • Add Hotspot option fill in the start and end positions of the feature region in the reference genome of the generated primer design.
  • click the "Submit targets” button to submit and obtain the multiplex primer sequences of the characteristic regions of the target microbial population.
  • the designed multiplex primers to analyze the target microbial population using BLASTN (Basic Local Alignment Search Tool) (version 2.2.26), and select at least one of the forward and reverse primers to be specific. Sexual primers.
  • the selected primers are then subjected to BLASTN alignment analysis with the genome of the non-target organism to check whether they can amplify the genome of the non-target organism.
  • the non-target organism is all organisms except the target microbial group, and the non-target organism's genome is NCBI's NT/NR library.
  • the amplification product of any non-target organism is compared with the characteristic region of any target microbial group, and in all the alignments, the minimum number of differential bases is the discrimination degree m1, and the target microbial group retaining m1 ⁇ 3
  • the feature region further removes feature regions containing simple repeat sequences or multiple copies on the genome. From the characteristic regions of the retained target microorganism group, the characteristic regions of the target microorganism group are further preferred and the characteristic regions of the target microorganism are selected.
  • the preferred method of the characteristic region of the target microbial group is as follows: The reference genome of the target organism is subjected to BLASTN alignment, and the characteristic region having more than 95% homology with the non-target organism is removed, and the remaining characteristic region is utilized between the target microorganism and other microorganisms in the target microbial group to utilize the software muscle (version: V3. 6) Align by its default parameters to obtain the minimum value of the number of differential bases, that is, the m2 value. The characteristic regions of the target microbial group with m2 ⁇ 2 are retained, and two or more characteristic regions with larger discrimination degrees m1 and m2 are selected as the characteristic regions and target microorganisms of the target microbial group from the retained characteristic regions. The characteristic region, the corresponding multiplex amplification primer serves as both the first multiplex amplification primer and the second multiplex amplification primer.
  • the reference microbial characteristic region and its corresponding third multiplex amplification primer are obtained in a similar manner to the search for the characteristic region of the target microbial group, and the differences are described below, and the same points are not repeatedly described.
  • the reference microbial genome was also aligned with the query sequence (reference sequence) using the software Megablast (version 2.2.26), which is the genomic sequence of Agrobacterium tumefaciens K84. After the alignment is completed, a single sequence in the reference microbial genome that appears only once in the query sequence is obtained.
  • a single sequence was aligned with NCBI's NT/NR library, a single sequence with homologous sequences in non-target organisms was discarded, and a non-overlapping 110 bp length region was randomly selected from a single sequence, and the sequences on both sides were used as primers.
  • Design area Multiple-amplification primers for the characteristic region were designed on the multiplex primer online design page https://ampliseq.com, and the characteristic regions of the multiplex primers were successfully designed.
  • the specific method is as follows: remove the simple repeat sequence or in the genome The feature region is multi-copy, and the remaining feature region is compared with the reference genome of the non-target organism by BLASTN, and the characteristic region having more than 95% homology with the non-target organism is removed. From the remaining feature regions, two or more characteristic regions are randomly selected as the characteristic regions of the reference microbial group, and the corresponding multiplex amplification primers are used as the third multiplex amplification primers.
  • the multiplexed primers correspond to the amplified template sequences, and the template sequences refer to the amplified regions of each multiplex primer that are filled in the Add Hotspot option.
  • the amplification efficiency of each multiplex primer was tested according to the operating manual of StepOne Real-Time PCR (Part Number 4376784 Rev.
  • the multiplex amplification primers retained by the first multiplex amplification primer, the second multiplex amplification primer and the third multiplex amplification primer obtained above are multiplexed primers on the online design webpage https://ampliseq.com
  • the pooling procedure was combined to obtain hybrid multiplex primers, which were synthesized in liquid form after synthesis by a hybrid multiplex primer.
  • the feature area related information finally obtained in this embodiment is shown in Table 1.
  • the starting and ending positions in Table 1 refer to the starting and ending positions of the characteristic region on the reference genome on the query sequence.
  • Step 4 Add a reference microorganism to the sample to be tested to obtain a mixed sample, and the specific method is as follows:
  • the reference microorganism is not present in the sample to be tested, so the reference microorganism can be used as an internal reference and operated in parallel with the microorganism in the sample to be tested, and the target microorganism group and the target microorganism in the sample to be tested are quantified.
  • the amount of reference microorganisms added is controlled to extract about 10 ng of mixed sample nucleic acid (DNA) to construct a high-throughput sequencing library normally, and the reference microorganisms are not added in such a way that the proportion of the reference microorganisms is too large, occupying Excessive amount of high-throughput sequencing data.
  • the method for obtaining the mixed sample in the present embodiment is as follows: 0.2 mL of the reference microorganism of the reference microorganism having a concentration of 2 OD (OD is the maximum absorbance value of the bacterial liquid) is placed in a 1.5 mL centrifuge tube, vacuum-dried and dried, and then added to the sample to be tested. , mixing, that is, a mixed sample of the sample to be tested and the reference microorganism. The amount of reference microorganisms obtained by adding the mixed sample was counted by the blood plate count, as shown in Table 2.
  • Step 5 Extracting the nucleic acid of the mixed sample, the specific method is as follows:
  • the mixed multiples may be added during the extraction process of the nucleic acid of the mixed sample.
  • the added exogenous nucleic acid does not exist in nature and thus does not interfere with microbial detection.
  • the External RNA Control Association has designed and validated a set of nucleic acid sequences that are not found in nature and can be used as exogenous nucleic acids in the examples of the present invention.
  • the sequence can be found at https://tools.lifetechnologies.com/content/sfs /manuals/cms_095047.txt.
  • the amount of exogenous nucleic acid added is about 1 ug, which ensures that the nucleic acid of the mixed sample can be extracted normally.
  • the sample to be tested is blood, and its nucleic acid content is normal, and therefore, it is not necessary to add an exogenous nucleic acid to the mixed sample.
  • the nucleic acid of the obtained mixed sample was extracted using a blood genomic DNA extraction kit (manufacturing company: Tiangen Biochemical Technology (Beijing) Co., Ltd., product number: DP348) according to the method provided in the operation manual.
  • Step 6 The amplification reaction is obtained by using the mixed multiplex primer and the nucleic acid of the mixed sample to obtain an amplification product, and the specific method is as follows:
  • kits After multiplex PCR amplification of the nucleic acid of the mixed sample using Library Construction Kit 2.0 (manufactured by LifeTechnology, Inc., Cat. No. 4475345), a high-throughput sequencing library was constructed using the amplification product.
  • the kit comprises the following reagents: 5 ⁇ Ion AmpliSeq TM HiFi Mix , FuPa reagent, converting reagent, sequencing adapters and DNA ligase solution. Library construction method of operating manual of the kit "Ion AmpliSeq TM Library Preparation" (Publication number: MAN0006735, Version: A.0) performed.
  • the amplification system of multiplex PCR was as follows: 5 ⁇ Ion AmpliSeq TM HiFi Mix 4 ⁇ l, synthetic mixed multiplex amplification primer 4 ⁇ l, extracted mixed sample nucleic acid 10 ng, and enzyme-free water 11 ⁇ l.
  • the amplification procedure for multiplex PCR was as follows: 99 ° C, 2 minutes; (99 ° C, 15 seconds; 60 ° C, 4 minutes) ⁇ 25 cycles; 10 ° C incubation.
  • the excess primers in the multiplex PCR amplification product were digested with FuPa reagent, and then phosphorylated.
  • the specific method was as follows: 2 ⁇ L of FuPa reagent was added to the amplification product of multiplex PCR, and after mixing, the following procedure was performed on the PCR instrument. : 50 ° C, 10 minutes; 55 ° C, 10 minutes; 60 ° C, 10 minutes; 10 ° C preservation, to obtain a mixture a, the mixture a is a solution containing a phosphorylated amplification product.
  • the phosphorylated amplification product was ligated to the sequencing adaptor by adding 4 ⁇ L of the conversion reagent, 2 ⁇ L of the sequencing adaptor solution and 2 ⁇ L of the DNA ligase to the mixture a, and then mixing, and reacting on the PCR instrument according to the following procedure: 22 ° C , 30 minutes; 72 ° C, 10 minutes; stored at 10 ° C, to give a mixture b.
  • the mixture b was purified by standard ethanol precipitation method and dissolved in 10 ⁇ L of enzyme-free water. Made with Invitrigen, USA The dsDNA HS Assay Kit (Cat. No.
  • the purified mixed solution b was diluted to 15 ng/ml to obtain a high-throughput sequencing library having a concentration of about 100 pM.
  • Step 7 High-throughput sequencing using high-throughput sequencing of amplification products, the specific methods are as follows:
  • the obtained high-throughput sequencing library and kit Ion PI Template OT2 200Kit v2 (manufactured by Invirtrigen, USA, No. 4485146) for ePCR (Emulsion PCR, emulsion polymerase chain reaction) amplification before sequencing, according to the reagent
  • the box's operating manual is carried out.
  • the high-throughput sequencing fragments are aligned to the characteristic regions of the corresponding target microbial population, the characteristic regions of the target microorganisms, and the characteristic regions of the reference microorganisms based on the primers of the sequenced fragments.
  • the sequencing fragments with unsuccessful alignment and incomplete feature regions were removed, and the unsuccessful sequencing fragments were mostly non-specific amplification products.
  • the incomplete fragment of the characteristic region refers to the failure to start the characteristic region in Table 1. The sequence detection from the position to the end position is complete.
  • Step VIII Qualitative and quantitative analysis of the target microbial group and the target microorganism according to the high-throughput sequencing fragment, the specific method is as follows:
  • the basic principle of qualitative and quantitative analysis of microorganisms is that the characteristic region represents the target microbial group and the target microorganism, and if there are sequencing fragments of the characteristic region, it indicates that the target microbial group or the target microorganism exists, and the number of sequencing fragments of the characteristic region is also Represents the number of target microbial groups and target microorganisms.
  • the embodiments of the present invention calculate the reliability of qualitative and quantitative microorganisms, and at the same time, enhance the practicability of the conclusions.
  • the embodiments of the present invention need to clarify the complex relationship between parameters to achieve qualitative and quantitative detection of any microorganism, and obtain reliable conclusions.
  • the qualitative analysis method is as follows: the high-throughput sequencing fragment is compared with the characteristic region of each target microbial group. When the number of differential bases is ⁇ n1, the alignment is successful, and the corresponding high-throughput sequencing fragment is the target microbial group.
  • the characteristic region of the target microorganism is compared with the characteristic region of each homologous target microbial group, and the standard genotype of the target microorganism is formed by extracting the difference base in the characteristic region of the target microorganism, wherein the differential base refers to the target microorganism
  • the characteristic region is the sum of the bases that differ from any microorganism in the target microbial population.
  • n2 is the maximum fault-tolerant base number of the characteristic sequencing fragment of the target microorganism, and the test gene of the target microorganism
  • the high-throughput sequencing fragment in which the model is located is a characteristic sequencing fragment of the target microorganism.
  • the number of bases of the standard genotype and the test genotype at this time is zero, and therefore, the number of bases of difference between them is also zero.
  • the high-throughput sequencing fragment in which the test genotype of the target microorganism is located is determined as the characteristic sequencing fragment of the target microorganism.
  • the number of characteristic fragments of the target microbial group and the characteristic region of the target microorganism was obtained, and the results are shown in Table 1.
  • the values of n1 and n2 are shown in Table 2, and the calculation process is as follows.
  • N1 makes P1 ⁇ ⁇ 1 and P3 ⁇ ⁇ 3, wherein P1 is a probability that a high-throughput sequencing fragment that is not a characteristic sequencing fragment of the target microbial population is misjudged as a characteristic positive segment of the target microbial population; P3 is A characteristic sequencing fragment of a target microbial population is misjudged as the probability of a false negative that is not a characteristic sequencing fragment of the target microbial population; ⁇ 1 and ⁇ 3 are thresholds for judgment.
  • N2 makes P2 ⁇ ⁇ 2 and P4 ⁇ ⁇ 4, wherein P2 is a probability that a characteristic sequencing fragment that is not a target microorganism is misjudged as a characteristic sequencing fragment of the target microorganism; P4 is a characteristic sequencing fragment of a target microorganism Misjudged is the probability of false negatives that are not the characteristic sequencing fragments of the target microorganism; ⁇ 2 and ⁇ 4 are judgment thresholds; the magnitudes of various thresholds in the embodiments of the present invention are determined by actual needs, for example, certain germs are extremely harmful , missed detection (false negative) will cause serious consequences, then it is necessary to control false negatives, ⁇ 2 and ⁇ 4 values are low.
  • the values of ⁇ 1 and ⁇ 3 are 0.01%, that is, there are 1 false positive or false negative in about 10,000 characteristic sequences.
  • the accuracy is very high.
  • the reason why this high accuracy can be controlled is because the m1 value in the feature sequence is large and can be easily distinguished from other non-target organisms, thus controlling the false positive rate and the false negative rate.
  • the values of ⁇ 2 and ⁇ 4 are 0.5%, that is, there are 5 false positives or false negatives in about 1 thousand characteristic sequences, which shows that the accuracy is high.
  • m1 is the degree of discrimination, specifically referring to the degree of discrimination corresponding to the feature region of the target microbial group used for calculating S1, in this embodiment, m1
  • m2 is the minimum value of the difference between the characteristic region of the target microorganism and the other microorganisms of the target microorganism group, and specifically refers to the value of m2 of the characteristic region corresponding to the target microorganism for calculating S3.
  • the values of m2 are shown in Table 1 and Table 2; L1 is the target microorganism.
  • the length of the characteristic region of the group, in the present example, the value of L1 is shown in Table 2; L2 is the length of the standard genotype of the target microorganism, in the present example, the value of L2 is shown in Table 2; E is the base error rate, It consists of sequencing error rate E1 and natural mutation rate E2.
  • the sequencing error rate of PROTON high-throughput sequencer is E1 ⁇ 1%.
  • microbial races such as P1-P6 white-leaf races
  • the variation rate between the reference genomes is generally less than 0.5%, and the natural mutation rate is lower than the variation rate between the races.
  • the natural mutation rate E2 ⁇ 0.5%, for the method of the present invention is more adaptable, E2 ⁇ 1%, in the present embodiment, E ⁇ 2%, in order to make the probability of the qualitative and quantitative conclusion of the microorganism in the present embodiment more reliable, take the maximum value of 2% of the E value for calculation.
  • the value of n1 is gradually increased from 0, and the values of P1 and P3 are calculated.
  • n1 13 (see Table 2)
  • the reference microorganism is used as a target microbial group containing only one target microorganism, and the obtained characteristic sequencing fragment of the target microorganism is calculated, that is, the characteristic sequencing fragment of the reference microorganism.
  • the number of characteristic fragments of the characteristic region of the reference microorganism is shown in Tables 1 and 2.
  • the probability of the characteristic sequencing fragment of the target microbial group P5 ⁇ 5 it is judged that the target microbial group exists in the sample to be tested; if the probability of the characteristic sequencing fragment of the target microbial group exists P5 ⁇ 5, it is judged that the target is not present in the sample to be tested.
  • ⁇ 5 is a probability guarantee, in the present embodiment, the value of ⁇ 5 is 99.99%.
  • P5 1-BINOM.DIST(S1, S1, P1, FALSE),
  • S1 is the median of the number of characteristic sequencing fragments of the target microbial group of the characteristic regions of all target microbial groups, in this embodiment, the target microorganism
  • the second feature of the taxonomic group The number of sequencing fragments is the median of the number of characteristic sequencing fragments of all target microbial taxa, so the values of the present embodiment S1 are shown in Table 1 and Table 2, and the value of S1 and P1 in this embodiment are The value is substituted into the calculation formula of P5 to obtain P5 ⁇ ⁇ 5. Therefore, in this embodiment, the target microorganism group exists in the sample to be tested, FALSE is the parameter value, and the BINOM.DIST function returns the probability of the unary binomial distribution.
  • the probability of the characteristic sequencing fragment of the target microorganism is P6 ⁇ 6, it is judged that there is a target in the sample to be tested.
  • P6 1-BINOM.DIST(S3,S3,P2,FALSE), the BINOM.DIST function returns the probability of a unary binomial distribution, and S3 is the number of characteristic sequencing fragments of the target microorganisms of the characteristic regions of all target microorganisms.
  • the number of digits, in the present embodiment, the number of the second characteristic sequencing fragments of the target microorganism is the median of the number of characteristic sequencing fragments of all target microorganisms, and the corresponding value of S3 is shown in Table 1 and Table 2,
  • the value of S3 and the value of P2 are substituted into the calculation formula of P6 to obtain P6 ⁇ ⁇ 6, and therefore, it is judged that the target microorganism exists in the sample to be tested in the present embodiment.
  • both ⁇ 5 and ⁇ 6 are determined according to actual needs. Both ⁇ 5 and ⁇ 6 can be the same or different, and the difference depends on the actual needs. When a certain microorganism is to be strictly controlled, the values of ⁇ 5 and ⁇ 6 are larger. On the contrary, the values of ⁇ 5 and ⁇ 6 are both small. In addition, all values of the a value in the embodiment of the present invention follow the principle.
  • Mr is the amount of the reference microorganism added to the sample to be tested.
  • the value of Mr is shown in Table 2; S2 is all references.
  • the median of the number of sequencing fragments of the characteristic microorganism of the characteristic region of the microorganism, in the present embodiment, the second feature of the reference microorganism, the number of sequencing fragments is the median of the number of characteristic sequencing fragments of all reference microorganisms,
  • the confidence interval for the amount of the target microbial group is [M11, M12], and M11 and M12 are the lower and upper limits of the confidence interval of the M1 value, respectively.
  • M11 M1 ⁇ (1-S4/S1)
  • M12 M1 ⁇ (1+S5/S1)
  • S4 is the number of characteristic sequencing fragments of the target microbial group of false positives
  • S4 CRITBINOM(nS, P1
  • S5 is the number of characteristic sequencing fragments of the target microbial population of false negatives
  • S5 CRITBINOM(S1, P3, ⁇ 9)
  • ⁇ 9 is a probability guarantee, in this embodiment, ⁇ 9 takes 99.50%
  • the CRITBINOM function returns
  • the cumulative binomial distribution is greater than or equal to the minimum value of the critical value
  • nS is the number of high-throughput sequencing fragments of the non-featured region amplified by the multiplex amplification primers for calculating the characteristic region of the target microbial population
  • nS is the target microorganism group The number of high-throughput sequencing fragments of the non-featured regions generated by the amplification of the multiplex amplification primers of the two characteristic regions.
  • the values of nS are shown in Table 2.
  • the value of the value of nS and the value of P1 are substituted into the formula of S4 to obtain the value of S4, and the value of the present embodiment S1 and the value of P3 are substituted into the formula of S5 to obtain the value of S5.
  • the values of M11 and M12 in the present embodiment are calculated, and the confidence interval of M1 is obtained, that is, the confidence interval of the amount of the target microbial group is [2871226, 2871455].
  • the value of ⁇ 10 is 99.50%
  • the values of the present embodiments S1 and S3, and the values of P2 and P4 are substituted into the calculation formulas of S6 and S7, and the values of S6 and S7 are calculated.
  • the values of S6, S7, M1, and S3 are substituted into the calculation formulas of M21 and M22, and the values of M21 and M22 are calculated, and the confidence interval of the amount of the target microorganism is obtained [2534067, 2539614].
  • Table 2 is the qualitative and quantitative analysis parameters of microorganisms in this example and its calculation principle
  • Example 2 Identification of human feces microorganisms
  • the sample to be tested in this embodiment is human feces, and is taken from a doctor to diagnose a patient having an intestinal disease.
  • the detection of microorganisms in the feces is a basis for providing a treatment plan.
  • This embodiment is similar to the method of the first embodiment, and the methods, parameters, and results that are not mentioned are the same as those of the first embodiment, and therefore, will not be repeated.
  • Step 1 Determine a target microbial group, a target microorganism and a non-target organism in the sample to be tested, and a reference microorganism not present in the sample to be tested.
  • the purpose of this example is to identify Salmonella enterica in the sample to be tested, the Latin name is Salmonellaenterica, and at the NCBI (National center for biotechnology information), the Salmonella typhimurium of the reference genome is known to be small. A total of 33 species (cut-off time June 2, 2015), see http://www.ncbi.nlm.nih.gov/genome/genomegroups/152, these physiological races constitute the target microbial population of this example. Among these physiological races, Salmonella enterica subsp.houtenae str. ATCC BAA-1581 is highly pathogenic and serves as a target microorganism of the present example.
  • Step 2 According to the reference genome sequence of the target microbial group, the reference gene of the target microorganism The set sequence, the reference genomic sequence of the reference microorganism, and the reference genomic sequence of the non-target organism obtain the characteristic region of the target microbial population, the characteristic region of the target microorganism, and the characteristic region of the reference microorganism.
  • the feature area related information finally obtained in this embodiment is shown in Table 3.
  • Table 3 Primer related information provided in the second embodiment of the present invention
  • Step 4 Add a reference microorganism to the sample to be tested to obtain a mixed sample, and the specific method is as follows:
  • the method for obtaining the mixed sample of the present embodiment is as follows: 0.2 mL of the reference microorganism having a concentration of 2 OD (OD is the maximum absorbance value of the bacterial liquid) is placed in a 1.5 mL centrifuge tube, vacuum-freeze-dried, and 100 mg of the sample to be tested is added. Medium, mixing, that is, a mixed sample of the sample to be tested and the reference microorganism. The amount of reference microorganisms obtained by adding the mixed sample was calculated by blood cell plate counting as shown in Table 4.
  • Step 5 Extracting the nucleic acid of the mixed sample, the specific method is as follows:
  • the sample to be tested is feces and its nucleic acid content is low. Therefore, an exogenous nucleic acid, that is, an ERCC-00014 gene designed by the 1 ug external RNA control association, is added to the mixed sample.
  • an exogenous nucleic acid that is, an ERCC-00014 gene designed by the 1 ug external RNA control association
  • the DNA kit extracts the nucleic acid of the obtained mixed sample according to the method provided in the operation manual.
  • Step 6 The amplification reaction is obtained by using the mixed multiplex primer and the nucleic acid of the mixed sample to obtain an amplification product, and the specific method is the same as that in the first embodiment.
  • Step 7 High-throughput sequencing is performed by using the amplified product to obtain a high-throughput sequencing fragment, and the specific method is the same as that in the first embodiment.
  • Step VIII Qualitative and quantitative analysis of the target microbial group and the target microorganism according to the high-throughput sequencing fragment, the specific method is as follows:
  • the specific parameters of the embodiment of the present invention and the calculation principle thereof are shown in Table 4.
  • Table 4 is the microbial qualitative and quantitative analysis parameters of this example and its calculation principle
  • the detection method provided by the embodiments of the present invention can apply various aspects of medicine.
  • microbial nucleic acid separation methods are slightly different.
  • blood and feces have different genomic extraction kits, which need to be operated according to their respective operations.
  • the other steps are substantially the same except for the nucleic acid separation method. Therefore, the detection method provided by the embodiment of the present invention is more versatile.
  • the invention changes the existing method, can only detect a few microorganisms at a time, can only distinguish microorganisms into species, quantitatively inaccurate, no probabilistic guarantee of detection results, requires pre-culture, long detection period, and some microorganisms cannot be cultured and thus cannot be Many problems such as quantitative distortion and quantitative roughness caused by different microbial culturability, provide a comprehensive, rapid and precise qualitative and quantitative detection method for human microbiological detection, providing fast, accurate and comprehensive medical diagnosis. data support.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Toxicology (AREA)
  • Virology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了一种人体微生物定性与定量的检测方法,包括以下步骤:确定待测样品中的目标微生物类群、目标微生物和非目标生物、以及不存在于所述待测样品中的参考微生物;设计目标微生物类群与目标微生物的特征区域;设计特征区域的多重扩增引物;在待测样品中加入参考微生物与外源核酸后,提取待测样品中的微生物的核酸;利用设计的多重引物扩增微生物核酸,扩增获得特征测序片段;利用特征测序片段定性、定量分析待测样品中微生物。

Description

一种人体微生物定性与定量的检测方法 技术领域
本发明涉及生物技术领域,特别涉及一种人体微生物定性与定量的检测方法。
背景技术
人体微生物是人体疾病诊断的重要依据,人体微生物精确地定性与定量检测是十分必要的。
现有人体微生物定性与定量检测技术包括形态学计数、芯片检测、16S rRNA测序、宏基因组测序和实时定量PCR(Polymerase Chain Reaction,聚合酶链式反应)。形态学计数检测需要对微生物进行预培养,耗时长,不可培养微生物不可检测,一次仅能够检测一种微生物,通量低,在计数时抽样量有限,且结果粗糙,无法对种以下的分类单元进行区分。芯片检测所需的待测样品的DNA量大,需要对微生物进行预培养及富集处理,检测结果不准确,且无法做定量检测。16S rRNA测序无法对种以下的分类单元进行区分。宏基因组测序深度有限,对于低含量的微生物的定量检测准确度很差。实时定量PCR一次只能检测一种微生物,通量低。另外,已有方法共有缺陷是,无法计算微生物定性与定量的可靠性,使得结论实用性差。以上技术缺陷导致了人体疾病诊断不及时、诊断不精确以及误诊等问题。
发明内容
为了解决现有技术中微生物定性与定量检测不准确的问题,本发明实施例提供了一种人体微生物定性与定量的检测方法。所述技术方案如下:
本发明实施例提供了一种人体微生物定性与定量的检测方法,所述方法包括:
确定待测样品中的目标微生物类群、目标微生物和非目标生物、以及不存在于所述待测样品中的参考微生物,所述待测样品为人体组织、体液和排泄物;
根据所述目标微生物类群、所述目标微生物、所述参考微生物和所述非目标生物的参考基因组序列,获得所述目标微生物类群的特征区域、所述目标微生物的特征区域和所述参考微生物的特征区域;
制备扩增所述目标微生物类群的特征区域的第一多重扩增引物、扩增所述目标微生物的特征区域的第二多重扩增引物和扩增所述参考微生物的特征区域的第三多重扩增引物,将所述第一多重扩增引物、所述第二多重扩增引物和所述第三多重扩增引物混合得到混合多重扩增引物;
向所述待测样品中加入所述参考微生物,获得混合样品;
提取所述混合样品的核酸;
利用所述混合多重扩增引物和所述混合样品的核酸进行扩增反应,获得扩增产物;
利用所述扩增产物进行高通量测序,获得高通量测序片段;
根据所述高通量测序片段,对所述目标微生物类群和所述目标微生物进行定性和定量分析。
具体地,所述目标微生物类群的数目≥1个,且每个所述目标微生物类群包括≥0种所述目标微生物;
所述目标微生物为细菌、病毒、真菌、放线菌、立克次体、支原体、衣原体、螺旋体和原生动物中的至少一种;
所述参考微生物为细菌、病毒、真菌、放线菌、立克次体、支原体、衣原体、螺旋体和原生动物中的至少一种。
具体地,所述确定待测样品中的非目标生物的方法包括:将所述非目标生物确定为除所述目标微生物类群之外的所有生物,若能获得所述目标微生物类群的特征区域,则所述非目标生物指除所述目标微生物类群之外的所有生物;若不能获得所述目标微生物类群的特征区域,则所述非目标生物指所述混合样品中,除所述目标微生物类群之外的其它生物。
具体地,所述目标微生物类群的特征区域为所述目标微生物类群内的微生物的参考基因组上的核酸序列;所述目标微生物类群的特征区域的两侧的序列在所述参考基因组中为单一序列;所述目标微生物类群的特征区域的两侧的序 列在所述目标微生物类群内不同微生物间保守;所述目标微生物类群的特征区域的区分度≥3;
所述目标微生物的特征区域与所述目标微生物类群的特征区域同源;所述目标微生物的特征区域的m2值≥2,其中,m2值为所述目标微生物的特征区域与所述目标微生物类群内除所述目标微生物外的其它所述微生物间的差异碱基数的最小值;
所述参考微生物的特征区域为所述参考微生物的参考基因组上的核酸序列;所述参考微生物的特征区域的两侧的序列在所述参考微生物的参考基因组中为单一序列;所述参考微生物的特征区域的两侧的序列在除所述参考微生物外的其它生物中不具有同源性。
进一步地,所述区分度是指由同一所述混合多重扩增引物扩增的任一所述目标微生物类群的特征区域与任一非特征区域间的差异碱基数的最小值,其中,所述非特征区域是所述混合多重扩增引物以所述混合样品的核酸为模板的扩增产物,且所述非特征区域不为所述目标微生物类群的特征区域,若无所述非特征区域,则所述区分度=3×L1/4,其中,L1为所述目标微生物类群的特征区域的核酸序列长度。
具体地,在提取所述混合样品的核酸时,若所述待测样品中核酸的含量过低,则在提取所述混合样品的核酸的过程中,加入所述混合多重扩增引物不能扩增的外源核酸。
具体地,所述目标微生物类群和所述目标微生物的定性分析方法如下:
将所述高通量测序片段与每种所述目标微生物类群的特征区域进行比对,当差异碱基数≤n1时,则比对成功,相应的所述高通量测序片段为所述目标微生物类群的特征区域,其中,n1为所述目标微生物类群的特征测序片段的最大容错碱基数;若比对成功的所述目标微生物类群的特征区域≥1种时,则判断所述高通量测序片段为所述目标微生物类群的特征测序片段;
将所述目标微生物的特征区域与每种同源的所述目标微生物类群的特征区域进行比对,在所述目标微生物的特征区域中提取差异碱基组成所述目标微生物的标准基因型;在所述目标微生物类群的特征测序片段上,提取所述目标微 生物的标准基因型所对应的碱基,组成所述目标微生物的测试基因型;若所述目标微生物的测试基因型与所述目标微生物的标准基因型的差异碱基数≤n2,其中,n2为所述目标微生物的特征测序片段的最大容错碱基数,则所述目标微生物的测试基因型所在的所述高通量测序片段为所述目标微生物的特征测序片段;
将所述参考微生物作为仅包含一个所述目标微生物的所述目标微生物类群,计算获得的所述目标微生物的特征测序片段,即为所述参考微生物的特征测序片段;
若所述目标微生物类群的特征测序片段存在的概率P5≥α5,则判断所述待测样品中存在所述目标微生物类群,其中,α5为概率保障;若所述目标微生物类群的特征测序片段存在的概率P5<α5,则判断所述待测样品中不存在所述目标微生物类群;
若所述目标微生物的特征测序片段存在的概率P6≥α6,则判断所述待测样品中存在所述目标微生物,其中,α6为概率保障;若所述目标微生物的特征测序片段存在的概率P6<α6,则判断所述待测样品中不存在所述目标微生物;
n1使得P1≤α1且P3≤α3,其中,P1为一条不是所述目标微生物类群的特征测序片段的所述高通量测序片段被误判为所述目标微生物类群的特征测序片段而产生的假阳性的概率;P3为一条所述目标微生物类群的特征测序片段被误判为不是所述目标微生物类群的特征测序片段而产生的假阴性的概率;α1和α3为判断阈值;
n2使得P2≤α2且P4≤α4,其中,P2为一条不是所述目标微生物的特征测序片段的所述高通量测序片段被误判为所述目标微生物的特征测序片段而产生的假阳性的概率;P4为一条所述目标微生物的特征测序片段被误判为不是所述目标微生物的特征测序片段而产生的假阴性的概率;α2和α4为判断阈值;
P5=1-BINOM.DIST(S1,S1,P1,FALSE),P6=1-BINOM.DIST(S3,S3,P2,FALSE),S1为所有的所述目标微生物类群的特征区域的所述目标微生物类群的特征测序片段的数量的中位数;S3为所有的所述目标微生物的特征区域的所述目标微生物的特征测序片段的数量的中位数, FALSE为参数值;BINOM.DIST函数返回一元二项式分布的概率。
进一步地,所述目标微生物类群和所述目标微生物的定量分析方法如下:
所述目标微生物类群的量M1=Mr×S1/S2,所述目标微生物类群的量的置信区间为[M11,M12],其中,Mr为加入所述待测样品中的所述参考微生物的量;S2为所有的所述参考微生物的特征区域的所述参考微生物的特征测序片段的数量的中位数;M11和M12分别为M1值的置信区间的下限与上限;
所述目标微生物的量M2=M1×S3/S1,所述目标微生物的量的置信区间为[M21,M22],M21和M22分别为M2值的置信区间的下限与上限;
M11=M1×(1-S4/S1),M12=M1×(1+S5/S1),M21=M2×(1-S6/S3),M22=M2×(1+S7/S3);其中,S4为假阳性的所述目标微生物类群的特征测序片段的数量且S4=CRITBINOM(nS,P1,α9),其中,nS为计算S1的所述目标微生物类群的特征区域的所述多重扩增引物所扩增的所述非特征区域的所述高通量测序片段的数量;S5为假阴性的所述目标微生物类群的特征测序片段的数量且S5=CRITBINOM(S1,P3,α9),其中,α9为概率保障;S6为假阳性的所述目标微生物的特征测序片段的数量且S6=CRITBINOM(S1,P2,α10),S7为假阴性的所述目标微生物的特征测序片段的数量且S7=CRITBINOM(S3,P4,α10),其中,α10为概率保障;CRITBINOM函数返回使累积二项式分布大于等于临界值的最小值。
进一步地,P1=BINOM.DIST(n1,m1,1-E,TRUE),P2=BINOM.DIST(n2,m2,1-E,TRUE),P3=1-BINOM.DIST(n1,L1,E,TRUE),P4=1-BINOM.DIST(n2,L2,E,TRUE),其中,m1为所述区分度;所述m2为所述目标微生物的特征区域与所述目标微生物类群的其它所述微生物间差异碱基的最小值;L1为所述目标微生物类群的特征区域的长度;L2为所述目标微生物的标准基因型的长度;E为碱基错误率。
本发明实施例提供的技术方案带来的有益效果是:本发明提供的方法不需要对微生物进行预培养与增殖,耗时短,可同时检测多种微生物,通量高,计数时抽样量大,检测结果精细,能够对分类单元进行区分,无需大量的DNA并 避免了富集培养,检测结构无噪音且准确,对于低含量微生物的定量准确度高,且对于微生物定性和定量的检测结果准确、分辨率高、灵敏度高、有概率保障,检测过程简单、快速且流程规范。本发明提供的方法有助于血液疾病及时且精确地进行诊断。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将对本发明实施方式作进一步地详细描述。本发明中未标注说明的试剂均为常用市售试剂,在大多数生物技术公司均可购买到且效果几乎无差别。
实施例一、人体血液微生物的鉴定
待测样品为人体组织、体液和排泄物,其中,血液微生物是人体疾病诊断与治疗的依据。本实施例中的待测样品为人体血液,取自医生诊断具有菌血症疾病的病人,检测其血液中的微生物是为治疗方案提供依据。
步骤一、确定待测样品中的目标微生物类群、目标微生物和非目标生物、以及不存在于待测样品中的参考微生物,具体方法如下:
目标微生物类群的数目≥1个,且每个目标微生物类群包括≥0种目标微生物;目标微生物可以为细菌、病毒、真菌、放线菌、立克次体、支原体、衣原体、螺旋体和原生动物中的至少一种。本实施例的目的是鉴定待测样品中的铜绿假单胞菌,其拉丁学名为Pseudomonas aeruginosa,在NCBI(National center for biotechnology information,国家生物技术信息中心)上,已知参考基因组的铜绿假单胞菌生理小种共30个(截止时间2015年6月2日),具体见http://www.ncbi.nlm.nih.gov/genome/genomegroups/187,这些生理小种构成本实施例的目标微生物类群。在这些生理小种中,Pseudomonas aeruginosa PA7致病性较强,作为本实施例的目标微生物。
参考微生物可以为细菌、病毒、真菌、放线菌、立克次体、支原体、衣原体、螺旋体和原生动物中的至少一种。参考微生物不存在于待测样品中。参考微生物的作用是为待测样品中的目标微生物类群和目标微生物的定量提供一个参照。由于根癌农杆菌存在于植物根中,所以不存在于待测样品中,因此,在 本实施例中选取根癌农杆菌作为参考微生物,其拉丁学名为Agrobacterium tumefaciens K84。
具体地,确定待测样品中的非目标生物的方法包括:将非目标生物确定为除目标微生物类群之外的所有生物,若能获得目标微生物类群的特征区域,则非目标生物指除目标微生物类群之外的所有生物,其中,所有生物是指具有参考基因组的生物,是非目标生物的最严格的标准。在本实施例中,将非目标生物确定为目标微生物类群之外的所有己知生物时,可以找到目标微生物类群的特征区域(特征区域的获取过程见后,结果见表1),因此,本实施例中的非目标生物为除目标微生物类群之外的所有生物的集合。
将非目标生物确定为除目标微生物类群之外的所有生物,若不能获得目标微生物类群的特征区域,则非目标生物指混合样品中,除目标微生物类群之外的其它生物,以缩小非目标生物的范围,增加找到目标微生物类群的特征区域的可能性。在混合样品中,除目标微生物类群之外的其它生物可根据经验确定,例如,本实施例中,混合样品包括血液和参考微生物,则混合样品中不可能存在植物成份以及专性寄生于植物的微生物,因此,若将本实施例中非目标生物确定为目标微生物类群之外的所有己知生物时,无法获得目标微生物的特征区域,则非目标微生物可以确定为除目标微生物、植物、专性寄生于植物的微生物之外的生物的集合。
步骤二、根据目标微生物类群的参考基因组序列、目标微生物的参考基因组序列、参考微生物的参考基因组序列和非目标生物的参考基因组序列,获得目标微生物类群的特征区域、目标微生物的特征区域和参考微生物的特征区域,具体方法如下:
目标微生物类群的特征区域为目标微生物类群内的微生物的参考基因组上的核酸序列;目标微生物类群的特征区域的两侧的序列在参考基因组中为单一序列;目标微生物类群的特征区域的两侧的序列在目标微生物类群内不同微生物间保守;目标微生物类群的特征区域的区分度≥3。非特征区域不为目标微生物类群的特征区域,非特征区域是指混合多重扩增引物以混合样品的核酸为模板的扩增产物;区分度是指由同一混合多重扩增引物扩增的任一目标微生物类 群的特征区域与任一非特征区域间的差异碱基数的最小值,若无非特征区域,则区分度=3×L1/4,其中,L1为目标微生物类群的特征区域的核酸序列长度。
具体地,目标微生物类群的特征区域用于代表目标微生物类群,目标微生物类群的特征区域存在,则代表目标微生物类群存在,目标微生物类群的特征区域的测序片段的数量,代表目标微生物类群的数量。理想的目标微生物类群的特征区域的多重引物只扩增目标微生物类群的特征区域,不能扩增非目标生物。这就要求目标微生物类群的特征区域的两侧序列,即引物设计区域在非目标生物中不同源,那么,非目标生物不能被扩增,也不能产生非特征区域。此时,特征区域与非特征区域间只能随机产生相同碱基,碱基共4种,相同与不同的概率分别为1/4和3/4,因此,区分度为3×L1/4。目标微生物类群的特征区域的区分度≥3是为了保证目标微生物类群的特征测序片段判断的假阳性率与假阴性率都较低,其原理见表2。目标微生物类群的特征区域的两侧的序列在目标微生物类群内不同微生物间保守,就可以用相同的引物扩增目标微生物类群内的不同微生物,以排除扩增效率对目标微生物类群不同微生物间的相对定量的影响。
目标微生物的特征区域与目标微生物类群的特征区域同源;目标微生物的特征区域的m2值≥2,其中,m2值为目标微生物的特征区域与目标微生物类群内除目标微生物外的其它微生物间的差异碱基数的最小值。本实施例中的其它微生物是指除目标微生物外目标微生物类群中的其它生理小种,m2值为目标微生物的特征区域分别与目标微生物类群中的其它生理小种同源区域比较,所获得的差异碱基数目中的最小值。目标微生物定性与定量分析时,重点是与目标微生物类群内其它微生物进行区分。目标微生物与目标微生物类群亲缘关系往往较近,序列间相似性高,因此,难以区分。在目标微生物定性与定量分析时,只关注了扩增子中与目标微生物类群内其它微生物间有差异的标准基因型,减少了误差的来源,从而可以更好地将目标微生物从目标微生物类群内区分出来。当m2≥2时,判断测序片段为目标微生物的特征测序片段的假阳性率与假阴性率都较低,因此,可以将目标微生物从目标微生物类群中区分出来,其原理见表2。
参考微生物的特征区域为参考微生物的参考基因组上的核酸序列;参考微生物的特征区域的两侧的序列在参考微生物的参考基因组中为单一序列;参考微生物的特征区域的两侧的序列在除参考微生物外的其它生物中不具有同源性。
本实施例中,区分度是目标微生物类群的特征区域的唯一选择标准,根据检测目的的不同,也可以将具有特定基因序列的微生物作为目标微生物类群,并且将特定基因序列作为目标微生物类群的特征区域。例如,可以将具有特定致病基因的微生物作为目标微生物类群,并将该致病基因作为目标微生物的特征区域,以便根据致病基因的类型,指导用药。同样,耐药性基因作为特定基因序列也可以指导用药。
步骤三、制备扩增目标微生物类群的特征区域的第一多重扩增引物、扩增目标微生物的特征区域的第二多重扩增引物和扩增参考微生物的特征区域的第三多重扩增引物,将第一多重扩增引物、第二多重扩增引物和第三多重扩增引物混合得到混合多重扩增引物。
结合步骤二和步骤三具体方法如下:
在ftp://ftp.ncbi.nlm.nih.gov/genomes/中下载目标微生物类群中的不同生理小种的基因组序列,采用软件Megablast(版本2.2.26)将它们的基因组与query序列(参考序列)进行比对分析,本实施例中,query序列为NCBI上接收号为AE004091的基因组序列。Megablast软件比对的各参数设置为:参数-e设置为1e-5;参数-p设置为0;参数–v设置为5000;参数-m设置为1。比对完成后,获得在目标微生物类群的所有微生物间的同源序列,从中挑选仅在query序列中出现1次的同源序列。以110bp为窗口大小,以10bp为步长,在挑选的同源序列内作窗口平移。对于每一次平移获得的窗口,比较获得至少在目标微生物类群内两种微生物间存在差异的碱基,截取该窗口中从第一个差异碱基起至最后一个差异碱基止的区域作为特征区域,并统计该特征区域内的差异碱基的数量。向特征区域的两侧各延伸长度为160bp-特征区域长度的区域作为引物搜索区,在引物搜索区内,搜索存在长度大于20bp且在目标微生物类群内所有微生物间没有任何碱基差异的区域,作为特征区域的引物设计区,放弃缺乏引物设计区 的特征区域。
登录多重扩增引物在线设计网页https://ampliseq.com,在“Application type”选项选择“DNA Hotspot designs(single-pool)”。若在本实施例中选择multi-pool,则多重PCR将分多管进行,成本会有所增加。而选择single-pool的引物只需要一次多重PCR即可,节省成本,但缺点是某些特征区域的引物设计可能失败,但由于基因组上的特征区域的数目较多,少数特征区域引物设计失败不影响结果,所以,本实施例选择single-pool。将以上获得的所有目标微生物类群的特征区域及其对应的引物设计区用100个碱基N(N代表A、T、C和G四种碱基中的任意一种)连接起来,生成为一个引物设计的参考基因组。在“Select the genome you wish to use”选项中选择“Custom”后,上传生成的引物设计的参考基因组。DNA Type选项选择“Standard DNA”,在Add Hotspot选项中,填写特征区域在生成的引物设计的参考基因组中的起始和终止位置。最后点击“Submit targets”按钮提交并获得目标微生物类群的特征区域的多重扩增引物序列。
利用设计的多重扩增引物对目标微生物类群利用BLASTN(Basic Local Alignment Search Tool,基本局部比对搜索工具)(version 2.2.26)做比对分析,从中挑选正反向引物中至少有一个具有特异性的引物。将挑选出来的引物再与非目标生物的基因组做BLASTN比对分析,检查它们是否可以扩增非目标生物的基因组。本实施例中,非目标生物为除目标微生物类群外的所有生物,非目标生物的基因组为NCBI的NT/NR库。判断引物可以扩增的标准为:扩增区长度不超过200bp,引物匹配长度大于15bp且引物3’端的5个碱基以内没有碱基缺失或错配。若引物不能扩增任何非目标生物,此时,引物所对应的目标微生物的特征区域的区分度m1=3×L1/4,若引物可扩增部分非目标生物,则将将该引物扩增的任一非目标生物的扩增产物与任一目标微生物类群的特征区域进行比对,获得所有比对中,差异碱基数目的最小值为区分度m1,保留m1≥3的目标微生物类群的特征区域,进一步去掉含有简单重复序列或在基因组上为多拷贝的特征区域。从保留的目标微生物类群的特征区域中,进一步优选目标微生物类群的特征区域并选择目标微生物的特征区域。
进一步地,目标微生物类群的特征区域的优选方法如下:将特征区域与非 目标生物的参考基因组做BLASTN比对,去掉与非目标生物存在95%以上同源性的特征区域,将剩余的特征区域在目标微生物与目标微生物类群内其它微生物间利用软件muscle(版本:V3.6)按其默认参数进行比对,获得差异碱基数的最小值,即m2值。保留m2≥2的目标微生物类群的特征区域,从保留的特征区域中,任意挑选区分度m1与m2均较大的2个及2个以上的特征区域同时作为目标微生物类群的特征区域和目标微生物的特征区域,其对应的多重扩增引物同时作为第一多重扩增引物和第二多重扩增引物。
按与寻找目标微生物类群的特征区域相类似的方法,获取参考微生物特征区域及其对应的第三多重扩增引物,下面重点描述不同之处,相同之处不再重复描述。同样采用软件Megablast(版本2.2.26)对参考微生物基因组与query序列(参考序列)进行比对分析,query序列为Agrobacterium tumefaciens K84的基因组序列。比对完成后,获得参考微生物基因组中,仅在query序列中出现1次的单一序列。将单一序列与NCBI的NT/NR库比对,放弃在非目标生物中存在同源序列的单一序列,从单一序列中随机挑选不重叠的110bp长度的作为特征区域,其两侧的序列作为引物设计区域。于多重扩增引物在线设计网页https://ampliseq.com上设计特征区域的多重扩增引物,进一步筛选成功设计了多重扩增引物的特征区域,具体方法如下:去掉含有简单重复序列或在基因组上为多拷贝的特征区域,将剩下的特征区域与非目标生物的参考基因组做BLASTN比对,去掉与非目标生物存在95%以上同源性的特征区域。从保留下来的特征区域中,随机挑选2个及2个以上特征区域作为参考微生物类群的特征区域,其对应的多重扩增引物作为第三多重扩增引物。
由生工生物工程(上海)股份有限公司逐一合成以上获得的第一多重扩增引物、第二多重扩增引物和第三多重扩增引物中的每一重扩增引物、以及由每个多重扩增引物对应的扩增的模板序列,模板序列是指每个多重扩增引物在填入Add Hotspot选项的扩增区域。按照美国赛默飞世尔公司的StepOne实时定量PCR仪的操作手册(Part Number 4376784Rev.E)检测每个多重扩增引物的扩增效率,仅保留扩增效率在95%~105%的多重扩增引物,以减少扩增效率的差异对微生物定性与定量的影响。由于扩增效率影响较少,因此,目标微生物类 群与目标微生物的特征区域也可以不同,以方便更容易分别找到各自的特征区域。将以上获得的第一多重扩增引物、第二多重扩增引物和第三多重扩增引物保留下来的多重扩增引物按多重扩增引物在线设计网页https://ampliseq.com上的合并程序进行合并,获得混合多重扩增引物,混合多重扩增引物由美国赛默飞世尔公司合成后,以液体形式提供。本实施例最终获得的特征区域相关信息见表1。表1中的起始位置与终止位置是指特征区域在query序列上的参考基因组上的起始和终止位置。
表1本发明实施例一提供的引物相关信息
Figure PCTCN2017072441-appb-000001
Figure PCTCN2017072441-appb-000002
步骤四、向待测样品中加入参考微生物,获得混合样品,具体方法如下:
参考微生物不存在于待测样品中,所以,可以将参考微生物作为内部参照,并与待测样品中的微生物进行平行操作,对待测样品中的目标微生物类群与目标微生物进行定量。参考微生物的加入量控制为大约可以提取10ng的混合样品的核酸(DNA),以正常构建高通量测序文库,同时,参考微生物的加入量又不至于使得参考微生物所占的比例过大,占用过多的高通量测序数据量。本实施例混合样品的获取方法如下:将浓度为2OD(OD为菌液最大吸光度值)的参考微生物的菌液0.2mL置于1.5mL的离心管中真空冷冻离心干燥后,加入待测样品中,混匀,即得到待测样品与参考微生物的混合样品。通过血球板计数,计算获得加入混合样品的参考微生物的量见表2。
步骤五、提取混合样品的核酸,具体方法如下:
在提取所述混合样品的核酸时,若待测样品中核酸的含量过低(低于1ug),将影响混合样品的核酸的提取,则可以在混合样品的核酸的提取过程中,加入混合多重扩增引物不能扩增的外源核酸。所加入的外源核酸不存在与自然界中,因而不干扰微生物检测。外部RNA对照协会设计了并验证了一套核酸序列,它们在自然界中不存在,可以作为本发明实施例中的外源核酸,其序列可参考https://tools.lifetechnologies.com/content/sfs/manuals/cms_095047.txt。外源核酸的加入量为1ug左右,该加入量可以保证混合样品的核酸能够正常提取。在本实施例中,待测样品为血液,其核酸含量正常,因此,不需要向混合样品中加入外源核酸。利用血液基因组DNA提取试剂盒(生产公司:天根生化科技(北京)有限公司,产品货号:DP348)按其操作手册提供的方法提取获得的混合样品的核酸。
步骤六、利用混合多重扩增引物和混合样品的核酸进行扩增反应,获得扩增产物,具体方法如下:
利用文库构建试剂盒2.0(由美国LifeTechnology公司生产,货号为4475345)多重PCR扩增混合样品的核酸后,利用扩增产物构建高通量测序文库。该试剂盒包括以下试剂:5×Ion AmpliSeqTMHiFi Mix、FuPa试剂、转换试剂、测序接头溶液和DNA连接酶。文库构建的方法按该试剂盒的操作手册《Ion AmpliSeqTMLibrary Preparation》(出版号:MAN0006735,版本:A.0)进行。多重PCR的扩增体系如下:5×Ion AmpliSeqTMHiFi Mix 4μl、合成的混合多重扩增引物4μl、提取的混合样品的核酸10ng和无酶水11μl。多重PCR的扩增程序如下:99℃,2分钟;(99℃,15秒;60℃,4分钟)×25个循环;10℃保温。利用FuPa试剂消化掉多重PCR扩增产物中多余的引物后,再进行磷酸化,具体方法为:向多重PCR的扩增产物中加入2μL FuPa试剂,混匀后,在PCR仪上按如下程序反应:50℃,10分钟;55℃,10分钟;60℃,10分钟;10℃保存,得到混合物a,混合物a为含有经过磷酸化的扩增产物溶液。将磷酸化的扩增产物连接上测序接头,具体方法为:向混合物a中加入转换试剂4μL、测序接头溶液2μL和DNA连接酶2μL,混匀后,在PCR仪上按如下程序反应:22℃,30分钟;72℃,10分钟;10℃保存,得到混合液b。利用标准的乙醇沉淀方法纯化混合液b后溶解于10μL无酶水中。利用美国Invitrigen公司生产的
Figure PCTCN2017072441-appb-000003
dsDNA HS Assay Kit(货号为Q32852)并按照其说明书进行测定,获得混合液b的质量浓度后,将纯化后混合液b稀释至15ng/ml,得到浓度约100pM的高通量测序文库。
步骤七、利用扩增产物进行高通量测序,获得高通量测序片段,具体方法如下:
利用获得的高通量测序文库和试剂盒Ion PI Template OT2 200Kit v2(美国invirtrigen公司生产,货号为4485146)进行测序前的ePCR(Emulsion PCR,乳化聚合酶链反应)扩增,操作方法按该试剂盒的操作手册进行。利用ePCR产物和试剂盒Ion PI Sequencing 200Kit v2(美国invirtrigen公司生产,货号为4485149)在Proton二代高通量测序仪上进行高通量测序,操作方法按该试剂盒的操作手册进行。在本实施例中,高通量测序量设置为1M测序片段(1M=100万)。
根据测序片段的引物,将高通量测序片段比对到对应的目标微生物类群的特征区域、目标微生物的特征区域和参考微生物的特征区域。去掉比对不成功和特征区域不完整的测序片段,比对不成功的测序片段多为非特异扩增产物,特征区域不完整的测序片段是指没能将表1中的特征区域的起始位置到终止位置的序列检测完整。
步骤八、根据高通量测序片段,对目标微生物类群和目标微生物进行定性和定量分析,具体方法如下:
本发明提供的微生物定性定量分析的基本原理是:特征区域代表了目标微生物类群和目标微生物,若存在特征区域的测序片段,表明目标微生物类群或目标微生物存在,而特征区域的测序片段的数量也代表了目标微生物类群和目标微生物的数量。与其它微生物定性与定量检测不同,本发明实施例计算了微生物定性与定量的可靠性,同时,增强了结论的实用性。本发明实施例需要先理清参数间复杂的关系,才能实现任意微生物的定性、定量检测,并获得可靠的结论,本发明的具体参数及其推算原理见表2。表2中单元格、符号与公式的定义与Excel 2010相同,其中,单元格“基本参数”为A1,其它单元格参照A1按Excel 2010的规则进行定义。
定性分析方法如下:将高通量测序片段与每种目标微生物类群的特征区域进行比对,当差异碱基数≤n1时,则比对成功,相应的高通量测序片段为目标微生物类群的特征区域,其中,n1为目标微生物类群的特征测序片段的最大容错碱基数;若比对成功的目标微生物类群的特征区域≥1种时,则判断高通量测序片段为目标微生物类群的特征测序片段。
将目标微生物的特征区域与每种同源的目标微生物类群的特征区域进行比对,在目标微生物的特征区域中提取差异碱基组成目标微生物的标准基因型,这里的差异碱基是指目标微生物的特征区域与任何一个目标微生物类群内的微生物比较,存在差异的碱基的总和。在目标微生物类群的特征测序片段上,提取目标微生物的标准基因型所对应的碱基,组成目标微生物的测试基因型;若目标微生物的测试基因型与目标微生物的标准基因型的差异碱基数≤n2,其中,n2为目标微生物的特征测序片段的最大容错碱基数,则目标微生物的测试基因 型所在的高通量测序片段为目标微生物的特征测序片段。特别地,当目标微生物类群中仅包含了一个目标微生物时,此时的标准基因型和测试基因型的碱基数均为0个,因此,它们之间的差异碱基数也为0个,则不论n2为多大,均将目标微生物的测试基因型所在的高通量测序片段判定为目标微生物的特征测序片段。按以上方法,分别获得了目标微生物类群和目标微生物的特征区域的特征片段数,其结果列于表1。在本实施例中,n1和n2的值见表2,其推算过程见后。
n1使得P1≤α1且P3≤α3,其中,P1为一条不是目标微生物类群的特征测序片段的高通量测序片段被误判为目标微生物类群的特征测序片段而产生的假阳性的概率;P3为一条目标微生物类群的特征测序片段被误判为不是目标微生物类群的特征测序片段而产生的假阴性的概率;α1和α3为判断阈值。
n2使得P2≤α2且P4≤α4,其中,P2为一条不是目标微生物的特征测序片段被误判为目标微生物的特征测序片段而产生的假阳性的概率;P4为一条目标微生物的特征测序片段被误判为不是目标微生物的特征测序片段而产生的假阴性的概率;α2和α4为判断阈值;本发明实施例中的各种阈值的大小由现实需要确定,例如,某些病菌危害性极大,漏检(假阴性)将引起严重的后果,那么,就要控制假阴性,α2和α4值要低。若无特殊要求,则采用较低假阳性与假阴性为原则,本实施例子属于后者,α1和α3取值为0.01%,即大约1万条特征序列出现1条假阳性或假阴性,其准确性是很高的,之所以可以控制如此高的准确性,是因为特征序列中的m1值较大,很容易与其它非目标生物区分开,从而将假阳性率与假阴性率都控制在一个很低的水平。α2和α4的取值为0.5%,即大约1千条特征序列出现5条假阳性或假阴性,可见其准确性很高。P1=BINOM.DIST(n1,m1,1-E,TRUE),P2=BINOM.DIST(n2,m2,1-E,TRUE),P3=1-BINOM.DIST(n1,L1,E,TRUE),P4=1-BINOM.DIST(n2,L2,E,TRUE),其中,m1为区分度,具体指用于计算S1的目标微生物类群的特征区域对应的区分度,本实施例中,m1的值见表1和表2;m2为目标微生物的特征区域与目标微生物类群的其它微生物间差异碱基的最小值,具体指用于计算S3的目标微生物对应的特征区域的m2的值,本实施例中,m2的值见表1和表2;L1为目标微生物 类群的特征区域的长度,本实施例子中,L1的值见表2;L2为目标微生物的标准基因型的长度,本实施例子中,L2的值见表2;E为碱基错误率,其由测序错误率E1和自然突变率E2组成,本实施例中,PROTON高通量测序仪的测序错误率E1≤1%,根据我们的调查,微生物小种(如P1-P6白叶枯小种)的参考基因组之间的变异率一般小于0.5%,而自然突变率是低于小种间的变异率的,因此,自然突变率E2≤0.5%,为了本发明的方法适应性更广,取E2≤1%,则本实施例中,E≤2%,为了使得本实施例中微生物的定性与定量的结论正确率的概率更可靠,取E值的最大值2%进行计算。将以上参数值代入P1和P3的公式中后,将n1的值从0开始逐渐增加,计算得P1和P3的值,当n1=13时,计算得P1≤α1且P3≤α3,因此,本实施例中,n1=13(见表2),n1=13对应的P1和P3的值为本实施例中P1和P3的值。按类似的方法,将以上参数值代入P2和P4的公式中后,将n2的值从0开始逐渐增加,计算得P2和P4的值,当n2=2时,P2≤α2,P4≤α4,因此,本实施例中,n2=2(见表2),n2=2对应的P2和P4的值为本实施例中P2和P4的值。
将参考微生物作为仅包含一个目标微生物的目标微生物类群,计算获得的目标微生物的特征测序片段,即为参考微生物的特征测序片段。参考微生物的特征区域的特征片段数见表1和表2。
若目标微生物类群的特征测序片段存在的概率P5≥α5,则判断待测样品中存在目标微生物类群;若目标微生物类群的特征测序片段存在的概率P5<α5,则判断待测样品中不存在目标微生物类群,其中,α5为概率保障,本实施例中,α5取值为99.99%。P5=1-BINOM.DIST(S1,S1,P1,FALSE),S1为所有的目标微生物类群的特征区域的目标微生物类群的特征测序片段的数量的中位数,在本实施例中,目标微生物类群的第2个特征测序片段的数量为所有目标微生物类群的特征测序片段的数量的中位数,所以本实施例S1的值见表1和表2,将本实施例中S1的值和P1的值代入P5的计算公式计算获得P5≥α5,因此,判断本实施例中,待测样品中存在目标微生物类群,FALSE为参数值,BINOM.DIST函数返回一元二项式分布的概率。
若目标微生物的特征测序片段存在的概率P6≥α6,则判断待测样品中存在目 标微生物;若目标微生物的特征测序片段存在的概率P6<α6,则判断待测样品中不存在目标微生物;α6为概率保障。本实施例中,α6取值为99.99%。P6=1-BINOM.DIST(S3,S3,P2,FALSE),BINOM.DIST函数返回一元二项式分布的概率,S3为所有的目标微生物的特征区域的目标微生物的特征测序片段的数量的中位数,在本实施例中,目标微生物的第2个特征测序片段的数量为所有目标微生物的特征测序片段的数量的中位数,其对应的S3的值见表1和表2,将本实施例中S3的值和P2的值代入P6的计算公式计算获得P6≥α6,因此,判断本实施例中,待测样品中存在目标微生物。
此外,α5和α6均是人们根据实际需要定的,α5和α6均可以相同也可以不同,其区别取决于实际需要,当要严格控制某种微生物时,α5和α6的取值均较大,反之,α5和α6的取值均较小。此外,本发明实施例中所有的a值的取值均遵循该原理。
定量分析方法如下:目标微生物类群的量M1=Mr×S1/S2,其中,Mr为加入待测样品中的参考微生物的量,本实施例中,Mr的值见表2;S2为所有的参考微生物的特征区域的参考微生物的特征测序片段的数量的中位数,在本实施例中,参考微生物的第2个特征测序片段的数量为所有参考微生物的特征测序片段的数量的中位数,其对应的S2的值见表1和表2;将以上参数和通过定性分析获得的S1的值代入M1的计算公式中,计算获得M1值,即待测样品中,目标微生物类群中的微生物的量为M1=2871226个。
目标微生物类群的量的置信区间为[M11,M12],M11和M12分别为M1值的置信区间的下限与上限。M11=M1×(1-S4/S1),M12=M1×(1+S5/S1),其中,S4为假阳性的目标微生物类群的特征测序片段的数量且S4=CRITBINOM(nS,P1,α9),S5为假阴性的目标微生物类群的特征测序片段的数量且S5=CRITBINOM(S1,P3,α9),其中,α9为概率保障,本实施例中,α9取值为99.50%,CRITBINOM函数返回使累积二项式分布大于等于临界值的最小值,nS为计算S1的目标微生物类群的特征区域的多重扩增引物所扩增的非特征区域的高通量测序片段的数量,即是指多重引物所扩增的除目标微生物的特征测序片段之外的其它高通量测序片段。在本实施例中,nS为目标微生物类群中第 2个特征区域的多重扩增引物扩增所产生的非特征区域的高通量测序片段的数量,本实施例中,nS的值见表2。将nS的值和P1的值代入S4的公式计算获得S4的值,将本实施例S1的值和P3的值代入S5的公式计算获得S5的值。获得M11和M12公式中所有参数的值后,计算获得本实施例中M11和M12的值,进而获得M1的置信区间,即目标微生物类群的量的置信区间为[2871226,2871455]。
目标微生物的量M2=M1×S3/S1,将M1、S3和S1的值代入上述公式,获得目标微生物的量M2=2534075。
目标微生物的量的置信区间为[M21,M22],M21和M22分别为M2值的置信区间的下限与上限;M21=M2×(1-S6/S3),M22=M2×(1+S7/S3);其中,S6为假阳性的目标微生物的特征测序片段的数量且S6=CRITBINOM(S1,P2,α10),S7为假阴性的目标微生物的特征测序片段的数量且S7=CRITBINOM(S3,P4,α10),其中,α10为概率保障;CRITBINOM函数返回使累积二项式分布大于等于临界值的最小值。本实施例中,α10取值为99.50%,将本实施例S1和S3的值,以及P2和P4的值代入S6和S7的计算公式,计算得到S6和S7的值。进一步将S6、S7、M1和S3的值代入M21和M22的计算公式,计算得到M21和M22的值,进而得到目标微生物的量的置信区间为[2534067,2539614]。
表2为本实施例微生物定性与定量分析参数及其推算原理
Figure PCTCN2017072441-appb-000004
Figure PCTCN2017072441-appb-000005
实施例二、人体粪便微生物的鉴定
本实施例中的待测样品为人体粪便,取自医生诊断具有肠道类疾病的病人,检测其粪便中的微生物是为治疗方案提供依据。本实施与实施例一方法类似,没有提及的方法、参数与结果与实施例一相同,因此,不再重述。
步骤一、确定待测样品中的目标微生物类群、目标微生物和非目标生物、以及不存在于待测样品中的参考微生物。
本实施例的目的是鉴定待测样品中的肠道沙门氏菌,其拉丁学名为Salmonellaenterica,在NCBI(National center for biotechnology information,国家生物技术信息中心)上,已知参考基因组的肠道沙门氏菌生理小种共33个(截止时间2015年6月2日),具体见http://www.ncbi.nlm.nih.gov/genome/genomegroups/152,这些生理小种构成本实施例的目标微生物类群。在这些生理小种中,Salmonella enterica subsp.houtenae str.ATCC BAA-1581致病性较强,作为本实施例的目标微生物。
步骤二、根据目标微生物类群的参考基因组序列、目标微生物的参考基因 组序列、参考微生物的参考基因组序列和非目标生物的参考基因组序列,获得目标微生物类群的特征区域、目标微生物的特征区域和参考微生物的特征区域。本实施例最终获得的特征区域相关信息见表3。
表3本发明实施例二提供的引物相关信息
Figure PCTCN2017072441-appb-000006
步骤四、向待测样品中加入参考微生物,获得混合样品,具体方法如下:
本实施例混合样品的获取方法如下:将浓度为2OD(OD为菌液最大吸光度值)的参考微生物的菌液0.2mL置于1.5mL的离心管中真空冷冻离心干燥后,加入100mg待测样品中,混匀,即得到待测样品与参考微生物的混合样品。通过血球板计数,计算获得加入混合样品的参考微生物的量见表4。
步骤五、提取混合样品的核酸,具体方法如下:
在本实施例中,待测样品为粪便,其核酸含量较低,因此,向混合样品中加入外源核酸,即1ug外部RNA对照协会设计的ERCC-00014基因。利用粪便 DNA试剂盒(生产公司:美国MP公司,产品货号:116570200,产品英文名:FastDNA SPIN kit for feces)按其操作手册提供的方法提取获得的混合样品的核酸。
步骤六、利用混合多重扩增引物和混合样品的核酸进行扩增反应,获得扩增产物,具体方法同实施例一。
步骤七、利用扩增产物进行高通量测序,获得高通量测序片段,具体方法同实施例一。
步骤八、根据高通量测序片段,对目标微生物类群和目标微生物进行定性和定量分析,具体方法如下:
本发明实施例的具体参数及其推算原理见表4。本实施例分析结果为:待测样品中存在目标微生物类群和目标微生物,其中,目标微生物类群中的微生物的量为M1=3942647个,置信区间为[3942647,3943113];目标微生物的量M2=1787805,置信区间为[1777581,1788849]。
表4为本实施例微生物定性与定量分析参数及其推算原理
Figure PCTCN2017072441-appb-000007
Figure PCTCN2017072441-appb-000008
本发明实施例提供的检测方法可应用医学的多个方面,在不同应用中,微生物的核酸分离方法略有差异,例如,血液和粪便有其不同的基因组提取试剂盒,需要分别按它们的操作手册分离核酸。除了核酸分离方法外,其它步骤基本相同,因此,本发明实施例提供的检测方法通用性较强。本发明改变了已有方法中一次只能检测少数几种微生物、只能将微生物区分到种、定量不准、检测结果无概率保障、需要预培养、检测周期长、某些微生物不可培养因而不可检测、微生物可培养性不同而导致的定量失真、定量粗糙等诸多问题,为人体微生物检测提供了一种全面、快速、精细的定性与定量检测新方法,为医学诊断提供快速、准确和全面的数据支持。

Claims (9)

  1. 一种人体微生物定性与定量检测方法,其特征在于,所述方法包括:
    确定待测样品中的目标微生物类群、目标微生物和非目标生物、以及不存在于所述待测样品中的参考微生物,所述待测样品为人体组织、体液和排泄物;
    根据所述目标微生物类群、所述目标微生物、所述参考微生物和所述非目标生物的参考基因组序列,获得所述目标微生物类群的特征区域、所述目标微生物的特征区域和所述参考微生物的特征区域;
    制备扩增所述目标微生物类群的特征区域的第一多重扩增引物、扩增所述目标微生物的特征区域的第二多重扩增引物和扩增所述参考微生物的特征区域的第三多重扩增引物,将所述第一多重扩增引物、所述第二多重扩增引物和所述第三多重扩增引物混合得到混合多重扩增引物;
    向所述待测样品中加入所述参考微生物,获得混合样品;
    提取所述混合样品的核酸;
    利用所述混合多重扩增引物和所述混合样品的核酸进行扩增反应,获得扩增产物;
    利用所述扩增产物进行高通量测序,获得高通量测序片段;
    根据所述高通量测序片段,对所述目标微生物类群和所述目标微生物进行定性和定量分析。
  2. 根据权利要求1所述的方法,其特征在于,所述目标微生物类群的数目≥1个,且每个所述目标微生物类群包括≥0种所述目标微生物;
    所述目标微生物为细菌、病毒、真菌、放线菌、立克次体、支原体、衣原体、螺旋体和原生动物中的至少一种;
    所述参考微生物为细菌、病毒、真菌、放线菌、立克次体、支原体、衣原体、螺旋体和原生动物中的至少一种。
  3. 根据权利要求1所述的方法,其特征在于,所述确定待测样品中的非目 标生物的方法包括:将所述非目标生物确定为除所述目标微生物类群之外的所有生物,若能获得所述目标微生物类群的特征区域,则所述非目标生物指除所述目标微生物类群之外的所有生物;若不能获得所述目标微生物类群的特征区域,则所述非目标生物指所述混合样品中,除所述目标微生物类群之外的其它生物。
  4. 根据权利要求1所述的方法,其特征在于,所述目标微生物类群的特征区域为所述目标微生物类群内的微生物的参考基因组上的核酸序列;所述目标微生物类群的特征区域的两侧的序列在所述参考基因组中为单一序列;所述目标微生物类群的特征区域的两侧的序列在所述目标微生物类群内不同微生物间保守;所述目标微生物类群的特征区域的区分度≥3;
    所述目标微生物的特征区域与所述目标微生物类群的特征区域同源;所述目标微生物的特征区域的m2值≥2,其中,m2值为所述目标微生物的特征区域与所述目标微生物类群内除所述目标微生物外的其它所述微生物间的差异碱基数的最小值;
    所述参考微生物的特征区域为所述参考微生物的参考基因组上的核酸序列;所述参考微生物的特征区域的两侧的序列在所述参考微生物的参考基因组中为单一序列;所述参考微生物的特征区域的两侧的序列在除所述参考微生物外的其它生物中不具有同源性。
  5. 根据权利要求4所述的方法,其特征在于,所述区分度是指由同一所述混合多重扩增引物扩增的任一所述目标微生物类群的特征区域与任一非特征区域间的差异碱基数的最小值,其中,所述非特征区域是所述混合多重扩增引物以所述混合样品的核酸为模板的扩增产物,且所述非特征区域不为所述目标微生物类群的特征区域,若无所述非特征区域,则所述区分度=3×L1/4,其中,L1为所述目标微生物类群的特征区域的核酸序列长度。
  6. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    在提取所述混合样品的核酸时,若所述待测样品中核酸的含量过低,则在提取所述混合样品的核酸的过程中,加入所述混合多重扩增引物不能扩增的外源核酸。
  7. 根据权利要求1所述的方法,其特征在于,所述目标微生物类群和所述目标微生物的定性分析方法如下:
    将所述高通量测序片段与每种所述目标微生物类群的特征区域进行比对,当差异碱基数≤n1时,则比对成功,相应的所述高通量测序片段为所述目标微生物类群的特征区域,其中,n1为所述目标微生物类群的特征测序片段的最大容错碱基数;若比对成功的所述目标微生物类群的特征区域≥1种时,则判断所述高通量测序片段为所述目标微生物类群的特征测序片段;
    将所述目标微生物的特征区域与每种同源的所述目标微生物类群的特征区域进行比对,在所述目标微生物的特征区域中提取差异碱基组成所述目标微生物的标准基因型;在所述目标微生物类群的特征测序片段上,提取所述目标微生物的标准基因型所对应的碱基,组成所述目标微生物的测试基因型;若所述目标微生物的测试基因型与所述目标微生物的标准基因型的差异碱基数≤n2,其中,n2为所述目标微生物的特征测序片段的最大容错碱基数,则所述目标微生物的测试基因型所在的所述高通量测序片段为所述目标微生物的特征测序片段;
    将所述参考微生物作为仅包含一个所述目标微生物的所述目标微生物类群,计算获得的所述目标微生物的特征测序片段,即为所述参考微生物的特征测序片段;
    若所述目标微生物类群的特征测序片段存在的概率P5≥α5,则判断所述待测样品中存在所述目标微生物类群,其中,α5为概率保障;若所述目标微生物类群的特征测序片段存在的概率P5<α5,则判断所述待测样品中不存在所述目标微生物类群;
    若所述目标微生物的特征测序片段存在的概率P6≥α6,则判断所述待测样品中存在所述目标微生物,其中,α6为概率保障;若所述目标微生物的特征测 序片段存在的概率P6<α6,则判断所述待测样品中不存在所述目标微生物;
    n1使得P1≤α1且P3≤α3,其中,P1为一条不是所述目标微生物类群的特征测序片段的所述高通量测序片段被误判为所述目标微生物类群的特征测序片段而产生的假阳性的概率;P3为一条所述目标微生物类群的特征测序片段被误判为不是所述目标微生物类群的特征测序片段而产生的假阴性的概率;α1和α3为判断阈值;
    n2使得P2≤α2且P4≤α4,其中,P2为一条不是所述目标微生物的特征测序片段的所述高通量测序片段被误判为所述目标微生物的特征测序片段而产生的假阳性的概率;P4为一条所述目标微生物的特征测序片段被误判为不是所述目标微生物的特征测序片段而产生的假阴性的概率;α2和α4为判断阈值;
    P5=1-BINOM.DIST(S1,S1,P1,FALSE),P6=1-BINOM.DIST(S3,S3,P2,FALSE),S1为所有的所述目标微生物类群的特征区域的所述目标微生物类群的特征测序片段的数量的中位数;S3为所有的所述目标微生物的特征区域的所述目标微生物的特征测序片段的数量的中位数,FALSE为参数值;BINOM.DIST函数返回一元二项式分布的概率。
  8. 根据权利要求7所述的方法,其特征在于,所述目标微生物类群和所述目标微生物的定量分析方法如下:
    所述目标微生物类群的量M1=Mr×S1/S2,所述目标微生物类群的量的置信区间为[M11,M12],其中,Mr为加入所述待测样品中的所述参考微生物的量;S2为所有的所述参考微生物的特征区域的所述参考微生物的特征测序片段的数量的中位数;M11和M12分别为M1值的置信区间的下限与上限;
    所述目标微生物的量M2=M1×S3/S1,所述目标微生物的量的置信区间为[M21,M22],M21和M22分别为M2值的置信区间的下限与上限;
    M11=M1×(1-S4/S1),M12=M1×(1+S5/S1),M21=M2×(1-S6/S3),M22=M2×(1+S7/S3);其中,S4为假阳性的所述目标微生物类群的特征测序片段的数量且S4=CRITBINOM(nS,P1,α9),其中,nS为计算S1的所述目标微生物类群的特征区域的所述多重扩增引物所扩增的所述非特征区域的所述高通量测序 片段的数量;S5为假阴性的所述目标微生物类群的特征测序片段的数量且S5=CRITBINOM(S1,P3,α9),其中,α9为概率保障;S6为假阳性的所述目标微生物的特征测序片段的数量且S6=CRITBINOM(S1,P2,α10),S7为假阴性的所述目标微生物的特征测序片段的数量且S7=CRITBINOM(S3,P4,α10),其中,α10为概率保障;CRITBINOM函数返回使累积二项式分布大于等于临界值的最小值。
  9. 根据权利要求8所述的方法,其特征在于,P1=BINOM.DIST(n1,m1,1-E,TRUE),P2=BINOM.DIST(n2,m2,1-E,TRUE),P3=1-BINOM.DIST(n1,L1,E,TRUE),P4=1-BINOM.DIST(n2,L2,E,TRUE),其中,m1为所述区分度;所述m2为所述目标微生物的特征区域与所述目标微生物类群的其它所述微生物间差异碱基的最小值;L1为所述目标微生物类群的特征区域的长度;L2为所述目标微生物的标准基因型的长度;E为碱基错误率。
PCT/CN2017/072441 2016-01-29 2017-01-24 一种人体微生物定性与定量的检测方法 WO2017129110A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/073,395 US20190048393A1 (en) 2016-01-29 2017-01-24 Method for qualitative and quantitative detection of microorganism in human body
EP17743725.8A EP3409789A4 (en) 2016-01-29 2017-01-24 METHOD FOR THE QUALITATIVE AND QUANTITATIVE DETECTION OF MICROORGANISMS IN A HUMAN BODY

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610061014.3 2016-01-29
CN201610061014.3A CN105671150A (zh) 2016-01-29 2016-01-29 一种人体微生物定性与定量的检测方法

Publications (1)

Publication Number Publication Date
WO2017129110A1 true WO2017129110A1 (zh) 2017-08-03

Family

ID=56303083

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/072441 WO2017129110A1 (zh) 2016-01-29 2017-01-24 一种人体微生物定性与定量的检测方法

Country Status (4)

Country Link
US (1) US20190048393A1 (zh)
EP (1) EP3409789A4 (zh)
CN (1) CN105671150A (zh)
WO (1) WO2017129110A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105671150A (zh) * 2016-01-29 2016-06-15 江汉大学 一种人体微生物定性与定量的检测方法
CN110875082B (zh) * 2018-09-04 2022-05-31 深圳华大因源医药科技有限公司 一种基于靶向扩增测序的微生物检测方法和装置
CN112980937A (zh) * 2021-03-17 2021-06-18 自然资源部第二海洋研究所 基于高通量测序的有害藻华种分子快速检测方法
CN113270145B (zh) * 2021-04-28 2022-05-06 广州微远基因科技有限公司 判断背景引入微生物序列的方法及其应用
CN115862735B (zh) * 2022-12-28 2024-02-27 郑州思昆生物工程有限公司 一种核酸序列检测方法、装置、计算机设备及存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102534042A (zh) * 2012-03-14 2012-07-04 上海翼和应用生物技术有限公司 一种多重竞争性pcr定量基因表达谱分析方法
CN105567831A (zh) * 2016-01-29 2016-05-11 江汉大学 一种食品微生物定性与定量的检测方法
CN105603076A (zh) * 2016-01-29 2016-05-25 江汉大学 一种土壤微生物定性与定量的检测方法
CN105603081A (zh) * 2016-01-29 2016-05-25 北京工商大学 一种肠道微生物定性与定量的检测方法
CN105603082A (zh) * 2016-01-29 2016-05-25 中国科学院遗传与发育生物学研究所 一种水稻微生物定性与定量的检测方法
CN105603075A (zh) * 2016-01-29 2016-05-25 江汉大学 一种小麦微生物定性与定量的检测方法
CN105603074A (zh) * 2016-01-29 2016-05-25 江汉大学 一种微生物定性与定量检测方法
CN105671150A (zh) * 2016-01-29 2016-06-15 江汉大学 一种人体微生物定性与定量的检测方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5498954B2 (ja) * 2007-12-21 2014-05-21 ジェン−プロウブ インコーポレイテッド 抗生剤耐性微生物の検出
KR102649364B1 (ko) * 2013-11-07 2024-03-20 더 보드 어브 트러스티스 어브 더 리랜드 스탠포드 주니어 유니버시티 인간 마이크로바이옴 및 그의 성분의 분석을 위한 무세포 핵산
CN104846076B (zh) * 2015-03-31 2019-02-05 江汉大学 一种测定杂交油菜新品种的特异性、一致性与稳定性的方法
CN104830975A (zh) * 2015-04-08 2015-08-12 江汉大学 一种玉米亲本来源真实性及其比例测试新方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102534042A (zh) * 2012-03-14 2012-07-04 上海翼和应用生物技术有限公司 一种多重竞争性pcr定量基因表达谱分析方法
CN105567831A (zh) * 2016-01-29 2016-05-11 江汉大学 一种食品微生物定性与定量的检测方法
CN105603076A (zh) * 2016-01-29 2016-05-25 江汉大学 一种土壤微生物定性与定量的检测方法
CN105603081A (zh) * 2016-01-29 2016-05-25 北京工商大学 一种肠道微生物定性与定量的检测方法
CN105603082A (zh) * 2016-01-29 2016-05-25 中国科学院遗传与发育生物学研究所 一种水稻微生物定性与定量的检测方法
CN105603075A (zh) * 2016-01-29 2016-05-25 江汉大学 一种小麦微生物定性与定量的检测方法
CN105603074A (zh) * 2016-01-29 2016-05-25 江汉大学 一种微生物定性与定量检测方法
CN105671150A (zh) * 2016-01-29 2016-06-15 江汉大学 一种人体微生物定性与定量的检测方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3409789A4 *

Also Published As

Publication number Publication date
US20190048393A1 (en) 2019-02-14
CN105671150A (zh) 2016-06-15
EP3409789A4 (en) 2019-10-02
EP3409789A1 (en) 2018-12-05

Similar Documents

Publication Publication Date Title
WO2017129110A1 (zh) 一种人体微生物定性与定量的检测方法
WO2022028624A1 (zh) 通过测序获取微生物物种及相关信息的方法、装置、计算机可读存储介质和电子设备
US20200294628A1 (en) Creation or use of anchor-based data structures for sample-derived characteristic determination
CN104046703B (zh) 人乳头状瘤病毒的快速基因型鉴定分析及其装置
CN110875082B (zh) 一种基于靶向扩增测序的微生物检测方法和装置
CN110904250B (zh) 用于检测多种菌的多重荧光定量pcr引物、试剂盒及检测方法
CN113481311B (zh) 用于鉴定布鲁氏菌疫苗株m5的snp分子标记及其应用
CN111793704B (zh) 鉴别布鲁氏菌疫苗株s2和野毒株的snp分子标记及其应用
CN108064272A (zh) 用于类风湿性关节炎的生物标记物及其用途
CN112331268B (zh) 目标物种特有序列的获取方法及目标物种检测方法
Chiu et al. Next‐generation sequencing
CN107937618A (zh) A型塞内卡病毒的微滴数字rt‑pcr检测引物和探针及其应用
CN105603074B (zh) 一种非诊断目的微生物定性与定量的检测方法
CN109306372A (zh) 一种巢式pcr检测或/和鉴定布鲁氏菌的方法
CN105603081B (zh) 一种非诊断目的的肠道微生物定性与定量的检测方法
CN105907890A (zh) 一种快速区分HP-PRRS疫苗GDr180株与野毒株的引物、探针及方法
CN105603082B (zh) 一种水稻微生物定性与定量的检测方法
CN105567831B (zh) 一种食品微生物定性与定量的检测方法
CN112501321B (zh) 结核分枝杆菌的分子分型方法
CN116064853A (zh) 一种试剂盒及其应用
CN114842909A (zh) 基于三代靶向测序数据的多靶标病原微生物分析方法
CN106521030A (zh) 猪瘟病毒、牛病毒性腹泻病毒双重荧光定量rt‑pcr检测方法
CN105603076A (zh) 一种土壤微生物定性与定量的检测方法
CN115101126B (zh) 基于ce平台的呼吸道病毒和/或细菌亚型引物设计方法及系统
CN108118097B (zh) 用于定量检测痢疾性阿米巴虫的引物探针、试剂盒及方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17743725

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2017743725

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017743725

Country of ref document: EP

Effective date: 20180829