CN116254356B - Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof - Google Patents

Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof Download PDF

Info

Publication number
CN116254356B
CN116254356B CN202310340239.2A CN202310340239A CN116254356B CN 116254356 B CN116254356 B CN 116254356B CN 202310340239 A CN202310340239 A CN 202310340239A CN 116254356 B CN116254356 B CN 116254356B
Authority
CN
China
Prior art keywords
mycobacterium tuberculosis
chromosome
nucleotide sequence
snp
site
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310340239.2A
Other languages
Chinese (zh)
Other versions
CN116254356A (en
Inventor
朱晨迪
李卫民
侯悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Chest Hospital
Beijing Tuberculosis and Thoracic Tumor Research Institute
Original Assignee
Beijing Chest Hospital
Beijing Tuberculosis and Thoracic Tumor Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Chest Hospital, Beijing Tuberculosis and Thoracic Tumor Research Institute filed Critical Beijing Chest Hospital
Priority to CN202310340239.2A priority Critical patent/CN116254356B/en
Publication of CN116254356A publication Critical patent/CN116254356A/en
Application granted granted Critical
Publication of CN116254356B publication Critical patent/CN116254356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12RINDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
    • C12R2001/00Microorganisms ; Processes using microorganisms
    • C12R2001/01Bacteria or Actinomycetales ; using bacteria or Actinomycetales
    • C12R2001/32Mycobacterium
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method for identifying a mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof, and relates to the field of biology. The 23 SNP specific loci screened by the method can more comprehensively reflect the overall situation of identification and evolution of six branches of the Mycobacterium tuberculosis pedigree 2.3. The SNP combination ensures high resolution, reduces the number of SNP loci which do not need to be detected, and balances the two. In addition, the 23 SNP loci are adopted to identify the global mycobacterium tuberculosis pedigree 2.3, so that the accuracy rate is high and the specificity is strong. Thereby realizing the identification and the accurate prevention and control of the sub-branches with strong pathogenicity and transmission power of the mycobacterium tuberculosis lineage.

Description

Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof
Technical Field
The invention relates to the field of biology, in particular to a method for identifying a mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof.
Background
Tuberculosis (TB), one of the longest diseases accompanied with human history, is also the disease with the largest number of deaths caused by single pathogenic bacteria worldwide, has become a global major public health problem, and the prevention and treatment of tuberculosis is serious and far away. Mycobacterium tuberculosis (M.tuberculosis), commonly known as Mycobacterium tuberculosis, is the causative agent of tuberculosis. Nine human adaptive lineages (L1-L9) have been differentiated currently as MTBC follows human migration out of africa. Different lineages show extremely strong bio-geographic relevance and there is a large difference in the proportion of spread and prevalence.
Lineage 2 originated at the earliest in the southeast asia or in the vicinity thereof and had undergone genetic differentiation in situ into 3 major subtypes, L2.1, L2.2 and L2.3 (L2.1-L2.3), respectively. Among them, L2.1 formed by earliest differentiation, and its prevalence was low, and was found only in the south east asia. L2.2, although widely distributed in east asia, is less spread to other countries outside asia. Earlier studies found that L2.3 is actually a branch of late differentiation formation within L2.2 (about 500 years ago), and that more than 90% of the total epidemic Mycobacterium tuberculosis lineage 2 is due to L2.3. Molecular epidemiological studies have found that L2.3 results in more recent spread than L2.2; animal model infection experiments also show that L2.3 has stronger pathogenicity. It follows that the transmission advantage of Mycobacterium tuberculosis lineage 2 is actually that of the strain of the L2.3 subtype. Therefore, it is necessary to achieve efficient identification of the lineage 2.3 strain level of Mycobacterium tuberculosis.
In view of this, the present invention has been made.
Disclosure of Invention
The invention aims at providing a method for identifying a mycobacterium tuberculosis lineage 2.3 subtype strain and application thereof.
The invention is realized in the following way:
in a first aspect, embodiments of the present invention provide a reagent or kit comprising: reagents for detecting SNP specific sites; the SNP specific locus comprises any one or a combination of a plurality of combinations 1 to 6;
the combination 1 comprises at least one of SNP loci 1 to 4: SNP locus 1 is 655559 th site on the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 2 is located on 2093991 th position on mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 3 locates at 3218997 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/A; SNP locus 4 is located on the 3684649 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A;
The combination 2 comprises at least one of SNP loci 5 to 8: SNP locus 5 locates at 36577 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 6 locates at 1674210 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C; SNP locus 7 is located on the 1772288 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/G; SNP locus 8 is located on the 3529598 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/G;
the combination 3 comprises at least one of SNP loci 9 to 12: SNP locus 9 is located on the 594351 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is T/C; SNP locus 10 is located on the 4131585 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C; SNP locus 11 is located on the 3897155 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is T/C; SNP locus 12 is located on the 1600039 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/T;
The combination 4 comprises at least one of SNP loci 13 to 16: SNP locus 13 is located on the 1451405 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 14 is located on the 4219219 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 15 is located on the 1279968 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/C; SNP locus 16 locates at 2185884 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/C;
the combination 5 comprises at least one of SNP loci 17 to 19: SNP locus 17 locates at 2474686 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 18 locates at 3355057 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 19 is located on the 4215793 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C;
The combination 6 comprises at least one of SNP loci 20 to 23: SNP locus 20 is located on the 1151304 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/A; SNP locus 21 is located on the 1446733 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is G/A; SNP locus 22 is located on the 2838897 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/G; SNP locus 23 is located on the 3056767 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/T;
The physical position of the SNP locus is determined based on the whole genome sequence alignment of the mycobacterium tuberculosis H37Rv, wherein Accesion Number of the whole genome sequence of the mycobacterium tuberculosis H37Rv is NC_000962.3.
In a second aspect, embodiments of the present invention provide the use of a reagent or kit as described in the previous embodiments for the preparation of a product having any one of the following uses: (1) Mycobacterium tuberculosis lineage 2.3 sub-lineage determination; (2) Identification of Mycobacterium tuberculosis lineage 2.3 subtype strains at the level; (3) evolution analysis of Mycobacterium tuberculosis lineage 2.3.
In a third aspect, embodiments of the present invention provide the use of a reagent or kit as described in the preceding embodiments for the preparation of a product for diagnosis or co-diagnosis of a disease or condition associated with a strain of the subtype 2.3 of the Mycobacterium tuberculosis lineage.
In a fourth aspect, embodiments of the present invention provide a method for sub-lineage determination and/or strain level identification and/or evolutionary analysis of Mycobacterium tuberculosis lineage 2.3, the method comprising detecting SNP-specific loci as described in the previous embodiments.
The invention has the following beneficial effects:
(1) The 23 SNP loci involved in the kit, application and method are obtained from the whole genome sequence of 200 mycobacterium tuberculosis pedigrees 2.3, and are subjected to genetic differentiation analysis, and meanwhile, the SNP loci are obtained after the genome sequence verification of 2000L 2.3 mycobacterium tuberculosis in 51 countries/regions worldwide. The 23 specific SNP loci are simultaneously subjected to functional enrichment, which indicates that the 23 specific SNP loci are all key SNP loci which are used for representing 2.3 6 genetic evolutionary branches of the mycobacterium tuberculosis and are closely related to the gene functions of the mycobacterium tuberculosis. Therefore, detection of the above SNP locus is of great importance for identification and evolutionary analysis of 6 branches of Mycobacterium tuberculosis lineage 2.3.
(2) The 23 SNP loci involved in the kit, the application and the method have high accuracy and specificity reaching 100% when identifying single mycobacterium tuberculosis, and can accurately identify 6 genetics of mycobacterium tuberculosis pedigree 2.3 at the plant level.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a phylogenetic tree constructed from a whole genome SNP set;
figure 2 is a global lineage 2.3 validation result.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below. The specific conditions are not noted in the examples and are carried out according to conventional conditions or conditions recommended by the manufacturer. The reagents or apparatus used were conventional products commercially available without the manufacturer's attention.
At present, the typing method of the mycobacterium tuberculosis mainly comprises the following steps: number variable tandem repeat (variable number TANDEM REPEATS VNTR) typing, single Nucleotide Polymorphism (SNP) typing, and whole genome sequence (Whole Genomic Sequences, WGS) analysis techniques. VNTR is the most common typing method. The WGS technology is the best genotyping method of M.TB at present, has high resolution, overcomes the defect of homology and heterogeneity, has huge analysis and calculation capacity by using all genome information, and has completely consistent information of a plurality of gene loci, so that the identification of plant level can be realized by only extracting locus information with difference, namely SNP loci.
The embodiment of the invention provides a reagent or a kit, which comprises the following components: reagents for detecting SNP specific sites; the SNP specific locus comprises any one or a combination of a plurality of combinations 1 to 6;
The combination 1 comprises at least one of SNP loci 1 to 4, and is applied to identifying mycobacterium tuberculosis L2.3.1: SNP locus 1 is 655559 th site on the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 2 is located on 2093991 th position on mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 3 locates at 3218997 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/A; SNP locus 4 is located on the 3684649 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A;
The combination 2 comprises at least one of SNP loci 5 to 8, and is applied to identifying mycobacterium tuberculosis L2.3.2: SNP locus 5 locates at 36577 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 6 locates at 1674210 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C; SNP locus 7 is located on the 1772288 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/G; SNP locus 8 is located on the 3529598 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/G;
The combination 3 comprises at least one of SNP loci 9 to 12, and is applied to identifying mycobacterium tuberculosis L2.3.3: SNP locus 9 is located on the 594351 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is T/C; SNP locus 10 is located on the 4131585 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C; SNP locus 11 is located on the 3897155 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is T/C; SNP locus 12 is located on the 1600039 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/T;
The combination 4 comprises at least one of SNP loci 13 to 16, and is applied to identifying mycobacterium tuberculosis L2.3.4: SNP locus 13 is located on the 1451405 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 14 is located on the 4219219 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 15 is located on the 1279968 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/C; SNP locus 16 locates at 2185884 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/C;
The combination 5 comprises at least one of SNP loci 17 to 19, and is applied to the identification of mycobacterium tuberculosis L2.3.5: SNP locus 17 locates at 2474686 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 18 locates at 3355057 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 19 is located on the 4215793 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C;
the combination 6 comprises at least one of SNP loci 20 to 23, and is applied to identifying mycobacterium tuberculosis L2.3.6: SNP locus 20 is located on the 1151304 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/A; SNP locus 21 is located on the 1446733 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is G/A; SNP locus 22 is located on the 2838897 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/G; SNP locus 23 is located on the 3056767 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/T;
specific information on SNP-specific sites can be found in Table 1.
TABLE 1 specific information on SNP specificity sites
The physical position of the SNP locus is determined based on the whole genome sequence alignment of the mycobacterium tuberculosis H37Rv, wherein Accesion Number of the whole genome sequence of the mycobacterium tuberculosis H37Rv is NC_000962.3.
The reagent or the kit can identify 6 sub-branch SNP sets of the mycobacterium tuberculosis pedigree 2.3, and improve the accuracy of 6 branch identification or evolution analysis of the mycobacterium tuberculosis pedigree 2.3, so that dominant branches popular in the mycobacterium tuberculosis pedigree 2.3 are found, and the precise prevention and control capability of the mycobacterium tuberculosis pedigree 2.3 is improved. Compared with other SNP locus combinations, the SNP locus combination for identifying 6 branches provided by the embodiment of the invention has the advantages of high specificity and good sensitivity.
In some embodiments, the SNP specific sites include combinations 1-6.
The data sample test set for screening SNP specific loci is from the whole genome sequence of 200 Mycobacterium tuberculosis pedigrees 2.3 in east Asia, the training set covers 2000 Mycobacterium tuberculosis pedigrees 2.3 strains which are discovered and sequenced globally at present, and the SNP set comprising 23 SNP loci can reflect the overall situation of identification and evolution of six branches of Mycobacterium tuberculosis pedigrees 2.3 more comprehensively. Secondly, the SNP set ensures high resolution, reduces the number of SNP loci which do not need to be detected, and balances the two. In addition, the 23 SNP loci are adopted to identify the global mycobacterium tuberculosis pedigree 2.3, so that the accuracy rate is high and the specificity is strong. Thereby realizing the identification and the accurate prevention and control of the sub-branches with strong pathogenicity and transmission power of the mycobacterium tuberculosis lineage.
In some embodiments, the reagent or kit has any one of the following uses: (1) Mycobacterium tuberculosis lineage 2.3 sub-lineage determination; (2) Identification of subtype strains of Mycobacterium tuberculosis lineage 2.3 at the level; (3) evolution analysis of Mycobacterium tuberculosis lineage 2.3.
In some embodiments, the reagent or kit comprises: any one or a combination of a plurality of PCR primers, molecular probes, biosensors and chips.
In another aspect, embodiments of the present invention provide the use of a reagent or kit as described in any of the preceding embodiments for the preparation of a product having any of the following uses, including: (1) Mycobacterium tuberculosis lineage 2.3 sub-lineage determination; (2) Identification of Mycobacterium tuberculosis lineage 2.3 subtype strains at the level; (3) evolution analysis of Mycobacterium tuberculosis lineage 2.3.
In another aspect, embodiments of the present invention provide the use of a reagent or kit as described in any of the preceding embodiments for the preparation of a product for diagnosis or aiding diagnosis of a disease or condition associated with a strain of the subtype 2.3 of the Mycobacterium tuberculosis lineage.
In some embodiments, the related disease comprises tuberculosis associated with a mycobacterium tuberculosis lineage 2.3 subtype strain.
In some embodiments, the mycobacterium tuberculosis lineage 2.3 subtype strain includes: any one or a combination of more of mycobacterium tuberculosis 2.3.1, mycobacterium tuberculosis 2.3.2, mycobacterium tuberculosis 2.3.3, mycobacterium tuberculosis 2.3.4, mycobacterium tuberculosis 2.3.5 and mycobacterium tuberculosis 2.3.6.
In addition, the embodiment of the present invention provides a method for performing sub-lineage determination and/or strain level identification and/or evolution analysis on Mycobacterium tuberculosis lineage 2.3, which comprises detecting the SNP specific site described in the previous embodiment.
In some embodiments, the methods are not directed to diagnosis or treatment of a disease.
In some embodiments, the detecting comprises: gel electrophoresis-based SNP detection method, DNA sequencing method, DNA chip method, denaturing high performance liquid chromatography and mass spectrometry detection method.
In some embodiments, the gel electrophoresis-based SNP detection method comprises: one or more of single-strand conformational polymorphism detection method, denaturing gradient gel electrophoresis detection method, enzyme digestion amplification polymorphic sequence detection method and allele-specific PCR detection method;
In some embodiments, the mass spectrometry detection method comprises matrix-assisted laser desorption ionization time-of-flight mass spectrometry detection (MALDI-TOF).
The 23 SNP loci are obtained by comparison on the basis of whole genome data of 140 mycobacterium tuberculosis pedigrees 2.3, and are verified based on a genetic evolutionary tree constructed by genome data of 2000L 2.3 strains in 51 countries/regions worldwide, so that the SNP loci can represent most of SNP loci of each of 6 evolutionary branches on the 2.3 evolutionary tree of the mycobacterium tuberculosis pedigrees. The SNP locus excellently characterizes genetic evolution information of 2.3 branches of a mycobacterium tuberculosis lineage and mutation information of key genes. The ability to identify the above-described SNP sites of 2.3 6 genetic evolutionary branches of Mycobacterium tuberculosis provides the above information, and is of great significance in the genetic classification of 2.3 branches of Mycobacterium tuberculosis and in the prevention and control of tuberculosis.
The features and capabilities of the present invention are described in further detail below in connection with the examples.
Example 1: screening all SNP of whole genome, and constructing L2.3 evolutionary tree
The SNP sites were screened on the whole genome scale of Mycobacterium tuberculosis as follows:
1. Strain data download
The genome-wide data of 300 representative strains of Mycobacterium tuberculosis of east Asia of different lineages was downloaded from NCBI database.
2. Quality assessment of strain sequencing data
Quality control assessment of strain sequencing fastq file GC content was performed using FastQC (version 0.10.1) software: (1) Removing sequencing data having GC content less than 60% or greater than 70%; (2) Strain sequencing data were removed by SNP MISS CALL to greater than 15%. The final 200 Mycobacterium tuberculosis lineage 2.3 sequencing data were all included in the study.
3. Statistical comparison rate
The sequencing files of the strain were compared to the Mycobacterium tuberculosis reference genome H37Rv (NC-000962.3) by using Bowtie 2 (version 2.2.9) software, and the strain comparison rate was counted.
4、SNP Calling
This example uses a heavy sequencing comparative analysis method for mapping short sequencing reads to a reference genome. Specifically, the Sicke tool is used to trim WGS data. Sequencing reads with a Phred basal quality above 20 and a length above 30 were retained for analysis. The whole genome sequence of the Mycobacterium tuberculosis H37Rv strain (NC_ 000962.2) was used as a reference template for read mapping. Sequencing reads were mapped to reference genomic analysis using Bowtie 2 (version 2.2.9). SAMtools (version 1.3.1) were used for SNP calls with map quality greater than 30. The fixed mutation (frequency. Gtoreq.75%) is incorporated into the reading of at least 10 more supported results using VarScan (2.3.9) and the chain skew filter option is enabled. All SNPs located in repeated regions of the genome (e.g., PPE/PE-PGRS family genes, phage sequences, inserted or shifted genetic elements) that are difficult to characterize by short gene sequencing techniques are excluded. Small insertions or deletions identified by VarScan (version 2.3.9) are also excluded. Finally, 8012 SNP sites were obtained from 300 sequenced strains.
And constructing a Mycobacterium tuberculosis evolutionary tree according to the following steps:
Mixed infection isolates were excluded from phylogenetic reconstitution by examining genotype heterozygosity of SNPs. For all phylogenetic reconstructions, the SNPs of the MTBC isolates were combined into a common and non-redundant list, while those nucleotide positions of the taxa that were more than 5% gapped were excluded (probably due to insertions or deletions, low coverage, etc.). Alignment of polymorphic positions from all strains was used for phylogenetic reconstruction using MEGA X. When the number of classification units is large, an initial inference is made on the system structure using the adjacency method. But for the final estimate of phylogenetic development, a maximum likelihood method of at least 100 repetitions under a general time reversal model is applied to guide the confidence level. Phylogenetic trees were visualized in FigTree (version 1.4.3) (http:// tree. Bio.ac.uk/software/figtree /) or iTOL (FIG. 1).
Example 2: l2.3 specific SNP screening, construction of subtype classification
In this example, 200 strains L2.3 were used to calculate six SNPs unique to the sub-branches. Briefly, the L2.3 different subfamilies were ancestral sequence reconstructed using Baseml software, a most recent consensus ancestral sequence (MRCA) was generated and compared to the most recent L2.2 MRCA sequence. The smallest identifiable set of L2.3 SNPs was obtained. The Mycobacterium tuberculosis evolutionary tree was then constructed as described in Table 1, following the steps:
Mixed infection isolates were excluded from phylogenetic reconstitution by examining genotype heterozygosity of SNPs. For all phylogenetic reconstructions, the SNPs of the MTBC isolates were combined into a common and non-redundant list, while those nucleotide positions of the taxa that were more than 5% gapped were excluded (probably due to insertions or deletions, low coverage, etc.). Alignment of polymorphic positions from all strains was used for phylogenetic reconstruction using MEGA X. When the number of classification units is large, an initial inference is made on the system structure using the adjacency method. But for the final estimate of phylogenetic development, a maximum likelihood method of at least 100 repetitions under a general time reversal model is applied to guide the confidence level. Phylogenetic trees are visualized in FigTree (version 1.4.3) (http:// tree. Bio.ac.uk/software/figtree /) or iTOL. The phylogenetic tree differentiation nodes of lineage 2.3 were finally obtained (see fig. 1). Specifically, the above SNP locus was used to subdivide mycobacterium tuberculosis lineage 2.3 into the following 6 branches: 2.3.1,2.3.2,2.3.3,2.3.4,2.3.5 and 2.3.6.
3. Evaluation of SNP-set strain level identification effect
Mycobacterium tuberculosis of known genomic sequences of 2000L 2.3 strains of 51 countries worldwide were downloaded from NCBI database (FIG. 2), these strains were typed according to the resulting 23 SNP sets (see Table 1), and the typing results based on the 23 SNP sets were completely consistent with the typing results based on the whole genome sequence data. Wherein each subtype had a sensitivity of 86.4% based on single-site identification and 100% based on multiple-site identification, and all strains were completely correctly classified.
Finally, it should be noted that: currently, the prior art is not clear about the typing of each subtype of the L2.3 system, and conventional experimental methods (such as VNTR, 16Srna, etc.) cannot identify and distinguish multiple subtypes of the L2.3 system. Genome-wide based assays often require tens of thousands of sites to obtain accurate structures. The invention screens and obtains the optimal combination of 23 loci (at least 3 specific targets of each subtype), is suitable for identifying a plurality of subtypes of an L2.3 system, ensures the accuracy of classification, reduces a large amount of calculation amount and realizes the identification of subspecies level.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (8)

1. The application of a reagent in preparing a product for judging the lineages of mycobacterium tuberculosis 2.3 subfamilies is characterized in that,
The reagent comprises: reagents for detecting SNP specific sites; the SNP specific locus consists of combinations 1 to 6;
The combination 1 consists of SNP loci 1 to 4: SNP locus 1 is 655559 th site on the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 2 is located on 2093991 th position on mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 3 locates at 3218997 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/A; SNP locus 4 is located on the 3684649 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A;
the combination 2 consists of SNP loci 5 to 8: SNP locus 5 is positioned at 36577 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 6 is positioned at 1674210 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is A/C, SNP locus 7 is positioned at 1772288 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/G, SNP locus 8 is positioned at 3529598 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is C/G;
The combination 3 consists of SNP loci 9 to 12: SNP locus 9 is positioned at 594351 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/C, SNP locus 10 is positioned at 4131585 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is A/C, SNP locus 11 is positioned at 3897155 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/C, SNP locus 12 is positioned at 1600039 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is G/T;
The combination 4 consists of SNP loci 13 to 16: SNP locus 13 is positioned at 1451405 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 14 is positioned at 4219219 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 15 is positioned at 1279968 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/C, SNP locus 16 is positioned at 2185884 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is T/C;
The combination 5 consists of SNP loci 17 to 19: SNP locus 17 is positioned at 2474686 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 18 is positioned at 3355057 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 19 is positioned at 4215793 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is A/C;
The combination 6 consists of SNP loci 20 to 23: SNP locus 20 is positioned at 1151304 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/A, SNP locus 21 is positioned at 1446733 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 22 is positioned at 2838897 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/G, SNP locus 23 is positioned at 3056767 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is G/T;
The physical position of the SNP locus is determined based on the whole genome sequence alignment of the mycobacterium tuberculosis H37Rv, wherein Accesion Number of the whole genome sequence of the mycobacterium tuberculosis H37Rv is NC_000962.3.
2. The use of a reagent for the preparation of a product for the evolutionary analysis of Mycobacterium tuberculosis lineage 2.3, characterized in that,
The reagent comprises: reagents for detecting SNP specific sites; the SNP specific locus consists of combinations 1 to 6;
The combination 1 consists of SNP loci 1 to 4: SNP locus 1 is 655559 th site on the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 2 is located on 2093991 th position on mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 3 locates at 3218997 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/A; SNP locus 4 is located on the 3684649 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A;
the combination 2 consists of SNP loci 5 to 8: SNP locus 5 is positioned at 36577 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 6 is positioned at 1674210 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is A/C, SNP locus 7 is positioned at 1772288 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/G, SNP locus 8 is positioned at 3529598 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is C/G;
The combination 3 consists of SNP loci 9 to 12: SNP locus 9 is positioned at 594351 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/C, SNP locus 10 is positioned at 4131585 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is A/C, SNP locus 11 is positioned at 3897155 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/C, SNP locus 12 is positioned at 1600039 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is G/T;
The combination 4 consists of SNP loci 13 to 16: SNP locus 13 is positioned at 1451405 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 14 is positioned at 4219219 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 15 is positioned at 1279968 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/C, SNP locus 16 is positioned at 2185884 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is T/C;
The combination 5 consists of SNP loci 17 to 19: SNP locus 17 is positioned at 2474686 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 18 is positioned at 3355057 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 19 is positioned at 4215793 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is A/C;
The combination 6 consists of SNP loci 20 to 23: SNP locus 20 is positioned at 1151304 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/A, SNP locus 21 is positioned at 1446733 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 22 is positioned at 2838897 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/G, SNP locus 23 is positioned at 3056767 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is G/T;
The physical position of the SNP locus is determined based on the whole genome sequence alignment of the mycobacterium tuberculosis H37Rv, wherein Accesion Number of the whole genome sequence of the mycobacterium tuberculosis H37Rv is NC_000962.3.
3. The use according to claim 1 or 2, wherein the agent comprises: any one or a combination of a plurality of PCR primers, molecular probes, biosensors and chips.
4. A method for sub-lineage determination of mycobacterium tuberculosis lineage 2.3, comprising detecting a SNP specific site as set forth in claim 1, which is not directly targeted for diagnosis or treatment of a disease.
5. A method for evolutionarily analyzing mycobacterium tuberculosis lineage 2.3, comprising detecting the SNP specific site as set forth in claim 2, which is not directly aimed at diagnosis or treatment of a disease.
6. The method according to claim 4 or 5, wherein the detecting comprises: gel electrophoresis-based SNP detection method, DNA sequencing method, DNA chip method, denaturing high performance liquid chromatography and mass spectrometry detection method.
7. The method of claim 6, wherein the gel electrophoresis-based SNP detection method comprises: single strand conformational polymorphism detection, denaturing gradient gel electrophoresis detection, enzyme digestion amplification polymorphic sequence detection, and allele-specific PCR detection.
8. The method of claim 6, wherein the mass spectrometry detection method comprises matrix-assisted laser desorption ionization time-of-flight mass spectrometry detection.
CN202310340239.2A 2023-03-31 2023-03-31 Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof Active CN116254356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310340239.2A CN116254356B (en) 2023-03-31 2023-03-31 Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310340239.2A CN116254356B (en) 2023-03-31 2023-03-31 Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof

Publications (2)

Publication Number Publication Date
CN116254356A CN116254356A (en) 2023-06-13
CN116254356B true CN116254356B (en) 2024-04-30

Family

ID=86682612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310340239.2A Active CN116254356B (en) 2023-03-31 2023-03-31 Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof

Country Status (1)

Country Link
CN (1) CN116254356B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106701979A (en) * 2017-02-04 2017-05-24 北京市结核病胸部肿瘤研究所 Kit used for mycobacterium tuberculosis typing SNP site and application thereof
CN110453001A (en) * 2019-08-28 2019-11-15 北京市结核病胸部肿瘤研究所 Application of 1073 SNP sites in mycobacterium tuberculosis pedigree 3

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106701979A (en) * 2017-02-04 2017-05-24 北京市结核病胸部肿瘤研究所 Kit used for mycobacterium tuberculosis typing SNP site and application thereof
CN110453001A (en) * 2019-08-28 2019-11-15 北京市结核病胸部肿瘤研究所 Application of 1073 SNP sites in mycobacterium tuberculosis pedigree 3

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Revised nomenclature and SNP barcode for Mycobacterium tuberculosis lineage 2;Yuttapong Thawornwattana等;Microbiology Society;第7卷(第11期);第000697-000710页 *

Also Published As

Publication number Publication date
CN116254356A (en) 2023-06-13

Similar Documents

Publication Publication Date Title
Wagner et al. Evaluation of PacBio sequencing for full-length bacterial 16S rRNA gene classification
Park et al. Genome sequencing of the extinct Eurasian wild aurochs, Bos primigenius, illuminates the phylogeography and evolution of cattle
Nielsen et al. A scan for positively selected genes in the genomes of humans and chimpanzees
Sepil et al. Characterization and 454 pyrosequencing of Major Histocompatibility Complex class I genes in the great tit reveal complexity in a passerine system
Matukumalli et al. Development and characterization of a high density SNP genotyping assay for cattle
KR102638152B1 (en) Verification method and system for sequence variant calling
Vale et al. Dormant phages of Helicobacter pylori reveal distinct populations in Europe
CN112342302B (en) Method for identifying candidate gene markers of milk production traits of buffalos and application
Carrasco et al. Identification, typing, and phylogenetic relationships of the main clinical Nocardia species in Spain according to their gyrB and rpoB genes
MacEachern et al. Phylogenetic reconstruction and the identification of ancient polymorphism in the Bovini tribe (Bovidae, Bovinae)
Chittoria et al. Natural selection mediated association of the Duffy (FY) gene polymorphisms with Plasmodium vivax malaria in India
Katzman et al. GC-biased evolution near human accelerated regions
Dai et al. Multilocus phylogeography (mitochondrial, autosomal and Z-chromosomal loci) and genetic consequence of long-distance male dispersal in Black-throated tits (Aegithalos concinnus)
CN111534602A (en) Method for analyzing human blood type and genotype based on high-throughput sequencing and application thereof
Huang et al. Genome-wide association study on chicken carcass traits using sequence data imputed from SNP array
Claes et al. Dealing with pseudogenes in molecular diagnostics in the next generation sequencing era
Aracena et al. Epigenetic variation impacts individual differences in the transcriptional response to influenza infection
CN116254356B (en) Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof
CN106755422B (en) Detection method of MEG3 gene SNP related to cattle growth traits and application thereof
Sazzini et al. Ancient pathogen-driven adaptation triggers increased susceptibility to non-celiac wheat sensitivity in present-day European populations
CN106868128B (en) Biomarker for auxiliary diagnosis of breast cancer and application thereof
González‐Tortuero et al. The Quantification of Representative Sequences pipeline for amplicon sequencing: case study on within‐population ITS 1 sequence variation in a microparasite infecting D aphnia
CN115820872B (en) Molecular marker related to pheasant reproduction traits and application thereof
CN118667982A (en) Kit for identifying SNP loci by 3.1.1.6 new branches of mycobacterium tuberculosis pedigree and application of kit
Huang et al. The origin, inter-subspecies hybridization and adaptation of house mice (Mus musculus)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant