CN116254356B - Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof - Google Patents
Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof Download PDFInfo
- Publication number
- CN116254356B CN116254356B CN202310340239.2A CN202310340239A CN116254356B CN 116254356 B CN116254356 B CN 116254356B CN 202310340239 A CN202310340239 A CN 202310340239A CN 116254356 B CN116254356 B CN 116254356B
- Authority
- CN
- China
- Prior art keywords
- mycobacterium tuberculosis
- chromosome
- nucleotide sequence
- snp
- site
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 241000187479 Mycobacterium tuberculosis Species 0.000 title claims abstract description 166
- 238000000034 method Methods 0.000 title claims abstract description 29
- 239000002773 nucleotide Substances 0.000 claims description 95
- 125000003729 nucleotide group Chemical group 0.000 claims description 95
- 210000000349 chromosome Anatomy 0.000 claims description 92
- 238000001514 detection method Methods 0.000 claims description 19
- 239000003153 chemical reaction reagent Substances 0.000 claims description 18
- 238000004458 analytical method Methods 0.000 claims description 12
- 241001646725 Mycobacterium tuberculosis H37Rv Species 0.000 claims description 9
- 108700035964 Mycobacterium tuberculosis HsaD Proteins 0.000 claims description 9
- 201000010099 disease Diseases 0.000 claims description 8
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 8
- 238000003745 diagnosis Methods 0.000 claims description 7
- 238000002360 preparation method Methods 0.000 claims description 5
- 238000001502 gel electrophoresis Methods 0.000 claims description 4
- 238000004949 mass spectrometry Methods 0.000 claims description 4
- 238000002864 sequence alignment Methods 0.000 claims description 4
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 claims description 3
- 238000000018 DNA microarray Methods 0.000 claims description 2
- 238000001712 DNA sequencing Methods 0.000 claims description 2
- 238000007844 allele-specific PCR Methods 0.000 claims description 2
- 230000003321 amplification Effects 0.000 claims description 2
- 238000003935 denaturing gradient gel electrophoresis Methods 0.000 claims description 2
- 238000001976 enzyme digestion Methods 0.000 claims description 2
- 238000004128 high performance liquid chromatography Methods 0.000 claims description 2
- 239000003068 molecular probe Substances 0.000 claims description 2
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 2
- 239000003795 chemical substances by application Substances 0.000 claims 1
- 230000002265 prevention Effects 0.000 abstract description 5
- 230000005540 biological transmission Effects 0.000 abstract description 3
- 230000007918 pathogenicity Effects 0.000 abstract description 3
- 238000012163 sequencing technique Methods 0.000 description 11
- 230000002068 genetic effect Effects 0.000 description 8
- 201000008827 tuberculosis Diseases 0.000 description 8
- 230000004069 differentiation Effects 0.000 description 5
- 108090000623 proteins and genes Proteins 0.000 description 5
- 235000008730 Ficus carica Nutrition 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000003780 insertion Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 208000003322 Coinfection Diseases 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 231100000517 death Toxicity 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 231100000676 disease causative agent Toxicity 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000011331 genomic analysis Methods 0.000 description 1
- 238000003205 genotyping method Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 238000001303 quality assessment method Methods 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/689—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12R—INDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
- C12R2001/00—Microorganisms ; Processes using microorganisms
- C12R2001/01—Bacteria or Actinomycetales ; using bacteria or Actinomycetales
- C12R2001/32—Mycobacterium
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a method for identifying a mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof, and relates to the field of biology. The 23 SNP specific loci screened by the method can more comprehensively reflect the overall situation of identification and evolution of six branches of the Mycobacterium tuberculosis pedigree 2.3. The SNP combination ensures high resolution, reduces the number of SNP loci which do not need to be detected, and balances the two. In addition, the 23 SNP loci are adopted to identify the global mycobacterium tuberculosis pedigree 2.3, so that the accuracy rate is high and the specificity is strong. Thereby realizing the identification and the accurate prevention and control of the sub-branches with strong pathogenicity and transmission power of the mycobacterium tuberculosis lineage.
Description
Technical Field
The invention relates to the field of biology, in particular to a method for identifying a mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof.
Background
Tuberculosis (TB), one of the longest diseases accompanied with human history, is also the disease with the largest number of deaths caused by single pathogenic bacteria worldwide, has become a global major public health problem, and the prevention and treatment of tuberculosis is serious and far away. Mycobacterium tuberculosis (M.tuberculosis), commonly known as Mycobacterium tuberculosis, is the causative agent of tuberculosis. Nine human adaptive lineages (L1-L9) have been differentiated currently as MTBC follows human migration out of africa. Different lineages show extremely strong bio-geographic relevance and there is a large difference in the proportion of spread and prevalence.
Lineage 2 originated at the earliest in the southeast asia or in the vicinity thereof and had undergone genetic differentiation in situ into 3 major subtypes, L2.1, L2.2 and L2.3 (L2.1-L2.3), respectively. Among them, L2.1 formed by earliest differentiation, and its prevalence was low, and was found only in the south east asia. L2.2, although widely distributed in east asia, is less spread to other countries outside asia. Earlier studies found that L2.3 is actually a branch of late differentiation formation within L2.2 (about 500 years ago), and that more than 90% of the total epidemic Mycobacterium tuberculosis lineage 2 is due to L2.3. Molecular epidemiological studies have found that L2.3 results in more recent spread than L2.2; animal model infection experiments also show that L2.3 has stronger pathogenicity. It follows that the transmission advantage of Mycobacterium tuberculosis lineage 2 is actually that of the strain of the L2.3 subtype. Therefore, it is necessary to achieve efficient identification of the lineage 2.3 strain level of Mycobacterium tuberculosis.
In view of this, the present invention has been made.
Disclosure of Invention
The invention aims at providing a method for identifying a mycobacterium tuberculosis lineage 2.3 subtype strain and application thereof.
The invention is realized in the following way:
in a first aspect, embodiments of the present invention provide a reagent or kit comprising: reagents for detecting SNP specific sites; the SNP specific locus comprises any one or a combination of a plurality of combinations 1 to 6;
the combination 1 comprises at least one of SNP loci 1 to 4: SNP locus 1 is 655559 th site on the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 2 is located on 2093991 th position on mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 3 locates at 3218997 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/A; SNP locus 4 is located on the 3684649 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A;
The combination 2 comprises at least one of SNP loci 5 to 8: SNP locus 5 locates at 36577 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 6 locates at 1674210 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C; SNP locus 7 is located on the 1772288 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/G; SNP locus 8 is located on the 3529598 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/G;
the combination 3 comprises at least one of SNP loci 9 to 12: SNP locus 9 is located on the 594351 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is T/C; SNP locus 10 is located on the 4131585 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C; SNP locus 11 is located on the 3897155 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is T/C; SNP locus 12 is located on the 1600039 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/T;
The combination 4 comprises at least one of SNP loci 13 to 16: SNP locus 13 is located on the 1451405 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 14 is located on the 4219219 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 15 is located on the 1279968 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/C; SNP locus 16 locates at 2185884 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/C;
the combination 5 comprises at least one of SNP loci 17 to 19: SNP locus 17 locates at 2474686 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 18 locates at 3355057 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 19 is located on the 4215793 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C;
The combination 6 comprises at least one of SNP loci 20 to 23: SNP locus 20 is located on the 1151304 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/A; SNP locus 21 is located on the 1446733 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is G/A; SNP locus 22 is located on the 2838897 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/G; SNP locus 23 is located on the 3056767 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/T;
The physical position of the SNP locus is determined based on the whole genome sequence alignment of the mycobacterium tuberculosis H37Rv, wherein Accesion Number of the whole genome sequence of the mycobacterium tuberculosis H37Rv is NC_000962.3.
In a second aspect, embodiments of the present invention provide the use of a reagent or kit as described in the previous embodiments for the preparation of a product having any one of the following uses: (1) Mycobacterium tuberculosis lineage 2.3 sub-lineage determination; (2) Identification of Mycobacterium tuberculosis lineage 2.3 subtype strains at the level; (3) evolution analysis of Mycobacterium tuberculosis lineage 2.3.
In a third aspect, embodiments of the present invention provide the use of a reagent or kit as described in the preceding embodiments for the preparation of a product for diagnosis or co-diagnosis of a disease or condition associated with a strain of the subtype 2.3 of the Mycobacterium tuberculosis lineage.
In a fourth aspect, embodiments of the present invention provide a method for sub-lineage determination and/or strain level identification and/or evolutionary analysis of Mycobacterium tuberculosis lineage 2.3, the method comprising detecting SNP-specific loci as described in the previous embodiments.
The invention has the following beneficial effects:
(1) The 23 SNP loci involved in the kit, application and method are obtained from the whole genome sequence of 200 mycobacterium tuberculosis pedigrees 2.3, and are subjected to genetic differentiation analysis, and meanwhile, the SNP loci are obtained after the genome sequence verification of 2000L 2.3 mycobacterium tuberculosis in 51 countries/regions worldwide. The 23 specific SNP loci are simultaneously subjected to functional enrichment, which indicates that the 23 specific SNP loci are all key SNP loci which are used for representing 2.3 6 genetic evolutionary branches of the mycobacterium tuberculosis and are closely related to the gene functions of the mycobacterium tuberculosis. Therefore, detection of the above SNP locus is of great importance for identification and evolutionary analysis of 6 branches of Mycobacterium tuberculosis lineage 2.3.
(2) The 23 SNP loci involved in the kit, the application and the method have high accuracy and specificity reaching 100% when identifying single mycobacterium tuberculosis, and can accurately identify 6 genetics of mycobacterium tuberculosis pedigree 2.3 at the plant level.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a phylogenetic tree constructed from a whole genome SNP set;
figure 2 is a global lineage 2.3 validation result.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below. The specific conditions are not noted in the examples and are carried out according to conventional conditions or conditions recommended by the manufacturer. The reagents or apparatus used were conventional products commercially available without the manufacturer's attention.
At present, the typing method of the mycobacterium tuberculosis mainly comprises the following steps: number variable tandem repeat (variable number TANDEM REPEATS VNTR) typing, single Nucleotide Polymorphism (SNP) typing, and whole genome sequence (Whole Genomic Sequences, WGS) analysis techniques. VNTR is the most common typing method. The WGS technology is the best genotyping method of M.TB at present, has high resolution, overcomes the defect of homology and heterogeneity, has huge analysis and calculation capacity by using all genome information, and has completely consistent information of a plurality of gene loci, so that the identification of plant level can be realized by only extracting locus information with difference, namely SNP loci.
The embodiment of the invention provides a reagent or a kit, which comprises the following components: reagents for detecting SNP specific sites; the SNP specific locus comprises any one or a combination of a plurality of combinations 1 to 6;
The combination 1 comprises at least one of SNP loci 1 to 4, and is applied to identifying mycobacterium tuberculosis L2.3.1: SNP locus 1 is 655559 th site on the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 2 is located on 2093991 th position on mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 3 locates at 3218997 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/A; SNP locus 4 is located on the 3684649 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A;
The combination 2 comprises at least one of SNP loci 5 to 8, and is applied to identifying mycobacterium tuberculosis L2.3.2: SNP locus 5 locates at 36577 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 6 locates at 1674210 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C; SNP locus 7 is located on the 1772288 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/G; SNP locus 8 is located on the 3529598 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/G;
The combination 3 comprises at least one of SNP loci 9 to 12, and is applied to identifying mycobacterium tuberculosis L2.3.3: SNP locus 9 is located on the 594351 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is T/C; SNP locus 10 is located on the 4131585 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C; SNP locus 11 is located on the 3897155 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is T/C; SNP locus 12 is located on the 1600039 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/T;
The combination 4 comprises at least one of SNP loci 13 to 16, and is applied to identifying mycobacterium tuberculosis L2.3.4: SNP locus 13 is located on the 1451405 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 14 is located on the 4219219 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 15 is located on the 1279968 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/C; SNP locus 16 locates at 2185884 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/C;
The combination 5 comprises at least one of SNP loci 17 to 19, and is applied to the identification of mycobacterium tuberculosis L2.3.5: SNP locus 17 locates at 2474686 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 18 locates at 3355057 th on the mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 19 is located on the 4215793 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is A/C;
the combination 6 comprises at least one of SNP loci 20 to 23, and is applied to identifying mycobacterium tuberculosis L2.3.6: SNP locus 20 is located on the 1151304 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/A; SNP locus 21 is located on the 1446733 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is G/A; SNP locus 22 is located on the 2838897 th position of the mycobacterium tuberculosis chromosome, and the nucleotide sequence is C/G; SNP locus 23 is located on the 3056767 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/T;
specific information on SNP-specific sites can be found in Table 1.
TABLE 1 specific information on SNP specificity sites
The physical position of the SNP locus is determined based on the whole genome sequence alignment of the mycobacterium tuberculosis H37Rv, wherein Accesion Number of the whole genome sequence of the mycobacterium tuberculosis H37Rv is NC_000962.3.
The reagent or the kit can identify 6 sub-branch SNP sets of the mycobacterium tuberculosis pedigree 2.3, and improve the accuracy of 6 branch identification or evolution analysis of the mycobacterium tuberculosis pedigree 2.3, so that dominant branches popular in the mycobacterium tuberculosis pedigree 2.3 are found, and the precise prevention and control capability of the mycobacterium tuberculosis pedigree 2.3 is improved. Compared with other SNP locus combinations, the SNP locus combination for identifying 6 branches provided by the embodiment of the invention has the advantages of high specificity and good sensitivity.
In some embodiments, the SNP specific sites include combinations 1-6.
The data sample test set for screening SNP specific loci is from the whole genome sequence of 200 Mycobacterium tuberculosis pedigrees 2.3 in east Asia, the training set covers 2000 Mycobacterium tuberculosis pedigrees 2.3 strains which are discovered and sequenced globally at present, and the SNP set comprising 23 SNP loci can reflect the overall situation of identification and evolution of six branches of Mycobacterium tuberculosis pedigrees 2.3 more comprehensively. Secondly, the SNP set ensures high resolution, reduces the number of SNP loci which do not need to be detected, and balances the two. In addition, the 23 SNP loci are adopted to identify the global mycobacterium tuberculosis pedigree 2.3, so that the accuracy rate is high and the specificity is strong. Thereby realizing the identification and the accurate prevention and control of the sub-branches with strong pathogenicity and transmission power of the mycobacterium tuberculosis lineage.
In some embodiments, the reagent or kit has any one of the following uses: (1) Mycobacterium tuberculosis lineage 2.3 sub-lineage determination; (2) Identification of subtype strains of Mycobacterium tuberculosis lineage 2.3 at the level; (3) evolution analysis of Mycobacterium tuberculosis lineage 2.3.
In some embodiments, the reagent or kit comprises: any one or a combination of a plurality of PCR primers, molecular probes, biosensors and chips.
In another aspect, embodiments of the present invention provide the use of a reagent or kit as described in any of the preceding embodiments for the preparation of a product having any of the following uses, including: (1) Mycobacterium tuberculosis lineage 2.3 sub-lineage determination; (2) Identification of Mycobacterium tuberculosis lineage 2.3 subtype strains at the level; (3) evolution analysis of Mycobacterium tuberculosis lineage 2.3.
In another aspect, embodiments of the present invention provide the use of a reagent or kit as described in any of the preceding embodiments for the preparation of a product for diagnosis or aiding diagnosis of a disease or condition associated with a strain of the subtype 2.3 of the Mycobacterium tuberculosis lineage.
In some embodiments, the related disease comprises tuberculosis associated with a mycobacterium tuberculosis lineage 2.3 subtype strain.
In some embodiments, the mycobacterium tuberculosis lineage 2.3 subtype strain includes: any one or a combination of more of mycobacterium tuberculosis 2.3.1, mycobacterium tuberculosis 2.3.2, mycobacterium tuberculosis 2.3.3, mycobacterium tuberculosis 2.3.4, mycobacterium tuberculosis 2.3.5 and mycobacterium tuberculosis 2.3.6.
In addition, the embodiment of the present invention provides a method for performing sub-lineage determination and/or strain level identification and/or evolution analysis on Mycobacterium tuberculosis lineage 2.3, which comprises detecting the SNP specific site described in the previous embodiment.
In some embodiments, the methods are not directed to diagnosis or treatment of a disease.
In some embodiments, the detecting comprises: gel electrophoresis-based SNP detection method, DNA sequencing method, DNA chip method, denaturing high performance liquid chromatography and mass spectrometry detection method.
In some embodiments, the gel electrophoresis-based SNP detection method comprises: one or more of single-strand conformational polymorphism detection method, denaturing gradient gel electrophoresis detection method, enzyme digestion amplification polymorphic sequence detection method and allele-specific PCR detection method;
In some embodiments, the mass spectrometry detection method comprises matrix-assisted laser desorption ionization time-of-flight mass spectrometry detection (MALDI-TOF).
The 23 SNP loci are obtained by comparison on the basis of whole genome data of 140 mycobacterium tuberculosis pedigrees 2.3, and are verified based on a genetic evolutionary tree constructed by genome data of 2000L 2.3 strains in 51 countries/regions worldwide, so that the SNP loci can represent most of SNP loci of each of 6 evolutionary branches on the 2.3 evolutionary tree of the mycobacterium tuberculosis pedigrees. The SNP locus excellently characterizes genetic evolution information of 2.3 branches of a mycobacterium tuberculosis lineage and mutation information of key genes. The ability to identify the above-described SNP sites of 2.3 6 genetic evolutionary branches of Mycobacterium tuberculosis provides the above information, and is of great significance in the genetic classification of 2.3 branches of Mycobacterium tuberculosis and in the prevention and control of tuberculosis.
The features and capabilities of the present invention are described in further detail below in connection with the examples.
Example 1: screening all SNP of whole genome, and constructing L2.3 evolutionary tree
The SNP sites were screened on the whole genome scale of Mycobacterium tuberculosis as follows:
1. Strain data download
The genome-wide data of 300 representative strains of Mycobacterium tuberculosis of east Asia of different lineages was downloaded from NCBI database.
2. Quality assessment of strain sequencing data
Quality control assessment of strain sequencing fastq file GC content was performed using FastQC (version 0.10.1) software: (1) Removing sequencing data having GC content less than 60% or greater than 70%; (2) Strain sequencing data were removed by SNP MISS CALL to greater than 15%. The final 200 Mycobacterium tuberculosis lineage 2.3 sequencing data were all included in the study.
3. Statistical comparison rate
The sequencing files of the strain were compared to the Mycobacterium tuberculosis reference genome H37Rv (NC-000962.3) by using Bowtie 2 (version 2.2.9) software, and the strain comparison rate was counted.
4、SNP Calling
This example uses a heavy sequencing comparative analysis method for mapping short sequencing reads to a reference genome. Specifically, the Sicke tool is used to trim WGS data. Sequencing reads with a Phred basal quality above 20 and a length above 30 were retained for analysis. The whole genome sequence of the Mycobacterium tuberculosis H37Rv strain (NC_ 000962.2) was used as a reference template for read mapping. Sequencing reads were mapped to reference genomic analysis using Bowtie 2 (version 2.2.9). SAMtools (version 1.3.1) were used for SNP calls with map quality greater than 30. The fixed mutation (frequency. Gtoreq.75%) is incorporated into the reading of at least 10 more supported results using VarScan (2.3.9) and the chain skew filter option is enabled. All SNPs located in repeated regions of the genome (e.g., PPE/PE-PGRS family genes, phage sequences, inserted or shifted genetic elements) that are difficult to characterize by short gene sequencing techniques are excluded. Small insertions or deletions identified by VarScan (version 2.3.9) are also excluded. Finally, 8012 SNP sites were obtained from 300 sequenced strains.
And constructing a Mycobacterium tuberculosis evolutionary tree according to the following steps:
Mixed infection isolates were excluded from phylogenetic reconstitution by examining genotype heterozygosity of SNPs. For all phylogenetic reconstructions, the SNPs of the MTBC isolates were combined into a common and non-redundant list, while those nucleotide positions of the taxa that were more than 5% gapped were excluded (probably due to insertions or deletions, low coverage, etc.). Alignment of polymorphic positions from all strains was used for phylogenetic reconstruction using MEGA X. When the number of classification units is large, an initial inference is made on the system structure using the adjacency method. But for the final estimate of phylogenetic development, a maximum likelihood method of at least 100 repetitions under a general time reversal model is applied to guide the confidence level. Phylogenetic trees were visualized in FigTree (version 1.4.3) (http:// tree. Bio.ac.uk/software/figtree /) or iTOL (FIG. 1).
Example 2: l2.3 specific SNP screening, construction of subtype classification
In this example, 200 strains L2.3 were used to calculate six SNPs unique to the sub-branches. Briefly, the L2.3 different subfamilies were ancestral sequence reconstructed using Baseml software, a most recent consensus ancestral sequence (MRCA) was generated and compared to the most recent L2.2 MRCA sequence. The smallest identifiable set of L2.3 SNPs was obtained. The Mycobacterium tuberculosis evolutionary tree was then constructed as described in Table 1, following the steps:
Mixed infection isolates were excluded from phylogenetic reconstitution by examining genotype heterozygosity of SNPs. For all phylogenetic reconstructions, the SNPs of the MTBC isolates were combined into a common and non-redundant list, while those nucleotide positions of the taxa that were more than 5% gapped were excluded (probably due to insertions or deletions, low coverage, etc.). Alignment of polymorphic positions from all strains was used for phylogenetic reconstruction using MEGA X. When the number of classification units is large, an initial inference is made on the system structure using the adjacency method. But for the final estimate of phylogenetic development, a maximum likelihood method of at least 100 repetitions under a general time reversal model is applied to guide the confidence level. Phylogenetic trees are visualized in FigTree (version 1.4.3) (http:// tree. Bio.ac.uk/software/figtree /) or iTOL. The phylogenetic tree differentiation nodes of lineage 2.3 were finally obtained (see fig. 1). Specifically, the above SNP locus was used to subdivide mycobacterium tuberculosis lineage 2.3 into the following 6 branches: 2.3.1,2.3.2,2.3.3,2.3.4,2.3.5 and 2.3.6.
3. Evaluation of SNP-set strain level identification effect
Mycobacterium tuberculosis of known genomic sequences of 2000L 2.3 strains of 51 countries worldwide were downloaded from NCBI database (FIG. 2), these strains were typed according to the resulting 23 SNP sets (see Table 1), and the typing results based on the 23 SNP sets were completely consistent with the typing results based on the whole genome sequence data. Wherein each subtype had a sensitivity of 86.4% based on single-site identification and 100% based on multiple-site identification, and all strains were completely correctly classified.
Finally, it should be noted that: currently, the prior art is not clear about the typing of each subtype of the L2.3 system, and conventional experimental methods (such as VNTR, 16Srna, etc.) cannot identify and distinguish multiple subtypes of the L2.3 system. Genome-wide based assays often require tens of thousands of sites to obtain accurate structures. The invention screens and obtains the optimal combination of 23 loci (at least 3 specific targets of each subtype), is suitable for identifying a plurality of subtypes of an L2.3 system, ensures the accuracy of classification, reduces a large amount of calculation amount and realizes the identification of subspecies level.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.
Claims (8)
1. The application of a reagent in preparing a product for judging the lineages of mycobacterium tuberculosis 2.3 subfamilies is characterized in that,
The reagent comprises: reagents for detecting SNP specific sites; the SNP specific locus consists of combinations 1 to 6;
The combination 1 consists of SNP loci 1 to 4: SNP locus 1 is 655559 th site on the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 2 is located on 2093991 th position on mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 3 locates at 3218997 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/A; SNP locus 4 is located on the 3684649 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A;
the combination 2 consists of SNP loci 5 to 8: SNP locus 5 is positioned at 36577 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 6 is positioned at 1674210 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is A/C, SNP locus 7 is positioned at 1772288 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/G, SNP locus 8 is positioned at 3529598 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is C/G;
The combination 3 consists of SNP loci 9 to 12: SNP locus 9 is positioned at 594351 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/C, SNP locus 10 is positioned at 4131585 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is A/C, SNP locus 11 is positioned at 3897155 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/C, SNP locus 12 is positioned at 1600039 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is G/T;
The combination 4 consists of SNP loci 13 to 16: SNP locus 13 is positioned at 1451405 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 14 is positioned at 4219219 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 15 is positioned at 1279968 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/C, SNP locus 16 is positioned at 2185884 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is T/C;
The combination 5 consists of SNP loci 17 to 19: SNP locus 17 is positioned at 2474686 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 18 is positioned at 3355057 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 19 is positioned at 4215793 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is A/C;
The combination 6 consists of SNP loci 20 to 23: SNP locus 20 is positioned at 1151304 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/A, SNP locus 21 is positioned at 1446733 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 22 is positioned at 2838897 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/G, SNP locus 23 is positioned at 3056767 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is G/T;
The physical position of the SNP locus is determined based on the whole genome sequence alignment of the mycobacterium tuberculosis H37Rv, wherein Accesion Number of the whole genome sequence of the mycobacterium tuberculosis H37Rv is NC_000962.3.
2. The use of a reagent for the preparation of a product for the evolutionary analysis of Mycobacterium tuberculosis lineage 2.3, characterized in that,
The reagent comprises: reagents for detecting SNP specific sites; the SNP specific locus consists of combinations 1 to 6;
The combination 1 consists of SNP loci 1 to 4: SNP locus 1 is 655559 th site on the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A; SNP locus 2 is located on 2093991 th position on mycobacterium tuberculosis chromosome, the nucleotide sequence is C/T; SNP locus 3 locates at 3218997 th place on the mycobacterium tuberculosis chromosome, the nucleotide sequence is T/A; SNP locus 4 is located on the 3684649 th position of the mycobacterium tuberculosis chromosome, the nucleotide sequence is G/A;
the combination 2 consists of SNP loci 5 to 8: SNP locus 5 is positioned at 36577 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 6 is positioned at 1674210 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is A/C, SNP locus 7 is positioned at 1772288 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/G, SNP locus 8 is positioned at 3529598 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is C/G;
The combination 3 consists of SNP loci 9 to 12: SNP locus 9 is positioned at 594351 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/C, SNP locus 10 is positioned at 4131585 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is A/C, SNP locus 11 is positioned at 3897155 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is T/C, SNP locus 12 is positioned at 1600039 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is G/T;
The combination 4 consists of SNP loci 13 to 16: SNP locus 13 is positioned at 1451405 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 14 is positioned at 4219219 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 15 is positioned at 1279968 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/C, SNP locus 16 is positioned at 2185884 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is T/C;
The combination 5 consists of SNP loci 17 to 19: SNP locus 17 is positioned at 2474686 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 18 is positioned at 3355057 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/T, SNP locus 19 is positioned at 4215793 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is A/C;
The combination 6 consists of SNP loci 20 to 23: SNP locus 20 is positioned at 1151304 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/A, SNP locus 21 is positioned at 1446733 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is G/A, SNP locus 22 is positioned at 2838897 th site on the chromosome of the mycobacterium tuberculosis, the nucleotide sequence is C/G, SNP locus 23 is positioned at 3056767 th site on the chromosome of the mycobacterium tuberculosis, and the nucleotide sequence is G/T;
The physical position of the SNP locus is determined based on the whole genome sequence alignment of the mycobacterium tuberculosis H37Rv, wherein Accesion Number of the whole genome sequence of the mycobacterium tuberculosis H37Rv is NC_000962.3.
3. The use according to claim 1 or 2, wherein the agent comprises: any one or a combination of a plurality of PCR primers, molecular probes, biosensors and chips.
4. A method for sub-lineage determination of mycobacterium tuberculosis lineage 2.3, comprising detecting a SNP specific site as set forth in claim 1, which is not directly targeted for diagnosis or treatment of a disease.
5. A method for evolutionarily analyzing mycobacterium tuberculosis lineage 2.3, comprising detecting the SNP specific site as set forth in claim 2, which is not directly aimed at diagnosis or treatment of a disease.
6. The method according to claim 4 or 5, wherein the detecting comprises: gel electrophoresis-based SNP detection method, DNA sequencing method, DNA chip method, denaturing high performance liquid chromatography and mass spectrometry detection method.
7. The method of claim 6, wherein the gel electrophoresis-based SNP detection method comprises: single strand conformational polymorphism detection, denaturing gradient gel electrophoresis detection, enzyme digestion amplification polymorphic sequence detection, and allele-specific PCR detection.
8. The method of claim 6, wherein the mass spectrometry detection method comprises matrix-assisted laser desorption ionization time-of-flight mass spectrometry detection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310340239.2A CN116254356B (en) | 2023-03-31 | 2023-03-31 | Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310340239.2A CN116254356B (en) | 2023-03-31 | 2023-03-31 | Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116254356A CN116254356A (en) | 2023-06-13 |
CN116254356B true CN116254356B (en) | 2024-04-30 |
Family
ID=86682612
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310340239.2A Active CN116254356B (en) | 2023-03-31 | 2023-03-31 | Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116254356B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106701979A (en) * | 2017-02-04 | 2017-05-24 | 北京市结核病胸部肿瘤研究所 | Kit used for mycobacterium tuberculosis typing SNP site and application thereof |
CN110453001A (en) * | 2019-08-28 | 2019-11-15 | 北京市结核病胸部肿瘤研究所 | Application of 1073 SNP sites in mycobacterium tuberculosis pedigree 3 |
-
2023
- 2023-03-31 CN CN202310340239.2A patent/CN116254356B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106701979A (en) * | 2017-02-04 | 2017-05-24 | 北京市结核病胸部肿瘤研究所 | Kit used for mycobacterium tuberculosis typing SNP site and application thereof |
CN110453001A (en) * | 2019-08-28 | 2019-11-15 | 北京市结核病胸部肿瘤研究所 | Application of 1073 SNP sites in mycobacterium tuberculosis pedigree 3 |
Non-Patent Citations (1)
Title |
---|
Revised nomenclature and SNP barcode for Mycobacterium tuberculosis lineage 2;Yuttapong Thawornwattana等;Microbiology Society;第7卷(第11期);第000697-000710页 * |
Also Published As
Publication number | Publication date |
---|---|
CN116254356A (en) | 2023-06-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wagner et al. | Evaluation of PacBio sequencing for full-length bacterial 16S rRNA gene classification | |
Park et al. | Genome sequencing of the extinct Eurasian wild aurochs, Bos primigenius, illuminates the phylogeography and evolution of cattle | |
Nielsen et al. | A scan for positively selected genes in the genomes of humans and chimpanzees | |
Sepil et al. | Characterization and 454 pyrosequencing of Major Histocompatibility Complex class I genes in the great tit reveal complexity in a passerine system | |
Matukumalli et al. | Development and characterization of a high density SNP genotyping assay for cattle | |
KR102638152B1 (en) | Verification method and system for sequence variant calling | |
Vale et al. | Dormant phages of Helicobacter pylori reveal distinct populations in Europe | |
CN112342302B (en) | Method for identifying candidate gene markers of milk production traits of buffalos and application | |
Carrasco et al. | Identification, typing, and phylogenetic relationships of the main clinical Nocardia species in Spain according to their gyrB and rpoB genes | |
MacEachern et al. | Phylogenetic reconstruction and the identification of ancient polymorphism in the Bovini tribe (Bovidae, Bovinae) | |
Chittoria et al. | Natural selection mediated association of the Duffy (FY) gene polymorphisms with Plasmodium vivax malaria in India | |
Katzman et al. | GC-biased evolution near human accelerated regions | |
Dai et al. | Multilocus phylogeography (mitochondrial, autosomal and Z-chromosomal loci) and genetic consequence of long-distance male dispersal in Black-throated tits (Aegithalos concinnus) | |
CN111534602A (en) | Method for analyzing human blood type and genotype based on high-throughput sequencing and application thereof | |
Huang et al. | Genome-wide association study on chicken carcass traits using sequence data imputed from SNP array | |
Claes et al. | Dealing with pseudogenes in molecular diagnostics in the next generation sequencing era | |
Aracena et al. | Epigenetic variation impacts individual differences in the transcriptional response to influenza infection | |
CN116254356B (en) | Method for identifying mycobacterium tuberculosis pedigree 2.3 subtype strain and application thereof | |
CN106755422B (en) | Detection method of MEG3 gene SNP related to cattle growth traits and application thereof | |
Sazzini et al. | Ancient pathogen-driven adaptation triggers increased susceptibility to non-celiac wheat sensitivity in present-day European populations | |
CN106868128B (en) | Biomarker for auxiliary diagnosis of breast cancer and application thereof | |
González‐Tortuero et al. | The Quantification of Representative Sequences pipeline for amplicon sequencing: case study on within‐population ITS 1 sequence variation in a microparasite infecting D aphnia | |
CN115820872B (en) | Molecular marker related to pheasant reproduction traits and application thereof | |
CN118667982A (en) | Kit for identifying SNP loci by 3.1.1.6 new branches of mycobacterium tuberculosis pedigree and application of kit | |
Huang et al. | The origin, inter-subspecies hybridization and adaptation of house mice (Mus musculus) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |