CN108165562B - Mycobacterium tuberculosis H37Rv encoding gene and application thereof - Google Patents
Mycobacterium tuberculosis H37Rv encoding gene and application thereof Download PDFInfo
- Publication number
- CN108165562B CN108165562B CN201711250534.XA CN201711250534A CN108165562B CN 108165562 B CN108165562 B CN 108165562B CN 201711250534 A CN201711250534 A CN 201711250534A CN 108165562 B CN108165562 B CN 108165562B
- Authority
- CN
- China
- Prior art keywords
- gene
- mycobacterium tuberculosis
- sequence
- h37rv
- dna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 80
- 241001646725 Mycobacterium tuberculosis H37Rv Species 0.000 title claims abstract description 14
- 108700035964 Mycobacterium tuberculosis HsaD Proteins 0.000 title claims abstract description 13
- 241001302239 Mycobacterium tuberculosis complex Species 0.000 claims abstract description 35
- 238000001514 detection method Methods 0.000 claims abstract description 11
- 238000000034 method Methods 0.000 claims description 17
- 238000003752 polymerase chain reaction Methods 0.000 claims description 12
- 230000003321 amplification Effects 0.000 claims description 10
- 238000004458 analytical method Methods 0.000 claims description 10
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 10
- 150000001413 amino acids Chemical class 0.000 claims description 9
- 238000012163 sequencing technique Methods 0.000 claims description 7
- 241000186359 Mycobacterium Species 0.000 claims description 5
- 241000894007 species Species 0.000 claims description 5
- 201000010099 disease Diseases 0.000 claims description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 2
- 238000001502 gel electrophoresis Methods 0.000 claims description 2
- 239000003147 molecular marker Substances 0.000 claims description 2
- 239000002773 nucleotide Substances 0.000 claims 1
- 125000003729 nucleotide group Chemical group 0.000 claims 1
- 102000007079 Peptide Fragments Human genes 0.000 description 22
- 108010033276 Peptide Fragments Proteins 0.000 description 22
- 108020004414 DNA Proteins 0.000 description 15
- 108090000765 processed proteins & peptides Proteins 0.000 description 14
- 201000008827 tuberculosis Diseases 0.000 description 10
- 108700026244 Open Reading Frames Proteins 0.000 description 9
- 238000012795 verification Methods 0.000 description 8
- 241000187479 Mycobacterium tuberculosis Species 0.000 description 7
- 102000004169 proteins and genes Human genes 0.000 description 7
- 108091028043 Nucleic acid sequence Proteins 0.000 description 5
- 238000012408 PCR amplification Methods 0.000 description 5
- 244000052616 bacterial pathogen Species 0.000 description 5
- 238000001228 spectrum Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 239000003153 chemical reaction reagent Substances 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 150000002500 ions Chemical class 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- 230000000241 respiratory effect Effects 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 3
- 108010026552 Proteome Proteins 0.000 description 3
- 238000001962 electrophoresis Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 210000002345 respiratory system Anatomy 0.000 description 3
- 108020004705 Codon Proteins 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 208000015181 infectious disease Diseases 0.000 description 2
- 238000004949 mass spectrometry Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 238000012070 whole genome sequencing analysis Methods 0.000 description 2
- 108020004465 16S ribosomal RNA Proteins 0.000 description 1
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 1
- RWWPBOUMKFBHAL-FXQIFTODSA-N Arg-Asn-Cys Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(O)=O RWWPBOUMKFBHAL-FXQIFTODSA-N 0.000 description 1
- ZTKHZAXGTFXUDD-VEVYYDQMSA-N Arg-Asn-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZTKHZAXGTFXUDD-VEVYYDQMSA-N 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- QFMCHXSGIZPBKG-ZLUOBGJFSA-N Cys-Ala-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N QFMCHXSGIZPBKG-ZLUOBGJFSA-N 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 241000282414 Homo sapiens Species 0.000 description 1
- 102000003839 Human Proteins Human genes 0.000 description 1
- 108090000144 Human Proteins Proteins 0.000 description 1
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 1
- IFMPDNRWZZEZSL-SRVKXCTJSA-N Leu-Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O IFMPDNRWZZEZSL-SRVKXCTJSA-N 0.000 description 1
- KZOHPCYVORJBLG-AVGNSLFASA-N Lys-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N KZOHPCYVORJBLG-AVGNSLFASA-N 0.000 description 1
- AIRZWUMAHCDDHR-KKUMJFAQSA-N Lys-Leu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O AIRZWUMAHCDDHR-KKUMJFAQSA-N 0.000 description 1
- YTJFXEDRUOQGSP-DCAQKATOSA-N Lys-Pro-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YTJFXEDRUOQGSP-DCAQKATOSA-N 0.000 description 1
- XOFDBXYPKZUAAM-GUBZILKMSA-N Met-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N XOFDBXYPKZUAAM-GUBZILKMSA-N 0.000 description 1
- 241001467553 Mycobacterium africanum Species 0.000 description 1
- 241000186366 Mycobacterium bovis Species 0.000 description 1
- 241001312372 Mycobacterium canettii Species 0.000 description 1
- 241000211133 Mycobacterium caprae Species 0.000 description 1
- 241000187919 Mycobacterium microti Species 0.000 description 1
- 241000699502 Mycobacterium mungi Species 0.000 description 1
- 241000656726 Mycobacterium orygis Species 0.000 description 1
- 241001457456 Mycobacterium pinnipedii Species 0.000 description 1
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 1
- 206010036790 Productive cough Diseases 0.000 description 1
- 206010057190 Respiratory tract infections Diseases 0.000 description 1
- PLQWGQUNUPMNOD-KKUMJFAQSA-N Ser-Tyr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PLQWGQUNUPMNOD-KKUMJFAQSA-N 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 108010092854 aspartyllysine Proteins 0.000 description 1
- 230000007321 biological mechanism Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000012136 culture method Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 239000012154 double-distilled water Substances 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000001647 drug administration Methods 0.000 description 1
- 238000001976 enzyme digestion Methods 0.000 description 1
- 238000011841 epidemiological investigation Methods 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000001819 mass spectrum Methods 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 238000012257 pre-denaturation Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 108010031719 prolyl-serine Proteins 0.000 description 1
- 208000008128 pulmonary tuberculosis Diseases 0.000 description 1
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 210000003802 sputum Anatomy 0.000 description 1
- 208000024794 sputum Diseases 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
- C07K14/35—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Mycobacteriaceae (F)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/689—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Wood Science & Technology (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicinal Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention relates to a mycobacterium tuberculosis H37Rv coding gene which can be used as a standard gene for molecular identification of mycobacterium tuberculosis complex and is used for molecular identification and clinical detection of the mycobacterium tuberculosis complex.
Description
Technical Field
The invention relates to the field of gene detection, in particular to identification of pathogenic bacteria species.
Background
Mycobacterium Tuberculosis (MTB) is a pathogenic bacterium that causes tuberculosis in humans. It can invade all organs of the body, but pulmonary tuberculosis is the most common. Tuberculosis is an extremely important infectious disease so far and seriously threatens the life health of human beings. It is reported by WHO that about 800 new cases occur each year, and at least 300 million people die from the disease. The clinical bacterial strain of MTB is difficult to culture, slow in growth, capable of cross-infecting with other mycobacteria, difficult to distinguish between tuberculosis and other respiratory tract infection symptoms and the like, and brings great difficulty to clinical rapid diagnosis and treatment. Therefore, the establishment of a quick, accurate, specific, sensitive and cheap tuberculosis detection method is a necessary premise for effectively treating and controlling tuberculosis spreading, and is a new challenge and a new task for detecting mycobacterium in clinical laboratories.
Mycobacterium tuberculosis complex (MTBC) includes the Mycobacterium groups m.tuberculosis, m.africanum, m.orygis, m.bovis, m.microti, m.canettii, m.caprae, m.pinnipedii, m.subcatetate, m.mungi, which all cause tuberculosis in humans and other life forms. At present, the domestic and foreign MTBC identification method is mainly divided into the following three categories: traditional separation culture method; molecular level detection (IS6110, restriction fragment length polymorphism analysis, multi-site variable number repeat polymorphism analysis, etc.); a method for analyzing the components of a microorganism (fatty acid, mycolic acid) by chromatography. The three methods have respective advantages, but have disadvantages, such as long separation culture period and low thallus culturable rate; at present, the molecular level detection is poor in specificity, sensitivity and simplicity; the analysis cost of the thallus component characteristics is high, and the operation is complex.
MTB H37Rv completed whole genome sequencing in 1998, the MTB strain that completed whole genome sequencing the earliest. From this point on, researchers in various countries are perfecting and supplementing H37Rv gene annotation databases based on strategies such as algorithm optimization, annotation software updating, transcriptomics and proteomics. However, since MTB belongs to prokaryotes, annotation errors (over-annotation, gene boundary error, ORF initiation, termination site error, alternative splicing, ribosome translocation, missing annotation) may still exist in genome annotation due to the inherent shortcomings of the prokaryote genome annotation technology, which brings trouble to deep and accurate analysis of biological mechanisms. In order to solve the problem, proteomics (proteomics) has been used for correcting the annotated gene of H37Rv, however, high-proportion false positive, difficulty in annotated gene prediction, new gene verification, new gene function analysis and application thereof, and the like, are problems faced in the field.
In general, the traditional mycobacterium tuberculosis complex (MTBC) identification strategy has the defects of long period, tedious steps, low specificity and sensitivity and the like. In order to further perfect re-annotation of the H37Rv whole genome, missing annotation genes in H37Rv are found, the H37Rv whole genome missing annotation genes and application technologies thereof in MTBC molecular identification are effectively protected, and a method for quickly and accurately identifying the MTBC group by using the H37Rv new genes is imperatively developed.
Disclosure of Invention
An object of the present invention is to provide a new encoding gene of mycobacterium tuberculosis H37Rv, which is H37Rv minus annotation encoding gene Rv2815A (+ |3123619 |), which can be used as a barcode molecular marker of mycobacterium tuberculosis complex for detecting mycobacterium tuberculosis complex, and the sequence of which is shown in SEQ ID NO. 1.
Other objects of the present invention include providing specific PCR primers useful for amplifying the above-described encoding genes and providing a method of detecting or identifying the presence of a binding Mycobacterium complex in a sample; the invention also provides a detection kit related to the coding gene and application of the gene.
According to one aspect of the invention, by comparing proteomic research techniques, a protein coding sequence of H37Rv that is difficult to find by genetic prediction software was discovered that effectively distinguishes MTBC from other species of the same genus. The gene is a missing annotation gene of Mycobacterium tuberculosis (Mycobacterium tuberculosis H37Rv), namely Rv2815A (+ | 3123619-. Comparative genomics studies show that the gene sequence can distinguish the Mycobacterium tuberculosis complex (MTBC) strain from other species of Mycobacterium.
Specifically, a primer capable of realizing specific amplification on the Rv2815A (+ |3123619-3123756|) gene of MTBC is designed, namely the primer provided by the invention, and the primer sequence is as follows:
F:5’-CAGCGTGTGGTAACAATGCC-3’;
R:5’-AGCGATGCTGACGAAGGG-3’。
according to the existence of the gene DNA sequence PCR product in the sample to be detected or the difference of the DNA sequence, the MTBC can be quickly and accurately identified.
According to another aspect of the present invention, based on the above-mentioned new standard encoding gene of Mycobacterium tuberculosis H37Rv, the present invention specifically establishes a method for detecting or identifying Mycobacterium tuberculosis complex, comprising the following steps:
(1) separating and extracting genome DNA from a sample to be detected;
(2) and (2) performing PCR amplification by using the DNA obtained in the step (1) as a template and adopting the following primers:
F:5’-CAGCGTGTGGTAACAATGCC-3’(SEQ ID NO.4);
R:5’-AGCGATGCTGACGAAGGG-3’(SEQ ID NO.5)。
(3) performing gel electrophoresis analysis or sequencing on the DNA product obtained by amplification in the step (2);
(4) and (3) comparing the result of the step (3) with the barcode gene Rv2815A (+ | 3123619-.
Further, the detection method is characterized in that electrophoresis analysis is performed on the PCR product primarily according to the DNA bar code principle, and if the strain to be detected does not have a target band, the strain is not MTBC; if the band exists, further sequencing verification can be carried out, the sequence obtained by sequencing and the standard sequence of Rv2815A (+ |3123619-3123756|) of H37Rv are subjected to homologous comparison and alignment to obtain the similarity between the sequences, and if the sequence homology is more than 99%, the strain can be judged to be MTBC; and (3) distinguishing the MTBC family from nontuberculous mycobacteria, common respiratory pathogenic bacteria and common respiratory viruses according to the clustering condition of the DNA barcode sequence of the strain to be identified and the standard sequence.
The detection method can be used for strain identification research of the mycobacterium tuberculosis complex and can also be used for clinical rapid inspection. The sample to be detected can be H37Rv strain, other MTBC, nontuberculous mycobacteria, respiratory tract common pathogenic bacteria and respiratory tract common virus strain; or directly using sputum, saliva or blood of tuberculosis and other respiratory patients.
Based on the above method, the present invention also provides a detection kit, wherein the kit contains a reagent for detecting the novel standard encoding gene of Mycobacterium tuberculosis H37Rv in a container, and simultaneously provides manufacturing, using and marketing information about the medicine or biological product, which can be approved by a government drug administration. For example, the reagent for directly detecting the Rv2815A (+ |3123619-3123756|) gene in the sample after PCR amplification may comprise one or more of amplification primers, dNTP, DNA polymerase used for PCR reaction and its buffer, reagents required for enzyme digestion reaction and/or sequencing reaction, etc. It is known to those skilled in the art that the above components are merely illustrative, and for example, the primers may employ the specific PCR primers described above, and the DNA polymerase used for the PCR reaction is an enzyme capable of being used for PCR amplification. The detection of the encoding gene of the present invention can also be provided in the form of an integrated, e.g., gene chip.
Has the advantages that: the invention provides a standard gene and a molecular identification method for molecular identification of Mycobacterium tuberculosis complex (MTBC), wherein the gene can effectively distinguish MTBC from other species of the same genus, the identification method using the gene overcomes the defects of primer design multiplicity, poor result repeatability and the like in the existing identification process of the Mycobacterium tuberculosis complex, has the characteristics of universality, easy amplification and easy comparison, can accurately identify the class from other mycobacteria with close relativity or other respiratory tract infectious germs, and provides powerful technical means and research tools for the epidemiological investigation and the rapid diagnosis and identification of clinical tuberculosis patients.
Drawings
FIG. 1: evidence of peptide profile matching supporting the discovery of new coding genes;
FIG. 2: comparing the mass spectrogram of the synthesized peptide fragment with the mass spectrogram of the original identified peptide fragment;
FIG. 3: a corresponding diagram of a protein sequence coded by ORF of the peptide fragment locus region; the underlined part is the peptide identified in proteomics and verified by the synthetic peptide;
FIG. 4: rv2815A (+ |3123619-3123756|) standard gene sequence homology comparison;
FIG. 5: the result of BLASTP of a protein sequence corresponding to the Rv2815A (+ |3123619-3123756|) gene of the H37Rv strain;
FIG. 6: the result of agarose gel electrophoresis of the PCR amplification product of the Rv2815A (+ |3123619-3123756|) specific primer; wherein, the specific information of each lane sample is shown in Table 1;
FIG. 7: the PCR amplification sequencing result of the Rv2815A (+ |3123619-3123756|) gene is compared with a standard sequence.
Detailed Description
The invention is further described with reference to specific embodiments, but the scope of the claims is not limited thereto. The reagents used in the present invention are all commercially available.
Example 1: search for genes encoding missing release of the genome of strain H37Rv
1.1 high coverage proteomic validation of the genome of the H37Rv strain
The deep coverage study of proteome was performed on the H37Rv strain using the high coverage proteome technique. Annotated encoding gene validation was performed on its genome using the pFind 3 engine based on the Tuberculosis (20160307) database. To find new protein coding regions, we performed six-reading-frame database translation of H37Rv in the genome-wide (NC _000962.3) file published at NCBI using pAnno software based on proteomic technology, and identified new peptide fragments and new proteins using this database for mass spectrometry data. To reduce the false positive rate, we used 3 filtering methods to separately estimate class FDR for the annotated and new peptide fragments, S-FDR, T-FDR I and T-FDR II, respectively, during the data filtering.
Through data analysis, a total of 3238H 37Rv annotated genes are identified, and the coverage is as high as more than 80% of the strain, which is the largest mass spectrum data of the H37Rv protein reported so far. In addition, we obtained new peptide fragments after 3 FDRs ≤ 1 filtration. In order to further ensure the quality of the new peptide fragments, spectrogram quality screening is carried out on spectrograms corresponding to the new peptide fragments left after filtration, and finally some peptide fragments with good spectrogram quality are reserved. To further investigate that these peptides with higher spectral quality were not due to single amino acid mutations in the annotated peptide, we performed amino acid mutation checks to ensure that these new peptides were newly identified peptides of H37 Rv.
1.2 verification of the encoded protein of Rv2815A (+ |3123619-3123756|) Gene and database
After high coverage proteome verification, we find some suspected new peptide fragments which are leaked to release, and perform peptide fragment synthesis verification on the suspected new peptide fragments with high reliability, and score more than or equal to 0.8 according to the similarity between the original spectrum and the synthesized spectrum of the new peptide fragments as a similarity threshold, and after scoring and screening, a plurality of peptide fragments pass through verification and correspond to a new Open Reading Frame (ORF), namely the potential leaked to release genes of the current H37Rv strain.
Among them, we found a new missing annotated gene Rv2815A (+ | 3123619-. We found that peptide LLCADKPSPSVER (SEQ ID NO.6) was detected in the data from three different laboratory proteins and corresponded to the new gene Rv2815A (+ |3123619- |), as shown in FIG. 1, the spectrum quality was good, 4 b/y ions were matched consecutively, the peak signal was low, and the result was very reliable.
To further confirm this identification, we chemically synthesized the peptide according to the amino acid sequence of our newly identified peptide and generated a secondary spectrum of the synthesized peptide using the mass spectrometry conditions described above.
Our high energy collision MS on synthetic peptide fragments2Verification is carried out, and the primary parent ions and the secondary daughter ions both accord with theoretical values, so that the sequence of the synthesized peptide fragment is correct; on this basis, we manually examined MS of synthetic peptides of novel peptide sequences identified from large-scale proteomic data2And the large scale identification of the new peptide fragment spectrum, both of which are almost completely identical, the cosin value obtained by the daughter ion similarity is 0.98, which proves that the new peptide fragment identified by us from H37Rv is correct. (FIG. 2).
After confirming the sequence of the peptide fragment to be released, according to the gene position of the peptide fragment, taking the region included by the former stop codon and the latter stop codon as a boundary, obtaining the Open Reading Frame (ORF) DNA sequence containing the new peptide fragment to be released, as shown in SEQ ID NO. 2.
TAACAATGCCTGCTGATGATGTCAAAAGAACACAAACTCCTCTGCGCTGACAAGCCGTCCCCTTCCGTAGAACGTAACTGCCGCAACACCTCTTATCTTATAGATCCGGATGTTGTCGCAGTCGATGGCGAAGCGGTCGATACGTGCAACTAG(SEQ ID NO.2)
The correspondence between the open reading frame code and the amino acid sequence is shown in FIG. 3.
Further translation verification revealed that the authentic gene sequence (SEQ ID NO.1) was opened from aboveIn reading frame DNA (SEQ ID NO.2)ATGAt the beginning, 138bp in total encodes 45 amino acids, the theoretical molecular weight of which is 4.95kDa, namely the Rv2815A (+ |3123619-3123756|) gene.
ATGATGTCAAAAGAACACAAACTCCTCTGCGCTGACAAGCCGTCCCCTTCCGTAGAACGTAACTGCCGCAACACCTCTTATCTTATAGATCCGGATGTTGTCGCAGTCGATGGCGAAGCGGTCGATACGTGCAACTAG(SEQ ID NO.1)
The theoretical coding product amino acid sequence of the gene is shown as SEQ ID NO. 3:
MMSKEHKLLCADKPSPSVERNCRNTSYLIDPDVVAVDGEAVDTCN(SEQ ID NO.3)
the amino acid sequence of the theoretical gene coding product shown in SEQ ID NO.3 is subjected to NCBI-BLASTP analysis, and the sequence is not aligned to any sequence and is a human protein which is not found yet. (see FIG. 4). It was shown that our detected Rv2815A (+ |3123619-3123756|) gene products were missing annotations in the H37Rv strain database.
We performed a comparative genome local BLAST analysis on the DNA sequence of the Rv2815A (+ |3123619-3123756|) gene, as shown in FIG. 5, and the results showed that the Rv2815A (+ |3123619-3123756|) gene sequence belongs to MTBC family-specific gene and has no more homologous sequence in other species, which indicates that the Rv2815A (+ |3123619-3123756|) gene sequence found in the H37Rv strain has better sequence specificity and can distinguish MTBC from other mycobacteria and other respiratory tract-infecting bacteria in the same genus.
Example 2: method for establishing and identifying MTBC complex group
(1) Designing a primer:
based on the Rv2815A (+ |3123619-3123756|) gene sequence shown in SEQ ID NO.1, the PCR primers were designed using Oligo7.0, and the primer sequences were as follows:
F:5’-CAGCGTGTGGTAACAATGCC-3’(SEQ ID NO.4);
R:5’-AGCGATGCTGACGAAGGG-3’(SEQ ID NO.5)
the position relationship between the above primers and the Rv2815A (+ |3123619-3123756|) gene is shown below, the primers are respectively positioned at the upstream and downstream of the structural gene, wherein the positions corresponding to the primers are marked by single-dashed lines, and the double-dashed lines are an initiator and a terminator.
GGACAATTCGTCCAGCGTGTGGTAACAATGCCTGCTGATGTCAAAAGAACACAAACTCCTCTGCGCTGACAAGCCGTCCCCTTCCGTAGAACGTAACTGCCGCAACACCTCTTATCTTATAGATCCGGATGTTGTCGCAGTCGATGGCGAAGCGGTCGATACGTGCAACTTTCGCGAGCTGGCCCTTCGTCAGCATCGCTTCGAATG(SEQ ID NO.7)
(2) Extracting total DNA of strains to be detected including M.tuberculosis H37Rv, wherein 40 standard strains of mycobacterium are preserved by China medical bacterial strain preservation management center (CMCC), the other 16 non-tuberculous mycobacteria are clinical isolates of 309 hospital of China people' S liberation military, completing the work of sequencing and comparing strains 16S RNA genes and submitting NCBI sequences, and the strains to be detected are shown in Table 1:
TABLE 1 related strains selected
(3) The DNA fragment was amplified and subjected to Polymerase Chain Reaction (PCR) using the above F/R primer.
PCR System (25. mu.L) as ddH2O (9.5. mu.L), 2XTaq PCR MasterMix (TIANGEN, 12.5. mu.L), primer F (10. mu.M, 1. mu.L), primer R (10. mu.M, 1. mu.L), DNA template (1. mu.L);
and (3) amplification procedure: pre-denaturation at 94 ℃ for 3min, denaturation at 94 ℃ for 30s, annealing at 58 ℃ for 30s, extension at 72 ℃ for 1min, 35 cycles, and extension at 72 ℃ for 5 min.
(4) And (4) detecting the amplified product by electrophoresis in agarose gel and 1 xTBE electrophoresis solution. As shown in FIG. 6, the MTBC and the positive control group showed an amplification band at 194bp, and the actual amplification result is consistent with the expected result, with the specificity of 98.3%.
(5) To further verify the sequence of the amplified DNA, we sequenced the amplified sequence and compared it with the original sequence, as shown in FIG. 7, which is a perfect match to the expected sequence without errors, further verifying the presence of a new missing annotated gene.
This indicates that the method for identifying MTBC complex based on Rv2815A (+ |3123619-3123756|) gene is truly reliable.
SEQUENCE LISTING
<110> Peking proteome research center
<120> Mycobacterium tuberculosis H37Rv encoding gene and application thereof
<130> BJ1936-17P121793
<160> 7
<170> PatentIn version 3.3
<210> 1
<211> 138
<212> DNA
<213> Artificial
<220>
<223> Mycobacterium tuberculosis H37Rv encoding gene Rv2815A (+ |3123619-3123756|)
<400> 1
atgatgtcaa aagaacacaa actcctctgc gctgacaagc cgtccccttc cgtagaacgt 60
aactgccgca acacctctta tcttatagat ccggatgttg tcgcagtcga tggcgaagcg 120
gtcgatacgt gcaactag 138
<210> 2
<211> 153
<212> DNA
<213> Artificial
<220>
<223> open reading frame DNA sequence comprising peptide fragment with missing annotation
<400> 2
taacaatgcc tgctgatgat gtcaaaagaa cacaaactcc tctgcgctga caagccgtcc 60
ccttccgtag aacgtaactg ccgcaacacc tcttatctta tagatccgga tgttgtcgca 120
gtcgatggcg aagcggtcga tacgtgcaac tag 153
<210> 3
<211> 45
<212> PRT
<213> Artificial
<220>
<223> Rv2815A (+ |3123619-3123756|) gene theory encoding product amino acid sequence
<400> 3
Met Met Ser Lys Glu His Lys Leu Leu Cys Ala Asp Lys Pro Ser Pro
1 5 10 15
Ser Val Glu Arg Asn Cys Arg Asn Thr Ser Tyr Leu Ile Asp Pro Asp
20 25 30
Val Val Ala Val Asp Gly Glu Ala Val Asp Thr Cys Asn
35 40 45
<210> 4
<211> 20
<212> DNA
<213> Artificial
<220>
<223> F primer sequences
<400> 4
<210> 5
<211> 18
<212> DNA
<213> Artificial
<220>
<223> R primer sequences
<400> 5
<210> 6
<211> 13
<212> PRT
<213> Artificial
<220>
<223> peptide fragment to be released by missed injection
<400> 6
Leu Leu Cys Ala Asp Lys Pro Ser Pro Ser Val Glu Arg
1 5 10
<210> 7
<211> 213
<212> DNA
<213> Artificial
<220>
<223> demonstration of integration sequence of upstream and downstream primers with Rv2815A (+ |3123619-3123756|) Gene
<400> 7
ggacaattcg tccagcgtgt ggtaacaatg cctgctgatg atgtcaaaag aacacaaact 60
cctctgcgct gacaagccgt ccccttccgt agaacgtaac tgccgcaaca cctcttatct 120
tatagatccg gatgttgtcg cagtcgatgg cgaagcggtc gatacgtgca actagtttcg 180
cgagctggcc cttcgtcagc atcgcttcga atg 213
Claims (5)
1. A mycobacterium tuberculosis H37Rv coding gene is H37Rv missing-release coding gene Rv2815A (+ |3123619-3123756|), and the nucleotide sequence of the coding gene is shown as SEQ ID NO. 1.
2. The Mycobacterium tuberculosis H37Rv encoding gene of claim 1, wherein the gene encodes an amino acid represented by the sequence of SEQ ID No. 3.
3. A barcode molecular marker for use in detecting and/or identifying a mycobacterium tuberculosis complex, comprising the gene encoding mycobacterium tuberculosis H37Rv according to claim 1 as a standard detection gene.
4. Specific PCR primers for amplifying the gene encoding Mycobacterium tuberculosis H37Rv of claim 1, wherein the sequence of the primers is as follows:
F: 5’- CAGCGTGTGGTAACAATGCC -3’;
R: 5’- AGCGATGCTGACGAAGGG -3’。
5. a method for identifying strains of the mycobacterium tuberculosis complex from other species of the genus mycobacterium which are not used for the diagnostic treatment of disease, comprising the steps of:
(1) separating and extracting genome DNA from a sample to be detected;
(2) adding an amplification primer by taking the DNA obtained in the step (1) as a template to perform polymerase chain reaction;
(3) carrying out gel electrophoresis analysis and sequencing on the DNA product obtained by amplification in the step (2);
(4) comparing the result of the step (3) with the coding gene of claim 1, and judging whether the mycobacterium tuberculosis complex exists in the sample to be detected according to the homology;
wherein the sequence of the amplification primer in the step (2) is as follows:
F: 5’- CAGCGTGTGGTAACAATGCC -3’;
R: 5’- AGCGATGCTGACGAAGGG -3’。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711250534.XA CN108165562B (en) | 2017-12-01 | 2017-12-01 | Mycobacterium tuberculosis H37Rv encoding gene and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711250534.XA CN108165562B (en) | 2017-12-01 | 2017-12-01 | Mycobacterium tuberculosis H37Rv encoding gene and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108165562A CN108165562A (en) | 2018-06-15 |
CN108165562B true CN108165562B (en) | 2021-06-08 |
Family
ID=62525034
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711250534.XA Active CN108165562B (en) | 2017-12-01 | 2017-12-01 | Mycobacterium tuberculosis H37Rv encoding gene and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108165562B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110343706A (en) * | 2019-01-25 | 2019-10-18 | 北京蛋白质组研究中心 | Mycobacterium tuberculosis H37Rv encoding gene and its application |
CN116732211B (en) * | 2023-08-09 | 2023-10-27 | 湖南工程学院 | Probe set and method for detecting mycobacterium bovis based on 8-17 deoxyribozyme and CRISPR-Cas13a trans-cleavage |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6074820A (en) * | 1994-05-16 | 2000-06-13 | De Staat Der Nederlanden, Vertegenwoordigd Door De Minister Van Welzijn, Volksgezondhed En Cultuur | Detection and differentiation of mycobacterium tuberculosis complex bacteria by direct variant repeat oligotyping |
WO2006035317A2 (en) * | 2004-09-30 | 2006-04-06 | Institut Pasteur | Immunogenic glycopeptides for diagnosing pathogenic microorganisms infections |
WO2009017902A2 (en) * | 2007-06-22 | 2009-02-05 | Ibis Biosciences, Inc. | Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis |
WO2012135815A2 (en) * | 2011-04-01 | 2012-10-04 | Occam Biolabs, Inc. | Methods and kits for detecting cell-free pathogen-specific nucleic acids |
CN103063843A (en) * | 2011-10-20 | 2013-04-24 | 上海交通大学医学院 | Specific marker of mycobacterium tuberculosis and application thereof |
CN103882130A (en) * | 2014-03-24 | 2014-06-25 | 中国科学院微生物研究所 | Primer pair for detecting mycobacterium tuberculosis complex group and application of primer pair |
-
2017
- 2017-12-01 CN CN201711250534.XA patent/CN108165562B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6074820A (en) * | 1994-05-16 | 2000-06-13 | De Staat Der Nederlanden, Vertegenwoordigd Door De Minister Van Welzijn, Volksgezondhed En Cultuur | Detection and differentiation of mycobacterium tuberculosis complex bacteria by direct variant repeat oligotyping |
WO2006035317A2 (en) * | 2004-09-30 | 2006-04-06 | Institut Pasteur | Immunogenic glycopeptides for diagnosing pathogenic microorganisms infections |
WO2009017902A2 (en) * | 2007-06-22 | 2009-02-05 | Ibis Biosciences, Inc. | Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis |
WO2012135815A2 (en) * | 2011-04-01 | 2012-10-04 | Occam Biolabs, Inc. | Methods and kits for detecting cell-free pathogen-specific nucleic acids |
CN103063843A (en) * | 2011-10-20 | 2013-04-24 | 上海交通大学医学院 | Specific marker of mycobacterium tuberculosis and application thereof |
CN103882130A (en) * | 2014-03-24 | 2014-06-25 | 中国科学院微生物研究所 | Primer pair for detecting mycobacterium tuberculosis complex group and application of primer pair |
Non-Patent Citations (2)
Title |
---|
"Mycobacterium tuberculosis H37Rv complete genome,ACCESSION:AL123456.3";Cole,S.T. et al.;《Genbank》;20150227;第1页 * |
"Mycobacterium tuberculosis strain TBV4952 chromosome, complete genome,ACCESSION:CP023640.1";Sheen,P. et al.;《Genbank》;20171005;第1页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108165562A (en) | 2018-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lang et al. | Genomics-based diagnostic marker development for Xanthomonas oryzae pv. oryzae and X. oryzae pv. oryzicola | |
CN110408629B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN110408630B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN108913768B (en) | Multiplex liquid phase gene chip primer, kit and analysis method for simultaneously detecting seven glucosamine drug resistance genes | |
CN111893198B (en) | Specific molecular target for identifying staphylococcus albus, detection primer group and rapid detection method thereof | |
CN108165562B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN108004253B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN108165561B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN108165564B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN110408632B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN108165565B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN108165560B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN105603081B (en) | Non-diagnosis-purpose qualitative and quantitative detection method for intestinal microorganisms | |
CN108165563B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN110408631B (en) | Mycobacterium tuberculosis H37Rv encoding gene and application thereof | |
CN110423835A (en) | Primer composition for lower respiratory tract the pathogenic microorganism examination | |
CN110923349B (en) | Species-specific detection molecular tags 3283 and 3316 of yersinia enterocolitica and rapid detection method thereof | |
CN114196779A (en) | Pathogenic microorganism detection method and kit based on targeted sequencing | |
KR102204398B1 (en) | Middle East respiratory syndrome-related corona virus universal primer sets for whole genome detection method and diagnosis kit | |
CN113652474A (en) | Detection method for DMD gene exon copy number variation and application thereof | |
CN113151516A (en) | Human high-risk zoonosis type streptococcus suis specific sequence, detection primer and application | |
Mironov et al. | Multilocus sequence-typing scheme for Borrelia miyamotoi—the erythema-free ixodid tick-borne borreliosis pathogens | |
CN110343706A (en) | Mycobacterium tuberculosis H37Rv encoding gene and its application | |
CN113174443B (en) | Mycobacterium identification method and biological material thereof | |
CN113151309B (en) | Streptococcus suis specific sequence with high risk of human beings and livestock and application thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240201 Address after: 100850 No. 27 Taiping Road, Beijing, Haidian District Patentee after: ACADEMY OF MILITARY MEDICAL SCIENCES Country or region after: China Address before: Building 1, No.33, kekeyuan Road, Changping District, Beijing Patentee before: BEIJING PROTEOME RESEARCH CENTER Country or region before: China |