CN109385484B - DNA bar code, primer, kit, method and application - Google Patents

DNA bar code, primer, kit, method and application Download PDF

Info

Publication number
CN109385484B
CN109385484B CN201710656798.9A CN201710656798A CN109385484B CN 109385484 B CN109385484 B CN 109385484B CN 201710656798 A CN201710656798 A CN 201710656798A CN 109385484 B CN109385484 B CN 109385484B
Authority
CN
China
Prior art keywords
strain
sequence
seq
tmcc70007
dna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710656798.9A
Other languages
Chinese (zh)
Other versions
CN109385484A (en
Inventor
施佳辉
徐平
唐蜀昆
田飞
高林瑞
高慧英
职晓阳
丁章贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MENGHAI TEA INDUSTRY Co.,Ltd.
Yunnan Dayi Microbial Technology Co., Ltd
Original Assignee
Yunnan Dayi Microbial Technology Co ltd
Menghai Tea Industry Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan Dayi Microbial Technology Co ltd, Menghai Tea Industry Co ltd filed Critical Yunnan Dayi Microbial Technology Co ltd
Priority to CN201710656798.9A priority Critical patent/CN109385484B/en
Publication of CN109385484A publication Critical patent/CN109385484A/en
Application granted granted Critical
Publication of CN109385484B publication Critical patent/CN109385484B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Mycology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Botany (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention belongs to the field of species and strain identification, and particularly relates to a DNA bar code, a primer, a kit, a method and application for identifying a desmospora adephagi strain. The DNA bar code can accurately identify the Pu' er tea fermentation strain, namely, the adenine arthrobotrys yeast TMCC70007, and can quickly and accurately identify the strain from confusable strains or other strains in the same species.

Description

DNA bar code, primer, kit, method and application
Technical Field
The invention belongs to the field of species and strain identification, and particularly relates to a DNA bar code, a primer, a kit, a method and application.
Background
Pu' er tea is post-fermented tea with geographical identification of Yunnan, which is prepared by a series of processes by adopting big-leaf sun-dried raw tea as a raw material. The traditional Pu' er tea preparation process comprises the following steps: the picked fresh tea leaves are rolled, dried in the sun, purified, moistened, piled, dried in the air, screened, pressed and formed, and packaged for delivery. In the production of Pu ' er tea, the pile fermentation process is a main factor for the formation of the Pu ' er tea, and in the process, tea polyphenol, caffeine, polysaccharide substances and other contained components in the tea are greatly changed, so that the special flavor, taste, quality and various health-care effects of the Pu ' er tea are achieved.
In the traditional Pu 'er tea production, enzymes contained in tea leaves are activated by a damp and hot environment, a part of contained components in the tea leaves are converted into substances which can be utilized by microorganisms, the microorganisms are bred in a large quantity to generate abundant intracellular enzymes and extracellular enzymes, the components contained in the tea leaves are catalyzed to generate a series of conversion, and the Pu' er tea with different qualities is gradually formed by adding other factors such as damp and hot. Different origins, because of the different microbial community structures, have different flavors and qualities.
Besides the unique flavor and culture of Pu 'er tea, the Pu' er tea has the health care effects of losing weight, reducing blood sugar and blood fat, preventing and improving cardiovascular diseases, resisting aging, resisting cancer, diminishing inflammation, helping digestion, nourishing the stomach and the like and is also concerned by people. In recent years, Pu 'er tea is more and more popular, and the development of the Pu' er tea industry is pulled by the increasing market demand, so that the economic growth of Yunnan areas is promoted.
However, Pu' er tea also faces some dilemmas, such as unstable product quality, certain potential safety hazard, long production period, too high labor input, excessive microorganism quantity, mite breeding and the like. With the improvement of living standard of people, consumers pay more and more attention to the problems of food such as sanitation, safety and the like. In recent years, food quality safety events are frequent, the safety problem of tea quality is increasingly highlighted, and besides the problem of pesticide residues, microorganisms in the pile fermentation process can also become important factors influencing the tea quality.
At present, the production of Pu' er tea by most manufacturers is still the empirical fermentation of semi-natural artificial pile fermentation, although communities mainly comprising dominant common microorganisms such as adenine-nodulation yeast (Saccharomyces adeninivorans) and the like in the pile fermentation process are relatively stable, the stability of the product still has a larger space for improving. Certain potential safety hazards inevitably exist in the production process, and in order to further obtain the favor of consumers and the acceptance of the market, break through the foreign trade barrier and improve the market competitiveness of Pu 'er tea enterprises, the manual control, cleaning and high efficiency of the Pu' er tea production process must be realized, new products are not developed, and the industrial chain is extended. In order to achieve the purpose, technology must be innovated, a series of safe, clean, efficient, artificial, controllable and automatic Pu 'er tea new processes are invented, the healthy development of the Pu' er tea industry can be ensured, and long-term benefits are brought to the nation and people.
With the further development of Pu ' er tea scientific research, more and more people begin to explore the artificial inoculation and fermentation of Pu ' er tea, and at present, a plurality of bacterial strains are applied to the artificial controllable fermentation of Pu ' er tea. In recent years, due to the increase of market demands, a plurality of small-scale manufacturers lack systematic and deep research on Pu 'er tea, and greatly abuse a plurality of other people's patents to profit, for example, according to a plurality of patent methods for inoculating and fermenting Pu 'er tea, a plurality of strains are adopted to ferment Pu' er tea, and a series of methods such as Pu 'er tea processing and production are carried out, so that the economic benefits of Pu' er tea production enterprises with related patents are greatly damaged.
In order to ensure that the germplasm resources of Pu ' er tea fermentation microorganisms are effectively protected, prevent the infringement behavior of abusing the Pu ' er tea fermentation microorganisms and solve the difficult problem of difficult demonstration in the process of artificially controllable fermentation of Pu ' er tea during the infringement of strains, the development of a method for quickly and accurately identifying the Pu ' er tea fermentation strains is imperative, a molecular identification method of the Pu ' er tea fermentation strains needs to be urgently established, a method for accurately identifying and distinguishing the similar strains by a DNA bar code technology is developed, so that the problems of identification and identification of industrial production strains are solved by a method combining morphological characteristics and molecular data analysis, a theoretical basis is provided for the development of the controlled industrial fermentation process of Pu ' er tea and resource protection, and the healthy and prosperous development of the Pu ' er tea industry is.
Disclosure of Invention
In order to overcome the defects of morphological identification of the alternaria adegua strain for fermentation production of Pu ' er tea, the invention provides a DNA bar code, a primer, a kit, a method and application for identifying the alternaria adegua strain, so that the alternaria adegua strain TMCC70007 strain is quickly identified and distinguished, quick evaluation can be provided for a new fermentation process of the Pu ' er tea, interference of other mixed bacteria in the fermentation process can be prevented, and a proof-lifting method and basis can be provided for an artificial controllable fermentation process of the Pu ' er tea and illegal abuse of the strain.
In order to achieve the purpose, the invention adopts the following technical scheme:
(1) a DNA barcode for identifying a desmodium adenanthus yeast strain, the DNA barcode being derived from the genome of the desmodium adenanthus TMCC70007 strain and comprising at least 500bp of a sequence selected from the DNA sequences shown as SEQ ID No.1, and the length of the DNA barcode being 500bp to 3500bp, preferably 500bp to 2200bp, more preferably 500bp to 1600 bp.
(2) The DNA barcode according to (1), wherein the nucleotide sequence thereof is shown as SEQ ID No.1, SEQ ID No.2, SEQ ID No.4 or SEQ ID No. 7.
(3) A primer pair for amplifying the DNA barcode according to (1).
(4) The primer set according to (3), wherein the nucleotide sequence of the forward primer is identical to such a sequence in the genome of the A.adenine-nodorum TMCC70007 strain: the sequence is a sequence from 1000bp upstream of the 1 st site of the nucleotide sequence shown as SEQ ID No.1 in the genome of the TMCC70007 strain to the 926 nd site of the nucleotide sequence shown as SEQ ID No.1, and the length of the forward primer is 20-30 bp; the reverse primer of the strain is reversely complementary to a sequence in the genome of the TMCC70007 strain: the sequence is a sequence in a region from the 501 st bit of the nucleotide sequence shown as SEQ ID No.1 to the downstream 1000bp of the last bit of the nucleotide sequence shown as SEQ ID No.1 in the genome of the TMCC70007 strain, and the length of the reverse primer is 20-30 bp.
(5) The primer pair according to (4), wherein the nucleotide sequences of the forward primer and the reverse primer are respectively as follows:
a forward primer: 5'-GCCGCACGCTCAATATTTTTC-3', respectively;
reverse primer: 5'-GCTGATCGGGTAGAGCAAGT-3' are provided.
(6) A kit for identifying a strain of Arthrospora adenantha comprising the primer pair according to (3).
(7) A method for identifying a strain of alternaria adefovea comprising the steps of:
a) providing the genome DNA of a strain to be tested;
b) performing PCR amplification by using the genomic DNA in the step a) as a template and using the primer pair in the step (3) to obtain a PCR product;
c) detecting the PCR product by electrophoresis, if no target band exists, judging that the strain to be detected is not the adenine arthrobacter adevorans TMCC70007 strain, and if the target band exists, performing the step d);
d) sequencing the obtained PCR product to obtain a nucleotide sequence to be detected; and (3) carrying out homology comparison on the nucleotide sequence to be detected and the nucleotide sequence of the DNA bar code of claim 1, and if the homology is more than 99%, judging that the strain to be detected is the adenine arthrobacter adenine-burning TMCC70007 strain.
(8) Use of the DNA barcode according to (1) for identifying a strain of Alternaria adefovea.
(9) The application of the primer pair in (3) in identifying the adenine nodospora spaying yeast strain.
(10) The application of the kit in (6) in identifying the adenine nodospora spacicola strains.
Compared with the prior art, the invention has the following advantages and positive effects:
1. the invention adopts the protein genomics technology to discover the leakage-injection encoded protein from the genome of the Arthrospora adenantha TMCC70007 strain, and discovers that the coding sequence (SEQ ID No.1) of the protein is unique to the genome of the Arthrospora adenantha TMCC70007 strain. The invention further develops the specific DNA sequence into a DNA bar code through careful research and comparative analysis, and the DNA bar code is used as an effective tool for identifying the strain Alternaria adenini TMCC70007 produced by the industrial fermentation of the Pu' er tea. The bar code sequence can realize the rapid and accurate identification and differentiation of the strains in the adenine node spore feeding yeast.
2. Compared with the prior art, the specificity of the DNA bar code discovered by adopting the protein genomics technology is better.
3. The invention further discovers that the sequence (such as SEQ ID No.1) in the SOF1 gene has the characteristics of universality, easiness in amplification and easiness in alignment compared with other genes.
4. The invention establishes a standard gene sequence and a sample identification method of the Puer tea industrial fermentation production strain Arthrospora adenini TMCC70007, and compared with the traditional morphological identification method, the identification efficiency is obviously improved. The method has low requirement on the integrity of the sample, and the identification index can be quantized, which provides effective basis for timely judging Pu' er tea and germplasm resources thereof. In addition, morphological confusing species are further added, and an identification rule can be established for identification by using a phylogenetic tree method based on cluster analysis, so that the reliability and accuracy of the identification are greatly superior to those of the conventional molecular identification method, and the blank of the strain identification of the Pu' er fermented tea production strain based on the DNA barcode technology is filled.
Drawings
Fig. 1A and B show mass spectra of peptide fragments SSDTYLPAR and NLDPALHPFER, respectively.
FIGS. 2A and B show a comparison of mass spectra of the as-identified peptide fragments with mass spectra of the synthetic peptide fragments SSDTYLPAR and NLDPALHPFER, respectively; the original identified peptide segment is obtained by mass spectrometry and identification; the upper part of the figure is a mass spectrogram of the original identified peptide fragment, and the lower part is a mass spectrogram of the synthesized peptide fragment.
FIG. 3 shows the mRNA sequence of the protein coding frame and the corresponding map of the encoded protein sequence in the region of the peptide stretch.
FIG. 4 shows a map of the mRNA sequence transcribed from the gene encoding SOF1 and its encoded protein sequence.
FIG. 5A shows a SDS-PAGE separation profile of the cellular holoprotein of TMCC 70007; fig. 5B shows a molecular weight verification plot for SOF1 protein (SOF1 protein theoretical molecular weight 50.78kDa, lg (50.78) ═ 1.71, located at the position corresponding to abscissa 1.71 in fig. 5B).
FIG. 6A shows an alignment of the barcode sequence SEQ ID NO.1 with homology thereto by NCBI-BLASTN, the two short grey segments at the bottom of FIG. 6A showing 2 homologous sequences; FIG. 6B shows an alignment of SEQ ID NO.1 with covered portions of the two sequences; FIG. 6C shows SEQ ID NO.1 and Millerozyma farinosa (Pichia farinosa ) CBS 7064TAlignment of annotated proteins of strains (accession XM-004200814.1).
FIG. 7 shows a comparison of the bar code sequence SEQ ID NO.1 of C.adenini 70007 with the homologous sequence of C.adenini LS3 strain;
FIG. 8 shows the positions of primers used for amplification of SEQ ID NO.1 in one embodiment of the present invention, the underlined regions in the sequence are the regions where the primers are located, the bold font ATG and TAG are the start and stop sites, the gray background region is the intron region of the gene, and the amplified sequence is SEQ ID NO. 7.
FIG. 9 shows the results of agarose gel electrophoresis of PCR amplification products of test strains obtained using the primers of the present invention.
FIG. 10 shows a comparison of the sequence of SEQ ID NO.1 with the PCR amplified sequence of the test strain.
Detailed Description
The present invention is further described in the following description of the embodiments with reference to the drawings, which are not intended to limit the invention, and those skilled in the art may make various modifications or improvements based on the basic idea of the invention, but within the scope of the invention, unless departing from the basic idea of the invention.
The invention adopts the protein genomics technology to find the coding protein with missing annotation from the genome of the Alternaria adenini TMCC70007 strain, and the coding protein is called as the coding protein SOF1 because the coding protein has higher homology with the SOF superfamily protein. The invention obtains the corresponding gene sequence in the TMCC70007 strain genome from the newly found protein sequence, the gene sequence is called SOF1 gene, and the gene sequence (SEQ ID No.1) is found to be unique to the genome of the Alternaria adephaga TMCC70007 strain. Therefore, the specific DNA sequence can be used for developing and identifying the DNA bar code of the commercial fermentation production strain of the Puer tea, namely the adenine nodularia cerealis TMCC 70007. Compared with the prior art, the DNA barcode obtained by the method has higher specificity.
Considering that the DNA barcode should have a suitable length and sufficient specificity among strains, the present invention further obtains a DNA barcode that can accurately and efficiently identify the strain of Arthrospora adenine-feeding TMCC70007 based on the above-mentioned characteristic DNA sequence through careful study and comparative analysis.
Thus, in one aspect the present invention provides a DNA barcode for identifying a strain of arthrobacter adenivorus, the DNA barcode being derived from the genome of the strain of arthrobacter adenivorus TMCC70007 and comprising at least 500bp of a sequence selected from the DNA sequences shown as SEQ ID No.1, and the DNA barcode having a length of 500bp to 3500bp, preferably 500bp to 2200bp, more preferably 500bp to 1600 bp.
The bar code sequence of the invention can realize the rapid and accurate identification and differentiation of the strains in the adenine-feeding node spore yeast.
The length of SEQ ID No.1 is 1426bp, and because the SEQ ID No.1 is specific to the genome of the Arthrospora adenine-feeding TMCC70007 strain, the sequence can be used as a DNA bar code for identifying the Arthrospora adenine-feeding TMCC70007 strain. In addition, theoretical analysis and experimental verification prove that the specificity of the DNA barcode containing the sequence and having longer length is better ensured. When the length of the DNA barcode is too long (e.g., greater than 3500bp), it is less desirable for the amplification procedure. On the other hand, when the length of the DNA bar code is not less than 500bp, the operation requirements of easy amplification and easy comparison can be met; theoretical analysis and experimental verification prove that at least 500bp sequence selected from the DNA sequence shown as SEQ ID No.1 can realize the rapid and accurate identification and differentiation of the strains in the adenine-node-adenine-B-yeast.
In a preferred embodiment of the invention, the DNA barcode sequence is as shown in SEQ ID No.1, SEQ ID No.2, SEQ ID No.4 or SEQ ID No. 7.
In this context, the term "missing-release gene" means that after the species has completed the genome sequencing, the gene cannot be predicted by using a gene prediction software (e.g., GeneMark, Augustus, Glimer, etc.), and the gene is generally expressed in a low amount under specific conditions and thus is difficult to be found in the study.
The term "DNA barcoding" refers to a new molecular identification technique for identifying species using a standard, short DNA fragment in the genome, which allows rapid and accurate species identification.
The term "six-frame translation" is a known term in proteomics and genomics, and is briefly described on the principle that when a DNA encodes a protein, there are 3 coding possibilities given to a DNA sequence, using triplet codons to encode the protein, plus 3 coding possibilities on its complementary strand, for a total of 6 coding possibilities (+1, +2, +3, -3, -2, -1).
Another embodiment of the present invention provides a primer pair for amplifying the DNA barcode according to the present invention.
Preferably, the nucleotide sequence of its forward primer is identical to such a sequence in the genome of the strain Arthrospora adevorans TMCC 70007: the sequence is a sequence from 1000bp upstream of the 1 st site of the nucleotide sequence shown as SEQ ID No.1 in the genome of the TMCC70007 strain to the 926 nd site of the nucleotide sequence shown as SEQ ID No.1, and the length of the forward primer is 20-30 bp; the reverse primer of the strain is reversely complementary to a sequence in the genome of the TMCC70007 strain: the sequence is a sequence in a region from the 501 st bit of the nucleotide sequence shown as SEQ ID No.1 to the downstream 1000bp of the last bit of the nucleotide sequence shown as SEQ ID No.1 in the genome of the TMCC70007 strain, and the length of the reverse primer is 20-30 bp.
In a more preferred embodiment, the nucleotide sequences of the forward primer and the reverse primer are as follows:
SOF1-F:5’-GCCGCACGCTCAATATTTTTC-3’(SEQ ID No.5);
SOF1-R:5’-GCTGATCGGGTAGAGCAAGT-3’(SEQ ID No.6)。
the primers of the invention enable specific amplification of the DNA barcode sequence.
The invention also provides a kit for identifying the adenine node B yeast strain, which comprises the primer pair.
In another embodiment, the kit further comprises a DNA barcode according to the present invention. The DNA barcode may be present on a recording medium. The recording medium is, for example, an optical disc. The kit may also comprise any tools and reagents for the experimental procedure.
In yet another aspect of the invention, a method for identifying a strain of nodospora adenine dinucleotide is provided, comprising the steps of:
a) providing the genome DNA of a strain to be tested;
b) performing PCR amplification by using the genomic DNA of the step a) as a template and using the primer pair of claim 3 to obtain a PCR product;
c) detecting the PCR product by electrophoresis, if no target band exists, judging that the strain to be detected is not the adenine arthrobacter adevorans TMCC70007 strain, and if the target band exists, performing the step d);
d) sequencing the obtained PCR product to obtain a nucleotide sequence to be detected; and (3) carrying out homology comparison on the nucleotide sequence to be detected and the nucleotide sequence of the DNA bar code of claim 1, and if the homology is more than 99%, judging that the strain to be detected is the adenine arthrobacter adenine-burning TMCC70007 strain.
In a specific embodiment of the present invention, the procedure of PCR amplification is: 1) pre-denaturation at 94-96 deg.C for 8 min; 2) denaturation at 94-96 ℃ for 45 seconds, annealing at 54-57 ℃ for 45 seconds, and extension at 72 ℃ for 1 min 15 seconds, wherein the procedure 2) is performed for 30-35 cycles; 3) extension at 72 ℃ for 10 min.
In another embodiment of the present invention, the method further comprises performing cluster analysis (e.g., phylogenetic tree) on the sequence to be tested obtained from the sequencing result and the DNA barcode of the present invention, and if the sequence to be tested and the DNA barcode are clustered together, determining that the strain to be tested is the alternaria adenini yeast strain TMCC 70007.
In a specific embodiment of the invention, genomic DNA extracted from the strain to be identified is subjected to PCR amplification using the primer pairs of the invention, followed by detection by agarose gel electrophoresis. Identifying the strains based on detecting the presence or absence of PCR products: if the strain to be identified does not amplify a corresponding target band, the strain is not TMCC 70007; if the corresponding target band is amplified, the strain is proved to be possible TMCC 70007. For further identification, the PCR product is sequenced, the DNA sequencing result and the DNA bar code sequence are subjected to homology comparison to obtain the similarity (namely homology) between the sequences, and if the sequence homology is less than 99 percent, the strain to be detected is judged not to be the adenine node B spore forming yeast strain TMCC 70007. And if the sequence homology is more than or equal to 99 percent, judging that the strain to be detected is the adenine-node B.adephagus strain TMCC 70007.
If a clustering analysis is performed, such as a phylogenetic tree, the DNA barcode is used together with the DNA sequencing results (i.e., the sequences to be tested) of each strain to be identified to construct an NJ phylogenetic tree using MEGA 5.1 or PAUP software. If the sequence to be tested of the strain to be identified is clustered with the DNA bar code of the Arthrospora adenini TMCC70007, the strain is identified as the Arthrospora adenini TMCC 70007.
The term "cluster" as used herein means that the branches are located at the same branch and have the same evolutionary distance after phylogenetic tree analysis.
The invention also provides application of the DNA bar code in identifying the alternaria adephaga yeast strain.
The invention also provides application of the primer pair in identifying the adenine node B saccharomycetes strains.
Examples
The present invention is further illustrated by the following specific examples. The methods used in the examples, unless otherwise specified, were carried out using conventional methods and known tools.
Example 1: acquisition of SOF1 Gene and DNA Bar codes
1. By utilizing a high-coverage proteome technology, pFind and pAnno software are adopted to carry out deep coverage research on proteome on the Puer tea industrial fermentation bacteria Arthrospora adenine-eating yeast TMCC70007, and the genome is verified by annotating coding genes. Specifically, in order to find a novel protein coding region, a Six-Frame Translation database of the genome data of the Geobacillus adenini TMCC70007 was obtained using a Six-Frame Translation (Six Frame Translation) strategy in systematic proteomics, exhaustively listing 6 coding possibilities of the genome (+1, +2, +3, -1, -2, -3), and the protein sequence was referred to as a "Six-Frame Translation protein sequence", and the corresponding nucleic acid sequence was referred to as a "Six-Frame Translation nucleic acid sequence". Generally, a six-frame translational nucleic acid sequence is a sequence from one terminator to the next, which is also referred to herein as a "protein coding frame". The discovery and identification of new peptide fragments and new proteins was performed using this six-box translation database, using pFind and pAnno software to compare with whole protein mass spectral data of TMCC70007 strain.
Through identification, peptide fragments SSDTYLPAR and NLDPALHPFER which are not found in the adenine-nodakeslea yeast TMCC70007 annotated genome in the prior art are found, and a mass spectrum is shown in figure 1.
The results of manual inspection of mass spectrum spectrograms show that almost all y ion sequences of secondary mass spectrum spectrograms (MS2) of peptide fragments SSDTYLPAR and NLDPALHPFER are detected, the matching is good, the signal is strong, and the results are reliable. To further confirm this identification, peptides were chemically synthesized according to the amino acid sequences of newly identified peptides SSDTYLPAR and NLDPALHPFER, respectively, and secondary spectra of the synthesized peptides were generated using the mass spectrometry conditions of SSDTYLPAR and NLDPALHPFER described above, see fig. 2. The secondary mass spectrum of the newly synthesized peptide fragment is consistent with the secondary mass spectrum of the new peptide fragment which is originally identified by the research of high coverage protein genomics.
Next, the high energy collision MS2 generated by the synthesized peptide fragment was verified, and both the primary parent ion and the secondary daughter ion were in agreement with the theoretical values, indicating that the sequence of the synthesized peptide fragment was correct (fig. 2). On the basis, the MS2 of the peptide section synthesized by the new peptide section sequence identified according to the large-scale proteome data and the spectrogram of the large-scale identified new peptide section are manually checked, the two are almost completely consistent, and the Cosin values obtained by the similarity of the daughter ions are respectively as high as 0.96 and 0.99, thereby proving that the new peptide section identified from the Pu' er tea industrial fermentation bacterium adenine-burning arthrobacter adenini TMCC70007 is correct.
2. The six-frame translated nucleic acid sequence (protein coding frame) SEQ ID NO.2 is obtained by defining the region between the former stop codon and the latter stop codon, depending on the position of the novel peptide fragment. The mRNA corresponding to the sequence of SEQ ID NO.2 and the amino acid sequence encoded by the mRNA are shown in FIG. 3.
3. In order to further determine the coding start site and the coding end site of the coding gene (SOF1), the coding frame of the protein is respectively expanded by 1000bp upstream and downstream, GeneMark.hmm is adopted for gene prediction, and Schizosaccharomyces pombe (Schizosaccharomyces pombe) is selected as a reference species. The existence of the gene is predicted in the region, the coding frame of the gene is consistent with the coding frame sequence of the protein, the coding start and termination sites of the gene are determined, and the gene sequence is SEQ ID NO. 1. The 4 th to 76 th positions of SEQ ID NO.1 are introns of the gene.
The correspondence between the mRNA sequence transcribed from the gene and the amino acid sequence of the protein translated from the mRNA sequence is shown in FIG. 4. Peptide fragments SSDTYLPAR and NLDPALHPFER are located upstream of the gene.
The nucleotide sequence of the gene is 1426bp in total, the intron region is removed, 450 amino acids are coded, the theoretical molecular weight is 50.78kDa, and the theoretically coded amino acid sequence is shown as SEQ ID NO. 3.
4. To further confirm the correctness of the identified sequences, TMCC70007 strain was cultured in YPD medium and total protein was extracted therefrom, separated by SDS-PAGE, stained with Coomassie Brilliant blue on electrophoresis gel, and the molecular weight was verified based on the molecular weight characteristics at the time of preparation of the proteome sample, as shown in FIG. 5. In addition, 25 strips were cut out in a band for in-gel digestion and mass spectrometry, where the SOF1 protein was identified in the mass spectrometry data of the 10 th strip, matching the predicted molecular weight size. The theoretical molecular weight calculated based on the amino acid sequence was 50.78kDa, which was consistent with the position of the gel strip to which the protein belongs in SDS-PAGE. The correctness of the identified genes and proteins was confirmed.
5. The amino acid sequence of the theoretical encoded product of the gene was subjected to blastp analysis, and the sequence thereof has high homology with the WD40 superfamily and the SOF superfamily, and the results are summarized in table 1. WD40 is a protein in eukaryotes, which is used as a binder/regulation component in the process of signal transduction and participates in the processing of precursor mRNA and the assembly of cytoskeleton, and is characterized in that the N end of the protein contains 11-24 amino acid residues of GH dipeptide, WD is at the C end, and the number of amino acids of WD40 is about 40, so the protein is named as WD 40. The SOF1 protein is essential for cell growth and is a component of RNA processing modules in the nucleus.
Figure BDA0001369493860000131
The blastp result indicates that the detected missing injection protein is probably a product of SOF1 homologous gene, and the sequence conservation is high.
6. The ORF sequence of this SOF1 gene (i.e., SEQ ID NO.1) was subjected to NCBI-BLASTN analysis, and the results are shown in FIG. 6A. The BLASTN results are summarized in table 2. The results showed that the DNA had substantially no homologous sequence, indicating that the leaky release gene SOF1 detected in A. adenosylvestris was sequence specific.
TABLE 2 list of nucleic acid sequences with higher similarity to SEQ ID NO.1
Figure BDA0001369493860000141
The two sequences found by BLASTN were manually aligned with the sequence of the SOF1 gene of Arthrospora adevorans, as shown in FIG. 6B. The results show that although the amino acid sequences of SOF1 are similar, the DNA differences are large, and only partial DNA sequences have high similarity.
In NCBI database, the protein sequence has the highest homology with an annotated protein (accession number: XM _004200814.1) of Pichia farinosa CBS 7064T strain, and after DNAMAN analysis, the consistency of the two sequences is only 56.48%, and the result is shown in FIG. 6C.
The further proves that the ORF sequence of the gene can be used for developing the DNA bar code of the Puer tea fermentation strain Arthrospora adenini TMCC 70007.
7. The region from 1000bp upstream to 1000bp downstream of SEQ ID No.1 was further selected (as shown in SEQ ID No. 4). The sequence of SEQ ID NO.4 was subjected to NCBI-BLASTN analysis, and the NCBI Nucleotide collection (nr/nt) database was searched, and as a result, it was found that only 7% of the sequence had homology of 76% to the H chromosome of Pichia (Pichia pastoris) CBS 7064 strain (see Table 3), demonstrating that SEQ ID NO.4 has high specificity. Based on the above results, it was considered that SEQ ID NO.4 can serve as a DNA barcode sequence for identifying the strain of Alternaria adephaga strain TMCC 70007. Furthermore, the sequence of SEQ ID NO.4 was further subjected to NCBI-BLASTN analysis to search the Whole genome sequence of Arthrospora adenine-saccharolytica (taxonomic ID: taxi: 409370) in the white-genome shotgun contigs (wgs) database. As a result, it was found that the homology of SEQ ID NO.4 to the entire gene sequence Arad1A _ contig _1(CBZY010000006.1) of Arad1, Conidiobolus adenine dinucleotide 3 was 90% and the sequence coverage was 97% (see Table 4). Although TMCC70007 and LS3 strains are the same species, they differ significantly in sequence over this DNA segment, enough to distinguish TMCC70007 from its similar strain, LS3 strain. Therefore, the Arthrospora adenantha TMCC70007 SEQ ID NO.4 has sufficient discrimination with the homologous sequence in the genome of Arthrospora adenantha LS3, and can be used as a DNA barcode sequence. In conclusion, the sequence from upstream 1000bp to downstream 1000bp of SEQ ID NO.1 (SEQ ID NO.4) can be used as a DNA barcode for identifying the strain of Arthrospora adenantha.
Table 3 list of nucleic acid sequences having a higher similarity to SEQ ID NO.4 after searching the NCBI Nucleotide analysis (nr/nt) database
Figure BDA0001369493860000151
TABLE 4 List of nucleic acid sequences with higher similarity to SEQ ID NO.4 after search of the white-genome shotgun contigs (wgs) database
Figure BDA0001369493860000152
8. Currently, of the species Arthrospora adenantha, only the genome of the strain Arthrospora adenantha LS3 has been reported (Kunze et al Biotechnology for Biofuels 2014,7: 66). After analysis by Local-BLASTN, the gene sequence SEQ ID NO.1 was found to have a homologous sequence in the genome of LS3 strain. After DNAMAN analysis, the homology of the two sequences was 93.20%, although TMCC70007 and LS3 strains were of the same species (instant Arthromyces adenine, Blastotrys adenoninivorans), 96 sites were different in 1426 bases of the DNA barcode sequence, and the results are shown in FIG. 7. This further demonstrates that the use of this DNA barcode sequence can effectively distinguish TMCC70007 strain from other strains within the species.
Example 2 identification of strains Using DNA barcodes
And judging whether the sample to be detected is the bacterial strain TMCC70007 applied to the Pu' er tea industry or not according to the amplification result of the sample to be detected and the sequence homology of the bacterial strain SEQ ID NO.1 of the bacterial strain TMCC70007 of the Arthrospora adenini.
(1) Based on SEQ ID NO.1, PCR primers are designed at two ends of the gene by adopting an NCBI primer design tool, an amplification product needs to comprise the initial and termination sites of the gene, and the sequences of the obtained positive and negative primers are respectively SOF 1-F: 5'-GCCGCACGCTCAATATTTTTC-3', respectively; SOF 1-R: 5'-GCTGATCGGGTAGAGCAAGT-3' are provided.
FIG. 8 shows the positions of primers used for amplification of SEQ ID NO.1 in one embodiment of the present invention, the underlined regions in the sequence are the regions where the primers are located, the bold font ATG and TAG are the start and stop sites, the gray background region is the intron region of the gene, and the amplified sequence is SEQ ID NO. 7.
(2) The source of the strain
TABLE 5 information on selected related strains
Figure BDA0001369493860000161
(3) Respectively extracting strain DNA: OMEGA e.z.n.a was used.TMYeast genomic DNA was extracted using the Yeast DNA kit (Bio Rad laboratories) and the DNA concentration of the sample was diluted to 0.5. mu.g/. mu.L with sterile deionized water.
(4) Amplifying DNA fragments, and carrying out Polymerase Chain Reaction (PCR), wherein the sequences of the primers are respectively as follows:
forward primer sequence: SOF 1-F: 5'-GCCGCACGCTCAATATTTTTC-3', respectively;
reverse primer sequence: SOF 1-R: 5'-GCTGATCGGGTAGAGCAAGT-3' are provided.
The PCR reaction system is 50 mu L, and the PCR reagent is Thermo ScientificTMTaq DNA Polymerase (recombinant) of (1): ddH2O 37.7μL、MgCl2mu.L, dNTPs 4. mu.L, forward primer 1. mu.L, reverse primer 1. mu.L, Taq DNA polymerase 0.3. mu. L, DNA template 1. mu.L, without dye. The amplification procedure was: pre-denaturation at 94 ℃ for 8 min; then denaturation at 94 ℃ for 45 seconds, annealing at 56 ℃ for 45 seconds, and extension at 72 ℃ for 1 minute and 15 seconds are carried out for 32-35 cycles in total; final extension at 72 ℃ for 10 min.
(5) Detection of amplification products: the PCR fragment size was detected by electrophoresis on 1.0% agarose gel, 1 XTBE, and DNA marker. If the strain to be detected has no amplified band, the strain is not the adenine node B spore feeding yeast TMCC 70007; if a clear band appears and no miscellaneous band exists, the DNA fragment is sent to a biological sequencing company for DNA fragment sequencing.
(6) The primer can only realize amplification in the Arthrospora adenantha TMCC70007, and the Arthrospora adenantha CBS 8244 of the same speciesTCBS 7350, CBS 7370, CBS 8335 and Blastotrys raffinositifications CBS 6800TAmplification cannot be achieved. The results are shown in FIG. 9As shown. The amplification sequence of the PCR primer was 1586bp, consistent with expectations. To further verify the sequence of the amplified DNA, sequencing and homologous sequence comparison were performed.
(7) For the sequencing result of the sequence with the band, firstly, the quality of a sequence peak image obtained after sequencing is checked by software Chromas, and after the quality of the peak image meets the requirement of data analysis, the forward and reverse sequences are spliced by SeqMan in a DNASTAR software package. And (3) manually proofreading and splicing the sequencing result, and if the homology of the DNA fragment of the strain to be detected and the standard gene sequence of the Arthrospora adenantha TMCC70007 is more than 99%, judging that the strain to be detected is possible to be the Arthrospora adenantha TMCC70007 strain. For example, through DNAMAN, the sequence obtained by sequencing the strain TMCC70007 is completely consistent with the DNA barcode sequence SEQ ID NO.1 after alignment, and the reliability of the method is further proved. The results are shown in table 4 and fig. 10.
TABLE 6
Figure BDA0001369493860000171
Figure BDA0001369493860000181
SEQUENCE LISTING
<110> Menghai tea industry, Limited liability company
<120> DNA barcodes, primers, kits, methods and uses
<130> FI-163711-59:52/C
<160> 7
<170> PatentIn version 3.5
<210> 1
<211> 1426
<212> DNA
<213> Arthrospora adenantha (Blastobotrys adeninivorans)
<400> 1
atggtaagtt tatttcctcg gagctataaa aggccaatcg gctgctccct tgtatggttc 60
ttgtgttcta actcagaaaa tcaaaacaat ctctcggtcc agcgatacgt acttgccggc 120
tcgaaatact gaagcacaac gtctgcctcg gaacttggac cctgcacttc acccgtttga 180
gcgagctaga gaatacacca gagctctaaa tgccacaaag ctagaacgta tgttcgccca 240
gccattcatt ggccagctgg gtaacggaca tatcgatgga gtgtactcta ttgccaaaaa 300
ccttcactcg ctgtcccgac tggccacggg atctggtgat ggaatcgtaa agctgtggga 360
tctcagcacc cgagacgaaa tcttctctgt aaaggcccac acaaatattg tccgaggact 420
cacaatgacc cccaacggaa acctcttgtc ctgtgctact gacaagtcca tcaagctgtg 480
ggatatcacc gacaaggcat ccagccagga gcccatgcaa acatacctag gagctagtgg 540
attcagtggc attgaccatc accgagacga gggcaagttt gtcaccgccg gagacactgt 600
agagctgtgg gatacaaatc gatcaaagcc aatctcaaac ctgtcatggg gcgccgacag 660
cgtgcactct gttcgattca accagaccga aacctccatt gtcgcatctt caggagccga 720
tcgctctatt gtcatttacg atcttcgtac ctcgtcacct gtacaaaagc tggtggctac 780
catgtccacc aacgccattg cctggaaccc aatggaagca ttcaactttg ctgccgctag 840
cgaagatcac aatgtctatc tttacgatat gcgtaagtta agccggtccc tcaacgttta 900
caaggaccac gttgccgccg ttatggatgt ggacttttcc cccaccggac aagagctggt 960
cactggttct tacgaccgat ccatccgcct tttccgggtc cgcgagggcc actccaggga 1020
gatctaccac accaagcgta tgcaaagggt gttttgcgtt aaattctcca tggattcaaa 1080
gtacacagtg tccgggtccg atgacggtaa cgtgcgtctc tggcgagcca atgcttccga 1140
acgcgcaggt gttaagtcag ccaagcagcg tgcaaagctt gagtatgatg cagcgcttaa 1200
ggaacggttc aagcacatgc ccgagattcg ccggattgct cgtcatcgtc acgtccccaa 1260
gccgatcaag aaggccgggg aaatcaagaa ggtggagctg gaatctctgc gcagaaagga 1320
ggacaatgtg cgccgtcact caaagaaggg cgcggtcccg tttgcaaagg agcgagagaa 1380
gcatattgtt ggtactgccg tcaaggacga cgactctcac aaatga 1426
<210> 2
<211> 1422
<212> DNA
<213> Arthrospora adenantha (Blastobotrys adeninivorans)
<400> 2
taagtttatt tcctcggagc tataaaaggc caatcggctg ctcccttgta tggttcttgt 60
gttctaactc agaaaatcaa aacaatctct cggtccagcg atacgtactt gccggctcga 120
aatactgaag cacaacgtct gcctcggaac ttggaccctg cacttcaccc gtttgagcga 180
gctagagaat acaccagagc tctaaatgcc acaaagctag aacgtatgtt cgcccagcca 240
ttcattggcc agctgggtaa cggacatatc gatggagtgt actctattgc caaaaacctt 300
cactcgctgt cccgactggc cacgggatct ggtgatggaa tcgtaaagct gtgggatctc 360
agcacccgag acgaaatctt ctctgtaaag gcccacacaa atattgtccg aggactcaca 420
atgaccccca acggaaacct cttgtcctgt gctactgaca agtccatcaa gctgtgggat 480
atcaccgaca aggcatccag ccaggagccc atgcaaacat acctaggagc tagtggattc 540
agtggcattg accatcaccg agacgagggc aagtttgtca ccgccggaga cactgtagag 600
ctgtgggata caaatcgatc aaagccaatc tcaaacctgt catggggcgc cgacagcgtg 660
cactctgttc gattcaacca gaccgaaacc tccattgtcg catcttcagg agccgatcgc 720
tctattgtca tttacgatct tcgtacctcg tcacctgtac aaaagctggt ggctaccatg 780
tccaccaacg ccattgcctg gaacccaatg gaagcattca actttgctgc cgctagcgaa 840
gatcacaatg tctatcttta cgatatgcgt aagttaagcc ggtccctcaa cgtttacaag 900
gaccacgttg ccgccgttat ggatgtggac ttttccccca ccggacaaga gctggtcact 960
ggttcttacg accgatccat ccgccttttc cgggtccgcg agggccactc cagggagatc 1020
taccacacca agcgtatgca aagggtgttt tgcgttaaat tctccatgga ttcaaagtac 1080
acagtgtccg ggtccgatga cggtaacgtg cgtctctggc gagccaatgc ttccgaacgc 1140
gcaggtgtta agtcagccaa gcagcgtgca aagcttgagt atgatgcagc gcttaaggaa 1200
cggttcaagc acatgcccga gattcgccgg attgctcgtc atcgtcacgt ccccaagccg 1260
atcaagaagg ccggggaaat caagaaggtg gagctggaat ctctgcgcag aaaggaggac 1320
aatgtgcgcc gtcactcaaa gaagggcgcg gtcccgtttg caaaggagcg agagaagcat 1380
attgttggta ctgccgtcaa ggacgacgac tctcacaaat ga 1422
<210> 3
<211> 450
<212> PRT
<213> Arthrospora adenantha (Blastobotrys adeninivorans)
<400> 3
Met Lys Ile Lys Thr Ile Ser Arg Ser Ser Asp Thr Tyr Leu Pro Ala
1 5 10 15
Arg Asn Thr Glu Ala Gln Arg Leu Pro Arg Asn Leu Asp Pro Ala Leu
20 25 30
His Pro Phe Glu Arg Ala Arg Glu Tyr Thr Arg Ala Leu Asn Ala Thr
35 40 45
Lys Leu Glu Arg Met Phe Ala Gln Pro Phe Ile Gly Gln Leu Gly Asn
50 55 60
Gly His Ile Asp Gly Val Tyr Ser Ile Ala Lys Asn Leu His Ser Leu
65 70 75 80
Ser Arg Leu Ala Thr Gly Ser Gly Asp Gly Ile Val Lys Leu Trp Asp
85 90 95
Leu Ser Thr Arg Asp Glu Ile Phe Ser Val Lys Ala His Thr Asn Ile
100 105 110
Val Arg Gly Leu Thr Met Thr Pro Asn Gly Asn Leu Leu Ser Cys Ala
115 120 125
Thr Asp Lys Ser Ile Lys Leu Trp Asp Ile Thr Asp Lys Ala Ser Ser
130 135 140
Gln Glu Pro Met Gln Thr Tyr Leu Gly Ala Ser Gly Phe Ser Gly Ile
145 150 155 160
Asp His His Arg Asp Glu Gly Lys Phe Val Thr Ala Gly Asp Thr Val
165 170 175
Glu Leu Trp Asp Thr Asn Arg Ser Lys Pro Ile Ser Asn Leu Ser Trp
180 185 190
Gly Ala Asp Ser Val His Ser Val Arg Phe Asn Gln Thr Glu Thr Ser
195 200 205
Ile Val Ala Ser Ser Gly Ala Asp Arg Ser Ile Val Ile Tyr Asp Leu
210 215 220
Arg Thr Ser Ser Pro Val Gln Lys Leu Val Ala Thr Met Ser Thr Asn
225 230 235 240
Ala Ile Ala Trp Asn Pro Met Glu Ala Phe Asn Phe Ala Ala Ala Ser
245 250 255
Glu Asp His Asn Val Tyr Leu Tyr Asp Met Arg Lys Leu Ser Arg Ser
260 265 270
Leu Asn Val Tyr Lys Asp His Val Ala Ala Val Met Asp Val Asp Phe
275 280 285
Ser Pro Thr Gly Gln Glu Leu Val Thr Gly Ser Tyr Asp Arg Ser Ile
290 295 300
Arg Leu Phe Arg Val Arg Glu Gly His Ser Arg Glu Ile Tyr His Thr
305 310 315 320
Lys Arg Met Gln Arg Val Phe Cys Val Lys Phe Ser Met Asp Ser Lys
325 330 335
Tyr Thr Val Ser Gly Ser Asp Asp Gly Asn Val Arg Leu Trp Arg Ala
340 345 350
Asn Ala Ser Glu Arg Ala Gly Val Lys Ser Ala Lys Gln Arg Ala Lys
355 360 365
Leu Glu Tyr Asp Ala Ala Leu Lys Glu Arg Phe Lys His Met Pro Glu
370 375 380
Ile Arg Arg Ile Ala Arg His Arg His Val Pro Lys Pro Ile Lys Lys
385 390 395 400
Ala Gly Glu Ile Lys Lys Val Glu Leu Glu Ser Leu Arg Arg Lys Glu
405 410 415
Asp Asn Val Arg Arg His Ser Lys Lys Gly Ala Val Pro Phe Ala Lys
420 425 430
Glu Arg Glu Lys His Ile Val Gly Thr Ala Val Lys Asp Asp Asp Ser
435 440 445
His Lys
450
<210> 4
<211> 3417
<212> DNA
<213> Arthrospora adenantha (Blastobotrys adeninivorans)
<400> 4
tgaacaccct ccgaggccat ggtggatctg gttctttgat gtgctcaagc aggtgattgg 60
ggccgggtgt ctgcactttt tgaacctgct gcagtcaatt atcttttcta acagtggaga 120
gccggatcta gacaagaatc cttgtacgtg gtacttttta aatgtgctgt tagacaccac 180
cattggtgtg ccagtgctgt ggttttttct gtactttgtg cactcagcag cgtatcggtt 240
tggagttcgg cagattgtat cgggccagta tggtcatcct cctaaatgga tccccttctt 300
caaacaagca ttgctctatc tggtggctct ggttagcatg aagctgttgc tctatctgtt 360
tgtatggtgg atgccactga tcgatgattt gggtaacttt ttaatatctt ggtccaattt 420
cgatgctcga gtacaggtag catttgtggt tctggtattt cctcttatca tgaacaccct 480
ccagtactat ctagtggact ctatcattca atccccagag tatcacaatc ccaaacttgc 540
acaacttgct ccagatacag aaaataggcc atcagggcct acggagccaa ctagaccagc 600
agattctaac cgatctgacc ttcctacaca agctgacaga tcccatcatg ttaagggtaa 660
acagaaatcc actaaccagg ctgaccactc taaccaggct aacgagacta ctagactcct 720
cggttcccac taacttcatt aataatattt aactccatac tcactattat cttttgtgaa 780
gcttatggta ctcaatgaat attctattta ttattctttt cttgctattt gatactgtcc 840
acaacattac tagcttgaag gtcgagcccg atatcgagag atgcgccgca cgccgcacgc 900
tcaatatttt tcttatcgaa atctgctcaa aatttgtgct ccatttctgc tcaaataaat 960
tctctcaaaa ttctctcatc agcaaccttc cttaatggta agtttatttc ctcggagcta 1020
taaaaggcca atcggctgct cccttgtatg gttcttgtgt tctaactcag aaaatcaaaa 1080
caatctctcg gtccagcgat acgtacttgc cggctcgaaa tactgaagca caacgtctgc 1140
ctcggaactt ggaccctgca cttcacccgt ttgagcgagc tagagaatac accagagctc 1200
taaatgccac aaagctagaa cgtatgttcg cccagccatt cattggccag ctgggtaacg 1260
gacatatcga tggagtgtac tctattgcca aaaaccttca ctcgctgtcc cgactggcca 1320
cgggatctgg tgatggaatc gtaaagctgt gggatctcag cacccgagac gaaatcttct 1380
ctgtaaaggc ccacacaaat attgtccgag gactcacaat gacccccaac ggaaacctct 1440
tgtcctgtgc tactgacaag tccatcaagc tgtgggatat caccgacaag gcatccagcc 1500
aggagcccat gcaaacatac ctaggagcta gtggattcag tggcattgac catcaccgag 1560
acgagggcaa gtttgtcacc gccggagaca ctgtagagct gtgggataca aatcgatcaa 1620
agccaatctc aaacctgtca tggggcgccg acagcgtgca ctctgttcga ttcaaccaga 1680
ccgaaacctc cattgtcgca tcttcaggag ccgatcgctc tattgtcatt tacgatcttc 1740
gtacctcgtc acctgtacaa aagctggtgg ctaccatgtc caccaacgcc attgcctgga 1800
acccaatgga agcattcaac tttgctgccg ctagcgaaga tcacaatgtc tatctttacg 1860
atatgcgtaa gttaagccgg tccctcaacg tttacaagga ccacgttgcc gccgttatgg 1920
atgtggactt ttcccccacc ggacaagagc tggtcactgg ttcttacgac cgatccatcc 1980
gccttttccg ggtccgcgag ggccactcca gggagatcta ccacaccaag cgtatgcaaa 2040
gggtgttttg cgttaaattc tccatggatt caaagtacac agtgtccggg tccgatgacg 2100
gtaacgtgcg tctctggcga gccaatgctt ccgaacgcgc aggtgttaag tcagccaagc 2160
agcgtgcaaa gcttgagtat gatgcagcgc ttaaggaacg gttcaagcac atgcccgaga 2220
ttcgccggat tgctcgtcat cgtcacgtcc ccaagccgat caagaaggcc ggggaaatca 2280
agaaggtgga gctggaatct ctgcgcagaa aggaggacaa tgtgcgccgt cactcaaaga 2340
agggcgcggt cccgtttgca aaggagcgag agaagcatat tgttggtact gccgtcaagg 2400
acgacgactc tcacaaatga ttgccttgtt gaggtggagg atgatgatct acccgctact 2460
tgctctaccc gatcagccca atcaccaaaa aaagtgggta tctaccgctg caagtcctac 2520
tacatatgtg ccaagactga aattattttg tgtccctcga acgcttccag gatgagggga 2580
agcagttgtg atctgaatca aggttcactc ctggtggagt tcccatgtat tgcataattt 2640
gatattaact ttcggctaca aacttaacct tgtgtacgcc cctctgtacg ccctgaacca 2700
cggttggatt actgttcggc taatgtagta gggggtgacg atcttcgtct tccttttttt 2760
tttttttggt tggtcatcaa actgaaaacg catctacgtg catttttgct gctgtatgcg 2820
gcatctgatc gtttttgtcc tcggtgataa tcagataatc cctatttagg aaaccccgac 2880
tttgaccgcc cccaactaca actgatggac cagatggggt ataaattatg ggccttgttg 2940
ctgctcattt gagctaagtt ttttgaaaag tcaatcagct ttccagtttg aaaaagagaa 3000
caatgaagtt tggacttgtt gcaactgtag ccacaatcat ctcgtccgtg tcggcggtca 3060
ctactggcaa gctcggggac gctaaggaag tgcaggacaa ttgtccacga gccatcttcc 3120
gatcgttcct tcgttctgat gaggtggagg gcattgtgag cttccttcct accacgaacg 3180
gaactggcct cacggtggtg gccgagtttt caaagctgcc tgctgatgat gacaagatca 3240
tgtaccatat tcatgagaag cgcttggagg gtggttctac aaactgcacc tctactggag 3300
gacacttgga cccttaccag cgcggagacg agcctcctgc tgaggagggc catccagagc 3360
gtgctgaggt cggagacctg tctggtaagc acggaacctt gtctggagcc cctgtgg 3417
<210> 5
<211> 21
<212> DNA
<213> Artificial sequence
<400> 5
gccgcacgct caatattttt c 21
<210> 6
<211> 20
<212> DNA
<213> Artificial sequence
<400> 6
gctgatcggg tagagcaagt 20
<210> 7
<211> 1586
<212> DNA
<213> Arthrospora adenantha (Blastobotrys adeninivorans)
<400> 7
gccgcacgct caatattttt cttatcgaaa tctgctcaaa atttgtgctc catttctgct 60
caaataaatt ctctcaaaat tctctcatca gcaaccttcc ttaatggtaa gtttatttcc 120
tcggagctat aaaaggccaa tcggctgctc ccttgtatgg ttcttgtgtt ctaactcaga 180
aaatcaaaac aatctctcgg tccagcgata cgtacttgcc ggctcgaaat actgaagcac 240
aacgtctgcc tcggaacttg gaccctgcac ttcacccgtt tgagcgagct agagaataca 300
ccagagctct aaatgccaca aagctagaac gtatgttcgc ccagccattc attggccagc 360
tgggtaacgg acatatcgat ggagtgtact ctattgccaa aaaccttcac tcgctgtccc 420
gactggccac gggatctggt gatggaatcg taaagctgtg ggatctcagc acccgagacg 480
aaatcttctc tgtaaaggcc cacacaaata ttgtccgagg actcacaatg acccccaacg 540
gaaacctctt gtcctgtgct actgacaagt ccatcaagct gtgggatatc accgacaagg 600
catccagcca ggagcccatg caaacatacc taggagctag tggattcagt ggcattgacc 660
atcaccgaga cgagggcaag tttgtcaccg ccggagacac tgtagagctg tgggatacaa 720
atcgatcaaa gccaatctca aacctgtcat ggggcgccga cagcgtgcac tctgttcgat 780
tcaaccagac cgaaacctcc attgtcgcat cttcaggagc cgatcgctct attgtcattt 840
acgatcttcg tacctcgtca cctgtacaaa agctggtggc taccatgtcc accaacgcca 900
ttgcctggaa cccaatggaa gcattcaact ttgctgccgc tagcgaagat cacaatgtct 960
atctttacga tatgcgtaag ttaagccggt ccctcaacgt ttacaaggac cacgttgccg 1020
ccgttatgga tgtggacttt tcccccaccg gacaagagct ggtcactggt tcttacgacc 1080
gatccatccg ccttttccgg gtccgcgagg gccactccag ggagatctac cacaccaagc 1140
gtatgcaaag ggtgttttgc gttaaattct ccatggattc aaagtacaca gtgtccgggt 1200
ccgatgacgg taacgtgcgt ctctggcgag ccaatgcttc cgaacgcgca ggtgttaagt 1260
cagccaagca gcgtgcaaag cttgagtatg atgcagcgct taaggaacgg ttcaagcaca 1320
tgcccgagat tcgccggatt gctcgtcatc gtcacgtccc caagccgatc aagaaggccg 1380
gggaaatcaa gaaggtggag ctggaatctc tgcgcagaaa ggaggacaat gtgcgccgtc 1440
actcaaagaa gggcgcggtc ccgtttgcaa aggagcgaga gaagcatatt gttggtactg 1500
ccgtcaagga cgacgactct cacaaatgat tgccttgttg aggtggagga tgatgatcta 1560
cccgctactt gctctacccg atcagc 1586

Claims (10)

1. A DNA barcode for identifying a C.adenini strain TMCC70007 strain, which is derived from the genome of the C.adenini strain and has a nucleotide sequence selected from the sequences shown as SEQ ID No.4 and comprising the sequence shown as SEQ ID No. 2.
2. The DNA barcode according to claim 1, wherein the nucleotide sequence is shown as SEQ ID No.1, SEQ ID No.2, SEQ ID No.4 or SEQ ID No. 7.
3. A primer pair for amplifying the DNA barcode of claim 1.
4. The primer set according to claim 3, wherein the nucleotide sequence of the forward primer is identical to a sequence in the genome of the A.adefovea strain TMCC 70007: the sequence is a sequence in a region from the 1 st site of the nucleotide sequence shown as SEQ ID No.4 to the 20 th site of the nucleotide sequence shown as SEQ ID No.2 in the genome of the TMCC70007 strain, and the length of the forward primer is 20-30 bp; the reverse primer of the strain is reversely complementary to a sequence in the genome of the TMCC70007 strain: the sequence is a sequence in a region from the 1403 th site of the nucleotide sequence shown as SEQ ID No.2 to the last site of the nucleotide sequence shown as SEQ ID No.4 in the genome of the TMCC70007 strain, and the length of the reverse primer is 20-30 bp.
5. The primer set according to claim 4, wherein the nucleotide sequences of the forward primer and the reverse primer are as follows:
a forward primer: 5'-GCCGCACGCTCAATATTTTTC-3', respectively;
reverse primer: 5'-GCTGATCGGGTAGAGCAAGT-3' are provided.
6. A kit for identifying a nodospora adevorans TMCC70007 strain comprising the primer pair according to claim 3.
7. A method for identifying a strain of the nodospora adephagi TMCC70007, comprising the steps of:
a) providing the genome DNA of a strain to be tested;
b) performing PCR amplification by using the genomic DNA of the step a) as a template and using the primer pair of claim 3 to obtain a PCR product;
c) detecting the PCR product by electrophoresis, if no target band exists, judging that the strain to be detected is not the adenine arthrobacter adevorans TMCC70007 strain, and if the target band exists, performing the step d);
d) sequencing the obtained PCR product to obtain a nucleotide sequence to be detected; and (3) carrying out homology comparison on the nucleotide sequence to be detected and the nucleotide sequence of the DNA bar code of claim 1, and if the homology is more than 99%, judging that the strain to be detected is the adenine arthrobacter adenine-burning TMCC70007 strain.
8. The use of the DNA barcode according to claim 1 for identifying the nodospora adefovea TMCC70007 strain.
9. The use of the primer pair according to claim 3 for identifying the strain of Arthrospora adenantha TMCC 70007.
10. The use of the kit according to claim 6 for the identification of the nodospora adefovea TMCC70007 strain.
CN201710656798.9A 2017-08-03 2017-08-03 DNA bar code, primer, kit, method and application Active CN109385484B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710656798.9A CN109385484B (en) 2017-08-03 2017-08-03 DNA bar code, primer, kit, method and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710656798.9A CN109385484B (en) 2017-08-03 2017-08-03 DNA bar code, primer, kit, method and application

Publications (2)

Publication Number Publication Date
CN109385484A CN109385484A (en) 2019-02-26
CN109385484B true CN109385484B (en) 2021-01-29

Family

ID=65412959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710656798.9A Active CN109385484B (en) 2017-08-03 2017-08-03 DNA bar code, primer, kit, method and application

Country Status (1)

Country Link
CN (1) CN109385484B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111662995B (en) * 2019-03-06 2024-03-12 勐海茶业有限责任公司 DNA bar code, primer, kit, method and application

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Characterization and expression analysis of a gene cluster for nitrate assimilation from the yeast Arxula adeninivorans;Erik B¨oer, et al.;《Yeast》;20090203;第26卷;第83-93页 *
The complete genome of Blastobotrys (Arxula) adeninivorans LS3 - a yeast of biotechnological interest;Kunze et al.;《Biotechnology for Biofuels》;20140424;第7卷(第66期);第1-15页 *

Also Published As

Publication number Publication date
CN109385484A (en) 2019-02-26

Similar Documents

Publication Publication Date Title
del Carmen Portillo et al. Analysis of microbial diversity and dynamics during wine fermentation of Grenache grape variety by high-throughput barcoding sequencing
KR100969177B1 (en) Screening method for genes of brewing yeast
Banilas et al. Development of microsatellite markers for Lachancea thermotolerans typing and population structure of wine-associated isolates
JP4537094B2 (en) Screening method for yeast genes for brewing
KR20070083594A (en) Method for analyzing genes of industrial yeasts
Macías et al. Convergent adaptation of Saccharomyces uvarum to sulfite, an antimicrobial preservative widely used in human-driven fermentations
CN109385484B (en) DNA bar code, primer, kit, method and application
DK2723883T3 (en) Specific alleles, THERE ARE IMPORTANT FOR ETHANOL TOLERANCE
CN109385485B (en) DNA bar code, primer, kit, method and application
CN108950039B (en) DNA bar code, primer, kit, method and application
Lopez-Martinez et al. ATG18 and FAB1 are involved in dehydration stress tolerance in Saccharomyces cerevisiae
CN109402279B (en) DNA bar code, primer, kit, method and application
CN108795932B (en) DNA bar code, primer, kit, method and application
CN109402278B (en) DNA bar code, primer, kit, method and application
CN108866221B (en) DNA bar code, primer, kit, method and application
CN109385486B (en) DNA bar code, primer, kit, method and application
CN109385483B (en) DNA bar code, primer, kit, method and application
Sampaio et al. Taxonomy, diversity, and typing of brewing yeasts
Hoff Molecular typing of wine yeasts: Evaluation of typing techniques and establishment of a database
CN111662995B (en) DNA bar code, primer, kit, method and application
CN116804231A (en) DNA bar code, primer, kit, method and application
JP4820921B2 (en) Amino acid sequence, DNA and method for growing yeast
CN117363787A (en) DNA bar code, primer, kit, method and application
Lauterbach et al. Novel diagnostic marker genes differentiate Saccharomyces with respect to their potential application
CN108118098B (en) DNA bar code primer, DNA bar code, kit, method and application for rapidly identifying alternaria adefovea strain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200507

Address after: 666200 Yunnan Province, Xishuangbanna Dai Autonomous Prefecture of Menghai Menghai County town of Tea Road No. 9

Applicant after: MENGHAI TEA INDUSTRY Co.,Ltd.

Applicant after: Yunnan Dayi Microbial Technology Co., Ltd

Address before: 666200 Yunnan Province, Xishuangbanna Dai Autonomous Prefecture of Menghai Menghai County town of Tea Road No. 9

Applicant before: MENGHAI TEA INDUSTRY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant