CN116426671A - Traditional Chinese medicine identification method based on time-base method and application - Google Patents

Traditional Chinese medicine identification method based on time-base method and application Download PDF

Info

Publication number
CN116426671A
CN116426671A CN202310217601.7A CN202310217601A CN116426671A CN 116426671 A CN116426671 A CN 116426671A CN 202310217601 A CN202310217601 A CN 202310217601A CN 116426671 A CN116426671 A CN 116426671A
Authority
CN
China
Prior art keywords
sequence
seq
species
candidate
traditional chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310217601.7A
Other languages
Chinese (zh)
Other versions
CN116426671B (en
Inventor
宋经元
郝利军
许文杰
齐桂红
甘雨桐
辛天怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Medicinal Plant Development of CAMS and PUMC
Original Assignee
Institute of Medicinal Plant Development of CAMS and PUMC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Medicinal Plant Development of CAMS and PUMC filed Critical Institute of Medicinal Plant Development of CAMS and PUMC
Priority to CN202310217601.7A priority Critical patent/CN116426671B/en
Priority claimed from CN202310217601.7A external-priority patent/CN116426671B/en
Publication of CN116426671A publication Critical patent/CN116426671A/en
Application granted granted Critical
Publication of CN116426671B publication Critical patent/CN116426671B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Abstract

The application discloses a traditional Chinese medicine identification method based on a time-base method and application, wherein a standard specific target sequence is obtained by utilizing nuclear, chloroplast and mitochondrial genome of a traditional Chinese medicine primordial species, so that species identification of a sample to be detected can be realized based on the standard specific target sequence. The time-based method can screen and obtain standard specific target sequences for judging the identity of any traditional Chinese medicine sample to be detected and the specified traditional Chinese medicine base species, and eliminates miss risks such as miss, so that the time-based method can accurately judge the identity of any traditional Chinese medicine sample to be detected and the specified traditional Chinese medicine base species.

Description

Traditional Chinese medicine identification method based on time-base method and application
Technical Field
The application relates to the technical field of traditional Chinese medicine identification, in particular to a traditional Chinese medicine identification method based on a time-base method (Analysis of whole-GEnome, AGE, also called whole GEnome analysis method or Argo method) and application thereof.
Background
The traditional Chinese medicine is a great treasury inherited for thousands of years, and contains numerous valuable experiences and wisdom accumulated in struggling with diseases. Traditional identification methods (such as eye observation, hand touch and mouth taste) of traditional Chinese medicines cannot meet the identification requirements of traditional Chinese medicines with high timeliness development, and physicochemical identification methods based on index component detection cannot solve the identification problems of artificial adulteration and the like. The DNA molecular identification method identifies the specific DNA sequences inherent to the identified species, has the advantages of objectivity, sensitivity, strong specificity, simplicity, easiness in standardization and the like, and plays an increasingly important role in the field of traditional Chinese medicine identification.
The genome of the nucleus, the chloroplast and the mitochondria is used as a carrier of all genetic information of the species, can provide a large number of species-specific DNA sequences, is an ideal database for identifying traditional Chinese medicines, but due to the technical limitation, no traditional Chinese medicine molecule identification method based on the genome of the nucleus, the chloroplast and the mitochondria exists at present.
Various methods of identifying species-specific DNA sequences have been presented heretofore, including CRISPR/Cas systems, taqMan probe-based real-time PCR systems, sanger sequencing systems, and the like. CRISPR/Cas systems are trans-cleavage activity dependent on Cas proteins (such as Cas12a and Cas13 a): after crRNA specifically recognizes the target sequence, cas protein trans-cleavage activity is activated, allowing non-specific cleavage of single-stranded DNA fluorescent signal molecules, producing detectable fluorescence. The reaction is carried out at 37 ℃, the operation is simple, and only a constant temperature and fluorescence detection instrument is needed. The TaqMan probe-based real-time PCR system can identify species by detecting fluorescent signals by adding fluorescent probes that specifically recognize target sequences during the PCR reaction. The method needs a real-time fluorescence quantitative PCR instrument, and the result is accurate. The PCR system is most simple and convenient by designing a primer specifically recognizing a target sequence and carrying out species identification through electrophoresis after PCR, but has weaker specificity. The detection result of the Sanger sequencing system is the most accurate, but a sequencing instrument is needed, so that the detection is relatively complicated and the time is long. Besides the specific DNA sequence detection method, other sequence detection methods are also applicable, and are applicable to different application scenes. However, the species-specific target sequences which have been screened at present often have the defects of low specificity, high off-target risk and the like, and how to construct a library of suitable species-specific target sequences becomes a key factor for restricting the wide application of the existing DNA sequence recognition method in the aspect of traditional Chinese medicine molecular identification.
Disclosure of Invention
The species identification method (hereinafter referred to as time-delicacy method) combines nuclear, chloroplast and mitochondrial genome analysis with different target sequence detection methods, and realizes traditional Chinese medicine identification from nuclear, chloroplast and mitochondrial genome levels. Compared with the prior art, the time-based method fully exploits the potential of the nuclear, chloroplast and mitochondrial genomes for species identification by screening candidate target sequences for identification from the nuclear, chloroplast and mitochondrial genomes, selecting different detection methods according to application scenes and screening corresponding species standard specific target sequences. In consideration of huge information contained in nuclear, chloroplast and mitochondrial genomes and target sequence selection diversity provided by different target sequence detection methods, in theory, the time-based method can determine whether a specific target sequence exists in a genome of a sample to be detected through various relatively simple and lower-cost methods, the identity of any traditional Chinese medicine sample to be detected and a specified traditional Chinese medicine base stock species can be determined without sequencing the whole genome of the sample, and miss risks such as miss are eliminated, so that the time-based method can accurately determine the identity of any traditional Chinese medicine sample to be detected and the specified traditional Chinese medicine base stock species.
In one aspect, the invention provides a method for identifying traditional Chinese medicines based on a time-base method, which comprises the following steps:
(1) Obtaining nuclear, chloroplast and mitochondrial genome sequences of Chinese medicinal primordial species, and constructing a small fragment genome library;
(2) Extracting candidate target sequences from the small fragment genome library and analyzing, and screening candidate sequences meeting at least one selected from screening conditions (a) to (d) from the candidate target sequences as standard specific target sequences, wherein the screening conditions comprise:
(a) The GC content of the candidate sequence is 40% -60%;
the candidate sequence cannot contain four or more repeats, consecutive trinucleotide repeats, discrete 3 or more trinucleotide repeats;
the candidate sequence is not complementary to the crRNA repeat sequence;
the content of G+C in 6 nucleotides at the 5' end of the candidate sequence is 30% -80%;
the candidate sequence is aligned with nuclear, chloroplast and mitochondrial genome sequences of the mixed imitative product and the related species, and at least 3 difference nucleotides are contained; and/or
The region of the candidate sequence is-50 bp to +300bp or-300 bp to +50bp, and more than 4 continuous A or GT repeats do not exist;
(b) The content of G+C in the primer of the region where the candidate sequence is amplified is 30% -80%;
The annealing temperatures of the fluorescent probe designed according to the candidate sequence and the primer of the region where the upstream and downstream amplification candidate sequences are located are 55 ℃ to 60 ℃ and 68 ℃ to 70 ℃ respectively, and the annealing temperatures among the primer pairs differ by not more than 2 ℃;
the length of the primer of the region where the candidate sequence is amplified is 15-30bp;
primers of the region where the upstream and downstream amplification candidate sequences are located are spaced by 50-150bp, and a forward primer is close to a fluorescent probe designed according to the candidate sequences;
the fluorescent probe designed according to the candidate sequence and the primer of the region where the upstream and downstream amplification candidate sequences are located do not contain hairpin structures and the primer pair cannot form a dimer;
the G/C base in 5bp of the 3' end of the primer of the region where the upstream and downstream amplification candidate sequences are located is not more than two;
more C bases than G bases in the fluorescent probe designed according to the candidate sequence; and/or
Candidate sequences were aligned with nuclear, chloroplast and mitochondrial genomes of closely related species
At least 2 different nucleotides;
(c) The GC content of the amplification primer designed according to the candidate sequence is 40% -60%, and four nucleotides are uniformly distributed, so that the amplification primer does not contain polypyrrole and does not contain a GC-rich region;
the length of the amplification primer designed according to the candidate sequence is 18-30bp, and the difference between the primer pairs is not more than 3 bases;
The amplification primer designed according to the candidate sequence does not contain an inverted repeat sequence and a self-complementary sequence larger than 3bp, and a dimer cannot be formed between the primers;
the annealing temperature of the amplification primers designed according to the candidate sequence is 55-60 ℃, and the annealing temperature difference between the primer pairs is not more than 2 ℃;
more than one but not more than three G/C bases in 5 bases at the 3' -end of the amplification primer designed according to the candidate sequence; and/or
The candidate sequence is aligned with nuclear, chloroplast and mitochondrial genomes of the mixed imitative product and the related species to at least contain 2 different nucleotides;
alternatively, (d) the region in which the candidate sequence is located is homozygous;
the GC content of the region where the candidate sequence is located is 30% -80%;
the region of the candidate sequence cannot contain more than four consecutive repeated nucleotides;
the region in which the candidate sequence is located cannot contain a methylated sequence;
the area where the candidate sequence is located cannot have a hairpin structure; and/or
The candidate sequence is aligned with nuclear, chloroplast and mitochondrial genomes of the mixed imitative product and the related species to at least contain 2 different nucleotides;
(3) Extracting genomic DNA of a traditional Chinese medicine sample to be detected, optionally amplifying the genomic DNA or an amplification product thereof as a DNA substrate, and detecting whether the standard specific target sequence exists in the DNA substrate by using a target sequence detection system, wherein the target sequence detection system comprises a CRISPR/Cas12a system, a TaqMan probe-based real-time PCR system, a PCR system or a Sanger sequencing system, wherein the traditional Chinese medicine sample to be detected has identity with a specified traditional Chinese medicine basic species if a significant fluorescent signal is generated through the detection for the CRISPR/Cas12a system, otherwise the traditional Chinese medicine sample to be detected does not exist; for the TaqMan probe-based real-time PCR system, if the CT value is larger than 37 through the detection, the traditional Chinese medicine sample to be detected has the same property with the designated traditional Chinese medicine basic species, otherwise, the traditional Chinese medicine sample to be detected does not exist; in the PCR system, if the electrophoresis result generates obvious bands through the detection, the traditional Chinese medicine sample to be detected has the same property with the specified traditional Chinese medicine base stock, otherwise, the traditional Chinese medicine sample to be detected does not have the same property; for the Sanger sequencing system, if the sequencing result is the same as the standard specific target sequence, the traditional Chinese medicine sample to be detected has the same property with the designated traditional Chinese medicine basic species, otherwise, the traditional Chinese medicine sample to be detected does not have the same property.
In some embodiments, the nuclear, chloroplast, and mitochondrial genomic sequences of the traditional Chinese medicine primordial species are obtained by constructing a genomic map or shallow sequencing.
In some embodiments, the nuclear, chloroplast and mitochondrial genome sequences of the Chinese herb primordial species are divided into L-K+1 fragments of length K to construct the small fragment genomic library, and the copy number of each fragment is calculated, and the genomic position of each fragment is determined by alignment with the genome, wherein L represents the genomic sequence length and K represents the library fragment length. In some embodiments, K is 15-750bp, e.g., 18-30bp, 20-750bp, 25-28bp.
In some embodiments, the traditional Chinese medicine-based species specified for the above method is, but not limited to, rheum palmatum (Rheum palmatum), rheum tanguticum (Rheum tanguticum) or Rheum officinale (Rheum officinale). In this context, a herbal-based species refers to a species to which a sample to be detected may belong, for example, for a sample to be detected suspected to be derived from rheum officinale, rheum palmatum, rheum tanguticum, rheum officinale, etc. may be selected as the base species; further, if it is desired to determine whether the sample to be tested has identity with rheum officinale, then rheum officinale is the prescribed herbal base species.
In some embodiments, in step (2), the candidate target sequence is extracted according to the detection system employed in step (4) that detects the standard specific target sequence. Those skilled in the art are familiar with the requirements of various sequence detection systems for the sequence to be tested, e.g., for CRISPR/Cas12a systems, a motif with TTTV at the 5 'end or VAAA (PAM) at the 3' end can be selected (preferably, PAM motif is detected for each fragment in a small fragment genomic library, and candidate target sequences with PAM are extracted to construct a candidate target sequence library); for TaqMan probe-based real-time PCR systems, motifs that do not contain more than 3 consecutive repeated bases and 3 consecutive G bases can be selected; for PCR systems, motifs that do not contain a GC-rich region may be selected; alternatively, for example, for Sanger sequencing systems, motifs suitable for sequencing may be selected, although the scope of the invention is not limited in this respect.
In some embodiments, the specificity of the standard specific target sequence can be further increased by increasing the number of nucleotides that differ in the candidate sequence from the nuclear, chloroplast, and mitochondrial genomic sequences of the mixed and closely related species; or the standard specific target sequences in a preset number range can be obtained by adjusting the difference nucleotide numbers of candidate sequences and nuclear, chloroplast and mitochondrial genome sequences of mixed imitations and closely related species.
In a preferred embodiment, the specified chinese drug-based species is rheum officinale, the target sequence detection system is a CRISPR/Cas12a system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:1 and 4. In a preferred embodiment, the specified chinese drug-based species is rheum palmatum, the target sequence detection system is a CRISPR/Cas12a system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:2, and a target nucleotide sequence shown in seq id no. In a preferred embodiment, the specified chinese drug-based species is tangutorum, the target sequence detection system is the CRISPR/Cas12a system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:3, and a target nucleotide sequence shown in seq id no.
In a preferred embodiment, the specified chinese herbal based species is rheum officinale, and the target sequence detection system is a TaqMan probe-based real-time PCR system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:17, and a target nucleotide sequence shown in seq id no.
In a preferred embodiment, the specified chinese herbal based species is rheum officinale, and the target sequence detection system is a PCR system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:19, and a target nucleotide sequence shown in seq id no.
In a preferred embodiment, the specified chinese herbal based species is rheum officinale, and the target sequence detection system is Sanger sequencing system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:20, and a target nucleotide sequence shown in seq id no.
In step (3), the requirements of various sequence detection systems for detecting sequences are well known to those skilled in the art, and on the premise of standard specific target sequences obtained based on the present invention, corresponding detection sequences can be routinely designed and synthesized using known molecular biology websites or tools. In some embodiments, for example, for the CRISPR/Cas12a system, CRISPR RNA (crRNA) can be designed and synthesized from standard specific target sequences; preferably, a library of crRNA sequences matching a standard specific target sequence library of a given traditional Chinese medicine base stock species relative to its mixed stock and closely related species can be constructed. For example, for TaqMan probe-based real-time PCR systems, primers and fluorescent probes can be designed and synthesized according to the standard specific target sequences and the regions where the standard specific target sequences are located; for example, for PCR systems, amplification primers can be designed and synthesized based on standard specific target sequences; for example, for Sanger sequencing systems, sequencing primers can be synthesized based on a selected standard specific target sequence. It will be appreciated by those skilled in the art that the target sequence detection system is not limited to those exemplified above, but may be based on any detection system known in the art and thus designed to yield a corresponding detection sequence.
In a preferred embodiment, CRISPR/Cas12a systems are employed with SEQ ID NOs: 5. 6, 7 or 8 to detect the presence or absence of the standard specific target sequence in the DNA substrate.
In a preferred embodiment, taqman probe-based real-time PCR is used with the sequence of SEQ ID NOs:18 and 16, and a Taqman probe shown as FAM-GCTTGAATGAAAGT CAGGCACTCCGCCA-BHQ to detect the presence or absence of the standard specific target sequence in the DNA substrate.
In a preferred embodiment, a PCR system is used with SEQ ID NOs:19 and 16 to detect the presence or absence of said standard specific target sequence in said DNA substrate.
In a preferred embodiment, the Sanger sequencing system is used to sequence the nucleotide sequence of SEQ ID NOs:19 and 21 to detect the presence or absence of the standard specific target sequence in the DNA substrate.
In some embodiments, in step (4), for example, genomic DNA of the traditional Chinese medicine to be detected may be amplified using a primer pair that specifically amplifies a standard specific target sequence and the amplified standard specific target sequence is recovered as a DNA substrate; or using primers for specifically amplifying DNA sequences comprising standard specific target sequences to amplify genomic DNA to be detected and recovering the amplified DNA sequences comprising standard specific target sequences as DNA substrates.
For example, in selecting the CRISPR/Cas12a system as the target sequence detection system, one can utilize the sequences set forth in SEQ ID NOs:9-10, SEQ ID NOs:11-12, SEQ ID NOs:13-14 or SEQ ID NOs:15-16 and recovering amplified standard specific target sequences as DNA substrates.
In some preferred embodiments, the CRISPR/Cas12a system comprises: gene editing buffer, cas12a, crRNA, nuclease-free water, DNA substrate, and fluorescent signaling molecules (e.g., ssDNA reporter genes).
As an example, for the CRISPR/Cas12a system, NEBuffer 2.1 and Lba Cas12a (Cpf 1) can be selected, and the fluorescent signal molecule poly_c_fq (5 '-FAM-CCCCCCCCCC-BHQ-3') can be reacted as follows:
(1) The following reaction system (final concentration in brackets) was prepared
Figure SMS_1
(2) Incubate for 10 minutes at room temperature.
(3) Using the standard specific target sequence recovered after amplification as a DNA substrate, 10. Mu.L of the standard specific target sequence recovered after amplification (1 ng/. Mu.L) and 4. Mu.L of Poly_C_FQ (400 nM) were added and incubated at 37℃and at 0, 3, 6, 9, 12, 15, 25, 35, 45 minutes with a microplate reader at lambda ex 483nm/λ em The fluorescence values were detected separately at 535nm (determined from the selected fluorescent signal molecules). If the detection result has a significant difference (P < 0.01) from the blank control, the sample to be detected can be judged to have the same as the specified species, otherwise, the sample to be detected does not have the same.
In some preferred embodiments, the Taqman probe-based real-time PCR system comprises: buffer solution, primer, nuclease-free water, DNA substrate and Taqman probe.
As an example, in the case of a Taqman Probe-based real-time PCR system, a Probe qPCR Mix may be selected, a Taqman Probe fluorophore may be FAM, and a quencher may be BHQ. The reaction was performed as follows:
(1) The following reaction system (final concentration in brackets) was prepared
Figure SMS_2
(2) The reaction conditions were 95℃for 30sec;40 cycles: 95 ℃ for 5sec; 30sec at 60 ℃.
(3) If the CT value of the detection result is larger than 37, the sample to be detected can be judged to have the same property with the appointed species, otherwise, the sample to be detected does not have the same property.
In some preferred embodiments, the PCR system comprises: buffer, forward/reverse primer, nuclease-free water and DNA substrate.
As an example, for the PCR system Taq PCR MasterMix can be chosen, the reaction is performed as follows:
(1) The following reaction system (final concentration in brackets) was prepared
Figure SMS_3
Figure SMS_4
(2) The reaction conditions were 95℃for 30sec;30 cycles: 95 ℃ for 5sec; t (T) m 30sec;72℃30sec;72℃10min。
(3) The PCR product was electrophoresed on a 2% agarose gel at 120V for 30min. If the electrophoresis result has a specific band, the species to be detected can be judged to have the identity with the designated species, otherwise, the species to be detected does not have the identity.
In some preferred embodiments, the Sanger sequencing system comprises: buffer, sequencing primer, nuclease-free water and DNA substrate.
As an example, for the Sanger sequencing system, taq PCR MasterMix can be selected and reacted as follows:
(1) The following reaction system (final concentration in brackets) was prepared
Figure SMS_5
(2) The reaction conditions were 95℃for 30sec;30 cycles: 95 ℃ for 5sec; t (T) m 30sec;72℃30sec;72℃10min。
(3) The PCR products were Sanger sequenced using sequencing primers.
(4) Analysis of the sequencing results, for example, the sequencing results containing standard specific target sequences, can determine that the species to be detected has identity to the specified species, and otherwise, does not have identity.
In one aspect, the invention provides a standard specific target nucleotide of rheum officinale, which is at least one selected from the group consisting of: (1) the sequence set forth in SEQ ID NO:1 (AATATGGTTATGTTATATTAATAAA); (2) the sequence set forth in SEQ ID NO:2 (TTTATATTGATTGTTTTATATTGAT); (3) the sequence set forth in SEQ ID NO:3 (TTTCGCCAGTATCATTATTATTTAATTT); (4) the sequence set forth in SEQ ID NO:4 (TTTCTTGTGGCGGAGTGCCTGACTT); (5) the sequence set forth in SEQ ID NO:17 (GCTTGAATGAAAGTCAGGCACTCCGCCA); (6) the sequence set forth in SEQ ID NO:19 (AAGCTGGCTGTCATTCAGCT); or (7) the sequence set forth in SEQ ID NO:20 (ATAACCTGCATTCTATGGTTTGGTT).
In another aspect, the invention provides the use of a standard specific target nucleotide as described above for species identification of rhubarb or a material derived from rhubarb, or for distinguishing rhubarb from its closely related species.
The present invention discovers for the first time that there is a standard specific target nucleotide sequence in rhubarb that is species specific as shown above. Wherein, with SEQ ID NO: 1. 4, 17, 19 and 20 are found in rheum officinale, a target nucleotide sequence shown in SEQ ID NO:2 is found in rheum palmatum; and with SEQ ID NO:3 is found in tangutot rhubarb. In some embodiments, the rheum officinale may be selected from rheum palmatum, rheum tanguticum, or rheum officinale.
In one aspect, the invention provides a primer pair for species identification of rheum officinale or a material derived from rheum officinale, comprising:
(1) Ro_cp_1f: CCAAATTGCCCGAAGCCTATG (SEQ ID NO: 9), and Ro_cp_1R: ATCGCTTTCCGACCCACAAT (SEQ ID NO: 10);
(2) Rp_cp_1f: GTTTAGGCGGTACGTACATAGA (SEQ ID NO: 11), and Rp_cp_1R: GATCTCAGTAAGAAGGGTTTACGA (SEQ ID NO: 12);
(3) Rt_cp_1f: CGCTTTCGCCAGTATCATTAT (SEQ ID NO: 13); and rt_cp_1r: CCATTCCACAAAGGGATCC (SEQ ID NO: 14);
(4) Ro_wg_1f: ATGGCGAGAGAGGTGTTCCTAAA (SEQ ID NO: 15); and ro_wg_1r: GTTGTGAATCCGACACGACCAATAT (SEQ ID NO: 16);
(5) Ro_wg_2f: GCTGTCATTCAGCTGTTCTCTGT (SEQ ID NO: 18); and ro_wg_2r: GTTGTGAATCCGACACGACCAATAT (SEQ ID NO: 16);
(6) Ro_wg_3f: AAGCTGGCTGTCATTCAGCT (SEQ ID NO: 19); and ro_wg_3r: GTTGTGAATCCGACACGACCAATAT (SEQ ID NO: 16); or alternatively
(7) Ro_wg_4f: AAGCTGGCTGTCATTCAGCT (SEQ ID NO: 19); and ro_wg_4r: ATATTGGTCGTGTCGGATTCACAAC (SEQ ID NO: 21).
The primer pair disclosed by the invention can be used for amplifying the species standard specific target sequence of rheum officinale, so that the species identification of rheum officinale is realized. For example, the sequences set forth in SEQ ID NOs: the primer pairs shown in 9-10, 15-16, 18 and 16, 19 and 16, and 19 and 21 may preferably be used for species identification of materials derived from medicinal rhubarb; with SEQ ID NOs:11-12 may preferably be used for species identification of materials derived from rheum palmatum; and with SEQ ID NOs:13-14 may preferably be used for species identification of materials derived from tangutorum. In some embodiments, the rheum officinale may be selected from rheum palmatum, rheum tanguticum, or rheum officinale.
In another aspect, the invention relates to a kit for species identification of rheum officinale or a material derived from rheum officinale, wherein the kit comprises the primer pair described above.
In some embodiments, the kit further comprises PCR reaction reagents. In some preferred embodiments, the PCR reaction reagents comprise: PCR amplification buffer, dNTPs, taq DNA polymerase, mgCl 2 Sterile ultrapure water.
In some embodiments, the kit further comprises at least one selected from the group consisting of: reagents of TaqMan probe system, CRISPR/Cas12a system reagents.
In some preferred embodiments, the reagents of the Taqman probe system comprise: DNA polymerase, dNTPs, buffer solution and MgCl 2 Nuclease-free water and Taqman probes. In specific embodiments, the fluorescent moiety of the Taqman probe comprises FAM and the quenching moiety comprises BHQ. At the position ofIn a specific embodiment, the Taqman probe is shown as FAM-GCTTGAATGAAAGTCAGGCACTCCGCCA-BHQ.
In some preferred embodiments, the CRISPR/Cas12a system reagent comprises: gene editing buffer, cas protein, crRNA, nuclease-free water, and fluorescent signaling molecules (e.g., ssDNA fluorescent reporter genes). In specific embodiments, the Cas12a is an Lba Cas12a (Cpf 1); the fluorescent signal molecules include Poly_C_FQ (5 '-FAM-CCCCCCCCCC-BHQ-3'). In particular embodiments, the crRNA can be a nucleic acid sequence represented by SEQ ID NOs: 5. 6, 7 or 8.
In another aspect, the invention provides the use of the primer pair or kit described above in: identifying rheum officinale ingredients in a sample to be detected, and identifying species of rheum officinale or a material derived from rheum officinale, distinguishing traditional Chinese medicinal materials derived from rheum officinale from mixed and fake products thereof, or using the rheum officinale or the material derived from rheum officinale in safety detection of food, medicine or health care products.
In some embodiments, the sample to be tested may be rhubarb, a tissue or organ of rhubarb (e.g., a root or stem of rhubarb), a Chinese medicinal material containing rhubarb, or a rhubarb mix.
In the present invention, identity of a sample to be detected (e.g., a tissue or organ of rheum officinale, a mixed fake, a traditional Chinese medicine, a decoction piece, a Chinese patent medicine, a food, a health care product) to be detected is determined by obtaining genomic DNA of the sample to be detected and detecting whether the genomic DNA contains the standard specific target sequence of a specified species (e.g., rheum officinale).
Herein, the sample to be detected may be a sample from rheum officinale. In some exemplary embodiments, the rhubarb sample to be detected may be a sample from rheum officinale, rheum palmatum or rheum tanguticum. In some embodiments, the rhubarb sample to be detected is a sample of single species origin or a mixture of samples of multiple species origins.
Exemplary embodiments of the invention may be illustrated herein by the following numbered paragraphs:
1. a Chinese medicine identification method based on a time-base method comprises the following steps:
(1) Obtaining nuclear, chloroplast and mitochondrial genome sequences of Chinese medicinal primordial species, and constructing a small fragment genome library;
(2) Extracting candidate target sequences from the small fragment genome library and analyzing, and screening candidate sequences meeting at least one selected from screening conditions (a) to (d) from the candidate target sequences as standard specific target sequences, wherein the screening conditions comprise:
(a) The GC content of the candidate sequence is 40% -60%;
the candidate sequence cannot contain four or more repeats, consecutive trinucleotide repeats, discrete 3 or more trinucleotide repeats;
the candidate sequence is not complementary to the crRNA repeat sequence;
the content of G+C in 6 nucleotides at the 5' end of the candidate sequence is 30% -80%;
the candidate sequence is aligned with nuclear, chloroplast and mitochondrial genome sequences of the mixed imitative product and the related species, and at least 3 difference nucleotides are contained; and/or
The region of the candidate sequence is-50 bp to +300bp or-300 bp to +50bp, and more than 4 continuous A or GT repeats do not exist;
(b) The content of G+C in the primer of the region where the candidate sequence is amplified is 30% -80%;
the annealing temperatures of the fluorescent probe designed according to the candidate sequence and the primer of the region where the upstream and downstream amplification candidate sequences are located are 55 ℃ to 60 ℃ and 68 ℃ to 70 ℃ respectively, and the annealing temperatures among the primer pairs differ by not more than 2 ℃;
the length of the primer of the region where the candidate sequence is amplified is 15-30bp;
primers of the region where the upstream and downstream amplification candidate sequences are located are spaced by 50-150bp, and a forward primer is close to a fluorescent probe designed according to the candidate sequences;
the fluorescent probe designed according to the candidate sequence and the primer of the region where the upstream and downstream amplification candidate sequences are located do not contain hairpin structures and the primer pair cannot form a dimer;
the G/C base in 5bp of the 3' end of the primer of the region where the upstream and downstream amplification candidate sequences are located is not more than two;
more C bases than G bases in the fluorescent probe designed according to the candidate sequence; and/or
Candidate sequences were aligned with nuclear, chloroplast and mitochondrial genomes of closely related species
At least 2 different nucleotides;
(c) The GC content of the amplification primer designed according to the candidate sequence is 40% -60%, and four nucleotides are uniformly distributed, so that the amplification primer does not contain polypyrrole and does not contain a GC-rich region;
The length of the amplification primer designed according to the candidate sequence is 18-30bp, and the difference between the primer pairs is not more than 3 bases;
the amplification primer designed according to the candidate sequence does not contain an inverted repeat sequence and a self-complementary sequence larger than 3bp, and a dimer cannot be formed between the primers;
the annealing temperature of the amplification primers designed according to the candidate sequence is 55-60 ℃, and the annealing temperature difference between the primer pairs is not more than 2 ℃;
more than one but not more than three G/C bases in 5 bases at the 3' -end of the amplification primer designed according to the candidate sequence; and/or
The candidate sequence is aligned with nuclear, chloroplast and mitochondrial genomes of the mixed imitative product and the related species to at least contain 2 different nucleotides;
alternatively, (d) the region in which the candidate sequence is located is homozygous;
the GC content of the region where the candidate sequence is located is 30% -80%;
the region of the candidate sequence cannot contain more than four consecutive repeated nucleotides;
the region in which the candidate sequence is located cannot contain a methylated sequence;
the area where the candidate sequence is located cannot have a hairpin structure; and/or
The candidate sequence is aligned with nuclear, chloroplast and mitochondrial genomes of the mixed imitative product and the related species to at least contain 2 different nucleotides;
(3) Extracting genomic DNA of a traditional Chinese medicine sample to be detected, optionally amplifying the genomic DNA or an amplification product thereof as a DNA substrate, and detecting whether the standard specific target sequence exists in the DNA substrate by using a target sequence detection system, wherein the target sequence detection system comprises a CRISPR/Cas12a system, a TaqMan probe-based real-time PCR system, a PCR system or a Sanger sequencing system, wherein the traditional Chinese medicine sample to be detected has identity with a specified traditional Chinese medicine basic species if a significant fluorescent signal is generated through the detection for the CRISPR/Cas12a system, otherwise the traditional Chinese medicine sample to be detected does not exist; for the TaqMan probe-based real-time PCR system, if the CT value is larger than 37 through the detection, the traditional Chinese medicine sample to be detected has the same property with the designated traditional Chinese medicine basic species, otherwise, the traditional Chinese medicine sample to be detected does not exist; in the PCR system, if the electrophoresis result generates obvious bands through the detection, the traditional Chinese medicine sample to be detected has the same property with the specified traditional Chinese medicine base stock, otherwise, the traditional Chinese medicine sample to be detected does not have the same property; for the Sanger sequencing system, if the sequencing result is the same as the standard specific target sequence, the traditional Chinese medicine sample to be detected has the same property with the designated traditional Chinese medicine basic species, otherwise, the traditional Chinese medicine sample to be detected does not have the same property.
2. The method of paragraph 1 wherein the nuclear, chloroplast and mitochondrial genomic sequences of the Chinese herbal primordial species are obtained by constructing a genomic map or shallow sequencing.
3. The method of paragraph 1 or 2 wherein the nuclear, chloroplast and mitochondrial genome sequences of the Chinese herb basic species are divided into L-K+1 fragments of length K to construct the small fragment genomic library, and the copy number of each fragment is calculated, and the genomic position of each fragment is determined by comparison with the genome, wherein L represents the length of the genome sequence and K represents the length of the library fragment, wherein K is 15-750bp.
4. The method of any one of paragraphs 1-3, wherein the herbal-based species is rheum palmatum, rheum tanguticum or rheum officinale.
5. The method of any of paragraphs 1-4, wherein the chinese herbal primordial species is rheum officinale, the target sequence detection system is a CRISPR/Cas12a system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:1 and 4;
the Chinese herbal based species is rheum officinale, and the target sequence detection system is a TaqMan probe-based real-time PCR system, and the standard specific target sequence comprises a sequence represented by SEQ ID NO:17, a target nucleotide sequence shown in seq id no;
The traditional Chinese medicine-based species is rheum officinale, the target sequence detection system is a PCR system, and the standard specific target sequence comprises a sequence represented by SEQ ID NO:19, a target nucleotide sequence shown in seq id no; or alternatively
The chinese herbal primordial species is rheum officinale, and the target sequence detection system is Sanger sequencing system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:20, and a target nucleotide sequence shown in seq id no.
6. The method of any of paragraphs 1-4, wherein the traditional Chinese medicine primordial species is rheum palmatum, the target sequence detection system is a CRISPR/Cas12a system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:2, and a target nucleotide sequence shown in seq id no.
7. The method of any of paragraphs 1-4, wherein the chinese herbal primordial species is tangutorum, the target sequence detection system is a CRISPR/Cas12a system, and the standard specific target sequence comprises the amino acid sequence set forth in SEQ ID NO:3, and a target nucleotide sequence shown in seq id no.
8. The method of any of paragraphs 1-7, wherein the CRISPR/Cas12a system is employed with the sequence set forth in SEQ ID NOs: 5. 6, 7 or 8 to detect the presence or absence of the standard specific target sequence in the DNA substrate.
9. The method of any one of paragraphs 1-5, wherein the TaqMan probe-based real-time PCR system is used with the sequence set forth in SEQ ID NOs:18 and 16, and a Taqman probe shown as FAM-GCTTGAATGAAAGTCAGGCACTCCGCCA-BHQ to detect the presence or absence of the standard specific target sequence in the DNA substrate.
10. The method of any of paragraphs 1-5, wherein the PCR system is used with the sequence set forth in SEQ ID NOs:19 and 16 to detect the presence or absence of said standard specific target sequence in said DNA substrate.
11. The method of any one of paragraphs 1-5, wherein the Sanger sequencing system is employed as set forth in SEQ ID NOs:19 and 21 to detect the presence or absence of the standard specific target sequence in the DNA substrate.
12. The method of any one of paragraphs 1-11, wherein, in step (4), the genomic DNA of the traditional Chinese medicine to be detected is amplified using a primer pair that specifically amplifies a standard specific target sequence and the amplified standard specific target sequence is recovered as a DNA substrate; or using primers for specifically amplifying DNA sequences containing standard specific target sequences to amplify genomic DNA to be detected and recovering the amplified DNA sequences containing standard specific target sequences as DNA substrates.
13. The method of any of paragraphs 1-8, wherein the CRISPR/Cas12a system comprises: gene editing buffer, cas12a, crRNA, nuclease-free water, DNA substrate, and fluorescent signal molecule.
14. The method of any one of paragraphs 1-5 and 9, wherein the Taqman probe-based real-time PCR system comprises: buffer solution, primer, nuclease-free water, DNA substrate and Taqman probe.
15. The method of any one of paragraphs 1-5 and 10, wherein the PCR system comprises: buffer, forward primer, reverse primer, nuclease-free water and DNA substrate.
16. The method of any one of paragraphs 1-5 and 11, wherein the Sanger sequencing system comprises: buffer, sequencing primer, nuclease-free water and DNA substrate.
17. A standard specific target nucleotide of rheum officinale, wherein the target nucleotide is at least one selected from the group consisting of: (1) the sequence set forth in SEQ ID NO:1, a target nucleotide shown in fig. 1; (2) the sequence set forth in SEQ id no:2, a target nucleotide shown in seq id no; (3) the sequence set forth in SEQ ID NO:3, a target nucleotide shown in fig. 3; (4) the sequence set forth in SEQ ID NO: 4; (5) the sequence set forth in SEQ ID NO:17, a target nucleotide shown in seq id no; (6) the sequence set forth in SEQ ID NO:19, a target nucleotide shown in seq id no; or (7) the sequence set forth in SEQ ID NO:20, and a target nucleotide shown in seq id no.
18. The use of a standard specific target nucleotide as set forth in paragraph 17 for the species identification of rhubarb or a material derived from rhubarb, or for distinguishing rhubarb from its closely related species.
19. The use of paragraph 18, wherein the large Huang Xuanzi rheum palmatum, tanggutta-percha or rheum officinale.
20. A primer pair for species identification of rheum officinale or a material derived from rheum officinale, comprising:
(1) SEQ ID NO:9 and SEQ ID NO:10;
(2) SEQ ID NO:11 and SEQ ID NO:12;
(3) SEQ ID NO:13 and SEQ ID NO:14;
(4) SEQ ID NO:15 and SEQ ID NO:16;
(5) SEQ ID NO:18 and SEQ ID NO:16;
(6) SEQ ID NO:19 and SEQ ID NO:16; or alternatively
(7) SEQ ID NO:19 and SEQ ID NO:21.
21. the primer pair of paragraph 20 wherein the Rheum palmatum, rheum tanguticum or Rheum officinale of large Huang Xuanzi is a pharmaceutically acceptable form.
22. A kit for species identification of rheum officinale or a material derived from rheum officinale, wherein the kit comprises the primer pair of paragraph 20 or 21.
23. The kit of paragraph 22 wherein the kit further comprises PCR reaction reagents.
24. The kit of paragraph 23 wherein the PCR reaction reagents comprise: PCR amplification buffer, dNTPs, taq DNA polymerase, mgCl 2 Sterile ultrapure water.
25. The kit of any one of paragraphs 22-24, wherein the kit further comprises at least one selected from the group consisting of: reagents of TaqMan probe system, CRISPR/Cas12a system reagents.
26. The kit of paragraph 25 wherein the reagents of the Taqman probe system comprise: DNA polymerase,dNTPs, buffer and MgCl 2 Nuclease free water and Taqman probes.
27. The kit of paragraph 26 wherein the Taqman probe is shown as FAM-GCTTGAATGAAAGTCAGGCACTCCGCCA-BHQ.
28. The kit of paragraph 25 wherein the CRISPR/Cas12a system reactant comprises: gene editing buffer, cas protein, crRNA, nuclease-free water, and fluorescent signal molecules.
29. The kit of paragraph 28 wherein the crRNA is a polypeptide represented by SEQ ID NOs: 5. 6, 7 or 8.
30. Use of the primer pair of paragraph 20 or 21 or the kit of any of paragraphs 22-29 for: identifying rheum officinale ingredients in a sample to be detected, and identifying species of rheum officinale or a material derived from rheum officinale, distinguishing traditional Chinese medicinal materials derived from rheum officinale from mixed and fake products thereof, or using the rheum officinale or the material derived from rheum officinale in safety detection of food, medicine or health care products.
31. The use of paragraph 30, wherein the sample to be tested is rhubarb, a tissue or organ of rhubarb, a Chinese medicinal material containing rhubarb, or a rhubarb mixed pseudo product.
32. The use of paragraphs 30 or 31 wherein the rhubarb sample to be tested is a sample from rhubarb, rheum palmatum or Rheum tanguticum.
33. The use of any of paragraphs 30-32, wherein the rhubarb sample to be tested is a sample of single species origin or a mixture of samples of multiple species origins.
Exemplary aspects of the present invention will be further described with reference to the accompanying drawings to fully illustrate the objects, technical features, and technical effects of the present invention. The following drawings are merely examples of the present disclosure, and the scope of the present invention is not limited thereto.
Drawings
FIG. 1 is a schematic flow chart of a time-slicing method of the present disclosure;
FIG. 2 is a result of an exemplary CRISPR/Cas12a system-based time-scale method of the present disclosure applied to identification of rheum officinale (Ro_cp_target1 as a standard specific target sequence);
FIG. 3 is a graph of the results of an exemplary CRISPR/Cas12a system-based time-slicing method of the present disclosure applied to rheum palmatum identification;
FIG. 4 is a graph of the results of an exemplary CRISPR/Cas12a system-based time-delicacy of the present disclosure applied to identification of tangutorum rhubarb;
FIG. 5 is a result of an exemplary CRISPR/Cas12a system-based time-scale method of the present disclosure applied to identification of rheum officinale (Ro_wg_target1 as a standard specific target sequence);
FIG. 6 is a graph of the results of an exemplary TaqMan probe-based real-time PCR system-based time-based method of the present disclosure applied to identification of pharmaceutical radix et rhizoma Rhei;
FIG. 7 is a graph showing the results of an exemplary PCR-based system of the present disclosure as applied to the identification of pharmaceutical rheum officinale;
FIG. 8 is a graph of the results of an exemplary time-slicing method of the present disclosure applied to identification of rhubarb materials to be tested;
FIG. 9 is a graph showing the results of an exemplary time-series method of the present disclosure applied to the identification of decoction pieces of rhubarb to be detected;
FIG. 10 is a graph showing the results of an exemplary time-series method of the present disclosure applied to the identification of a Chinese patent drug containing rhubarb to be tested;
FIG. 11 shows the results of the Ro_cp_medium, rp_cp_medium and Rt_cp_medium groups when the time-base method was applied to the detection of a Chinese patent medicine containing rhubarb.
Detailed Description
Fig. 1 shows a schematic flow chart of the time-series method of the present application, and the method for identifying traditional Chinese medicines based on the time-series method of the present disclosure is further described below with reference to the identification process of traditional Chinese medicines with rheum officinale as a specific implementation example. The experimental procedures, which are not specified in the following examples, were carried out under conventional conditions.
Example 1: identification of rhubarb plants by time-series method based on CRISPR/Cas12a system
The 2020 edition of Chinese pharmacopoeia specifies that Rheum officinale is the dried root and rhizome of Rheum palmatum (Rheum palmatum) Maxim. Or Rheum officinale (Rheum tanguticum Maxim.) Makino. Of Polygonaceae. The rheum officinale is a traditional Chinese medicinal material commonly used in a large amount, and has the effects of purging pathogenic accumulation, clearing heat and purging fire, cooling blood and removing toxin, removing blood stasis and dredging channels, promoting diuresis and removing jaundice. Chemical analysis shows that the contents of the three active ingredients of the rheum officinale are different, and the three rheum officinale are very necessary to be distinguished in order to ensure the safety and effectiveness of clinical medication. However, common molecular identification methods such as DNA barcodes and the like cannot distinguish the three primitive species of rheum officinale.
In this example, the time-delicacy based on CRISPR/Cas12a system was used to identify rheum officinale plants according to the following procedure:
(1) The nuclear, chloroplast and mitochondrial genome sequences of the three rheum officinale were obtained by shallow sequencing.
(2) Three kinds of rhubarb nuclear, chloroplast and mitochondrial genome sequences (L=genome sequence length) were selected, and each was separated into (L-25+1) sequences of 25bp in length or (L-28+1) sequences of 28bp in length by Jellyfish (v1.1.12) to construct a small fragment genome library.
(3) Extracting a sequence with PAM from a small fragment genome library of rheum officinale (in the embodiment, a CRISPR/Cas12a system is adopted as a target sequence detection system, and a sequence with TTTV at the 5 'end or VAAA (PAM) at the 3' end is selected); and standard specific target sequences for identification of rheum officinale plants were selected using Bowtie (v1.1.0) according to the following screening principle:
The GC content of the candidate sequence is 40% -60%; the candidate sequence cannot contain four or more repeats, consecutive trinucleotide repeats, discrete 3 or more trinucleotide repeats; the candidate sequence is not complementary to the crRNA repeat (5'-UAAUUUCUACUAAGUGUAGAU-3'; SEQ ID NO: 22); the content of G+C in 6 nucleotides at the 5' end of the candidate sequence is 30% -80%; the candidate sequence is aligned with nuclear, chloroplast and mitochondrial genome sequences of the mixed imitative product and the related species, and at least 3 difference nucleotides are contained; the candidate sequence is in the region of-50 bp to +300bp or-300 bp to +50bp, and more than 4 continuous A or GT repeats are not present.
To demonstrate the versatility of the time-point method, this example screened a standard specific target sequence from the chloroplast genomes of three rheum officinale respectively, named ro_cp_target1 (rheum officinale, medicinal rheum officinale), rp_cp_target1 (rheum palmatum) and rt_cp_target1 (rheum tanguticum). A standard specific target sequence is selected from the nuclear genome of the rheum officinale, and is named as Ro_wg_target1. The sequence is as follows (5 '. Fwdarw.3'):
Ro_cp_target1:AATATGGTTATGTTATATTAATAAA(SEQ ID NO:1)Rp_cp_target1:TTTATATTGATTGTTTTATATTGAT(SEQ ID NO:2)Rt_cp_target1:TTTCGCCAGTATCATTATTATTTAATTT(SEQ ID NO:3)Ro_wg_target1:TTTCTTGTGGCGGAGTGCCTGACTT(SEQ ID NO:4)
according to the selected CRISPR/Cas system and the crRNA design principle, crRNAs matched with the standard specific target sequences are designed, and crRNAs corresponding to the standard specific target sequences selected from the chloroplast genome of three rheum officinale are respectively named as Ro_cp_crRNA1, rp_cp_crRNA1 and Rt_cp_crRNA1. The crRNA corresponding to the standard specific target sequence selected from the nuclear genome of rheum officinale is designated ro_wg_crrr1, and has the following (5 '. Fwdarw.3'): ro_cp_crrr1: UAAUUUCUACUAAGUGUAGAUUUAAUAUAACAUAACCAUAUU (SEQ ID NO: 5)
Rp_cp_crRNA1:UAAUUUCUACUAAGUGUAGAUUAUUGAUUGUUUUAUAUUGAU(SEQ ID NO:6)
Rt_cp_crRNA1:UAAUUUCUACUAAGUGUAGAUGCCAGUAUCAUUAUUAUUUAAUU U(SEQ ID NO:7)
Ro_wg_crRNA1:UAAUUUCUACUAAGUGUAGAUUUGUGGCGGAGUGCCUGACUU(SEQ ID NO:8)
(4) Amplifying and purifying standard specific target sequence, wherein the medicinal rhubarb is collected from Sichuan crown, the serial number is Y6, the palmleaf rhubarb is collected from Gansu downed, the serial number is Z5, and the tangkute rhubarb is collected from Sichuan hongya, the serial number is T1. The collected plant samples were dried and pulverized by a ball mill, and genomic DNA was extracted according to the instructions of tengen company Plant Genomic DNA Kit. The integrity of the genomic DNA was checked by 0.8% agarose gel electrophoresis, and then the purity and concentration thereof were checked by a Nanodrop 2000C spectrophotometer. Designing primers according to the region where the standard specific target sequence is located to amplify the standard specific target sequence, wherein the primers are respectively named as Ro_cp_1F and Ro_cp_1R; rp_cp_1f, rp_cp_1r; rt_cp_1f, rt_cp_1r; ro_wg_1F, ro_wg_1R, the sequence of which is as follows (5 '. Fwdarw.3'):
Ro_cp_1F:CCAAATTGCCCGAAGCCTATG(SEQ ID NO:9)
Ro_cp_1R:ATCGCTTTCCGACCCACAAT(SEQ ID NO:10)
Rp_cp_1F:GTTTAGGCGGTACGTACATAGA(SEQ ID NO:11)
Rp_cp_1R:GATCTCAGTAAGAAGGGTTTACGA(SEQ ID NO:12)
Rt_cp_1F:CGCTTTCGCCAGTATCATTAT(SEQ ID NO:13)
Rt_cp_1R:CCATTCCACAAAGGGATCC(SEQ ID NO:14)
Ro_wg_1F:ATGGCGAGAGAGGTGTTCCTAAA(SEQ ID NO:15)
Ro_wg_1R:GTTGTGAATCCGACACGACCAATAT(SEQ ID NO:16)
the total volume of PCR reaction was 50. Mu.L: mu.L of 2 XTaq Mastermix, 2. Mu.L of primer (F/R) (400 nM), 2. Mu.L of total DNA sample, and nuclease-free water were filled to 50. Mu.L. The PCR reaction conditions were: 30sec at 95 ℃;35 cycles: 95 ℃ for 5sec; t (T) m 30sec;72 ℃ for 2min;72 ℃ for 10min; preserving at 10 ℃. The PCR product was recovered and purified using tengen Universal DNA Purification Kit according to the instructions, the integrity of the standard specific target sequence was checked by 2% agarose gel electrophoresis, then the purity and concentration was checked using Nanodrop 2000C spectrophotometer, and the fragment of the region where the recovered standard specific target sequence was used as the DNA substrate for the subsequent experiments.
(5) The time-point method is used for identifying the rheum officinale plant sample. Ro_cp_crRNA1, rp_cp_crRNA1, rt_cp_crRNA1 and Ro_wg_crRNA1 are used as crRNA, and the amplified region fragments of Ro_cp_target1, rp_cp_target1, rt_cp_target1 and Ro_wg_target1 are used as DNA substrates to correspondingly set Ro_cp_plant, rp_cp_plant, rt_cp_plant and Ro_wg_plant respectively. Experiments were performed using enben Lba Cas12a (Cpf 1) from NEB corporation in a total reaction volume of 100 μl:10 μL of 10 XNEBuffer 2.1,2 μL of Lba Cas12a (20 nM), 3 μL of crRNA (300 nM), 10 μL of DNA substrate (1 ng/. Mu.L), 4 μL of LPoly_C_FQ (400 nM) and 71 μL of nuclease-free water. NEBuffer2.1, lba Cas12a, crRNA and water without nuclease are firstly added into a reaction system to be incubated for 30 minutes at room temperature, then DNA substrate and Poly_C_FQ are added, and are incubated at 37 ℃ and at 0, 3, 6, 9, 12, 15, 25, 35 and 45 minutes, an enzyme label instrument is used for lambda ex 483nm/λ em Fluorescence was detected at 535nm, respectively.
The results of the ro_cp_plant, rp_cp_plant, rt_cp_plant, and ro_wg_plant groups are shown in fig. 2, 3, 4, and 5, respectively. In the ro_cp_plant group, using ro_cp_crrna1 as crRNA, only medicinal rheum officinale produced fluorescent signals, whereas rheum palmatum and rheum tanguticum produced no fluorescent signals. The other 3 groups likewise only produced fluorescent signals from the crRNA in combination with the corresponding samples, with a significant difference (P > 0.01) from the CK group (blank group, no DNA substrate added, the rest of the groups). The results demonstrate that the time-delicacy method based on the CRISPR/Cas12a system can accurately and rapidly identify three rheum officinale plants.
Example 2: taqMan probe-based real-time PCR system-based time-based method for identifying rheum officinale plants
(1) Three rhubarb nuclear, chloroplast and mitochondrial genome sequences were obtained by shallow sequencing.
(2) Three kinds of rhubarb nuclear, chloroplast and mitochondrial genome sequences (l=genome sequence length) were selected, and each was divided into (L-28+1) sequences of 28bp (length is optional, recommended 18-30 bp) with jelyfish (v1.1.12) to construct a small fragment genome library.
(3) Extracting a sequence containing no more than 3 continuous repeated bases and 3 continuous G bases from a small fragment genome library of rheum officinale; and standard specific target sequences for identification of rheum officinale plants were selected using Bowtie (v1.1.0) software according to the following screening principle:
the content of G+C in the primer of the region where the candidate sequence is amplified is 30% -80%; the annealing temperatures of the fluorescent probe designed according to the candidate sequence and the primer of the region where the upstream and downstream amplification candidate sequences are located are 55 ℃ to 60 ℃ and 68 ℃ to 70 ℃ respectively, and the annealing temperatures among the primer pairs differ by not more than 2 ℃; the length of the primer of the region where the candidate sequence is amplified is 15-30bp; primers of the region where the upstream and downstream amplification candidate sequences are located are spaced by 50-150bp, and a forward primer is close to a fluorescent probe designed according to the candidate sequences; the fluorescent probe designed according to the candidate sequence and the primer of the region where the upstream and downstream amplification candidate sequences are located do not contain hairpin structures and the primer pair cannot form a dimer; the G/C base in 5bp of the 3' end of the primer of the region where the upstream and downstream amplification candidate sequences are located is not more than two; more C bases than G bases in the fluorescent probe designed according to the candidate sequence; the candidate sequence contains at least 2 different nucleotides when aligned with the nuclear, chloroplast and mitochondrial genomes of the alien species.
In this example, a standard specific target sequence, designated Ro_wg_Target2, was selected from the nuclear genome of Rheum officinale. The sequence is as follows (5 '. Fwdarw.3'):
Ro_wg_target2:GCTTGAATGAAAGTCAGGCACTCCGCCA(SEQ ID NO:17)
according to the selected TaqMan probe-based real-time PCR system and the design principle of the fluorescent probe, the fluorescent probe matched with the standard specific target sequence is designed, and is named as Ro_wg_Probe2, and the sequence is as follows (5 '. Fwdarw.3'):
Ro_wg_probe2:FAM-GCTTGAATGAAAGTCAGGCACTCCGCCA-BHQ
amplification primers were designed based on fragments upstream and downstream of the standard specific target sequence, designated Ro_wg_2F and Ro_wg_2R, respectively, and the sequences were as follows (5 '. Fwdarw.3'):
Ro_wg_2F:GCTGTCATTCAGCTGTTCTCTGT(SEQ ID NO:18)Ro_wg_2R:GTTGTGAATCCGACACGACCAATAT(SEQ ID NO:16)
(4) Extraction and detection of rhubarb plant samples and plant DNA were the same as in example 1.
(5) The time-series method identified rheum officinale plant samples, using ro_wg_probe2 as fluorescent probe, ro_wg_ F, ro _wg_2r as primer, three rheum officinale genomic DNA as DNA substrate (i.e. total DNA sample), set ro_wg_probe set. Experiments were performed using a Probe qPCR Mix from TaKaRa, the total volume of qPCR reaction was 20 μl: 10. Mu.L of 2 XTaq Mastermix, 0.4. Mu.L of primer (F/R) (200 nM), 2. Mu.L of total DNA sample, 0.8. Mu.L of fluorescent probe, nuclease free water make up to 20. Mu.L. The PCR reaction conditions were: 30sec at 95 ℃;35 cycles: 95 ℃ for 5sec; 30sec at 60 ℃;72 ℃ for 2min;72 ℃ for 10min; preserving at 10 ℃.
As a result, in the Ro_wg_probe group, only medicinal rhubarb has Ct value smaller than 37, and has significant difference (P > 0.01) with CK (blank control group) in FIG. 6. And Ct values of rheum palmatum and rheum tanguticum are the same as CK and are both larger than 37. The results demonstrate that the time-based PCR method based on TaqMan probe-based real-time can accurately and rapidly identify three rhubarb plants.
Example 3: PCR system-based time-based method for identifying rheum officinale plants
(1) Three rhubarb nuclear, chloroplast and mitochondrial genome sequences were obtained by shallow sequencing.
(2) Three kinds of rhubarb nuclear, chloroplast and mitochondrial genome sequences (l=genome sequence length) were selected, and each was divided into (L-20+1) sequences of 20bp (length is optional, recommended 18-30 bp) with jelyfish (v1.1.12) to construct a small fragment genome library.
(3) Constructing a rhubarb candidate target sequence library, and extracting sequences which are not rich in GC regions from a small fragment genome library of rhubarb; and standard specific target sequences for identification of rheum officinale plants were selected using Bowtie (v1.1.0) software according to the following screening principle:
the GC content of the amplification primer designed according to the candidate sequence is 40-60%, and four nucleotides are uniformly distributed (without the polypyrrole and polypyrimidine and without the GC rich region); the length of the amplification primer designed according to the candidate sequence is 18-30bp, and the difference between the primer pairs is not more than 3 bases; the amplification primer designed according to the candidate sequence does not contain an inverted repeat sequence and a self-complementary sequence larger than 3bp, and a dimer cannot be formed between the primers; the annealing temperature of the amplification primers designed according to the candidate sequence is 55-60 ℃, and the annealing temperature difference between the primer pairs is not more than 2 ℃; more than one but not more than three G/C bases in 5 bases at the 3' -end of the amplification primer designed according to the candidate sequence; the candidate sequence contains at least 2 different nucleotides when aligned with the nuclear, chloroplast and mitochondrial genomes of the alien species.
In this example, a standard specific target sequence, designated Ro_wg_Target3, was selected from the nuclear genome of Rheum officinale. The sequence is as follows (5 '. Fwdarw.3'):
Ro_wg_target3:AAGCTGGCTGTCATTCAGCT(SEQ ID NO:19)
according to the selected PCR system and primer design principle, detection primers matching the standard specific target sequence are designed, named Ro_wg_3F and Ro_wg_3R, respectively, with the following sequences (5 '. Fwdarw.3'):
Ro_wg_3F:AAGCTGGCTGTCATTCAGCT(SEQ ID NO:19)
Ro_wg_3R:GTTGTGAATCCGACACGACCAATAT(SEQ ID NO:16)
(4) Extraction and detection of rhubarb plant samples and plant DNA were the same as in example 1
(5) The time-series method identified rheum officinale plant samples, using ro_wg_ F, ro _wg_3r as primers, three rheum officinale genomic DNA as DNA substrates (i.e., total DNA samples), set up ro_wg_pcr sets. Experiments were performed using Taq Master mix from Edley, the total volume of PCR reactions was 25. Mu.L: 12.5. Mu.L of 2 XTaq Master mix, 0.5. Mu.L of primer (F/R) (200 nM), 2. Mu.L of total DNA sample, and no nuclease water was added to 25. Mu.L. The PCR reaction conditions were: 30sec at 95 ℃;40 cycles: 95 ℃ for 5sec; 30sec at 60 ℃. The PCR products were electrophoresed on a 2% agarose gel at 120V for 30min.
As a result, in the Ro_wg_PCR group, only medicinal rhubarb gave bright single bands and both Rheum palmatum and Rheum tanguticum were identical to CK, and no bright single bands were produced. The results show that the time-based method based on the PCR system can accurately and rapidly identify three rhubarb plants.
Example 4: identification of rhubarb plants by time-base method based on Sanger sequencing system
(1) Three rhubarb nuclear, chloroplast and mitochondrial genome sequences were obtained by shallow sequencing.
(2) Three kinds of rhubarb nuclear, chloroplast and mitochondrial genome sequences (l=genome sequence length) were selected, and each was divided into (L-25+1) sequences 25bp (length is optional, recommended 20-750 bp) in length with jelyfish (v1.1.12) to construct a small fragment genome library.
(3) Because of the low sequence requirements of Sanger sequencing systems, small fragment genomic libraries directly based on rhubarb were selected using Bowtie (v1.1.0) software to identify standard specific target sequences for rhubarb plants according to the following screening principles:
the region of the candidate sequence is homozygous; the GC content of the region where the candidate sequence is located is about 50%, for example 30% -80%, so that overhigh is avoided; the region of the candidate sequence cannot contain more than four consecutive repeated nucleotides; the region in which the candidate sequence is located cannot contain a methylated sequence. The area where the candidate sequence is located cannot have a hairpin structure; the candidate sequence contains at least 2 different nucleotides when aligned with the nuclear, chloroplast and mitochondrial genomes of the alien species.
In this example, a standard specific target sequence, designated Ro_wg_Target4, was selected from the nuclear genome of Rheum officinale. The sequence is as follows (5 '. Fwdarw.3'):
Ro_wg_target4:ATAACCTGCATTCTATGGTTTGGTT(SEQ ID NO:20)
sequencing primers matching the standard specific target sequence were designed according to the selected PCR system and primer design rules, designated ro_wg_4f and ro_wg_4r, respectively, with the following sequences (5 '→3'):
Ro_wg_4F:AAGCTGGCTGTCATTCAGCT(SEQ ID NO:19)
Ro_wg_4R:ATATTGGTCGTGTCGGATTCACAAC(SEQ ID NO:21)
(4) Extraction and detection of rhubarb plant samples and plant DNA were the same as in example 1.
(5) The time-series method identified rheum officinale plant samples, using ro_wg_ F, ro _wg_4r as primers, three rheum officinale genomic DNA as DNA substrates (i.e., total DNA samples), set up the ro_wg_sequencing group. Experiments were performed using Taq Master mix from Edley, the total volume of PCR reactions was 25. Mu.L: 12.5. Mu.L of 2 XTaq Master mix, 0.5. Mu.L of primer (F/R) (200 nM), 2. Mu.L of total DNA sample, 25. Mu.L of nuclease-free water. The PCR reaction conditions were: 30sec at 95 ℃;40 cycles: 95 ℃ for 5sec; 30sec at 60 ℃. The PCR products were spliced and analyzed for sequencing results using a Sanger sequencing system.
The results are shown in FIG. 8, in the Ro_wg_sequencing group, only the medicinal rhubarb contained the standard specific target sequence and neither Rheum palmatum nor Rheum tanguticum had a sequence identical to the standard specific target sequence. The results demonstrate that the time-based method based on Sanger sequencing system can accurately and rapidly identify three rhubarb plants.
Example 5: application of time-delicacy method to identification result of rhubarb medicinal material
This example uses the CRISPR/Cas system as detection means, three rhubarb standard specific target sequences have been obtained in example 1, which directly uses ro_cp_crrna1, rp_cp_crrna1 and rt_cp_crrna1 as crrnas, using the following primer pairs as primers for the corresponding amplification standard specific target sequences: ro_cp_ F, ro _cp_1r; rp_cp_1f, rp_cp_1r; rt_cp_1f and rt_cp_1r.
The medicinal rhubarb to be detected is collected from Sichuan crown, the serial number is Y10, and the tangutot rhubarb is collected from Sichuan cover, the serial number is T12. The medicinal material sample to be detected is crushed by a ball mill, and then genomic DNA is extracted according to Plant Genomic DNA Kit application instructions provided by TIANGEN company. The integrity of the genomic DNA was checked by 0.8% agarose gel electrophoresis, and then the purity and concentration thereof were checked by a Nanodrop 2000C spectrophotometer. The obtained genomic DNAs of two rhubarb medicinal materials are used as substrates, and the three pairs of primers are respectively used for amplifying standard specific target sequences. The total volume of PCR reaction was 50. Mu.L: 25. Mu.L of 2 XTaq Mastermix, 2. Mu.L of primer (F/R) (200 nM), 2. Mu.L of total DNA sample, and nuclease-free water was made up to 50. Mu.L. The PCR reaction conditions were: 30sec at 95 ℃;35 cycles: 95 ℃ for 5sec; t (T) m 30sec;72 ℃ for 2min;72 ℃ for 10min; preserving at 10 ℃.
The PCR product was recovered and purified using Universal DNA Purification Kit provided by TIANGEN, according to the instructions, and the integrity of the standard specific target sequence was checked by 2% agarose gel electrophoresis, then the purity and concentration thereof were checked by a Nanodrop 2000C spectrophotometer, and the fragment of the region where the recovered standard specific target sequence was used as a DNA substrate for the subsequent experiments.
The Ro_cp_crRNA1, rp_cp_crRNA1 and Rt_cp_crRNA1 are respectively used as crRNA, and the Ro_cp_herb, rp_cp_herb and Rt_cp_herb groups are respectively and correspondingly arranged by taking the region fragments where the recovered standard specific target sequences are located as DNA substrates. Experiments were performed using enben Lba Cas12a (Cpf 1) from NEB corporation in a total reaction volume of 100 μl:10 μL of 10 XNEBuffer 2.1,2 μL of Lba Cas12a (20 nM), 3 μL of crRNA (300 nM), 10 μL of DNA substrate (1 ng/. Mu.L), 4 μL of Poly_C_FQ (400 nM) and 71 μL of nuclease-free water. NEBuffer 2.1, lba Cas12a, crRNA and nuclease-free water are added into a reaction system to be incubated for 30 minutes at room temperature, then DNA substrate and Poly_C_FQ are added, and the reaction system is incubated at 37 ℃ and fluorescence is detected at the speed of λex 483nm/λem 535nm respectively by an enzyme-labeled instrument at the time of 0, 3, 6, 9, 12, 15, 25, 35 and 45 minutes.
The results of the Ro_cp_herb, rp_cp_herb and Rt_cp_herb groups are shown in FIG. 9, in the Ro_cp group, using Ro_cp_crRNA1 as crRNA, only medicinal rhubarb produced fluorescent signals, while tangutot rhubarb produced no fluorescent signals. The other 2 groups likewise produced fluorescence signals only from crRNA in combination with the corresponding samples, with a significant difference (P > 0.01) from the CK group. The result shows that the time-based method based on the CRISPR/Cas12a system can accurately and rapidly identify three rhubarb medicinal materials.
Example 6: results of application of time-delicacy method to identification of rheum officinale decoction pieces
This example uses the CRISPR/Cas system as detection means, three rhubarb standard specific target sequences have been obtained in example 1, which uses directly ro_cp_crrna1, rp_cp_crrna1 and rt_cp_crrna1 as crrnas, using the following primer pairs as primers for the corresponding amplification standard specific target sequences: ro_cp_1f, ro_cp_1r; rp_cp_1f, rp_cp_1r; rt_cp_1f and rt_cp_1r.
Three packages of decoction pieces were purchased from Beijing, hebei Anguo and Anhui Bozhou, respectively. From each of which a piece of decoction was randomly extracted as a sample, no. YP1, YP2 and YP3, and the piece of decoction was pulverized with a ball mill, and then genomic DNA was extracted according to the instructions for use Plant Genomic DNA Kit provided by TIANGEN corporation. The integrity of the genomic DNA was checked by 0.8% agarose gel electrophoresis, and then the purity and concentration thereof were checked by a Nanodrop 2000C spectrophotometer. And (3) using the obtained three rhubarb decoction pieces genome DNA as a substrate, and respectively amplifying standard specific target sequences by using the three pairs of primers. The total volume of PCR reaction was 50. Mu.L: 25. Mu.L of 2 XTaq Mastermix, 2. Mu.L of primer (F/R) (400 nM), 2. Mu.L of total DNA sample, and nuclease-free water was made up to 50. Mu.L. The PCR reaction conditions were: 30sec at 95 ℃;35 cycles: 95 ℃ for 5sec; t (T) m 30sec;72 ℃ for 2min;72 ℃ for 10min; preserving at 10 ℃.
The PCR product was recovered and purified using a protocol of Universal DNA Purification Kit from TIANGEN, the integrity of the target sequence was checked by 2% agarose gel electrophoresis, then the purity and concentration were checked by a Nanodrop 2000C spectrophotometer, and the fragment of the region where the recovered standard specific target sequence was used as a DNA substrate for the subsequent experiments.
Ro_cp_crRNA1, rp_cp_crRNA1 and Rt_cp_crRNA1 are respectively used as crRNA, and the Ro_cp_decotion, rp_cp_decotion and Rt_cp_decotion groups are respectively and correspondingly arranged by taking the region fragments where the recovered standard specific target sequences are located as DNA substrates. Experiments were performed using enben Lba Cas12a (Cpf 1) from NEB corporation in a total reaction volume of 100 μl:10 μL of 10 XNEBuffer 2.1,2 μL of Lba Cas12a (20 nM), 3 μL of crRNA (300 nM), 10 μL of DNA substrate (1 ng/. Mu.L), 4 μL of Poly_C_FQ (400 nM) and 71 μL of nuclease-free water. Adding NEBuffer 2.1, lba Cas12a, crRNA and water without nuclease into a reaction system, incubating for 30 minutes at room temperature, adding DNA substrate and Poly_C_FQ, incubating at 37 ℃ and incubating for 0, 3, 6, 9, 12, 15, 25, 35 and 45 minutes at lambda by using an enzyme-labeled instrument ex 483nm/λ em Fluorescence was detected at 535nm, respectively.
The results of the ro_cp_decotion, rp_cp_decotion and rt_cp_decotion groups are shown in fig. 10, in which no obvious fluorescent signal is generated, indicating that none of the three decoction pieces are medicinal rheum officinale, decoction pieces 1 and 2 (YP 1 and YP 2) in the rp_cp_decotion groups generate fluorescent signals, indicating that decoction pieces 1 and 2 are rheum palmatum, and decoction piece 3 (YP 3) is not rheum palmatum. Only decoction pieces 3 in the Rt_cp_detection group generate fluorescent signals, which indicates that the decoction pieces 3 are tangkute rhubarb and correspond to the other two groups of experimental results. The results demonstrate that the time-series method based on the CRISPR/Cas12a system can accurately and rapidly identify three rhubarb decoction pieces.
Example 7: results of applying time-delicacy method to Chinese patent medicine containing rhubarb
This example uses the CRISPR/Cas system as detection means, three rhubarb standard specific target sequences have been obtained in example 1, which uses directly ro_cp_crrna1, rp_cp_crrna1 and rt_cp_crrna1 as crrnas, using the following primer pairs as primers for the corresponding amplification standard specific target sequences: ro_cp_1f, ro_cp_1r; rp_cp_1f, rp_cp_1r; rt_cp_1f and rt_cp_1r.
Selection ofThree Chinese patent medicines containing rhubarb: the red-inducing pill, the hemp seed intestine-moistening tablet and the Ruyi golden powder are purchased from Beijing, and the hemp seed intestine-moistening tablet and the Ruyi golden powder are purchased from Hebei Shijizhuang. Pulverizing radix et rhizoma Rhei pill, fructus Cannabis intestine-moistening tablet and Ruyi golden powder 3-5g with ball mill, and washing with nuclear separating liquid for three times. Genomic DNA was then extracted according to the instructions for use Plant Genomic DNA Kit provided by TIANGEN. The integrity of the genomic DNA was checked by 0.8% agarose gel electrophoresis, and then the purity and concentration thereof were checked by a Nanodrop 2000C spectrophotometer. The three obtained Chinese patent medicine genome DNA are used as substrates, and the three pairs of primers are respectively used for amplifying standard specific target sequences. The total volume of PCR reaction was 50. Mu.L: 25. Mu.L of 2 XTaq Mastermix, 2. Mu.L of primer (F/R) (400 nM), 2. Mu.L of total DNA sample, and nuclease-free water was made up to 50. Mu.L. The PCR reaction conditions were: 30sec at 95 ℃;35 cycles: 95 ℃ for 5sec; t (T) m 30sec;72 ℃ for 2min;72 ℃ for 10min; preserving at 10 ℃.
The PCR product was recovered and purified using Universal DNA Purification Kit provided by TIANGEN, according to the instructions, and the integrity of the standard specific target sequence was checked by 2% agarose gel electrophoresis, then the purity and concentration thereof were checked by a Nanodrop 2000C spectrophotometer, and the fragment of the region where the recovered standard specific target sequence was used as a DNA substrate for the subsequent experiments.
The ro_cp_crrna1, rp_cp_crrna1 and rt_cp_crrna are used as crrnas, and the ro_cp_medium, rp_cp_medium and rt_cp_medium groups are respectively set correspondingly with the region fragments where the recovered standard specific target sequences are located as DNA substrates. Experiments were performed using enben Lba Cas12a (Cpf 1) from NEB corporation in a total reaction volume of 100 μl:10 μL of 10 XNEBuffer 2.1,2 μL of Lba Cas12a (20 nM), 3 μL of crRNA (300 nM), 10 μL of DNA substrate (1 ng/. Mu.L), 4 μL of Poly_C_FQ (400 nM) and 71 μL of nuclease-free water. Adding NEBuffer 2.1, lba Cas12a, crRNA and water without nuclease into a reaction system, incubating for 30 minutes at room temperature, adding DNA substrate and Poly_C_FQ, incubating at 37 ℃ and incubating for 0, 3, 6, 9, 12, 15, 25, 5 and 45 minutes at lambda by using an enzyme-labeled instrument ex 483nm/λ em Fluorescence was detected at 535nm, respectively.
The results of the Ro_cp_medium, rp_cp_medium and Rt_cp_medium groups are shown in FIG. 11, and in the Ro_cp_medium groups, the red-lead pills and the Ruyi golden powder have fluorescent signals, which indicate that the red-lead pills and the Ruyi golden powder contain medicinal rheum officinale, and the hemp seed intestine-moistening tablets do not contain medicinal rheum officinale. The hemp seed intestine-moistening tablets in the Rp_cp_medium group generate fluorescent signals, which indicate that the hemp seed intestine-moistening tablets contain rheum palmatum, and the other two decoction pieces do not contain rheum palmatum. The Rt_cp_medium group has no obvious fluorescent signal, which indicates that all three Chinese patent medicines do not contain the tangkute rhubarb and correspond to the experimental results of the other two groups. The result shows that the time-based method based on CRISPR/Cas12a system can accurately and rapidly identify the Chinese patent medicine containing rheum officinale.

Claims (10)

1. A Chinese medicine identification method based on a time-base method comprises the following steps:
(1) Obtaining nuclear, chloroplast and mitochondrial genome sequences of Chinese medicinal primordial species, and constructing a small fragment genome library;
(2) Extracting candidate target sequences from the small fragment genome library and analyzing, and screening candidate sequences meeting at least one selected from screening conditions (a) to (d) from the candidate target sequences as standard specific target sequences, wherein the screening conditions comprise:
(a) The GC content of the candidate sequence is 40% -60%;
the candidate sequence cannot contain four or more repeats, consecutive trinucleotide repeats, discrete 3 or more trinucleotide repeats;
the candidate sequence is not complementary to the crRNA repeat sequence;
the content of G+C in 6 nucleotides at the 5' end of the candidate sequence is 30% -80%;
the candidate sequence is aligned with nuclear, chloroplast and mitochondrial genome sequences of the mixed imitative product and the related species, and at least 3 difference nucleotides are contained; and/or
The region of the candidate sequence is-50 bp to +300bp or-300 bp to +50bp, and more than 4 continuous A or GT repeats do not exist;
(b) The content of G+C in the primer of the region where the candidate sequence is amplified is 30% -80%;
the annealing temperatures of the fluorescent probe designed according to the candidate sequence and the primer of the region where the upstream and downstream amplification candidate sequences are located are 55 ℃ to 60 ℃ and 68 ℃ to 70 ℃ respectively, and the annealing temperatures among the primer pairs differ by not more than 2 ℃;
the length of the primer of the region where the candidate sequence is amplified is 15-30bp;
primers of the region where the upstream and downstream amplification candidate sequences are located are spaced by 50-150bp, and a forward primer is close to a fluorescent probe designed according to the candidate sequences;
the fluorescent probe designed according to the candidate sequence and the primer of the region where the upstream and downstream amplification candidate sequences are located do not contain hairpin structures and the primer pair cannot form a dimer;
The G/C base in 5bp of the 3' end of the primer of the region where the upstream and downstream amplification candidate sequences are located is not more than two;
more C bases than G bases in the fluorescent probe designed according to the candidate sequence; and/or
The candidate sequence is aligned with nuclear, chloroplast and mitochondrial genomes of the mixed imitative product and the related species to at least contain 2 different nucleotides;
(c) The GC content of the amplification primer designed according to the candidate sequence is 40% -60%, and four nucleotides are uniformly distributed, so that the amplification primer does not contain polypyrrole and does not contain a GC-rich region;
the length of the amplification primer designed according to the candidate sequence is 18-30bp, and the difference between the primer pairs is not more than 3 bases;
the amplification primer designed according to the candidate sequence does not contain an inverted repeat sequence and a self-complementary sequence larger than 3bp, and a dimer cannot be formed between the primers;
the annealing temperature of the amplification primers designed according to the candidate sequence is 55-60 ℃, and the annealing temperature difference between the primer pairs is not more than 2 ℃;
more than one but not more than three G/C bases in 5 bases at the 3' -end of the amplification primer designed according to the candidate sequence; and/or
The candidate sequence is aligned with nuclear, chloroplast and mitochondrial genomes of the mixed imitative product and the related species to at least contain 2 different nucleotides;
Alternatively, (d) the region in which the candidate sequence is located is homozygous;
the GC content of the region where the candidate sequence is located is 30% -80%;
the region of the candidate sequence cannot contain more than four consecutive repeated nucleotides;
the region in which the candidate sequence is located cannot contain a methylated sequence;
the area where the candidate sequence is located cannot have a hairpin structure; and/or
The candidate sequence is aligned with nuclear, chloroplast and mitochondrial genomes of the mixed imitative product and the related species to at least contain 2 different nucleotides;
(3) Extracting genomic DNA of a traditional Chinese medicine sample to be detected, optionally amplifying the genomic DNA or an amplification product thereof as a DNA substrate, and detecting whether the standard specific target sequence exists in the DNA substrate by using a target sequence detection system, wherein the target sequence detection system comprises a CRISPR/Cas12a system, a TaqMan probe-based real-time PCR system, a PCR system or a Sanger sequencing system, wherein the traditional Chinese medicine sample to be detected has identity with a specified traditional Chinese medicine basic species if a significant fluorescent signal is generated through the detection for the CRISPR/Cas12a system, otherwise the traditional Chinese medicine sample to be detected does not exist; for the TaqMan probe-based real-time PCR system, if the CT value is larger than 37 through the detection, the traditional Chinese medicine sample to be detected has the same property with the designated traditional Chinese medicine basic species, otherwise, the traditional Chinese medicine sample to be detected does not exist; in the PCR system, if the electrophoresis result generates obvious bands through the detection, the traditional Chinese medicine sample to be detected has the same property with the specified traditional Chinese medicine base stock, otherwise, the traditional Chinese medicine sample to be detected does not have the same property; for the Sanger sequencing system, if the sequencing result is the same as the standard specific target sequence, the traditional Chinese medicine sample to be detected has the same property with the designated traditional Chinese medicine basic species, otherwise, the traditional Chinese medicine sample to be detected does not have the same property.
2. The method of claim 1, wherein the nuclear, chloroplast, and mitochondrial genomic sequences of the traditional Chinese medicine primordial species are obtained by constructing a genomic map or shallow sequencing;
preferably, the nuclear, chloroplast and mitochondrial genome sequences of the traditional Chinese medicine primordial species are divided into L-K+1 fragments with the length of K to form a small fragment genome library, the copy number of each fragment is calculated, and the genome position of each fragment is determined through comparison with a genome, wherein L represents the length of the genome sequence, K represents the length of the library fragment, and K is 15-750bp;
preferably, the Chinese medicinal base stock is rheum palmatum, rheum tanguticum or rheum officinale;
preferably, the chinese herbal based species is rheum officinale, the target sequence detection system is a CRISPR/Cas12a system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:1 and 4;
the Chinese herbal based species is rheum officinale, and the target sequence detection system is a TaqMan probe-based real-time PCR system, and the standard specific target sequence comprises a sequence represented by SEQ ID NO:17, a target nucleotide sequence shown in seq id no;
The traditional Chinese medicine-based species is rheum officinale, the target sequence detection system is a PCR system, and the standard specific target sequence comprises a sequence represented by SEQ ID NO:19, a target nucleotide sequence shown in seq id no; or alternatively
The chinese herbal primordial species is rheum officinale, and the target sequence detection system is Sanger sequencing system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:20, and a target nucleotide sequence shown in seq id no;
or preferably, the chinese herbal based species is rheum palmatum, the target sequence detection system is a CRISPR/Cas12a system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:2, a target nucleotide sequence shown in seq id no;
or preferably, the chinese herbal primordial species is tangku, the target sequence detection system is a CRISPR/Cas12a system, and the standard specific target sequence comprises the sequence set forth in SEQ ID NO:3, and a target nucleotide sequence shown in seq id no.
3. The method of claim 1 or 2, wherein the CRISPR/Cas12a system is employed with the sequence of SEQ ID NOs: 5. 6, 7 or 8 to detect the presence or absence of the standard specific target sequence in the DNA substrate;
alternatively, preferably, a TaqMan probe-based real-time PCR system is used with the sequence of SEQ ID NOs:18 and 16, and a Taqman probe shown as FAM-GCTTGAATGAAAGTCAGGCACTCCGCCA-BHQ to detect the presence or absence of the standard specific target sequence in the DNA substrate;
Or preferably, a PCR system is employed with SEQ ID NOs:19 and 16 to detect the presence or absence of the standard specific target sequence in the DNA substrate;
alternatively, preferably, the Sanger sequencing system is used to sequence the nucleotide sequence of SEQ ID NOs:19 and 21 to detect the presence or absence of the standard specific target sequence in the DNA substrate.
4. A method according to any one of claims 1 to 3, wherein in step (4) genomic DNA of the traditional Chinese medicine to be detected is amplified using a primer pair that specifically amplifies a standard specific target sequence and the amplified standard specific target sequence is recovered as a DNA substrate; or using a primer for specifically amplifying a DNA sequence containing a standard specific target sequence to amplify the genome DNA to be detected and recovering the amplified DNA sequence containing the standard specific target sequence as a DNA substrate;
further preferably, the CRISPR/Cas12a system comprises: gene editing buffer, cas12a, crRNA, nuclease-free water, DNA substrate, and fluorescent signal molecule;
further preferably, the Taqman probe-based real-time PCR system comprises: buffer solution, primer, nuclease-free water, DNA substrate and Taqman probe;
Further preferably, the PCR system comprises: buffer solution, forward primer, reverse primer, nuclease-free water and DNA substrate;
further preferably, the Sanger sequencing system comprises: buffer, sequencing primer, nuclease-free water and DNA substrate.
5. A standard specific target nucleotide of rheum officinale, wherein the target nucleotide is at least one selected from the group consisting of: (1) the sequence set forth in SEQ ID NO:1, a target nucleotide shown in fig. 1; (2) the sequence set forth in SEQ ID NO:2, a target nucleotide shown in seq id no; (3) the sequence set forth in SEQ ID NO:3, a target nucleotide shown in fig. 3; (4) the sequence set forth in SEQ ID NO: 4; (5) the sequence set forth in SEQ ID NO:17, a target nucleotide shown in seq id no; (6) the sequence set forth in SEQ ID NO:19, a target nucleotide shown in seq id no; or (7) the sequence set forth in SEQ ID NO:20, and a target nucleotide shown in seq id no.
6. Use of a standard specific target nucleotide according to claim 5 for species identification of rheum officinale or material derived from rheum officinale, or for distinguishing rheum officinale from its closely related species;
preferably, the large Huang Xuanzi rheum palmatum, rheum tanguticum or rheum officinale.
7. A primer pair for species identification of rheum officinale or a material derived from rheum officinale, comprising:
(1) SEQ ID NO:9 and SEQ ID NO:10;
(2) SEQ ID NO:11 and SEQ ID NO:12;
(3) SEQ ID NO:13 and SEQ ID NO:14;
(4) SEQ ID NO:15 and SEQ ID NO:16;
(5) SEQ ID NO:18 and SEQ ID NO:16;
(6) SEQ ID NO:19 and SEQ ID NO:16; or alternatively
(7) SEQ ID NO:19 and SEQ ID NO:21, a step of;
preferably, the large Huang Xuanzi rheum palmatum, rheum tanguticum or rheum officinale.
8. A kit for species identification of rheum officinale or a material derived from rheum officinale, wherein the kit comprises the primer pair of claim 7.
9. The kit of claim 8, wherein the kit further comprises PCR reaction reagents;
preferably, the PCR reaction reagent comprises: PCR amplification buffer, dNTPs, taq DNA polymerase, mgCl 2 Without anyBacteria ultrapure water;
preferably, the kit further comprises at least one selected from the group consisting of: reagents of TaqMan probe system and CRISPR/Cas12a system;
preferably, the reagents of the Taqman probe system comprise: DNA polymerase, dNTPs, buffer solution and MgCl 2 Nuclease-free water and Taqman probes; further preferably, the Taqman probe is shown as FAM-GCTTGAATGAAAGTCAGGCACTCCGCCA-BHQ;
Or preferably, the CRISPR/Cas12a system reactant comprises: gene editing buffer, cas protein, crRNA, nuclease-free water and fluorescent signal molecules; further preferably, the crRNA is a nucleic acid sequence represented by SEQ ID NOs: 5. 6, 7 or 8.
10. Use of the primer pair of claim 7 or the kit of claim 8 or 9 for: identifying rheum officinale ingredients in a sample to be detected, and identifying species of rheum officinale or a material derived from rheum officinale, distinguishing traditional Chinese medicinal materials derived from rheum officinale from mixed and fake products thereof, or performing safety detection on food, medicine or health care products;
preferably, the sample to be detected is rheum officinale, a tissue or organ of rheum officinale, a traditional Chinese medicine containing rheum officinale, or rheum officinale mixed pseudo-products;
preferably, the rhubarb sample to be detected is a sample from rheum officinale, rheum palmatum or rheum tanguticum;
preferably, the rhubarb sample to be detected is a sample of single species origin or a mixture of samples of multiple species origins.
CN202310217601.7A 2023-03-03 Traditional Chinese medicine identification method based on time-base method and application Active CN116426671B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310217601.7A CN116426671B (en) 2023-03-03 Traditional Chinese medicine identification method based on time-base method and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310217601.7A CN116426671B (en) 2023-03-03 Traditional Chinese medicine identification method based on time-base method and application

Publications (2)

Publication Number Publication Date
CN116426671A true CN116426671A (en) 2023-07-14
CN116426671B CN116426671B (en) 2024-05-10

Family

ID=

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112779266A (en) * 2019-11-06 2021-05-11 青岛清原化合物有限公司 Method for creating new gene in organism and application
CN113897415A (en) * 2021-10-21 2022-01-07 中国医学科学院药用植物研究所 Method for identifying three primitive species of rhubarb medicinal material and application
CN115087750A (en) * 2022-03-30 2022-09-20 中国医学科学院药用植物研究所 Eukaryotic organism species identification method based on whole genome analysis and application

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112779266A (en) * 2019-11-06 2021-05-11 青岛清原化合物有限公司 Method for creating new gene in organism and application
CN113897415A (en) * 2021-10-21 2022-01-07 中国医学科学院药用植物研究所 Method for identifying three primitive species of rhubarb medicinal material and application
CN115087750A (en) * 2022-03-30 2022-09-20 中国医学科学院药用植物研究所 Eukaryotic organism species identification method based on whole genome analysis and application

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李沛青: "不同产地大黄ITS序列分析", 《辽宁中医杂志》, vol. 37, no. 1 *

Similar Documents

Publication Publication Date Title
ES2393318T3 (en) Strategies for the identification and detection of high performance polymorphisms
CN102337345B (en) Medicolegal composite assay kit based on twenty triallelic SNP (single nucleotide polymorphism) genetic markers
US11873535B2 (en) Authentication of botanical DNA isolated from dietary supplements
Zaya et al. Plant genetics for forensic applications
Heubl DNA-based authentication of TCM-plants: current progress and future perspectives
Chen et al. DNA barcoding in herbal medicine: retrospective and prospective
CN101654709B (en) Method for using sts primer to identify ginseng species
CN106755284A (en) It is a kind of based on the cascade DNA amplification machine of label-free molecular beacon and application
KR102327643B1 (en) Molecular markers for identification of antlers and their use
Xiong et al. ITS2 barcoding DNA region combined with high resolution melting (HRM) analysis of Hyoscyami Semen, the mature seed of Hyoscyamus niger
CN108265123B (en) Kit and method for identifying paris polyphylla genuine product and different genotypes
KR102212054B1 (en) Molecular marker based on chloroplast genome sequence for discriminating Schisandrae Fructus and uses thereof
CN116426671B (en) Traditional Chinese medicine identification method based on time-base method and application
JPH025863A (en) Treatment of objective and reference polymer related to each other composed of complementary chain
KR20100061910A (en) Specific primers for discriminating rice cultivars, and uses thereof
CN116426671A (en) Traditional Chinese medicine identification method based on time-base method and application
US7811766B2 (en) Genetic identification and validation of Echinacea species
US20210155972A1 (en) Targeted rare allele crispr enrichment
Lum et al. Molecular methods for the authentication of botanicals and detection of potential contaminants and adulterants
Xie et al. Biological ingredient analysis of Traditional Herbal Patent Medicine Fuke Desheng Wan using the shotgun metabarcoding approach
JP4097288B2 (en) Genetic variety identification method in hops
KR101974175B1 (en) SNP Marker derived from complete sequencing of chloroplast genome of seven Panax species, primer set for discrimination of Panax species and uses thereof
CN110885898A (en) Molecular specific marker primer and method for identifying amaranthus rugosus and 2 common confused products
CN113151558B (en) SSR molecular marker based on Ardisia crispa transcriptome as well as identification method and application thereof
Panapitiya et al. Development of RAPD, DAMD and ISSR markers for authentication of medicinal plant Cassia auriculata and its adulterant Cassia surattensis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant