WO2023108430A1 - 基于全基因组分析与基因组编辑的植物物种鉴定方法与应用 - Google Patents

基于全基因组分析与基因组编辑的植物物种鉴定方法与应用 Download PDF

Info

Publication number
WO2023108430A1
WO2023108430A1 PCT/CN2021/138005 CN2021138005W WO2023108430A1 WO 2023108430 A1 WO2023108430 A1 WO 2023108430A1 CN 2021138005 W CN2021138005 W CN 2021138005W WO 2023108430 A1 WO2023108430 A1 WO 2023108430A1
Authority
WO
WIPO (PCT)
Prior art keywords
plant
target sequence
genome
identified
library
Prior art date
Application number
PCT/CN2021/138005
Other languages
English (en)
French (fr)
Inventor
宋经元
郝利军
许文杰
辛天怡
齐桂红
Original Assignee
中国医学科学院药用植物研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国医学科学院药用植物研究所 filed Critical 中国医学科学院药用植物研究所
Priority to PCT/CN2021/138005 priority Critical patent/WO2023108430A1/zh
Priority to CN202180026003.0A priority patent/CN115843318B/zh
Priority to US17/687,928 priority patent/US20230193301A1/en
Publication of WO2023108430A1 publication Critical patent/WO2023108430A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8213Targeted insertion of genes into the plant genome by homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • G16B35/20Screening of libraries
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B99/00Subject matter not provided for in other groups of this subclass
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Definitions

  • This application relates to the technical field of plant species identification, in particular to a plant species identification method and application based on whole genome analysis and genome editing.
  • CRISPR Clustered regularly interspaced short palindromic repeats
  • CRISPR/Cas CRISPR-associated proteins
  • the system is introduced into the field of identification. JenniferA.Doudna, Zhang Feng and Wang Jin and others successively developed methods such as DETECTR, SHERLOCK and HOLMES and successfully applied them to the detection and identification of viruses and bacteria.
  • crRNA will specifically recognize the target sequence and guide Cas12 to bind to it to form a ternary complex, and then the Cas12a trans-cleavage activity is activated and cleaves the fluorescent signal molecule to generate detectable fluorescence.
  • the reaction is carried out at 37°C, and the operation is simple, requiring only constant temperature and fluorescence detection instruments.
  • GAGE method combines genome-wide analysis (Genome Analysis) with genome editing (Genome Editing) strategy, and realizes plant species identification from the whole-genome level for the first time.
  • the GAGE method screens the target sequence with PAM from the whole genome to obtain all the target sequences with PAM that may be used for species identification in the plant to be identified, and fully exploits the application of the whole genome in species identification. It provides sufficient candidate target sequences for the subsequent selection of specific target sequences with PAM for determining the identity of the plants to be detected and the plants to be identified by comparison with the genome of the plants to be identified.
  • the GAGE method can screen and obtain specific target sequences with PAM to determine the identity of any plant to be detected and the plant to be identified, eliminating off-targets, etc. Risk of error, that is, the GAGE method can accurately determine the identity of any plant to be detected and the plant to be identified.
  • this application proposes a plant species identification method based on genome-wide analysis and genome editing, including the following steps:
  • Step 1 According to the whole genome sequence of the plant to be identified, a small fragment genome library is constructed.
  • the whole genome of the plant to be identified is divided into (L-K+1) fragments of length K to form a small fragment genome library, and the copy number of each fragment is calculated, and then determined by comparison with the genome The genomic position of each fragment, where L represents the genome length and K represents the library fragment length.
  • Step 2 Extract the candidate target sequence with PAM from the whole genome of the plant to be identified, wherein PAM (prespacer sequence adjacent motif) can be determined according to the selected genome editing system, for example, the CRISPR/Cas12a system can select the 5' end Motifs with TTTV or VAAA at the 3' segment are well known to those skilled in the art, and will not be repeated here.
  • PAM spacer sequence adjacent motif
  • the CRISPR/Cas12a system can select the 5' end Motifs with TTTV or VAAA at the 3' segment are well known to those skilled in the art, and will not be repeated here.
  • a PAM motif is detected for each fragment in the small fragment genome library, and candidate target sequences with PAM are extracted to construct a candidate target sequence library.
  • Step 3 Screen and compare the candidate target sequence with the whole genome of counterfeit products and closely related species, and select the sequence that only exists in the plant to be identified as the target sequence, preferably located in a species with high conservation and inter-species differences Candidate target sequences for high sex regions. Considering the off-target effect, it is preferable that the genomes of the counterfeit and closely related species do not include sequences having at most n base differences from the screened target sequence, where n is greater than or equal to 3.
  • the specificity of the target sequence can be further improved by increasing the value of n, or can be screened to obtain target sequences within a predetermined number range by adjusting the value of n.
  • Step 4 design and synthesize CRISPR RNA (crRNA) according to the selected genome editing system.
  • crRNA CRISPR RNA
  • the target sequence library of the plant to be identified relative to its counterfeit and closely related species and the matching crRNA sequence library can be constructed.
  • Step 5 Extracting the genomic DNA of the plant to be detected, amplifying it and recovering the target sequence as a DNA substrate, or using the extracted genomic DNA of the plant to be detected directly as a DNA substrate.
  • the genomic DNA to be detected can be amplified using primers that specifically amplify the target sequence and the target sequence is recovered as a DNA substrate; or the genomic DNA to be detected can be amplified using primers that specifically amplify the DNA sequence containing the target sequence The DNA sequence containing the target sequence is amplified and recovered as a DNA substrate.
  • Step 6 use at least 6 components including Buffer, Cas protein, crRNA, nuclease-free water, DNA substrate and fluorescent signal molecules such as ssDNA reporter (fluorescent reporter gene) to carry out the reaction.
  • fluorescent signal molecules such as ssDNA reporter (fluorescent reporter gene)
  • the Buffer and Cas proteins can be determined according to the selected genome editing system. Taking the CRISPR/Cas12a system as an example, NEBuffer 2.1 and Lba Cas12a (Cpf1) can be selected, and the fluorescent signal molecule can be selected as Poly_A_FQ (5'-FAM-AAAAAAAAAA-BHQ- 3'), the reaction conditions are as follows:
  • Fig. 1 is the flowchart of the GAGE method of the present disclosure
  • Fig. 2 is the candidate target sequence library of saffron
  • Figure 3 is a target sequence specificity analysis diagram
  • Figure 4 is the target sequence and matching crRNA in the saffron ITS2 region
  • Fig. 5 is the fluorescence detection result that the GAGE method of the present disclosure is applied to saffron;
  • Fig. 6 is the fluorescence detection result of identifying saffron by using the genomic DNA of the plant to be detected as the DNA substrate.
  • FIG. 1 shows a flow chart of the GAGE method of the present application.
  • the GAGE method of the present disclosure will be further described below in conjunction with the identification process of saffron as a specific implementation example.
  • the experimental methods that do not indicate specific conditions in the following examples are all implemented according to conventional conditions.
  • Saffron comes from the dry stigma of Crocus sativus, a traditional Chinese herbal medicine, which has the functions of promoting blood circulation, removing blood stasis, cooling blood and detoxification, relieving stagnation and calming the nerves.
  • saffron is also used as a food colorant and spice, and has the reputation of "red gold".
  • the counterfeit products of saffron mainly include safflower, lotus silk and corn silk.
  • Sequences with PAM were extracted from the saffron small fragment genome library to construct a candidate target sequence library.
  • Results A total of 178,043,117 candidate target sequences were screened from the whole genome of saffron, and 59,282,259 remained after deduplication.
  • the genome annotation information about 85% of the candidate target sequences are located in the annotated region, and 15% are located in the non-annotated region.
  • a total of 26,771,965 target sequences were located in the coding region, 21,275 were located in the non-coding region, and 1,997,115 in the coding region were located in the protein coding region, as shown in Figure 2.
  • Example 2 Selection of target sequences for identification of saffron
  • Target sequences are screened from areas with high intra-species conservation and strong inter-species differences; (2) The genomes of counterfeit products do not include at most n bases in the target sequence obtained from screening A sequence of differences, where n is greater than or equal to 3.
  • the specific screening steps are as follows: (1) Data preparation: Download all published sequences of saffron and its fake magenta (Carthamus tinctorius) from the NCBI database (https://www.ncbi.nlm.nih.gov), Whole genome sequences of lotus (Nelumbo nucifera) and corn (Zea mays); (2) Screening of candidate target sites conserved within saffron species: using Bowtie (v1.1.0) to combine the saffron candidate target sequences obtained in 1.2 with The saffron sequences downloaded from the database were compared, and the completely matching sequences were screened as the conserved candidate target sequences within the saffron species; (3) Specific candidate target screening between saffron species: using Bowtie (v1.1.0 ) Comparing the candidate target sequence obtained in (2) with the saffron counterfeit genome, and screening the sequences without 3 base mismatches between saffron and counterfeit products as the selected target sequence library.
  • Figure 3 shows
  • a target sequence is selected from the selected target sequence library and named as Cs_target1, as shown in FIG. 4 , which is located in the ITS2 region of saffron.
  • Cs_crRNA a crRNA matching Cs_target1 was designed and named Cs_crRNA, as shown in Figure 4.
  • the saffron was collected from Dingzhou, Hebei, the safflower was collected from Urumqi, Xinjiang, the lotus was collected from the Botanical Garden of Beijing Institute of Medicinal Botany, and the corn was collected from Nanning, Guangxi.
  • the plant samples were pulverized with a ball mill, and then the total DNA was extracted according to the instructions of the Plant Genomic DNA Kit provided by TIANGEN. The integrity of the total DNA was detected by 0.8% agarose gel electrophoresis, and then its purity and concentration were detected by a Nanodrop 2000C spectrophotometer.
  • the primer sequences are as follows:
  • Reverse primer P2 5'-CTAGGAGGTGTGTGTGTGTGTGGGGA-3'
  • the PCR product was recovered and purified by the Universal DNA Purification Kit instruction manual provided by TIANGEN, and the integrity of the target sequence was detected by 2% agarose gel electrophoresis, and then its purity and concentration were detected by a Nanodrop 2000C spectrophotometer. The recovered ITS2 fragment was used for subsequent experiments DNA substrate.
  • Embodiment 4 GAGE identifies saffron
  • Cs_crRNA was used as crRNA, and the ITS2 fragments of saffron, safflower, lotus, and corn were used as DNA substrates to set up Cs (saffron), Ct (safflower), Zm (lotus), Nn (corn) and CK ( Blank control group.
  • Cs_crRNA was used as crRNA, and the genomic DNA of saffron, safflower, lotus, and maize were used as DNA substrates to set up Cs*, Ct*, Zm*, Nn*, and CK groups, respectively.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Biotechnology (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Cell Biology (AREA)
  • Plant Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Botany (AREA)
  • Mycology (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了一种基于全基因组分析与基因组编辑的植物物种鉴定方法与应用,即GAGE法(Genome Analysis and Genome Editing),所述方法具体包括:筛选待鉴定植物全基因组中带有PAM的序列,将其与混伪品及密切相关物种的基因组比对后选择仅存在于待鉴定植物基因组中的序列作为靶标序列,引入基因组编辑系统对其进行检测,根据所选靶标序列设计并合成crRNA(CRISPR RNA),crRNA引导Cas蛋白与靶标序列结合形成复合体,之后Cas蛋白反式切割活性被激活并切割带有荧光信号基团的单链DNA,通过检测荧光信号鉴定植物物种。

Description

基于全基因组分析与基因组编辑的植物物种鉴定方法与应用 技术领域
本申请涉及植物物种鉴定技术领域,具体涉及一种基于全基因组分析与基因组编辑的植物物种鉴定方法和应用。
背景技术
人类社会出现伊始,植物物种鉴定就与自然探索、社会发展和科学研究息息相关。尽管地球植物总数目前尚无定论,但毫无疑问其数目巨大,对这些种类多样、分布广泛、复杂难辨植物进行鉴定和分类是一项长期而艰巨的任务。早期研究中,主要根据植物的形态特征,化学成分等表现型进行物种鉴定,但由于表现型受环境和生长阶段等因素影响,往往无法真正反映植物的本质——基因型。二十世纪八十年代以来,DNA测序数据的引入使得根据基因型进行植物物种鉴定成为可能。作为植物全部遗传信息的载体,全基因组是植物鉴定的理想数据库,根据全基因组进行鉴定也是植物物种鉴定未来的发展方向。以往由于全基因组资源缺乏,生信分析能力较弱等限制,分子生物学鉴定方法如DNA条形码技术往往只关注几个特定区域,没有充分挖掘和利用全基因组的鉴定潜力。随着测序技术不断迭代,越来越多的植物全基因组被发表,同时计算机软硬件的发展也大大增强了基因组分析能力,两者为实现从全基因组层面进行植物鉴定提供了有力支撑。
Clustered regularly interspaced short palindromic repeats(CRISPR)/CRISPR-associated proteins(CRISPR/Cas)系统自问世以来就备受关注,除了用于基因组编辑,Cas12a和Cas13a反式切割活性的发现和应用还将CRISPR/Cas系统引入鉴定领域。JenniferA.Doudna、张锋和王金等人先后开发了DETECTR、SHERLOCK和HOLMES等方法并将其成功应用于病毒和细菌的检测和鉴定。在CRISPR/Cas12a系统中,crRNA会特异性识别靶标序列并引导Cas12与之结合形成三元复合体,然后Cas12a反式切割活性被激活并切割荧光信号分子产生可被检测的荧光。该反应在37℃下进行,操作简单,仅需要恒温与荧光检测仪器。
然而,相关现有技术均受技术限制只能从特定基因区域内筛选带有PAM(Protospacer adjacent motif,前间隔序列临近基序)的靶标序列,由于可供筛选的基因数据库较小,能够获得的靶标序列极其有限,导致靶标序列的特异性不足,容易发生脱靶等失误,不能很好地满足不同物种鉴定的需求。全基因组包含着生物的全部遗传信息,是物种鉴定的理想数据库,通过全基因组比对筛选序列差异并据此进行鉴定是物种鉴定的未来发展方向。本发明(以下称为GAGE法)将全基因组分析(Genome Analysis)与基因组编辑(Genome Editing)策略相结合,首次实现了从全基因组层面进行植物物种鉴定。与现有技术相比,GAGE法通过从全基因组中筛选带有PAM的靶标序列,获得待鉴定植物的全部可能用于物种鉴定的带有PAM的靶标序列,充分挖掘了全基因组应用于物种鉴定的潜力,为后续通过与待鉴定植物基因组比对选择用于判定待检测植物与待鉴定植物同一性的带有PAM的特异性靶标序列提供了充足的候选靶标序列。考虑到基因组蕴含的巨大信息以及带有PAM的靶标序列分布的广泛程度,理论上GAGE法可筛选获得判定任意待检测植物与待鉴定植物同一性的带有PAM的特异性靶标序列,消除脱靶等失误风险,即GAGE法可准确判定任意待检测植物与待鉴定植物的同一性。
发明内容
为此,本申请提出一种基于全基因组分析与基因组编辑的植物物种鉴定方法,包括如下步骤:
步骤1.根据待鉴定植物的全基因组序列,构建小片段基因组文库。在一些实施方案中,将待鉴定植物的全基因组分成(L-K+1)个长度为K的片段以构成小片段基因组文库,并计算每个片段的拷贝数,再通过与基因组比对确定每个片段的基因组位置,其中L表示基因组长度,K表示文库片段长度。
步骤2.从待鉴定植物的全基因组中提取带有PAM的候选靶标序列,其中PAM(前间隔序列临近基序)可根据所选基因组编辑系统来确定,例如CRISPR/Cas12a系统可选择5'端带有TTTV或3'段带有VAAA的基序,此类知识为本领域技术人员所熟知,不再赘述。优选地,对小片段基因组文库中的每一个片段检测PAM基序,并提取带有PAM的候选靶标序列构建候选靶标序列库。
步骤3.将候选靶标序列与混伪品及密切相关物种的全基因组进行筛选比 对,选择仅存在于所述待鉴定植物中的序列作为靶标序列,优选位于种内保守性高且种间差异性高区域的候选靶标序列。考虑脱靶效应,优选所述混伪品及密切相关物种的基因组中不包括与筛选得到的靶标序列存在至多n个碱基差异的序列,其中n大于等于3。优选地,可以通过增大n值以进一步提高靶标序列的特异性,或者可以通过调节n值,筛选得到预定数量范围内的靶标序列。
步骤4.根据选定的靶标序列,按照所选基因组编辑系统设计并合成CRISPR RNA(crRNA)。优选地,可以通过重复步骤3和4构建待鉴定植物相对于其混伪品及密切相关物种的靶标序列库及与其相匹配的crRNA序列库。
步骤5.提取待检测植物的基因组DNA,对其进行扩增并回收所述靶标序列作为DNA底物,或者以提取的待检测植物的基因组DNA直接作为DNA底物。例如,可以利用特异性扩增靶标序列的引物对待检测基因组DNA进行扩增并回收所述靶标序列作为DNA底物;或者利用特异性扩增包含靶标序列的DNA序列的引物对待检测基因组DNA进行扩增并回收包含靶标序列的DNA序列作为DNA底物。
步骤6.根据所选基因组编辑系统,使用包括Buffer,Cas蛋白,crRNA,nuclease-free water,DNA底物和荧光信号分子例如ssDNA reporter(荧光报告基因)在内的至少6个组分进行反应。
具体地,Buffer和Cas蛋白可根据选择的基因组编辑系统确定,以CRISPR/Cas12a系统为例,可选择NEBuffer 2.1和Lba Cas12a(Cpf1),荧光信号分子选择Poly_A_FQ(5’-FAM-AAAAAAAAAA-BHQ-3’),反应条件如下:
5.1配置以下反应体系
Figure PCTCN2021138005-appb-000001
5.2在室温孵育30分钟
5.3.1以扩增后回收的靶标序列作为DNA底物
加入10μL扩增后回收的靶标序列(1ng/μL)和4μL Poly_A_FQ(400nM)后在37℃孵育并在0,3,6,9,12,15,25,35,45,60分钟时用酶标仪在λ ex 483nm/λ em 535nm(根据所选荧光信号分子确定)分别检测荧光值。
5.3.2以基因组DNA作为DNA底物
加入10μL基因组DNA(10ng/μL)和4μL Poly_A_FQ(400nM)后在37℃孵育60分钟,之后继续在37℃孵育并在0,3,6,9,12,15,25,35,45,60,75,105,135,165分钟时用酶标仪在λ ex 483nm/λ em 535nm(根据所选荧光信号分子确定)分别检测荧光值。
如检测结果与空白对照存在显著性差异(P<0.01)则可判定待检测植物与待鉴定植物具有同一性,反之则不具同一性。
以下将结合附图对本发明(GAGE)作进一步说明,以充分说明本发明的目的、技术特征和技术效果。
附图说明
为了更清楚地说明本公开或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,以下附图仅仅是本公开的实施例,本发明的保护范围不限于此。
图1为本公开的GAGE法的流程图;
图2为西红花的候选靶标序列库;
图3为靶标序列特异性分析图;
图4为西红花ITS2区域中的靶标序列以及匹配的crRNA;
图5为本公开的GAGE法应用于西红花的荧光检测结果;
图6为以待检测植物的基因组DNA为DNA底物鉴定西红花的荧光检测结果。
具体实施方式
图1示出本申请的GAGE法的流程图,下面结合西红花的鉴定过程作为具体实施实例,进一步阐述本公开的GAGE方法。下列实施例中未注明具体条件的实验方法,均按照常规条件实施。
实施例1:西红花小片段基因组文库与靶标序列库的构建
西红花来源于鸢尾科番红花(Crocus sativus)的干燥柱头,是传统名贵中药材,具有活血化瘀、凉血解毒、解郁安神的功效。除了药用,西红花也被用作食品着色剂和香料,有“红色黄金”的美誉。西红花的伪品主要包括红花,莲须和玉米须等。
1.1构建西红花小片段基因组文库
选定西红花(Crocus sativus)的全基因组,将西红花全基因(L=genome length)组用Jellyfish(v1.1.12)分成(L-25+1)个长度为25bp的序列,构建小片段基因组文库。
1.2构建西红花候选靶标序列库
从西红花小片段基因组文库中提取带有PAM(本实施例采用CRISPR/Cas12a系统,PAM的5’端带有TTTV或3’段带有VAAA)的序列构建候选靶标序列库。结果从西红花的全基因组中共筛选到178,043,117个候选靶标序列,去重后剩余59,282,259个。根据基因组注释信息,候选靶标序列约有85%位于注释区,15%位于非注释区。共有26,771,965个靶标序列位于编码区,21275个位于非编码,编码区中有1997115个位于蛋白质编码区,如图2所示。
实施例2:选择用于鉴定西红花的靶标序列
2.1筛选靶标序列
依据以下两点筛选原则:(1)从种内保守性高、种间差异性强的区域筛选靶标序列;(2)混伪品基因组中不包括与筛选得到的靶标序列存在至多n个碱基差异的序列,其中n大于等于3。
具体筛选步骤如下:(1)数据准备:从NCBI数据库(https://www.ncbi.nlm.nih.gov)下载所有已公布的西红花序列及其混伪品红花(Carthamus tinctorius)、莲(Nelumbo nucifera)、玉米(Zea mays)的全基因组序列;(2)西红花种内保守性候选靶点筛选:使用Bowtie(v1.1.0)将1.2中得到的西红花候选靶标序列与数据库下载的西红花序列进行比对,筛选二者完全匹配的序列作为西红花物种内保守候选靶标序列;(3)西红花种间特异性候选靶点筛选:使用Bowtie(v1.1.0)将(2)中得到的候选靶标序列与西红花混伪品基因组进行比对,筛选西红花与混伪品不存在3个以内碱基错配的序列作为选定的靶标序列库。图3示出选定靶标序列库的分析图。
本实施例从选定的靶标序列库中选择一条靶标序列,命名为Cs_target1,如图4所示,其位于西红花ITS2区域。
2.2设计靶标序列匹配的crRNA
根据选定基因组编辑系统以及crRNA设计原则,设计匹配Cs_target1的crRNA,命名为Cs_crRNA,如图4所示。
实施例3:扩增与纯化靶标序列
3.1植物DNA提取
西红花采集自河北定州,红花采集自新疆乌鲁木齐,莲采集自北京市药用植物研究所植物园,玉米采集自广西南宁。植物样品用球磨仪粉碎,然后按照TIANGEN公司提供的Plant Genomic DNAKit使用说明书提取总DNA。用0.8%琼脂糖凝胶电泳检测总DNA的完整性,然后用Nanodrop 2000C分光光度计检测其纯度和浓度。
3.2靶标序列扩增
因靶标序列所在的ITS2区域有通用引物,直接使用其通用引物扩增和纯化靶标序列。引物序列如下:
正向引物P1:5'-ATGGCGTTTTGTGACGAAG-3'
反向引物P2:5'-CTAGGAGGTGTGTGTGGGGA-3'
PCR反应总体积为50μL:25μL 2×Taq MasterMix,2μL primer(F/R)(10μM),2μL total DNA samples,nuclease-free H 2O补齐50μL。PCR反应条件为:95℃30S;35clycles:95℃5sec;58℃30sec;72℃2min;72℃10min;10℃保存。
3.3PCR产物纯化回收
PCR产物TIANGEN公司提供的Universal DNAPurification Kit使用说明书回收纯化,用2%琼脂糖凝胶电泳检测靶标序列的完整性,然后用Nanodrop 2000C分光光度计检测其纯度和浓度,回收的ITS2片段用作后续实验的DNA底物。
实施例4:GAGE鉴定西红花
使用Cs_crRNA作为crRNA,以西红花、红花、莲、玉米的ITS2片段作为DNA底物分别对应设置Cs(西红花)、Ct(红花)、Zm(莲)、Nn(玉米)和CK(空白对照)组。使用NEB公司的EnGen Lba Cas12a(Cpf1)进行 实验,反应总体积为100μL:10μL 10×NEBuffer 2.1,2μL Lba Cas12a(20nM),3μL Cs_crRNA(300nM),10μL DNA底物(1ng/μL),4μL Poly_A_FQ(400nM)和71μL nuclease-free H 2O。反应体系中先加入NEBuffer2.1,Lba Cas12a,Cs_crRNA和nuclease-free H 2O在室温下孵育30分钟,之后加入DNA底物和Poly_A_FQ,在37℃孵育并在0,3,6,9,12,15,25,35,45,60分钟时用酶标仪在λ ex 483nm/λ em 535nm分别检测荧光。
结果见图5,只有Cs组产生了荧光信号,荧光值在25分钟左右达到最大并保持,与CK组有显著性差异(P>0.01)。而Ct、Zm、Nn组与CK组一致,都没有荧光信号产生,荧光值与CK组无显著性差异(P<0.01)。该结果说明GAGE法能准确方便地鉴定西红花。
实施例5:直接使用基因组DNA鉴别西红花
使用Cs_crRNA作为crRNA,以西红花、红花、莲、玉米的基因组DNA作为DNA底物分别设置Cs*、Ct*、Zm*、Nn*和CK组。使用NEB公司的EnGen Lba Cas12a(Cpf1)进行实验,反应总体积为100μL:10μL 10×NEBuffer 2.1,2μL Lba Cas12a(20nM),3μL Cs_crRNA(300nM),10μL DNA底物(10ng/μL),4μL Poly_A_FQ(400nM)和71μL nuclease-free H 2O。反应体系中先加入NEBuffer 2.1,Lba Cas12a,Cs_crRNA和nuclease-free H 2O在室温下孵育30分钟,之后加入DNA底物和Poly_A_FQ,在37℃孵育60分钟,之后继续在37℃孵育并在0,3,6,9,12,15,25,35,45,60,75,105,135,165分钟时用酶标仪在λ ex 483nm/λ em 535nm分别检测荧光。
结果见图6,只有Cs*组产生了荧光信号,且荧光值随时间增加,与CK组有显著性差异(P>0.01)。而Ct*、Zm*、Nn*组与CK组一致,都没有荧光信号产生,荧光值与CK组无显著性差异(P<0.01),该结果说明GAGE方法中,不经扩增直接使用基因组DNA也准确方便地鉴定西红花。
所属领域的普通技术人员应当理解:以上任何实施例的讨论仅为示例性的,并非旨在暗示本公开的保护范围被限于这些实施例;在本公开的思路下,以上实施例或者不同实施例中的技术特征之间也可以进行组合,步骤可以以任意顺序实现,并存在如上所述的本公开实施例的不同方面的许多其它变化,为了简明它们没有在细节中提供。本公开实施例旨在涵盖落入所附权利要求的宽泛范围之内的所有这样的替换、修改和变型。因此,凡在本公开实施例 的精神和原则之内,所做的任何省略、修改、等同替换、改进等,均应包含在本公开的保护范围之内。

Claims (10)

  1. 一种基于全基因组分析与基因组编辑的植物物种鉴定方法,其特征在于,包括如下步骤:
    步骤1.根据待鉴定植物的全基因组序列,构建小片段基因组文库;
    步骤2.从所述小片段基因组文库中提取带有PAM的候选靶标序列构建候选靶标序列库;
    步骤3.将所述候选靶标序列与混伪品及密切相关物种的全基因组进行筛选比对,选择仅存在于所述待鉴定植物中的序列作为靶标序列;
    步骤4.根据所述靶标序列设计并合成crRNA;
    步骤5.提取待检测植物的基因组DNA,对其进行扩增并回收所述靶标序列作为DNA底物,或者以提取的待检测植物的基因组DNA直接作为DNA底物;
    步骤6.根据所选基因组编辑系统,使用包括Buffer、Cas蛋白、crRNA、nuclease-free water、待检测植物DNA底物和ssDNAreporter在内的至少6个成分进行反应并进行荧光检测,如检测结果与空白对照存在显著性差异(P<0.01)则判定待检测植物与待鉴定植物具有同一性,反之则不具同一性。
  2. 根据权利要求1所述的方法,其特征在于,步骤1的构建小片段基因组文库的方法包括:将待鉴定植物的全基因组分成(L-K+1)个长度为K的片段,所述片段构成小片段基因组文库,并计算其拷贝数,再通过与基因组比对确定每个片段的基因组位置,其中L表示基因组长度,K表示文库片段长度。
  3. 根据权利要求1所述的方法,其特征在于,步骤2所述的候选靶标序列是从待鉴定植物的全基因组范围内提取,而非局限于特定区域。
  4. 根据权利要求2所述的方法,其特征在于,步骤2还包括对所述小片段基因组文库中的每一个片段检测PAM基序,并提取带有PAM的候选靶标序列构建候选靶标序列库。
  5. 根据权利要求1所述的方法,其特征在于,步骤3还包括:将步骤2得到的候选靶标序列与待鉴定植物的混伪品及密切相关物种的全基因组进行比对, 其中所述混伪品及密切相关物种的基因组中不包括与筛选得到的靶标序列存在至多n个碱基差异的序列,其中n大于等于3。
  6. 根据权利要求5所述的方法,其特征在于,通过调节n值,筛选得到预定数量范围内的靶标序列。
  7. 根据权利要求1所述的方法,其特征在于,步骤5还包括:利用特异性扩增所述靶标序列的引物对所述待检测植物基因组DNA进行扩增并回收所述靶标序列作为DNA底物;或者利用特异性扩增包含所述靶标序列的DNA序列的引物对所述待检测植物基因组DNA进行扩增并回收包含所述靶标序列的DNA序列作为DNA底物。
  8. 根据权利要求1所述的方法,其特征在于,步骤6所述的用于检测靶标序列的基因组编辑系统包括基于CRISPR/Cas策略的系统,优选CRISPR/Cas12a系统或CRISPR/Cas13a系统。
  9. 根据权利要求1所述的方法,其特征在于,所述方法还包括:通过重复步骤3和4构建所述待鉴定植物相对于其混伪品及密切相关物种的靶标序列库及与其相匹配的crRNA序列库。
  10. 一种根据权利要求1所述的方法用于鉴定植物物种的应用,其特征在于,包括:根据待检测植物的性状选择待鉴定植物,执行根据权利要求1所述的方法,根据所述荧光检测结果判定待检测植物与待鉴定植物的同一性。
PCT/CN2021/138005 2021-12-14 2021-12-14 基于全基因组分析与基因组编辑的植物物种鉴定方法与应用 WO2023108430A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/CN2021/138005 WO2023108430A1 (zh) 2021-12-14 2021-12-14 基于全基因组分析与基因组编辑的植物物种鉴定方法与应用
CN202180026003.0A CN115843318B (zh) 2021-12-14 2021-12-14 基于全基因组分析与基因组编辑的植物物种鉴定方法与应用
US17/687,928 US20230193301A1 (en) 2021-12-14 2022-03-07 Method and use for identifying plant species based on whole genome analysis and genome editing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/138005 WO2023108430A1 (zh) 2021-12-14 2021-12-14 基于全基因组分析与基因组编辑的植物物种鉴定方法与应用

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/687,928 Continuation US20230193301A1 (en) 2021-12-14 2022-03-07 Method and use for identifying plant species based on whole genome analysis and genome editing

Publications (1)

Publication Number Publication Date
WO2023108430A1 true WO2023108430A1 (zh) 2023-06-22

Family

ID=85575492

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/138005 WO2023108430A1 (zh) 2021-12-14 2021-12-14 基于全基因组分析与基因组编辑的植物物种鉴定方法与应用

Country Status (3)

Country Link
US (1) US20230193301A1 (zh)
CN (1) CN115843318B (zh)
WO (1) WO2023108430A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116287416B (zh) * 2023-04-06 2024-03-15 中国医学科学院药用植物研究所 基于时珍法的真菌物种鉴定方法、靶标核苷酸、引物对、试剂盒及其应用

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021068086A1 (en) * 2019-10-09 2021-04-15 The Governing Council Of The University Of Toronto A molecular sensing platform and methods of use
US20210171973A1 (en) * 2018-07-04 2021-06-10 Guangdong Sanjie Forage Biotechnology Co., Ltd Method of Obtaining Multileaflet Medicago Sativa Materials by Means of MsPALM1 Artificial Site-Directed Mutants
CN113201586A (zh) * 2021-04-22 2021-08-03 中南大学 一种基于Cas蛋白的检测方法

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015189693A1 (en) * 2014-06-12 2015-12-17 King Abdullah University Of Science And Technology Targeted viral-mediated plant genome editing using crispr/cas9
WO2018122248A1 (en) * 2016-12-29 2018-07-05 Johann Wolfgang Goethe-Universität Frankfurt am Main Method for generating higher order genome editing libraries
US11398294B2 (en) * 2017-06-28 2022-07-26 Institute Of Medicinal Plant Development, Chinese Academy Of Medical Science Method for controlling the quality of traditional Chinese patent medicines based on metagenomics
CN108205614A (zh) * 2017-12-29 2018-06-26 苏州金唯智生物科技有限公司 一种全基因组sgRNA文库的构建系统及其应用
EP3931313A2 (en) * 2019-01-04 2022-01-05 Mammoth Biosciences, Inc. Programmable nuclease improvements and compositions and methods for nucleic acid amplification and detection
CN110066852B (zh) * 2019-05-29 2022-07-22 复旦大学 一种在哺乳动物细胞中检测CRISPR/Cas PAM序列的方法和系统
WO2021046257A1 (en) * 2019-09-03 2021-03-11 The Broad Institute, Inc. Crispr effector system based multiplex cancer diagnostics
CN113174433B (zh) * 2021-04-22 2024-03-26 苏州淦江生物技术有限公司 一种基于Cas蛋白的检测方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210171973A1 (en) * 2018-07-04 2021-06-10 Guangdong Sanjie Forage Biotechnology Co., Ltd Method of Obtaining Multileaflet Medicago Sativa Materials by Means of MsPALM1 Artificial Site-Directed Mutants
WO2021068086A1 (en) * 2019-10-09 2021-04-15 The Governing Council Of The University Of Toronto A molecular sensing platform and methods of use
CN113201586A (zh) * 2021-04-22 2021-08-03 中南大学 一种基于Cas蛋白的检测方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KANITCHINDA SUTHASINEE, SRISALA JIRAPORN, SUEBSING RUNGKARN, PRACHUMWAT ANUPHAP, CHAIJARASPHONG THAWATCHAI: "CRISPR-Cas fluorescent cleavage assay coupled with recombinase polymerase amplification for sensitive and specific detection of Enterocytozoon hepatopenaei", BIOTECHNOLOGY REPORTS, ELSEVIER, vol. 27, 1 September 2020 (2020-09-01), pages e00485, XP093073801, ISSN: 2215-017X, DOI: 10.1016/j.btre.2020.e00485 *
LI NAN, LI XIAOMAN, ZHANG XUEFU : "Trend and Topic Analysis of Global Plant Science Research", CHINESE AGRICULTURAL SCIENCE BULLETIN, vol. 36, no. 34, 1 January 2020 (2020-01-01), pages 148 - 159, XP093073798, DOI: 10.11924/j.issn.1000-6850.casb20200300216 *
ZHOU XINCHENG, XIA ZHIQIANG, CHEN XIN, ZOU MEILING, LU CHENG, WANG HAIYAN, WANG WENQUAN: "Advances of Genomics and Its Utilization in Tropical Crops ", CHINESE JOURNAL OF TROPICAL CROPS, vol. 41, no. 10, 1 January 2020 (2020-01-01), pages 2130 - 2142, XP093073792, ISSN: 1000-2561, DOI: 10.3969/j.issn.1000-2561.2020.10.019 *

Also Published As

Publication number Publication date
CN115843318B (zh) 2023-07-18
US20230193301A1 (en) 2023-06-22
CN115843318A (zh) 2023-03-24

Similar Documents

Publication Publication Date Title
Dueholm et al. Generation of comprehensive ecosystem-specific reference databases with species-level resolution by high-throughput full-length 16S rRNA gene sequencing and automated taxonomy assignment (AutoTax)
Li et al. An atlas of alternative polyadenylation quantitative trait loci contributing to complex trait and disease heritability
Xie et al. Multiplexed engineering and analysis of combinatorial enhancer activity in single cells
Biton et al. Independent component analysis uncovers the landscape of the bladder tumor transcriptome and reveals insights into luminal and basal subtypes
Lu et al. RNA duplex map in living cells reveals higher-order transcriptome structure
Wolf et al. The evolution of chloroplast genes and genomes in ferns
Grün et al. Design and analysis of single-cell sequencing experiments
Ward et al. Strategies for transcriptome analysis in nonmodel plants
Song et al. Development of chloroplast genomic resources for Oryza species discrimination
Wei et al. Integrative analysis of MicroRNA and gene interactions for revealing candidate signatures in prostate cancer
He et al. The conservation and signatures of lincRNAs in Marek’s disease of chicken
Michels et al. ArrayCGH‐based classification of neuroblastoma into genomic subgroups
SU et al. Transrenal DNA as a diagnostic tool: important technical notes
Matvienko et al. Consequences of normalizing transcriptomic and genomic libraries of plant genomes using a duplex-specific nuclease and tetramethylammonium chloride
WO2023024508A1 (zh) 基于全基因组分析的真核生物物种鉴定方法及应用
Chen et al. Population genetic analysis of modern and ancient DNA variations yields new insights into the formation, genetic structure, and phylogenetic relationship of Northern Han Chinese
WO2009155443A2 (en) Method and apparatus for sequencing data samples
Zhao et al. Comparative transcriptome analysis reveals relationship of three major domesticated varieties of Auricularia auricula-judae
Mishra et al. Candidate DNA barcode tags combined with high resolution melting (Bar-HRM) curve analysis for authentication of Senna alexandrina Mill. with validation in crude drugs
Ren et al. MicroRNA signatures from multidrug‑resistant Mycobacterium tuberculosis
Ma et al. Identification of a 5‑microRNA signature and hub miRNA‑mRNA interactions associated with pancreatic cancer
WO2023108430A1 (zh) 基于全基因组分析与基因组编辑的植物物种鉴定方法与应用
Masclaux et al. Variation in allele frequencies at the bg112 locus reveals unequal inheritance of nuclei in a dikaryotic isolate of the fungus Rhizophagus irregularis
Diefenbach et al. Hypermethylation of circulating free DNA in cutaneous melanoma
Shang et al. Characterization and comparative analysis of mitochondrial genomes among the Calliphoridae (Insecta: Diptera: Oestroidea) and phylogenetic implications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21967570

Country of ref document: EP

Kind code of ref document: A1