WO2023034846A1 - Novel molecular markers for sex determination in cannabis - Google Patents

Novel molecular markers for sex determination in cannabis Download PDF

Info

Publication number
WO2023034846A1
WO2023034846A1 PCT/US2022/075732 US2022075732W WO2023034846A1 WO 2023034846 A1 WO2023034846 A1 WO 2023034846A1 US 2022075732 W US2022075732 W US 2022075732W WO 2023034846 A1 WO2023034846 A1 WO 2023034846A1
Authority
WO
WIPO (PCT)
Prior art keywords
label
seq
nucleotide
set forth
corresponds
Prior art date
Application number
PCT/US2022/075732
Other languages
French (fr)
Inventor
Stephen Ezra SCHAUER
Koreen RAMESSAR
Walter Edward NELSON
Rudie Gerardus Cornelia ANTONISE
Original Assignee
22Nd Century Limited, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 22Nd Century Limited, Llc filed Critical 22Nd Century Limited, Llc
Publication of WO2023034846A1 publication Critical patent/WO2023034846A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6879Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for sex determination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • the method further comprises: (a) selecting a female Cannabis plant when SNP 1 is homozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1; or (b) selecting a male Cannabis plant when SNP 1 is heterozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1.
  • the detecting comprises: (a) contacting the nucleic acid sample with sets of forward and reverse primers selected from the group consisting of: (i) two forward primers for identifying SNP 1, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4; (ii) two forward primers for identifying SNP 2, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8; and (iii) primers having at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about
  • the fluorescent label is selected from the group consisting of fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a CyTM3 label, a Cy 3.5 label, a NED label, a ROXTM label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch BlueTM label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar
  • the PCR is Kompetitive Allele Specific PCR (KASP).
  • all of the primer sets are contained together in an amplification mix further comprising DNA polymerase, dNTPs, and PCR buffer prior to contacting with the nucleic acid sample.
  • the Cannabis plant is a Cannabis saliva plant.
  • the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
  • the present disclosure provides a kit for identifying the sex of a Cannabis plant, the kit comprising at least one set of forward and reverse primers selected from the group consisting of: (a) a first primer set comprising two forward primers that specifically hybridize under stringent conditions to a segment of a target nucleic acid sample comprising a SNP identified as SNP 1 having an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4; and (b) a second primer set comprising two forward primers that specifically hybridize under stringent conditions to a segment of a target nucleic acid sample comprising a SNP identified as SNP 2 having a C or an
  • the primer sets are present together in an amplification mix that further comprises DNA polymerase, dNTPs, and PCR buffer.
  • the fluorescent label is selected from one or more of a fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a CyTM3 label, a Cy 3.5 label, a NED label, a ROXTM label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch BlueTM label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 670
  • FAM fluorescein
  • the Cannabis plant is a Cannabis sativa plant.
  • the present disclosure provides a primer selected from the group consisting of: (a) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 2; (b) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 3; (c) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 6; (d) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 7; and (e) a nucleotide sequence having at least about 90% 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to any one of the nucleotide sequences of (a)-(d), wherein
  • the present disclosure provides a method for producing a population of female Cannabis plants, the method comprising: (a) selecting a female Cannabis plant when the presence of one or more homozygous single nucleotide polymorphisms (SNPs) in a nucleic acid sample obtained from the Cannabis plant is detected, wherein the one or more SNPs are selected from: (i) a SNP identified as SNP 1, wherein SNP 1 is homozygous for an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1; or (ii) a SNP identified as SNP 2, wherein SNP 2 is homozygous for a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5; and (b) planting or growing a crop of the female Cannabis plants.
  • SNPs single nucleotide polymorphisms
  • the homozygous SNPs are detected by a competitive allelespecific polymerase chain reaction (PCR) assay, comprising: (a) contacting the nucleic acid sample with sets of forward and reverse primers selected from the group consisting of: (i) two forward primers for identifying SNP 1 comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4; (ii) two forward primers for identifying SNP 2 comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8; and (iii) primers having at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%
  • the fluorescent label is selected from the group consisting of a fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a CyTM3 label, a Cy 3.5 label, a NED label, a ROXTM label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch BlueTM label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 6
  • FAM fluorescein
  • the PCR assay is a Kompetitive Allele Specific PCR (KASP) assay.
  • KASP Kompetitive Allele Specific PCR
  • all of the primer sets are contained together in an amplification mix further comprising DNA polymerase, dNTPs, and PCR buffer prior to contacting with the nucleic acid sample.
  • the Cannabis plant is a Cannabis sativa (C. sativa), C. indica. or C. ruder alis plant.
  • the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
  • the present disclosure provides a method of detecting the presence of at least one single nucleotide polymorphism (SNP) on a plurality of polynucleotide analytes in a nucleic acid sample obtained from a Cannabis plant, comprising a) amplifying the plurality of polynucleotide analytes with at least a first oligonucleotide set that is specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 or a second oligonucleotide set that is specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 to generate amplified polynucleotide analytes; and b) detecting amplified polynucleotide analytes comprising a first oligonucleotide set that
  • the method further comprises determining whether the nucleic acid sample is homozygous for an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the method further comprises determining whether the nucleic acid sample is heterozygous at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the method further comprises determining whether the nucleic acid sample is homozygous for a C at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5. In some embodiments, the method further comprises determining whether the nucleic acid sample is heterozygous at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5
  • homozygous for an A and/or C is indicative of a female Cannabis plant.
  • heterozygous at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a male Cannabis plant.
  • the present disclosure provides a method of detecting the presence of a single nucleotide polymorphism (SNP) on a polynucleotide analyte in a nucleic acid sample obtained from a Cannabis plant, comprising a) hybridizing a polynucleotide analyte with a set of detectably labeled probes, wherein the set comprises a first probe that is specific for an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and is labeled with a first fluorophore, and a second probe that is specific for a G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and is labeled with a second fluorophore, wherein the first fluorophore and the second fluorophore are different; and b) detecting one or more fluorescent signals to determine whether the polynucleotide analyte comprises an A or
  • the present disclosure provides a method of detecting the presence of a single nucleotide polymorphism (SNP) on a polynucleotide analyte in a nucleic acid sample obtained from a Cannabis plant, comprising a) hybridizing a polynucleotide analyte with a set of detectably labeled probes, wherein the set comprises a third probe that is specific for a C at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 and is labeled with a third fluorophore, and a fourth probe that is specific for an A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 and is labeled with a fourth fluorophore, wherein the third fluorophore and the fourth fluorophore are different; and b) detecting one or more fluorescent signals to determine whether the polynucleotide analyte comprises a C or A
  • the first fluorophore and the third fluorophore are the same.
  • the second fluorophore and the fourth fluorophore are the same.
  • each probe within the set is at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, or more nucleotides in length.
  • the set further comprises two or more additional probes, wherein the additional probes are specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
  • the set further comprises two or more additional probes, wherein the additional probes are specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1.
  • the method further comprises subjecting the nucleic acid sample under an amplification condition to generate the polynucleotide analyte prior to hybridizing the polynucleotide analyte with the set of detectably labeled probes.
  • the presence of an A and the absence of a G is indicative that the nucleic acid sample is homozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1.
  • the presence of an A and a G is indicative that the nucleic acid sample is heterozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1.
  • homozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a female Cannabis plant.
  • heterozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a male Cannabis plant.
  • the Cannabis plant is a Cannabis sativa plant.
  • the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
  • Figure 3 is a Principal Coordinates Analysis (PCO) plot of the 34 Ch samples genotyped with 7,912 SNPs.
  • Figures 4A and 4B are charts for the two KASP assays identified.
  • Figure 4A depicts a BLAST analysis showing that the fragments map to a unique position in reference genomes derived from female plants, with no more than two positions in reference genomes derived from male plants.
  • Figure 4B shows corresponding results at a protein level.
  • Figure 5 shows allelic discrimination plots obtained for individual hemp plants for Csa_283848 115 using KASP genotyping assay (96-sample format) on a QuantStudio 7 Pro Real-Time PCR System (Applied Biosystems). Allele 2/ Allele 2 (X:X) represents homozygous genotypes indicative of female and allele 2/allele 1 (X:Y) represent heterozygous genotypes indicative of males. Allele 1/Allele 1 (Y:Y) represents negative controls (left sample has no template DNA and right sample contains DNA from soybean Williams82) which showed no amplification as expected. Increased intensity observed for Allele 2 is due to the inherent characteristic of the FAM fluorophore which naturally produces more fluorescence.
  • Figure 6 is a photograph showing sex determination by traditional PCR with MADC2 forward and reverse primers. Five microliters of PCR product/sample separated on a 2% agarose gel. Male-specific amplicons are 39 Ibp; female amplicons are 560bp. Lanes M: O’GeneRuler Express DNA ladder (ThermoScientific, #SM1551), N: negative control (no template DNA); PM: male positive control; PF: female positive control; 1-10: test samples 1-10.
  • Cannabis sativa L. (cannabis, hemp, marijuana) is an herbaceous plant belonging to the Cannabis genus, family of Cannabaceae.
  • Cannabis sativa L. an annual herb that has been cultivated for thousands of years, contains a unique set of secondary metabolites called cannabinoids, which constitute a group of terpenophenolics.
  • the most desirable cannabinoids of the Cannabis sativa L. plant are formed in the female flowers/buds. Achieving high flower/bud yield and potency is of critical importance to medical cannabis growers. Although male cannabis plants produce these desirable cannabinoids, they do so in much smaller quantities.
  • the present technology relates to the discovery of two novel genetic markers (labeled as Csa_283848_106 ID:7 and Csa_283848_115 ID: 11), carrying unique single-nucleotide polymorphisms (SNPs) able to accurately predict the plant sex at a genetic level. If the plant is homozygous (has 2 copies of the same allele) at the specific locus, it is predicted to be female; if it is heterozygous (1 copy of the respective allele) it is predicted to be male.
  • SNPs single-nucleotide polymorphisms
  • KASP genotyping technology is a homogeneous, fluorescence resonance energy transfer (FRET)-based assay that enables accurate bi-allelic discrimination of known SNPs.
  • allele is one or more alternative forms of a gene at a particular locus, all of which alleles relate to one trait or characteristic at a specific locus.
  • alleles of a gene are located at a specific location, or locus on a chromosome.
  • One allele is present on each chromosome of the pair of homologous chromosomes.
  • a diploid plant species may comprise a large number of different alleles at a particular locus. These may be identical alleles of the gene (homozygous) or two different alleles (heterozygous).
  • Crobis or “cannabis plant” refers to any species in the Cannabis genus that produces cannabinoids, such as Cannabis saliva and interspecific hybrids thereof. As used herein, these terms include hemp varieties and medical cannabis varieties.
  • the term “cultivar” refers to a product of plant breeding that is released for access to produces that is uniform, distinct, stable, and new.
  • genotype refers to the genetic constitution of an individual (or group of individuals) at one or more particular loci.
  • genotype of an individual or group of individuals is defined and described by the allele forms at the one or more loci that the individual has inherited from its parents.
  • genotype may also be used to refer to an individual’s genetic constitution at a single locus, at multiple loci, or at all the loci in its genome.
  • germplasm refers to genetic material of or from an individual plant, a group of plants (e.g., a plant line, variety, and family), and a clone derived form a plant or group of plants.
  • a germplasm may be part of an organism or cell, or it may be separate (e.g., isolated) from the organism or cell.
  • germplasm provides genetic material with a specific molecular makeup that is the basis for hereditary qualities of the plant.
  • germplasm refers to cells of a specific plant; seed; tissue of the specific plant; and non-seed parts of the specific plant (e.g., leaf, stem, pollen, and cells).
  • germplasm is synonymous with “genetic material,” and it may be used to refer to seed (or other plant material) from which a plant may be propagated.
  • a germplasm utilized in a method or plant as described herein is from a Cannabis sativa line or variety.
  • a germplasm is seed of the Cannabis sativa line or variety.
  • a germplasm is a nucleic acid sample from the Cannabis sativa line or variety.
  • Marker assay refers generally to a molecular marker assay, such as PCR or KASP, for example, to identify whether a certain DNA sequence or SNP, for example, is present in a sample of nucleic acid.
  • a maker assay can include a molecular marker assay, e.g., KASP assay, which can be used to test whether a Cannabis sativa plant has a SNP associated with the sex of the plant.
  • SNP 2 is characterized by a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5 (which corresponds to the 173 bp Csa_283848_115 marker).
  • SNP 1 and SNP 2 accurately predict Cannabis sativa plant sex at a genetic level. If the plant is homozygous (z.e., has two copies of the same allele, e.g., X:X) at the specific locus, it is predicted to be female. If it is heterozygous (z.e., has one copy of the respective allele, e.g., X:Y) at the specific locus, it is predicted to be male.
  • This data point is plotted close to the X axis, representing high FAM signal and no HEX signal.
  • a sample that is homozygous for the allele reported by HEX will only generate HEX fluorescence during the KASP reaction.
  • This data point is plotted close to the Y axis, representing high HEX signal and no FAM signal.
  • a sample that is heterozygous will contain both the allele reported by FAM and the allele reported by HEX. This sample will generate half as much RAM fluorescence and half as much HEX fluorescence as the samples that are homozygous for these alleles.
  • This data point is plotted in the center of the plot, representing half FAM signal and half HEX signal.
  • the KASP reaction without any template DNA is included as a negative control to ensure reliability. This is referred to as a no template control (NTC) and will not generate any fluorescence and the data point will therefore be plotted at the origin.
  • NTC no template control
  • the STARP method comprises a first oligonucleotide set comprising a first forward primer, a second forward primer, and a first reverse primer.
  • the first forward primer is allele-specific to the X chromosome and binds to an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1.
  • the second forward primer is allele-specific to the Y chromosome and binds to a G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1.
  • the first forward primer and the second forward primer are fluorescently labeled and the two fluorophores are different.
  • the first reverse primer comprises at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to SEQ ID NO: 4.
  • Genetic distance analysis Dendrogram. A dissimilarity matrix was estimated using the function “dist” of the statistical software package R and Euclidian distances were estimated. Clustering was performed using the function “hclusf ’ with the use of average distances as agglomeration method. Subsequently, the genetic distances were visualized in a dendrogram ( Figure 2). The analysis was done based on the total set of lines and per company.
  • Results for Ch population are visualized in Table 2. A high association was found for 62 SNPs (34 loci), and out of this set 55 SNPs (32 loci) had a 100% association between genotype and phenotype. The homozygous “B” score is associated with female and the heterozygous score “H” is associated with male.
  • Loci that mapped to multiple positions in the Ch genome were eliminated from further analysis, while loci that mapped to a unique location on the Ch genome were further mapped to the Finola reference genome (which was generated from a male hemp plant).
  • Loci that mapped to more than two positions in the Finola genome were eliminated from further analysis. Additional BLAST analysis showed that these fragments map to a unique position in reference genomes derived from female plants, and no more than two positions in reference genomes derived from male plants.
  • SBG Sequence-Based Genotyping
  • This example describes the identification of two novel genetic markers labeled as Csa_283848_106 (SEQ ID NO: 1) and Csa_283848_l 15 (SEQ ID NO: 5), which carry unique SNPs able to accurately predict cannabis plant sex at a genetic level.
  • Genomic DNA isolation Hemp seeds were germinated on wet filter paper for 3 days in the dark at 25°C prior to transfer to soil. Germinated seedlings in soil were grown at 25°C under a 16h light, 8h dark photoperiod for 2 weeks prior to leaf sampling. Approximately lOOmg of leaf tissue was harvested, and total genomic DNA was isolated using the Qiagen DNeasy Plant Mini Kit according to the manufacturer’s protocol.
  • KASP assay primers were designed and synthetically manufactured by LGC, Biosearch Technologies (Hoddesdon, UK) for use in KASP genotyping assays (Table 5); fluorophores occur at the 5’end of the primers: FAM (495nm excitation, 520nm emission) for allele 1 (Y) and HEX (535nm excitation, 556nm emission) for allele 2 (X).
  • each amplification reaction contained 10 ng template gDNA, KASP 2x Master mix and KASP-by-Design assay mix (LGC, Biosearch Technologies) in a final volume of lOpL.
  • the PCR thermocycling conditions for both primers Csa_283848_106 and Csa_283848_l 15 was 15 min at 94°C followed by 10 cycles of 94°C for 20 sec and 61°C for 1 min (dropping -0.6°C per cycle to achieve a 55°C the annealing temperature) followed by 26 cycles of 94°C for 20 sec and 30°C for 1 min.
  • PCR amplification and analysis were performed using the QuantStudio 7 Pro Real-Time PCR System (Applied Biosystems). Results
  • Table 7 Summary of screening results for the MADC2 primer PCR, and KASP assays Csa_283848_106 (ID:7) and Csa_283848_l 15 (ID: 11).
  • a range includes each individual member.
  • a group having 1-3 cells refers to groups having 1, 2, or 3 cells.
  • a group having 1-5 cells refers to groups having 1, 2, 3, 4, or 5 cells, and so forth.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Mycology (AREA)
  • Botany (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present technology provides novel molecular markers for sex determination in Cannabis. In particular, the present technology provides single nucleotide polymorphism (SNP) markers and flanking sequences of certain SNPs associated with the sex of a Cannabis plant as well as methods of using the markers to identify the sex of Cannabis plants. The present technology further provides uses of the molecular markers in a kit for determining the sex of Cannabis plants.

Description

NOVEL MOLECULAR MARKERS FOR
SEX DETERMINATION IN CANNABIS
CROSS-REFERENCE TO RELATED APPLICATION
[0001] The present application claims priority to U.S. Provisional Patent Application No. 63/239,843, filed on September 1, 2021, the contents of which are hereby incorporated by reference in their entirety.
TECHNICAL FIELD
[0002] The present technology relates generally to novel molecular markers for sex determination in Cannabis. More specifically, the present technology relates to single nucleotide polymorphism (SNP) markers and flanking sequences of certain SNPs associated with the sex of a Cannabis plant as well as methods of using the markers to identify the sex of Cannabis plants.
BACKGROUND
[0003] The following description is provided to assist the understanding of the reader. None of the information provided or references cited is admitted to be prior art.
[0004] Cannabis sativa L. is a multiuse crop grown primarily for grain and fiber (hemp varieties), and cannabinoids (medical Cannabis varieties). It is a dioecious species (distinct male and female plants) with sexual dimorphism occurring in a late stage of plant development. Sex is determined by heteromorphic chromosomes (X and Y): male is the heterogametic sex (XY), and female is the homogametic one (XX). The sexual phenotype of Cannabis often shows some flexibility leading to the differentiation of hermaphrodite flowers or bisexual inflorescences (monoecious phenotype). Sex is considered an important trait for hemp genetic improvement; therefore, the study of the mechanism of sexual differentiation is of paramount interest in Cannabis research. Although the identification of male and female plants can be performed by phenotypic observation, this process is time consuming, costly, and subject to human errors in misidentifying plants or missing some plants in a field altogether. In addition, the phenotypic identification of male plants may be too late as they have already shed pollen putting the entire farming population at risk of being pollinated. Also, if a hermaphrodite plant is produced, one may not be able to visually identify if the plant is genetically male or female. Accordingly, there is a need to develop more reliable, efficient, and rapid methods for identifying the sex of Cannabis plants.
SUMMARY
[0005] Disclosed herein are novel molecular markers and uses of these markers for determining the sex of Cannabis plants and for producing Cannabis crops comprising either female or male plants.
[0006] In one aspect, the present disclosure provides a method for identifying the sex of a Cannabis plant, the method comprising detecting the presence of a single nucleotide polymorphism (SNP) in a nucleic acid sample obtained from the Cannabis plant, wherein the SNP is identified as SNP 1 having an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1.
[0007] In some embodiments, the method further comprises: (a) selecting a female Cannabis plant when SNP 1 is homozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1; or (b) selecting a male Cannabis plant when SNP 1 is heterozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1.
[0008] In one aspect, the present disclosure provides a method for identifying the sex of a Cannabis plant, the method comprising detecting the presence of a single nucleotide polymorphism (SNP) in a nucleic acid sample obtained from the Cannabis plant, wherein the SNP is identified as SNP 2 having a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5.
[0009] In some embodiments, the method further comprises: (a) selecting a female Cannabis plant when SNP 2 is homozygous at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5; or (b) selecting a male Cannabis plant when SNP 2 is heterozygous at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5. [0010] In some embodiments, the methods comprise: (a) selecting a female Cannabis plant when SNP 1 is homozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1 and when SNP 2 is homozygous at a position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5; or (b) selecting a male Cannabis plant when SNP 1 is heterozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1, and when SNP 2 is heterozygous at a position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5.
[0011] In some embodiments of the methods the detecting comprises: (a) contacting the nucleic acid sample with sets of forward and reverse primers selected from the group consisting of: (i) two forward primers for identifying SNP 1, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4; (ii) two forward primers for identifying SNP 2, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8; and (iii) primers having at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to the nucleotide sequences of the forward and reverse primer sets of (i)-(ii), (b) to produce a reaction-sample mixture, wherein: (i) the forward primers are fluorescently labeled primer-probes comprising an oligonucleotide probe sequence element at the 5’ end of the primer, (ii) the probe sequence element comprises the fluorescent label, (iii) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 3 fluoresce in different regions of the visible spectrum, and (iv) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 7 fluoresce in different regions of the visible spectrum; (c) subjecting the reaction-sample mixture to polymerase chain reaction (PCR) conditions under which each of SNP 1 and SNP 2 present in the nucleic acid sample is amplified to produce a fluorescent signal; and (d) measuring the amount of fluorescent signal produced from each fluorescent label.
[0012] In some embodiments of the methods, the fluorescent label is selected from the group consisting of fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a Cy™3 label, a Cy 3.5 label, a NED label, a ROX™ label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch Blue™ label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 670 label, a Quasar 705 label, a DABCYL label, a Dabsyl label, and any combination thereof.
[0013] In some embodiments of the methods, the PCR is Kompetitive Allele Specific PCR (KASP).
[0014] In some embodiments of the methods, all of the primer sets are contained together in an amplification mix further comprising DNA polymerase, dNTPs, and PCR buffer prior to contacting with the nucleic acid sample.
[0015] In some embodiments, the Cannabis plant is a Cannabis saliva plant.
[0016] In some embodiments, the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
[0017] In one aspect, the present disclosure provides a kit for identifying the sex of a Cannabis plant, the kit comprising at least one set of forward and reverse primers selected from the group consisting of: (a) a first primer set comprising two forward primers that specifically hybridize under stringent conditions to a segment of a target nucleic acid sample comprising a SNP identified as SNP 1 having an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4; and (b) a second primer set comprising two forward primers that specifically hybridize under stringent conditions to a segment of a target nucleic acid sample comprising a SNP identified as SNP 2 having a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8, wherein: (i) the forward primers are fluorescently labeled primer-probes comprising an oligonucleotide probe sequence element at the 5’ end of the primer, (ii) the probe sequence element comprises a fluorescent label, (iii) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 3 fluoresce in different regions of the visible spectrum, and (iv) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 7 fluoresce in different regions of the visible spectrum.
[0018] In some embodiments, the primer sets are present together in an amplification mix that further comprises DNA polymerase, dNTPs, and PCR buffer.
[0019] In some embodiments, the fluorescent label is selected from one or more of a fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a Cy™3 label, a Cy 3.5 label, a NED label, a ROX™ label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch Blue™ label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 670 label, a Quasar 705 label, a DABCYL label, and a Dabsyl label.
[0020] In some embodiments, the Cannabis plant is a Cannabis sativa plant.
[0021] In some embodiments, the target nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
[0022] In some embodiments, the kit further comprises printed or electronic instructions for use.
[0023] In one aspect, the present disclosure provides a primer selected from the group consisting of: (a) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 2; (b) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 3; (c) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 6; (d) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 7; and (e) a nucleotide sequence having at least about 90% 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to any one of the nucleotide sequences of (a)-(d), wherein the primer comprises a fluorescent label.
[0024] In some embodiments, the primer is a fluorescently labeled primer-probe comprising an oligonucleotide probe sequence element at the 5’ end of the primer, and the probe sequence element comprises the fluorescent label.
[0025] In one aspect, the present disclosure provides a method for producing a population of female Cannabis plants, the method comprising: (a) selecting a female Cannabis plant when the presence of one or more homozygous single nucleotide polymorphisms (SNPs) in a nucleic acid sample obtained from the Cannabis plant is detected, wherein the one or more SNPs are selected from: (i) a SNP identified as SNP 1, wherein SNP 1 is homozygous for an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1; or (ii) a SNP identified as SNP 2, wherein SNP 2 is homozygous for a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5; and (b) planting or growing a crop of the female Cannabis plants.
[0026] In some embodiments, the homozygous SNPs are detected by a competitive allelespecific polymerase chain reaction (PCR) assay, comprising: (a) contacting the nucleic acid sample with sets of forward and reverse primers selected from the group consisting of: (i) two forward primers for identifying SNP 1 comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4; (ii) two forward primers for identifying SNP 2 comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8; and (iii) primers having at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to the corresponding nucleotide sequences of the forward and reverse primer sets of (i)-(ii), (b) to produce a reaction-sample mixture, wherein: (i) the forward primers are fluorescently labeled primer-probes comprising an oligonucleotide probe sequence element at the 5’ end of the primer, (ii) the probe sequence element comprises the fluorescent label, (iii) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 3 fluoresce in different regions of the visible spectrum, (iv) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 7 fluoresce in different regions of the visible spectrum; (c) subjecting the reaction-sample mixture to PCR conditions under which each of SNP 1 and SNP 2 present in the nucleic acid sample is amplified to produce a fluorescent signal; and (d) measuring the amount of fluorescent signal produced from each fluorescent label.
[0027] In some embodiments, the fluorescent label is selected from the group consisting of a fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a Cy™3 label, a Cy 3.5 label, a NED label, a ROX™ label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch Blue™ label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 670 label, a Quasar 705 label, a DABCYL label, a Dabsyl label, and any combination thereof.
[0028] In some embodiments, the PCR assay is a Kompetitive Allele Specific PCR (KASP) assay.
[0029] In some embodiments, all of the primer sets are contained together in an amplification mix further comprising DNA polymerase, dNTPs, and PCR buffer prior to contacting with the nucleic acid sample.
[0030] In some embodiments, the Cannabis plant is a Cannabis sativa (C. sativa), C. indica. or C. ruder alis plant.
[0031] In some embodiments, the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age. [0032] In one aspect, the present disclosure provides a method of detecting the presence of at least one single nucleotide polymorphism (SNP) on a plurality of polynucleotide analytes in a nucleic acid sample obtained from a Cannabis plant, comprising a) amplifying the plurality of polynucleotide analytes with at least a first oligonucleotide set that is specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 or a second oligonucleotide set that is specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 to generate amplified polynucleotide analytes; and b) detecting amplified polynucleotide analytes comprising an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 or a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
[0033] In some embodiments, the method further comprises determining whether the nucleic acid sample is homozygous for an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the method further comprises determining whether the nucleic acid sample is heterozygous at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the method further comprises determining whether the nucleic acid sample is homozygous for a C at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5. In some embodiments, the method further comprises determining whether the nucleic acid sample is heterozygous at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5
[0034] In some embodiments, homozygous for an A and/or C is indicative of a female Cannabis plant.
[0035] In some embodiments, heterozygous at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a male Cannabis plant.
[0036] In one aspect, the present disclosure provides a method of detecting the presence of a single nucleotide polymorphism (SNP) on a polynucleotide analyte in a nucleic acid sample obtained from a Cannabis plant, comprising a) hybridizing a polynucleotide analyte with a set of detectably labeled probes, wherein the set comprises a first probe that is specific for an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and is labeled with a first fluorophore, and a second probe that is specific for a G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and is labeled with a second fluorophore, wherein the first fluorophore and the second fluorophore are different; and b) detecting one or more fluorescent signals to determine whether the polynucleotide analyte comprises an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1.
[0037] In one aspect, the present disclosure provides a method of detecting the presence of a single nucleotide polymorphism (SNP) on a polynucleotide analyte in a nucleic acid sample obtained from a Cannabis plant, comprising a) hybridizing a polynucleotide analyte with a set of detectably labeled probes, wherein the set comprises a third probe that is specific for a C at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 and is labeled with a third fluorophore, and a fourth probe that is specific for an A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 and is labeled with a fourth fluorophore, wherein the third fluorophore and the fourth fluorophore are different; and b) detecting one or more fluorescent signals to determine whether the polynucleotide analyte comprises a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
[0038] In some embodiments, the first fluorophore and the third fluorophore are the same. In some embodiments, the second fluorophore and the fourth fluorophore are the same. In some embodiments, each probe within the set is at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, or more nucleotides in length.
[0039] In some embodiments, the set further comprises two or more additional probes, wherein the additional probes are specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
[0040] In some embodiments, the set further comprises two or more additional probes, wherein the additional probes are specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1.
[0041] In some embodiments, the set of probes comprises molecular beacon probes.
[0042] In some embodiments, the method further comprises subjecting the nucleic acid sample under an amplification condition to generate the polynucleotide analyte prior to hybridizing the polynucleotide analyte with the set of detectably labeled probes. [0043] In some embodiments, the presence of an A and the absence of a G is indicative that the nucleic acid sample is homozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the presence of an A and a G is indicative that the nucleic acid sample is heterozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the presence of a C and the absence of an A is indicative that the nucleic acid sample is homozygous at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5. In some embodiments, the presence of a C and an A is indicative that the nucleic acid sample is heterozygous at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
[0044] In some embodiments, homozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a female Cannabis plant.
[0045] In some embodiments, heterozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a male Cannabis plant.
[0046] In some embodiments, the Cannabis plant is a Cannabis sativa plant.
[0047] In some embodiments, the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
[0048] The inventions described and claimed herein have many attributes and embodiments including, but not limited to, those set forth or described or referenced in this Brief Summary and the following Brief Description of the Drawings and Detailed Description. It is not intended to be all-inclusive and the inventions described and claimed herein are not limited to or by the features or embodiments identified in this Brief Summary, which is included for purposes of illustration only and not restriction. Additional embodiments may be disclosed in the Detailed Description below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0049] Figure 1 is a chart showing the number of sequence reads generated per specific Ch sample, with the number of reads that passed QC shown in the lower part of the bar graph and reads that did not pass QC shown in the upper part of the bar graph. [0050] Figure 2 is a dendrogram plot of the 34 Ch samples genotyped with 7,912 SNPs with equal or less than 10% U-scores.
[0051] Figure 3 is a Principal Coordinates Analysis (PCO) plot of the 34 Ch samples genotyped with 7,912 SNPs.
[0052] Figures 4A and 4B are charts for the two KASP assays identified. Figure 4A depicts a BLAST analysis showing that the fragments map to a unique position in reference genomes derived from female plants, with no more than two positions in reference genomes derived from male plants. Figure 4B shows corresponding results at a protein level.
[0053] Figure 5 shows allelic discrimination plots obtained for individual hemp plants for Csa_283848 115 using KASP genotyping assay (96-sample format) on a QuantStudio 7 Pro Real-Time PCR System (Applied Biosystems). Allele 2/ Allele 2 (X:X) represents homozygous genotypes indicative of female and allele 2/allele 1 (X:Y) represent heterozygous genotypes indicative of males. Allele 1/Allele 1 (Y:Y) represents negative controls (left sample has no template DNA and right sample contains DNA from soybean Williams82) which showed no amplification as expected. Increased intensity observed for Allele 2 is due to the inherent characteristic of the FAM fluorophore which naturally produces more fluorescence.
[0054] Figure 6 is a photograph showing sex determination by traditional PCR with MADC2 forward and reverse primers. Five microliters of PCR product/sample separated on a 2% agarose gel. Male-specific amplicons are 39 Ibp; female amplicons are 560bp. Lanes M: O’GeneRuler Express DNA ladder (ThermoScientific, #SM1551), N: negative control (no template DNA); PM: male positive control; PF: female positive control; 1-10: test samples 1-10.
DETAILED DESCRIPTION
I. INTRODUCTION
[0055] Cannabis sativa L. (cannabis, hemp, marijuana) is an herbaceous plant belonging to the Cannabis genus, family of Cannabaceae. Cannabis sativa L., an annual herb that has been cultivated for thousands of years, contains a unique set of secondary metabolites called cannabinoids, which constitute a group of terpenophenolics. The most desirable cannabinoids of the Cannabis sativa L. plant are formed in the female flowers/buds. Achieving high flower/bud yield and potency is of critical importance to medical cannabis growers. Although male cannabis plants produce these desirable cannabinoids, they do so in much smaller quantities. When male plants pollinate female plants, this causes the plants to produce seeds, which can greatly reduce the amounts of cannabinoids in female plant buds. Once pollinated, female plants put much of their energy into producing seeds rather than potent buds. Furthermore, growers are interested in producing feminized seeds to ensure only female plants are produced. Feminized seeds are produced by inducing a normal female to grow male flowers with viable pollen, to ensure seeds produced are of female genetics only. Therefore, being able to identify male from female plants, and remove the males from the population at an early stage is integral to cannabis growing operations, and has significant growth space, time, and cost advantages.
[0056] Male cannabis plants have their purpose, too. Even if the buds are not harvested for sale or consumption, the ability to identify true male plants from hermaphrodite plants is imperative to the success of a breeding program.
[0057] The present technology relates to the discovery of two novel genetic markers (labeled as Csa_283848_106 ID:7 and Csa_283848_115 ID: 11), carrying unique single-nucleotide polymorphisms (SNPs) able to accurately predict the plant sex at a genetic level. If the plant is homozygous (has 2 copies of the same allele) at the specific locus, it is predicted to be female; if it is heterozygous (1 copy of the respective allele) it is predicted to be male.
[0058] As described in the Examples, in some embodiments this identification is facilitated by using genomic DNA from a Cannabis plant (isolated early at the seedling stage) in Kompetitive allele specific PCR (KASP) assays. KASP genotyping technology is a homogeneous, fluorescence resonance energy transfer (FRET)-based assay that enables accurate bi-allelic discrimination of known SNPs.
[0059] All results from KASP assays with the above two novel identified markers were confirmed by phenotypic observations and/or by traditional PCR methodology using male- associated DNA from Cannabis sativa (MADC2) primers as described, for example, by Techen et al., Planta Med. 76: 1938-9 (2010). See also Toth et al., GCB Bioenergy 12: 213-22 (2020).
II. DEFINITIONS
[0060] All technical terms employed in this specification are commonly used in biochemistry, molecular biology and agriculture; hence, they are understood by those skilled in the field to which the present technology belongs. Those technical terms can be found, for example, in: Molecular Cloning: A Laboratory Manual 3rd ed., vol. 1-3, ed. Sambrook and Russel (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 2001); Current Protocols In Molecular Biology, ed. Ausubel et al. (Greene Publishing Associates and Wiley-Interscience, New York, 1988) (including periodic updates); Short Protocols InMolecular Biology: A Compendium Of Methods From Current Protocols In Molecular Biology 5th ed., vol. 1-2, ed. Ausubel et al. (John Wiley & Sons, Inc., 2002); Genome Analysis: A Laboratory Manual, vol. 1- 2, ed. Green et al. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1997). Methodology involving plant biology techniques are described here and also are described in detail in treatises such as Methods In Plant Molecular Biology: A Laboratory Course Manual, ed. Maliga et al. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1995).
[0061] As used herein, the term “about” will be understood by persons of ordinary skill in the art and will vary to some extent depending upon the context in which it is used. If there are uses of the term which are not clear to persons of ordinary skill in the art given the context in which it is used, “about” will mean up to plus or minus 10% of the particular term.
[0062] As used herein, the term “allele” is one or more alternative forms of a gene at a particular locus, all of which alleles relate to one trait or characteristic at a specific locus. In a diploid cell of an organism, alleles of a gene are located at a specific location, or locus on a chromosome. One allele is present on each chromosome of the pair of homologous chromosomes. A diploid plant species may comprise a large number of different alleles at a particular locus. These may be identical alleles of the gene (homozygous) or two different alleles (heterozygous).
[0063] “Cannabis” or “cannabis plant” refers to any species in the Cannabis genus that produces cannabinoids, such as Cannabis saliva and interspecific hybrids thereof. As used herein, these terms include hemp varieties and medical cannabis varieties.
[0064] As used herein, the term “cultivar” refers to a product of plant breeding that is released for access to produces that is uniform, distinct, stable, and new.
[0065] As used herein, the term “genotype” refers to the genetic constitution of an individual (or group of individuals) at one or more particular loci. The genotype of an individual or group of individuals is defined and described by the allele forms at the one or more loci that the individual has inherited from its parents. The term genotype may also be used to refer to an individual’s genetic constitution at a single locus, at multiple loci, or at all the loci in its genome.
[0066] As used herein, the term “germplasm” refers to genetic material of or from an individual plant, a group of plants (e.g., a plant line, variety, and family), and a clone derived form a plant or group of plants. A germplasm may be part of an organism or cell, or it may be separate (e.g., isolated) from the organism or cell. In general, germplasm provides genetic material with a specific molecular makeup that is the basis for hereditary qualities of the plant. As used herein, “germplasm” refers to cells of a specific plant; seed; tissue of the specific plant; and non-seed parts of the specific plant (e.g., leaf, stem, pollen, and cells). As used herein, “germplasm” is synonymous with “genetic material,” and it may be used to refer to seed (or other plant material) from which a plant may be propagated. In some embodiments, a germplasm utilized in a method or plant as described herein is from a Cannabis sativa line or variety. In some embodiments, a germplasm is seed of the Cannabis sativa line or variety. In some embodiments, a germplasm is a nucleic acid sample from the Cannabis sativa line or variety.
[0067] As used herein, “Kompetitive Allele Specific PCR” or “KASP” is a homogenous, fluorescence-based genotyping variant of Polymerase Chain Reaction (PCR). KASP is based on allele-specific oligo extension and fluorescence resonance energy transfer (FRET) for signal generation. KASP genotyping assays are based on competitive allele-specific PCR and enable bi-allelic scoring of single nucleotide polymorphisms (SNPs) and insertions and deletions at specific loci.
[0068] As used herein, “locus” or “loci” means a specific place or places, or a site on a chromosome where a gene or molecular marker, such as a SNP, is found.
[0069] As used herein, the terms “marker” and “molecular marker” refer to a nucleotide sequence used as a point of reference when identifying a linked locus. Thus, a marker may refer to a gene or nucleotide sequence that can be used to identify plants having a particular allele. A marker may be described as a variation at a given genomic locus. A genetic maker may be a short DNA sequence, such as a sequence surrounding a single base-pair change (single nucleotide polymorphism, or “SNP”), or a long one, for example, a microsatellite/simple sequence repeat (“SSR”). Markers can be derived from genomic nucleotide sequences or from expressed nucleotide sequences. The term can also refer to nucleic acid sequences complementary to or flanking a marker. The term can also refer to nucleic acid sequences used as a molecular marker probe, primer, primer pair, or a molecule that can be used to identify the presence of a marker locus, e.g., a nucleic acid probe that is complementary to a marker locus sequence, and is capable of amplifying sequence fragments using PCR and modified PCR reaction methods. Examples of markers associated with determination of the sex of Cannabis sativa plants include SNP 1 and SNP 2 and/or flanking sequences of the Cannabis sativa L. genome, as well as primers capable of identifying SNP 1 or SNP 2, or a fragment of such sequences. Markers of the present technology can include sequences having at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to any of the sequences provided in SEQ ID NOs: 1-8.
[0070] “Marker assay” refers generally to a molecular marker assay, such as PCR or KASP, for example, to identify whether a certain DNA sequence or SNP, for example, is present in a sample of nucleic acid. For example, a maker assay can include a molecular marker assay, e.g., KASP assay, which can be used to test whether a Cannabis sativa plant has a SNP associated with the sex of the plant.
[0071] “Marker assisted selection” or “MAS” is a process of identifying and using the presence (or absence) of one or more molecular markers (e.g, a SNP) associated with a particular locus or to a particular chromosomal region to select plants for the presence of the specific locus. For example, the methods of the present technology can be used to select a female Cannabis plant when the presence of one or more homozygous SNPs in a nucleic acid sample obtained from the Cannabis plant is detected, wherein the one or more SNPs are selected from: (i) a SNP identified as SNP 1, wherein SNP 1 is homozygous for an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1; or (ii) a SNP identified as SNP 2, wherein SNP 2 is homozygous for a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5.
[0072] “Plant” is a term that encompasses whole plants, plant organs (e.g., leaves, stems, roots, etc.), seeds, differentiated or undifferentiated plant cells, and progeny of the same. Plant material includes without limitation seeds, suspension cultures, embryos, meristematic regions, callus tissues, leaves, roots, shoots, stems, fruit, gametophytes, sporophytes, pollen, and microspores.
[0073] “Sequence identity” or “identity” in the context of two polynucleotide (nucleic acid) or polypeptide sequences includes reference to the residues in the two sequences that are the same when aligned for maximum correspondence over a specified region. When percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties, such as charge and hydrophobicity, and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Sequences which differ by such conservative substitutions are said to have “sequence similarity” or “similarity.” Means for making this adjustment are well-known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, for example, according to the algorithm of Meyers & Miller, Computer Applic. Biol. Sci. 4: 11-17 (1988), as implemented in the program PC/GENE (Intelligenetics, Mountain View, California, USA).
[0074] Use in this description of a percentage of sequence identity denotes a value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (z.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity. [0075] A “single-nucleotide polymorphism” or “SNP” is a variation in a single nucleotide that occurs at a specific position in a DNA sequence of a genome, where each variation is present to some appreciable degree within members of the same species or a paired chromosome. As described herein, the SNP identified as “SNP 1” has an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1, and the SNP identified as “SNP 2” has a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5.
[0076] A “variant” is a nucleotide or amino acid sequence that deviates from the standard, or given, nucleotide or amino acid sequence of a particular gene or polypeptide. The terms “isoform,” “isotype,” and “analog” also refer to “variant” forms of a nucleotide or an amino acid sequence. An amino acid sequence that is altered by the addition, removal, or substitution of one or more amino acids, or a change in nucleotide sequence, may be considered a variant sequence. A polypeptide variant may have “conservative” changes, wherein a substituted amino acid has similar structural or chemical properties, e.g., replacement of leucine with isoleucine. A polypeptide variant may have “nonconservative” changes, e.g., replacement of a glycine with a tryptophan. Analogous minor variations may also include amino acid deletions or insertions, or both. Guidance in determining which amino acid residues may be substituted, inserted, or deleted may be found using computer programs well known in the art such as Vector NTI Suite (InforMax, MD) software. Variant may also refer to a “shuffled gene” such as those described in Maxygen-assigned patents (see, e.g., U. S. Patent No. 6,602,986).
III. NOVEL MOLECULAR MARKERS FOR SEX DETERMINATION IN CANNABIS
A. SNP 1 and SNP 2 and Methods for Use
[0077] The disclosure of the present technology relates to the identification of two novel genetic markers labeled as Csa_283848_106 and Csa_283848_l 15 carrying SNP 1 and SNP 2, respectively (Table A). The SNP identified herein as SNP 1 is characterized by an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1 (which corresponds to the 195 bp Csa_283848_106 marker). The SNP identified herein as SNP 2 is characterized by a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5 (which corresponds to the 173 bp Csa_283848_115 marker).
Table A. Genetic Markers.
Figure imgf000020_0001
[0078] The unique SNPs (SNP 1 and SNP 2) accurately predict Cannabis sativa plant sex at a genetic level. If the plant is homozygous (z.e., has two copies of the same allele, e.g., X:X) at the specific locus, it is predicted to be female. If it is heterozygous (z.e., has one copy of the respective allele, e.g., X:Y) at the specific locus, it is predicted to be male. As demonstrated in the experimental examples, the two novel genetic markers (SNP 1 and SNP 2) show high accuracy for sex determination in cannabis, with SNP 1 demonstrating a 99.7% accuracy and SNP 2 demonstrating a 100% accuracy in distinguishing male from female plants (see Table 7; Example 3).
[0079] In some embodiments, the disclosure of the present technology relates to methods for producing a population of female Cannabis plants, comprising selecting a female Cannabis plant when the presence of one or more homozygous single nucleotide polymorphisms (SNPs) (e.g., homozygous SNP 1 and/or SNP 2) in a nucleic acid sample obtained from the Cannabis plant is detected, and planting or growing a crop of the female Cannabis plants. In some embodiments, the disclosure of the present technology relates to methods for producing a population of male Cannabis plants, comprising selecting a male Cannabis plant when the presence of one or more heterozygous SNPs (e.g., heterozygous SNP 1 and/or SNP 2) in a nucleic acid sample obtained from the Cannabis plant is detected, and planting or growing a crop of the male Cannabis plants. [0080] There are several advantages associated with the use of the genetic markers and methods of the present technology. For example, the methods of the present technology permit the rapid identification of male and female plants. In some embodiments, the methods described herein are capable of analyzing 96 samples in a 90-minute KASP assay run. In some embodiments, the instrument can be adapted to perform 384 samples within the same run time. In addition, a further advantage is that a low amount of sample DNA (~ 5 ng) is needed for the KASP assay. As such, the claimed methods can be performed at the seedling stage (2-3 days old). This is advantageous for at least two reasons. First, Cannabis growers can remove male plants from their population while the plants are in the nursery, and the growers can be confident that the population being transferred to the field is female only. This presents enormous time, field growth space, and cost savings. Second, the methods of the present technology can shorten breeding times. By allowing farmers to identify male and female plants early in development, the methods of the present technology will ensure that farmers have a good representation of plants for their breeding efforts, and will not have to wait until the plants flower to determine their sex.
B. KASP Assay
[0081] The identification of the SNPs was facilitated by using genomic DNA from Cannabis sativa plants (isolated at the seedling stage) in Kompetitive allele specific PCR (KASP) assays. KASP permits both SNP 1 and SNP 2 to be evaluated simultaneously, making use of competing allele-specific forward primers that each have a unique 5’ unlabeled tail sequence {see Table 6; Example 2). KASP genotyping assays are based on competitive allele-specific PCR and enable bi-allelic scoring of single nucleotide polymorphisms (SNPs). KASP assays use three components: (1) test/sample DNA {e.g, genomic DNA) containing, for example, the SNP of interest; (2) the SNP-specific KASP assay mix containing (a) two different, allele specific competing forward primers with unique unlabeled tail sequences at the 5’ end (z.e., one primer for each SNP allele), and (b) one common reverse primer; and (3) the universal KASP Master mix containing (a) FAM and HEX specific FRET cassette (fluorophores in quenched form), (b) Taq polymerase specially modified for allele-specific PCR, and (c) optimized buffer.
[0082] In some embodiments of the present technology, the competing allele-specific forward primers for the genetic marker comprising SNP 1 (labeled as Csa_283848_106) are oligonucleotides comprising the nucleotide sequences 5’- TACTTGTGGATTCCATTCGTCCAAA-3’ (SEQ ID NO: 2) or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 2, and 5’- ACTTGTGGATTCCATTCGTCCAAG-3’ (SEQ ID NO: 3) or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 3, and the common reverse primer comprises 5’-TAGGCTGCCCTSACACTAGGAATA-3’ (SEQ ID NO: 4) or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 4 (see Table 6; Example 2). In some embodiments of the present technology, the competing allele-specific forward primers for the genetic marker comprising SNP 2 (labeled as Csa_283848_l 15) are oligonucleotides comprising the nucleotide sequences 5’- GATTCCATTCGTCCAARTAAAAGCAC-3’ (SEQ ID NO: 6) or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 6 and 5’- CCATTCGTCCAARTAAAAGCAA-3’ (SEQ ID NO: 7) or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 7, and the common reverse primer comprises 5’-TAGGCTGCCCTSACACTAGGAATA-3’ (SEQ ID NO: 8) or at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO: 8 (see Table 6; Example 2).
[0083] The methods of the present technology may use any suitable genomic nucleic acid samples from Cannabis. Genomic nucleic acid samples include samples taken from, but not limited to, seeds, seedlings, tissue cultures, or plants of any age. In some embodiments, the samples may comprise Cannabis leaves of seedlings or young plants pressed into a paper matrix, such as a paper matrix laced with a mixture of chemicals that lyse cells and stabilize nucleic acids on contact for long-term storage at room temperature. In some embodiments, the samples may comprise purified DNA isolated from fresh or dry Cannabis plant material. The purified DNA may be used directly, or may be stored by affixing the purified DNA to filter paper and then used.
[0084] If the genomic DNA (gDNA) sample is homozygous for the specific allele (i.e., it has two copies of the allele, one on each chromosome), then only the allele specific primer will bind to the DNA template. Through subsequent PCR cycling steps, the respective fluorophore from the Master mix binds to specific complement tail sequence of the newly synthesized DNA strand; the fluorophore is no longer quenched and thus emits fluorescence. KASP is an endpoint chemistry. Results are generated from a final read once the PCR amplification is completed. If the SNP is homozygous, only one of the two possible fluorescent signals will be generated. If the genotype if heterozygous, a mixed fluorescent signal will be generated.
[0085] The end-point fluorescent reading is made with any FRET capable instrument that can excite fluorophores between 485 nm and 575 nm and read light emissions between 520 nm and 610 nm. Such instruments may include, but are not limited to, the following makes and models: Biotek Synergy 2, ABI 7500, ABI 7300, ABI 7900, ABI ViiA7, Roche LC480, Agilent Mx3000P/3005P, Illumina EcoRT, and BIO-RAD CFX. A passive reference dye, 5-carboxy-X- rhodamine succinimidyl ester (ROX), is included in the master mix to allow for the normalization of the HEX and FAM signals due to slight variations in well volume. Upon completion of the KASP reactions, the resulting fluorescence is measured, the raw data is interpreted, and genotypes are assigned to the DNA samples by plotting fluorescence values for each sample on a cluster plot (Cartesian plot). The fluorescent signal from each individual DNA sample is represented as an independent data point on a cluster plot. For example, one axis is used to plot the FAM fluorescence value (typically the X axis) and the second axis is used to plot the HEX fluorescence value (typically the Y axis) for each sample. A sample that is homozygous for an allele reported by FAM will only generate FAM fluorescence during the KASP reaction. This data point is plotted close to the X axis, representing high FAM signal and no HEX signal. A sample that is homozygous for the allele reported by HEX will only generate HEX fluorescence during the KASP reaction. This data point is plotted close to the Y axis, representing high HEX signal and no FAM signal. A sample that is heterozygous will contain both the allele reported by FAM and the allele reported by HEX. This sample will generate half as much RAM fluorescence and half as much HEX fluorescence as the samples that are homozygous for these alleles. This data point is plotted in the center of the plot, representing half FAM signal and half HEX signal. The KASP reaction without any template DNA is included as a negative control to ensure reliability. This is referred to as a no template control (NTC) and will not generate any fluorescence and the data point will therefore be plotted at the origin.
[0086] A wide range of fluorophores can be used in the KASP assay to label oligos/primers and selection of the fluorophores will depend on the application and instrument available for analysis of the assay. The most commonly used fluorophores are fluorescein amidite fluorophore (FAM), hexachlorofluorescein (HEX), tetrachlorofulorescein (TET), TAMRA, 4,5- dichloro-dimethoxy-fluorescein (JOE), ROX™, DABCYL, and Dabsyl. However, any suitable combination of fluorophores spanning the visible spectrum may be used. Accordingly, in some embodiments, the fluorescent label includes, but is not limited to, a FAM label, a HEX label, a TET label, a JOE label, a VIC® label, a Cy™3 label, a Cy 3.5 label, a NED label, a ROX™ label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch Blue™ label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 670 label, a Quasar 705 label, a DABCYL label, a Dabsyl label, and any combination thereof.
C. Additional Sequencing Methods
[0087] In some embodiments of the present technology, one or more additional methods are utilized to detect the presence of one or more of genetic markers labeled as Csa_283848_106 and Csa_283848_l 15 carrying SNP 1 and SNP 2, respectively. The one or more additional methods can include an amplification method which involves utilization of a primer pair to generate a progeny population with the same sequence as the parental target sequence or a hybridization method which can utilize differences in thermal stability of double-stranded DNA to separate between matched and mismatched target-probe pairs for allelic discrimination.
[0088] In some embodiments, the one or more additional methods comprise an amplification method. In one aspect, the amplification method comprises a semi-thermal asymmetric reverse PCR (STARP) method, a TaqMan PCR method, a nucleic acid sequence-based amplification (NASBA) method, or a rhAmp based on RNase H2-dependent PCR (rhPCR) method.
[0089] In some embodiments, the one or more additional methods comprise a semi-thermal asymmetric reverse PCR (STARP) method. The STARP method can comprise one or more sets of locus-specific primers and a pair of universal primers. Exemplary universal primers include, but are not limited to, M13 forward and reverse primers, SP6, T3, T7 (e.g., T7 EEV, T7 Reverse, T7 Term), pBluescript KS, pBluescript SK, GST-Tag, CMV-Forward, CMV-Reverse, EGFP-C, EGFP-N, BGH-Reverse, GALI Forward, pTRE 3’, pTRE 5’, SV40-pArev, SV40-Promoter, U6 Primer, Xpress Forward, EBV-Rev primer, hU6-01, and hU6-02. [0090] In some embodiments, the STARP method comprises a first oligonucleotide set comprising a first forward primer, a second forward primer, and a first reverse primer. In some embodiments, the first forward primer is allele-specific to the X chromosome and binds to an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the second forward primer is allele-specific to the Y chromosome and binds to a G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the first forward primer and the second forward primer are fluorescently labeled and the two fluorophores are different. In some embodiments, the first forward primer comprises at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to SEQ ID NO: 2. In some embodiments, the second forward primer comprises at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to SEQ ID NO: 3. In some embodiments, the first reverse primer comprises at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to SEQ ID NO: 4.
[0091] In some embodiments, the STARP method further comprises a second oligonucleotide set comprising a third forward primer, a fourth forward primer, and a second reverse primer. In some embodiments, the third forward primer is allele-specific to the X chromosome and binds to a C at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5. In some embodiments, the fourth forward primer is allele-specific to the Y chromosome and binds to an A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5. In some embodiments, the third forward primer and the fourth forward primer are fluorescently labeled and the two fluorophores are different. In some embodiments, the third forward primer comprises at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to SEQ ID NO: 6. In some embodiments, the fourth forward primer comprises at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to SEQ ID NO: 7. In some embodiments, the second reverse primer comprises at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or about 100% sequence identity to SEQ ID NO: 8.
[0092] In some embodiments, the one or more additional methods comprise a TaqMan PCR method. The TaqMan PCR method can comprise a primer pair and a labeled probe. The labeled probe can comprise a fluorophore at one terminus and a quencher at the opposite terminus. In particular, the labeled probe can hybridize to a target site prior to the start of an amplification process. Since the fluorophore and the quencher are conjugated at opposing termini, the quencher quenches the fluorescence of the fluorophore. During amplification, the labeled probe can be hydrolyzed by the polymerase, thereby releasing the fluorophore. As such, a fluorescence is detected, correlating to the generation of the progeny sequence.
[0093] In some embodiments, the TaqMan PCR method comprises a first oligonucleotide set comprising a forward primer, a reverse primer, and a first probe. In some embodiments, the first probe is specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the TaqMan PCR method further comprises a second oligonucleotide set comprising a forward primer, a reverse primer, and a second probe. In some embodiments, the second probe is specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5. In some embodiments, the first probe and the second probe are each independently at least about 12, at least about 13, at least about 14, at least about 15, at least about 16, at least about 17, at least about 18, at least about 19, at least about 20, at least about 21, at least about 22, at least about 23, at least about 24, at least about 25, at least about 30, at least about 35, at least about 40, at least about 45, at least about 50, or more nucleotides in length.
[0094] In some embodiments, the one or more additional methods comprise a nucleic acid sequence-based amplification (NASBA) method. The NASBA method is a technique that produces copies of target segments of RNA/DNA. In particular, the NASBA method can comprise a two-step process that targets RNA during an annealing step and then utilizes an enzyme cocktail (e.g., avian myeloblastosis reverse transcriptase (AMV-RT), RNase H, and RNA polymerase) during the amplification step. [0095] In some embodiments, the NASBA method comprises a first oligonucleotide set comprising a forward primer, a reverse primer, and a first molecular beacon probe. In some embodiments, the first molecular beacon probe is specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1. In some embodiments, the NASBA method further comprises a second oligonucleotide set comprising a forward primer, a reverse primer, and a second molecular beacon probe. In some embodiments, the second molecular beacon probe is specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5. In some embodiments, the first molecular beacon probe and the second molecular beacon probe are each independently at least about 12, at least about 13, at least about 14, at least about 15, at least about 16, at least about 17, at least about 18, at least about 19, at least about 20, at least about 21, at least about 22, at least about 23, at least about 24, at least about 25, at least about 30, at least about 35, at least about 40, at least about 45, at least about 50, or more nucleotides in length.
[0096] In some embodiments, the one or more additional methods comprise a rhAmp based on RNase H2-dependent PCR (rhPCR) method. The rhPCR method utilizes RNase H2 along with a polymerase and a blocked primer pair as part of the amplification process. In particular, each of the blocked primer pairs can contain a single RNA base and a 3’ blocking group that is removed by the RNase H2 prior to extension by a DNA polymerase. The added specificity enables a higher sensitivity and specificity of the amplified fragments and less off-target product.
[0097] In some embodiments, the one or more additional methods comprise a hybridization method. In some instances, the hybridization method utilizes an oligonucleotide array. The oligonucleotide array (also referred to as DNA microarray, DNA chip, or biochip) is a collection of oligonucleotides (or probes) that are spotted on a solid surface. In some embodiments, probetarget hybridization can be detected and quantified by detection of fluorophore-, silver-, or chemiluminescence-labeled targets. In other embodiments, the oligonucleotides are further fluorescently labeled and upon binding to one or more targets, a change in fluorescence can be detected. In some embodiments, the hybridization method is a GENECHIP® array system. In some instances, the hybridization method is a BEAD ARRAY™ -based system. D. Kits
[0098] The genetic markers described herein can be adapted for use in a detection kit system. Accordingly, in some embodiments, the disclosure of the present technology provides a kit for identifying the sex of a Cannabis plant. In some embodiments, the kit comprises at least one set of forward and reverse primers selected from the group consisting of: (a) a first primer set comprising two forward primers that specifically hybridize under stringent conditions to a segment of a target nucleic acid sample comprising a SNP identified as SNP 1 having an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4; and (b) a second primer set comprising two forward primers that specifically hybridize under stringent conditions to a segment of a target nucleic acid sample comprising a SNP identified as SNP 2 having a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8, wherein: (i) the forward primers are fluorescently labeled primer-probes comprising an oligonucleotide probe sequence element at the 5’ end of the primer, (ii) the probe sequence element comprises a fluorescent label, (iii) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 3 fluoresce in different regions of the visible spectrum, and (iv) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 7 fluoresce in different regions of the visible spectrum. In some embodiments, the kit comprise printed or electronic instructions for use.
[0099] The kit may be used according to the methods of the present technology, and may comprise further components necessary for DNA extraction from any plant sample or seed. In some embodiments, the target nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age. EXAMPLES
[0100] The following examples are provided by way of illustration only and not by way of limitation. Those of skill in the art will readily recognize a variety of non-critical parameters that could be changed or modified to yield essentially the same or similar results. The examples should in no way be construed as limiting the scope of the present technology, as defined by the appended claims.
Example 1 : Informatic Process Leading to the Identification of Candidate Sex-Linked Markers
[0101] This example demonstrates the process taken to identify sex-linked cannabis SNPs encompassed by the present disclosure.
Methods
[0102] Sequence data generation. Genomic DNA of 34 plants of the Ch hemp genotype was used to generate Sequence-Based Genotyping (SBG) libraries. For complexity reduction one AFLP -based Pstl/Msel primer combination (+0/+2) was used. QC on the libraries was done prior to sequencing. The libraries were sequenced on an Illumina HiSeq 2500 sequencer (single read sequencing; 125 bp). Illumina reads were subsequently filtered on quality, presence of sample identification tags, and the PstI restriction site motif. The number of reads per sample is presented in Table 1 and Figure 1. Samples shown in Table 1 are ordered from high to low number of reads that passed QC. In total, 205,170,030 sequence reads with sample IDs were generated, of which 176,344,784 (86.0%) passed the specific filtering criteria, which is consistent with expectations.
Table 1. Overview of reads that were generated and passed quality control per sample including summary statistics.
Figure imgf000029_0001
Table 1. Overview of reads that were generated and passed quality control per sample including summary statistics.
Figure imgf000030_0001
[0103] Reference building and SNP mining. Reads from all samples that passed the preprocessing criteria were used to establish a reference set. In this case, a reference set of 25, 191 sequences for SNP discovery was generated. The produced reference sequences were subsequently mined for SNPs using the SBG analysis pipeline version 2.0. This resulted in 41,165 putative variants (SNPs) in 11,301 references. The raw SNP set was subsequently filtered using default settings (genotyping depth of 7 reads), aiming at obtaining a data set with good quality SNPs, resulting in 28,276 SNPs in 8,260 references (loci), with on average 3.4 SNPs per locus. In this step, a genotyping file was also constructed containing “A,” “B,” “H” genotypes and unknown genotypes/missing data (“U”-score). [0104] The A, B, and H genotypes were assigned in the following manner: (i) A for the alternative allele; (ii) B for the reference allele; and (iii) H for the heterozygous SNPs.
[0105] U-scores can be the result of absence of fragments (biological origin, e.g., restriction site polymorphisms, which can be the case when a diverse sample set is analyzed) or due to fewer reads than set as threshold when applying the fdter settings (= 7 reads for this project), which is a technical reason.
[0106] QC on the dataset. Before the SBG data was used in a genetic diversity analysis, a filtering for missing data per SNP and per individual was performed. This resulted in a dataset consisting of 7,912 SNPs (in 2,478 loci) scored in 34 individuals.
[0107] Genetic distance analysis: Dendrogram. A dissimilarity matrix was estimated using the function “dist” of the statistical software package R and Euclidian distances were estimated. Clustering was performed using the function “hclusf ’ with the use of average distances as agglomeration method. Subsequently, the genetic distances were visualized in a dendrogram (Figure 2). The analysis was done based on the total set of lines and per company.
[0108] To evaluate to what extent the dendrogram is a good representation of the similarity matrix, the cophenetic value matrix was calculated. A cophenetic value matrix is made from similarity values directly extracted from the dendrogram. The cophenetic value matrix was compared with the original similarity matrix used to construct the dendrogram, and a “cophenetic correlation” between the two matrices was computed. This cophenetic correlation is an indication of how well the dendrogram represents the similarity matrix. Ideally, there should be a “cophenetic correlation” of 1.
[0109] For the dendrogram based on the matrix generated with the use of the “Euclidian” coefficient, a “cophenetic correlation” of 0.68 was found. This indicates that the dendrogram is a poor representation of the similarity between the lines. It should be noted that a dendrogram is a simplification of the similarity matrix (as is represented by the cophenetic correlation) and that caution should be taken in the interpretation.
[0110] Genetic distance analysis: Principal co-ordinate analysis. The distance matrix calculated by using the “Euclidian” coefficient was also analyzed using a principal coordinate analysis (PCO). The PCO allows a better separation of different clusters by using different co- ordinates. The first coordinate provides the best one-dimensional fit to the distance matrix; the first two coordinates provide the best two-dimensional fit, and so on. For this PCO, the function “pco” from the package “labdsv” was used (Figure 3).
[0111] Association analysis for gender. An association study was performed using all markers generated by studying association with the gender phenotypes generated. The data was analyzed using an R-script which calculates the association between genotype and phenotype. For this correlation (cor), the linear model (Im) function from R was used.
[0112] Results for Ch population are visualized in Table 2. A high association was found for 62 SNPs (34 loci), and out of this set 55 SNPs (32 loci) had a 100% association between genotype and phenotype. The homozygous “B” score is associated with female and the heterozygous score “H” is associated with male.
[0113] Mapping to reference genomes. The 28,276 SNPs in 8,260 loci from 1.2.2 were mapped to the CBDRx reference genome (see, e.g., Grassa et al., bioRxiv 458083; doi: doi.org/10.1101/458083). The loci that mapped to the X chromosome of the CBDRx reference genome (which was generated from a female hemp plant) with a data correlation value of greater than 90% were further mapped to the Ch reference genome (which was generated from a female hemp plant) using BLAST with the following settings: Word size = 25; reward = 1; penalty=-2; e-value threshold = 0.1; Maximum number of target sequences return = 30; Maximum number of high scoring pairs 30; percent identify threshold = 90%. Loci that mapped to multiple positions in the Ch genome were eliminated from further analysis, while loci that mapped to a unique location on the Ch genome were further mapped to the Finola reference genome (which was generated from a male hemp plant). Loci that mapped to more than two positions in the Finola genome were eliminated from further analysis. Additional BLAST analysis showed that these fragments map to a unique position in reference genomes derived from female plants, and no more than two positions in reference genomes derived from male plants.
Results
[0114] For the identification of sex-linked SNPs in cannabis, Sequence-Based Genotyping (SBG) was carried out on 34 plants of the Ch hemp genotype using high quality genomic DNA (gDNA) for the construction of SBG libraries. Libraries were subsequently sequenced on the Illumina HiSeq2500 system. A reference set generated within the SBG process was used as the reference sequence to which the Illumina HiSeq2500 reads, generated within this project, were mapped. The resulting alignments were subsequently mined for SNPs. This genotypic data was subsequently used for a genetic diversity analysis.
[0115] When analyzing generated SBG data, 28,276 SNPs in 8,260 loci were generated using default settings (genotyping depth of 7 reads). Filtering of the data was applied to leave only markers with a maximum of 10% missing data and samples with a maximum of 50% missing data, resulting in a dataset consisting of 7,912 SNPs (in 2,478 loci) scored in 34 individuals.
Note that U-scores can be due to (1) fewer reads than set as threshold when applying the filtering settings (= 7 reads for this project), which is a technical reason or (2) absence of fragments which is the result of a biological origin, e.g., restriction site polymorphisms. In the case of large genetic variation within the analyzed germplasm set, more of such polymorphism is expected.
[0116] Based on the dendrogram and the PCO plots (Figures 2-3) it was concluded that the genetic variation is present within the analyzed Ch samples, based on the current data set. A high association was found with 62 SNPs (34 loci), and out of this set 55 SNPs (32 loci) had a 100% association between genotype and phenotype (Table 2). Explained variance (r2) and P- value are shown for association with phenotypic data for gender. A and B indicate homozygous marker scores (not corrected for parental scores); and H indicates a heterozygous marker score. The phenotypic data is coded as 0 (female) and 1 (male). This table presents the top 64 SNPs with the highest association (out of 7,912 SNPs). The homozygous “B” score is associated with female and the heterozygous score “H” with male.
Figure imgf000034_0001
Figure imgf000035_0001
[0117] In parallel, the 28,276 SNPs in 8,260 loci were mapped initially to the CBDRx reference genome, and further filtered and mapped to the Ch and Finola genomes. This resulted in 14 SNPs in 11 loci which were confirmed by KASP assays. Additional BLAST analysis showed that these fragments map to a unique position in reference genomes derived from female plants and to no more than two positions in reference genomes derived from male plants (Figures 4A-4B).
Example 2: Description of Kompetitive Allele Specific PCR (KASP) Assays
[0118] This example describes the identification of two novel genetic markers labeled as Csa_283848_106 (SEQ ID NO: 1) and Csa_283848_l 15 (SEQ ID NO: 5), which carry unique SNPs able to accurately predict cannabis plant sex at a genetic level.
Methods
[0119] Genomic DNA (gDNA) isolation. Hemp seeds were germinated on wet filter paper for 3 days in the dark at 25°C prior to transfer to soil. Germinated seedlings in soil were grown at 25°C under a 16h light, 8h dark photoperiod for 2 weeks prior to leaf sampling. Approximately lOOmg of leaf tissue was harvested, and total genomic DNA was isolated using the Qiagen DNeasy Plant Mini Kit according to the manufacturer’s protocol.
[0120] Kompetitive allele specific PCR (KASP) assay. KASP genotyping technology is a homogeneous, fluorescence (FRET) based assay that enables accurate bi-allelic discrimination of known SNPs. The two identified genetic markers (Csa_283848_106 and Csa_283848_l 15), carrying unique single-nucleotide polymorphisms (SNPs) were identified from internal association study for sex-linked SNPs from SBG data of hemp line Ch as outlined in Example 1. The SNPs for male and female cannabis plants identified in the 34 Ch plants (used for SBG) and incorporated in the respective KASP markers (Csa_283848_106 and Csa_283848_l 15) are shown in Table 3.
Figure imgf000036_0001
Figure imgf000037_0001
[0121] The respective SNP (for male or female) identified in each reference genome for the respective KASP marker is shown in Table 4. Table 4. Reference genome SNP for KASP markers (Csa_283848_106 and Csa 283848 115)
Figure imgf000038_0001
[0122] These markers map as single copies in publicly known reference cannabis genomes and all map to the female (X) chromosome. KASP assay primers were designed and synthetically manufactured by LGC, Biosearch Technologies (Hoddesdon, UK) for use in KASP genotyping assays (Table 5); fluorophores occur at the 5’end of the primers: FAM (495nm excitation, 520nm emission) for allele 1 (Y) and HEX (535nm excitation, 556nm emission) for allele 2 (X).
Table 5. Details of KASP assay design: DNA sequences for each marker with the respective SNPs indicated within “[ ]” together with fluorophores and GC content are listed.
Figure imgf000038_0002
M = A or C: Y = T or C; S = G or C; R = A or G [0123] Table 6 provides the KASP assay forward and reverse primer sets for each SNP. For each SNP there are two different, allele-specific (X or Y) competing forward primers (i.e., one primer for each SNP allele) and one common reverse primer.
Figure imgf000039_0001
[0124] For all samples (carried out in a 96-well format), each amplification reaction contained 10 ng template gDNA, KASP 2x Master mix and KASP-by-Design assay mix (LGC, Biosearch Technologies) in a final volume of lOpL. The PCR thermocycling conditions for both primers Csa_283848_106 and Csa_283848_l 15 was 15 min at 94°C followed by 10 cycles of 94°C for 20 sec and 61°C for 1 min (dropping -0.6°C per cycle to achieve a 55°C the annealing temperature) followed by 26 cycles of 94°C for 20 sec and 30°C for 1 min. PCR amplification and analysis were performed using the QuantStudio 7 Pro Real-Time PCR System (Applied Biosystems). Results
[0125] Two novel genetic markers labeled as Csa_283848_106 (SEQ ID NO: 1) and Csa_283848_l 15 (SEQ ID NO: 5) carrying unique SNPs able to accurately predict the plant sex at a genetic level were identified. If the plant is homozygous (has 2 copies of the same allele, e.g, X:X) at the specific locus, then it is predicted to be female; if it is heterozygous (1 copy of the respective allele, e.g., X:Y), then it is predicted to be male. Figure 5 shows an example of a KASP assay allelic discrimination plot for Csa_283848_115 (SEQ ID NO: 5), which clearly identifies females (homozygous for the allele) from males (heterozygous for the allele).
[0126] The two novel genetic markers identified herein show high accuracy for sex determination in cannabis; thus far 426 plants representing 88 independent lines were evaluated. Csa_283848_106 (SEQ ID NO: 1) demonstrated a 99.7% accuracy in sex determination, while Csa_283848_l 15 (SEQ ID NO: 5) had 100% accuracy in distinguishing male from female plants. All results from KASP assays with the above two novel identified markers were confirmed with phenotypic observations.
[0127] Accordingly, these results demonstrate that the genetic markers of the present technology (Csa_283848_106 and Csa_283848_115 carrying SNP 1 and SNP 2, respectively) are useful in methods for identifying the sex of a Cannabis plant, and do so with high accuracy.
Example 3 : Description of MADC2 Primer PCR Assays
[0128] This example demonstrates the use of publicly known sex-determination DNA markers for cannabis (e.g., MADC2).
Methods
[0129] Sex determination by traditional PCR. PCRs were carried out under standard conditions by using 50 ng of genomic DNA, IX DreamTaq green PCR master mix (ThermoScientific, #K1081) and lOpM each of the MADC2 forward and reverse primers in a 50|iL reaction volume. The reactants were denatured at 95 °C for lOmin, followed by 39 cycles of 95 °C for 30 sec, 55°C for 30 sec, 72 °C for 60 sec, and a final extension at 72 °C for 7 min in a thermocycler. Five microliters of PCR product was separated on a 2% agarose gel with 5pL O’GeneRuler Express DNA ladder (ThermoScientific, #SM1551) for 1.5h at 125V. MADC2 forward primer is 5’- GTGACGTAGGTAGAGTTGAA-3 ’ (SEQ ID NO: 9) and MADC2 reverse primer is 5’- GTGACGTAGGCTATGAGAG-3’ (SEQ ID NO: 10) amplify malespecific amplicons (Mandolino et al., 1999; Kolenc & Cerenak, 2017; Sakamoto et al 2005). A single PCR product of 39 Ibp indicated male plants. For female plants, a single PCR product of 560bp occurred, and on occasion (line dependent) two PCR products of 560bp and 870bp were produced.
Results
[0130] All samples (using the same DNA aliquot used in KASP assays) were analyzed by traditional PCR method with known MADC2 primers and identified as male or female based on the differential size of the PCR product (Figure 6). Results were validated by phenotypic observations, and a 99.8% accuracy in distinguishing male from female was obtained. One sample was identified as a female by PCR but phenotypically was a male. This was not unexpected as it is known that the accuracy of publicly available sex-determination DNA markers for cannabis such as MADC2, SCAR323 and SC ARI 19 varies between genotypes.
Both KASP assays (Csa_283848_106 and Csa_283848_l 15) correctly identified it as a male. A summary of the screening results for the MADC2 primer PCR and KASP assays (Csa_283848_106 and Csa_283848_l 15) is provided in Table 7.
Table 7. Summary of screening results for the MADC2 primer PCR, and KASP assays Csa_283848_106 (ID:7) and Csa_283848_l 15 (ID: 11).
Figure imgf000042_0001
[0131] Accordingly, these results demonstrate that the genetic markers of the present technology (Csa_283848_106 and Csa_283848_l 15 carrying SNP 1 and SNP 2, respectively) are useful in methods for identifying the sex of a Cannabis plant, and do so at high accuracy.
REFERENCES
Sakamoto K, Abe T, Matsuyama T, Yoshida S, Ohmido N, Fukui K, Satoh S. (2005) RAPD markers encoding retrotransposable elements are linked to the male sex in Cannabis sativa L. Genome 48(5):931-6.
Mandolini G, Carboni A, Forapani S, Faeti V, Ranalli P. (1999) Identification of DNA markers linked to the male sex in dioecious hemp (Cannabis sativa L). Theor Appl Genet 98: 86-92.
Kolenc, Z, Cerenak A. (2017) Application of sex molecular markers in hemp plant (Cannabis sativa sp.). Hmeljarski Bilten 24: 121-128.
Techen et al. (2016) Genetic identification of female Cannabis sativa plants at early developmental stage. Planta Med 76(16): 1938-9.
Mendel et al. (2016) Progress in early sex determination of cannabis plant by DNA markers. MedelNet 2016:731-735.
Toijek et al. (2002) Novel male-specific molecular markers (MADC5, MADC6) in hemp. Euphytica 127(2), 209-218.
Djivan et al. (2020) Development of genetic markers for sexing Cannabis sativa seedlings. DOI: 10.1101/2020.05.25.114355, Corpus ID: 219153952.
Heikrujam et al. (2014) Review on different mechanisms of sex determination and sex-linked molecular markers in dioecious crops: a current update. Euphytica 201 : 161-194.
EQUIVALENTS
[0132] The present technology is not to be limited in terms of the particular embodiments described in this application, which are intended as single illustrations of individual aspects of the present technology. Many modifications and variations of this present technology can be made without departing from its spirit and scope, as will be apparent to those skilled in the art. Functionally equivalent methods and apparatuses within the scope of the present technology, in addition to those enumerated herein, will be apparent to those skilled in the art from the foregoing descriptions. Such modifications and variations are intended to fall within the scope of the appended claims. The present technology is to be limited only by the terms of the appended claims, along with the full scope of equivalents to which such claims are entitled. It is to be understood that this present technology is not limited to particular methods, reagents, compounds compositions or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting.
[0133] In addition, where features or aspects of the disclosure are described in terms of Markush groups, those skilled in the art will recognize that the disclosure is also thereby described in terms of any individual member or subgroup of members of the Markush group.
[0134] As will be understood by one skilled in the art, for any and all purposes, particularly in terms of providing a written description, all ranges disclosed herein also encompass any and all possible subranges and combinations of subranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, etc. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc. As will also be understood by one skilled in the art all language such as “up to,” “at least,” “greater than,” “less than,” and the like, include the number recited and refer to ranges which can be subsequently broken down into subranges as discussed above. Finally, as will be understood by one skilled in the art, a range includes each individual member. Thus, for example, a group having 1-3 cells refers to groups having 1, 2, or 3 cells. Similarly, a group having 1-5 cells refers to groups having 1, 2, 3, 4, or 5 cells, and so forth.
[0135] All publicly available documents referenced or cited to herein, such as patents, patent applications, provisional applications, and publications, including GenBank Accession Numbers, are incorporated by reference in their entirety, including all figures and tables, to the extent they are not inconsistent with the explicit teachings of this specification.
[0136] Other embodiments are set forth within the following claims. SEQUENCE LISTING
Figure imgf000045_0001

Claims

CLAIMS What is claimed is:
1. A method for identifying the sex of a Cannabis plant, the method comprising detecting the presence of a single nucleotide polymorphism (SNP) in a nucleic acid sample obtained from the Cannabis plant, wherein the SNP is identified as SNP 1 having an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO:1.
2. The method of claim 1, further comprising:
(a) selecting a female Cannabis plant when SNP 1 is homozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1; or
(b) selecting a male Cannabis plant when SNP 1 is heterozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1.
3. A method for identifying the sex of a Cannabis plant, the method comprising detecting the presence of a single nucleotide polymorphism (SNP) in a nucleic acid sample obtained from the Cannabis plant, wherein the SNP is identified as SNP 2 having a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5.
4. The method of claim 3, further comprising:
(a) selecting a female Cannabis plant when SNP 2 is homozygous at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5; or
(b) selecting a male Cannabis plant when SNP 2 is heterozygous at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5.
5. The method of any one of claims 1-4, comprising:
(a) selecting a female Cannabis plant when SNP 1 is homozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1 and when SNP 2 is homozygous at a position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5; or
(b) selecting a male Cannabis plant when SNP 1 is heterozygous at a position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1, and when SNP 2 is heterozygous at a position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5.
6. The method of any one of claims 1-4, wherein the detecting comprises:
(a) contacting the nucleic acid sample with sets of forward and reverse primers selected from the group consisting of:
(i) two forward primers for identifying SNP 1, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4;
(ii) two forward primers for identifying SNP 2, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8; and
(iii) primers having at least 90% sequence identity to the nucleotide sequences of the forward and reverse primer sets of (i)-(ii),
(b) to produce a reaction-sample mixture, wherein:
(i) the forward primers are fluorescently labeled primer-probes comprising an oligonucleotide probe sequence element at the 5’ end of the primer,
(ii) the probe sequence element comprises the fluorescent label,
(iii) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 3 fluoresce in different regions of the visible spectrum, and
(iv) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 7 fluoresce in different regions of the visible spectrum;
(c) subjecting the reach on- sample mixture to polymerase chain reaction (PCR) conditions under which each of SNP 1 and SNP 2 present in the nucleic acid sample is amplified to produce a fluorescent signal; and
(d) measuring the amount of fluorescent signal produced from each fluorescent label.
7. The method of claim 6, wherein the fluorescent label is selected from the group consisting of fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a Cy™3 label, a Cy 3.5 label, a NED label, a ROX™ label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch Blue™ label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 670 label, a Quasar 705 label, a DABCYL label, a Dabsyl label, and any combination thereof.
8. The method of claim 6 or 7, wherein the PCR is Kompetitive Allele Specific PCR (KASP).
9. The method of any one of claims 6-8, wherein all of the primer sets are contained together in an amplification mix further comprising DNA polymerase, dNTPs, and PCR buffer prior to contacting with the nucleic acid sample.
10. The method of any one of claims 1-9, wherein the Cannabis plant is a Cannabis sativa plant.
11. The method of any one of claims 1-10, wherein the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
12. A kit for identifying the sex of a Cannabis plant, the kit comprising at least one set of forward and reverse primers selected from the group consisting of:
(a) a first primer set comprising two forward primers that specifically hybridize under stringent conditions to a segment of a target nucleic acid sample comprising a SNP identified as SNP 1 having an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO:
3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4; and
(b) a second primer set comprising two forward primers that specifically hybridize under stringent conditions to a segment of a target nucleic acid sample comprising a SNP identified as SNP 2 having a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5, the forward primers comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8, wherein:
(i) the forward primers are fluorescently labeled primer-probes comprising an oligonucleotide probe sequence element at the 5’ end of the primer,
(ii) the probe sequence element comprises a fluorescent label,
(iii) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 3 fluoresce in different regions of the visible spectrum, and
(iv) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 7 fluoresce in different regions of the visible spectrum.
13. The kit of claim 12, wherein:
(a) the primer sets are present together in an amplification mix that further comprises DNA polymerase, dNTPs, and PCR buffer; and/or
(b) the fluorescent label is selected from one or more of a fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a Cy™3 label, a Cy 3.5 label, a NED label, a ROX™ label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch Blue™ label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 670 label, a Quasar 705 label, a DABCYL label, and a Dabsyl label; and/or
(c) the Cannabis plant is a Cannabis sativa plant; and/or
(d) the target nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age; and/or
(e) the kit further comprises printed or electronic instructions for use.
14. A primer selected from the group consisting of:
(a) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 2;
(b) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 3;
(c) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 6;
(d) an oligonucleotide comprising a nucleotide sequence as set forth in SEQ ID NO: 7; and
(e) a nucleotide sequence having at least 90% sequence identity to any one of the nucleotide sequences of (a)-(d), wherein the primer comprises a fluorescent label.
15. The primer of claim 14, wherein:
(a) the primer is a fluorescently labeled primer-probe comprising an oligonucleotide probe sequence element at the 5’ end of the primer, and
(b) the probe sequence element comprises the fluorescent label.
16. A method for producing a population of female Cannabis plants, the method comprising:
(a) selecting a female Cannabis plant when the presence of one or more homozygous single nucleotide polymorphisms (SNPs) in a nucleic acid sample obtained from the Cannabis plant is detected, wherein the one or more SNPs are selected from:
(i) a SNP identified as SNP 1, wherein SNP 1 is homozygous for an A or a G nucleotide at the position that corresponds to position 103 of the nucleotide sequence as set forth in SEQ ID NO: 1; or
(ii) a SNP identified as SNP 2, wherein SNP 2 is homozygous for a C or an A nucleotide at the position that corresponds to position 90 of the nucleotide sequence as set forth in SEQ ID NO: 5; and
(b) planting or growing a crop of the female Cannabis plants.
17. The method of claim 16, wherein the homozygous SNPs are detected by a competitive allele-specific polymerase chain reaction (PCR) assay, comprising: (a) contacting the nucleic acid sample with sets of forward and reverse primers selected from the group consisting of:
(i) two forward primers for identifying SNP 1 comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the nucleotide sequence as set forth in SEQ ID NO: 3, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 4;
(ii) two forward primers for identifying SNP 2 comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the nucleotide sequence as set forth in SEQ ID NO: 7, and a reverse primer comprising the nucleotide sequence as set forth in SEQ ID NO: 8; and
(iii) primers having at least 90% sequence identity to the corresponding nucleotide sequences of the forward and reverse primer sets of (i)-(ii),
(b) to produce a reaction-sample mixture, wherein:
(i) the forward primers are fluorescently labeled primer-probes comprising an oligonucleotide probe sequence element at the 5’ end of the primer,
(ii) the probe sequence element comprises the fluorescent label,
(iii) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 2 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 3 fluoresce in different regions of the visible spectrum,
(iv) the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 6 and the fluorescent label of the forward primer comprising the nucleotide sequence as set forth in SEQ ID NO: 7 fluoresce in different regions of the visible spectrum;
(c) subjecting the reach on- sample mixture to PCR conditions under which each of SNP 1 and SNP 2 present in the nucleic acid sample is amplified to produce a fluorescent signal; and
(d) measuring the amount of fluorescent signal produced from each fluorescent label.
18. The method of claim 17, wherein the fluorescent label is selected from the group consisting of a fluorescein amidite fluorophore (FAM) label, a hexachlorofluorescein (HEX) label, a tetrachlorofulorescein (TET) label, a 4,5-dichloro-dimethoxy-fluorescein (JOE) label, a VIC® label, a Cy™3 label, a Cy 3.5 label, a NED label, a ROX™ label, a Texas Red® label, a Pulsar® 650 label, a Cy 5 label, a Cy 5.5 label, a Biosearch Blue™ label, a CAL Fluor® Gold 540 label, a CAL Fluor Orange 560 label, a Quasar® 570 label, a TAMRA label, a CAL Fluor Red 590 label, a CAL Fluor Red 610 label, a CAL Fluor Red 635 label, a Quasar 670 label, a Quasar 705 label, a DABCYL label, a Dabsyl label, and any combination thereof.
19. The method of claim 17 or 18, wherein:
(a) the PCR assay is a Kompetitive Allele Specific PCR (KASP) assay; and/or
(b) all of the primer sets are contained together in an amplification mix further comprising DNA polymerase, dNTPs, and PCR buffer prior to contacting with the nucleic acid sample; and/or
(c) the Cannabis plant is a Cannabis sativa (C. sativa), C. Mica, or C. ruder alis plant; and/or
(d) the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
20. A method of detecting the presence of at least one single nucleotide polymorphism (SNP) on a plurality of polynucleotide analytes in a nucleic acid sample obtained from a Cannabis plant, comprising: a) amplifying the plurality of polynucleotide analytes with at least a first oligonucleotide set that is specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 or a second oligonucleotide set that is specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 to generate amplified polynucleotide analytes; and b) detecting amplified polynucleotide analytes comprising an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 or a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
21. The method of claim 20, wherein the first oligonucleotide set comprises a first forward primer, a second forward primer, and a first reverse primer.
22. The method of claim 21, wherein:
(a) the first forward primer is allele-specific to the X chromosome and binds to an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1; and/or
(b) the second forward primer is allele-specific to the Y chromosome and binds to a
G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1; and/or
(c) the first forward primer and the second forward primer are fluorescently labeled and the two fluorophores are different; and/or
(d) the second oligonucleotide set comprises a third forward primer, a fourth forward primer, and a second reverse primer; and/or
(e) the third forward primer is allele-specific to the X chromosome and binds to a C at a nucleotide position that corresponds to position 90 of SEQ. ID NO: 5; and/or
(f) the fourth forward primer is allele-specific to the Y chromosome and binds to an A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5; and/or
(g) the third forward primer and the fourth forward primer are fluorescently labeled and the two fluorophores are different; and/or
(h) the first forward primer comprises at least 90%, 95%, or 100% sequence identity to SEQ ID NO: 2; and/or
(i) the second forward primer comprises at least 90%, 95%, or 100% sequence identity to SEQ ID NO: 3; and/or
(j) the first reverse primer comprises at least 90%, 95%, or 100% sequence identity to SEQ ID NO: 4; and/or
(k) the third forward primer comprises at least 90%, 95%, or 100% sequence identity to SEQ ID NO: 6; and/or
(l) the fourth forward primer comprises at least 90%, 95%, or 100% sequence identity to SEQ ID NO: 7; and/or
(m) the second reverse primer comprises at least 90%, 95%, or 100% sequence identity to SEQ ID NO: 8.
23. The method of any one of claims 20-22, wherein the amplifying comprises a Kompetitive allele specific PCR (KASP) method.
24. The method of any one of claims 20-23, wherein the first oligonucleotide set or the second oligonucleotide set further comprises two universal primers.
25. The method of claim 24, wherein the amplifying comprises a semi-thermal asymmetric reverse PCR (STARP) method.
26. The method of claim 20, wherein:
(a) the first oligonucleotide set comprises a forward primer, a reverse primer, and a first probe; and/or
(b) the first probe is specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1; and/or
(c) the second oligonucleotide set comprises a forward primer, a reverse primer, and a second probe; and/or
(d) the second probe is specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ. ID NO: 5; and/or
(e) the amplifying comprises a TaqMan PCR method; and/or
(f) the first probe or the second probe is a molecular beacon probe; and/or
(g) the amplifying comprises a nucleic acid sequence-based amplification (NASBA) method; and/or
(h) the first probe and the second probe are each independently at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, or more nucleotides in length; and/or
(i) the amplifying comprises a rhAmp based on RNase H2-dependent PCR (rhPCR) method.
27. The method of any one of claims 20-26, further comprising:
(a) determining whether the nucleic acid sample is homozygous for an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1; and/or
(b) determining whether the nucleic acid sample is heterozygous at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1; and/or
(c) determining whether the nucleic acid sample is homozygous for a C at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5; and/or
(d) further comprising determining whether the nucleic acid sample is heterozygous at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
28. The method of any one of claims 20-27, wherein:
(a) homozygous for an A and/or C is indicative of a female Cannabis plant; and/or (b) heterozygous at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a male Cannabis plant.
29. A method of detecting the presence of a single nucleotide polymorphism (SNP) on a polynucleotide analyte in a nucleic acid sample obtained from a Cannabis plant, comprising: a) hybridizing a polynucleotide analyte with a set of detectably labeled probes, wherein the set comprises a first probe that is specific for an A at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and is labeled with a first fluorophore, and a second probe that is specific for a G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and is labeled with a second fluorophore, wherein the first fluorophore and the second fluorophore are different; and b) detecting one or more fluorescent signals to determine whether the polynucleotide analyte comprises an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1.
30. A method of detecting the presence of a single nucleotide polymorphism (SNP) on a polynucleotide analyte in a nucleic acid sample obtained from a Cannabis plant, comprising: a) hybridizing a polynucleotide analyte with a set of detectably labeled probes, wherein the set comprises a third probe that is specific for a C at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 and is labeled with a third fluorophore, and a fourth probe that is specific for an A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5 and is labeled with a fourth fluorophore, wherein the third fluorophore and the fourth fluorophore are different; and b) detecting one or more fluorescent signals to determine whether the polynucleotide analyte comprises a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
31. The method of claim 29 or 30, wherein the first fluorophore and the third fluorophore are the same.
32. The method of any one of claims 29-31, wherein:
(a) the second fluorophore and the fourth fluorophore are the same; and/or (b) each probe within the set is at least 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, or more nucleotides in length.
33. The method of any one of claims 29-32, wherein :
(a) the set of probes further comprises two or more additional probes, wherein the additional probes are specific for a C or A at a nucleotide position that corresponds to position 90 of SEQ ID NO: 5; and/or
(b) the set of probes further comprises two or more additional probes, wherein the additional probes are specific for an A or G at a nucleotide position that corresponds to position 103 of SEQ ID NO: 1; and/or
(c) the set of probes comprises molecular beacon probes; and/or
(d) the set of probes comprises an array.
34. The method of any one of claims 29-33, wherein the method:
(a) utilizes a GeneChip® array system; and/or
(b) the method utilizes a BeadArray™ -based system.
35. The method of any one of claims 29-34, further comprising subjecting the nucleic acid sample under an amplification condition to generate the polynucleotide analyte prior to hybridizing the polynucleotide analyte with the set of detectably labeled probes.
36. The method of any one of claims 29-35, wherein:
(a) the presence of an A and the absence of a G is indicative that the nucleic acid sample is homozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1; and/or
(b) the presence of an A and a G is indicative that the nucleic acid sample is heterozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1; and/or
(c) the presence of a C and the absence of an A is indicative that the nucleic acid sample is homozygous at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5; and/or
(d) the presence of a C and an A is indicative that the nucleic acid sample is heterozygous at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5.
37. The method of any one of claims 29-36, wherein:
(a) homozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a female Cannabis plant; and/or
(b) heterozygous at the nucleotide position that corresponds to position 103 of SEQ ID NO: 1 and/or at the nucleotide position that corresponds to position 90 of SEQ ID NO: 5 is indicative of a male Cannabis plant.
38. The method of any one of claims 20-37, wherein the Cannabis plant is a Cannabis sativa plant.
39. The method of any one of claims 20-38, wherein the nucleic acid sample is obtained from a seed, seedling, tissue culture, or plant of any age.
PCT/US2022/075732 2021-09-01 2022-08-31 Novel molecular markers for sex determination in cannabis WO2023034846A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163239843P 2021-09-01 2021-09-01
US63/239,843 2021-09-01

Publications (1)

Publication Number Publication Date
WO2023034846A1 true WO2023034846A1 (en) 2023-03-09

Family

ID=85413090

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/075732 WO2023034846A1 (en) 2021-09-01 2022-08-31 Novel molecular markers for sex determination in cannabis

Country Status (1)

Country Link
WO (1) WO2023034846A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016197258A1 (en) * 2015-06-12 2016-12-15 Anandia Laboratories Inc. Methods and compositions for cannabis characterization
CN107988417A (en) * 2018-01-16 2018-05-04 福建农林大学 A kind of molecular labeling of hemp gunther sex-linked and application
WO2019222835A1 (en) * 2018-05-22 2019-11-28 Anandia Laboratories Inc. Sex identification of cannabis plants
US20210025014A1 (en) * 2019-07-23 2021-01-28 University Of Kentucky Research Foundation Methods for gender identification and cultivation of cannabis seeds

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016197258A1 (en) * 2015-06-12 2016-12-15 Anandia Laboratories Inc. Methods and compositions for cannabis characterization
CN107988417A (en) * 2018-01-16 2018-05-04 福建农林大学 A kind of molecular labeling of hemp gunther sex-linked and application
WO2019222835A1 (en) * 2018-05-22 2019-11-28 Anandia Laboratories Inc. Sex identification of cannabis plants
US20210025014A1 (en) * 2019-07-23 2021-01-28 University Of Kentucky Research Foundation Methods for gender identification and cultivation of cannabis seeds

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TOTH JACOB A., STACK GEORGE M., CALA ALI R., CARLSON CRAIG H., WILK REBECCA L., CRAWFORD JAMIE L., VIANDS DONALD R., PHILIPPE GLEN: "Development and validation of genetic markers for sex and cannabinoid chemotype in Cannabis sativa L.", GCB BIOENERGY, vol. 12, no. 3, 1 March 2020 (2020-03-01), pages 213 - 222, XP093042925, ISSN: 1757-1693, DOI: 10.1111/gcbb.12667 *

Similar Documents

Publication Publication Date Title
TWI721708B (en) A molecular marker related to papaya fruiting
US10337072B2 (en) Copy number detection and methods
RU2620973C2 (en) Markers linked with soybean plants resistance to scn
KR102442563B1 (en) A biomarker for predicting a head stage of wheat
CN111961750A (en) KASP primer for detecting tomato yellow leaf curl virus disease resistance gene Ty-1 and application thereof
US20140053294A1 (en) Pongamia Genetic Markers and Method of Use
RU2593958C2 (en) Use of specific markers of brown average fibre 3 (brown midrib-3) in corn for dispersal signs
EP1466973A1 (en) Characteristic base sequences occurring in plant genes and method of utilizing the same
EP2875155B1 (en) Endpoint zygosity assay to detect rf4 gene in maize
CN116904636A (en) Molecular marker for detecting wheat stem WSC content QTL QWSC.caas-7DS and application
CN110777216A (en) Method for identifying purity of Jingke waxy 2000 corn hybrid based on SNP marker
CN113278723B (en) Composition for analyzing genetic diversity of Chinese cabbage genome segment or genetic diversity introduced in synthetic mustard and application
WO2023034846A1 (en) Novel molecular markers for sex determination in cannabis
CN111201318A (en) Method for detecting variants of cabbage plants
KR102266905B1 (en) Composition for selecting variety tolerant to rice seedling cold stress containing qSCT12 gene comprising DNA marker and method for selecting variety tolerant to rice seedling cold stress using DNA marker
JP5849317B2 (en) Variety identification marker of vegetative propagation crop
CN116904638B (en) Kasp markers linked to early females of quinoa and uses thereof
CN112725515B (en) Iris florida ground color SNP molecular marker primer composition and application thereof
KR102575912B1 (en) Single nucleotide polymorphism marker for discriminating cabbage having low content of sinigrin and uses thereof
KR102261419B1 (en) CAPS marker for discriminating presence or absence of trichome in tomato plant and uses thereof
KR101845254B1 (en) SNP markers associated with drought tolerance of Populus davidiana Dode and its use
KR101845251B1 (en) SNP markers associated with drought tolerance of Populus davidiana Dode and its use
JP4574263B2 (en) Hop variety identification method using microsatellite DNA
CN117305496A (en) Corn sweet taste related SNP locus and application thereof
CN116179741A (en) Molecular marker of melon monoscopic flower A gene CmACS7 and application thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22865759

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE