OA21074A - Heterozygous CENH3 monocots and methods of use thereof for haploid induction and simultaneous genome editing. - Google Patents

Heterozygous CENH3 monocots and methods of use thereof for haploid induction and simultaneous genome editing. Download PDF

Info

Publication number
OA21074A
OA21074A OA1202200508 OA21074A OA 21074 A OA21074 A OA 21074A OA 1202200508 OA1202200508 OA 1202200508 OA 21074 A OA21074 A OA 21074A
Authority
OA
OAPI
Prior art keywords
plant
monocot
cenh3
haploid
protein
Prior art date
Application number
OA1202200508
Inventor
R. Kelly DAWE
David Jackson
Original Assignee
Niversity Of Georgia Research Foundation, Inc.
Cold Spring Harbor Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Niversity Of Georgia Research Foundation, Inc., Cold Spring Harbor Laboratory filed Critical Niversity Of Georgia Research Foundation, Inc.
Publication of OA21074A publication Critical patent/OA21074A/en

Links

Abstract

Monocot plants heterozygous for centromeric histone 3 (CenH3) and optionally expressing gene editing constructs, for use in inducing haploids of a monocot target plant and optionally passthrough gene editing are provided. The monocot haploid inducer plants are typically composed of diploid plant cells having only one allele encoding a functional CENH3 protein. The diploid plant cells can also include, for example, one CenH3 allele encoding non-functional CENII3 protein. In some embodiments, the allele encoding nonfunctional CENH3 protein is a frameshift mutation, protein null allele, an RNA null allele, or a combination thereof. The monocot haploid inducer plant can also include gene editing machinery, such as a site-directed nuclease and optionally a guide RNA stably expressed by cells of the monocot plant. Methods of inducing formation of a target haploid monocot plant while optionally simultaneously modifying the target monocot plant's genome are also provided.

Description

HETEROZYGOUS CENH3 MONOCOTS AND
METHODS OF USE THEREOF FOR HAPLOID INDUCTION AND SIMULTANEOUS GENOME EDITING
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of and priority to U.S.S.N. 63/036,902 June 9, 2020, and U.S.S.N. 63/036,910 filed -lune 9, 2020, each of which is incorporated by reference in its entirety.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
Th'is invention was made with govemment support under 1444514 awarded by the National Science Foundation. The Government has certain rights in the invention.
REFERENCE TO SEQUENCE LISTING
The Sequence Listing submitted as a text file named “UGA_2020_139_03_PCT_ST25.txt,” created on June 7, 2021, and having a size of 20,835 bytes is hereby incorporated by reference pursuant to 37 C.F.R § 1.52(e)(5).
FIELD OF THE INVENTION
The field ofthe invention is generally related to haploid înducer plant lines, and methods of use thereof for creating genetically modified doubled haploid plants.
BACKGROUND OF THE INVENTION
Tens of thousands of maize haploid Unes are generated by breeding companies around the world each year as a prerequisite for creating new inbreds, which are ultimately used to create hybrids for sale. The induced haploids are doubled by Chemical treatment and immediately tested for agronomie performance. The enabling technology was the discovery of a maize inbred called Stock 6 that induces haploids when crossed as a male (Coe,e t al., The American Naturalisé 93 (873): 381-82 (1959)). About 3% of the progeny from this cross are haploid for the maternai genome. This line has now been improved and selected for improved haploid formation, which can range from 3-20% (Hu et al., Genetics, 202 (4): 1267—76 (2016); Prigge et al., Genetics, 190 (2): 781-93 (2012)). Ail maize haploid-induc ing applications trace to this original dîscovery and breeding lines derived from it. The relevant literature on this topic has been heavily reviewed (Chaikam et aL, TAG. Theoretical and Applied Genetics. Theoretische Und Angewandte Genetik, 132 (12): 3227—43 (2019); Comai and Tan, Trends in Genetics. doi: 10.1016/j.tig.2019.07.005 (2019); Kalinowska et al., TAG. Theoretical and Applied Genetics. Theoretische Und Angewandte Genetik, 132 (3): 593-605 (2019)). The gene responsibte for Stock 6-based haploid induction (Matrihneal, or matl) is a patatin-like phospholipase expressed primarîly in pollen (Kellîheret al., Nature, 542 (7639): 105—9 (2017); Liu et al., Molecular Plant, 10 (3): 520-22 (2017); Gilles et al., The EMBO Journal, 36 (6): 707-17 (2017)). Its mechanism of action is not understood, but may involve a change in membrane properties during fertilization that leads to sudden loss of the paternal génotype. Mutations in matl also induce haploids in rice (Yao et al., Nature Plants, 4 (8): 530-33 (2018)), îndicating that this method of haploid induction may be broadly used for monocot crop species.
Haploid induction itself is of broad interest. Also important was the subséquent démonstration that matl can be used to “invisibly” pass a CRISPR/Cas9 cassette into any inbred background (Kelliher et al., Nature Biotechnology, 37 (3):287-92 (2019)) and edit genes in more than 3% of the haploid progeny. The resulting haploids can then be doubled to create inbreds with homozygous mutations in a fast, GMO-free manner. Any other way of editing genes needs transformation and régénération; a process that is heavily genotype-dependent. By using a genotype-independent haploid înducer, this bottleneck can be avoided. However, the method is cumbersome, as it needs creating a mutation in matl before commencing with gene editing at another site. Lines containing the matl mutation are also unhealthy and difficult to propagate. Further, since the mechanism of matl action is not known, it is not clear how Cas9 is able to gain access to the maternai genome. Since the mechanism is not known, it is difficult to improve the technology.
Another method of inducing haploids is centre mere-mediated haploid induction (Ravi and Chan (Ravi, et al., Genetics, 186 (2): 461-71 (2010), Ravi, et al., PLoS Genet., 7(6):el002121 (2011) doi:
10.1371/journal.pgen.l002121, Ravi et al„ Nature, 464 (7288): 615-18 (2011)) found that a cenh3-/- Arabidopsis null mutant, when complemented with a modified version of CENII3 called “GFP-taîlswap” can induce paternal haploids a high frequency (-25%). GFP-tailswap îs a complex transgene with a substitution in the histone tail and large GFP moiety added to the N terminus. Other forms of taîlswap involvîng CENH3 genes from different species with or without GFP (Britt and Kuppu, Frontiers in Plant Science, 7 (April): 357, (2016)), and point mutations that confer single amino acid changes of CENH3 can also induce haploids at a lower frequency (Karimi-Ashtiyani et al., PNAS, 112 (36): 11211—16 (2015); Kuppu étal., PLoS Genetics, 11 (9): el005494 (2015), Kalinowska et al., TAG.
Theoretical and Applied Genetics. Theoretische Und Angewandte Genetik, 132 (3): 593-605 (2019)).
Centromere medîated haploid induction has also been successful in several other dicot species (Kalinowska et al., TAG. Theoretical and Applied Genetics. Theoretische Und Angewandte Genetik, 132 (3): 593-605 (2019)). Results in monocots are limited and unreliable (Kelliher et al., Frontiers in Plant Science, 7 (March): 414 (2016)), leading to the general view that for monocots, the Matrilineal system will be used, and for dicots, centromeremediated haploid induction (Kalinowska et al., TAG. Theoretical and Applied Genetics. Theoretische Und Angewandte Genetik, 132 (3): 593—605 (2019)).
Thus, there remains a need for improved compositions and methods for haploid induction in monocots and haploids formed therefrom, optionally in combination with sîmultaneous gene editing for induction of one or more mutations relative to the background genome.
SUMMARY OF THE INVENTION
Monocot plants heterozygous for centromeric histone 3 (CenH3) and use thereof in methods for efficient centromere-mediated haploid induction in a target plant are provided. The monocot haploid inducer plants are typically composed of diploid plant cells having only one CENH3 allele that is fully functional. The diploid plant cells can also include, for example, one CenH3 allele encoding non-functional CENH.3 protein. In some embodiments, the allele encoding non-functional CENH3 protein is a protein null allele, an RNA null allele, or a combination thereof. In some embodiments, the endogenous CenH3 loci on a first diploid chromosome is mutated, or partially or completely deleted. In some embodiments, the mutation is frameshift mutation that introduces a stop codon, causing the gene to express a truncated, non-functional protein. Typically, the endogenous CenH3 loci on a second (i.e., the other) diploid chromosome is intact. The functional CENH3 protein can be wildtype CENH3 protein.
Typically, the plant lacks a chromosomally integrated or extrachromosomal transgene encoding wildtype CenH3, CENH3 protein variants, and fusion proteins. Thus, typically, the plant lacks a construct encoding a CENH3-green fluorescent protein (GFP) fusion protein such as GFP-taîlswap. in some embodiments, the cenh3 null is used alone or in combination with other technologies that make use of haploids, such as synthetic apomixis, or transferring engineered chromosomes from one line to another.
The monocot haploid inducer plant can also optionally include gene editing machinery, such as a sîte-directed nuclease and optionally a guide RNA stably expressed by cells of the monocot plant. Typically constructs expressing the gene editing machinery (e.g., nuclease and optionally guide RNA) are stably expressed by the monocot plant. In some embodiments, the site directed nuclease is a CRISPR-based system, a transcription-activator like effector nuclease (TALEN), or a zinc-finger nuclease (ZFN), which may be deployed as cytidine deamînase or adenine deaminase fusion proteins. In some embodiments, a heterologous nucleic acid construct encoding the nuclease is integrated into the haploid inducer plant’s genome.
In some embodiments, the monocot haploid inducer plant’s genome includes a donor nucleic acid sequence to be introduced into a target plant’s genome by homology-directed repair (HDR) following cleavage by the nuclease.
Also provided are egg and sperm cells formed by the haploid inducer plants, lacking the one allele encoding functional CENH3 protein, and expressing gene editing machinery. In some embodiments, the egg cells hâve no more than about 12.5% functional CENH3 protein relative to a corresponding egg cell formed by a CenH3 homozygous plant. In some embodiments, sperm cells hâve no more than about 25% functional CENH3 protein relative to a corresponding sperm cell formed by a CenH3 homozygous plant.
Methods of inducing formation of a target haploid monocot plant are 10 also provided. The methods typically include poliinating a parent monocot target plant with pollen from a monocot haploid inducer plant or poliinating the monocot haploid inducer plant with pollen from a parent monocot target plant. Haploids are induced when egg or pollen carryîng the cenh3 (i.e., null) allele, with diluted quantifies of CENH3, are fertilized by pollen from a 15 wild type line. Next, haploid progeny produced by the pollination are selected.
Methods of modifyîng the genome of a monocot target plant are also provided. The methods typically include inducing formation of a target haploid monocot plant expressing gene editing machinery, selecting haploid 20 progeny with the genome of the monocot target plant but not the monocot haploid inducer plant, and wherein the genome of the haploid progeny has been modified by a site directed nuclease and optionally at least one guide RNA delivered by the monocot haploid inducer plant.
Any of the methods can further include additional steps, for example, 25 chromosome doubling ofthe selected haploid progeny. Chromosome doubling can be spontaneous or induced by, for example, a chromosome doubling agent optionally selected from colchicine, pronaimde, dithipyr, trifiuralin, or another anti-in icrotu bu le agent.
The monocot haploid inducer plant can be, for example, maize, 30 wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onîon, garlic, chives, or yam. In a preferred embodiment, the haploid inducer plant is maize.
BRIEF DESCRIPTION OF THE DRAWINGS
Figures 1A-1C illustrate the génération ofthe maize cenh3 null mutation by CRISPR/Cas9. Figure IA is a schematic showing construction of vectors for genome editing in maize. Ubi-Cas9 includes a codon- optimized Cas9 driven by the maize polyubiquitin promoter. gRNAImmuneCENH3 includes a gRNA targeting the fourtb exon of Cenh3 and an uncleavable ImmuneCenHJ gene driven by 2.1 kb of tbe Cenh3 native promoter. TailswapCENHS is based on ImmuneCENH3 but includes a modified N-terminal tail and a GFP tag. Figure IB is a schematic représentation of the genomic structure of CENH3 gene. Exons are shown with boxes. The protospacer adjacent motif (PAM) and 20 bp target sequence of the sgRN A are also shown (SEQ iD NOS: 16) aligned with complementary sequence/hybridizing segment ofthe sgRNA (SEQ JD NO: 17). Figure IC is a chromatogram of the sequence from a heterozygous I5 line showing the frameshift in the cenh3 null mutation (SEQ JD NOS:l8-l9). The PAM, the délétion, and the stop codon are illustrated.
Figures 2A-2F illustrate the results of assays confirming that plants are haploid. Figures 2A and 2B are plots showing flow cytométrie analysis of diploids and haploids. Diploid plants (2A) show peaks at 2N and 4N, 20 where 4N is the resuit of endoreduplication in differentiated tissues. Haploid plants bave IN and 2N peaks (2B). Figures 2C and 2D are images of chromosome spreads. Maize diploids hâve 20 chromosomes (2C), whereas haploids hâve 10 (2D). Figures 2E and 2F are images of plants: haploids plants hâve a shorter stature (2E), and are stérile without exerted anthers (2F).
Figures 3A and 3B are plots îllustrating the molecular karyotypes of aneuploids. For both panels, the chromosomes are shown end to end across the top. Figure 3A shows aneuploids produced from gl8 crosses. Aneuploîd_l is trisomie for chromosome 3, and aneuploîd_2 is monosomie 30 for chromosome 2 and 4 and trisomie for chromosome 10. Figure 3B shows aneuploids produced from gll crosses. Aneuploid_3 and aneuploid_4 are monosomie for chromosome 7, aneuploid_5 îs monosomie for chromosome 3, 6 and 7, aneuploid_6 is monosomie for ail the chromosomes except chromosome l, and aneuploid_7 is monosomie for chromosome 9. The coverage in each sample was normal ized to the coverage in the relevant diploid from each cross.
Figure 4A is a flow dîagram showing a failed strategy for creating a
Tailswap-CENH3 înducer line in maize. ImmuneCenH3 was swapped with a maize version of the GFP-tailswap construct and several other variants of Cenh3 to see if the lines would induce haploids. None of these CenH3 variants complemented the null (“fail”). Figure 4B is flow diagram showing the distribution of CENH3 during female gamete development. In the female gametophyte there are three cell divisions that précédé the formation of an egg. An egg carrying the cenh3 null can hâve no more than 12.5% CENH3 protein.
Figures 5A-5D illustrate Application of the cenh3 null în simultaneous haploid induction and gene edîting. Figure 5A is a plasmid map of the CRISPR construct used for simultaneous haploid induction and gene editing. Construct components are indicated. Figures 5B and 5C are photographs of wildtype (WT) (5B) and Young ear phenotype (fea2) (5C) phenotype (adapted from Taguchi-Shiobara, et al., Genes Dev. 15: 27552766 (2001), where the fea2 mutant phenotype was first described). Figure
5D is a photograph of Young ear phenotype in the fea2 edited haploid plant obtaîned using the cenh3 haploid înducer.
DETAILED DESCRIPTION OF THE INVENTION
I. Définitions
The term “about” is intended to describe values either above or below the stated value in a range of approx. +/- 10%. The ranges are intended to be made clear by context, and no further limitation is implied. The use of any and ail examples, or exemplary language (e.g., such as) provided herein, is intended merely to better illuminate the description and does not pose a limitation on the scope of the description unless otherwise claimed.
The term “plant” is used in its broadest sense. It includes, but is not limited to, any species of woody, omamental or décorative crop or cereal, and fruit or vegetable plant. It also refers to a pluralîty of plant cells that are largely differentiated into a structure that is present at any stage of a plant’s development. Such structures include, but are not limited to, a fruit, shoot, stem, leaf, flower petal, etc.
The term “plant tissue” includes differentiated and undifferentiated tissues of plants including those present in roots, shoots, leaves, pollen, seeds 5 and tumors, as well as cells in culture (e.g., single cells, protoplasts, embryos, callus, etc.). Plant tissue may be in planta, in organ culture, tissue culture, or cell culture. The term “plant part” as used herein refers to a plant structure, a plant organ, or a plant tissue.
The term “plant material” refers to leaves, stems, roots, flowers or 10 flower parts, fruits, pollen, egg cells, zygotes, seeds, cuttings, cell or tissue cultures, or any other part or product of a plant.
The term “plant organ” refers to a distinct and visibly structured and different!ated part of a plant such as a root, stem, leaf, flower bud, or embryo.
The term “plant cell” refers to a structurai and physiological unit of a plant, comprising a protoplast and a cell wall. The plant cell may be in form of an isolated single cell or a cultured cell, or as a part of higher organ îzed unit such as, for example, a plant tissue, a plant organ, or a whole plant.
The term “plant cell culture” refers to cultures of plant units such as, 20 for example, protoplasts, cell culture cells, cells in plant tissues, pollen, pollen tubes, ovules, embryo sacs, zygotes and embryos at various stages of development.
The term “transgenic plant” refers to a plant or tree that contains recombinant genetic material not normally found in plants or trees of this 25 type and which has been introduced into the plant in question (or into progenitors of the plant) by human manipulation. Thus, a plant that is grown from a plant cell into which recombinant DNA is introduced by transformation is a transgenic plant, as are ail offspring of that plant that contain the introduced transgene (whether produced sexually or asexually). It 30 is understood that the term transgenic plant encompasses the entire plant or tree and parts of the plant or tree, for instance grains, seeds, flowers, leaves, roots, fruit, pollen, stems etc.
The tenn “construct” refers to a recombinant genetic molécule having one or more isolated polynucleotide sequences. Genetic constructs used for transgene expression in a host organism include in the 5’-3’ direction, a promoter sequence; a sequence encoding a gene of interest; and a termination sequence. The construct may also include selectable marker gene(s) and other regulatory éléments for expression.
The tenn “gene” refers to a DNA sequence that encodes through its template or messenger RNA a sequence of amino acids characteristic of a spécifie peptide, polypeptide, or protein. The term “gene” also refers to a DNA sequence that encodes an RNA product. The term gene as used herein with reference to genomîc DNA includes intervening, non-coding régions as well as regulatory régions and can include 5’ and 3’ ends.
The tenn “orthologous genes” or “orthologs” refer to genes that hâve a similar nucleic acid sequence because they were separated by a spéciation event.
The term, “polypeptide” refers generally to peptides and proteins having more than about ten amino acids. The polypeptides can be “exogenous,” meaning that they are “heterologous,” i.e., foreign to the host cell being utilized, such as human polypeptide produced by a bacterial cell.
The term “isolated” is meant to describe a compound of interest (e.g., nucleic acids) that is in an environment different from that in which the compound naturally occurs, e.g., separated from its natural milieu such as by concentrating a peptide to a concentration at which it is not found in nature. “Isolated” is meant to include compounds that are within samples that are substantially enriched for the compound of interest and/or in which the compound of interest is partially or substantially purified. Isolated nucleic acids are at least 60% free, preferably 75% free, and most preferably 90% free from other associated components. An “isolated” nucleic acid molécule or polynucleotide is a nucleic acid molécule that is identified and separated from at least one contaminant nucleic acid molécule with which it is ordinarily associated in the natural source. The isolated nucleic can be, for example, free of association with ail components with which it is naturally
ΙΟ
associated. An isolated nucleic acid molécule is other than in the form or setting in which it is found in nature.
The term “locus” refers to a spécifie position along a chromosome or DNA sequence. Depending upon context, a locus could be a gene, a marker, 5 a chromosomal band or a spécifie sequence of one or more nucléotides.
The term “allele” refers to one of two or more alternative forms of a gene.
The term “vector” refers to a replicon, such as a plasmid, phage, or cosmid, into which another DNA segment may be inserted so as to bring 10 about the réplication of the inserted segment. The vectors can be expression vectors.
The term “expression vector” refers to a vector that includes one or more expression control sequences.
The term “expression control sequence” refers to a DNA sequence 15 that Controls and régulâtes the transcription and/or translation of another DNA sequence. Control sequences that are suitable for prokaryotes, for example, include a promoter, optionally an operator sequence, a ribosome binding site, and the like. Eukaryotic cells are known to utilize promoters, polyadenylation signais, and enhancers.
The term “promoter” refers to a regulatory nucleic acid sequence, typically located upstream (5’) of a gene or protein coding sequence that, in conjunction with various éléments, is responsible for regulating the expression of the gene or protein coding sequence. The promoters suitable for use in the constructs of this disclosure are functional in plants and in host 25 organisms used for expressing the disclosed polynucleotides. Many plant promoters are publîcly known. These include constitutive promoters, inducible promoters, tissue- and cell-specific promoters and developmentally-regulated promoters. Exemplary promoters and fusion promoters are described, e.g., in U.S. Pat. No. 6,717,034, which is herein incorporated by reference in its entirety.
A nucleic acid sequence or polynucleotide is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence.
For example, DNA for a presequence or secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participâtes in the sécrétion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is posîtioned so as to facilitate translation. Generally, “operably linked” means that the DNA sequences being linked are contiguous and, in the case of a secretory leader, contiguous and in reading frame. Linking can be accomplîshed by ligation at convenient restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice.
The tenns “transformed,” “transgenic,” “transfected” and “recombinant” refer to a host organism such as a bacterium or a plant into which a heterologous nucleic acid molécule has been introduced. The nucleic acîd molécule can be stably integrated into the genome of the host or the nucleic acid molécule can also be present as an extrachromosomal molécule. Such an extrachromosomal molécule can be auto-replicatîng. Transformed cells, tissues, or plants are understood to encompass not only the end product of a transformation process, but also transgenic progeny thereof. A “nontransformed,” “non-transgenic,” or “non-recombinanf ’ host refers to a wildtype organism, e.g., a bacterium or plant, which does not contain the heterologous nucleic acid molécule.
The term “endogenous” with regard to a nucleic acid refers to nucleic acids normally present in the host.
The term “heterologous” refers to éléments occurring where they are not normally found. For example, a promoter may be linked to a heterologous nucleic acid sequence, e.g., a sequence that is not normally found operably linked to the promoter. When used herein to de scribe a promoter element, heterologous means a promoter element that differs from that normally found in the native promoter, either in sequence, species, or number. For example, a heterologous control element in a promoter sequence may be a control/ regulatory element of a different promoter added to enhance promoter control, or an additional control element of the same
I2 promoter. The term “heterologous” thus can also encompasses “exogenous” and “non-native” éléments.
As used herein, “homologous” means derived from the same species. For example, a homologous trait is any characteristîc of organisms that is derived from a common ancestor. Homologous sequences can be orthologous or paralogous. Homologous sequences are orthologous if they were separated by a spéciation event: when a species diverges into two separate species, the divergent copies of a single gene in the resulting species are said to be orthologous. Orthologs, or orthologous genes, are genes in different species that are similar to each other because they originated from a common ancestor. Homologous sequences are paralogous if they were separated by a gene duplication event: if a gene in an organism is duplicated to occupy two different positions in the same genome, then the two copies are paralogous.
As used herein, “polypeptide” refers generally to peptides and proteins having more than about ten amino acids. The polypeptides can be “exogenous,” meaning that they are “heterologous,” i.e., foreign to the host cell being utilized, such as human polypeptide produced by a bacterial cell.
As used herein, a “cultivar” refers to a cultivated variety.
As used herein, “germplasm” refers to one or more phenotypîc characteristics, or one or more genes encoding said one or more phenotypic characteristics, capable of being transmitted between générations.
As used herein, the term “progenitor” refers to any of the species, varieties, cultivars, or germplasm, from which a plant is derived.
As used herein, the term “dérivative species, germplasm or variety” refers to any plant species, germplasm or variety that is produced using a stated species, variety, cultivar, or germplasm, using standard procedures of sexual hybridization, recombinant DNA technology, tissue culture, mutagenesîs, or a combination of any one or more said procedures.
As used herein, the terms “introgression”, “introgressed” and “introgressing” refer to both a natural and artificial process whereby genes of one species, variety or cultivar are moved into the genome of another species, variety or cuitïvar, by Crossing those species. The process may optionally be complétée! by backcrossing to the récurrent parent.
As used herein, “plant part” or “part of a plant” can include, but is not limited to cuttîngs, cells, protoplasts, cell tissue cultures, callus (calli), cell 5 clumps, embryos, stamens, pollen, anthers, pistils, ovules, flowers, seed, pétais, leaves, stems, and roots.
As used herein, a “hybrid” is typically derived from one or more crosses between different varieties, germplasms, populations, breeds or cultivars within a single species, between different subspecies within a species, or between different species within a genus. Typically, hybrids between subspecies are referred to as “intra-specific hybrids” and hybrids between different species within a genus are referred to as “interspecific hybrids.”
Récitation of ranges of values herein are merely intended to serve as a 15 shorthand method of referring individually to each separate value talling within the range, unless otherwise indicated herein, and each separate value is incorporated into the spécification as if it were individually recited herein.
Use of the term “about” is intended to describe values either above or below the stated value in a range of approx. +/- 10%; in other forms the 20 values may range in value either above or below the stated value in a range of approx. +/- 5%; in other forms the values may range in value either above or below the stated value in a range of approx. +/- 2%; in other forms the values may range in value either above or below the stated value in a range of approx. +/- l%. The preceding ranges are intended to be made clear by 25 context, and no further limitation is implied.
Disclosed are materials, compositions, and components that can be used for, can be used in conjunction with, can be used in préparation for, or are products of the disclosed method and compositions. These and other materials are disclosed herein, and it is understood that when combinations, 30 subsets, interactions, groups, etc. of these materials are disclosed that while spécifie reference of each various individual and collective combinations and permutation of these compounds may not be explicitly disclosed, each îs specifically contemplated and described herein. For example, if a ligand is disclosed and dîscussed and a number of modifications that can be made to a number of molécules including the ligand are discussed, each and every combination and permutation of ligand and the modifications that are possible are specifically contemplated unless specifically indicated to the contrary. Thus, if a class of molécules A, B, and C are disclosed as well as a class of molécules D, E, and F and an example of a combination molécule, A-D is disclosed, then even if each is not individually recited, each is individually and collectively contemplated. Thus, in this example, each of the combinations A-E, A-F, B-D, B-E, B-F, C-D, C-E, and C-F are specifically contemplated and should be consîdered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. Likewise, any subset or combination of these is also specifically contemplated and disclosed. Thus, for example, the sub-group of A-E, B-F, and C-E are specifically contemplated and should be consîdered disclosed from disclosure of A, B, and C; D, E, and F; and the example combination A-D. Further, each of the materials, compositions, components, etc. contemplated and disclosed as above can also be specifically and independently included or excluded from any group, subgroup, list, set, etc. of such materials.
These concepts apply to ail aspects of this application including, but not limited to, steps in methods of making and using the disclosed compositions. Thus, if there are a variety of additional steps that can be performed it is understood that each of these additional steps can be performed with any spécifie embodiment or combination of embodiments of the disclosed methods, and that each such combination is specifically contemplated and should be consîdered disclosed.
Ail methods described herein can be performed in any suitable order unless otherwise indicated or otherwise clearly contradicted by context. The use of any and ail examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the embodiments and does not pose a limitation on the scope of the embodiments unless otherwise claimed. No language in the spécification should be construed as indicating any non-claîmed element as essential to the practice of the invention.
Π. Plants
Haploid plants are widely used to accelerate the production of new inbred lines. Haploid induction involves a temporary diploid State followed by loss of one of the contributing genomes. This mechanism also makes it possible to introduce gene editing machinery in a genotype-independent manner without stably transforming the target line.
The disclosed methods utilize a simple nu il mutation of the Centromeric Histone H3 gene. Results presented in the Examples below show that heterozygous cenh3 null plant produces haploids at a frequency of
5% when crossed as a female and 0.5% when crossed as a male. The mechanism of haploid induction involves the sequential loss of chromosomes in the zygote. The disclosed plants and methods of use thereof also make it possible to introduce gene editing machinery to any line in a simple, rapid, and GMO-free way.
Maize gene nomenclature provides that mutant alleles are expressed in normal or italicized small font where the first letter is lowercase (e.g.
cenh3), wild type genes are expressed in normal or italicized font where the first letter is uppercase (e.g. CenH3), and the expressed protein product is written with ail letters in uppercase (CENH3). As the experiments described 20 below were conducted in maize, maize nomenclature is generally followed herein. it will be appreciated, however, that the compositions and methods described herein are also applicable in, and thus also disclosed for, the corresponding gene in other monocot plants, thus use of this nomenclature should not be construed as limiting the disclosed compositions and methods 25 to maize alone.
The disclosed haploid induction methods typically include a cross between a haploid înducer line, and a target line to be induced to generate haploids.
The disclosed gene editing methods typically include a cross between 30 a haploid inducer line including gene editing machinery, and a target line to be induced to generate haploids which is also the target of gene modification. The in vivo haploid induction process is co-opted to introduce editing machinery into a target germplasm by including it in the haploid inducer parent. Typically the editing machinery is stably întegrated as a transgene.
Simultaneous editing plus haploid induction can be done in various monocots via wide cross or de novo haploid induction.
Typically, one or more of the plants utilized in the crosses disclosed herein, and/or the progeny generated therefrom, are non-naturally occurring plants. A “non-naturally occurring plant” refers to a plant that does not occur in nature without human intervention. Non-naturally occurring plants include transgenic plants as well as plants produced by non-transgenîc means such as plant breeding.
A. Target Lines
The target line to be induced is typically a monocot. Monocots include one of the large divisions of Angiosperm plants (flowering plants with seeds protected within a vessel). They are herbaceous plants with parallel veined leaves and hâve an embryo with a single cotylédon, as opposed to dicot plants (dicotyledonous), which hâve an embryo with two cotylédons. In some embodiments, the target line is a monocot selected from maize, wheat, rice, sorghum, bariey, oats, triticale, rye, pearl millet, fînger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garlic, chives, or yam.
The target line is typically one in which having a haploid plant, typically to accelerate production of new inbred lines, is désirable. In some embodiments, the target line is an elite inbred line where extensive breeding has already been performed, but genetîc modifications are needed to improve the line so that, for example, it is résistant to disease or pest challenges or better adapted to different environments.
B. Inducer Lines
Haploid inducer plant lines optionally expressing gene editing machinery are provided. The inducer line is also a monocot, for example, maize, wheat, rice, sorghum, bariey, oats, triticale, rye, pearl millet, flnger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garlic, chives, or yam.
The target and inducer lines can both be two plant lines that typically sexually reproduce. In other embodiments, the cross is an interspecific and
I7 intergeneric hybrid cross between related species or généra that do not normally sexually reproduce with each other. These crosses can also be referred to as wide crosses. In wheat, rice, barley, brassica, and other crops, the route to haploid induction can be to use a pollen donor that induces haploids via wide cross. For example, one could use corn pollen on wheat, millet pollen on wheat, barley pollen on other barley species, or any other wide Crossing method.
As discussed in more detail below, the inducer plant is a heterozygous cenh3 null and optionally includes gene editing machinery.
1. cenh3 nulle
The inducer plants are heterozygous cenh3 nulls.
Exemplary GenBank accession number providing gene locations and sequences for CenH3 in various monocot plants include: maize,
AF519807.2; rice, AY438639.1; wheat, JF969287.1; barley, JF419329.1;
banana KP878235.1, each of which is incorporated by reference in its entirety, and which provide the following amino acid and nucleic acid (mRNA/cDNA) sequences for CENH3:
MARTKHQAVRKTAEKPKKKLQFERSGGASTSATPERAAGTGGRAASGGDSVKKTKPRHRWRPGTVALR
EIRKYQKSTEPLIPFAPFVRVVRELTNFVTNGKVERYTAEALLALQEAAEFHLIELFEMM1LCAIHAK rvtimqkdiqlarriggrrwa (SEQ ID NO;20, maize, AF519807.2) ctcccgtccc gagagttctg aatcgaaacc gtcggccacg agagcagtgc gaggcgccca ccgcgatggc tcgaaccaag caccaggccg tgaggaagac ggcggagaag cccaagaaga 25 agctccagtt cgagcgctca ggtggtgcga gtacctcggc gacgccggaa agggctgctg ggaccggggg aagagcggcg tctggaggtg actcagttaa gaagacgaaa ccacgccacc gctggcggcc agggactgta gcgctgcggg agatcaggaa gtaccagaag tccactgaac cgctcatccc ctttgcgcct ttcgtccgtg tggtgaggga gttaaccaat ttcgtaacaa acgggaaagt agagcgctat accgcagaag ccctccttgc gctgcaagag gcagcagaat 30 tccacttgat agaactgttt gaaatggcga atctgtgtgc catccatgcc aagcgtgtca caatcatgca aaaggacata caacttgcaa ggcgtatcgg aggaaggcgt tgggcatgat atataatatc cattctgatt gcatcattct tgtgaatttg tttgtaggag ctagacatta gtgttgttga atgctgcatg gttcctaatc cttttcgcag tctaacatct gtggagttag tatgttacat ggcaacagct gaacatctgt ggactataac tatatggcaa cagccgaaga 35 ttgtgtctgt gggataactg gttgttttgg ttgctcttca gtagtttgtt tgcttcaggt aaccatgctg cgaactatga tgttttcatt ctcggtttgc ttcagctaac cgagatcgat tcagtctgca gtatatggac tatggagtaa actgcatgct gaaacccgaa ccactgctga aacggcagtt gccaggatag caggagggcc ctttatgcac agtggaattg agtagagaac
tgagtaaacc atggttcttt ctccttttga actggaacac aaacacagtt ggatcttgtt tctcttctta ggccattgtc atcgtgtttc ttaggggtgt aaatggtatc tgtccgtatt cgaatttgat ctatctaaca aggctgaaat ccgaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaa (SEQ ID NO:21, maize, AF519807.2)
MARTKHPAVRKSKAEPKKKLQEERSPRPSKAQRAGGGTGTSATTRSAAGTSASGTPRQQTKQRKPHRF RPGTVALREIRKFQKTTELLIPEAPFSRLVREITDFYSKDVSRWTLEALLALQEAAEYHLVDIFEVSN lcaihakrvtimqkdmqlarriggrrpw (SEQ ID NO:22, rice, AY438639.1) acgccgcttc agtttgaaaa cgccgagatg gctcgcacga gaagctccag ttcgaacgct gggtacctcg gcgaccacga gcaaacgaag cagaggaagc caggaaattt cagaaaacca cagggagatc actgatttct tgcattgcaa gaggcagcag cgccatccat gctaagcgtg cggtgggcgg aggccatggt gcaacatgtg tcgttgatta tcaacttacc cttcgttccc tttggattgg tcgaattcag ttgttgtgtt gattgcaata rice, AY438639.I) cccaccgcca cgtcgccgcc agcacccggc ggtgaggaag cccctcggcc gtcgaaggcg ggagcgcggc tggaacatcg cacaccgctt ccgtccaggc ccgaactgct gatcccgttt attcaaagga tgtgtcacgg aataccactt agtggacata ttaccatcat gcaaaaggac gaaaatttgt ttgcgagcca acattttaga aagtagtgta attctaattc agttgatgtt gatttcatca aacagtcgat atgggttcct ctcaaaaaaa gccgccgccg ccgccgccga tcgaaggcgg agcccaagaa cagcgcgctg gtggcggcac gcttcaggga cgcctaggca acagtggcac tgcgggagat gcaccatttt ctcggctggt tggacccttg aagctctcct tttgaagtgt caaatctctg atgcaacttg ccaggcgtat tgcagcatga tggacaagga gatgtatctt cacataggga agtatttacc ttttgctcca tgtgaaatgt gaaccaggaa aaaaaaaa (SEQ ID NO:23, martkhpavrktkappkkqlgprpaqrrqetdgagtsatprragraaapggaqgatgqpkqrkphrfr PGTVALREIRRYQKSVDFLIPFAPFVRLIKEVTDFFCPEISRWTPQALVAIQEAAEYHLVDVFERANH caihakrvtvmqkdiqlarriggrrlw (SEQ IDNO:24, wheat, JF969287.I) atggcccgca ccaagcaccc ggccgtcagg aagaccaagg cgccgcccaa gaagcagctc gggccccgcc ccgcgcagcg gcggcaggag acagatggcg cgggcacgtc ggcgacaccg aggcgagccg ggcgggcggc ggccccaggg ggggctcaag gggcaactgg gcaacccaag cagaggaaac cacaccggtt caggccaggc acggtggcac tgcgggagat caggaggtat cagaagtcgg tcgactttct catcccgttt gcaccatttg tccgtctgat caaggaggtc accgacttct tctgtcctga aatcagccgc tggactcccc aagcgctcgt cgcgattcaa gaggctgcag agtatcacct cgtcgacgta tttgaaaggg caaatcactg tgccatccat gcaaagcgtg ttaccgtcat gcaaaaggac atacagcttg caaggcgtat cggcgggagg aggctttggt ga (SEQ ID NO:25, wheat, JF969287.1)
MARTKKTVAAKEKRPPCSKSEPQSQPKKKEKRAYRFRPGTVALREIRKYRKSTNMLIpfapfvrlvrd iadnltplsnkkeskptpwtplallslqesaeyhlvdlfgkahlcaihshrvtimlkdmqlarrigtr slw (SEQ ID NO:26, barley, JF4l 9329.1)
ΙΟ gaacgaactc tatctctctc tctgcctgct ttccccacct gcgatggctc gcacgaagaa aacggtggcg gcgaaggaga agcgcccccc ttgctccaag tcggagccgc agtcgcagcc gaagaagaag gagaagcggg cgtaccggtt ccggccgggc acggtggcgc tgcgggagat ccggaagtac cgcaagtcca ccaatatgct catccccttt gcgcccttcg tccgcctggt cagggacatc gccgacaact tgacgccatt gtcgaacaag aaggagagca agccgacgcc atggactcct ctcgcgctcc tctcgttgca agagtctgca gagtatcact tggtcgatct atttggaaag gcaaatctgt gtgccattca ttcgcaccgt gttaccatca tgctaaagga catgcagctt gcgaggcgta tcgggacgag aagcctttgg tgatactaat ggaggatgtt ttaggcatcg tagggagcaa acactggtga tgctaatggt agatgttttt tgtatgagtg caatgcgaag gagctgatgg tagtgatcca atatgtttgt gagtgtatca ttaggaagta atgtaggtgt gttttcatct aggtagcatg tcttcgaagt tgatccggtt ggttgcatgg ctgggatcat ctgacttgtc ggcttttgct ccatttgaat tgatctaatt cggacagagt ttctgcaaat gatccaataa tatgggttgc caaaaaaaaa aaaaaaat (SEQ ID NO:27, barley, JF419329.1)
MARTKHLSNRSSSRPRKRFHFGRSPGQRTPADANRPATPSGATPRTTATRSRDTPQGAPSQSKQQFRR RRFRPGVVALREIRNLQKTWNLLIPFAPFVRLVREITHFYSKEVNRWTPEALVAIQEAAETHMIEMFE DAYL CAIHAKRVTLMQKDIH LARRIGGRRHW (SEQ ID NO:28, banana, KP878235.1)
atggcgagaa cgaagcatct gtccaacagg tcctcctctc gccctcggaa gcgcttccat ttcggtcggt ctccagggca gcgaaccccc gctgatgcga atcggcctgc gacgccatct ggtgctactc ctagaaccac ggccaccaga tcgagggata cgcctcaagg ggcaccgagc 5 caatcaaaac agcagccgag gcggcgcagg tttaggccgg gggtggtggc gctacgcgag atcaggaatt tgcagaagac gtggaatcta ttgatccctt tcgctccgtt tgtcagactt gttcgggaga tcactcattt ctactcgaaa gaagtaaacc gatggacccc tgaagcttta gttgcgattc aagaggcagc ggaaactcat atgatagaaa tgtttgaaga tgcatatctc tgtgcaattc atgcaaaacg tgttaccctt atgcaaaaag atatccatct agcaaggcga ataggaggaa gaagacattg gtga (SEQ ID NO:29, banana, KP878235.1 )
The Cenh3 gene is conserved across ail plants, fungi, and animais, with few exceptions în some insect lineages. It serves the fundamentally important rôle of defming the boundaries of the functional centromere, and 15 initiating and organizing the kinetochore (Cheeseman and Desaî, Nature Reviews Molecular Cell Biology, 9:33—46 (2008). The CENH3 wildtype can be the CENH3 of the monocot inducer plant, e.g., maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garlic, chives, or yam. In some embodiments, the wildtype CENH3 has the amino acid sequence of any one of SEQ ID NQS:20, 22, 24, 26, or 28, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95% or more sequence identity thereto; is encoded by the nucleic acid of any one of SEQ ID NOS:21, 23, 25, 27, or 29, a nucleic acid having at least 75%, 25 80%, 85%, 90%, 95%, more sequence identity thereto; is a homologue such as an orthologue or paralogue of the foregoing sequences; or any combination thereof.
A null allele is a nonfunctîonal allele caused by a genetic mutation. Such mutations can cause a complété lack of production of the associated 30 gene product or a product that does not function properly; in either case, the allele is considered nonfunctîonal. For example, CENH3 protein binds to DNA and recruits ali overlying proteins that form the kinetochore that médiates chromosome ségrégation in a plant cell. Non-functional CENH3 will not contribute to the formation of a centromere, kinetochore formation, 35 and/or chromosome ségrégation in a plant cell. A null, whîch encodes no functional protein, is distinguished from CENH3 variants such as GFP21074 tailswap or variants that produce altered or partîally deleted CENH3 proteins. See, e.g., Kuppu, et al., “A Variety of Changes, Including CRISPR/Cas9-mediated Délétions, in CENH3 Lead to Haploid Induction on Outcrossing”, Plant BiotechnolJ, 2020, doi: 10.1111/pbi. 13365, which is specifically incorporated by references herein in its entirety. GFP-tailswap and variant forms can substitute for native CENH3 and retain sufficient function to organize kinetochores, even if imperfectly. Null alleles are a spécial category of mutation that cause a total loss of function.
A mutant allele that produces no RNA transcript is called an RNA null (shown by Northern blotting, total RNA sequencing, or by DNA sequencing of a délétion allele), and one that produces no protein is called a protein null (shown by Western blotting). Nulls are frequently caused by frameshift mutations. The genetic code is read in triplets of nucléotides, such that any sequence can be read in three frames, where only one is correct. Mutations that cause a small délétion or addition of nucléotides can shift the reading frame to a nonsensical protein and often cause protein translation to stop prematurely. Frameshift mutations that cause a prématuré stop, particularly when most of the predicted protein is absent, are generally interpreted as null alleles. For example, a frameshift mutation that causes a stop codon in the N-terminal tail of CENH3 encodes a severely truncated protein that lacks the capacity to bind to DNA or other histones (Figure IC). A genetic null or amorphic allele has the same phenotype when homozygous as when heterozygous with a deficiency that dîsrupts the locus in question. A genetic null allele may be both a protein null and an RNA null, but may also express normal levels of a gene product that is nonfunctional due to mutation. The cenh3 null allele can be a délétion of the entire locus or mutation of one or more nucléotides therein leading to lack of production of cenh3 gene product or a product that does not function properly. The cenh3 null allele can be an RNA null, a protein null, or both.
The null allele of the disclosed cenh3 nulls is typically distinguishable from GFP-tailswap constructs and several variants of Cenh3 such as those described in U.S. Patent No. 8,618,354, U.S. Published Application No. 2018/0116141, LJ.S. Published Application No.
2019/0343060, and WO 2017/004375, which create CENH3 variant proteins that are functional, albeit reduced or altered in function or level relative to a wildtype cenh3 allele.
In some embodiments, a sperm carrying the cenh3 null has no more 5 than 25% ofthe normal amount of a functional (e.g., wildtype) CENH3 protein and an egg carrying the cenh3 null has more than I2.5% of a functional (e.g., wildtype) CENH3 protein.
In some embodiments, the heterozygous cenh3 null inducer line does not include a recombinant gene expressing mutant or variant CENH3. Thus, 10 in some embodiments, the null is not complemented by non-endogenous CENH3 expression. In some embodiments, quantitative réductions in CENH3 alone induce centromere-mediated haploid induction.
In some embodiments, the only functional CENH3 in the heterozygous cenh3 null inducer line îs expressed from the endogenous 15 wildtype allele of the heterozygous cenh3 null.
A non-limiting method for making a cenh3 null is described below. Briefly, a transgenîc line containing Cas9 was crossed to another transgenic line containing a guide RNA targeting the first exon of CenH3 as well as an intact full-length genomic clone of CenH3 that contains five silent nucléotide 20 changes over guide RNA site (Figures l A-IC; the uncleavable gene is referred to as ImmuneCenH3). When lines containing these constructs were crossed together, Cas9 mutated native cenh3 while CenH3 function was covered by the ImmuneCenH3 gene. Thus, in some embodiments, the null allele has one or more mutation(s) in the first exon that abolishes expression 25 of functional protein from the null allele. In some embodiments, the null is caused by a mutation in the sequence encoding the N-terminal lail that removes ail CENH3 protein sequence that interacts with DNA or other histones.
At each cell division, CENH3 is naturally divided equally between 30 the replicated DNA strands at S phase and replenished later in G2 (Lermontova et al., The Plant Journal: For Cell and Molecular Biology, 68 (I): 40-50 (2006)). This cannot occur in a haploid cell that is null for cenh3, and the cell cycle must proceed with half as much CENH3 than would normal ly be present. In the male gametophyte (pollen), there are two cell divisions the précédé the formation of sperm, and in female gametophyte there are three cell divisions that précédé the formation of an egg. A sperm carrying the cenh3 null can hâve no more than 25% of the normal amount of
CENH3 and an egg carrying the cenh3 null can hâve no more than 12.5% (Fig. 4B). The results also show that 0.5% of the progeny from the male carrying the cenh3 null are haploid, and 5.0% of the progeny from the female are haploid. Thus, the inducer can be used as either the male or as the female ofthe cross. Female inducers generate a higher percentage of haploids, thus, 10 in some embodiments, the haploid inducer is preferably a female, but alternatively, the methods can be conducted using the haploid inducer as the male.
2. Markers
The haploid inducer can hâve a marker that can assist in identifying seeds that are haploid. For example, the haploid inducer can hâve a dominant purple pigment gene (e.g., Rl-nj). The seeds of haploid indivîduals hâve a purple aleurone, but lack purple pigment in the endosperm (scutellum), indicating that the germline does not contain the haploid inducer chromosomes. Seeds that hâve a yellow endosperm and a purple aleurone are 20 planted out and grown up to be seedlings. These seedlîngs bave their chromosome number doubled using colchîcine or other methods as discussed in more detail below. The chromosome doubled haploids are grown in a greenhouse and or transplanted to the field, and the chromosome doubled plants are self-pollinated to produce doubled haploid seed as discussed in 25 more detail elsewhere herein.
3. Gene Editing Machinery
The inducer line also optionally includes gene editing machinery.
For example, the inducer plant can hâve encoded into its DNA the machinery necessary for accomplishing the editing in the target plant’s genome.
Targeted mutagenesis (also known as gene editing) is a very important technology to crop breeding. There are numerous methods to edît spécifie gene targets now, including CR1SPR, TALEN, meganucleases, and zinc fingers. The endonuclease can be designed to target nearly any sequence. The endonuclease(s) can be constructed using methods such as, but not limited to, those described by Svitashev, et al.. Plant Physiology, 169; 931-945 (2015), Lee, et al., Plant Biotechnology, i 7(2):362-372 (2019)), Sander et al., Nature Met, 8(1):67-69, (2011), Cermak et al., Nucl Acids Res, 39(17):7879 (2011); with correction at Nucl Acids Res, 39:e82. doi: 10.1093/nar/gkr218, 2011), and Liang et al., et al., J Genet Genom, 41(2):63-68, (2014). The promoter used to drive expression of the endonuclease can be one expressed throughout development or specifically in egg cells or during early embryo development, and can be endogenous or exogenous. Examples of promoters 35S (CaMV d35S) or dérivatives (e.g., double 35S) Zmübl (maize) APX (rice) OsCcl (rice) E1F5 (rîce)RlGlB (riee)PGDl (rice) Actl (rice) SCP1 (rice).
The gene editing machinery construct(s) may include a selectable marker (e.g., herbicide résistance) to assist with recovery of the transgene during whole plant transformation and subséquent backcrossing. In some cases, one or more (e.g., two or more, or three or more) endonucleases and/or CR1SPR guide RNAs are combined into a single construct to target one or sequences of DNA.
One method to introduce editing machinery into plants is to use an Agrobacterium-based method (such as the method described by Ishîda et al., Nature Biotechnol, 146:745-750 (1996)) or particle bombardment (such as the method described by Gordon-Kamm et al., Plant Cell Online, 2(7):603618 (1990)) on plant tissue. Newer methods that incorporate developmental regulator genes hâve been devised that make it possible to transform plants without extensive tissue culture. See, e.g., Lowe, et al., The Plant Cell, 28: 1998-2015 (2016). In transformation, DNA coding for the editing machinery (e.g., CAS9 and guide RNA) is introduced into plant callus, seed or embryonic tissue. Stably-transformed plants (events) are then recovered, optionally with the help of a selectable marker.
Alternatively, a line amenable to transformation is first transformed with the gene editing machinery, and the resulting line is then crossed to a haploid inducer line. The resulting Fi that is heterozygous for cenh3 îtself becomes a haploid inducer. No additional backcrossing is needed. In this case, the Fi haploid inducer line contains the endonuclease transgene. Next, the inducer line carrying (e.g., encoding, expressing, etc.) the gene editing machinery can be pollinated by a second plant to be edited. From that pollination event, progeny (e.g., embryos or seeds) are produced; at least one of which will be a haploid seed. This haploid seed will only contain the chromosomes ofthe second plant; the inducer plant’s chromosomes hâve vanished (having been eliminated, lost or degraded), but before doing so, the inducer plant’s chromosomes permîtted the gene-editing machinery to be expressed.
Alternately, and without wishing to be bound by theory, the inducer plant delivers the aiready-expressed editing machinery upon pollination via the pollen tube. Or, in the case that the haploid inducer line is the female in the cross, the haploid inducing plant’s egg cell contains the editing machinery that is present and perhaps aiready being expressed, upon iertilization with the wild type or non-haploid inducing pollen grain. Through any of these routes, the haploid progeny obtained by the cross will also hâve had its genome edited. In the case of maize, where many crosses can be made with a single male plant, the F। containing the cenh3 null and gene editing machinery can be used to edit multiple lines by making multiple separate crosses.
The gene editing machinery typically includes an element or éléments that induce a single or a double strand break in the target cell’s genome. For example, the editing machinery can be any DNA modification enzyme, but is preferably a site-directed nuclease. The site-directed nuclease is preferably CRISPR-based, but could also be a meganuclease, a transcription-activator like effector nuclease (TALEN), or a zinc finger nuclease. The nuclease, for example, can be Cas9 or Cfpl/CasI2a. In one aspect, the nuclease is designed to cleave the DNA, with the intent of creating small délétions or duplications at the target site. The resulting small délétions and duplication can knock out gene function.
In another aspect, the DNA modification enzyme is a site-directed base editing enzyme such as Cas9-cytidîne deaminase or Cas9-adenine deaminase, wherein the Cas9 can hâve one or both of its nuclease activities inactivated, i.e. dCas9.
In yet another embodiment, the gene editing machinery can be combined with an additional repair template, such that cleavage is followed by homology-directed repair (HDR), resulting in the modification or replacement of DNA at the target site. The purpose ofthe haploid inducer in this context is as a means to rapidly transfer the gene editing machinery from a transformable line into any other line without passing through a tissue culture phase or repeated backcrossing.
] q Gene editing machinery that can be used are discussed in more detail below.
a. Strand Break Inducing Eléments i. CRISPR/Cas
In preferred embodiments, the element that induces a single or a double strand break in the target cell’s genome is a CRISPR/Cas system. As in other animal model Systems, Cas9 and sgRNA expression within targeted cells is sufficient to modify plant genomes (Deepa, et al., Front. Plant Sci., 9:985 (2018), do j : 10.33 89/fpls.2018.00985. While Cas9 is commonly used, any CRISPR/Cas-based system, for instance Cfpl/Casl2a can be used in a similar manner (Tang, et al., Genome Biology, 19(84) (2018), doi/10.1186/sI 3059-018-1458-5). Broadly useful RNA polymerase II promoters (such as 35S or ZmUbl) are often used to express Cas genes, but promoters expressing in egg cells may be more applicable in the current application. Plant-specific RNA polymerase III promoters [AtU6 (Arabidopsis); TaU6 (wheat); OsU6 or OsU3 (rice)] hâve been used to express sgRNA in plant Systems. Other embodiments may involve multîplexed guide RNA Systems driven by other promoters (Lowder, et al., Plant Physiol., 169(2): 971-985 (2015), He, et al., J Genet Genomics, 20; 44(9): 469^172 (2017)). The Cas genes may be fully functional and designed to create double stranded breaks that are repaired by nonhomologous end joining (NHEJ), resulting in mutations that knock out gene function. Alternatively Cas genes may be partially inactivated so as to cause single stranded nicks (e.g. nCas9), or fully inactivated (e.g. dCas9) to bind and not cleave, but simply direct another enzyme such as an adenine or cytidine deaminase to the desired site (Eid, et al., Biochem J, 475 (l l): 1955-1964 (2018)). There are several commercially available vectors for expressing Cas9 or Cas9 variants and gRNAs in plant Systems, and include 5 empty gRNA backbones having a plant RNA polymerase III promoter and gRNA scaffolds to which a practitioner can insert the gRNA of interest. CRISPR-based Systems can also be adapted to al ter genes by homology directed repair, as described below. One constraint is that CRISPR applications utîlize sequences that include short Protospacer Adjacent Motifs 10 (PAM sites).
The inducer plant’s genome can include one or more nucleic acids encoding a Cas enzyme and a guide RNA as components of a CRISPR system. The inducer plant’s genome can optionally include a donor polynucleotide sequence to be recombined into the target cell’s genome at or 15 adjacent to the target site (e.g., the site of single or double stand break induced by the Cas9).
Methods of preparing compositions for use in genome editing using the CRISPR/Cas Systems are described in detail în, for example, WO 2013/] 76772, WO 2014/018423, Cong, Science, 15:339(6121):819-823 (2013), and Jinek, et al., Science, 337(6096):816-21 (2012).
In general, “CRISPR system” refers collectively to transcripts and other éléments involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including a Cas protein or sequences encoding a Cas gene, a tracr (trans-activating CRISPR) sequence (e.g., 25 tracrRNA or an active partial tracrRNA), a tracr-mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” în the context of an endogenous CRISPR system), or other sequences and transcripts from a CRISPR locus. One or 30 more tracr mate sequences operably linked to a guide sequence (e.g., direct repeat-spacer-direct repeat) can also be referred to as pre-crRNA (preCR1SPR RNA) before processing or crRNA after processing by a nuclease.
In some embodiments, a tracrRNA and crRNA are linked and form a chimeric crRNA-tracrRNA hybrid where a mature crRNA is fused to a partial tracrRNA via a synthetic stem loop to mimic the natural crRNA:tracrRNA duplex as described in Cong, Science, 15:339(6121):819823 (2013) and Jinek, et al., Science, 337(6096):8 i6-21 (2012)). A single fused crRNA-tracrRNA construct is also referred to herein as a guide RNA or gRNA (or single-guide RNA (sgRNA)). Within a sgRNA, the crRNA portion can be identified as the ‘target sequence’ and the tracrRNA is often referred to as the ‘scaffold’.
In some embodiments, one or more éléments of a CRISPR system is derived from a type I, type II, or type 111 CRISPR system. In some embodiments, one or more éléments of a CRISPR system is derived from a particular organism including an endogenous CRISPR system, such as Streptococcus pyogenes.
In general, a CRISPR system is characterized by éléments that promote the formation of a CRISPR complex at the site of a target sequence (also referred to as a protospacer in the context of an endogenous CRISPR System). In the context of formation of a CRISPR complex, “target sequence” refers to a sequence to which a guide sequence is designed to hâve complementarity, where hybridization between a target sequence and a guide sequence promûtes the formation of a CRISPR complex. A target sequence can be any polynucleotide, such as DNA or RNA polynucleotides. In some embodiments, a target sequence is located în the nucléus or cytoplasm of a cell.
In the target nucleic acid, each protospacer is associated with a protospacer adjacent motif (PAM) whose récognition is spécifie to individual CRISPR Systems. In the Streptococcus pyogenes CRISPR/Cas system, the PAM is the nucléotide sequence NGG. In the Streptococcus thermophiles CRISPR/Cas system, the PAM is the nucléotide sequence isNNAGAAW. The tracrRNA duplex directs Cas to the DNA target consisting of the protospacer and the requisite PAM via heteroduplex formation between the spacer région of the crRNA and the protospacer DNA.
There are many resources available for helping practitioners détermine suitable target sites once a desired DNA target sequence is identîfied. See e.g., crispr.u-psud.fr/, a tool designed to help scientists find CRIS PR targetîng sites in a wide range of species and generate the appropriate crRNA sequence.
In some embodiments, one or more polynucleotides driving expression of one or more éléments of a CRISPR system are introduced into the inducer plant’s genome such that expression ofthe éléments of the CRISPR system direct formation of a CRISPR complex at one or more target 10 sites. For example, a Cas enzyme, and one or more a guide sequence linked to a tracr-mate sequence, and a tracr sequence could each be operably linked to separate regulatory éléments on separate expression constructs (e.g., sgRNAs). Alternatively, two or more of the éléments expressed from the same or different regulatory éléments may be combined in a single construct, [ 5 with one or more additional constructs providing any components of the
CRISPR system not included in the first construct. CRISPR system éléments that are combined in a single construct may be arranged in any suitable orientation, such as one element located 5' with respect to (“upstream” of) or 3' with respect to (“downstream” of) a second element. The coding sequence 20 of one element can be located on the same or opposite strand of the coding sequence of a second element, and oriented in the same or opposite direction. In some embodiments, a single promoter drives expression of a transcript encoding a CRISPR enzyme and one or more of the guide sequence, tracr mate sequence (optionally operably linked to the guide sequence), and a tracr 25 sequence embedded within one or more intron sequences (e.g., each in a different intron, two or more in at least one intron, or ail in a single intron). In some embodiments, the CRISPR enzyme, guide sequence, tracr mate sequence, and tracr sequence are operably linked to and expressed from the same promoter.
In some embodiments, a construct includes a regulatory element operably linked to an enzyme-coding sequence encoding a CRISPR enzyme, such as a Cas protein.
Non-limiting examples of Cas proteins include Casl, CaslB, Cas2,
Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), CaslO, Csyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmrl, Cmr3, Cmr4, Cmr5, Cmr6, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxlO, Csxl6, CsaX, Csx3, Csxl, Csxl5, Csfl, Csf2, Csf3, Csf4, homologues thereof, or modified versions thereof. In some embodiments, the unmodified CRISPR enzyme has DNA cleavage activity, such as Cas9. In some embodiments, the CRISPR enzyme directs cleavage of one or both strands at the location of a target sequence, such as within the target sequence and/or within the complément of the target sequence.
In some embodiments, a construct encodes a CRISPR enzyme that is mutated with respect to a corresponding wild-type enzyme such that the mutated CRISPR enzyme lacks the ability to cleave one or both strands of a target polynucleotide containing a target sequence. For example, an aspartate-to-alanine substitution (Dl 0A) in the RuvC I catalytîc domain of Cas9 from S, pyogenes couverts Cas9 from a nuclease that cleaves both strands to a nickase (cleaves a single strand). Other examples of mutations that render Cas9 a nickase include, without limitation, H840A, N854A, and N863A. As a further example, two or more catalytic domains of Cas9 (RuvC I, RuvC H, and RuvC III) can be mutated to produce a mutated Cas9 substantially lacking ail DNA cleavage activity. In some embodiments, a DIOA mutation is combined with one or more of H840A, N854A, orN863A mutations to produce a Cas9 enzyme substantially lacking ail DNA cleavage activity. In some embodiments, a CRISPR enzyme is consîdered to substantially lack all DNA cleavage activity when the DNA cleavage activity of the mutated enzyme is less than about 25%, 10%, 5%>, l%>, 0.1 %>, 0.0l%, or lower with respect to its non-mutated form.
In some embodiments, an enzyme coding sequence encoding a CRISPR enzyme is codon optimized for expression in particular cells. Various species exhibit particular bias for certain codons of a particular amino acid. Codon bias (différences in codon usage between organisms) often correlates with the efficiency of translation of messenger RNA (mRNA), which is in tum believed to be dépendent on, among other things, the properties ofthe codons being translated and the availability of particular transfer RNA (tRNA) molécules.
The prédominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization. Codon usage tables are readily available, for example, at the “Codon Usage Database”, and these tables can be adapted in a number of ways. See Nakamura, Y., et al., Nucl. Acids Res., 28:292 (2000). Computer algorithms for codon optimîzing a particular sequence for expression in a particular host cell, for example Gene Forge (Aptagen; Jacobus, PA), are also available. In some embodiments, one or more codons (e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more, or ail codons) in a sequence encoding a CRISPR enzyme correspond to the most frequently used codon for a particular amino acid.
While the spécifies can be varied in different engineered CRISPR Systems, the overall methodology is similar. A practitioner interested in using CRISPR technology to target a DNA sequence (identified using one of the many available online tools) can insert a short DNA fragment containing the target sequence into a guide RNA expression construct. The sgRNA expression construct contains the target sequence (about 20 nucléotides), a form of the tracrRNA sequence (the scaffold) as well as a suitable promoter and necessary éléments for proper processing in target cells. In some embodiments, multiple guide RNAs are expressed from one construct, either by chaining together multiple expression cassettes with RNA Polymerase III promoters, or by engineering long RNAs that include tRNAs or ribozyme self-cleavage sites that liberate multiple functional sgRNAs, either targeting one or several genes (Cermak, et al., Plant Cell, 29(6): 1 196-1217 (2017), He, et al., J Genet Genomics, 44(9): 469-472 (2017)). Vectors are commercially available (see, for example, Addgene). Many ofthe Systems rely on custom, complementary oiigos that are annealed to form a double stranded DNA and then cloned into the sgRNA expression plasmid. Coexpression of the sgRNAs and the appropriate Cas enzyme from the same or separate constructs in cells results in a single or double strand break (depending ofthe activity of the Cas enzyme) at the desired target site, CRISPR/Cas gene editing approaches are fully compatible with the cenh3 haploid inducer system, and the two technologies can be usefully combined to facilitate genotype-independent gene editing.
ii. Zinc Finger Nucleases
In some embodiments, the element that induces a single or a double strand break in the target plant’s genome is a nucleic acid construct or constructs encoding a zinc finger nucleases (ZFNs). ZFNs are typically fusion proteins that include a DNA-binding domain derived from a zinc- finger protein linked to a cleavage domain such as the Type IIS enzyme Fok I (Miller, et al., Nature Biotechnology, 25:778-785(2007)). Fok I catalyzes double-stranded cleavage of DNA at 9 nucléotides from its récognition site on one strand and 13 nucléotides from its récognition site on the other. See, also, U.S. Pat. Nos. 5,356,802; 5,436, 150 and 5,487,994; as well as Li et al.
Proc., Natl. Acad Sci. USA 89 ( 1992):4275-4279; Li et al. Proc. Natl. Acad. Sci. USA, 90:2764-2768 (1993); Kim et al. Proc. Natl. Acad. Sci. USA. 91:883-887 ( 1994a); Kim et al. J. Biol. Chem. 269:31,978-31,982 ( 1994b). One or more of these enzymes (or enzymatically functîonal fragments thereof) can be used as a source of cleavage domains.
Exemplary Type IIS restriction enzymes are described in
International Publication WO 07/014275. Additional restriction enzymes also contain separable binding and cleavage domains. See, for example, Roberts et al. Nucleic Acids Res., 31:418-420 (2003). In certain embodiments, the cleavage domain includes one or more engineered cleavage half-domain (also referred to as dimerization domain mutants) that minimize or prevent homodimerization, as described, for example, in U.S. Published Application Nos. 2005/0064474, 2006/0188987, and 2008/0131962. In certain embodiments the cleavage half domain is a mutant of the wild type Fok 1 cleavage half domain. In some embodiments the cleavage half domains are modified to include nuclear or other localization signais, peptide tags, or other binding domains.
The DNA-binding domain, which can, in principle, be desîgned to target any genomic location of interest, can be a tandem array of CyszHisz zinc fingers, each of which generally recognizes three to four nucléotides in the target DNA sequence. By linking together multiple fingers (the number varies: three to six fingers hâve been used per monomer in published studies), ZFN pairs can be designed to bind to genomic sequences 18-36 5 nucléotides long.
Another type of zinc finger, called a CyszCysi zinc finger, binds zinc between 2 pairs of cysteines has been found in a range of DNA binding proteins.
The DNA-binding domain of a ZFN can be composed of two to six zinc fingers. Each zinc finger motif is typically considered to recognize and bind to a three-base pair sequence and as such, a protein including more zinc fingers targets a longer sequence and therefore may hâve a greater specificîty and affinity to the target site. Zinc finger binding domains can be “engineered” to bind to a predetermined nucléotide sequence. See, for example, Beerli et al. Nature Biotechnol. 20: 135-141 (2002); Pabo et al, Ann. Rev. Biochem. 70:313-340 (2001); Isalan et al., Nature Biotechnol. 19:656-660 (2001); Segal étal. Curr. Opin. Biotechnol. 12:632-637 (2001); Choo et al., Curr. Opin. Struct. Biol. 10:41 1-416 (2000).
Standard ZFNs fuse the cleavage domain to the C-terminus of each zinc finger domain. In order to allow the two cleavage domains to dimerize and cleave DNA, the two individual ZFNs must bind opposite strands of DNA with their C-termini a certain distance apart, generally 5 to 7 bp. Both single-stranded cleavage and double- stranded cleavage are possible, and double-stranded cleavage can occur as a resuit of two distinct single-stranded cleavage events. Repair of zinc finger-induced double stranded breaks generally occurs by non-homologous end joining (NHEJ) and results in mutations that knock out gene function. Zinc finger nuclease Systems can also be combined with a second construct containing donor molécule to insert new DNA sequences by homology-directed repair as described below.
See also Shukla, et al., Nature, 459, 437-441(2009).
See, also, U.S. Pat. Nos. 6, 140,081; 6,453,242; 6,534,261;
6,610,512; 6,746,838; 6,866,997; 7,067,617; U.S. Published Application Nos. 2002/0165356; 2004/0197892; 2007/0154989; 2007/0213269; and
International Patent Application Publication Nos. WO 98/53059 and WO 2003/016496, for further design considérations.
A strength of the zinc-fmger nucleases is that any site can be targeted, and it is not limited by the PAM sites that are needed for Cas9 targeting. Zinc finger nuclease approaches are fully compatible with the cenh3 haploid inducer system, and the two technologies can be usefully combined to facilitate genotype-independent gene editing.
iîi. Transcription Activator-Like Effector Nucleases
In some embodiments, the element that induces a single or a double strand break in the target plant’s genome is a nucleic acid construct or constructs encoding a transcription activator-like effector nuclease (TALEN). TALENs hâve an overall architecture similar to that of ZFNs, with the main différence that the DNA-binding domain cornes from TAL effector proteins, transcription factors from plant pathogenîc bacteria. The DNA-binding domain of a TALEN is a tandem array of amino acid repeats, each about 34 residues long. The repeats are very similar to each other; typically they differ principally at two positions (amino acids 12 and 13, called the repeat variable diresidue, or RVD). Each RVD spécifiés preferential binding to one of the four possible nucléotides, meaning that each TALEN repeat bînds to a single base pair, though the NN RVD is known to bind adenines in addition to guanine. Like zinc finger Systems, RVDs are linked together to confer specificity to unique target sites, and are fused to a cleavage domain such as Fokl (Cermak, et al., Nucleic Acids
Research, 39(12) (2011), Page e82, doi/10.1093/nar/gkr218).
Repair of TALEN-induced double stranded breaks generaily occurs by non-homologous end joining (NHEJ) and results in mutations that knock out gene function. TALENs can also be combined with a donor template for H DR as described below. A strength of the TALEN approach is that any site can be targeted, and it is not limited by the PAM sites that are needed for Cas targeting. TALEN approaches are fully compatible with the cenh3 haploid inducer system, and the two technologies can be usefully combined to facilitate genotype-independent gene editing.
See also, Cermak, et al, Nucl. Acids Res. l-ll (2011), US Published
Application No. 2011/0145940, Miller et al., Nature Biotechnol 29: 143 (2011) for further TALEN design considérations.
b. Gene Altering DNA Sequences
The nuclease activity of the genome editing Systems described herein cleave target DNA to produce single or double strand breaks in the target DNA. Double strand breaks can be repaired by the cell in one of at least two ways: non-homologous end joinîng (NHEJ), and homology-directed repair (HDR). In non-homologous end joining, the double-strand breaks are repaired by direct ligation of the break ends to one another. As such, no new nucleic acid material is inserted into the site, although some nucleic acid material may be lost or gained during the repair process, resulting in a small délétions or insertions that can knock out gene function. In homologydirected repair, a donor polynucleotide with homology to the cleaved target DNA sequence is used as a template for repair of the cleaved target DNA sequence, resulting in the transfer of genetic information from a donor polynucleotide to the target DNA. As such, new nucleic acid material can be inserted/copied into the site.
Therefore, in some embodiments, the inducer plant optionally includes a donor polynucleotide, for example as a segment ofthe inducer plant’s genome. The modifications of the target plant’s DNA due to homology-directed repair can be used to induce gene correction, gene replacement, gene tagging, transgene insertion, nucléotide délétion, gene disruption, gene mutation, etc.
In this application, the donor polynucleotide sequence generally includes régions of sequence homology to the target DNA sequence known as homology artns. The sequence between the homology arms can be natural or engineered sequence that introduces new features such as tags, promoter motifs, expressed protein motifs, or other sequences of interest. During HDR, the homology arms are used to guide the repair of the double stranded break, resulting in insertion of the new sequence into the target site.
In applications in which it is désirable to insert a polynucleotide sequence into a target DNA sequence, a donor sequence to be inserted is also provided by the inducer plant’s genome. The donor sequence with homology arms can be in the form of a second DNA construct or construct component (Shi, et al·. Plant Biotechnology Journal, 15:207-216 (2017)), or a construct that expresses RNA that can be used as donor template for repair (Li, et al., Nature Biotechnology, 37:445-450 (2019)).
When insertion of the donor sequence for purposes of altering the genomic sequence of the target plant, the donor sequence is typically not identical to the genomic sequence that it replaces. Rather, the donor sequence may contain at least one or more single base changes, insertions, délétions, inversions or rearrangements with respect to the genomic sequence, so long as sufficient homology is present to support homology directed repair. In some embodiments, the donor sequence includes a nonhomologous sequence flanked by two régions of homology, such that homology-directed repair between the target DNA région and the two flanking sequences results in insertion of the non-homologous sequence at the target région.
The donor sequence can include restriction sites, nucléotide polymorphisms, selectable markers (e.g., drug résistance genes, fluorescent proteins, enzymes etc.), etc., relative to the genomic sequence which can be used to assess for successful insertion of the donor sequence at the cleavage site or in some cases may be used for other purposes (e.g., to signify expression at the targeted genomic locus). In some cases, if located in a coding région, such nucléotide sequence différences will not change the amino acid sequence, or will make si lent amino acid changes (i.e., changes which do not affect the structure or function of the protein). Alternatively, these sequence différences may include flanking recombination sequences such as FLPs, loxP sequences, or the like, that can be activated at a later time for removal of the marker sequence.
It is believed that ail of the currently employed gene editing methods, including CRISPR/Cas9, zinc Angers, TALENs, their combination with base editors, and their use to direct the insertion of altered DNA sequences by HDR, are compatible with the cenh3 haploid inducer system. The value of the cenh3 haploid inducer in this context is to facilitate the transfer of gene editing machinery from an easily transformable line into elite germplasm in a rapid and efficient manner.
III. Methods of Use
A. Haploid Induction
Methods for haploid induction are provided. The disclosed haploid induction methods typically include a cross between the line to be induced and a haploid inducer line. As discussed above, the haploid inducer is typically a heterozygous cenh3 null plant.
Haploid plants are produced by making crosses between the haploid inducer and virtually any inbred, hybrïd, or other germplasm of interest. Haploids are produced when the chromosomes from the haploid inducer plant are not maintained through the first cell divisions of the embryos. The resulting phenotype is not fully pénétrant, with some ovules containing haploid embryos, and others containing diploid embryos, aneuploid embryos, chimeric embryos, or aborted embryos. After haploid induction, haploid embryos or seed are typically segregated from diploid and aneuploid siblings using a phenotypic or genetic marker screen and grown into haploid plants, These plants are then converted via Chemical manipulation (e.g., using an anti-microtubule agent such as colchicine) into doubled haploid (DH) plants which then produce inbred seed.
Plant breeding is greatly facilitated by the use of doubled haploid (DH) plants. The production of DH plants enables plant breeders to obtain inbred fines without multi-generational inbreeding, thus decreasing the time needed to produce homozygous plants.
The haploid inducer is crossed (either as the male or female) to a targeted line to generate haploid progeny. The resuit is a haploid embryo or plant or seed that contains the chromosome set only from the non-inducer parent.
If the inducer is used as the male, the recovered progeny will hâve the cytoplasm of the targeted line. If the cytoplasm of the inducer is desired (for example to obtain male stérile cytoplasm), the haploid inducer can be used as the female.
Additional steps can include haploid identification techniques, and subséquent chromosome doubling techniques such as, but not limited to, those described by Prigge and Melchinger (Production of Haploids and Doubled Haploids, in Maize Plant Cell Culture Protocols, Methods in Molecular Biology, Volume 877, ρρ. 161 -172, 2012) and others, which include, for example, use of colchicine, pronamide, dithipyr, trifluralin, or another known anti-microtubule agent or other mitotic inhibitor. This line can then be directly used in downstream breeding programs.
The ease of use should make it particularly versatile when combined with other technologies that are built upon haploids. One such application is synthetic apomixis, where plants are engineered to skip meiosis and produce diploid gametes; when these lines are crossed to a haploid inducer, the resulting progeny are identîcal in génotype to the parent (Marimuthu, et aL, Science, 331(6019):876 (201 1)). Another application is the field of chromosome engineering, where the haploid inducer îs exploited to quickly move small, engineered, or fully synthetic chromosomes from one line to another (Birchler, et al., Current Opinion in Plant Biology, 19:76-80 (2014).
B. Gene Editing in Monocots
1. Methods of Gene Editing
Methods of simultaneous haploid induction and gene editing are also provided. The disclosed methods typically include a cross between the line to be înduced and a haploid inducer line. As introduced above, the haploid inducer is typically a heterozygous cenh3 null plant, and also encodes gene editing machinery of the CR1SPR/Cas9, zinc finger, TALEN types, either alone or in combination with base editor enzymes and/or donor molécules that ailow for HDR and gene replacement or modification.
In simultaneous editing plus haploid induction, rapid and costeffective production of edited crops and elite lines is possible without tissue culture. The line that receives the edits can be elite germplasm, and the editing machinery itself would be eliminated during the haploid induction process. At the same time, edîted doubled haploid lines are produced.
The gene editing machinery is delivered via the inducer line. The DNA, RNA, and proteins that make up the gene editing machinery are encoded by and are present in the inducer line. Typically, the gene editing machinery are been stably inserted in the inducer, for example, via bombardment or agrobacterium mediated transformation. The transgenic haploid inducer lines expressing editing machinery can be used as either pollen donors or acceptors in interspecifïc or intergeneric wide crosses for haploid induction and simultaneous genome editing.
The haploid inducer is crossed (either as the male or female) to a targeted line to generate haploid progeny. After fertilization, edits are made by the editing machinery in the non-inducer target genes prior to or during élimination of the inducer chromosomes. The resuit is a haploid embryo or plant or seed that contains the chromosome set only from the non-inducer parent, where that chromosome set contains DNA sequences that hâve been edited.
The promoter(s) used în the endonuclease construct typically resuit in endonuclease expression before fertilization, during the first couple of cell divisions, or a combination thereof. By using the haploid inducer as the female, if the endonuclease is expressed în the egg before pollination and during the first stages of cell development, the endonuclease can immediately begin mutating the target sequence upon pollination and continue mutating the target sequence before the haploid inducer genome is lost from the cell. In the first stages of mitosis, before the haploid inducer genome is eliminated, the targeted endonuclease induces targeted DNA strand break(s) în the DNA of the target line. These breaks are repaired by NJEH to create small insertions or délétions, or corrected by HDR, depending on the gene editing application.
The haploid progeny genomes can be doubled before or after the progeny are screened for the mutation(s). Once the genomes of these haploid indivîduals are doubled, the indivîduals can be grown out and self-pollinated to produce doubled haploid seed. Different mutations may be produced, and mutation events can be evaluated to détermine if the mutation(s) obtained hâve the desired resuit. The disclosed methods may be conducted on ail (or many) of the monocot lines that a breeder plans to use as parents for breeding. If a breeder develops populations using lines that hâve a désirable mutation(s) at ail targeted loci, the populations do not segregate for the désirable mutation(s). Thus, the breeding efforts are simplified by not having to select for the presence of the désirable mutation(s).
The methods can produce doubled haploid individuals with the 5 targeted mutation(s) without the time and expense of backcrossîng in a desired targeted mutation into the targeted line. The cenh3 haploid inducer is genetically dominant. When the cenh3 null is crossed to any line, the resulting Fi itself becomes a haploid inducer. The Fi individual containing the cenh3 null and gene editing machinery can be crossed to hundreds or even thousands of elite fines particularly where the inducer stock line is used as the male. In some embodiments, doubled haploid individuals from multiple elite fines, ail with the targeted mutation(s) are produced in less than one year.
In some embodiments, the inducer line can target 2, 3, 4, 5 or more 15 mutations by including, for example, multiple endonuclease transgenes, and/or gRNAs, and/or donor sequences, etc.
Recovered doubled haploid individuals may not hâve ail of the desired mutations. In some embodiments, where multiple mutations are desired, doubled haploid progeny with single mutations can be crossed 20 together, and the F2 progeny can be screened for individuals that are homozygous for ail desired mutations.
If the inducer is used as the male, the recovered progeny will hâve the cytoplasm of the targeted line. If the cytoplasm of the inducer is desired (for example to obtain male stérile cytoplasm), the haploid inducer can be used as 25 the female. If the cytoplasm of the targeted line is desired, crosses can be made between the non-mutated version of the targeted line (as the female) and the mutated version ofthe targeted line (as the male).
2. Types of Gene Editing
As introduced above, suitable gene editing Systems include, but are 30 not limited to, those that cause mutations or base edits, and targeted sequence insertion by HDR. It is believed that any form of stable gene editing process, whether it be through CRISPR/Cas9, zinc fingers, or TALENs can be combined with the cenh3 null to modify the genome of a target line. By
4I using one Cas9 nuclease and multiple gRNAs, more than one site can be targeted and altered simultaneously.
When an active Cas protein is targeted to a locus and no donor template is provided, one possible outcome is small délétions or insertions that are created during the NHEJ repair process. If the targeted location is properly selected, the mutation can create a frameshift in the coding sequence and abolish the function of the gene, or alter the promoter to change the expression pattern. If the intent is to knock out expression, multiple gRNAs are frequently targeted to several régions of the coding sequence. If multiple gRNAs are targeted to the promoter, dramatîc and usefui changes in gene expression can result (Rodriguez, et al., Cell, 171(2):470-80(2017)).
Targeted mutagenesis of DNA sequence can also be achieved through direct conversion of one DNA base to another without requiring double stranded breaks (DSBs). For example, cytidine deamînase APOBEC1, adenine deaminase, and other enhancing components like Uracil DNA glycosylase (UDG) can be fused to Cas9 (A840H) nickase or nucleaseinactivated dead Cas9 (dCa9) to direct editing of DNA sequence without introducing double strand DNA breaks (Komor et al.. Nature, 533:420^124 (2016) doi:10.1038/nature 17946; Gaudelli et al., Nature, 551:464-471 (2017) doi: 10.1038/nature24644; Komor et al.. Science Advances, 3(8) eaao4774 (2017), DOI: 10.1126/sciadv.aao4774). This kind of base editor machinery can also be delivered through haploid induction line to induce base editing in target sequences directly in other varieties.
HDR, which is an alternative means of repairing DSBs in chromosomes, is a mechanism for engineering plant genomes that facilitâtes more subtle DNA sequence modifications, including DNA correction, targeted knock-in or replacement, or any type of desired mutation. HDR occurs during cell division (Ceccaldi, et al., Trends Cell Biol, 26(1):52-64 (2016). doi: 10.1016/j.tcb,2015.07.009) and may be particularly active during the rapid cell divisions of the young embryo. In support of this, data presented below indicate that HDR may be responsible for the mutation in the plant produced by simultaneous haploid induction and pass-through gene editing of Example 2.
Thus, it is believed that not only can in vivo haploid induction system be used to întroduce protein, RNA or DNA for cleavage or conversion of 5 target sequence, it can also be used, along with an approprîate repair template, to întroduce précisé sequence changes to régions targeted for gene editing.
The template DNA can be inserted into the inducer line genome carrying genome editing machinery such as CRISPR-Cas9 system, either in 10 the same transgenic locus or different locus. When both Cas9-sgRNA and template DNA are present în the induced haploid embryos, cleavage of the target sequence will resuit in repair of the chromosomal break with the homologous transgenic DNA sequence as template.
Transgenes can be introduced into a DSB if the provided template 15 contains the transgene flanked by sequences that are homologous to the sequences on either side of the DSB (homology arms, see, Shukla et al., Nature, 459:437-441, (2009)). A transgenic event (e.g., to insert a gene of interest) is crossed into the haploid inducer line. An endonuclease gene can be used to target the relative position of the transgene in a non-transgenic line. The transgenic event to be inserted needs to be flanked on both s ides by DNA sequences homologous to the DNA flankîng the target site. When the haploid inducer is crossed to the targeted line, the endonuclease will cause a double strand break at the target site. If the targeted line's DNA is repaîred by HDR using the haploid inducer stock line’s DNA (and transgene) as the template, the targeted line DNA “repairs” the double strand break by putting the transgene sequence in the double strand break site. Thus, the disclosed methods may be used to place transgenes into targeted lines without having to backcross.
The disclosed compositions and methods can be further understood 30 through the following numbered paragraphe.
1. A monocot haploid inducer plant heterozygous for centromeric histone 3 (CenH3) comprising diploid plant cells comprising only one allele encoding functional CENH3 protein.
2. The monocot plant of paragraph l, wherein the diploid plant cells comprise one CenH3 allele encoding non-functional CENH3 protein.
3. The monocot plant of paragraph 2, wherein the allele encoding non-functional CENH3 protein is a protein null allele.
4. The monocot plant of paragraphs 2 or 3, wherein the allele encoding non-functional CENH3 protein is an RNA null allele.
5. The monocot plant of any one of paragraphs I -4, wherein the allele encoding non-functional CENH3 protein is caused by frameshift mutation that créâtes a stop codon that abolishes function.
6. The monocot plant of any one of paragraphs l-5, wherein the endogenous CenH3 loci on a first diploid chromosome is partialiy or completely deleted.
7. The monocot plant of any one of paragraphs l-6, wherein the endogenous CenH3 loci on a second diploid chromosome is intact.
8. The monocot plant of any one of paragraphs 1-7, wherein the functional CENH3 protein is wildtype CENH3 protein.
9. The monocot plant of any one of paragraphs 1-8, wherein the plant lacks a chromosomal l y integrated or extrachromosomal transgene encoding wildtype CenH3.
10. The monocot plant of any one of paragraphs 1 -9, wherein the plant lacks a chromosomally integrated or extrachromosomal transgene encoding a CENH3 variant or fusion protein.
]. The monocot plant of paragraph 10, wherein the fusion protein that is lacking from the plant comprises green fluorescent protein.
12. The monocot plant of paragraphs 10 or 11, wherein the fusion protein that is lacking from the plant is GFP-tails-wap.
13. The monocot plant of any one of paragraphs 1-12, wherein the sperm or eggs are hâve less CENH3 than a CENH3 homozygous wild type plant’s sperm or eggs, optionally wherein the sperm or egg hâve less than 50% CENH3 than a wild type plant’s sperm or eggs, preferably, wherein the sperm or egg hâve between about 50% and 10%, for example, 50%, 25%, or 12.5% of the CENH3 than a wild type plant’s sperm or eggs.
] 4. The monocot plant of any one of paragraphs I - l 3, wherein the plant is maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garîic, chives, or yam.
15. The monocot plant of paragraph 14, wherein the plant is maize.
16. The monocot plant of any one of paragraphs 1-15, further comprising an exogenous site-directed nuclease expressed by cells of the monocot plant.
17. The monocot plant of paragraph 16, wherein the diploid plant cells comprise one CenH3 allele encoding non-functîonal CENH3 protein.
18. The monocot plant of paragraph 17, wherein the allele encoding non-functîonal CENH3 protein is a protein null allele.
19. The monocot plant of paragraphs 17 or 18, wherein the allele encoding non-functîonal CENH3 protein is an RNA null allele.
20. The monocot plant of any one of paragraphs 16-19, wherein the allele encoding non-functîonal CENH3 protein is caused by frameshift mutation that créâtes a stop codon that abolishes function.
21. The monocot plant of any one of paragraphs 16-20, wherein the endogenous CenH3 loci on a first diploid chromosome is partially or completely deleted.
22. The monocot plant of any one of paragraphs 16-21, wherein the endogenous CenH3 loci on a second diploid chromosome is intact.
23. The monocot plant of any one of paragraphs 16-22, wherein the functional CENH3 protein is wildtype CENH3 protein.
24. The monocot plant of any one of paragraphs 16-23, wherein the plant lacks a chromosomally integrated or extrachromosomal transgene encoding wildtype CenH3.
25. The monocot plant of any one of paragraphs 16-24, wherein the plant lacks a chromosomally integrated or extrachromosomal transgene encoding a CENH3 variant or fusion protein.
26. The monocot plant of paragraph 25, wherein the fusion protein that is lackîng from the plant comprises green fluorescent protein.
27. The monocot plant of paragraphs 25 or 26, wherein the fusion protein that is lacking from the plant is GFP-tailswap.
28. The monocot plant of any one of paragraphs 16-27, wherein the sperm or eggs are hâve less CENH3 than a CENH3 homozygous wild 5 type plant’s sperm or eggs, optionally wherein the sperm or egg hâve less than 50% CENH3 than a wild type plant’s sperm or eggs, preferably, wherein the sperm or egg hâve between about 50% and 10%, for example, 50%, 25%, or 12.5% of the CENH3 than a wild type plant’s sperm or eggs.
29. The monocot plant of any one of paragraphs 26-28, wherein 10 the site-directed nuclease is stably expressed by cells of the monocot plant.
30. The monocot plant of paragraph 29, wherein the site directed nuclease is a meganuclease (MN), zinc-finger nuclease (ZFN), transcriptionactivator like effector nuclease (TALEN), or a CRI S PR-based nuclease optionally wherein the nuclease is selected from Cas9 nuclease, Cfpl nuclease, dCas9-FokI, dCpfl-Fokl, chimeric Cas9-cytidine deaminase, chimeric Cas9-adenine deaminase, chimeric FEN1 -Fokl, and Mega-TALs, a nickase Cas9 (nCas9), chimeric dCas9 non-Fokl nuclease, dCpfl non-Fokl nuclease, chimeric Cpfl-cytidine deaminase, and Cpfl-adenine deaminase.
31. The monocot plant of paragraphs 29 or 30, wherein the 20 plant’s genome comprises a heterologous nucleic acid construct encoding the nuclease.
32. The monocot plant of any one of paragraphs 16-31, further comprising a guide RNA expressed by cells of the monocot plant.
33. The monocot plant of paragraphs 29 or 30, wherein the plant’s genome comprises a heterologous nucleic acid construct encoding the gRNA.
34. The monocot plant of any one of paragraphs 16-33, wherein the plant’s genome comprises a donor nucleic acid sequence to be introduced by recombination at a cleavage site induced by the nuclease.
35. The monocot plant of any one of paragraphs 16-34, wherein the plant is maize, wheat, rice, sorghum, bariey, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garlic, chives, or yam.
36. An egg cell formed by the plant of any one of paragraphs l-
15, the egg cell lacking the one allele encoding functional CENH3 protein and comprising no more than about 12.5% functional CENH3 protein relative to a corresponding egg cell formed by a CenH3 homozygous plant.
37. A sperm cell formed by the plant of any one of paragraphs 115, the sperm cell lacking the one allele encoding functional CENH3 protein and comprising no more than about 25% functional CENH3 protein relative to a corresponding egg cell formed by a CenH3 homozygous plant.
38. An egg cell formed by the plant of any one of paragraphs 1635, the egg cell lacking the one allele encoding functional CENH3 protein and comprising no more than about 12.5% functional CEN1-13 protein relative to a corresponding egg cell formed by a CenH3 homozygous plant.
39. A sperm cell formed by the plant of any one of paragraphs 1635, the sperm cell lacking the one allele encoding functional CENH3 protein and comprising no more than about 25% functional CENH3 protein relative to a corresponding egg cell formed by a CenH3 homozygous plant.
40. A method of inducing formation of a target haploid monocot plant comprising pollinating a parent monocot target plant with pollen from the monocot haploid inducer plant of any one of paragraphs 1-15; and selecting at least one haploid progeny produced by the pollination.
41. A method of inducing formation of a target haploid monocot plant comprising pollinating the monocot haploid inducer plant of any one of paragraphs 1-15 with pollen from a parent monocot target plant; and selecting at least one haploid progeny produced by the pollination.
42. The method of paragraphs 40 or 41, further comprising chromosome doubling ofthe selected haploid progeny.
43. The method of paragraph 42, wherein chromosome doubling is spontaneous or induced by a chromosome doubling agent optionally selected from colchicine, pronamide, dithipyr, trifluraiin, or another antimicrotubule agent.
44. The method of any one of paragraphs 40-43, wherein the monocot target plant is selected from maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garlic, chives, or yam.
45. A method of modifying the genome of a monocot target plant comprising inducing formation of a target haploid monocot plant comprising pollinating a parent monocot target plant with pollen from the monocot haploid inducer plant of any one of paragraphs 16-35, and selecting at least one haploid progeny produced by the poilination, wherein the haploid progeny comprises the genome of the monocot target plant but not the monocot haploid inducer plant, and the genome ofthe haploid progeny has been modified by the site directed nuclease and optionally at least one guide RNA delivered by the monocot haploid inducer plant.
46. A method of modifying the genome of a monocot target plant comprising inducing formation of a target haploid monocot plant comprising pollinating the monocot haploid inducer plant of any one of paragraphs 1635 with pollen from a parent monocot target plant, and selecting at least one haploid progeny produced by the poilination, wherein the haploid progeny comprises the genome of the monocot target plant but not the monocot haploid inducer plant, and the genome of the haploid progeny has been modified by the site directed nuclease and optionally at least one guide RNA delivered by the monocot haploid inducer plant
47. The method of paragraphs 45 or 46, further comprising chromosome doubling of the selected haploid progeny.
48. The method of paragraph 47, wherein chromosome doubling is spontaneous or induced by a chromosome doubling agent optionally selected from colchicine, pronamide, dithîpyr, trifluralin, or another antimicrotubule agent.
49. The method of any one of paragraphs 45-48, wherein the monocot target plant is selected from maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onîon, garlic, chives, or yam.
Examples
Example 1: Préparation of maize cenh3 null mutant, and use thereof for haploid induction.
Materials and Methods
Plant materials
The gll, gl8, and cenh3-mul()l5598 transposon insertion lines were obtained from the Maize Genetics Coopération Stock Center, Urbana,
Illinois. The cenh3-mul0l5598 allele is one of several mutations in the UFMu-01386 stock line. Ail plants were grown in the University of Georgia Plant Bîology greenhouses.
Construct préparation and transformation
The Ubi-Cas9 construct contains 1991 bp of the maize polyubîquitin promoter (GenBank: S94464.1) driving a maize codon-optimized version of Cas9 terminated by the Nos terminator.
The gRNA-lmmuneCENH3 construct contains two components, a guide RNA module and the lmmuneCENH3 gene. The guide RNA portion contains the maize U6 promoter (Svitashev, et al.. Plant Physiol. 169, 93120 945 (2015)) driving a guide RNA (TCCCGCAGCGCTACAGTCCC) (SEQ
ID NO:1) terminated by the PolIII terminator TTTTTTTT. The lmmuneCENH3 portion contains 6455 bp of the native CENH3 gene (coordinates Chr6:166705239-166711693 on Zm-B73-REFERENCE-NAM5.0) but has five si lent codon changes in the gRNA target area (CCAGGTACGGTÇGCÇCTGCGÇGA) (SEQ ID NO:2). The promoter includes 2184 bp of sequence upstream of the ATG.
To create the gRNA-TailswapCENH3 construct, the natural 5’ UTR of CENH3 was retained and a codon-optimize GFP sequence was inserted at the ATG of ImmuneCENH3. This was followed by a linker sequence
ATGGATGAACTATACAAGGGCGGAGGCGGTGGAGGCGTCGAC (SEQ ID NO:3) and the tail sequence of the maize H3.3 gene (Genbank NM_001294303.2) including its intron, fused to the native CENH3 gene 3 bp upstream of the guide RNA target area. The Arabidopsis GFP-tailswap transgene also includes the H3.3 portion. The construct was based on the sequence of Arabidopsis GFP-taüswap obtained from the Cornai laboratory.
The three constructs were synthesized by GenScript (www.genscript.com) and cioned into the binary vector pTFlOl .1 (Paz, et al., Euphytica 136, 167-179 (2004)). To generate the cenh3 mutation, transgenic lines carrying Ubi-Cas9 were crossed with fines carrying gRNAlmmuneCENH3.
DNA Extraction, genotyping and sequence analysis
For standard leaf genotyping, genomic DNA was prepared using a
CTAB protocol (Clarke, ColdSpring Harbor Protocols vol. 2009 db.prot5177-pdb.prot5177 (2009)). Endosperm tissue was collected after the kernels had germinated and the glossy phenotype could be distinguished. Embryos and pericarps were removed with forceps, and the endosperm ground to a powder with a mortar and pestle. The endosperm DNA was extracted with the IBI Plant Genomic DNA Mini Kit (IBI Scientific ΙΒ47231).
To identify the presence of ImmuneCENH3 and Cas9 in transgenic fines, primers CENH3-F2 and CENH3-R3 were used to amplify lmmuneCENH3, and primers Cas9-Fl and Cas9-Rl were used to amplify
Cas9 (Table 1). To identify the original cenh3 mutation in Cas9 plants, PCR was carried out using the Phusion High-Fideiity PCR Kit (New England Biolabs, Ipswich, MA) with primers CENH3-F1 andCENH3-Rl in Table 1 (SEQ ID NOS:4-14, in descendîng order as they appear in the table).
Table 1: Primers used. __________
Primer marne Primer sequence Purpose SEQIDNO
CENH3-F1 TGCAAGATGAGGGCGAATGTG Genotyping on native CfJVHJ and ÎmmuneCEMiS 4
CENH3-R1 TACTTCCTGATCTCCCGCAGC Genotyping on native CENH3 but not JmmufleCENHJ 5
CENH3-F2 GGCTGCTCTTACTTGCTTGC Genotyping on native CEWHJ and frnmuneCENHJ 6
GENH3-R2 CGCTCTACTTTGCCGTTTGTTAC Genotyping on native CCNH3 and lmmuneCENH3 7
CENH^R3 TACTTCCTGATCTCGCGCAGGGCG Genotyping on /mmuneCENW but not native CENH3 8
Cas9-F1 ACGAGAAGTACCCGACAATCTACC Genotyping on Cas9 9
Cas9-R1 TGATTTGAAGTTCGGCGTCAGG Genotyping on 10
€ΕΝΗ3^4 CGCTCGAACTGGAGCTTCTT Genotyping on cenh3::mu 11
CENH3-R4 AGGTTGGCAGGTAGCCGTTA Genotyping on cenh&imu 12
Mu1 GCCTCTATTTCGTCGAATCCG Genotyping on cenh3mb 13
Mu2 GCCTCCATTTCGTCGAATCCC Genotyping on cenh3:;mu 14
The PCR products were either directly Sanger sequenced or cloned using a TOPO TA cloning kit (Thermo Fisher #K45750l) and then Sanger sequenced.
In lines that lack lmmuneCENH3, the cenh3 null allele was differentiated from the native CENH3 allele by PCR and restriction enzyme digestion. PCR amplifies a 496 bp PCR product using primers CENH3-F2 and CENH3-R2. When this product is digested with the restriction endonuclease AlwNI (New England Biolabs), the wild type allele is cleaved into two pièces of size 284 bp and 212 bp while the mutant cenh3 allele is 10 not cleaved.
The cenh3-mul015598 allele was scored using the primers CENH3F4, CENH3-R4 and Mumix (a l : l mix of the two primers Mul and Mu2 in Table I). The wild type allele is amplified with CENH3-F4 and CENH3-R4 while the Mu allele is amplified with CENH3-F4 and Mumix.
Ploidy Evaluation
Progeny from +[cenh3 crosses were grown indoors under grow lights for 10-13 days and water sprayed on the seedlings to identify the glossy phenotype. Ail glossy plants were subsequently assayed by flow cytometry. For each individual, about l g of flash-frozen leaves or roots were collected 20 and chopped into l .5 ml of pre-chilled nuclei extraction buffer (2 mM EDTA, 15 mM Tris-HCl pH 7.5, 20 mM NaCI, 80 mM KCI, 0.5 mM spermine, 15 mM 2-mercaptoethanol, 0.1 mM PMSF, 0.1% Triton X-100). After choppîng, the mixture was filtered through a 40 pm cell strainer twice. The nuclei were stained with 4,6~diamidino-2-phenylmdole and loaded into 25 flow cytometers hosted by the CTEGD Cytometry Shared Resource Lab at the University of Georgia.
Chromosome spreads
Chromosome analysis was carried out as described in (Dawe, et al., Cell 173, 839-850.e 18 (2018)). Briefly,root tips were collected from the 30 haploid and diploid plants, incubated in a chamber with nitrous oxide for three hours, and fixed with 90% acetic acid. Root tips were eut with a razor blade and digested in an enzyme solution (1% pectolyase Y-23, 2% cellulase Onozuka R-10) at 37°C for 50 minutes. The root section was washed in éthanol then immersed in 90% acetic acid, A métal pick was used to crush the roots tips and 10 μΐ of the cell suspension was dropped onto microscope si ides. Slides were dried and mounted with a glass coverslip using ProLong Gold with DAPI (Thermo Fisher Cat# P36931). Slides were imaged on a Zeiss Axio Imager.Ml fluorescence microscope with a 63X Plan-APO Chromât oil objective, and slidebook software (Intelligent Imaging Innovations, Denver, CO, USA) used to analyze the data.
Skim sequencing of haploids and aneuploids
For each sample, 12 ng/μΐ DNA was sonicated în a 100 μΐ volume with a Diagenode Bioruptor for seven minutes on high setting with 30second on-off intervals, yielding fragments averaging about 500 bp in length. DNA sequencing libraries were prepared using the ΚΑΡΑ Hyperprep Kit (KK8502) with ΚΑΡΑ single-indexed adapters (KK8700). 600 ng of sonicated DNA was used as input for each sample, and 3 cycles of PCR were used to amplify libraries, 150-nt Illumina sequencing reads were adaptertrimmed and quality-filtered using Cutadapt version l.9.l (Martin, et al., EMB net.journal vol. 17 10 (2011)) with parameters as follows: “-q 20 -a AGATCGGAAGAGC -e .05 -O l -m 50” (SEQ ID NO:l5). Reads were aligned to Zm-B73-REFERENCE-NAM-5.0 using BWA-mem version 0.7.15 in single-end mode with default parameters (Li, & Durbin, Bio informâtes vol. 25 1754-1760 (2009)). Read coverage was visualized using IGVTools version 2.3.98 (Thorvaldsdôttir, et al., Brief. Bioinform. 14, 178-192 (2013)) with coverage calculated on 25Mb intervals.
Results
Doubled haploid breeding is widely used to accelerate the production of new inbred lines (Kalînowska, et al., Theor. Appl. Genet. 132, 593-605 (2019)). One common approach, used throughout the maize breeding industry, involves creating haploids with mutants that interfère with fertilization (Kelliher, et al., Nature 542, 105—109 (2017), Liu, et aL, Mol. Plant 10, 520-522 (2017), Gilles, étal., EMBOJ. 36, 707-717 (2017), Yao, et al., Nat Plants 4, 530-533 (2018), Zhong, et al., Nat Plants 6, 466^472 (2020)). An entirely different method of inducing haploids was pioneered by Simon Chan and colleagues, who showed that Crossing Arabidopsis lines with a structural ly altered Centromeric Histone H3 (CENH3) protein yielded haploids and aneuploids at frequencies as high as 25-45% (Ravi and Chan, Nature 464, 615—618 (2010)). CENH3 is a histone variant that defines centromere location and recruits overlying kinetochore proteins (Cheeseman & Desai, Nat. Rev. Mol. Cell Biol. 9, 33-46 (2008), Black & Bassett, Curr. Opin. Cell Biol. 20,91-100 (2008)). The original study involved a construct called GFP-tailswap where the N-terminal tail of CENH3 was modified with a GFP tag, however point mutations and small délétions of CENH3 can also induce haploids at similar frequencies (Karimi-Ashtiyani, et al., Proc. Natl. Acad. Sci. U. S. A. 112, 11211-11216 (2015), Kuppu, étal., PLoSGenet. 11, e 1005494 (2015), Kuppu, et al., Plant Biotechnol. J. (2020) doi:10.1111/pbi.l 3365). Outside of Arabidopsîs, centromere-mediated haploid induction has proven to be less effective, generally producing <1% haploids (Kalinowska, et al., Theor. Appl. Genet. 132, 593-605 (2019)).
The current study was designed to învestigate the mechanism of centromere-mediated haploid induction in maize, inîtially using the GFPtailswap method. However, this approach is complicated by the fact that it needed both a mutant of native cenh3 and a functional GFP-tailswap transgene that compléments the mutant. Another group had already shown some success using an existing maize mutant (cenh3-mul()l5598) caused by a Robertsons Mutator (Mu) insertion in the 5’ UTR of the gene (Kelliher, et al., Front. Plant Sci. 7, 414 (2016), Feng, et ak, Plant J. (2019) doi; 10.1111/tpj. 14606)). They crossed GFP-tailswap into the cenh3mu!015598 background and observed an average of 0.86% haploids when crossed as a male and no haploids when crossed as a female (Kelliher, et al., Front. Plant Sci. 7, 414 (2016)). cenh3-tnul()15598 was obtained and three heterozygous plants were self-crossed. Genotyping reveaied that two ears segregated a low frequency of homozygous mutants that grew to various States of maturity (Table 2).
Table 2: The cenh3-mulOI5598 mutant is not a null *.
Cross # seedlings expected WT:het:hom observed WThethom
+/cenh3-mu1015598A ® 76 19:38:19 23:44:8
+/cenh3-mu1015598-2 ® 37 9:19:9 6:25:6
+/cenh3-mu1015598-3 ® 25 6:13:6 8:18:0
* WT indicates wild type, het indicates Mcenh3-mul015598 hétérozygote, and hom indicates cenh3-mul015598!cenh3-mul015598 homozygote. Note 5 that the first two plants yielded homozygous cenh3-mu!015598 progeny.
The recovery of homozygous mutants indicates that cenhSmul015598 is not a null, and that the prior results may hâve been confounded by a low level of wild type CENH3 expression. The variable penetrance of the cenh3-mu!015598 allele can be explained by the fact that
Mu éléments can promote low levels of expression when inserted into 5’ UTR régions (Barkan & Martienssen, Proc. Natl. Acad. Sci. U. S. A. 88, 3502-3506 (1991)).
To overcome the sélection against true null cenh3 alleles, a cenh3 null was created using a two-construct CRISPR/Cas9 approach. One line was 15 transformed with a simple construct expressing Cas9 driven by an Ubiquitin promoter. A second was transformed with a construct express ing a gRNA targeting the fourth exon of the native CENH3 gene and an lmmuneCENH3'' gene that contains a full-length native CENH3 gene with five silent nucléotide changes in the gRNA target area (Fig. 1 A, I B). After the two lines were crossed together, Cas9 generated mutations in the native CEN H3 gene but left the transgene unaffected. A cenh3 allele with a single nucléotide délétion was chosen that causes an immédiate stop codon in the N-termînal tail of CENH3 (Fig. 1 C). In the presence of lmmuneCENH3, the cenh3 mutation segregates as a simple Mendelian récessive trait (Table 3).
Table 3: Ségrégation of cenh3 in ImmuneCENH3 and
TailswapCENH3 backgrounds .
Cross transgene génotypes cenh3 génotypes
+/lmmuneCENH3, +/cenh3 +1 lmmuneCENH3 or lmmuneCENH3)lmmuneCENH3 (172) +/+ (54) +Icenh3 (75) cenh3lcenh3 (43)
+/+ (33)
+/TaitswapCENH3, +/cenh3 +ITailswapCENH3 or TailswapCENH3/TailswapCENH3 (56) +/+ (47) +Icenh3 (6) cenh3icenh3 (0)
+/+ (16)
* Numbers in parenthèses show the number of plants of each génotype.
Transgenic s were then created with TailswapCENH3, a close repi ica of the Arabidopsis GFP-tailswap construct, and it was crossed to the cenh3 mutation. Plants that contained TailswapCENHS and homozygous for cenh3 (Table3) were not obtained, indicating that the transgene does not complément a true null (Fig. 4A).
During the course of these studies, it was discovered thaï cenh3 was occasionally transmitted in the absence of ImmuneCENHS. By Crossing to wild type lines, a simple segregating cenh3 line was obtained that lacked both of the original transgenes. Among selfed progeny from a +!cenh3 line there were 163 +/+ wild type individuals, 55 +!cenh3 hétérozygotes, and zéro cenh3/cenh3 homozygotes, indicating that the mutant is homozygous léthal and poorly transmitted through gametophytes. Recîprocal crosses between +!cenh3 hétérozygotes and wild type plants were also carried out. A Mendelian trait is normally transmitted to 50% of testcross progeny, however it was observed that only I2.l% of the progeny received cenh3 when crossed through the male and 25% when crossed through the female (Table 4).
Table 4: Transmission of cenh3 through male and female crosses
Cross # seedîings expected WT:het:hom expected het frequency observed WThethom observed het frequency
*/cenh3 ® 218 55:103:55 50% 163:55:0 25.2%
Ncenh3 9 X B73 184 90:90 50% 138:46:0 25.0%
B73 9 X +/cenh3 <3 140 70:70 50% 123:17:0 12.1%
* WT indicates wîld type, het indicates +icenh3 hétérozygote, hom indicates 5 cenh3lcenh3 homozygote,
The réduction in transmission may be explained because sperm and eggs are carried within multicellular haploid gametophytes. Two haploid cell divisions précédé the formation of sperm and three haploid cell divisions précédé the formation of an egg. Those gametophytes with the cenh3 allele must use CENH3 carried over from the sporophytic phase while it is naturally diluted at each cell cycle (Lermontova, et al., Plant Cell 18, 24432451 (2006)). Under this model, cenh3 sperm would hâve about Ά of the normal amount of CENH3 and an egg carrying cenh3 would hâve about 'A relative to the cenh3 heterozygous parent (Fig. 4B). Assuming no dosage compensation, those values would be reduced by an additional Ά relative to a normal homozygous wild type parent. As a resuit, sperm and eggs carrying cenh3 may hâve smaller centromeres.
To test whether +/cenh3 heterozygous mutants are able to induce haploids, cenh3 hétérozygotes were crossed with tester lines în both directions. In the first test wild type and +!cenh3 plants were crossed to a line thaï is homozygous for a récessive glossy8 (gl8) mutation on chromosome 5 that causes seedlîng leaves to hâve a shiny appearance (Xu, et al., Plant Physiology vol. 115 501-510 (1997)). It was observed that 0.5% of the progeny were glossy when +!cenh3 hétérozygotes was crossed as male, and
5.0% of the progeny were haploid when +lcenh3 plants were crossed as female (Table 5).
Table 5. Haploid and aneuploid induction by +/cenli3 hétérozygotes.
Cross # seedlings # glossy plants # haploid haploid ratio # aneuploid aneuploid ratio
gWSX +/cenh3 $ 597 3 3 0.5% 0 0
gtS? XWT i 826 0 - - - -
+/cenh3%X g!8 3 838 42 42 5.0% 2 1 0.2%
WT1Xgl8 3 1000 0 - - - -
+Îcenh3vX g!1 3 844 75 44 5.2% 28 3.3%
WTgXgll 3 1114 0 - - - -
1 The two aneuploid plants are non-glossy plants with stunted phenotypes.
Flow' cytometry analysis revealed that al 1 of the glossy plants were haploids (Fig. 2A-2B), an interprétation that was confirmed by counting chromosomes in root tip cells of three plants (Fig. 2C-2D). When grown to maturity the haploid plants were short and stérile (Chase, Bot. Rev. 35, 117168 (1969)) (Fîg. 2E-2F). Also observed were two non-glossy plants with stunted phenotypes that were believed to be aneuploids. These two plants were skim sequenced along with six haploids. While the haploids showed uniform sequence coverage, the stunted plants dîd not; one was trisomie for chromosome 3, and the other was monosomie for chromosome 2 and 4 and trisomie for chromosome 10 (Fig. 3A-3B).
A second set of tests was carried out using glossy 1 (gll), which has a similar phenotype but the mutation is on chromosome 7 (Sturaro, et al., Plant Physiol. 138, 478-489 (2005)). In these crosses the germination rate was also scored, which is an indirect measure of karyotypic abnormality commonly used to score the efficacy of Arabidopsis haploid inducers (Kuppu, et al.,
PLoS Genet. 11, e 1005494 (2015), Ravi, et al., Nature Communications vol.
(2014), Maheshwari, et al., PLoS Genet. 11, e 1004970 (2015)), In crosses where +!cenh3 hétérozygotes were the female, 5.2% of the progeny showed the glossy phenotype and were haploid by flow cytometry measurements. Another 3.3% of the progeny showed the glossy phenotype but had a higher
DNA content than expected for haploids, and were scored as aneuploids (Table 6).
Table 6: Results of individual crosses between Mcenh3 plants and the gll tester.
cross1 seeds : seedlîngs germination ratio # glossy plants # haploid haploid ratio # aneuploid aneuploid ratio
NW222 192 147 76.6% 13 7 4.8% 6 4.1%
NW223 192 174 90.6% 7 2 1.2% 5 2.9%
NW224 192 157 81.8% 19 14 8.9% 5 3.2%
NW225 192 140 72.9% 14 11 7.9% 3 2.1%
NW227 180 117 65.0% 10 2 3 2.6% 6 5.1%
NW228 165 109 66.1% 12 2 7 6.4% 3 2.8%
Total 1113 844 75.8% 75 2 44 5.2% 28 3.3%
WT Kgl1 1125 1114 99.0% 0 0 0 0 0
1 NW222, NW223, NW224, NW225, NW227, and NW228 are different ears from the cross +!cenh3 Ç X gll 2 unable to interpret the ploidy level în three glossy plants.
Different crosses differed considerably în the germination rate (6591 %), frequencies of haploids (1.2-8.9%) and aneuploîds (2.1-5.1%) (Table 6). Sequence data from five aneuploid plants confirmed that ail except one were missîng chromosome 7, sometimes in conjunction with the loss of other chromosomes. One glossy plant that appeared to hâve two complété copies of chromosome 7 may bave had a small interstitiaI délétion that was not détectable by skim sequencîng (segmentai aneuploîds are common in Arabidopsis GFR-taUswap crosses (Tan, et al., Elife 4, (2015))). The results from the gll tests are more în line with what has been observed in Arabidopsis, where any given cross with GVP-tailswap generally yields haploids and aneuploîds in similar proportions (Ravi and Chan, Nature 464, 615-618 (2010), Ravi, et aL, Nature Communications vol. 5 (2014), Maheshwari, et aL, PLoSGenet. 11, e 1004970 (2015)).
If CENH3 dilution îs the underlying mechanîsm for haploid induction, then only gametes carrying the cenh3 mutation from the +lcenh3 parent should induce haploids. Unfortunately it is not possible to score seedlîngs for the presence of the cenh3 allele because the genome of the haploid inducer is lost. However, data from Arabidopsis GFP-tailswap crosses show that endosperm rareiy displays complété uniparental genome élimination when the seedling is haploid (Ravi, et al., Nature
Communications vol. 5 (2014)). If true in maize as well, the génotype of the endosperm could be used to détermine the original génotype of the seedling. The remnant endosperm from a set of eleven haploid plants produced from a +lcenh3 X gl8 cross was genotyped. The results revealed that ail eleven were heterozygous for the cenh3 allele, strongly supporting the interprétation that haploid induction is a resuit of the low CENH3 levels in cenhS gametes.
One of the striking éléments of centromere-mediated haploid induction îs that it is effective in only a subset of progeny. In some individuals, ail the chromosomes from the haploid inducer parent are lost, and in another much larger subset no chromosome loss occurs. The relatively small aneuploîd class represents “partial haploid induction” events, where some chromosomes were lost but others survived. The fact that the gl8 crosses yielded more true haploids than the gll crosses may be related to the fact that the former were carried ont in the summer while the latter were carried out in winter. It is also possible that the sélection scheme piayed a rôle. Studies using the maize r-Xl délétion line, which generales monosomies at high frequency, hâve demonstrated that some chromosomes are recovered as monosomies at higher frequency than others 28. Monosomies for chromosome 5 (with gl8) are rarely recovered whereas monosomies for chromosome 7 (with gll) are far more commun (17 times more common (Weber, Use of Maize Monosomies for Gene Localization and Dosage Studies. in The Maize Handbook (eds. Freeling, M. & Walbot, V.) 350-358 (Springer New York, 1994))). Indeed, two of five sequenced aneuploids from gll crosses were monosomie for chromosome 7 only (Table 6). These data may indicate that the gl8 tester favors the recovery of haploids while the gll tester recovers a broader range of ploidies.
It is believed that ail prior literature on centromere mediated haploid induction describes the complémentation of a null allele with a variant of CEN H 3 or al le les thaï produce allered or partially deleted forms of CEN H 3 (Ravi and Chan, Nature 464, 615-618 (2010), Karimi-Ashtiyani, et al., Proc. Natl. Acad. Sci. U. S. A. 112, 1121 i-11216 (2015), Kuppu, et al., Plant Biotechnol. J. (2020) doiTO.l 111/pbi. 13365, Maheshwari, et al., PLoS
Genet. 11, e 1004970 (2015), Ishii, et a!., Annu. Rev. Plant Biol. 67, 421^138 (2016)). These data hâve served to sustain the original interprétation that haploid induction is caused by a compétition between two structurally different forms of CENH3, and ultimate rejection of the altered centromeres 5 by a surveillance mechanism for improper assembly (Ravi and Chan, Nature 464, 615-618 (2010), Britt and Kuppu, Front. Plant Sci. 7, 357 (2016), Kalînowska, et al., Theor. Appl. Genet. 132, 593-605 (2019), Kuppu, et al., Plant Biotechnol. J. (2020) doi: 10.1111/pbi. 13365, Maheshwari, et al., PLoS Genet. 11, el004970 (2015), Copenhaver, & Preuss, Nat. Biotechnol. 28, 10 423-424 (2010)), compared to other possible mechanisms (Wang & Dawe,
Molecular Plant vol. 11 398-406 (2018), Karimi-Ashtiyani, et al., Proc. Natl. Acad. Sci. U. S. A. 112, 11211-11216 (2015), Ravi, et al., PLoS Genet. 7, el 002121 (2011), Wang, et al., Plant Methods 15, 42 (2019), Sanei, et al., Proc. Natl. Acad. Sci. U. S. A. 108, E498-505 (2011), Tan, et al., Elife 4, 15 (2015)).
In contrast, the data herein achieved high levels of haploid induction using a cenh3 mutation in the N-terminal tail that removes ail sequence that inleracts with DNA or other histones. Therefore, quantitative réductions in CENH3 alone can induce centromere-medîated haploid induction. The major 20 advantage of the cenh3 null approach is that the plants are vigorous and the process is simple. Any line that is crossed to cenh3 becomes a haploid inducer. The ease of use should make it particularly versatile when combined with other technologies that are buiIt upon haploids, such as synthetic apomixis (Marimuthu, et al., Science, 331(6019):876 (2011), Wang, et aL, 25 aBIOTECH, 1:15—20(2020», the transfer of engineered chromosomes from one line to another (Birchler, et al., Current Opinion in Plant Biology, 19:7680 (2014)), and genotype-indepeudent gene editing (Kelliher, et al., Nat. Biotechnol. 37, 287-292 (2019)).
Example 2: Haploid induction with simultaneous gene editing 30 Materials and Methods
A use of the cenh3 haploid inducer involves simultaneous haploid induction and gene editing. In an example of this use, the cenh3 null is first crossed to a line containing a CRISPR construct (expressing both Cas9 and a one or more guide RNAs). This hybrid line, with both cenh3 and CRISPR components, is then crossed as a female to a male wild type maize line. It is believed that upon fertilizatîon, roughly 5% ofthe progeny will be haploid, and among these, approximately half will hâve received the CRISPR construct expressing Cas9 and guide RNA(s). CRISPR components are expressed in the early zygotic divisions and where can catalyze gene editing on the genome of the paternal genome. The genome of the female parent will be rapidly lost during haploid induction, removing the CRISPR components and leaving only the paternal genome, of which a fraction will hâve sustained gene editing.
To test haploid induction with sîmultaneous gene editing, experiments were designed to cross the cenh3 null to a CRISPR construct expressing Cas9 from the Ubiquitin promoter and eight guide RNAs targetîng four genes that control plant development: fasciated ear-2 (fea2), fasciated ear-3 (fea3), compact plant-2 (ct2), and thick tassel dwarf-l (tdl). The CRISPR construct was first introduced by Agrobacterium-medîated transformation into an inbred called B104. The transgenic B104 line was then crossed to the heterozygous cenh3 null plant. From the progeny of this cross, plants with both the CRISPR construct and cenh3 were crossed to maize line homozygous for a récessive mutation called luteus-1 to identify haploids.
Results
Of a total of 192 plants, 7 proved to be haploids. A total of 40 diploid plants and ail 7 haploid plants were genotyped (by amplicon Sanger sequencing) for edits at each guide RNA site. The overall editing frequency was low, yielding a total of five edits among the forty diploid plants. However, importantly, one of the haploid plants was edited at the fea2 gene. The edit in the haploid plant conferred a fasciated ear phenotype consistent with the known récessive phenotype of this mutant (Taguchî-Shiobara, et al,, Genes Dev. 15: 2755-2766 (2001)). The editing frequency of 1/7 in haploids roughly matches the editing frequency of 5/40 in diploids, indicating, in this small early trial experiment, that editing is roughly as efficient in haploid and diploids.
Of further note îs the fact that the single edit is of a non-standard type. While the edïts in diploids were consistent with cleavage and repair by non-homologous end joining (NHEJ), the haploid plant contained an insertion flanked by a short région of homology on either side. This type of edit may îndicate erroneous homology-directed repair (HDR) (Xue and Greene, Prends inGenetics, DOI: l0.l016/j.tig.202l,02.008. (2021)). These data provide reason to believe that HDR is active during the early zygotic divisions, and that cleavage and repair can occur during the brief time that both genomes are présent, prior the loss of the chromosomes from the haploid inducer line. HDR is an important form of repair for many CRISPR applications involving replacement of promoters or genes with new sequences (Zhang et al., Nat Plants, 5; 778-794 (2019)).
The results of these experiments are illustrated in Figures 5A-5D and Tables 7 and 8. Figure 5A is a plasmîd map of the CRJSPR construct used for simultaneous haploid induction and gene editing. Construct components are indicated.
Table 7: Frequency of edits from diploid and haploid progeny of cenh3 heterozygous plant carrying Cas9 and eight gRNAs (SEQ ID
ΝΓΟ51'3Ω-27 in de<;eendina order as thev annear în the tahlel
SEQ
Gene: gRNA gRNA sequence HAPLOID EDITS- DIPLOID EDITS 1DNO
FEA3 1 GCGCCCAGCTGTXGACCTG 0/7 0/40 30
FEA3 2 GCTCGTGGAGAACAACCTGA O/7 0/40 31
FEA2 1 GGAAGGCGAGAAGCGCTGCG _ 0/7 1/40 32
FEA2 2 GTÂCGGCTGCGÂtàCCGGCG 1/7* 2/40 33
TOI 1 GTCGCCAACTGCTATCTCCG 0/7 0/39 34
TD1 2 GGATACTACAACGAGTACAG 0/7 1/39 35
en iTgtaaggtgctggagaatcg 0/7 0/40 36
CT2 Z.GCTTTGACGAGGCAGAGCn _ 0/7 2Q «diti 1/7 _ V*3 — 5/40 37
Figure 5B-5D compare wildtype (Figure 5B) and an example of the Young ear phenotype (fea2) (Figure 5C) adapted from (Taguchi-Shiobara, et al., Genes Dev. 15: 2755-2766 (2001)) to the edited haploid plant obtained using the cenh3 haploid inducer (Figure 5D).
Table 8: Illustration/Alignment of the Edits (SEQ ID NOS:38-47, in descending order as they appear in the table)
5 SEQ ID NO 38 39 40 41 42 43 44 45 46 47 FEA2 HT Guide X FEA2aL âiploid CGCGGUGGCGAGAAGCGCTGCGCGGTCGGTGGAGGCAGGGACTGGAGGTTGGGTCGCCGA CGCGGAAGGCGAGAAG---TGCGCGGTCGGTGGAGGCAGGGACTGCAGGTTGGGTCGCCGA
FEA2 WT Guide 2 TCCGTACGGCTGCGATAGCGGCGGGGATtTCGCCGÛAGAAGCGGTTG'rcGGAGAGGTCG*
FEA2“ diploid TCCGTBCGGCTGCGAXaC—GCGGGGATCTCGCCGGAGAAGCGGTFGTGBGAGAGGTCGA
diploid TCCCTACGGC TGC GAÏACCG-CGGGGATCTCGCCGGAGAAGCGGTTGTGGGAGAGGTCGA
FEA3®* haploid WMM^^MMcGCCGGAGAAGCGGTTGTGGGAGAGGTCGAGGAGGAGGAGAqC GGAGTTC^GGGGTCCMXGACGATCCGCGGGGGGACt^C^CGGAGATGGCGTTGCGGGAGAGATCAAGÛGCAGCGA
GGCGCGCGGGGAAGGAGAGACGCGGGGAGAGCGGG
TD1 WT Guide 2 GTCGGATACTACAACCAGTACAGCGGCGGGGTCCCGCGCGAGTTCGGCGCGCTCCAGTCGC
diploid GTCGGATACTACAACC1ÆT-C AGCGGCGGGGTC CCGCGCGAGTTCGGCGCGCTCCAGTCGC
CT2 WT Guide 2 CTGGCTTGaCGAGGCaqaGCttaGGaGCtaCaCktCaGtCatCCatGCtxatGtGtatCaG
CTÎ’1 diploid CTGGCTTGACGAGGCAGA-CTTAGGAGCTACACATCAGTCATCCATGCTAATGTGTATCAG
The edits in diploid plants are within the gRNA régions, consistent with cleavage and repair by NHEJ. The edit in the haploid plant is an 81 bp insertion suggestive of repair by erroneous HDR, presumably medîated by the flanking région of microhomology. Annotations indicate the following
I0 features: guide RNA target régions, PAM sites; deleted bases (dashes); duplicated région; homology flanking duplication. WT = wild type.
A very smal! experiment with a different construct (directed to ZmGBl) yielded no détectable edits in haploids and only a few in dîploids.
Unless defined otherwise, ail technical and scientific terms used herein hâve the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs. Publications cited herein and the materials for which they are cited are specifically incorporated by reference.
Those skilled in the art will recognize, or be able to ascertain using no more than routine expérimentation, many équivalents to the spécifie embodiments of the invention described herein. Such équivalents are intended to be encompassed by the following claims.

Claims (55)

  1. l. A monocot haploid inducer plant heterozygous for centromeric histone 3 (CenH3) comprising diploid plant cells comprising only one allele encoding functional CENH3 protein.
    5
  2. 2. The monocot plant of claim l, wherein the diploid plant cells comprise one CenH3 allele encoding non-functional CENH3 protein.
  3. 3. The monocot plant of claim 2, wherein the allele encoding non-functional CENH3 protein is a protein null allele.
  4. 4. The monocot plant of claim 3, wherein the allele encoding
  5. 5 eggs, optionally wherein the sperm or egg hâve less than 50% CENH3 than a wild type plant’s sperm or eggs, preferably, wherein the sperm or egg hâve between about 50% and 10%, for example, 50%, 25%, or 12.5% of the CENH3 than a wild type plant’s sperm or eggs.
    5. The monocot plant of claim l, wherein the allele encoding non-functional CENH3 protein is caused by frameshift mutation that créâtes a stop codon that abolishes function.
  6. 6. The monocot plant of claim I, wherein the endogenous
  7. 7. The monocot plant of claim 3, wherein the endogenous
    CenH3 loci on a second diploid chromosome is intact.
  8. 8. The monocot plant of claim 7, wherein the functional CENH3 protein is wildtype CENH3 protein.
    20
  9. 9. The monocot plant of claim 8, wherein the plant lacks a chromosomally integrated or extrachromosomal transgene encoding wildtype CenH3.
  10. 10 nuclease is stably expressed by cells of the monocot plant.
    10. The monocot plant of claim 9, wherein the plant lacks a chromosomally integrated or extrachromosomal transgene encoding a
    10 non-functional CENH3 protein is an RNA null allele.
  11. 11. The monocot plant of claim 10, wherein the fusion protein that is lacking from the plant comprises green fluorescent protein.
  12. 12. The monocot plant of claim 1 i, wherein the fusion protein that is lacking from the plant is GFP-tailswap.
    30
  13. 13. The monocot plant of claim 12, wherein the sperm or eggs are hâve less CENH3 than a CENH3 homozygous wild type plant’s sperm or eggs, optionally wherein the sperm or egg hâve less than 50% CENH3 than a wild type plant’s sperm or eggs, preferably, wherein the sperm or egg hâve between about 50% and 10%, for example, 50%, 25%, or 12.5% ofthe
    CENH3 than a wild type plant s sperm or eggs.
  14. 14. The monocot plant of claim 13, wherein the plant is maize, wheat, rice, sorghum, bariey, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garlic, chives, or yam.
  15. 15 nuclease, dCas9-FokI, dCpfl-Fokl, chimeric Cas9-cytidine deamînase, chimeric Cas9-adenine deaminase, chimeric FENl-Fokl, and Mega-TALs, a nickase Cas9 (nCas9), chimeric dCas9 non-Fokl nuclease, dCpfl non-Fokl nuclease, chimeric Cpfl-cytidîne deaminase, and Cpfl-adenine deaminase.
    15. The monocot plant of claim 14, wherein the plant is maize.
    15 CenH3 loci on a first diploid chromosome is partially or completely deleted.
  16. 16. The monocot plant of claim 1, further comprising an exogenous sîte-directed nuclease expressed by cells of the monocot plant.
  17. 17. The monocot plant of claim 16, wherein the diploid plant cells comprise one CenH3 allele encoding non-functional CENH3 protein.
  18. 18. The monocot plant of claim 17, wherein the allele encoding non-functional CENH3 protein is a protein null allele.
  19. 19. The monocot plant of claim 18, wherein the allele encoding non-functional CENH3 protein is an RNA null allele.
  20. 20. The monocot plant of claim 16, wherein the allele encoding non-functional CENH3 protein is caused by frameshift mutation that créâtes a stop codon that abolishes function.
  21. 21. The monocot plant of claim 18, wherein the endogenous Cenl-I3 loci on a first diploid chromosome is partially or completely deleted.
  22. 22. The monocot plant of daim 18, wherein the endogenous CenH3 loci on a second diploid chromosome is intact.
  23. 23. The monocot plant of claim 22, wherein the functîonal CENH3 protein is wildtype CENH3 protein.
  24. 24. The monocot plant of claim 23, wherein the plant lacks a chromosomally integrated or extrachromosomal transgene encoding wildtype CenH3.
  25. 25. The monocot plant of claim 24, wherein the plant lacks a chromosomally integrated or extrachromosomal transgene encoding a CEN1-13 variant or fusion protein.
    25 CENH3 variant or fusion protein.
  26. 26. The monocot plant of claim 25, wherein the fusion protein that is lacking from the plant comprises green fluorescent protein.
  27. 27. The monocot plant of claim 26, wherein the fusion protein that is lacking from the plant is GFP-tailswap.
  28. 28. The monocot plant of claim 27, wherein the sperm or eggs hâve less CENH3 than a CENH3 homozygous wild type plant’s sperm or
  29. 29. The monocot plant of claîm 28, wherein the site-dîrected
  30. 30. The monocot plant of claim 29, wherein the site directed nuclease is a meganuclease (MN), zinc-ftnger nuclease (ZFN), transcriptionactivator like effector nuclease (TALEN), or a CRISPR-based nuclease optionally wherein the nuclease is selected from Cas9 nuclease, Cfpl
  31. 31. The monocot plant of claim 30, wherein the plant’s genome 20 comprises a heterologous nucleic acid construct encoding the nuclease.
  32. 32. The monocot plant of claim 30, further comprising a guide RNA expressed by cells of the monocot plant.
  33. 33. The monocot plant of claim 32, wherein the plant’s genome comprises a heterologous nucleic acid construct encoding the gRNA.
    25
  34. 34. The monocot plant of claim 33, wherein the plant’s genome comprises a donor nucleic acid sequence to be introduced by recombination at a cleavage site induced by the nuclease.
  35. 35. The monocot plant of claim 34, wherein the plant is maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, fïnger millet, 30 proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onîon, garlic, chives, or yam.
  36. 36. An egg cell formed by the plant of any one of claims 1-15, the egg cell lacking the one allele encoding functional CENH3 protein and comprising no more than about 12.5% functional CENH3 protein relative to a corresponding egg cell formed by a CenH3 homozygous plant.
  37. 37. A sperm cell formed by the plant of any one of daims 1-15, the sperm cell lacking the one allele encoding functional CENH3 protein and 5 comprising no more than about 25% functional CENH3 protein relative to a corresponding egg cell formed by a CenH3 homozygous plant.
  38. 38. An egg cell formed by the plant of any one of daims 16-35, the egg cell lacking the one allele encoding functional CENH3 protein and comprising no more than about 12.5% functional CENH3 protein relative to 10 a corresponding egg cell formed by a CenH3 homozygous plant.
  39. 39. A sperm cell formed by the plant of any one of daims 16-35, the sperm cell lacking the one allele encoding functional CENH3 protein and comprising no more than about 25% functional CENH3 protein relative to a corresponding egg cell formed by a CenH3 homozygous plant.
    15
  40. 40. A method of inducing formation of a target haploid monocot plant comprising pollinating a parent monocot target plant with pollen from the monocot haploid inducer plant of any one of daims 1-15; and selectîng at least one haploid progeny produced by the pollination.
  41. 41. The method of daim 40, further comprising chromosome 20 doubling of the selected haploid progeny.
  42. 42. The method of daim 41, wherein chromosome doubling is spontaneous or induced by a chromosome doubling agent optionally selected from colchicine, pronamide, dithipyr, trifluralin, or another anti-microtubule agent.
    25
  43. 43. The method of daim 40, wherein the monocot target plant is selected from maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, swîtchgrass, Mîscanthus, asparagus, onîon, garlic, chives, or yam.
  44. 44. A method of inducing formation of a target haploid monocot 30 plant comprising pollinating the monocot haploid inducer plant of any one of daims 1-15 with pollen from a parent monocot target plant; and selectîng at least one haploid progeny produced by the pollination.
  45. 45. The method of claim 44, further comprising chromosome doubling of the selected haploid progeny.
  46. 46. The method of claim 45, wherein chromosome doubling is spontaneous or induced by a chromosome doubling agent optionally selected from colchicine, pronamide, dithipyr, trifluralin, or another anti-microtubule agent.
  47. 47. The method of claim 44, wherein the monocot target plant is selected from maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, on ion, garlic, chives, or yam.
  48. 48. A method of modifying the genome of a monocot target plant comprising inducing formation of a target haploid monocot plant comprising pollinating a parent monocot target plant with pollen from the monocot haploid inducer plant of any one of claims 16-35, and selecting at least one haploid progeny produced by the pollination, wherein the haploid progeny comprises the genome of the monocot target plant but not the monocot haploid inducer plant, and the genome of the haploid progeny has been modified by the site directed nuclease and optionally at least one guide R.NA delivered by the monocot haploid inducer plant.
  49. 49. The method of claim 48, further comprising chromosome doubling of the selected haploid progeny.
  50. 50. The method of claim 49, wherein chromosome doubling is spontaneous or induced by a chromosome doubling agent optionally selected from colchicine, pronamide, dithipyr, trifluralin, or another anti-microtubule agent.
  51. 51. The method of claîm 48, wherein the monocot target plant is selected from maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, finger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garlic, chives, or yam.
  52. 52. A method of modifying the genome of a monocot target plant comprising inducing formation of a target haploid monocot plant comprising poliinating the monocot haploid inducer plant of any one of claims 16-35 with pollen from a parent monocot target plant, and selecting at least one haploid progeny produced by the pollination, wherein the haploid progeny comprises the genome ofthe monocot target plant but not the monocot haploid inducer plant, and the genome of the haploid progeny has been modified by the site directed nuciease and optionally at least one guide RNA delivered by the monocot haploid inducer plant
  53. 53. The method of claim 52, further comprising chromosome doubling of the selected haploid progeny.
  54. 54, The method of claim 53, wherein chromosome doubling is spontaneous or induced by a chromosome doubling agent optionally selected from colchicine, pronamide, dithipyr, trifluralin, or another anti-microtubule agent.
  55. 55. The method of claim 52, wherein the monocot target plant is selected from maize, wheat, rice, sorghum, barley, oats, triticale, rye, pearl millet, imger millet, proso millet, foxtail millet, banana, bamboo, sugar cane, switchgrass, Miscanthus, asparagus, onion, garlic, chives, oryam.
OA1202200508 2020-06-09 2021-06-09 Heterozygous CENH3 monocots and methods of use thereof for haploid induction and simultaneous genome editing. OA21074A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US63/036,902 2020-06-09
US63/036,910 2020-06-09

Publications (1)

Publication Number Publication Date
OA21074A true OA21074A (en) 2023-10-09

Family

ID=

Similar Documents

Publication Publication Date Title
JP2021061868A (en) Method for precise modification of plant via transient gene expression
US20190376075A1 (en) Simultaneous gene editing and haploid induction
US20230227836A1 (en) Simultaneous gene editing and haploid induction
US11519000B2 (en) Methodologies and compositions for creating targeted recombination and breaking linkage between traits
US20230270067A1 (en) Heterozygous cenh3 monocots and methods of use thereof for haploid induction and simultaneous genome editing
US20200199609A1 (en) Compositions and methods for stature modification in plants
WO2020005667A1 (en) Compositions and methods for editing an endogenous nac gene in plants
CN112725374A (en) Method for creating plant haploid induction line and application thereof
AU2019402773A1 (en) Simultaneous gene editing and haploid induction
Sapara et al. Gene editing tool kit in millets: present status and future directions
CA3226793A1 (en) Methods and compositions relating to maintainer lines for male-sterility
JP2023526035A (en) Methods for obtaining mutant plants by targeted mutagenesis
CN117069813A (en) Parthenogenesis haploid induction gene BnDMP and application thereof
OA21074A (en) Heterozygous CENH3 monocots and methods of use thereof for haploid induction and simultaneous genome editing.
CN113557408A (en) Methods and compositions for generating dominant dwarf alleles using genome editing
WO2018228348A1 (en) Methods to improve plant agronomic trait using bcs1l gene and guide rna/cas endonuclease systems
US20230124856A1 (en) Genome editing in sunflower
US20240150778A1 (en) Polyploid hybrid maize breeding
WO2023136966A1 (en) Reduced height maize
WO2024129512A2 (en) Compositions and methods for site-directed integration
CN113939189A (en) Methods and compositions for generating dominant dwarf alleles using genome editing
ABIOTIC Biological and Clinical Sciences Research Journal
Zeb Rice grain yield and quality improvement via CRISPR/Cas9 system: Rice grain yield and quality improvement via CRISPR/Cas9 system: an updated review an updated review Aqib ZEB1a, Shakeel AHMAD2b, Javaria TABBASUM3, Zhonghua SHENG1, Peisong HU1