WO2023222908A1 - Generation of haploid plants based on novel knl2 - Google Patents
Generation of haploid plants based on novel knl2 Download PDFInfo
- Publication number
- WO2023222908A1 WO2023222908A1 PCT/EP2023/063525 EP2023063525W WO2023222908A1 WO 2023222908 A1 WO2023222908 A1 WO 2023222908A1 EP 2023063525 W EP2023063525 W EP 2023063525W WO 2023222908 A1 WO2023222908 A1 WO 2023222908A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- plant
- domain
- santa
- protein
- conserved
- Prior art date
Links
Classifications
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H1/00—Processes for modifying genotypes ; Plants characterised by associated natural traits
- A01H1/02—Methods or apparatus for hybridisation; Artificial pollination ; Fertility
- A01H1/022—Genic fertility modification, e.g. apomixis
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H1/00—Processes for modifying genotypes ; Plants characterised by associated natural traits
- A01H1/04—Processes of selection involving genotypic or phenotypic markers; Methods of using phenotypic markers for selection
- A01H1/045—Processes of selection involving genotypic or phenotypic markers; Methods of using phenotypic markers for selection using molecular markers
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H1/00—Processes for modifying genotypes ; Plants characterised by associated natural traits
- A01H1/06—Processes for producing mutations, e.g. treatment with chemicals or with radiation
- A01H1/08—Methods for producing changes in chromosome number
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/415—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8201—Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
- C12N15/8202—Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation by biological means, e.g. cell mediated or natural vector
- C12N15/8205—Agrobacterium mediated transformation
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
- C12N15/8262—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield involving plant development
- C12N15/8267—Seed dormancy, germination or sprouting
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8241—Phenotypically and genetically modified plants via recombinant DNA technology
- C12N15/8261—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield
- C12N15/8287—Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield for fertility modification, e.g. apomixis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A40/00—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
- Y02A40/10—Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
- Y02A40/146—Genetically Modified [GMO] plants, e.g. transgenic plants
Definitions
- the present invention relates to non-transgenic and transgenic plants, preferably crop plants, comprising at least one mutation of a novel homologue of the KINETOCHORE NULL2 (KNL2) protein, especially a mutation causing a substitution of an amino acid within the KNL2 protein, preferably within the N-terminal region of the KNL2 protein, preferably in the SANTA domain and/or in the conserved N-terminal motif located upstream of the SANTA domain, which preferably have the biological activity of a haploid inducer.
- the present invention provides methods of generating the plants of the present invention and haploid and double haploid plants obtainable by crossing the plants of the present invention with wildtype plants as well as methods of facilitating cytoplasm exchange.
- haploids are one of the most powerful biotechnological means to improve cultivated plants.
- the advantage of haploids for breeders is that homozygosity can be achieved already in the first generation after dihaploidization, creating doubled haploid plants, without the need of several backcrossing generations required to obtain a high degree of homozygosity.
- the value of haploids in plant research and breeding lies in the fact that the founder cells of doubled haploids are products of meiosis, so that resultant populations constitute pools of diverse recombinant and at the same time genetically fixed individuals.
- the generation of doubled haploids thus provides not only perfectly useful genetic variability to select from with regard to crop improvement, but is also a valuable means to produce mapping populations, recombinant inbreeds as well as instantly homozygous mutants and transgenic lines.
- a plant wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
- KNL2 KINETOCHORE NULL2
- a plant wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a conserved N-terminal motif, preferably upstream of a SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
- KNL2 KINETOCHORE NULL2
- a plant wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a conserved C-terminal motif, preferably downstream of a SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
- KNL2 KINETOCHORE NULL2
- a plant wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and a conserved N-terminal motif, preferably upstream of the SANTA domain, and a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
- the mutation is in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
- the invention also refers to a haploid plant obtainable by crossing in a first step a plant according to any of claims 1 to 6 with a plant comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein.
- CENH3 centromere histone H3
- the invention refers also to a method of generating a haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N- terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, and b) identifying the haploid progeny plant generated from the crossing step.
- the invention refers also to s method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive but comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved
- the invention refers also to a method of generating a plant according to the invention, comprising the steps of: i) subjecting seeds of a plant to a sufficient amount of the mutagen ethylmethane sulfonate to obtain Ml plants, ii) allowing sufficient production of fertile M2 plants, iii) isolating genomic DNA of M2 plants and iv) selecting individuals possessing at least one amino acid substitution, deletion or addition in KNL2.
- the invention refers also to a plant cell or host cell comprising the nucleotide sequence according to the present invention or an according vector as a transgene.
- the invention refers also to a method of generating a plant according to the present invention, comprising the steps of: yy) transforming a plant cell with the nucleotide sequence according to the present invention, and zz) regenerating a plant having the biological activity of a haploid inducer from the plant cell.
- the invention refers also to a haploid plant obtainable by crossing in a first step a plant according to the invention with a plant comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer and/or with a plant comprising a nucleotide sequence encoding a KNL2 protein comprising a CENPC-k motive wherein the nucleotide sequence comprises at least one mutation causing in the CENPC- k domain an amino acid substitution which confers the biological activity of a haploid inducer and/or with any plant which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif,
- not comprising a CENPC-k motive is to be understood as not comprising a wildtype CENPC-k motive, not comprising a mutated CENPC-k motive and not comprising a truncated CENPC-k motive.
- not comprising a CENPC-k motive is to be understood as not comprising a wildtype CENPC-k motive and not comprising a CENPC-k motive as disclosed in WO 2017/067714 Al.
- the mutation of the P and/or 5KNL2 protein can be at least one amino acid substitution, a deletion of at least one amino acid and/or the addition, i.e. insertion, of at least one amino acid.
- the expression of the and/or 5KNL2 protein is diminished or even suppressed in the plant.
- the P and/or 5KNL2 protein comprises a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain , wherein the nucleotide sequence comprises at least one mutation, preferably causing an amino acid d e l et i o n , ad d iti o n , i . e . i n s e rti o n , o r substitution which confers the biological activity of a haploid inducer.
- a CENP-C like motive is a motive which has a significant homology to the conserved CENP-C motive of the protein CENP-C, as described in (Kato et al, Science, 340 (2013), 1110-1113).
- the CENP-C like motive is a CENPC-k motive.
- the CENPC-k motive is in the C-Terminal part of KNL2 protein of plants comprising a CENPC-k motive.
- KNL2 protein sequence is disclosed in Lermontova, I., et al. (2013); Plant Cell, 25, 3389-3404.
- a SANTA domain is disclosed in Lermontova, I., et al. (2013); Plant Cell, 25, 3389-3404 and in Zuo, S., et al (2022); Mol Biol Evol, 39.
- AN-terminal motif and a C-terminal motif is disclosed in Zuo, S., et al (2022); Mol Biol Evol, 39.
- the inventors found also the presence of conserved domains in addition to the SANTA domain in a and P KNL2 proteins.
- a CATD domain is disclosed in Ravi, M et al. (2010); Genetics, 186, 461-471.
- the invention refers especially to a plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but not a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence.
- KNL2 KINETOCHORE NULL2
- the at least one mutation is a deletion, addition or substitution of at least one nucleotide in the nucleotide sequence for the in the SANTA domain encoding sequence and/or the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence.
- the plant has biological activity of a haploid inducer.
- the at least one mutation is in the C-terminal part of the and/or 5KNL2 protein.
- the P and/or 5KNL2 protein comprises a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but not a CENP-C like motive wherein the nucleotide sequence comprises a point mutation causing in the SANTA domain and/or the conserved N- terminal motif located preferably upstream of the SANTA domain an amino acid substitution which confers the biological activity of a haploid inducer.
- crossing between the plant and a wildtype plant or plant expressing wildtype PKNL2 protein yields at least 0.1 % haploid progeny.
- the nucleotide sequence comprising the at least one mutation is an endogenous gene or a transgene, especially an artificial transgene.
- the invention relates also to a haploid plant obtainable by crossing a plant according to the invention with a plant expressing wildtype and/or 5KNL2 protein.
- the invention relates also to a haploid plant obtainable by crossing in a first step a plant according to the invention with a plant comprising a mutated protein, which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype P and/or 5KNL2 protein and the wildtype form of the other protein.
- the invention relates also to a haploid plant obtainable by crossing in a first step a plant according to the invention with a plant comprising a nucleotide sequence encoding a centromere assembly factor or a spindle assembly checkpoint protein, wherein the nucleotide sequence comprises at least one mutation which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype P and/or 5KNL2 protein and preferably wildtype of the centromere assembly factor or the spindle assembly checkpoint protein.
- the invention relates also to a haploid plant obtainable by crossing in a first step a plant according to the invention with a plant comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype P and/or 5KNL2 protein and wildtype CENH3 protein.
- CENH3 centromere histone H3
- the invention relates also to a double haploid plant obtainable by converting the haploid plant according to the invention into a double haploid plant, preferably via colchicine treatment.
- the invention relates also to a method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype P and/or 5KNL2 protein, b) identifying a haploid progeny plant generated from the crossing step, and c) converting the haploid progeny plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
- the invention relates also to a method of generating a haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype P and/or 5KNL2 protein but comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype and/or 5KNL2 protein and wildtype CENH3 protein, and c) identifying the haploid progeny plant generated from step b).
- a method of generating a double haploid plant comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype P and/or 5KNL2 protein but comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype P and/or 5KNL2 protein and wildtype CENH3 protein, c) identifying a haploid progeny plant generated from step b), and d) converting the haploid progeny plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
- CENH3 centromere histone H3
- the invention relates also to a haploid progeny plant generated in a method according to the invention.
- the invention relates also to a double haploid progeny plant generated in a method according to the invention.
- the invention relates also to a method of facilitating a cytoplasm exchange, comprising the steps of: x) crossing a plant according to claims 1 to 15 as ovule parent with a plant expressing wildtype P and/or 5KNL2 protein as pollen parent, and y) obtaining a haploid progeny plant comprising the chromosomes of the pollen parent and the cytoplasm of ovule parent.
- the invention relates also to a haploid progeny plant generated in this method.
- the invention relates also to a method of generating a plant according to the invention, comprising the steps of: i) subjecting seeds of a plant to a sufficient amount of the mutagen ethylmethane sulfonate to obtain Ml plants, ii) allowing sufficient production of fertile M2 plants, iii isolating genomic DNA of M2 plants and iv) selecting individuals possessing at least one amino acid mutation in P and/or 5KNL2, preferably in the C-terminal part of and/or 5KNL2.
- the invention relates also to a nucleotide sequence encoding p and/or 5KNL 2 or at least the C-terminal part of P and/or 5KNL2 protein comprising at least one mutation. Pre fe rab ly the mutati o n causes in the C-terminal part an amino acid substitution.
- the invention relates also to a vector comprising this nucleotide sequence.
- the invention relates also to a plant cell or host cell comprising this nucleotide sequence or this vector as a transgene.
- the invention relates also to a method of generating a plant according to the invention, comprising the steps of: yy) transforming a plant cell with the nucleotide sequence or the vector according to the invention, and zz) regenerating a plant having the biological activity of a haploid inducer from the plant cell.
- the Arabidopsis thaliana sequences in this application serve only as references and do not limit the invention to the particular A. thaliana sequences. Due to the high level of conservation ones skilled in the art is able to find the nucleotide sequence and amino acid sequence corresponding to the A. thaliana sequences in any other plant material or plant species. This is shown for example for a number of other plants in the sequence listing and in figure lb. In plants the length of the amino acid sequence for KNL2 is in the same area, i.e. between 550 and 650 amino acids long.
- the CENP-C like motive, especially the CENPC-k motive is always at the C-terminal part. Accordingly, a skilled person can easily obtain a mutated P and/or 6KNL2 protein in any plant species of interest, e.g. crop plants. Interestingly the human KNL2 protein has no CENP-C like motive.
- the present invention using mutants of and/or 6KNL2 for the production of haploid and double haploid plants has inter alia the following advantages:
- the P and/or 6knl2 mutant can be crossed directly with the wild type.
- the inducer lines can be non-GMO.
- the "P and/or 5KNL2 approach” can also be applied to a broad number of genotypes.
- the haploid induction efficiency can be up to around 10% or even more.
- P and/or 5KNL2 has a SANTA domain at the N-terminus and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but has no CENP-C like motive.
- the mutation of the P and/or 5KNL2 protein can be achieved by transgenic as well as non- transgenic methods.
- Non-transgenic methods are preferred because of enormous costs for deregulation of genetically modified organisms (GMO) as well as increasing public rejection of genetically modified organisms (GMO) or plants generated by means of GMO, in particular crops for human consumption, and extensive market authorisation processes including rigorous safety assessments of such GMOs.
- plant refers to any plant, but particularly seed plants.
- plant according to the present invention includes whole plants or parts of such a whole plant.
- the plant of the present invention comprises at least one cell comprising a nucleotide sequence encoding a P and/or 5KNL2 protein, wherein the nucleotide sequence comprises at least one mutation, preferably causing in the and/or 5KNL2 protein an amino acid substitution, deletion or addition which can confer the biological activity of a haploid inducer to the plant, preferably as specified herein in more detail.
- the nucleotide sequence comprises at least one mutation, preferably causing in the and/or 5KNL2 protein an amino acid substitution, deletion or addition which can confer the biological activity of a haploid inducer to the plant, preferably as specified herein in more detail.
- Most preferably, most or in particular all cells of the plant of the present invention comprises the mutation as described herein.
- plant cell describes the structural and physiological unit of the plant, and comprises a protoplast and a cell wall.
- the plant cell may be in form of an isolated single cell, such as a stomatai guard cells or a cultured cell, or as a part of a higher organized unit such as, for example, a plant tissue, or a plant organ.
- ornamental plants such as ornamental flowers and ornamental crops, for instance Begonia, Carnation, Chrysanthemum, Dahlia, Gardenia, Asparagus, Geranium, Daisy, Gladiolus, Petunia, Gypsophila, Lilium, Hyacinth, Orchid, Rose, Tulip, Aphelandra, Aspidistra, Aralia, Clivia, Coleus, Cordyline, Cyclamen, Dracaena, Dieffnbachia, Ficus, Philodendron, Poinsettia, Fem, Ivy, Hydrangea, Limonium, Monstera, Palm, Date-palm, Potho, Singonio, Violet, Daffodil, Lavender, Lily, Narcissus, Crocus, Iris, Peonies, Zephyranthes, Anthurium, Gloxinia, Azalea, Ageratum, Bamboo, Camellia, Dianthus, Impatien, Lobelia, Pelargonium
- the plant according to the present invention is selected from the group consisting of barley (Hordeum vulgare), sorghum (Sorghum bicolor), rye (Secale cereale), Triticale, sugar cane (Saccharum officinarium), maize (Zea mays)' , foxtail millet (Setaria italic), rice (Oryza sativa), Oryza minuta, Oryza australiensis, Oryza alta, wheat (Triticum aestivum), Triticum durum, Hordeum bulbosum, purple false brome (Brachypodium distachyon), sea barley (Hordeum marinum), goat grass (Aegilops tauschii), apple (Malus domestica), Beta vulgaris, sunflower (Helianthus annuus), Australian carrot (Daucus glochidiatus), American wild carrot (Daucus pusillus), Daucus muricatus, carrot (Daucus carota), eucaly
- the plant is selected from the group consisting of barley (Hordeum vulgare), sorghum (Sorghum bicolor), rye (Secale cereale), Triticale, sugar cane (Saccharum officinarium), maize (Zea mays), rice (Oryza sativa), wheat (Triticum aestivum), Triticum durum, Avena sativa, Hordeum bulbosum, Beta vulgaris, sunflower (Helianthus annuus), carrot (Daucus carota), tobacco (Nicotiana tabacum), tomato (Solanum lycopersicum), potato (Solanum tuberosum), coffee (Coffea canephora), grape vine (Vitis vinifera), cucumber (Cucumis sativus), thale cress (Arabidopsis thaliana), rape (Brassica napus), broccoli (Brassica oleracea), Brassica rapa, Brassica juncacea, black mustard (Bras.
- the plant is selected from the group consisting of Amborella, Solanum, Camelina, Brassica, Arabidopsis, Alyrata, Capsella, Vigna, Pheaseolus, Medicago, Cicer, Glycine, Arachis, Daucus, Fragaria, Ziziphus, Coffea, Malus, Pyrus, Populus, Vitis, Citrus, Ricinus, Nicotiana, Theobroma, Gossypium, Prunus, Cucumis, Brachypodium, Oryza, Setaria, Sorgum, Musa, Elaesis and Phoenix.
- the plant is Arabidopsis thaliana.
- the term ‘at least one mutation’ refers to preferably one mutation, in particular solely one mutation. In a further preferred embodiment, the term ‘at least one mutation’ refers to two mutations, in particular solely two mutations. In a further preferred embodiment, the term ‘at least one mutation’ refers to three mutations, in particular solely three mutations. In a further preferred embodiment, the term ‘at least one mutation’ refers to four mutations, in particular solely four mutations. In a further preferred embodiment, the term ‘at least one mutation’ refers to five mutations, in particular solely five mutations.
- the at least one mutation is at least one mutation, is at least two mutations, is at least three mutations, is at least four mutations or is at least five mutations.
- the maximum number of mutations is two, three, four, five, six, seven, eight, nine and, most preferably, ten.
- one amino acid substitution in particular solely one amino acid substitution, is present.
- the CENP-C like motive of the P and/or 5KNL2 protein two amino acid substitutions, in particular solely two amino acid substitutions, are present.
- the P and/or 5KNL2 protein preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein, three amino acid substitutions, in particular solely three amino acid substitutions, are present.
- the P and/or 5KNL2 protein preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the PKNL2 protein, four amino acid substitutions, in particular solely four amino acid substitutions, are present.
- the KNL2 protein preferably in the N-terminal region of the PKNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the and/or 5KNL2 protein, five amino acid substitutions, in particular solely five amino acid substitutions, are present.
- the P and/or 5KNL2 protein in the P and/or 5KNL2 protein, preferably in the N- terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein, 1, 1 or 2, 1 to 3, 1 to 4, 1 to 5, preferably 1 to 6, and more preferably 1 to 7 amino acid substitutions are present.
- the present invention is concerned with mutations that cause or lead to an amino acid deletion, substitution or addition within the in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein.
- a mutation preferably is a non-synonymous point mutation or substitution in the DNA sequence encoding the P and/or 5KNL2 protein resulting in a change in amino acid. This is also called a missense mutation.
- the change in amino acid or the amino acid substitution may be conservative, i.e. a change to an amino acid with similar physiochemical properties, semi-conservative, e.g. negative to positively charged amino acid, or radical, i.e. a change to a vastly different amino acid.
- the present plant having biological activity of a haploid inducer is homozygous with respect to the at least one mutation. In a further embodiment of the present invention, the present plant having biological activity of a haploid inducer is heterozygous with respect to the at least one mutation.
- the plant according to the present invention has the biological activity of a haploid inducer. This means that crossing between the plant according to the present invention and a wildtype plant or a plant expressing wildtype P and/or 5KNL2 protein yields at least 0.1 %, 0.2 %, 0.3 %, 0.4 %, 0.5 %, 0.6 %, 0.7 %, 0.8 %, 0.9 %, preferably at least 1 %, preferably at least 2 %, preferably at least 3 %, preferably at least 4 %, preferably at least 5 %, preferably at least 6 %, preferably at least 7 %, preferably at least 8 %, preferably at least 9 %, most preferred at least 10 %, at least 15 %, at least 20% or more haploid progeny.
- the present invention most advantageously provides means and methods to generate haploid inducer lines in a wide range of eudicot, dicot and monocot species.
- the present invention also allows the exchange of maternal cytoplasm and to create for instance cytoplasmic male sterilite plants with a desired genotype in a single process step.
- the present invention is advantageous insofar as a single amino acid mutation can be generated by mutagenesis or any other non-GMO-based approaches.
- an "endogenous" gene, allele or protein refers to a non-recombinant sequence of a plant as the sequence occurs in the respective plant, in particular wildtype plant.
- the term “mutated” refers to a human-altered sequence.
- human-induced non-transgenic mutation include exposure of a plant to a high dose of chemical, radiological, or other mutagen for the purposes of selecting mutants.
- human-induced transgenic mutations i.e. recombinant alterations or genomic engineering for example by means of TALE nucleases, zinc-finger nucleases or a CRISPR/Cas system, include fusions, insertions, deletions, and/or changes to the DNA or amino acid sequence.
- a ‘promoter’ is a DNA sequence initiating transcription of an associated DNA sequence, in particular being located upstream (5’) from the start of transcription and being involved in recognition and being of the RNA-polymerase. Depending on the specific promoter region it may also include elements that act as regulators of gene expression such as activators, enhancers, and/or repressors.
- a ‘3' regulatory element’ refers to that portion of a gene comprising a DNA segment, excluding the 5' sequence which drives the initiation of transcription and the structural portion of the gene, that determines the correct termination site and contains a polyadenylation signal and any other regulatory signals capable of effecting messenger RNA (mRNA) processing or gene expression.
- the polyadenylation signal is usually characterised by effecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. Polyadenylation signals are often recognised by the presence of homology to the canonical form 5'-AATAAA-3'.
- coding sequence refers to that portion of a gene encoding a protein, polypeptide, or a portion thereof, and excluding the regulatory sequences which drive the initiation or termination of transcription.
- the gene, coding sequence or the regulatory element may be one normally found in the cell, in which case it is called ‘autologous’ or ‘endogenous’, or it may be one not normally found in a cellular location, in which case it is termed ‘heterologous’, ‘transgenic’ or ‘transgene’.
- a ‘heterologous’ gene, coding sequence or regulatory element may also be autologous to the cell but is, however, arranged in an order and/or orientation or in a genomic position or environment not normally found or occurring in the cell in which it is transferred.
- expression refers to the transcription and/or translation of an endogenous gene or a transgene in plants.
- Transformation refers to methods to transfer nucleic acid molecules, in particular DNA, into cells including, but not limited to, biolistic approaches such as particle bombardment, microinjection, permeabilising the cell membrane with various physical, for instance electroporation, or chemical treatments, for instance polyethylene glycol or PEG, treatments; the fusion of protoplasts or Agrobacterium tumefaciens or rhizogenes mediated trans-formation.
- biolistic approaches such as particle bombardment, microinjection, permeabilising the cell membrane with various physical, for instance electroporation, or chemical treatments, for instance polyethylene glycol or PEG, treatments; the fusion of protoplasts or Agrobacterium tumefaciens or rhizogenes mediated trans-formation.
- Plasmids such as pUC derivatives can be used. If whole plants are to be regenerated from such transformed cells, the use of a selectable marker is preferred.
- biological activity of a haploid inducer’ or ‘haploid inducer’ or ‘haploid inducer line’ refers to a plant or plant line having the capability to produce haploid progeny or offspring in at least 0.1 %, at least 0.2 %, 0.3 %, 0.4 %, 0.5 %, 0.6 %, 0.7 %, 0.8 %, 0.9 %, preferably at least 1 %, preferably at least 2 %, preferably at least 3 %, preferably at least 4 %, preferably at least 5 %, preferably at least 6 %, preferably at least 7 %, preferably at least 8 %, preferably at least 9 %, most preferred at least 10 %, most preferred at least 15 %, most preferred at least 20 % of cases when crossed to a wildtype plant or a plant at least expressing wildtype PKNL2 protein.
- the resulting haploid progeny only comprises the chromosomes of the wildtype parent.
- the haploid inducer was the ovule parent of the cross, the haploid progeny possesses the cytoplasm of the inducer and the chromosomes of the wildtype parent.
- the invention relates in a preferred embodiment to a plant according to the present teaching, wherein the at least one amino acid substitution is introduced into the nucleotide sequence encoding and/or 5KNL2 non-transgenically or transgenically.
- the present invention relates to a plant, wherein the non-transgenic introduction of the at least one mutation causing in P and/or 5KNL2, especially in the N-terminal region of P and/or 5KNL2 an amino acid substitution, deletion or addition which confers the biological activity of a haploid inducer is effected via chemical mutagenesis, in particular via a CRISPR/Cas method, especially the CRISPR/Cas9 technology.
- TILLING as well as a CRISPR/Cas method has the advantage that not only the haploid plant but also the inducer plants are non-GMO.
- the at least one mutation is introduced into the plant in form of a transgene.
- this is done by transforming a vector comprising a nucleotide sequence encoding at least N-terminal region of P and/or 5KNL2, preferably the complete P and/or 5KNL2, comprising at least one amino acid substitution, preferably such as described herein.
- Methods for transformation of a plant and introducing a transgene into the genome of a plant are well-known in the prior art.
- the Agrobacterium mediated transformation, floral dip method or particle bombardment are used for transformation.
- the nucleotide sequence encoding the mutated P and/or 5KNL2 protein according to the present invention is transformed into the plant in form of a transgene and one or two alleles of the endogenous and/or 5KNL2 gene are preferably inactivated or knocked out.
- the nucleotide sequence encoding the mutated P and/or 5KNL2 protein according to the present invention is transformed into the plant in form of a transgene and the transgene is overexpressed in order to be more competitive as the endogenous P and/or 5KNL2 protein.
- the method of producing the plant having biological activity of a haploid inducer according to the present invention is not an essentially biological method.
- the present invention relates to a haploid plant, obtainable, in particular obtained, by: a) a cross of a plant having the biological activity of a haploid inducer according to the present invention with a plant expressing wildtype P and/or 5KNL2 protein and optionally b) identifying haploid progeny generated from the crossing step.
- the present invention provides also a method of generating a haploid plant, comprising the steps of: a) crossing a plant having the biological activity of a haploid inducer according to the present invention to a plant expressing wildtype P and/or 5KNL2 protein and b) identifying haploid progeny generated from the crossing step.
- the selected haploid plant is preferably converted into a double haploid plant, preferably via colchicine treatment.
- the invention relates also to a method of generating a double haploid plant.
- the plant having the biological activity of a haploid inducer according to the present invention and/or the plant expressing wildtype P and/or 5KNL2 protein are grown in a method according to the present invention before step a) und stress condition, especially under a slight stress condition.
- a suitable stress condition can be an altered temperature or an altered light regiment.
- the plant is grown at a temperature above or below 21°C, for example at a temperature of at least 23 °C and at most 29°C, preferably of around 26°C or at a temperature of at least 15 °C and at most 20°C, preferably of around 18°C.
- a plant with a mutated P and/or 5KNL2 protein is crossed with a plant with a mutated CENH3 protein and haploid progeny generated from the crossing step are identified.
- the identified haploid plants can then be crossed with a wild type plant having neither a mutated P and/or 5KNL2 protein nor a mutated CENH3 protein.
- the efficiency of haploid induction can increase after combination of and/or 5knl2 and cenh3 mutations.
- the combination of several haploid-causing mutations can help to increase the efficiency of haploid generation. Therefore, in an alternative embodiment transformation of and/or 5knl2 mutant with altered CENH3 variants, e.
- said crossing step does not provide - such as a crossing usually does - heterozygous progeny but in fact homozygous progeny.
- the haploidy of progeny is not the result of the mixing of genes of the plants used for sexual crossing.
- the presently claimed process of generating a double haploid plant cannot be found in nature.
- the plant according to the present invention can also be used in a method to restore male fertility by providing a normal cytoplasm to a crossing partner that is CMS.
- a crossing partner that is CMS.
- the chromosomes of the CMS plant are introduced into the normal cytoplasm of the haploid inducer of the present invention which is not CMS.
- pollen production of the CMS plant has to be induced via temperature, light, length of day etc.
- a plant cell comprising said nucleotide sequence or a vector comprising it as a transgene is provided by the present invention.
- PKNL2 contains SANTA & unique conserved motifs in N & C termini, but not CENPC-k.
- C(KNL2 & PKNL2 localize with CenH3 at chromocenters during different phases.
- PKNL2 localizes to centromeres without CENPC-k motif and without N terminus.
- PKNL2 can localize to centromeres without SANTA, but not in meristematic tissues.
- PKNL2 requires C terminus part for nuclear localization.
- There are conserved motifs of PKNL2 in all plants. A GVKTRxM motif is preferred for proper localization of PKNL2.
- PKNL2 localizes to centromeres without CENPC-k motif. PKNL2 requires SANTA domain and C terminus part for proper localization. PKNL2 depends on C(KNL2 for proper localization to centromeres in non-meristematic tissues. PKNL2 interacts with centromeric proteins like CENPC, CenH3 and C(KNL2(C), Mis 12, NASP etc. PKNL2. E.g. in Arabidopsis, may localize the centromeres by interacting with different centromeric proteins to initiate CenH3 loading and kinetochore establishment in dividing cells. PKNL2 may be regulated through SUMOylation.
- A Protein structure of KNL2 in Arabidopsis. SANTA domain and CENPC-k motif are indicated by different color box.
- B Number of KNL2 homologs in 90 representative plant species. Phylogenetic tree adopted from the NCBI common tree.
- C Phylogenetic relationships of the analyzed species were adapted from (Banks et al., 2011).
- D Number of KNL2 homologs identified in analyzed crucifer (Brassicaceae) genomes.
- KNL2 proteins in Brassicales can be classified into two major groups (aKNL2 and PKNL2). Bootstrap values obtained after 1 000 ultrafast bootstrap replicates (bb) shown in the tree. Scale bar indicates number of substitutions per site.
- FIG. 19 Identification and primary analysis of pknl2 mutant. pknl2 mutant developmental phenotype.
- A Schematic representation of the T-DNA insertion position in the genomic fragment and protein with the position of the SANTA domain.
- B Representative siliques with red arrowheads showing abnormal whitish glossy seed phenotype from heterozygous pknl2-l and pknl2-2 plants.
- C-D Boxplots depicting the number of abnormal seeds per silique data from reciprocal crossing of WT and heterozygous pknl2-l and pknl2-2.
- Figure 22 A model for the diversification and evolution of the KNL2 gene. Cladogram was constructed using IQtree software based on the multiple sequence alignment of KNL2 proteins. Two evolutionary lineages are indicated by Tetrapoda and Viridiplantae. The yellow stars indicate the ancient plant-specific gene duplication of KNL2, while the red star represents two isoforms of M18bpl in allotetraploid Xenopus laevis.
- Figure 23 Boxplot analysis of abnormal seed phenotype from heterozygous pknl2 mutants.
- Figure 24 Representative siliques with arrowheads showing early ovule abortion phenotype in heterozygous pknl2-lplants due to other T-DNA insertions.
- Figure 25 Flow cytometric histograms of nuclei from white seeds of pknl2-2.
- Figure 26 Flow cytometric histograms of nuclei isolated from single leaf of pknl2 mutant and WT.
- Figure 30 Reciprocal crossing of mutants with WT to confirm zygosity of mutation causing phenotype.
- Figure 31 Single silique genotyping of heterozygous pknl2 mutants.
- sequence listing comprises the sequences SEQ ID No. 1 to SEQ ID. No 88.
- SEQ ID No. 1 to SEQ Id No. 9 show the original and updated annotation of PKNL2 in Arabidopsis.
- One gene represented by the original first exon of 842 bp plus additional 4 bp, encodes a protein of 281 amino acids, which contains the SANTA domain and was designated as PKNL2 (AtIg582I0, Figure 2, see sequences below).
- the second gene is represented by a new first exon of 94 bp containing slightly more than half of the original second exon of 164 bp with a new transcript start site, one intron, and a new second exon consisting of the original third exon of 2 736 bp plus 14 bp of upstream region (see sequences below).
- SEQ ID. No. 10 to SEQ ID No. 88 show conceptual protein sequences of different plants which show conceptual homology to the Arabidopsis aKNL2 (At5g02520) amino acid sequence. Reference is made to figure IB and ID.
- nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs: 1 to 88.
- nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs: 1 to 88.
- nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs: 10 to 88.
- nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 3.
- nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 5.
- nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 6.
- Keywords adaptive evolution, CENH3, centromere, gene duplication, kinetochore, KNL2, plant
- the truncated putative CENPC-k motif does not include the Trp which is similar to 7 th Arg is needed for the targeting of aKNL2 to centromeres (Sandmann, et al. 2017). Moreover, it remains to be elucidated whether KNL2 variants with the truncated CENPC-k motif can target CENH3 nucleosomes directly, without an additional interacting partner. Among all grass species with sequenced genomes, maize represents an exception, since it has only one KNL2 gene which belongs to KNL2 clade with the truncated CENPC-k and has no yKNL2 protein variant with the complete CENPC-k motif.
- 5KNL2 retains the hydrophobic residues in the SANTA domain that are important for CENP-C binding in Xenopus.
- two CENP-C proteins were identified (Talbert, et al. 2004).
- the mechanism of localization and function of KNL2 in maize is similar to that of mammals that represent an exception among vertebrates in lacking CENPC-k and relies on CENP-C binding similar to Xenopus.
- PKNL2 In contrast to aKNL2, PKNL2 not only lacks the CENPC-k domain but also the part necessary for interaction with DNA. Thus, its association with Mis 18 proteins, with the ability to bind to DNA is plausible. We also cannot exclude that centromere targeting of PKNL2 depends on aKNL2.
- aKNL2 We showed previously that manipulation of aKNL2 can be used for the production of haploids in Arabidopsis (Lermontova 2017). Here we demonstrate that KNL2 genes exist in two variants in eudicots (a, PKNL2) and monocots (y, oKNL2).
- CENP-C and KNL2 protein sequences were aligned using MAFFT software (Y amada, et al. 2016) and alignments were further slightly manually refined, including removal of gaps and poorly aligned regions. Evolutionary relationships among CENP-C and KNL2 gene family members were determined by using IQ-TREE software (Nguyen, et al. 2015) implemented maximum likelihood methods based on 1000 bootstrap alignments and single-branch tests. The phylogenetic trees were visualized and modified using the Fig-Tree vl.4.4 software (http://tree.bio.ed.ac.uk/software/figtree/). Sequence logos were generated using WebLogo3 (http://weblogo.berkeley.edu/) (Crooks, et al. 2004).
- PKNL2 The entire open reading frame of PKNL2 (Atlg58210) was amplified by RT-PCR with RNA isolated from flower buds of Arabidopsis wild-type and cloned into the pDONR221 vector (Invitrogen) via the Gateway BP reaction. From pDONR221 clones, the open reading frame was recombined via Gateway LR reaction (Invitrogen) into the two attR recombination sites of the Gateway-compatible vectors pGWB641and pGWB642 (http://shimane-u.org/nakagawa/gbv.htm), respectively, to study the localization of PKNL2 protein in vivo.
- Gateway LR reaction Invitrogen
- Nicotiana benthamiana leaves were infiltrated with Agrobacterium tumefaciens transformed with 0KNL2- EYFP fusion constructs according to (Walter, et al. 2004).
- T-DNA insertion lines were obtained from the European Arabidopsis stock center (http://arabidoDsis.info/). To confirm the presence of, and to identify heterozygous versus homozygous T- DNA insertions, we performed PCR with pairs of gene-specific primers flanking the putative positions of T-DNA (Supplementary Table S4) and with a pair of gene-specific and T-DNA end-specific primers (LBb3.1, Supplementary Table S 4). DNA isolation was performed as described in Edwards et al. 1991 For the germination and segregation experiments, seeds from individual siliques were germinated in vitro on MS medium as described above.
- the Arabidopsis genome assembly and gene annotation were downloaded from Araportll (https://bar.utoronto.ca/thalemine/dataCategories.do) with integrative reannotation (Cheng, et al. 2017).
- the CENP-C and KNL2 gene models were manually re-examined.
- the Arabidopsis RNA-seq data were downloaded from previous studies (Klepikova, et al. 2016).
- RNA-seq data were selected from 10 tissue types in Arabidopsis, including germinating seeds, stigmatic tissue, ovules from 6th and 7th flowers, young seeds, internode, axis of the inflorescence, flower, anthers of the young flower, opened anthers, and root (NCBI SRA: SRR3581356, SRR3581684, SRR3581691, SRR3581693, SRR3581704, SRR3581705, SRR3581719, SRR3581727, SRR3581728, SRR3581732). Transcriptome analysis was utilized a standard TopHat-Cufflinks pipeline with minor modification (Trapnell, et al. 2012).
- CENP-C and KNL2 normalized XoMONl in different tissues from microarray experiments were obtained from the Arabidopsis eFP Browser website (http://bar.utoronto.ca/efp/cgi- bin/efpWeb.cgi).
- the corresponding gene IDs are: CENP-C (Atlgl5660), oKNL2 (At5g02520), /3KNL2 (Atlg58210), and CENH3 (Atlg01370).
- PAML 4.8 software (Yang, 2007) was used to test for positive selection on KNL2 homologs from Brassicaceae species.
- the KNL2 gene alignments and gene trees were used as input into the CodeML NSsites models of PAML. Alignments were manually refined as described in phylogenetic analyses.
- To determine whether each KNL2 homologs evolve under positive selection random-site models were selected. Random-site models allow co to vary among sites but not across lineages. We compared two models that do not allow co to exceed 1 (Ml and M7), and that allow co > 1 (M2 and M8). Positively selected sites were classified as those sites with a Bayes Empirical Bayes posterior probability > 95%.
- Example 4 Shown is an alignment of protein sequences of typical deltaKNL2 proteins, having a SANTA domain and conserved hydrophobic motifs
- centromeres types, causes and consequences of structural abnormalities implicating centromeric DNA. Nat Commun 9, 4340.
- Araportl 1 a complete reannotation of the Arabidopsis thaliana reference genome. Plant J 89, 789- 804.
- Floral dip a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant Journal 16, 735-743.
- CENP-C facilitates the recruitment of M18BP1 to centromeric chromatin. Nucleus-Austin 3, 101-110.
- a Maize Homolog of Mammalian CENPC Is a Constitutive Component of the Inner Kinetochore.
- Xenopus laevis M18BP1 Directly Binds Existing CENP-A Nucleosomes to Promote Centromeric Chromatin Assembly. Dev Cell 42, 190-199 el 10.
- CENP-C is a blueprint for constitutive centromere-associated network assembly within human kinetochores. J Cell Biol 210, 11-22.
- CENP-C is involved in chromosome segregation, mitotic checkpoint function, and kinetochore assembly. Mol Biol Cell 18, 2155-2168.
- Arabidopsis KINETOCHORE NULL2 Is an Upstream Component for Centromeric Histone H3 Variant cenH3 Deposition at Centromeres. Plant Cell 25, 3389-3404.
- CENP-C recruits M18BP1 to centromeres to promote CENP-A chromatin assembly. J Cell Biol 194, 855-871.
- IQ-TREE A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Molecular Biology and Evolution 32, 268-274.
- SANTA domain a novel conserved protein module in Eukaryota with potential involvement in chromatin regulation. Bioinformatics 22, 2459- 2462.
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Wood Science & Technology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Cell Biology (AREA)
- Botany (AREA)
- Developmental Biology & Embryology (AREA)
- Environmental Sciences (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Gastroenterology & Hepatology (AREA)
- Physiology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
Abstract
The present invention relates to non-transgenic and transgenic plants, preferably crop plants, comprising at least one mutation of a novel homologue of the KINETOCHORE NULL2 (KNL2) protein, especially a mutation causing a substitution of an amino acid within the KNL2 protein, preferably within the N-terminal region of the KNL2 protein, preferably in the SANTA domain and/or in the conserved N-terminal motif located upstream of the SANTA domain, which preferably have the biological activity of a haploid inducer. Further, the present invention provides methods of generating the plants of the present invention and haploid and double haploid plants obtainable by crossing the plants of the present invention with wildtype plants as well as methods of facilitating cytoplasm exchange.
Description
DESCRIPTION
Generation of haploid plants based on novel KNL2
The present invention relates to non-transgenic and transgenic plants, preferably crop plants, comprising at least one mutation of a novel homologue of the KINETOCHORE NULL2 (KNL2) protein, especially a mutation causing a substitution of an amino acid within the KNL2 protein, preferably within the N-terminal region of the KNL2 protein, preferably in the SANTA domain and/or in the conserved N-terminal motif located upstream of the SANTA domain, which preferably have the biological activity of a haploid inducer. Further, the present invention provides methods of generating the plants of the present invention and haploid and double haploid plants obtainable by crossing the plants of the present invention with wildtype plants as well as methods of facilitating cytoplasm exchange.
The generation and use of haploids is one of the most powerful biotechnological means to improve cultivated plants. The advantage of haploids for breeders is that homozygosity can be achieved already in the first generation after dihaploidization, creating doubled haploid plants, without the need of several backcrossing generations required to obtain a high degree of homozygosity. Further, the value of haploids in plant research and breeding lies in the fact that the founder cells of doubled haploids are products of meiosis, so that resultant populations constitute pools of diverse recombinant and at the same time genetically fixed individuals. The generation of doubled haploids thus provides not only perfectly useful genetic variability to select from with regard to crop improvement, but is also a valuable means to produce mapping populations, recombinant inbreeds as well as instantly homozygous mutants and transgenic lines.
Haploids can be obtained by in vitro or in vivo approaches. However, many species and genotypes are recalcitrant to these processes. Alternatively, substantial changes of the centromere-specific histone H3 variant (CENH3, also called CENP-A), by swapping its N-terminal regions and fusing it to GFP (“GFP- tailswap” CENH3), creates haploid inducer lines in the model plant Arabidopsis thaliana (Ravi and Chan, Nature, 464 (2010), 615-618 and US 2011/0083202 Al). Haploids induction methods based on CENH3- mediated approach requires the generation of cenh3 mutant with its subsequent complementation by altered CENH3 (“GFP tailswap” CENH3) variants. CENH3 proteins are variants of H3 histone proteins that are members of the kinetochore complex of active centromeres. With these “GFP -tailswap” haploid inducer lines, haploidization occurred in the progeny when a haploid inducer plant was crossed with a wildtype plant. Interestingly, the haploid inducer line was stable upon selfing, suggesting that a competition between modified and wild type centromere in the developing hybrid embryo results in centromere inactivation of the inducer parent and consequently in uniparental chromosome elimination. As a result, the chromosomes
containing the altered CENH3 protein are lost during early embryo development producing haploid progeny containing only the chromosomes of the wildtype parent.
Thus, haploid plants can be obtained by crossing “GFP-tailswap” transgenic plants as haploid inducer to wildtype plants. However, as described above, this technique requires generation of cenh3 mutant and substitution of endogenous CENH3 by substantial changes of the CENH3 protein and the plants comprise a heterologous transgene, which is economically problematic because of increasing public reluctance toward genetically engineered crops.
However, using CENH3 has the disadvantage, that the cenh3 mutant is viable only in heterozygous state. Furthermore CENH3 is present in a relatively high number of isoforms, for example six isoforms in wheat and two isoforms in barley.
It is therefore an object of the present invention to overcome the aforementioned problems and in particular to provide alternative haploid inducer plants which do not comprise necessarily modifications of their CENH3 protein and/or which are not genetically engineered.
This problem is solved by the subject matter of the independent claims, in particular by a plant having preferably biological activity of a haploid inducer and comprising a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but not a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation. Preferably the mutation causes an amino acid addition, deletion or substitution which confers the biological activity of a haploid inducer. The KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but not a CENPC-k motive will be named PKNL2 in the following. In a preferred embodiment 5KNL2 also refers to a KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but not a CENPC-k motive. In a preferred embodiment 5KNL2 does not refer to a KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but not a CENPC-k motive.
This problem is solved by a plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
This problem is solved by a plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
This problem is solved by a plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a conserved N-terminal motif, preferably upstream of a SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
This problem is solved by a plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a conserved C-terminal motif, preferably downstream of a SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
This problem is solved by a plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain or a conserved N-terminal motif, preferably upstream of the SANTA domain, or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
This problem is solved by a plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and a conserved N-terminal motif, preferably upstream of the SANTA domain, and a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.In a preferred embodiment the mutation is in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
In a preferred embodiment the at least one mutation is a deletion, addition or substitution of at least one nucleotide in the nucleotide sequence for the SANTA domain and/or the conserved N-terminal motif located preferably upstream of the SANTA domain and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
In a preferred embodiment the plant expresses a KNL2 protein not comprising a CENPC-k motif having at least one amino acid addition, amino acid deletion and/or amino acid substitution in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
The invention refers also to a haploid plant obtainable by crossing a plant according to the invention with a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive.
The invention also refers to a haploid plant obtainable by crossing in a first step a plant according to any of claims 1 to 6 with a plant comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein.
The invention refers also to a double haploid plant obtainable by converting the haploid plant according to the invention into a double haploid plant, preferably via colchicine treatment.
The invention refers also to a method of generating a haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N- terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, and b) identifying the haploid progeny plant generated from the crossing step.
The invention refers also to a method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N- terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, b) identifying a haploid progeny plant generated from the crossing step, and c) converting the haploid progeny plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
The invention refers also to a method of generating a haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype KNL2 protein but comprising a nucleotide sequence encoding a centromer histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein, and c) identifying the haploid progeny plant generated from step b).
The invention refers also to s method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive but comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein, c) identifying a haploid progeny plant generated from step b), and d) converting the haploid progeny plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
The invention refers also to a haploid progeny plant generated in a method according to the invention or double haploid progeny plant generated in a method according to the invention.
The invention refers also to a method of generating a plant according to the invention, comprising the steps of: i) subjecting seeds of a plant to a sufficient amount of the mutagen ethylmethane sulfonate to obtain Ml plants, ii) allowing sufficient production of fertile M2 plants,
iii) isolating genomic DNA of M2 plants and iv) selecting individuals possessing at least one amino acid substitution, deletion or addition in KNL2.
The invention refers also to the generation of transgenic lines of the plants, e.g. by TDNA,RNAi or CRISPR Cas9.
The invention refers also to a nucleotide sequence , e sp e cial ly an arti fici al nucl e otide se quence encoding a KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive comprising at least one mutation causing an amino acid substitution, deletion or addition, preferably in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
The invention refers also to a plant cell or host cell comprising the nucleotide sequence according to the present invention or an according vector as a transgene.
The invention refers also to a method of generating a plant according to the present invention, comprising the steps of: yy) transforming a plant cell with the nucleotide sequence according to the present invention, and zz) regenerating a plant having the biological activity of a haploid inducer from the plant cell.
The invention refers also to plants with deregulated expression of KNL2, e.g. by T-DNA insertion, CRISPRCas9 mutants or plants expressing RNAi constructs.
The invention refers also to a haploid plant obtainable by crossing in a first step a plant according to the invention with a plant comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid
inducer and/or with a plant comprising a nucleotide sequence encoding a KNL2 protein comprising a CENPC-k motive wherein the nucleotide sequence comprises at least one mutation causing in the CENPC- k domain an amino acid substitution which confers the biological activity of a haploid inducer and/or with any plant which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or conserved C- terminal motifs, located preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein.
The invention refers also to a method of generating a haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype KNL2 protein but comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducerand/or with a plant comprising a nucleotide sequence encoding a KNL2 protein comprising a CENPC-k motive wherein the nucleotide sequence comprises at least one mutation causing in the CENPC-k domain an amino acid substitution which confers the biological activity of a haploid inducer and/or with any plant which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein, and c) identifying the haploid progeny plant generated from step b).
The invenrtion refers also to a method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive but comprising a nucleotide sequence encoding a centromer histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid induceror to a plant which confers the biological activity of a haploid inducer or to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif,
preferably upstream of the SANTA domain and/or conserved C-terminal motifs, located preferably downstream of the SANTA domain, and not comprising a CENPC-k motive but comprising a nucleotide sequence encoding a KNL2 protein comprising a CENPC-k motive wherein the nucleotide sequence comprises at least one mutation causing in the CENPC-k domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein, c) identifying a haploid progeny plant generated from step b), and d) converting the haploid progeny plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
In a preferred embodiment “not comprising a CENPC-k motive” is to be understood as not comprising a wildtype CENPC-k motive. In a preferred embodiment “not comprising a CENPC-k motive” is to be understood as not comprising a wildtype CENPC-k motive and not comprising a mutated CENPC-k motive, in particular a CENPC-k motive differing from the wildtype CENPC-k motive in an amino acid substitution, deletion or addition, in particular only an amino acid substitution. In a preferred embodiment “not comprising a CENPC-k motive” is to be understood as not comprising a wildtype CENPC-k motive, not comprising a mutated CENPC-k motive and not comprising a truncated CENPC-k motive. In a particularly preferred embodiment “not comprising a CENPC-k motive” is to be understood as not comprising a wildtype CENPC-k motive and not comprising a CENPC-k motive as disclosed in WO 2017/067714 Al.
The mutation of the P and/or 5KNL2 protein can be at least one amino acid substitution, a deletion of at least one amino acid and/or the addition, i.e. insertion, of at least one amino acid. In a further embodiment the expression of the and/or 5KNL2 protein is diminished or even suppressed in the plant.
According to the invention the P and/or 5KNL2 protein does not comprise a CENP-C like motive. The mutation can be in the C-terminal or the N-terminal part of the protein. The invention also relates to the downregulation of the P and/or 5KNL2 protein in a plant to produce haploid plants.
In a preferred embodiment the P and/or 5KNL2 protein comprises a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain , wherein the nucleotide sequence
comprises at least one mutation, preferably causing an amino acid d e l et i o n , ad d iti o n , i . e . i n s e rti o n , o r substitution which confers the biological activity of a haploid inducer.
A CENP-C like motive is a motive which has a significant homology to the conserved CENP-C motive of the protein CENP-C, as described in (Kato et al, Science, 340 (2013), 1110-1113).
Preferably, the CENP-C like motive is a CENPC-k motive. The CENPC-k motive is in the C-Terminal part of KNL2 protein of plants comprising a CENPC-k motive.
A CENPC-k motive is disclosed in WO 2017/067714 Al and in Sandmann, M., et al (2017); Plant Cell, 29, 144-155.
The subject matter of WO 2017/067714 Al is incorporated into the present invention by reference.
A KNL2 protein sequence is disclosed in Lermontova, I., et al. (2013); Plant Cell, 25, 3389-3404.
The definition of aKNL2, PKNL2, yKNL2and 6KNL2 is disclosed in Zuo, S., et al (2022); Mol Biol Evol, 39.
A SANTA domain is disclosed in Lermontova, I., et al. (2013); Plant Cell, 25, 3389-3404 and in Zuo, S., et al (2022); Mol Biol Evol, 39.
AN-terminal motif and a C-terminal motif is disclosed in Zuo, S., et al (2022); Mol Biol Evol, 39. The inventors found also the presence of conserved domains in addition to the SANTA domain in a and P KNL2 proteins.
A CATD domain is disclosed in Ravi, M et al. (2010); Genetics, 186, 461-471.
According definitions are applicable to the present invention.
The invention refers especially to a plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but not a CENPC-k motive, wherein the nucleotide sequence comprises at least one mutation in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence.
Preferably the at least one mutation is a deletion, addition or substitution of at least one nucleotide in the nucleotide sequence for the in the SANTA domain encoding sequence and/or the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence. Preferably the plant has biological activity of a haploid inducer.
In a preferred embodiment the at least one mutation is in the C-terminal part of the and/or 5KNL2 protein.
Accordingly, the invention relates especially to a plant comprising a non-natural DNA sequence expressing a mutated, i.e. non-natural protein, especially a mutated, i.e. non-natural P and/or 5KNL2 protein. The according DNA and P and/or 5KNL2 protein are accordingly artificial.
In a preferred embodiment the at least one mutation is a point mutation. Preferred are especially one or two point mutations in the SANTA domain and/or the conserved N-terminal motif located preferably upstream of the SANTA domain.
In a preferred embodiment the P and/or 5KNL2 protein comprises a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but not a CENP-C like motive wherein the nucleotide sequence comprises a point mutation causing in the SANTA domain and/or the conserved N- terminal motif located preferably upstream of the SANTA domain an amino acid substitution which confers the biological activity of a haploid inducer.
In a preferred embodiment the plant comprises also a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer.
In a preferred embodiment crossing between the plant and a wildtype plant or plant expressing wildtype PKNL2 protein yields at least 0.1 % haploid progeny.
In a preferred embodiment the nucleotide sequence comprising the at least one mutation is an endogenous gene or a transgene, especially an artificial transgene.
In a preferred embodiment the nucleotide sequence comprising the at least one mutation is a transgene and at least one endogenous gene encoding a P and/or 5KNL2 protein is inactivated or knocked out.
In a preferred embodiment the plant has one isoform of and/or 5KNL2.
The invention relates also to a part of the plant according to the invention, which is preferably a shoot vegetative organ, root, flower or floral organ, seed, fruit, ovule, embryo, plant tissue or cell. Preferably the part of the plant expresses the mutated form of the P and/or 5KNL2 protein.
The invention relates also to a haploid plant obtainable by crossing a plant according to the invention with a plant expressing wildtype and/or 5KNL2 protein.
The invention relates also to a haploid plant obtainable by crossing in a first step a plant according to the invention with a plant comprising a mutated protein, which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype P and/or 5KNL2 protein and the wildtype form of the other protein.
The invention relates also to a haploid plant obtainable by crossing in a first step a plant according to the invention with a plant comprising a nucleotide sequence encoding a centromere assembly factor or a spindle assembly checkpoint protein, wherein the nucleotide sequence comprises at least one mutation which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype P and/or 5KNL2 protein and preferably wildtype of the centromere assembly factor or the spindle assembly checkpoint protein.
The invention relates also to a haploid plant obtainable by crossing in a first step a plant according to the invention with a plant comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype P and/or 5KNL2 protein and wildtype CENH3 protein.
The invention relates also to a double haploid plant obtainable by converting the haploid plant according to the invention into a double haploid plant, preferably via colchicine treatment.
The invention relates also to a method of generating a haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing the wildtype P and/or 5KNL2 protein, and b) identifying the haploid progeny plant generated from the crossing step.
The invention relates also to a method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype P and/or 5KNL2 protein, b) identifying a haploid progeny plant generated from the crossing step, and c) converting the haploid progeny
plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
The invention relates also to a method of generating a haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype P and/or 5KNL2 protein but comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype and/or 5KNL2 protein and wildtype CENH3 protein, and c) identifying the haploid progeny plant generated from step b).
A method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to the invention to a plant expressing wildtype P and/or 5KNL2 protein but comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype P and/or 5KNL2 protein and wildtype CENH3 protein, c) identifying a haploid progeny plant generated from step b), and d) converting the haploid progeny plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
In a preferred embodiment the P and/or 5knl2 mutant is transformed with GFP-tailswap CENH3.
The invention relates also to a haploid progeny plant generated in a method according to the invention.
The invention relates also to a double haploid progeny plant generated in a method according to the invention.
The invention relates also to a method of facilitating a cytoplasm exchange, comprising the steps of: x) crossing a plant according to claims 1 to 15 as ovule parent with a plant expressing wildtype P and/or 5KNL2 protein as pollen parent, and y) obtaining a haploid progeny plant comprising the chromosomes of the pollen parent and the cytoplasm of ovule parent. The invention relates also to a haploid progeny plant generated in this method.
The invention relates also to a method of generating a plant according to the invention, comprising the steps of: i) subjecting seeds of a plant to a sufficient amount of the mutagen ethylmethane sulfonate to obtain Ml plants, ii) allowing sufficient production of fertile M2 plants, iii isolating genomic DNA of M2
plants and iv) selecting individuals possessing at least one amino acid mutation in P and/or 5KNL2, preferably in the C-terminal part of and/or 5KNL2.
The invention relates also to a nucleotide sequence encoding p and/or 5KNL 2 or at least the C-terminal part of P and/or 5KNL2 protein comprising at least one mutation. Pre fe rab ly the mutati o n causes in the C-terminal part an amino acid substitution. The invention relates also to a vector comprising this nucleotide sequence. The invention relates also to a plant cell or host cell comprising this nucleotide sequence or this vector as a transgene.
The invention relates also to a method of generating a plant according to the invention, comprising the steps of: yy) transforming a plant cell with the nucleotide sequence or the vector according to the invention, and zz) regenerating a plant having the biological activity of a haploid inducer from the plant cell.
The Arabidopsis thaliana sequences in this application serve only as references and do not limit the invention to the particular A. thaliana sequences. Due to the high level of conservation ones skilled in the art is able to find the nucleotide sequence and amino acid sequence corresponding to the A. thaliana sequences in any other plant material or plant species. This is shown for example for a number of other plants in the sequence listing and in figure lb. In plants the length of the amino acid sequence for KNL2 is in the same area, i.e. between 550 and 650 amino acids long. The CENP-C like motive, especially the CENPC-k motive is always at the C-terminal part. Accordingly, a skilled person can easily obtain a mutated P and/or 6KNL2 protein in any plant species of interest, e.g. crop plants. Interestingly the human KNL2 protein has no CENP-C like motive.
The present invention, using mutants of and/or 6KNL2 for the production of haploid and double haploid plants has inter alia the following advantages: The P and/or 6knl2 mutant can be crossed directly with the wild type. Thus, not only the final product, but also the inducer lines can be non-GMO. The "P and/or 5KNL2 approach" can also be applied to a broad number of genotypes. The haploid induction efficiency can be up to around 10% or even more.
The present inventors surprisingly found that P and/or 5KNL2 has a SANTA domain at the N-terminus and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, but has no CENP-C like motive.
Advantageously, the mutation of the P and/or 5KNL2 protein can be achieved by transgenic as well as non- transgenic methods. Non-transgenic methods are preferred because of enormous costs for deregulation of genetically modified organisms (GMO) as well as increasing public rejection of genetically modified
organisms (GMO) or plants generated by means of GMO, in particular crops for human consumption, and extensive market authorisation processes including rigorous safety assessments of such GMOs.
The term "plant" refers to any plant, but particularly seed plants. The term ‘plant’ according to the present invention includes whole plants or parts of such a whole plant.
Whole plants preferably are seed plants, or a crop. Parts of a plant are e.g. shoot vegetative organs/structures, e.g., leaves, stems and tubers; roots, flowers and floral organs/structures, e.g. bracts, sepals, petals, stamens, carpels, anthers and ovules; seed, including embryo, endosperm, and seed coat; fruit and the mature ovary; plant tissue, e.g. vascular tissue, ground tissue, and the like; and cells, e.g. guard cells, egg cells, trichomes and the like; and progeny of the same.
In any case, the plant of the present invention comprises at least one cell comprising a nucleotide sequence encoding a P and/or 5KNL2 protein, wherein the nucleotide sequence comprises at least one mutation, preferably causing in the and/or 5KNL2 protein an amino acid substitution, deletion or addition which can confer the biological activity of a haploid inducer to the plant, preferably as specified herein in more detail. Most preferably, most or in particular all cells of the plant of the present invention comprises the mutation as described herein.
The species of plants that can be used in the method of the invention are preferably eudicot, dicot and monocot plants.
The term ‘plant’ in a preferred embodiment relates solely to a whole plant, i.e. a plant exhibiting the full phenotype of a developed plant and capable of reproduction, a developmental earlier stage thereof, e.g. a plant embryo, or to both.
In an embodiment of the present invention the term ‘plant’ refers to a part of a whole plant, in particular plant material, plant cells or plant cell cultures.
The term ‘plant cell’ describes the structural and physiological unit of the plant, and comprises a protoplast and a cell wall. The plant cell may be in form of an isolated single cell, such as a stomatai guard cells or a cultured cell, or as a part of a higher organized unit such as, for example, a plant tissue, or a plant organ.
The term ‘plant material’ includes plant parts, in particular plant cells, plant tissue, in particular plant propagation material, preferably leaves, stems, roots, emerged radicles, flowers or flower parts, petals, fruits, pollen, pollen tubes, anther filaments, ovules, embryo sacs, egg cells, ovaries, zygotes, embryos,
zygotic embryos per se, somatic embryos, hypocotyl sections, apical meristems, vascular bundles, pericycles, seeds, roots, cuttings, cell or tissue cultures, or any other part or product of a plant.
Thus, the present invention also provides plant propagation material of the plants of the present invention. Said “plant propagation material” is understood to be any plant material that may be propagated sexually or asexually in vivo or in vitro. Particularly preferred within the scope of the present invention are protoplasts, cells, calli, tissues, organs, seeds, embryos, pollen, egg cells, zygotes, together with any other propagating material obtained from transgenic plants. Parts of plants, such as for example flowers, stems, fruits, leaves, roots originating in mutated plants or their progeny previously mutated, preferably transformed, by means of the methods of the present invention and therefore consisting at least in part of mutated cells, are also an object of the present invention.
The term “transgenic plant” or “transgenic plant cell” or “transgenic plant material” refers to a plant, plant cell or plant material which is characterised by the presence of a polynucleotide or polynucleotide variant of the present invention, which may - in case it is autologous to the plant - either be located at another place or in another orientation than usually found in the plant, plant cell or plant material or which is heterologous to the plant, plant cell or plant material. Preferably, the transgenic plant, plant cell or plant material expresses the polynucleotide or its variants such as to induce apomixis.
The term "plant cell" describes the structural and physiological unit of the plant, and comprises a protoplast and a cell wall. The plant cell may be in form of an isolated single cell, such as a stomatai guard cells or a cultured cell, or as a part of a higher organized unit such as, for example, a plant tissue, or a plant organ.
The term "plant material" includes plant parts, in particular plant cells, plant tissue, in particular plant propagation material, preferably leaves, stems, roots, emerged radicles, flowers or flower parts, petals, fruits, pollen, pollen tubes, anther filaments, ovules, embryo sacs, egg cells, ovaries, zygotes, embryos, zygotic embryos per se, somatic embryos, hypocotyl sections, apical meristems, vascular bundles, pericycles, seeds, roots, cuttings, cell or tissue cultures, or any other part or product of a plant.
Thus, the present invention also provides plant propagation material of the transgenic plants of the present invention. Said “plant propagation material” is understood to be any plant material that may be propagated sexually or asexually in vivo or in vitro. Particularly preferred within the scope of the present invention are protoplasts, cells, calli, tissues, organs, seeds, embryos, pollen, egg cells, zygotes, together with any other propagating material obtained from transgenic plants. Parts of plants, such as for example flowers, stems, fruits, leaves, roots originating in transgenic plants or their progeny previously transformed by means of the methods of the present invention and therefore consisting at least in part of transgenic cells, are also an
object of the present invention. Especially preferred plant materials, in particular plant propagation materials, are apomictic seeds.
Particularly preferred plants are monocotyledonous or dicotyledonous plants. Particularly preferred are crop or agricultural plants, such as sunflower, peanut, com, potato, sweet potato, bean, pea, chicory, lettuce, endive, cabbage, cauliflower, broccoli, turnip, radish, spinach, onion, garlic, eggplant, celery, carrot, squash, pumpkin, zucchini, cucumber, apple, pear, melon, strawberry, grape, raspberry, pineapple, soybean, Cannabis, Humulus (hop), tomato, sorghum, sugar cane, and non-fruit bearing trees such as poplar, rubber, Paulownia, pine, elm, Lolium, Festuca, Dactylis, alfalfa, safflower, tobacco, cassava, coffee, coconut, pineapple, citrus trees, cocoa, tea, banana, avocado, fig, guava, mango, olive, papaya, cashew, macadamia, almond, green beans, lima beans, peas, fir, hemlock, spruce, redwood, in particular maize, wheat, barley, sorghum, rye, oats, turf and forage grasses, millet, rice and sugar cane. Especially preferred are maize, wheat, sorghum, rye, oats, turf grasses and rice.
Particularly preferred are also ornamental plants such as ornamental flowers and ornamental crops, for instance Begonia, Carnation, Chrysanthemum, Dahlia, Gardenia, Asparagus, Geranium, Daisy, Gladiolus, Petunia, Gypsophila, Lilium, Hyacinth, Orchid, Rose, Tulip, Aphelandra, Aspidistra, Aralia, Clivia, Coleus, Cordyline, Cyclamen, Dracaena, Dieffnbachia, Ficus, Philodendron, Poinsettia, Fem, Ivy, Hydrangea, Limonium, Monstera, Palm, Date-palm, Potho, Singonio, Violet, Daffodil, Lavender, Lily, Narcissus, Crocus, Iris, Peonies, Zephyranthes, Anthurium, Gloxinia, Azalea, Ageratum, Bamboo, Camellia, Dianthus, Impatien, Lobelia, Pelargonium, Lilac, Lily of the Valley, Stephanotis, Hydrangea, Sunflower, Gerber daisy, Oxalis, Marigold and Hibiscus.
Among the dicotyledonous plants Arabidopsis, Boechera, soybean, cotton, sugar beet, oilseed rape, tobacco, pepper, melon, lettuce, Brassica vegetables, in particular Brassica napus, sugar beet, oilseed rape and sunflower are more preferred herein.
In a preferred embodiment the plant is a species selected from the group consisting of Hordeum vulgare, Hordeum bulbusom, Sorghum bicolor, Sciccharum officinarium, Zea mays, Setaria italica, Oryza minuta, Oriza sativa, Oryza australiensis, Oryza alta, Triticum aestivum, Secale cereale, Malus domestica, Brachypodium distachyon, Hordeum marinum, Aegilops tauschii, Daucus glochidiatus, Beta vulgaris, Daucus pusillus, Daucus muricatus, Daucus carota, Eucalyptus grandis, Nicotiana sylvestris, Nicotiana tomentosiformis, Nicotiana tabacum, Solanum lycopersicum, Solanum tuberosum, Coffea canephora, Vitis vinifera, Erythrante guttata, Genlisea aurea, Cucumis sativus, Morus notabilis, Arabidopsis arenosa, Arabidopsis lyrata, Arabidopsis thaliana, Crucihimalaya himalaica, Crucihimalaya wallichii, Cardamine flexuosa, Lepidium virginicum, Capsella bursa pastoris, Olmarabidopsis pumila, Arabis hirsute, Brassica
napus, Brassica oeleracia, Brassica rapa, Raphanus sativus, Brassica juncea, Brassica nigra, Eruca vesicaria subsp. sativa, Citrus sinensis, Jatropha cur cas, Populus trichocarpa, Medicago truncatula, Cicer yamashitae, Cicer bijugum, Cicer arietinum, Cicer reticulatum, Cicer judaicum, Cajanus cajanifolius, Cajanus scarabaeoides, Phaseolus vulgaris, Glycine max, Astragalus sinicus, Lotus japonicas, Torenia fournieri, Allium cepa, Allium fistulosum, Allium sativum, and Allium tuberosum.
Preferably, the plant according to the present invention is selected from the group consisting of barley (Hordeum vulgare), sorghum (Sorghum bicolor), rye (Secale cereale), Triticale, sugar cane (Saccharum officinarium), maize (Zea mays)' , foxtail millet (Setaria italic), rice (Oryza sativa), Oryza minuta, Oryza australiensis, Oryza alta, wheat (Triticum aestivum), Triticum durum, Hordeum bulbosum, purple false brome (Brachypodium distachyon), sea barley (Hordeum marinum), goat grass (Aegilops tauschii), apple (Malus domestica), Beta vulgaris, sunflower (Helianthus annuus), Australian carrot (Daucus glochidiatus), American wild carrot (Daucus pusillus), Daucus muricatus, carrot (Daucus carota), eucalyptus (Eucalyptus grandis), Erythranthe guttata, Genlisea aurea, woodland tobacco (Nicotiana sylvestris), tobacco (Nicotiana tabacum), Nicotiana tomentosiformis, tomato (Solanum lycopersicum), potato (Solanum tuberosum), coffee (Coffea canephora), grape vine (Vitis vinifera), cucumber (Cucumis sativus), mulberry (Morus notabilis), thale cress (Arabidopsis thaliana), Arabidopsis lyrata, sand rock-cress (Arabidopsis arenosa), Crucihimalaya himalaica, Crucihimalaya wallichii, wavy bittercress (Cardamine flexuosa), peppergrass (Lepidium virginicum), sheperd’s-purse (Capsella bursa-pastoris), Olmarabidopsis pumila, hairy rockcress (Arabis hirsuta), rape (Brassica napus), broccoli (Brassica oleracea), Brassica rapa, Brassica juncacea, black mustard (Brassica nigra), radish (Raphanus sativus), Eruca vesicaria sativa, orange (Citrus sinensis), Jatropha curcas, Glycine max, and black cottonwood (Populus trichocarpa).
Particularly preferred the plant is selected from the group consisting of barley (Hordeum vulgare), sorghum (Sorghum bicolor), rye (Secale cereale), Triticale, sugar cane (Saccharum officinarium), maize (Zea mays), rice (Oryza sativa), wheat (Triticum aestivum), Triticum durum, Avena sativa, Hordeum bulbosum, Beta vulgaris, sunflower (Helianthus annuus), carrot (Daucus carota), tobacco (Nicotiana tabacum), tomato (Solanum lycopersicum), potato (Solanum tuberosum), coffee (Coffea canephora), grape vine (Vitis vinifera), cucumber (Cucumis sativus), thale cress (Arabidopsis thaliana), rape (Brassica napus), broccoli (Brassica oleracea), Brassica rapa, Brassica juncacea, black mustard (Brassica nigra), radish (Raphanus sativus), and Glycine max.
Particularly preferred the plant is selected from the group consisting of Amborella, Solanum, Camelina, Brassica, Arabidopsis, Alyrata, Capsella, Vigna, Pheaseolus, Medicago, Cicer, Glycine, Arachis, Daucus, Fragaria, Ziziphus, Coffea, Malus, Pyrus, Populus, Vitis, Citrus, Ricinus, Nicotiana, Theobroma, Gossypium, Prunus, Cucumis, Brachypodium, Oryza, Setaria, Sorgum, Musa, Elaesis and Phoenix.
In a preferred embodiment the plant is Arabidopsis thaliana.
In the context of the present invention the term ‘at least one mutation’ refers to preferably one mutation, in particular solely one mutation. In a further preferred embodiment, the term ‘at least one mutation’ refers to two mutations, in particular solely two mutations. In a further preferred embodiment, the term ‘at least one mutation’ refers to three mutations, in particular solely three mutations. In a further preferred embodiment, the term ‘at least one mutation’ refers to four mutations, in particular solely four mutations. In a further preferred embodiment, the term ‘at least one mutation’ refers to five mutations, in particular solely five mutations.
In a preferred embodiment of the present invention, the at least one mutation is at least one mutation, is at least two mutations, is at least three mutations, is at least four mutations or is at least five mutations.
In a preferred embodiment of the present invention, the maximum number of mutations is two, three, four, five, six, seven, eight, nine and, most preferably, ten.
In a furthermore preferred embodiment, in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the and/or 5KNL2 protein, one amino acid substitution, in particular solely one amino acid substitution, is present.
In a furthermore preferred embodiment, in the P and/or 5 KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, the CENP-C like motive of the P and/or 5KNL2 protein, two amino acid substitutions, in particular solely two amino acid substitutions, are present.
In a furthermore preferred embodiment, in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein, three amino acid substitutions, in particular solely three amino acid substitutions, are present.
In a furthermore preferred embodiment, in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the PKNL2 protein, four amino acid substitutions, in particular solely four amino acid substitutions, are present.
In a furthermore preferred embodiment, in the KNL2 protein, preferably in the N-terminal region of the PKNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the and/or 5KNL2 protein, five amino acid substitutions, in particular solely five amino acid substitutions, are present.
In a preferred embodiment of the present invention, in the P and/or 5KNL2 protein, preferably in the N- terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein, 1, 1 or 2, 1 to 3, 1 to 4, 1 to 5, preferably 1 to 6, and more preferably 1 to 7 amino acid substitutions are present.
In particular, the present invention is concerned with mutations that cause or lead to an amino acid deletion, substitution or addition within the in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein. Thus, in the context of the present invention, a mutation preferably is a non-synonymous point mutation or substitution in the DNA sequence encoding the P and/or 5KNL2 protein resulting in a change in amino acid. This is also called a missense mutation. Further, the change in amino acid or the amino acid substitution may be conservative, i.e. a change to an amino acid with similar physiochemical properties, semi-conservative, e.g. negative to positively charged amino acid, or radical, i.e. a change to a vastly different amino acid.
In a preferred embodiment of the present invention, the present plant having biological activity of a haploid inducer is homozygous with respect to the at least one mutation. In a further embodiment of the present invention, the present plant having biological activity of a haploid inducer is heterozygous with respect to the at least one mutation.
The plant according to the present invention has the biological activity of a haploid inducer. This means that crossing between the plant according to the present invention and a wildtype plant or a plant expressing wildtype P and/or 5KNL2 protein yields at least 0.1 %, 0.2 %, 0.3 %, 0.4 %, 0.5 %, 0.6 %, 0.7 %, 0.8 %, 0.9 %, preferably at least 1 %, preferably at least 2 %, preferably at least 3 %, preferably at least 4 %, preferably at least 5 %, preferably at least 6 %, preferably at least 7 %, preferably at least 8 %, preferably at least 9 %, most preferred at least 10 %, at least 15 %, at least 20% or more haploid progeny. Thereby, a wildtype plant is preferably a plant of the same species which does not comprise the at least one mutation of the plant according to the present invention within the corresponding endogenous P and/or 5KNL2 gene, i.e. the plant is able to express the native and/or 5KNL2 protein, and a plant expressing wildtype P and/or 5KNL2 is preferably a plant of the same species which comprises i) a nucleotide sequence encoding the PKNL2 protein without the at least one mutation of the plant according to the present invention and is able
to express said native P and/or 5KNL2 protein or ii) a nucleotide sequence encoding a and/or 5KNL2 protein from another plant species that shows a comparable functionality to the native P and/or 5KNL2, for instance, such P and/or 5KNL2 protein derived from another plant species can be introduced as a transgene.
Thus, the present invention most advantageously provides means and methods to generate haploid inducer lines in a wide range of eudicot, dicot and monocot species. The present invention also allows the exchange of maternal cytoplasm and to create for instance cytoplasmic male sterilite plants with a desired genotype in a single process step. The present invention is advantageous insofar as a single amino acid mutation can be generated by mutagenesis or any other non-GMO-based approaches.
Thus, the entire process of haploidization via application of a haploid inducer line characterized by a point mutated endogenous P and/or 5KNL2 gene encoding a and/or 5KNL2 protein with amino acid substitutions at at least one of the positions provided by the present invention is non-transgenic in a preferred embodiment.
In the context of the present invention, an "endogenous" gene, allele or protein refers to a non-recombinant sequence of a plant as the sequence occurs in the respective plant, in particular wildtype plant. The term "mutated" refers to a human-altered sequence. Examples of human-induced non-transgenic mutation include exposure of a plant to a high dose of chemical, radiological, or other mutagen for the purposes of selecting mutants. Alternatively, human-induced transgenic mutations, i.e. recombinant alterations or genomic engineering for example by means of TALE nucleases, zinc-finger nucleases or a CRISPR/Cas system, include fusions, insertions, deletions, and/or changes to the DNA or amino acid sequence.
A polynucleotide or polypeptide sequence is "heterologous or exogenous to" an organism if it originates from a foreign species, or, if from the same species, is modified from its original form. "Recombinant" refers to a human-altered, i.e. transgenic polynucleotide or polypeptide sequence. A "transgene" is used as the term is understood in the art and refers to a, preferably heterologous, nucleic acid introduced into a cell by human molecular manipulation of the cell's genome, e.g. by molecular transformation. Thus, a "transgenic plant" is a plant comprising a transgene, i.e. is a genetically-modified plant. The transgenic plant can be the initial plant into which the transgene was introduced as well as progeny thereof whose genome contains the transgene as well.
The term ‘nucleotide sequence encoding’ refers to a nucleic acid which directs the expression of a specific protein, in particular the P and/or 5KNL2 protein or parts thereof. The nucleotide sequences include both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is translated into the
1 protein. The nucleotide sequences include both the full length nucleic acid sequences as well as non-full length sequences derived from the full length sequences.
The term ‘gene’ refers to a coding nucleotide sequence and associated regulatory nucleotide sequences.
The term ‘regulatory element’ refers to a sequence, preferably a nucleotide sequence, located upstream (5'), within and/or downstream (3') to a nucleotide sequence, preferably a coding sequence, whose transcription and expression is controlled by the regulatory element, potentially in conjunction with the protein biosynthetic apparatus of the cell. ‘Regulation’ or ‘regulate’ refer to the modulation of the gene expression induced by DNA sequence elements located primarily, but not exclusively upstream (5') from the transcription start of the gene of interest. Regulation may result in an all or none response to a stimulation, or it may result in variations in the level of gene expression.
A regulatory element, in particular DNA sequence, such as a promoter is said to be "operably linked to" or "associated with" a DNA sequence that codes for a RNA or a protein, if the two sequences are situated and orientated such that the regulatory DNA sequence effects expression of the coding DNA sequence.
A ‘promoter’ is a DNA sequence initiating transcription of an associated DNA sequence, in particular being located upstream (5’) from the start of transcription and being involved in recognition and being of the RNA-polymerase. Depending on the specific promoter region it may also include elements that act as regulators of gene expression such as activators, enhancers, and/or repressors.
A ‘3' regulatory element’ (or ‘3' end’) refers to that portion of a gene comprising a DNA segment, excluding the 5' sequence which drives the initiation of transcription and the structural portion of the gene, that determines the correct termination site and contains a polyadenylation signal and any other regulatory signals capable of effecting messenger RNA (mRNA) processing or gene expression. The polyadenylation signal is usually characterised by effecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. Polyadenylation signals are often recognised by the presence of homology to the canonical form 5'-AATAAA-3'.
The term ‘coding sequence’ refers to that portion of a gene encoding a protein, polypeptide, or a portion thereof, and excluding the regulatory sequences which drive the initiation or termination of transcription.
The gene, coding sequence or the regulatory element may be one normally found in the cell, in which case it is called ‘autologous’ or ‘endogenous’, or it may be one not normally found in a cellular location, in which case it is termed ‘heterologous’, ‘transgenic’ or ‘transgene’.
A ‘heterologous’ gene, coding sequence or regulatory element may also be autologous to the cell but is, however, arranged in an order and/or orientation or in a genomic position or environment not normally found or occurring in the cell in which it is transferred.
The term ‘vector’ refers to a recombinant DNA construct which may be a plasmid, virus, autonomously replicating sequence, an artificial chromosome, such as the bacterial artificial chromosome BAC, phage or other nucleotide sequence, in which at least two nucleotide sequences, at least one of which is a nucleic acid molecule of the present invention, have been joined or recombined. A vector may be linear or circular. A vector may be composed of a single or double stranded DNA or RNA.
The term ‘expression’ refers to the transcription and/or translation of an endogenous gene or a transgene in plants.
‘Transformation’, ‘transforming’ and ‘transferring’ refers to methods to transfer nucleic acid molecules, in particular DNA, into cells including, but not limited to, biolistic approaches such as particle bombardment, microinjection, permeabilising the cell membrane with various physical, for instance electroporation, or chemical treatments, for instance polyethylene glycol or PEG, treatments; the fusion of protoplasts or Agrobacterium tumefaciens or rhizogenes mediated trans-formation. For the injection and electroporation of DNA in plant cells there are no specific requirements for the plasmids used. Plasmids such as pUC derivatives can be used. If whole plants are to be regenerated from such transformed cells, the use of a selectable marker is preferred. Depending upon the method for the introduction of desired genes into the plant cell, further DNA sequences may be necessary; if, for example, the Ti or Ri plasmid is used for the transformation of the plant cell, at least the right border, often, however, the right and left border of the Ti and Ri plasmid T-DNA have to be linked as flanking region to the genes to be introduced. Preferably, the transferred nucleic acid molecules are stably integrated in the genome or plastome of the recipient plant.
In the context of the present invention the term ‘biological activity of a haploid inducer’ or ‘haploid inducer’ or ‘haploid inducer line’ refers to a plant or plant line having the capability to produce haploid progeny or offspring in at least 0.1 %, at least 0.2 %, 0.3 %, 0.4 %, 0.5 %, 0.6 %, 0.7 %, 0.8 %, 0.9 %, preferably at least 1 %, preferably at least 2 %, preferably at least 3 %, preferably at least 4 %, preferably at least 5 %, preferably at least 6 %, preferably at least 7 %, preferably at least 8 %, preferably at least 9 %, most preferred at least 10 %, most preferred at least 15 %, most preferred at least 20 % of cases when crossed to a wildtype plant or a plant at least expressing wildtype PKNL2 protein. Since the chromosomes of the haploid inducer are eliminated during meiosis the resulting haploid progeny only comprises the chromosomes of the wildtype parent. However, in case the haploid inducer was the ovule parent of the
cross, the haploid progeny possesses the cytoplasm of the inducer and the chromosomes of the wildtype parent.
The plant according to the present invention contains in a preferred embodiment the nucleotide sequence encoding the P and/or 5KNL2 either as an endogenous gene or a transgene.
The invention relates in a preferred embodiment to a plant according to the present teaching, wherein the at least one amino acid substitution is introduced into the nucleotide sequence encoding and/or 5KNL2 non-transgenically or transgenically.
Thus, preferably in an embodiment, wherein the at least one mutation is effected in the endogenous P and/or 5KNL2 gene, the obtained plant is non-transgenic. Preferably, the mutation is effected via non-transgenic mutagenesis, in particular chemical mutagenesis, preferably via EMS (ethylmethane sulfonate)-induced TILLING.
Thus, the present invention relates to a plant, wherein the non-transgenic introduction of the at least one mutation causing in P and/or 5KNL2, especially in the N-terminal region of P and/or 5KNL2 an amino acid substitution, deletion or addition which confers the biological activity of a haploid inducer is effected via chemical mutagenesis, in particular via TILLING.
Alternatively, the present invention relates to a plant, wherein the non-transgenic introduction of the at least one mutation causing in P and/or 5KNL2, especially in the N-terminal region of P and/or 5KNL2 an amino acid substitution, deletion or addition which confers the biological activity of a haploid inducer is effected via chemical mutagenesis, in particular via a CRISPR/Cas method, especially the CRISPR/Cas9 technology.
TILLING as well as a CRISPR/Cas method has the advantage that not only the haploid plant but also the inducer plants are non-GMO.
In another preferred embodiment, the at least one mutation is introduced into the plant in form of a transgene. Preferably, this is done by transforming a vector comprising a nucleotide sequence encoding at least N-terminal region of P and/or 5KNL2, preferably the complete P and/or 5KNL2, comprising at least one amino acid substitution, preferably such as described herein. Methods for transformation of a plant and introducing a transgene into the genome of a plant are well-known in the prior art.
Preferably, the Agrobacterium mediated transformation, floral dip method or particle bombardment are used for transformation.
In the preferred embodiment, wherein the nucleotide sequence encoding the mutated P and/or 5KNL2 protein according to the present invention is transformed into the plant in form of a transgene and one or two alleles of the endogenous and/or 5KNL2 gene are preferably inactivated or knocked out. Another preferred embodiment, wherein the nucleotide sequence encoding the mutated P and/or 5KNL2 protein according to the present invention is transformed into the plant in form of a transgene and the transgene is overexpressed in order to be more competitive as the endogenous P and/or 5KNL2 protein.
The present invention also provides a plant obtainable, in particular obtained, by a method according to the present invention and which is characterized by having the biological activity of a haploid inducer.
In a preferred embodiment of the present invention, the method of producing the plant having biological activity of a haploid inducer according to the present invention is not an essentially biological method.
Further, the present invention also provides a method of generating the plant having biological activity of a haploid inducer according to the present invention, comprising the steps of: i) subjecting seeds of a plant to a sufficient amount of the mutagen ethylmethane sulfonate (EMS) to obtain Ml plants, ii) allowing sufficient production of fertile M2 plants, iii) isolating genomic DNA of M2 plants and iv) selecting individuals possessing at least one amino acid substitution, deletion or addition in the P and/or 5KNL2 protein, preferably in the N-terminal region of the and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein.
The present invention further relates in a preferred embodiment to a method of generating a plant having biological activity of a haploid inducer according to the present invention, comprising the steps of: xx) providing a vector comprising a nucleotide sequence encoding at least the P and/or 5KNL2 protein, preferably the N-terminal region of the P and/or 5KNL2 protein, most preferably the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein comprising at least one mutation causing in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif,
preferably upstream of the SANTA domain, of the P and/or 5KNL2 proteinan amino acid substitution, yy) transforming a plant cell with the vector, wherein preferably the plant cell comprising one ortwo endogenous alleles of a and/or 5KNL2 gene inactivated or knocked out, and zz) regenerating a plant having the biological activity of a haploid inducer from the plant cell.
The present invention further relates in a preferred embodiment to a method of generating a plant having biological activity of a haploid inducer according to the present invention, comprising the steps of: yy) transforming a plant cell with a nucleotide sequence encoding at least the KNL2 protein, preferably the N-terminal region of the P and/or 5KNL2 protein, most preferably the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 proteincomprising at least one mutation causing in the in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein an amino acid substitution or a vector comprising a nucleotide sequence encoding at least the P and/or 5KNL2 protein, preferably the N-terminal region of the P and/or 5KNL2 protein, most preferably the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the PKNL2 protein comprising at least one mutation causing in the in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein an amino acid substitution, and zz) regenerating a plant having the biological activity of a haploid inducer from the plant cell.
In particular, the present invention relates to a haploid plant, obtainable, in particular obtained, by: a) a cross of a plant having the biological activity of a haploid inducer according to the present invention with a plant expressing wildtype P and/or 5KNL2 protein and optionally
b) identifying haploid progeny generated from the crossing step.
Preferably, the identified haploid plant can be converted into a double haploid plant, preferably via colchicine treatment, which is also part of the present invention. Thus, the present invention also relates to a double -haploid plant, obtainable, in particular obtained, by converting the haploid plant according to the present invention into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
Thus, the present invention provides also a method of generating a haploid plant, comprising the steps of: a) crossing a plant having the biological activity of a haploid inducer according to the present invention to a plant expressing wildtype P and/or 5KNL2 protein and b) identifying haploid progeny generated from the crossing step.
In a further step c) the selected haploid plant is preferably converted into a double haploid plant, preferably via colchicine treatment. Thus, the invention relates also to a method of generating a double haploid plant.
In a preferred embodiment of the present invention, the method provided is not an essentially biological method.
The inventors also observed that the efficiency of haploid induction by crosses of and/or 5knl2 mutant with the wild type varies depending on growth conditions. Therefore, P and/or 5knl2 mutant and wild type plants can be grown under different light and temperature conditions.
In a preferred embodiment the plant having the biological activity of a haploid inducer according to the present invention and/or the plant expressing wildtype P and/or 5KNL2 protein are grown in a method according to the present invention before step a) und stress condition, especially under a slight stress condition. A suitable stress condition can be an altered temperature or an altered light regiment. Preferably the plant is grown at a temperature above or below 21°C, for example at a temperature of at least 23 °C and at most 29°C, preferably of around 26°C or at a temperature of at least 15 °C and at most 20°C, preferably of around 18°C.
In a further method according to the present invention a plant with a mutated P and/or 5KNL2 protein is crossed with a plant with a mutated CENH3 protein and haploid progeny generated from the crossing step are identified. The identified haploid plants can then be crossed with a wild type plant having neither a mutated P and/or 5KNL2 protein nor a mutated CENH3 protein.
Not to be bound on this theory, the efficiency of haploid induction can increase after combination of and/or 5knl2 and cenh3 mutations. The combination of several haploid-causing mutations can help to increase the efficiency of haploid generation. Therefore, in an alternative embodiment transformation of and/or 5knl2 mutant with altered CENH3 variants, e. g. GFP -tailswap can be done to increase its ability to induce haploids. P and/or 5knl2 with a mutation within the CENP-C motif can for example be crossed with cenh3. These double mutants can have an increased efficiency to induce haploid formation.
In particular, the present methods do not rely solely on, in particular do not consist of, natural phenomena such as crossing or selection, but in fact are essentially based on the technical teaching so as to provide a specifically mutated nucleotide sequence prepared by mankind’s contribution. Thus, the present invention introduces a specific structural feature, namely a mutation, into a nucleotide sequence and a plant of the present invention, which mutation is not caused by or associated with any natural phenomena such as crossing or selection.
In a particular embodiment of the present invention, which provides a method including a crossing step, said crossing step does not provide - such as a crossing usually does - heterozygous progeny but in fact homozygous progeny. Furthermore, the haploidy of progeny is not the result of the mixing of genes of the plants used for sexual crossing. Furthermore, the presently claimed process of generating a double haploid plant cannot be found in nature.
Further, the present invention also provides a method of facilitating a cytoplasm exchange, comprising the steps of: x) crossing a plant according to the present invention as ovule parent to a plant expressing wildtype P and/or 5KNL2 protein as pollen parent, and y) obtaining a haploid progeny plant comprising the chromosomes of the pollen parent and the cytoplasm of ovule parent.
In a preferred embodiment of the present invention, the method provided is not an essentially biological method. Said method is not a biological method essentially for the same reasons as indicated above, in particular since it is not entirely made up of natural phenomena such as crossing and selection, but involves as an essential feature a significant technical teaching so as to provide a particular mutation in a nucleotide sequence and a plant of the present invention. Furthermore, the haploidy of the progeny is not the result of the mixing of genes of the plants used for sexual crossing.
The method can advantageously be used to create cytoplasmic male sterility (CMS). CMS is caused by the extranuclear genome (mitochondria or chloroplasts) and shows maternal inheritance. Thus, the plant according to the present invention has to exhibit CMS and be the ovule parent of the cross. In this way CMS can be introduced into the crossing partner, preferably being an elite line of a crop.
In a preferred embodiment, the plant according to the present invention can also be used in a method to restore male fertility by providing a normal cytoplasm to a crossing partner that is CMS. Through such a cross the chromosomes of the CMS plant are introduced into the normal cytoplasm of the haploid inducer of the present invention which is not CMS. However, pollen production of the CMS plant has to be induced via temperature, light, length of day etc.
The present invention also relates to a nucleotide sequence encoding at least the P and/or 5KNL2 protein, preferably the N-terminal region of the and/or 5KNL2 protein, most preferably the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein comprising at least one mutation causing in the P and/or 5KNL2 protein, preferably in the N-terminal region of the P and/or 5KNL2 protein, most preferably in the SANTA domain and/or conserved N-terminal motif, preferably upstream of the SANTA domain, of the P and/or 5KNL2 protein an amino acid substitution.
The present invention also relates to a vector, in particular viral vector, construct or plasmid comprising said nucleotide sequence and, if present, associates sequences, preferably as indicated herein.
In a furthermore preferred embodiment of the present invention, the coding sequence of the P and/or 5KNL2 may be associated with regulatory elements, such as 5’- and/or 3’- regulatory elements, most preferably with a promoter, preferably a constitutive or inducible promoter.
Further, a plant cell comprising said nucleotide sequence or a vector comprising it as a transgene is provided by the present invention.
In the context of the present invention, the term ‘comprising’ as used herein is understood as to have the meaning of ‘including’ or ‘containing’, which means that in addition to the explicitly mentioned element further elements are possibly present.
In a preferred embodiment of the present invention, the term ‘comprising’ as used herein is also understood to mean ‘consisting of thereby excluding the presence of other elements besides the explicitly mentioned element.
In a furthermore preferred embodiment, the term ‘comprising’ as used herein is also understood to mean ‘consisting essentially of thereby excluding the presence of other elements providing a significant contribution to the disclosed teaching besides the explicitly mentioned element.
In summary the inventors found that there in plants like Arabidopsis 2 SANTA genes. PKNL2 contains SANTA & unique conserved motifs in N & C termini, but not CENPC-k. C(KNL2 & PKNL2 localize with CenH3 at chromocenters during different phases. PKNL2 localizes to centromeres without CENPC-k motif and without N terminus. PKNL2 can localize to centromeres without SANTA, but not in meristematic tissues. PKNL2 requires C terminus part for nuclear localization. There are conserved motifs of PKNL2 in all plants. A GVKTRxM motif is preferred for proper localization of PKNL2. PKNL2 localizes to centromeres without CENPC-k motif. PKNL2 requires SANTA domain and C terminus part for proper localization. PKNL2 depends on C(KNL2 for proper localization to centromeres in non-meristematic tissues. PKNL2 interacts with centromeric proteins like CENPC, CenH3 and C(KNL2(C), Mis 12, NASP etc. PKNL2. E.g. in Arabidopsis, may localize the centromeres by interacting with different centromeric proteins to initiate CenH3 loading and kinetochore establishment in dividing cells. PKNL2 may be regulated through SUMOylation.
Further preferred embodiments of the present invention are the subject-matter of the subclaims and the further independent claims.
The invention will now be described in some more detail by way of a non-limiting example and the figures.
The figures show:
Figure 1. Identification of the KNL2 and CENP-C gene homologs across major plant lineages.
(A) Protein structure of KNL2 in Arabidopsis. SANTA domain and CENPC-k motif are indicated by different color box. (B) Number of KNL2 homologs in 90 representative plant species. Phylogenetic tree adopted from the NCBI common tree. (C) Phylogenetic relationships of the analyzed species were adapted from (Banks et al., 2011). (D) Number of KNL2 homologs identified in analyzed crucifer (Brassicaceae) genomes.
Figure 2. The alignment of aKNL2 and KNL2 in Arabidopsis.
Amino acid alignment of aKNL2 and PKNL2 from Arabidopsis . SANTA domain and CENPC-k motif are underlined in blue and pink bars.
Figure 3. Evolutionary relationship among all KNL2 paralogs in land plants.
We performed maximum likelihood phylogenetic analyses using IQ-tree with a protein alignment of all KNL2 paralogs. KNL2 proteins cluster into two branches in three plant clades - heterosporous water fems (Salviniaceae), eudicots, and grasses (Poaceae) - indicating ancient gene duplications (arrows). The KNL2 in eudicots and grasses can be classified into two major groups (aKNL2 and PKNL2, and yKNL2 and 5KNL2, respectively). Bootstrap values obtained after 1 000 ultrafast bootstrap replicates (bb) shown in the tree. Scale bar indicates number of substitutions per site. The tree is arbitrarily rooted between bryophytes and tracheophytes.
Figure 4. Evolutionary relationship among all KNL2 paralogs in Brassicales species.
We performed maximum likelihood phylogenetic analyses using IQ-tree with a protein alignment of all KNL2 paralogs from Brassicales. KNL2 proteins in Brassicales can be classified into two major groups (aKNL2 and PKNL2). Bootstrap values obtained after 1 000 ultrafast bootstrap replicates (bb) shown in the tree. Scale bar indicates number of substitutions per site.
Figure 5. KNL2 protein motif evolution across the Brassicales.
(A) Variation map of the SANTA domain in the KNL2 homologs. The WebLogo program
was used to present SANTA domain alignments. The upper panel aligns SANTA domains of all KNL2 homologs, whereas the middle and bottom panel represents SANTA domain alignments of aKNL2 and PKNL2 homologs, respectively. (B) Conserved motif evolution of PKNL2 homologs across the Brassicales. The unaligned amino acid sequences of KNL2 from Brassicales species were used to search for additional conserved motifs of KNL2 using MEME suite v5.1.0. The dataset was submitted to the MEME server (http://meme-suite.org/) and the conserved domains and motifs were marked. The scale relates to amino acid residues. The SANTA domain comprised by several domain was underlined in pink bars. (C) Logos generated by MEME for consensus motifs.
Figure 6. Subcellular localization of PKNL2 in tobacco and Arabidopsis.
(A) Live imaging of root tip cells of Arabidopsis transformed with the PKNL2-EYFP fusion construct. Fluorescent signals showed distinct centromeric and diffused nucleoplasmic distribution.
(B) Nucleus isolated from seedlings of the PKNL2-EYFP transformants after immunostaining with anti- GFP (left panel) and anti-CENH3 (middle panel) antibodies Centromeric immunosignals showed colocalization with bright DAPI-stained chromocenters (right panel).
(C) Live imaging of root tip cells of Arabidopsis transformed with the PKNL2-EYFP fusion construct. A cell undergoing mitosis is encircled.
Figure 7. The KNL2, CENP-C and CENH3 gene expression profiles in Arabidopsis and evolutionary pressures on the KNL2 paralogs.
(A) Column charts showing different expression levels of the KNL2, CENP-C and CENH3 genes in tissues enriched for dividing cells. Expression levels of KNL2, CENH3 and CENP-C were normalized to the reference gene M0N1 (At2g28390) in RNA-seq datasets. The corresponding gene id numbers are: aKNL2 (At5g02520), /3KNL2 (Atlg58210), CENP-C (Atlgl5660) and CENH3 (Atlg01370). (B) Summary oftests for positive selection performed on KNL2 paralogs from Brassicaceae species. Tests that were statistically significant (P < 0.05) are indicated with an asterisk. (C) A schematic of a representative KNL2 protein, showing sites evolving under positive selection identified by Bayes Empirical Bayes analyses (posterior probability > 0.95).
Figure 8 Pipeline for the data mining phylogenetic analysis and motif identification.
Figure 9 Phylogenetic relationship of CENP-C genes in green plant lineages.
Figure 10 Phylogenetic classification of CENP-C genes in Brassicales species.
Figure 11 Alignment of SANTA domain, CENPC-k motif and CENPC motif. (A) Variation map of the SANTA domain in the KNL2 homologs. The WebLogo program (http ://weblogo .berke ley .edu/logo .cgi) was used to present SANTA domain alignments. The upper panel aligns SANTA domains of all KNL2 homologs from Brassicales, whereas the middle and bottom panel represent SANTA domain alignments of aKNL2 and PKNL2 homologs, respectively. Putative Aurora kinase phosphorylation consensus was underlined in red bars. (B) Alignment of CENPC-k motif of KNL2 homologs from land plants.
Figure 12 Conserved motif evolution of aKNL2 homologs across the Brassicales.
Figure 13 Motif evolution of KNL2 across the Brassicales.
Figure 14 Maize KNL2 alignment from different genome annotations.
Figure 15 Expression profde of KNL2, CENH3, CENP-C, and NET2A.
Expression of NET2A normalized to the reference gene M0N1 (At2g28390) in RNA-seq datasets and expression of KNL2, CENH3 and CENP-C normalized to the reference gene M0N1 in micro-array expression studies (source eFP-Browser).
Figure 16 Positive selection site of KNL2 orthologs from Brassicaceae species.
Figure 17 (A) The original and updated annotation of the /3KNL2 gene in Arabidopsis . The upper panel is the original /3KNL2 gene (Atlg58210) annotation containing three exons that encode a 1 246 amino-acid protein, which has a SANTA domain (PF09133), a kinase interacting protein 1 domain (KIP1, PF07765), and a C-terminal structural maintenance of chromosomes, bacterial type domain (SMC Prok B). The bottom panel shows the updated gene annotation of two newly predicted proteins. The /3KNL2 (Atlg58210) gene encodes a protein of 281 amino acids containing the SANTA domain. NET2A (Atlg58215) encodes a 947 amino-acid protein containing conserved NET actin-binding (NAB) and SMC Prok B domains. (B) The alignment of putative KIP1 and NAB domains.
Figure 18 Flow cytometry analysis of Fl seeds obtained in crosses of pknl2 mutant with wild type.
Figure 19 Identification and primary analysis of pknl2 mutant. pknl2 mutant developmental phenotype. (A) Schematic representation of the T-DNA insertion position in the genomic fragment and protein with the position of the SANTA domain. (B) Representative siliques with red arrowheads showing abnormal whitish glossy seed phenotype from heterozygous pknl2-l and pknl2-2 plants. (C-D) Boxplots depicting the number of abnormal seeds per silique data from reciprocal crossing of WT and heterozygous pknl2-l and pknl2-2. (E) Two weeks old in vitro germinated seedlings from Col-0 wild type, pknl2-l and pknl2-2 heterozygous (+/-) and null mutants (-/-). (G-I) Representative dry seeds of Col-0, pknl2-l and pknl2-2. (F) pknl2 null (-/-) and heterozygous (+/-) mutants on soil, homozygous mutants turning yellow in red circle. (J) Boxplot depicting the significant increase of abnormal dry seeds per silique of heterozygous pknl2-l and pknl2-2 compared to WT as control.
Figure 20 Analysis of single siliques for seeds germination and presence of abnormal seedlings. (A) Two weeks old in vitro germinated seeds collected from single siliques of WT as control and heterozygous self-pollinated fiknl2-l and pknl2-2 plants. pknl2 homozygous seedling are indicated by red circles. Bars: 1 cm. (B) Boxplot depicting the significant decrease of germination percentage per silique of heterozygous fiknl2-l and pknl2-2 compared to WT as control. (C) Boxplot depicting the significant increase of abnormal seedlings (red color circled seedlings in fig. 8 A) per silique of heterozygous fiknl2-l and pknl2-2 compared to WT as control. (D) RT-PCR amplification of KNL2 from knl2-l and knl2-2 homozygous null mutants and WT as the positive control with f>KNk2 (EMB1674) gene specific primers and EFIB primers as housekeeping gene.
Figure 21 Reduced cenH3 levels in pknl2 null mutants leading to endoreduplication. (A) Ploidy analysis of white abnormal seeds from !knl2 heterozygous mutants and WT as control. (B) Ploidy analysis of abnormal seedlings representing !knl2 null mutants and WT as control. (C) Boxplot showing a significant decrease in the number of centromeric CENH3 signals in fiknl2-l and knl2-2 compared to WT as a control. (D) Super resolution microscopic images showing immune-stained nuclei of wild-type and !knl2 null mutants with anti CENH3 antibodies.
Figure 22 A model for the diversification and evolution of the KNL2 gene. Cladogram was constructed using IQtree software based on the multiple sequence alignment of KNL2 proteins. Two evolutionary lineages are indicated by Tetrapoda and Viridiplantae. The yellow stars indicate the ancient plant-specific gene duplication of KNL2, while the red star represents two isoforms of M18bpl in allotetraploid Xenopus laevis.
Figure 23 Boxplot analysis of abnormal seed phenotype from heterozygous pknl2 mutants.
Figure 24 Representative siliques with arrowheads showing early ovule abortion phenotype in heterozygous pknl2-lplants due to other T-DNA insertions.
Figure 25 Flow cytometric histograms of nuclei from white seeds of pknl2-2.
Figure 26 Flow cytometric histograms of nuclei isolated from single leaf of pknl2 mutant and WT.
Figure 27 Immunostaining showing complete reduction of CENH3 signals inpknl2 mutant.
Figure 28 Alignment of SANTA domain in KNL2 homologs.
Figure 29 Characterization and identification of the KNL2s in plants.
Figure 30 Reciprocal crossing of mutants with WT to confirm zygosity of mutation causing phenotype.
Figure 31 Single silique genotyping of heterozygous pknl2 mutants.
Figure 32 List of primers used in this study.
The sequence listing comprises the sequences SEQ ID No. 1 to SEQ ID. No 88.
SEQ ID No. 1 to SEQ Id No. 9 show the original and updated annotation of PKNL2 in Arabidopsis.
Annotation of the PKNL2 gene: In the original version of annotation (TAIR and JGI Phytozome database) for the other SANTA domain-encoding paralog, the transcript included three exons (see Figure below, Panel A) encoding respectively the SANTA domain (PF09133), the kinase interacting protein 1 domain (KIP1, PF07765), and the C-terminal structural maintenance of chromosomes, bacterial type domain (SMC_Prok_B), totaling 1 246 amino acids (see sequences below).
One gene, represented by the original first exon of 842 bp plus additional 4 bp, encodes a protein of 281 amino acids, which contains the SANTA domain and was designated as PKNL2 (AtIg582I0, Figure 2, see sequences below). The second gene is represented by a new first exon of 94 bp containing slightly more than half of the original second exon of 164 bp with a new transcript start site, one intron, and a new second exon consisting of the original third exon of 2 736 bp plus 14 bp of upstream region (see sequences below). It encodes a 947 amino-acid protein containing a conserved NET actin-binding (NAB) domain and a SMC_Prok_B domain (see Figure below, Panel A, Atlg58215). Interestingly, the second protein has been studied as a plant-specific member of the Networked (NET) actin-binding protein superfamily, and the putative KIP 1 domain was identified as the NAB domain that binds directly to actin filaments (Decks et al., 2012). This NAB domain and putative KIP1 domain share the consensus region (see Figure below, Panel B). Decks et al. hypothesized that this highly conserved region represents an actin-association motif and named this protein as NET2A. Because most plant genome sequencing projects used the Arabidopsis genome as a reference, the original annotation of PKNL2 could lead to inaccuracies in annotation of KNL2 orthologs in other plant genomes. This is shown also in figure 17.
SEQ ID. No. 10 to SEQ ID No. 88 show conceptual protein sequences of different plants which show conceptual homology to the Arabidopsis aKNL2 (At5g02520) amino acid sequence. Reference is made to figure IB and ID.
The sequences of the sequence listing and shown in the examples and figures are part of preferred embodiments of the present invention.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs: 1 to 88.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 75%, 76%, 77%, 78%,
79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs: 1 to 88.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs:
I to 9.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs: 1 to 9.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs: 10 to 88. In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NOs: 10 to 88.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 3.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,
97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 3.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 5.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 5.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 6.
In a preferred embodiment the nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein of the plant of the present invention comprises or encodes a sequence having at least 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the sequence set forth in any one of SEQ ID NO. 6.
Further embodiments and examples of the present invention can be found in the post published document Zuo, S., et al (2022); Mol Biol Evol, 39 (Mol. Biol. Evol. 39(6):msacl23 https ://doi .org/ 10.1093/molbev/msac 123) aanndd the supplemental ddaattaa ooff this document (https://oup.silverchair- cdn.com/oup/backfile/Content public/Joumal/mbe/39/6/10,1093 molbev msacl23/3/msacl23 suppleme ntary_data.zip?Expires=1687363999&Signature=vU25bncFISZvLm83FaSSwa4skMu7OIJRtX5ZSSd- k7DceDx2~vXSy6A2ZcbQTSUorIuGV4v-2B0ivHA~FWvuXSlE- KDbGFIVI07wnK4QMNkDs6ZCDSegsGaNHj - obfRF6HzJ2StOOhOXJSHn~PyOxCJGfkLDEFWRluECpNao50ncvb~s~N6b5olLl~SOARKM7CORkrS oyrgw9dO81Kdl5RvVElBM6gb7wleeaNOkOgmJeC8EzikrRMvJUOp3UOoRcw56~HDK4Dp61apgNb OuBvUnRwRfhbKeWw4~E7bKOcR~z5NjPUmWUEdLZLlGpQlYHJvUtW81NOa9vNTqilvZ2w &Ke y-Pair-Id=APKAIE5G5CRDK6RD3PGA). The document and the supplemental data are incorporated by reference.
The present application claims the priority of European application 22174323.0, which is incorporated by reference.
Examples
EXAMPLE 1
ABSTRACT
KINETOCHORE NULL2 (KNL2) plays key role in the recognition of centromeres and new CENH3 deposition. To gain insight into the origin and diversification of the KNL2 gene, we reconstructed their evolutionary history in the plant kingdom. Our results indicate that the KNL2 gene in plants underwent three independent ancient duplications in fems, grasses and eudicots, after which the KNL2 paralogues experienced sub-fimctionalization. Additionally, we demonstrated that previously unclassified KNL2 genes could be divided into two clades aKNL2 and /3KNL2 in eudicots and yKNL2 and 3KNL2 in grasses, respectively. KNL2s of all clades encode the conserved SANTA domain, but only the aKNL2 and yKNL2 groups additionally encode the CENPC-k motif. Signatures of positive selection were found in all clades. The confirmed centromeric localization of PKNL2 and mutant analysis suggests that it participates in the new CENH3 loading similarly to aKNL2. The high rate of seed abortion was found in heterozygous PKNL2 and the germinated homozygous mutants did not develop beyond cotyledon stage. Taken together, our study provides a new understanding of the evolutionary origin and function of plant kinetochore assembly gene KNL2 and suggests that the plant-specific duplicated KNL2 are involved in centromere and/or kinetochore assembly for preserving genome stability.
Keywords: adaptive evolution, CENH3, centromere, gene duplication, kinetochore, KNL2, plant
INTRODUCTION
Centromeres are specific chromosomal regions where kinetochore protein complexes assemble in mitosis and meiosis to attach chromosomes to the spindle microtubules, and thus, are responsible for accurate segregation of chromosomes. Loss of centromere and kinetochore function causes chromosome missegregation, aneuploidy and cell death (Fachinetti, et al. 2013; McKinley and Cheeseman 2016; Barra and Fachinetti 2018). Centromere identity is specified epigenetically by the presence of the histone H3 variant termed CENH3 (also named CENP-A in mammals) which triggers the assembly of a functional kinetochore (Talbert, et al. 2002). The kinetochore complexes are formed by more than 100 proteins including the constitutive centromere associated network (CCAN) complexes and outer kinetochore modules (Cheeseman and Desai 2008; Musacchio and Desai 2017; Hara and Fukagawa 2018).
KINETOCHORE NULL2 (KNL2, also termed M18BP1) (Moree, et al. 2011; Lermontova, et al. 2013) plays a key role in new CENH3 deposition after replication. In vertebrates, M18BP1 (KNL2) is a part of the Mis 18 complex, including additionally Mis 18a and Misl8p proteins. However, Mis 18a and Misl8p in plants have not yet been identified. The human Mis 18 complex is transiently present at centromeres prior to new CENH3 incorporation (Fujita, et al. 2007); in chicken and Xenopus the M18BP1 protein is present at centromeres throughout the cell cycle (French, et al. 2017; Hori, et al. 2017). In plants, KNL2 localizes at centromeres through the cell cycle, except from metaphase to late anaphase (Lermontova, et al. 2013). The KNL2 proteins identified so far contain the characteristic SANTA domain (Zhang, et al. 2006), a protein module of ~90 amino acids which in some organisms is accompanied by a SANT/Myb-like putative DNA-binding domain. The functional role of SANTA and SANT domains have remained obscure for a long time. For instance, an interaction of KNL2 homologues containing the SANT/Myb domain with DNA has not yet been demonstrated, while Arabidopsis thaliana KNL2, lacking this domain, showed DNA- binding capability in vitro and an association with the centromeric repeat PALI in vivo (Sandmann, et al. 2017). Deletion of the SANTA domain in Arabidopsis KNL2 has not impaired its targeting to centromeres (Lermontova et al., 2013) nor disrupted its interaction with DNA (Sandmann, et al. 2017). In Xenopus, a direct interaction of M18BP1 with CENH3 nucleosomes also did not require the SANTA domain (French, et al. 2017). However, M18BP1 localizes at centromeres during metaphase - prior to CENH3 loading - by binding to CENP-C using the SANTA domain (French and Straight 2019).
A conserved CENPC-k motif, which is highly similar to the previously described CENPC motif of the CENP-C protein (Sugimoto, et al. 1994; Talbert, et al. 2004; Kato, et al. 2013), was identified on the C- terminal part of the KNL2 homologues in a wide spectrum of eukaryotes (Kral 2016; Sandmann, et al. 2017). The importance of this domain for the centromeric targeting of KNL2 was demonstrated in Arabidopsis (Sandmann, et al. 2017), Xenopus (French, et al. 2017) and chicken (Hori, et al. 2017). Moreover, direct binding of CENPC-k to CENH3 nucleosomes was shown (French, et al. 2017; Hori, et al. 2017). In Xenopus, KNL2, similar to CENP-C, recruits the CENH3 chaperone HJURP to centromeres for
new CENH3 assembly, and CENP-C competes with KNL2 for binding new CENH3 at centromeres (French, et al. 2017). KNL2 in eutherian mammals lacks a CENPC-k motif (Kral 2016; Sandmann, et al. 2017), and centromeric localization of human KNL2 may be achieved by direct binding of the SANTA domain to CENP-C (French and Straight 2019). Depletion of KNL2 in different organisms causes defects in CENH3 assembly (Fujita, et al. 2007; Lermontova, et al. 2013; French, et al. 2017). For instance, knockout of M18BP1 as well as other components of the Mis 18 complex in human He La cells with RNAi abolished centromeric recruitment of newly synthesized CENP-A, leading to chromosome missegregation and interphase micronuclei (Fujita, et al. 2007). Embryos of homozygous mis 18a mutant of mouse showed decreased DNA methylation, increased centromeric transcription, misaligned chromosomes, anaphase bridges and lagging chromosomes, which was accompanied by embryo lethality (Kim et al., 2012). Unlike in mammals, the homozygous knl2 mutant of Arabidopsis is viable despite reduced cenH3 levels, mitotic and meiotic abnormalities resulting in reduced growth rate and fertility (Lermontova, et al. 2013). The fact that in the knl2 mutant cenH3 is still localized at the centromeres suggests that this is not the only mechanism responsible for centromeric loading of cenH3 in plants.
Although the functions of KNL2 are gradually being uncovered, research is still limited to a few model species, and in particular, the precise molecular mechanism of KNL2 interaction remains to be clarified. Up to now, robust phylogenetic analyses of the KNL2 gene across large evolutionary time scales have not been reported. A better understanding of KNL2 evolution may yield important insights into its role in CENH3 deposition and kinetochore assembly. To reconstruct the evolutionary history of the KNL2 gene in plants, we compiled a dataset of the proteins encoded by the KNL2 gene across major plant lineages from available genomic resources. Our phylogenetic analyses indicate that the KNL2 gene in plants underwent three independent ancient duplications in fems, grasses and eudicots, after which the KNL2 paralogues experienced sub-functionalization. We show that previously unclassified KNL2 genes in eudicots could be divided into two clades (aKNL2 and /3KNL2). Both clades encode the conserved SANTA domain, but only the aKNL2 group additionally encodes the conservative CENPC-k motif. Two additional KNL2 clades (yKNL2 and 5KNL2) were identified in the grasses. Similar to the divergence of aKNL2 and PKNL2 proteins, yKNL2 proteins retain the CENPC-k motif, while 5KNL2 proteins have a shortened motif that resembles part of CENPC-k. Signatures of positive selection were found in both clades. In addition, analysis of RNA-seq data in Arabidopsis shows the /3KNL2 gene expression in nearly all tissues is considerably higher than the expression of aKNL2. Moreover, we provide the first evidence that PKNL2 localizes to centromeric regions in Arabidopsis . Mutant analysis of PKNL2 suggests that it participates in the new CENH3 loading similarly to aKNL2. Taken together, our study provides a new understanding of the evolutionary origin and function of plant-specific duplicated KNL2 as a kinetochore assembly factor.
RESULTS
Search for KNL2 genes in plants led to the finding and re-annotation of a new KNL2 variant in Arabidopsis
KNL2 protein contains a conserved module designated as SANTA due to its association with the SANT domain. Although most metazoans have only one gene coding for a SANTA domain-containing protein, two genes (At5g02520 and Atlg58210) were identified in Arabidopsis (Zhang, et al. 2006). Since the protein encoded by the Atlg58210 gene contained in addition to the SANTA domain, a protein interaction kinase domain 1 (KIP1) and the C-terminal chromosome maintenance structural domain (SMC Prok B), completely atypical for previously described KNL2 proteins, we had previously excluded it from our research (Lermontova et al., 2013).
However, based on the updated Araport-11 annotation and our in silico analysis, we found that the Atlg58210 gene encodes a protein of 281 amino acids including only the SANTA domain. We designated it as /3KNL2 and the previously characterized KNL2 as aKNL2 (Fig. 1A).
To investigate the origin and evolution of KNL2 genes, we constructed a comprehensive proteome dataset across major plant lineages including 90 representative species (Fig. IB, C). We performed a genome-wide search using the Arabidopsis aKNL2 (At5g02520) amino acid sequence and its conserved domains as the query for a local BLASTP search against the dataset (see Supplementary Fig. 8 for the data-mining scheme). Using the aKNL2 protein sequence as the query, 148 homologous conceptual protein sequences encoded by KNL2 genes were identified in plant lineages including bryophytes (3 species: 3 sequences), lycophytes (1: 1), fems (3: 5), gymnosperms (7: 7) and angiosperm species (67: 132) (Fig. IB, D Supplementary Table SI and Supplementary File SI). Two KNL2 protein sequences from Colocasia esculenta and Phoenix dactylifera were retrieved from GenBank database. For lycophytes, KNL2 gene was retrieved by TBLASTN search from Selaginella moellendorffii genome. Comparison with genomic and cDNA sequences in .S', moellendorffii, there is an intron right in the CENPC-k motif and there are two cDNAs that do not encode the CENPC-k motif which probably are mis-splicing events (Supplementary File S2). KNL2 proteins alignment suggest the two variants are probably alleles in .S', moellendorffii. While the KNL2 gene was detected in all investigated angiosperm species and fems, it has not been identified in four out of eleven gymnosperm species investigated (('yeas micholitzii, Ginkgo biloba, Gnetum montanum and Taxus baccata). The failure to find KNL2 in these species is likely because of incomplete assembled proteomes of gymnosperms, not its absence in their genomes. Additionally, the KNL2 gene also was not retrieved in any algal species, suggesting that it may have been lost from algal lineages.
KNL2 gene in plants underwent independent duplications in ferns, grasses and eudicots
To better understand the KNL2 gene diversification and evolution across the plant kingdom, we made a multiple sequence alignment of KNL2 proteins using MAFFT. Because of the high divergence among
KNL2 homologs and potentially inaccurate annotation of PKNL2, the alignment was further manually refined (Supplementary File S3), and used to construct a phylogenetic tree. The topology of the Maximum Likelihood (ML) tree (Fig. 3) shows that KNL2 proteins cluster into two branches in three plant clades - heterosporous water fems (Salviniaceae), eudicots, and grasses (Poaceae) - indicating ancient gene duplications. Despite the deep divergence of the duplicated paralogs in fems, their CENPC-k motifs are 83% identical. The grouping of a KNL2 protein of Ceratopteris, a member of the Polypodiales encompassing -80% of fem species, with one of the two KNL2 proteins of water fems suggests that the duplication of KNL2 in fems occurred prior to the divergence of Satviniales and Polypodiales, more than 120 million years ago (Mya) (Qi et al., 2018). In angiosperms, gene duplication occurred after the divergence oVAmborella trichopoda and monocots, but prior to the divergence ofthe basal eudicot Nelumbo nucifera, estimated at -100 Mya (Angiosperm Phylogeny website: httD://www.mobot.org/MOBOT/research/APweb/; (Friis, et al. 2016)). This duplication gave rise to the oKNL2 and PKNL2 genes ofArabidopsis and their orthologs in other eudicots. Monocots except for grasses (Poaceae) appear to have only one KNL2 gene copy, while two paralogs in grasses indicate another gene duplication in the grass ancestor -100 Mya (Wu, et al. 2018). Based on their separate origin from oKNL2 and PKNL2 in eudicots, these two paralogous copies in grasses were named yKNL2 and &KNL2.
The aKNL2 and 0KNL2 paralogs contain the SANTA domain, but only aKNL2 is characterized by the presence of the C-terminal CENPC-k motif
Next, we focused on the aKNL2 and PKNL2 genes and their proteins mainly in Brassicales due to the extensive availability of genomic resources (Fig. 4, Supplementary File S4). Except for a few neopolyploid species, the aKNL2 and PKNL2 gene numbers are conserved at one copy each across Brassicales families. These KNL2 proteins present several conserved features: the N-terminus contains the conserved SANTA domain in all KNL2 proteins, whereas only the oKNL2 type C-terminus possesses the CENPC-k motif. We produced an alignment of all SANTA domains in KNL2 homologs in Brassicales species to show the conservation and variation within the SANTA domain, and also made separate alignments of the SANTA domains in oKNL2 and PKNL2 paralogs (Fig. 5A). Many residues in the SANTA domains are conserved between both oKNL2 and PKNL2 paralogs. However, there are also amino acids specific to oKNL2 or PKNL2, suggesting that they might have different functions or interact with different proteins. For instance, one putative Aurora kinase phosphorylation consensus ((R/K)XI-3(S/T)) can be detected in oKNL2 (Fig. 5 A, middle panel, aa 37-41) and three in PKNL2 (Fig. 5 A, lower panel, aa 37-41, 47-50, 69-72). In addition, we aligned SANTA domains from angiosperm species (minus Brassicales) and early-diverging land plants (Fig. 11A). As expected, SANTA domain variation increased with the phylogenetic divergence through evolutionary time. However, SANTA domains from nearly all paralogs maintain the previously identified
conserved hydrophobic residues at the N- and C-termini, including the VxLxDW motif at the N-terminus of the SANTA domain and the GFxxxxxxxFxxGFPxxW motif at the C-terminus.
In contrast to the SANTA domain, the CENPC-k motif is highly conserved throughout the plant kingdom where it is present (Supplementary Fig. 11B); however, the CENPC-k motif is missing from the PKNL2 and 5KNL2 clades. Given that aKNL2 and PKNL2 paralogs may have been retained to perform distinct functions, we looked for additional conserved motifs in both variants from Brassicales species using the Multiple Em for Motif Elicitation (MEME) tool. Besides the motifs preserved in SANTA and CENPC-k regions (Fig. 5A and Fig. 11B), we also identified several additional conserved motifs that are unique to one or the other paralog (Fig. 5B-C, Fig. 12-13). For example, the N-termini of PKNL2 paralogs have a conserved motif 7 (21 aa), which is located upstream of the SANTA domain (Fig. 5B, C), but absent in aKNL2 paralogs (Fig. 12).
The KNL2 of maize is represented only by the 5KNL2 variant with a truncated CENPC-k motif
We also examined the KNL2 and KNL2 genes in grasses. KNL2 encodes a SANTA domain and CENPC- k motif (Supplementary File S5), while 3KNL2 encodes a SANTA domain and the motif RRLRSGKV/I, which resembles a truncated version of the CENPC-k motif (Supplementary File S6). Other monocot species only have one KNL2 gene copy (Fig. 3 and Supplementary Table SI), and these single copy KNL2 genes more closely resemble the y clade, encoding SANTA and CENPC-k motif, which is the ancestral state of KNL2 before the grass-specific gene duplication (Supplementary File S5). Interestingly, in eight reference proteomes of maize, we found only one copy of the KNL2 gene, though with several splicing variants (Fig. 14A/14B). We also checked maize transcriptome data from different tissues and developmental stages; however, only 3KNL2 was identified (Maize RNA-seq Database: http://ipf.sustech.edu.cn/pub/zmma/). We propose that unlike in other grass species, the maize genome contains only one copy of the KNL2 gene and has lost KNL2.
Different evolutionary forces act on KNL2 paralogs
We considered the possibility that selection may act differently on KNL2 paralogs. We used maximum likelihood methods using the PAML suite (Y ang 2007) to test for positive selection on each of the KNL2 paralogs in Brassicaceae species (Fig. 16, Supplementary File S7). The branch site model was used to test two KNL2 groups by using Codeml (Y ang 2007). Our PAML analyses revealed positive selection on both aKNL2 (Fig. 7B, Ml vs M2, P= 2.104e-04 and M7 vs M8, P = 3.518e-05) and (3KNL2 paralogs (M7 vs M8, P = 4.863e-04). Bayes Empirical Bayes analyses identified two amino acids in aKNL2 paralogs and one amino acid in PKNL2 paralogs as having evolved under positive selection with a high posterior probability (>0.95) (Fig. 7C). In aKNL2, the two positively selected sites are located in and slightly C-
terminal to the SANTA domain. In PKNL2, the positively selected site also is located slightly C-terminal to the SANTA domain (Fig. 7C, Fig. 16). KNL2 of Arabidopsis showed the centromeric localization during interphase
To elucidate whether the /3KNL2 paralogs were sub-functionalized or acquired a new function, we assessed the subcellular localization and putative biological function of the Arabidopsis 3KNL2 variant in vivo. To this end, the pKNI.2 cDNA was cloned into the pDONR221 vector and subcloned into pGWB641 (35Spro, C-EYFP) and pGWB642 (35Spro, N-EYFP) vector, respectively. In Arabidopsis, seedlings stably transformed with the PKNL2 fused to EYFP, fluorescent signals were detected at centromeres and in the nucleoplasm of the root tip nuclei (Fig. 6A-C). An immunostaining experiment with anti-GFP and anti- CENH3 antibodies revealed the colocalization of PKNL2-EYFP with CENH3 at centromeres (Fig. 6B). Live cell imaging of mitotic cells showed that PKNL2 is present at centromeres during interphase, almost not detectable shortly prior to mitosis, but appears again during the M phase (Fig. 6C). In contrast, aKNL2 was not detectable during metaphase and early anaphase in Arabidopsis root tip cells (Lermontova, et al. 2013).
Expression profile of the KNL2 genes in Arabidopsis
In all selected meristematic tissues, the expression level of (JKNL2 is higher than that of O.KNL2
To investigate the expression profdes of the KNL2 genes in different tissues and developmental stages and to compare them with CENH3 and CENP-C, we downloaded the available RNA-seq data in Arabidopsis from a public database (Klepikova, et al. 2016) and additionally performed expression analysis using the eFP genome browser. However, in the eFP genome browser analysis, /3KNL2 was excluded from the analysis due to the mis-annotation and consequent lack of correct gene expression data. The expression value of selected genes was normalized to the reference gene MONENSIN SENSITIVITY 1 (MONT, At2g28390) which shows stable transcription during plant development (Czechowski, et al. 2005). The data showed that the KNL2, CENH3 and CENP-C genes have high transcriptional activity in tissues enriched for meristematically active cells (Fig. 7A, Fig. 15A), indicating the involvement of these genes in cell division processes. In contrast, a low expression level of the selected genes was observed in the rosette and senescent leaves (Fig. 15A). In general, the CENP-C and CENH3 genes show higher expression than KNL2. Interestingly, the /3KNL2 has higher expression than aKNL2 in nearly all tissues.
(3KNL2 knockout resulted in an abnormal seed phenotype and semilethal mutant phenotype
To characterize and understand the 3KNL2 function, two T-DNA insertion lines SALK_135778 and SALK_091054 were identified and defined as fiknl2-l and fiknl2-2. respectively (Fig. 19A). Both T-DNA insertions are present in the single exon of f>KNI.2. 270 and 335 nucleotides downstream from the
transcription start. Thus, in /3knl2-l the T-DNA insertion is located upstream and in /3knl2-2 directly in the region encoding the SANTA domain (Fig. 19A). PCR-based genotyping of soil-grown plants revealed no homozygous mutant lines in neither mutant populations obtained from the ABRC seed stock (n=26, n=38, respectively) nor in the next generation (n=195, n=220, respectively). This suggested that the /3KNL2 knockout might be lethal.
Therefore, siliques of both mutants were tested for the seed phenotype. Heterozygous pknl2 mutant lines show 11±1% (Fig. 23) of abnormal seeds (P < 0.01), which look larger and whitish with glossy surface compared to normal green seeds (Fig. 19B), whereas in the case of WT plants no such seeds were found. However, unlike pknl2-2, the pknl2-l mutant exhibited an ovule abortion phenotype (Fig. 24). The SALK_135778 line carries two additional T-DNA insertions in the AT1G76850 and AT3G13920 genes according to the ABRC database (htps ://abrc .osu.edii/stocks/618439) . Furthermore, these two genes affect ovule development and pollen acceptance. The corresponding mutations cause an ovule lethal phenotype (Bush, et al. 2015; Safavian, et al. 2015). Therefore, we speculated that the ovule lethality found in SALK_135778 might be due to these off-target mutations. Using primers specific to these additional T- DNA insertions, we selected clean fiknl2-l plants carrying single T-DNA. Indeed, resulting fiknl2-l lines did not show the aborted ovule phenotype and were selected for further analysis (Fig. 19B). To assess whether the heterozygous or homozygous state of mutation causes the abnormal seed phenotype and maternal or paternal effects during embryogenesis, reciprocal crosses between WT and heterozygous ftknl2- 1 and /3knl2-2 mutants were performed. All these crosses produced < 3% of abnormal seeds (Fig. 19C-D, Supplementary Table S2) that is similar to the one observed in WT self-pollinated siliques. These findings indicate that the appearance of abnormal seeds in the siliques of heterozygous mutants is not the result of defective female gamete formation, but is rather due to defects during postzygotic development. The fact that the abnormal seeds were increased only in self-pollinated heterozygous mutants (Fig. 19C-D, Supplementary Table S2), suggests the recessive nature of this phenotype.
As mentioned above, homozygous !knl2 mutants cannot be selected among the progeny population of heterozygous lines grown on soil. Therefore, we tested whether the abnormal seeds, possibly carrying homozygous mutant plants, could germinate on a sterile medium. For both mutants, we found abnormal seedlings, with reduced growth rate and root development (Fig. 19E). According to the genotyping results, abnormal seedlings were represented by homozygous mutants, which occur at a frequency of 2-6% of the total number of sown seeds. Unfortunately, our repeated attempts to transfer homozygous seedlings into the soil resulted in their death (Fig. 19 F). At the same time, heterozygous mutant seedlings were not distinguishable from the wild-type ones (Fig. 19E).
In heterozygous self- or manually pollinated mutants containing single T-DNA insertions, the siliques show less than 25% of abnormal seeds that does not correspond to the Mendelian monohybrid phenotypic ratio (Fig. 19C). We hypothesized that this might be due to inaccuracy in the visual phenotyping of immature
seeds. Therefore, as the next step, the dry seed phenotype was analyzed in single siliques (Fig. 19G-J). The heterozygous mutants in addition to normal seeds contain small, dark-colored and shriveled ones (Fig. 19H- I) in contrast to the wild-type (Fig. 19G) with uniform seed size and color.
We observed that the abnormal dry seed phenotype is significantly more frequent in the siliques of both heterozygous mutants compared to WT (Fig. 19J, /' O.001 ) and similar to that of the fresh siliques (Fig. 23). Thus, it can be assumed that a large part of the whitish seeds with a glossy surface became dark and small or shriveled on drying.
Additionally, we analyzed the germination rate of seeds obtained from single siliques of both !knl2 mutants and wild-type (Fig. 20A-B). Compared to wild-type, mutants showed a significantly decreased germination rate (Fig. 20B, <0.01) and increased number of the abnormal seedlings per single silique (Fig. 20C, P<0.01). To test the Mendelian segregation of phenotype-genotype ratio, we also performed single silique genotyping. In the case of fiknl2-l, the homozygous mutation represents -16% per silique and ftknl2-2 - -25% Supplementary Table S 3). The variation between the two mutants may be due to the different quality of the seeds harvested at two different time points and, as a result, the lower germination of the homozygous lines of one of the mutants.
To test whether abnormal seedlings (reduced seedling size and reduced root length) of both !knl2 mutants possess the /3KNL2 transcripts, the RT-PCR analysis with gene-specific primers for /3KNL2 was performed on RNA isolated from 3-5 seedlings pooled together. The results showed an absence of full-length PKNL2 transcript in both mutant lines SALK_135778 and SALK_091054, suggesting that seedlings for further analysis can be selected based on their phenotype without additional genotyping (Fig. 20D). KNL2 is required for proper CENH3 loading and correct mitotic division
We showed that PKNL2 colocalizes at centromeres with CENH3 (Fig. 6B) and has a localization pattern similar to that of aKNL2 (Lermontova, et al. 2013). To analyze whether PKNL2 similar to aKNL2, is involved in the regulation of cell divisions and CENH3 loading, we used homozygous seedlings of both mutants for the flow cytometry (FC) analysis and nuclei isolation for the immunostaining. The seedlings were selected based on their abnormal phenotype. Thus, the leaves of the abnormal seedlings and additionally the abnormal white seeds were checked by FC for ploidy levels. Comparison of the green seeds of mutants with WT showed similar histogram profdes with a high 2C count (Fig. 21A top). Whereas the white seeds showed higher 4C and in general shifts toward increased ploidy levels (Fig. 21A bottom). In some cases, no clear nuclear populations have been identified at all (Fig. 25). To analyze ploidy levels of seedlings we chopped a single leaf from six 14 days old seedlings ofWT and homozygous fiknl2. In contrast to WT leaves with distinct peaks of 2C and 4C nuclei, in mutant leaves high ploidy nuclei such as 8C and 16C were predominant (Fig. 21B, Fig. 26).
To find whether the PKNL2 knockout results in reduced loading of CENH3 at centromeres, similar to aKNL2 deregulation, we performed an immunostaining experiment with anti-CENH3 antibodies on nuclei isolated from 14 days old seedlings of WT and ftknl2 mutants. In WT plants, 8-10 distinct dots of CENH3 signals were identified, while in ftknl2-l&2 the round-shaped nuclei with similar size showed only 3-6 signals. In some cases, mutant nuclei did not contain any CENH3 signals (Fig. 27). The CENH3 signals were counted in 50 nuclei of WT, fiknl2-l and fiknl2-2. In contrast to WT with 8-9 signals, both mutants have only 4 signals on average (Fig. 21C). We performed Student’s t-test and found that the mutants have significantly lower CENH3 signals compared to WT (Fig. 21C, n<6, /'<(). 1 ). To check heterochromatin structure in detailed images of the representative nuclei from the same slides were captured with super resolution microscope (Fig. 21D-I). We observed that in the nuclei with reduced CENH3 levels heterochromatin remains normal and condensed similar to WT. These data suggest that reduced CENH3 levels in homozygous /3knl2-l&,2 mutants lead to inhibition of mitosis and switching of cells to endocycles.
DISCUSSION
Origin and Evolution of the KNL2
Duplication of KNL2
Most metazoan genomes have only one KNL2 gene with the SANTA domain, except for the allotetraploid Xenopus laevis, where two KNL2 genes were identified; both with identical CENPC-k motifs, nearly identical SANTA and Myb (SANT) domains, and 74% sequence similarity (Figure 27) (Moree, et al. 2011; French, et al. 2017). In contrast, two genes containing the SANTA domain were identified in the water fem genomes and most angiosperms, whereas only one KNL2 copy was found in bryophytes and gymnosperms. Though Brassicaceae species experienced multiple WGD events such as the At-a and
WGDs, which occurred about 31.8-42.8 Mya and 85-92.2 Mya ago, respectively (Edger, et al. 2018), most species exhibit two KNL2 gene copies, except for a few neopolyploid species which have experienced an extra recent WGD event(s). For instance, Brassica juncea and B. napus were formed through recent allopolyploidizations, consistent with roughly twice as many KNL2 gene copies relative to other Brassica species (Figure 1), and Camelina sativa is a neo-hexaploid (Kagale, et al. 2014) with six copies of the KNL2 gene. The KNL2 gene was not retrieved in any algal species. Based on the quality of the assembled algal proteomes, the KNL2 gene is probably absent in these genomes. For example, the genome of the red alga Chondrus crispus was sequenced using the Sanger technology and gene prediction was manually checked (Collen, et al. 2013), and for the green algae Chlamydomonas reinhardtii (Merchant, et al. 2007) and Coccomyxa subellipsoidea (Blanc, et al. 2012), the draft assembly is 95% and 97% complete based on the alignments of expressed sequence tags (ESTs) to the genome. However, we cannot exclude the possibility that KNL2 has diverged
beyond recognition by BLASTP and tBLASTN in algal genomes. In summary, the KNL2 genes likely experienced plant-specific duplication events (Supplementary Fig. 22).
In land plants, we identified recurrent duplication of KNL2 in water fems, eudicots and grasses. We found strong conservation of KNL2, notably in the VxLxDW motif at the N-terminus of the SANTA domain and the GFxxxxxxxFxxGFPxxW motif at the C-terminus (Figure 7), where the bolded residues impaired CENP-C binding when mutated in Xenopus M18BP (French and Straight 2019), suggesting that plant KNL2s may also bind CENP-C through the SANTA domain. In plants, the CENPC-k motif that binds CENH3 is strongly conserved (Fig. 11), however, in both eudicots and grasses, we found that one of the duplicate KNL2 proteins (PKNL2 and 5KNL2) has lost or partially lost the CENPC-k motif. In addition, analysis of aKNL2 and PKNL2 protein sequences identified numerous motifs, strongly suggesting that the paralogs have subfiinctionalized. A study in Drosophila has shown that Cid (CENH3) paralogs evolved new motifs following Cid duplication (Kursel and Malik 2017). Loss of ancestral motifs in Drosophila Cids was proposed as direct evidence of subfiinctionalization (Kursel and Malik 2017; Kursel, et al. 2020).
We identified positive selection in and near the SANTA domain of KNL2 in the analyzed Brassicaceae species, similar to what has been previously reported for CENH3 (Talbert, et al. 2002) and CENP-C (Talbert, et al. 2004). Thus, KNL2 might be responding for centromere drive through interaction with rapidly evolving CENH3 and CENH3 chaperone NASPSIM3, which recently was identified in Arahidopsis (Le Goff, et al. 2020), or with CENP-C. However, the mechanisms of adaptively evolving regions remain to be elucidated.
Partial or complete loss of the CENPC-k motif in KNL2 in different clades of plants
The CENPC-k motif is found in KNL2 of diverse eukaryotes including non-mammalian vertebrates, many invertebrates, chytrid fungi, cryptomonads, and plants (Kral 2015; Sandmann, et al. 2017). In eudicots the conserved CENPC-k motif is present in aKNL2 clade, but is absent from /3KNL2. Similarly, in most grass species the CENPC-k motif is conserved in KNL2 clade, while KNL2 clade does not have the motif. However, we found a RRLRSGKV/I motif possibly related to the beginning of the CENPC-k motif (KRSRSGRV/I) in the 6KNL2 clade (Supplementary File S6). We showed previously that the substitution of the 7th Arg by Ala in the CENPC-k motif abolishes centromere targeting of aKNL2 (Sandmann, et al. 2017). In the truncated putative CENPC-k motif, Lys is present instead of Arg. Since these two amino acids have similar features, we assume that Lys might be required for the targeting of 5KNL2 to centromeres. However, the truncated putative CENPC-k motif does not include the Trp which is similar to 7th Arg is needed for the targeting of aKNL2 to centromeres (Sandmann, et al. 2017). Moreover, it remains to be elucidated whether KNL2 variants with the truncated CENPC-k motif can target CENH3 nucleosomes directly, without an additional interacting partner. Among all grass species with sequenced genomes, maize represents an exception, since it has only one KNL2 gene which belongs to KNL2 clade with the truncated CENPC-k and has no yKNL2 protein variant with the complete CENPC-k motif. Interestingly, in sorghum,
closely related to maize, the yKNL2 protein can be identified (Supplementary File S5). If for other species it can be postulated that centromeric targeting of KNL2 p and 5 can depend on KNL2 a and y, respectively, for maize this assumption cannot be applied. Indeed, the pathways and molecular components regulating the CENH3 deposition show remarkable variations across organisms (Zasadzinska and Foltz 2017). This suggests that maize has evolved a different mechanism to serve CENH3 deposition compared to other grasses. It is also possible that the targeting of 5KNL2 to centromeres depends on CENP-C. Notably, 5KNL2 retains the hydrophobic residues in the SANTA domain that are important for CENP-C binding in Xenopus. In maize, in contrast to other species, two CENP-C proteins were identified (Talbert, et al. 2004). Perhaps the mechanism of localization and function of KNL2 in maize is similar to that of mammals that represent an exception among vertebrates in lacking CENPC-k and relies on CENP-C binding similar to Xenopus.
The function of KNL2 in plants
Although KNL2 protein homologues have been identified in different organisms as components of the CENH3 loading machinery, they differ considerably in the composition of their functional domains, interacting partners, and localization timing in the mitotic cell cycle. The mammalian M18BP1, composed of the conserved N-terminal (Misl8a-binding) region, SANTA domain, CENP-C-binding domain, SANT (Myb-like) domain and the C-terminus, is lacking the CENPC-k motif. The N-terminal (Misl8a-binding) region and the CENP-C-binding domain are required for centromere targeting (Stellfox et al., 2016). Deletion of the SANTA domain in mammalian and chicken M18BP1/KNL2 does not abolish its centromeric localization (Stellfox, et al. 2016; Fiori, et al. 2017). In contrast, mutation of the SANTA domain in Xenopus reduced centromeric localization of M18BP1/KNL2 by 90% (French, et al. 2017). Later, the same authors demonstrated that the SANTA domain is required for the interaction of M18BP1/KNL2 with CENP-C during metaphase (French and Straight 2019). We showed previously that in Arabidopsis the centromeric localization of aKNL2 depends on the CENPC-k motif (Sandmann, et al. 2017), while it was not abolished in the complete absence of the N-terminal part of KNL2 with the SANTA domain (Lermontova, et al. 2013). The C-terminal half of Arabidopsis KNL2 was not only sufficient for its targeting to centromeres, but also the interaction with DNA (Sandmann, et al. 2017). In the present study, we demonstrated that a newly identified /3KNL2 of Arabidopsis is characterized by a conserved SANTA domain, but is lacking the CENPC-k motif. Despite that, PKNL2 co-localizes with CENH3 at centromeres. In general, both variants of Arabidopsis KNL2 showed a similar localization pattern during interphase. However, in a contrast to aKNL2, PKNL2 can be detected on chromosomes during metaphase and early anaphase. The centromeric location of PKNL2 suggests that /3KNL2 may partially compensate for the loss of aKNL2 in the corresponding Arabidopsis mutant which showed only reduced, but not completely abolished CENH3 loading which would be lethal (Lermontova, et al. 2013). In contrast to aKNL2, homozygous T-DNA insertions for PKNL2 resulted in plant death at the seedling stage. However, it should
be considered that in the analyzed aKNL2 mutants, the T-DNA was inserted after the SANTA domain coding region, whereas in the case of PKNL2 mutants both T-DNAs were inserted before the SANTA domain. Therefore, it cannot be excluded that truncated aKNL2 with the full SANTA domain may retain some function in the mutant.
As reciprocal crosses of ftknl2 mutants with the wild type resulted in normal seed development in both directions, we hypothesized that the PKNL2 null mutations do not affect gametes or fertilization processes, but rather postzygotic cell divisions. In support of this hypothesis, FC ploidy analysis of young seedlings revealed that in contrast to the wild-type with distinct 2C and 4C peaks, homozygous mutants showed a shift towards endopolyploidisation (Figure 2 IB), confirming disruption of cell division. Impaired mitotic divisions in mutant seedlings can be explained by the reduced levels of CENH3 on the centromeres of both mutants (Figure 9D). Thus, our data strongly suggest the involvement of PKNL2 protein in CENH3 loading. The ability of cells in homozygous seedlings to undergo some mitotic divisions can be explained by residual amounts of CENH3 from parental plants, and when CENH3 levels are highly diluted, cells switch from mitotic cycle to endocycles. We observed that the development of homozygous seedlings can be inhibited at different stages (Figure 19E).
Taken together, our results suggest that the KNL2 gene in eudicots had an early duplication and after that the KNL2 proteins have experienced sub-functionalization with the core function of CENH3 deposition to define the centromere region. Due to the lack of the CENPC-k motif in PKNL2, we propose that in Arabidopsis PKNL2 might localize to centromeres by binding to CENP-C through the conserved N terminal motif located upstream of the SANTA domain as it was previously described in humans (Stellfox, et al. 2016) or through the SANTA domain as it was shown for Xenopus (French and Straight 2019), or both of these regions.
Although in the SANTA domain of PKNL2 three putative Aurora kinase phosphorylation sites can be identified, there is only one in aKNL2. This fact might suggest that both KNL2 variants are involved in the formation of different protein complexes. We also could not rule out the possibility that PKNL2 assembles as a Mis 18 complex to ensure centromeric localization and subsequent CENH3 deposition. So far, Mis 18a and P proteins have not been identified and characterized in Arabidopsis . However, in-silico analysis (https://bioinformatics.psb.ugent.be/plaza/) revealed a family of seven genes (At2G40110, AT3G08990, AT3G11230, AT3G55890, AT4G27740, AT4G27745, AT5G53940) encoding proteins with the Yippee- Mis 18 domain-specific to Mis 18 proteins (Stellfox, et al . 2016) . Recently it was demonstrated that the direct binding of Schizosaccharomyces pombe Mis 18 to nucleosomal DNA is important for the recruitment of .s/iMis 18 and Cnpl (CENH3) to the centromere in fission yeast (Zhang, et al. 2020). In contrast to aKNL2, PKNL2 not only lacks the CENPC-k domain but also the part necessary for interaction with DNA. Thus, its association with Mis 18 proteins, with the ability to bind to DNA is plausible. We also cannot exclude that centromere targeting of PKNL2 depends on aKNL2.
We showed previously that manipulation of aKNL2 can be used for the production of haploids in Arabidopsis (Lermontova 2017). Here we demonstrate that KNL2 genes exist in two variants in eudicots (a, PKNL2) and monocots (y, oKNL2). The conserved gene structure and expression patterns of a, /yKNL2 in both eudicots and monocots suggest that a, /yKNL2 mutations could be used to develop in vivo haploid induction systems in different crop plants. Similarly, the newly identified PKNL2 may become the subject of manipulations to obtain haploids both in Arabidopsis and in crops. As homozygous ftk l2 mutants are dying at seedlings stage, we can assume that the heterozygous mutant plants can also induce haploids similar as it was described for the heterozygous cenh3 mutants of maize and wheat (Lv, et al. 2020; Wang, et al. 2021).
Materials and Methods
Data sources and sequences retrieval
The KNL2 protein sequences of Arabidopsis thaliana were identified by screening the Arabidopsis Information Resource (TAIR10) using the specific gene number. To obtain and annotate KNL2 members in plants, we downloaded 88 representative species reference genome including red and green algae, bryophytes, lycophytes, fems, gymnosperms, and angiosperms from the Phytozome database (Goodstein, et al. 2012) (https://phytozome.jgi.doe.gov/), NCBI genome database, Ensembl Plants database and another single genome website (Supplementary Table SI). We used the homology search tool BLASTP to scan the reference proteome with a cut-off e-value of 0.01 using whole sequences and conserved domains from Arabidopsis aKNL2 as the query. TBLASTN was used as additional method for failed identification case. Then, we combined the BLAST results and deleted spliced variants in multiple sequence alignments. The protein data is summarized in Supplementary Table SI and File 1.
Alignments and phylogenetic analysis
To explore the phylogenetic relationships of the CENP-C and KNL2 genes in plant lineages, CENP-C and KNL2 protein sequences were aligned using MAFFT software (Y amada, et al. 2016) and alignments were further slightly manually refined, including removal of gaps and poorly aligned regions. Evolutionary relationships among CENP-C and KNL2 gene family members were determined by using IQ-TREE software (Nguyen, et al. 2015) implemented maximum likelihood methods based on 1000 bootstrap alignments and single-branch tests. The phylogenetic trees were visualized and modified using the Fig-Tree vl.4.4 software (http://tree.bio.ed.ac.uk/software/figtree/). Sequence logos were generated using WebLogo3 (http://weblogo.berkeley.edu/) (Crooks, et al. 2004).
Sequence motif analysis
The unaligned amino acid sequences of KNL2 were constructed to search for additional conserved motifs of KNL2 using MEME suite v5.1.0 (Bailey, et al. 2009), which is the motif generator algorithm. Due to misleading annotation of PKNL2 gene, we manually removed the NET2A regions in some species. The dataset was submitted to the MEME server (http://meme-suite.org/) and the conserved domains and motifs were marked. We used the motif search algorithm MAST (Bailey and Gribskov 1998) to identify motifs.
Plasmid construction, plant transformation and cultivation
The entire open reading frame of PKNL2 (Atlg58210) was amplified by RT-PCR with RNA isolated from flower buds of Arabidopsis wild-type and cloned into the pDONR221 vector (Invitrogen) via the Gateway BP reaction. From pDONR221 clones, the open reading frame was recombined via Gateway LR reaction (Invitrogen) into the two attR recombination sites of the Gateway-compatible vectors pGWB641and pGWB642 (http://shimane-u.org/nakagawa/gbv.htm), respectively, to study the localization of PKNL2 protein in vivo.
Nicotiana benthamiana leaves were infiltrated with Agrobacterium tumefaciens transformed with 0KNL2- EYFP fusion constructs according to (Walter, et al. 2004).
Plants of Arabidopsis accession Columbia-0 were transformed according to the flower dip method (Clough and Bent 1998). T1 transformants were selected on Murashige and Skoog (MS) medium (Murashige and Skoog 1962) containing 50 mg/L of kanamycin and 50 mg/L hygromycin. Growth conditions in a cultivation room were 21°C 8 h light/18°C 16 h dark or 21°C 16 h light/18°C 8 h dark.
Analysis of T-DNA insertion mutants
Seeds of T-DNA insertion lines were obtained from the European Arabidopsis stock center (http://arabidoDsis.info/). To confirm the presence of, and to identify heterozygous versus homozygous T- DNA insertions, we performed PCR with pairs of gene-specific primers flanking the putative positions of T-DNA (Supplementary Table S4) and with a pair of gene-specific and T-DNA end-specific primers (LBb3.1, Supplementary Table S 4). DNA isolation was performed as described in Edwards et al. 1991 For the germination and segregation experiments, seeds from individual siliques were germinated in vitro on MS medium as described above.
Flow cytometry
For the analysis of (endopoly)ploidy of immature seeds, white and green seeds were selected from the same silique of the heterozygous mutant and compared with the green seeds of the wild-type. For the analysis of (endopoly)ploidy levels in seedlings, one leaf from 2-week-old heterozygous mutant and WT seedlings was used. Seeds and leaf tissue were chopped with a razor blade in 300 pl of nuclei extraction buffer (CyStain UV Ploidy, Sysmex-Partec). The resulting nuclei suspension was filtered through a 50 pm disposable CellTrics filter (Sysmex-Partec), incubated for 10 min on ice and measured on BD Influx cell sorter (BD Biosciences).
Immunostaining and microscopy analysis of fluorescent signals
For analysis of the CENH3 loading in homozygous mutants and wild-type, 2-week-old seedlings were used. Slides were prepared using a cytospin and used for immunostaining as it was described by Ahmadli, et al. (2022). To determine the colocalization of PKNL2-EYFP protein with CENH3, immunostaining of nuclei/chromosomes with anti-CENH3 and anti-GFP antibodies and microscopic analysis of fluorescent signals were performed as previously described (Lermontova, et al. 2013).
To investigate the interphase nucleus and centromeric chromatin ultrastructures at an optical lateral resolution of -100 nm (super-resolution achieved with a 405 nm laser excitation), we applied spatial structural illumination microscopy (3D-SIM) using a 63/1.40 objective of an Elyra PS.l super-resolution microscope system (Carl Zeiss GmbH), (Kubalova, et al. 2021) (Weisshart et al. 2016). DAPI (whole chromatin) and rhodamine (CENH3 signals) were excited by 405 and 561 nm lasers, respectively.
Expression profile analyses
The Arabidopsis genome assembly and gene annotation were downloaded from Araportll (https://bar.utoronto.ca/thalemine/dataCategories.do) with integrative reannotation (Cheng, et al. 2017). The CENP-C and KNL2 gene models were manually re-examined. The Arabidopsis RNA-seq data were downloaded from previous studies (Klepikova, et al. 2016). RNA-seq data were selected from 10 tissue types in Arabidopsis, including germinating seeds, stigmatic tissue, ovules from 6th and 7th flowers, young seeds, internode, axis of the inflorescence, flower, anthers of the young flower, opened anthers, and root (NCBI SRA: SRR3581356, SRR3581684, SRR3581691, SRR3581693, SRR3581704, SRR3581705, SRR3581719, SRR3581727, SRR3581728, SRR3581732). Transcriptome analysis was utilized a standard TopHat-Cufflinks pipeline with minor modification (Trapnell, et al. 2012). Transcription levels were normalized to M0N1 and expressed in reads per kilobase of exon model (RPKM) per million mapped reads. Expression levels of CENP-C and KNL2 normalized XoMONl in different tissues from microarray experiments were obtained from the Arabidopsis eFP Browser website (http://bar.utoronto.ca/efp/cgi- bin/efpWeb.cgi). The corresponding gene IDs are: CENP-C (Atlgl5660), oKNL2 (At5g02520), /3KNL2 (Atlg58210), and CENH3 (Atlg01370).
Positive selection analyses
PAML 4.8 software (Yang, 2007) was used to test for positive selection on KNL2 homologs from Brassicaceae species. The KNL2 gene alignments and gene trees were used as input into the CodeML NSsites models of PAML. Alignments were manually refined as described in phylogenetic analyses. To determine whether each KNL2 homologs evolve under positive selection, random-site models were selected. Random-site models allow co to vary among sites but not across lineages. We compared two models that do not allow co to exceed 1 (Ml and M7), and that allow co > 1 (M2 and M8). Positively selected sites were classified as those sites with a Bayes Empirical Bayes posterior probability > 95%.
Statistical data analysis
All statistical analyses were performed in Microsoft Excel using FTEST and two tailed TTEST functions (Supplementary File S8). Box Plots wweerree generated using online tool BoxPlotR (httD://shinv.chemgrid.org/boxDlotr/, Team RC., 2013).
Data availability
Data have been deposited to Figshare (https://figshare.eom/s/c3fFd36add8f2704e4ab). All data are available from the corresponding authors upon reasonable request.
Example 2
Flow Cytometry with 0KNL2 mutant
Flow cytometry analysis of Fl seeds obtained from the crossing of heterozygous pknl2 mutant with wildtype Arabidopsis thaliana under temperature stress revealed 1% haploid seeds (Figure 18).
Example 3
Shown is a nucleotide sequence of a typical KNL2 protein, having a SANTA domain and conserved hydrophobic motifs
Example 4 Shown is an alignment of protein sequences of typical deltaKNL2 proteins, having a SANTA domain and conserved hydrophobic motifs
REFERENCES
Bailey, T.L., and Gribskov, M. (1998). Combining evidence using p-values: application to sequence homology searches. Bioinformatics 14, 48-54.
Bailey, T.L., Boden, M., Buske, F.A., Frith, M., Grant, C.E., Clementi, L., Ren, J.Y., Li, W.W., and Noble, W.S. (2009). MEME SUITE: tools for motif discovery and searching. Nucleic Acids Research 37, W202-W208.
Banks, J.A., Nishiyama, T., Hasebe, M., Bowman, J.L., Gribskov, M., dePamphilis, C., Albert, V.A., Aono, N., Aoyama, T., Ambrose, B.A., Ashton, N.W., Axtell, M.J., Barker, E., Barker, M.S., Bennetzen, J.L., Bonawitz, N.D., Chapple, C., Cheng, C.Y., Correa, L.G.G., Dacre, M., DeBarry, J., Dreyer, I., Elias, M., Engstrom, E.M., Estelle, M., Feng, L., Finet, C., Floyd, S.K., Frommer, W.B., Fujita, T., Gramzow, L., Gutensohn, M., Harholt, J., Hattori, M., Heyl, A., Hirai, T., Hiwatashi, Y., Ishikawa, M., Iwata, M., Karol, K.G., Koehler, B., Kolukisaoglu, U., Kubo, M., Kurata, T., Lalonde, S., Li, K.J., Li, Y., Litt, A., Lyons, E., Manning, G., Maruyama, T., Michael, T.P., Mikami, K., Miyazaki, S., Morinaga, S., Murata, T., Mueller- Roeber, B., Nelson, D.R., Obara, M., Oguri, Y., Olmstead, R.G., Onodera, N., Petersen, B.L., Pils, B., Prigge, M., Rensing, S.A., Riano-Pachon, D.M., Roberts, A.W., Sato, Y., Scheller, H.V., Schulz, B., Schulz, C., Shakirov, E.V., Shibagaki, N., Shinohara, N., Shippen, D.E., Sorensen, L, Sotooka, R., Sugimoto, N., Sugita, M., Sumikawa, N., Tanurdzic, M., Theissen, G., Ulvskov, P., Wakazuki, S., Weng, J.K., Willats, W.W.G.T., Wipf, D., Wolf, P.G., Yang, L.X., Zimmer, A.D., Zhu, Q.H., Mitros, T., Hellsten, U., Loque, D., Otillar, R., Salamov, A., Schmutz, J., Shapiro, H., Lindquist, E., Lucas, S., Rokhsar, D., and Grigoriev, I.V. (2011). The Selaginella Genome Identifies Genetic Changes Associated with the Evolution of Vascular Plants. Science 332, 960-963.
Barra, V., and Fachinetti, D. (2018). The dark side of centromeres: types, causes and consequences of structural abnormalities implicating centromeric DNA. Nat Commun 9, 4340.
Blanc, G., Agarkova, L, Grimwood, J., Kuo, A., Brueggeman, A., Dunigan, D.D., Gurnon, J., Ladunga, L, Lindquist, E., Lucas, S., Pangilinan, J., Proschold, T., Salamov, A., Schmutz, J., Weeks, D., Yamada, T., Lomsadze, A., Borodovsky, M., Claverie, J.M., Grigoriev, I.V., and
Van Etten, J.L. (2012). The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation. Genome Biol 13, R39.
Cheeseman, I.M., and Desai, A. (2008). Molecular architecture of the kinetochore -microtubule interface. Nat Rev Mol Cell Bio 9, 33-46.
Cheng, C.Y., Krishnakumar, V., Chan, A.P., Thibaud-Nissen, F., Schobel, S., and Town, C.D. (2017). Araportl 1 : a complete reannotation of the Arabidopsis thaliana reference genome. Plant J 89, 789- 804.
Clough, S.J., and Bent, A.F. (1998). Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant Journal 16, 735-743.
Collen, J., Porcel, B., Carre, W., Ball, S.G., Chaparro, C., Tonon, T., Barbeyron, T., Michel, G., Noel, B., Valentin, K., Elias, M., Artiguenave, F., Arun, A., Aury, J.M., Barbosa-Neto, J.F., Bothwell, J.H., Bouget, F.Y., Brillet, L., Cabello-Hurtado, F., Capella-Gutierrez, S., Charrier, B., Cladiere, L., Cock, J.M., Coelho, S.M., Colleoni, C., Czjzek, M., Da Silva, C., Delage, L., Denoeud, F., Deschamps, P., Dittami, S.M., Gabaldon, T., Gachon, C.M., Groisillier, A., Herve, C., Jabbari, K., Katinka, M., Kloareg, B., Kowalczyk, N., Labadie, K., Leblanc, C., Lopez, P.J., McLachlan, D.H., Meslet-Cladiere, L., Moustafa, A., Nehr, Z., Nyvall Collen, P., Panaud, O., Partensky, F., Poulain, J., Rensing, S.A., Rousvoal, S., Samson, G., Symeonidi, A., Weissenbach, J., Zambounis, A., Wincker, P., and Boyen, C. (2013). Genome structure and metabolic features in the red seaweed Chondrus crispus shed light on evolution of the Archaeplastida. Proc Natl Acad Sci U S A 110, 5247-5252.
Crooks, G.E., Hon, G., Chandonia, J.M., and Brenner, S.E. (2004). WebLogo: A sequence logo generator. Genome Research 14, 1188-1190.
Czechowski, T., Stitt, M., Altmann, T., Udvardi, M.K., and Scheible, W.R. (2005). Genome-wide identification and testing of superior reference genes for transcript normalization in Arabidopsis. Plant Physiol 139, 5-17.
Dambacher, S., Deng, W., Hahn, M., Sadie, D., Frohlich, J. J., Nuber, A., Hoischen, C., Diekmann, S., Leonhardt, H., and Schotta, G. (2012). CENP-C facilitates the recruitment of M18BP1 to centromeric chromatin. Nucleus-Austin 3, 101-110.
Dawe, R.K., Reed, L.M., Yu, H.-G., Muszynski, M.G., and Hiatt, E.N. (1999). A Maize Homolog of Mammalian CENPC Is a Constitutive Component of the Inner Kinetochore. The Plant Cell 11, 1227-1238.
Deeks, M.J., Calcutt, J.R., Ingle, E.K., Hawkins, T.J., Chapman, S., Richardson, A.C., Mentlak, D.A., Dixon, M.R., Cartwright, F., Smertenko, A.P., Oparka, K., and Hussey, P.J. (2012). A superfamily of actin-binding proteins at the actin-membrane nexus of higher plants. Curr Biol 22, 1595-1600.
Duckney, P., Deeks, M.J., Dixon, M.R., Kroon, J., Hawkins, T.J., and Hussey, P.J. (2017). Actin- membrane interactions mediated by NETWORKED2 in Arabidopsis pollen tubes through associations with Pollen Receptor-Like Kinase 4 and 5. New Phytol 216, 1170-1180.
Edger, P.P., Hall, J.C., Harkess, A., Tang, M., Coombs, J., Mohammadin, S., Schranz, M.E., Xiong, Z., Leebens-Mack, J., Meyers, B.C., Sytsma, K.J., Koch, M.A., Al-Shehbaz, LA., and Pires, J.C. (2018). Brassicales phylogeny inferred from 72 plastid genes: A reanalysis of the phylogenetic localization of two paleopolyploid events and origin of novel chemical defenses. Am J Bot 105, 463-469.
Fachinetti, D., Folco, H.D., Nechemia-Arbely, Y., Valente, L.P., Nguyen, K., Wong, A.J., Zhu, Q., Holland, A.J., Desai, A., Jansen, L.E., and Cleveland, D.W. (2013). A two-step mechanism for epigenetic specification of centromere identity and function. Nat Cell Biol 15, 1056-1066.
French, B.T., and Straight, A.F. (2019). CDK phosphorylation of Xenopus laevis M18BP1 promotes its metaphase centromere localization. Embo J 38.
French, B.T., Westhorpe, F.G., Limouse, C., and Straight, A.F. (2017). Xenopus laevis M18BP1 Directly Binds Existing CENP-A Nucleosomes to Promote Centromeric Chromatin Assembly. Dev Cell 42, 190-199 el 10.
Friis, E.M., Pedersen, K.R., and Crane, P.R. (2016). The emergence of core eudicots: new floral evidence from the earliest Late Cretaceous. P Roy Soc B-Biol Sci 283.
Fujita, Y., Hayashi, T., Kiyomitsu, T., Toyoda, Y., Kokubu, A., Obuse, C., and Yanagida, M. (2007). Priming of centromere for CENP-A recruitment by human hMisl8 alpha, hMisl8 beta, and M18BP1. Developmental Cell 12, 17-30.
Goodstein, D.M., Shu, S.Q., Howson, R., Neupane, R., Hayes, R.D., Fazo, J., Mitros, T., Dirks, W., Hellsten, U., Putnam, N., and Rokhsar, D.S. (2012). Phytozome: a comparative platform for green plant genomics. Nucleic Acids Research 40, DI 178-D1186.
Hara, M., and Fukagawa, T. (2018). Kinetochore assembly and disassembly during mitotic entry and exit. Curr Opin Cell Biol 52, 73-81.
Heeger, S., Leismann, O., Schittenhelm, R., Schraidt, O., Heidmann, S., and Lehner, C.F. (2005). Genetic interactions of separase regulatory subunits reveal the diverged Drosophila Cenp-C homolog. Genes Dev 19, 2041-2053.
Hori, T., Shang, W.H., Hara, M., Ariyoshi, M., Arimura, Y., Fujita, R., Kurumizaka, H., and Fukagawa, T. (2017). Association of M18BP1/KNL2 with CENP-A Nucleosome Is Essential for Centromere Formation in Non-mammalian Vertebrates. Dev Cell 42, 181-189 el83.
Kato, H., Jiang, J.S., Zhou, B.R., Rozendaal, M., Feng, H.Q., Ghirlando, R., Xiao, T.S., Straight, A.F., and Bai, Y.W. (2013). A Conserved Mechanism for Centromeric Nucleosome Recognition by Centromere Protein CENP-C. Science 340, 1110-1113.
Klare, K., Weir, J.R., Basilico, F., Zimniak, T., Massimiliano, L., Ludwigs, N., Herzog, F., and Musacchio, A. (2015). CENP-C is a blueprint for constitutive centromere-associated network assembly within human kinetochores. J Cell Biol 210, 11-22.
Klepikova, A.V., Kasianov, A.S., Gerasimov, E.S., Logacheva, M.D., and Penin, A.A. (2016). A high resolution map of the Arabidopsis thaliana developmental transcriptome based on RNA-seq profiling. Plant J 88, 1058-1070.
Kral, L. (2015). Possible identification of CENP-C in fish and the presence of the CENP-C motif in M18BP1 of vertebrates. FlOOORes 4, 474.
Kursel, L.E., and Malik, H.S. (2017). Recurrent Gene Duplication Leads to Diverse Repertoires of Centromeric Histones in Drosophila Species. Mol Biol Evol 34, 1445-1462.
Kursel, L.E., Welsh, F.C., and Malik, H.S. (2020). Ancient Coretention of Paralogs of Cid Centromeric Histones and Call Chaperones in Mosquito Species. Mol Biol Evol 37, 1949-1963.
Kwon, M.S., Hori, T., Okada, M., and Fukagawa, T. (2007). CENP-C is involved in chromosome segregation, mitotic checkpoint function, and kinetochore assembly. Mol Biol Cell 18, 2155-2168.
Le Goff, S., Keceli, B.N., Jerabkova, H., Heckmann, S., Rutten, T., Cotterell, S., Schubert, V., Roitinger, E., Mechtler, K., Franklin, F.C.H., Tatout, C., Houben, A., Geelen, D., Probst, A.V., and Lermontova, I. (2020). The H3 histone chaperone NASP(SIM3) escorts CenH3 in Arabidopsis. Plant Journal 101, 71-86.
Lermontova, I. (2017). Generation of haploid plants based on KNL 2. (IPK-Anmeldung), Veroffentlichung :
27.04.2017, IPK-Nr. 2015/01, WO/2017/067714.
Lermontova, L, Kuhlmann, M., Friedel, S., Rutten, T., Heckmann, S., Sandmann, M., Demidov, D., Schubert, V., and Schubert, I. (2013). Arabidopsis KINETOCHORE NULL2 Is an Upstream Component for Centromeric Histone H3 Variant cenH3 Deposition at Centromeres. Plant Cell 25, 3389-3404.
Lu, Y., Ran, J.H., Guo, D.M., Yang, Z.Y., and Wang, X.Q. (2014). Phylogeny and Divergence Times of Gymnosperms Inferred from Single-Copy Nuclear Genes. Pios One 9.
Mandakova, T., Pouch, M., Brock, J.R., Al-Shehbaz, LA., and Lysak, M.A. (2019). Origin and Evolution of Diploid and Allopolyploid Camelina Genomes Were Accompanied by Chromosome Shattering. Plant Cell 31, 2596-2612.
McKinley, K.L., and Cheeseman, I.M. (2016). The molecular basis for centromere identity and function. Nat Rev Mol Cell Bio 17.
Merchant, S.S., Prochnik, S.E., Vallon, O., Harris, E.H., Karpowicz, S.J., Witman, G.B., Terry, A., Salamov, A., Fritz-Laylin, L.K., Marechal-Drouard, L., Marshall, W.F., Qu, L.H., Nelson, D.R., Sanderfoot, A.A., Spalding, M.H., Kapitonov, V.V., Ren, Q., Ferris, P., Lindquist, E., Shapiro, H., Lucas, S.M., Grimwood, J., Schmutz, J., Cardol, P., Cerutti, H., Chanfreau, G.,
Chen, C.L., Cognat, V., Croft, M.T., Dent, R., Dutcher, S., Fernandez, E., Fukuzawa, H., Gonzalez-Ballester, D., Gonzalez-Halphen, D., Hallmann, A., Hanikenne, M., Hippier, M., Inwood, W., Jabbari, K., Kalanon, M., Kuras, R., Lefebvre, P.A., Lemaire, S.D., Lobanov, A.V., Lohr, M., Manuell, A., Meier, L, Mets, L., Mittag, M., Mittelmeier, T., Moroney, J.V., Moseley, J., Napoli, C., Nedelcu, A.M., Niyogi, K., Novoselov, S.V., Paulsen, LT., Pazour, G., Purton, S., Rai, J.P., Riano-Pachon, D.M., Riekhof, W., Rymarquis, L., Schroda, M., Stern, D., Umen, J., Willows, R., Wilson, N., Zimmer, S.L., Allmer, J., Balk, J., Bisova, K., Chen, C.J., Elias, M., Gendler, K., Hauser, C., Lamb, M.R., Ledford, H., Long, J.C., Minagawa, J., Page, M.D., Pan, J., Pootakham, W., Roje, S., Rose, A., Stahlberg, E., Terauchi, A.M., Yang, P., Ball, S., Bowler, C., Dieckmann, C.L., Gladyshev, V.N., Green, P., Jorgensen, R., Mayfield, S., Mueller-Roeber, B., Rajamani, S., Sayre, R.T., Brokstein, P., Dubchak, L, Goodstein, D., Hornick, L., Huang, Y.W., Jhaveri, J., Luo, Y., Martinez, D., Ngau, W.C., Otillar, B., Poliakov, A., Porter, A., Szajkowski, L., Werner, G., Zhou, K., Grigoriev, I.V., Rokhsar, D.S., and Grossman, A.R. (2007). The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318, 245-250.
Moree, B., Meyer, C.B., Fuller, C.J., and Straight, A.F. (2011). CENP-C recruits M18BP1 to centromeres to promote CENP-A chromatin assembly. J Cell Biol 194, 855-871.
Murashige, T., and Skoog, F. (1962). A Revised Medium for Rapid Growth and Bio Assays with Tobacco Tissue Cultures. Physiol Plantarum 15, 473-497.
Musacchio, A., and Desai, A. (2017). A Molecular View of Kinetochore Assembly and Function. Biology (Basel) 6.
Nguyen, L.T., Schmidt, H.A., von Haeseler, A., and Minh, B.Q. (2015). IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Molecular Biology and Evolution 32, 268-274.
Qi, X.P., Kuo, L.Y., Guo, C.C., Li, H., Li, Z.Y., Qi, J., Wang, L.B., Hu, Y., Xiang, J.Y., Zhang, C.F., Guo, J., Huang, C.H., and Ma, H. (2018). A well-resolved fem nuclear phylogeny reveals the evolution history of numerous transcription factor families. Molecular Phylogenetics and Evolution 127, 961-977.
Sandmann, M., Talbert, P., Demidov, D., Kuhlmann, M., Rutten, T., Conrad, U., and Lermontova, I. (2017). Targeting of Arabidopsis KNL2 to Centromeres Depends on the Conserved CENPC-k Motif in Its C Terminus. Plant Cell 29, 144-155.
Stellfox, M.E., Nardi, I.K., Knippler, C.M., and Foltz, D.R. (2016). Differential Binding Partners of the Mis 18 alpha/beta YIPPEE Domains Regulate Mis 18 Complex Recruitment to Centromeres. Cell Rep 15, 2127-2135.
Sugimoto, K., Yata, H., Muro, Y., and Himeno, M. (1994). Human Centromere Protein-C (Cenp-C) Is a DNA-Binding Protein Which Possesses aNovel DNA-Binding Motif. J Biochem-Tokyo 116, 877- 881.
Tachiwana, H., Muller, S., Blumer, J., Klare, K., Musacchio, A., and Almouzni, G. (2015). HJURP involvement in de novo CenH3(CENP-A) and CENP-C recruitment. Cell Rep 11, 22-32.
Talbert, P.B., Bryson, T.D., and Henikoff, S. (2004). Adaptive evolution of centromere proteins in plants and animals. J Biol 3, 18.
Talbert, P.B., Masuelli, R., Tyagi, A.P., Comai, L., and Henikoff, S. (2002). Centromeric localization and adaptive evolution of an Arabidopsis histone H3 variant. Plant Cell 14, 1053-1066.
Teixeira, J.R., Dias, G.B., Svartman, M., Ruiz, A., and Kuhn, G.C.S. (2018). Concurrent Duplication of Drosophila Cid and Cenp-C Genes Resulted in Accelerated Evolution and Male Germline- Biased Expression of the New Copies. J Mol Evol 86, 353-364.
Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D.R., Pimentel, H., Salzberg, S.L., Rinn, J.L., and Pachter, L. (2012). Differential gene and transcript expression analysis of RNA- seq experiments with TopHat and Cufflinks. Nat Protoc 7, 562-578. van Hooff, J.J.E., Tromer, E., van Wijk, L.M., Snel, B., and Kops, G.J.P.L. (2017). Evolutionary dynamics of the kinetochore network in eukaryotes as revealed by comparative genomics. Embo Reports 18, 1559-1571.
Walter, M., Chaban, C., Schutze, K., Batistic, O., Weckermann, K., Nake, C., Blazevic, D., Grefen, C., Schumacher, K., Oecking, C., Harter, K., and Kudla, J. (2004). Visualization of protein interactions in living plant cells using bimolecular fluorescence complementation. Plant Journal 40, 428-438.
Wu, Y., You, H.L., and Li, X.Q. (2018). Dinosaur-associated Poaceae epidermis and phytoliths from the Early Cretaceous of China. Natl Sci Rev 5, 721-727.
Yamada, K.D., Tomii, K., and Katoh, K. (2016). Application ofthe MAFFT sequence alignment program to large data-reexamination of the usefulness of chained guide trees. Bioinformatics 32, 3246-3251.
Yan, K., Yang, J., Zhang, Z., McLaughlin, S.H., Chang, L., Fasci, D., Ehrenhofer-Murray, A.E., Heck, A.J.R., and Barford, D. (2019). Structure of the inner kinetochore CCAN complex assembled onto a centromeric nucleosome. Nature.
Yang, Z. (2007). PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24, 1586-1591.
Zasadzinska, E., and Foltz, D.R. (2017). Orchestrating the Specific Assembly of Centromeric Nucleosomes. Prog Mol Subcell Bio 56, 165-192.
Zhang, D., Martyniuk, C.J., and Trudeau, V.L. (2006). SANTA domain: a novel conserved protein module in Eukaryota with potential involvement in chromatin regulation. Bioinformatics 22, 2459- 2462.
Zhang, M., Zheng, F., Xiong, Y.J., Shao, C., Wang, C.L., Wu, M.H., Niu, X.J., Dong, F.F., Zhang, X., Fu, C.H., and Zang, J.Y. (2020). Centromere targeting of Mis 18 requires the interaction with DNA and H2A-H2B in fission yeast. Cell Mol Life Sci.
Lermontova I. 2017. Generation of haploid plants based on knl2. Available from: https ://patents .google .com/patent/WQ2017067714A 1 Zen
Weisshart K, Fuchs J, Schubert V. 2016. Structured Illumination Microscopy (SIM) and Photoactivated Localization Microscopy (PALM) to Analyze the Abundance and Distribution of RNA Polymerase II Molecules on Flow-sorted Arabidopsis Nuclei. Bio-protocol 6:el725-el725.
Claims
1. Plant, wherein the plant comprises a nucleotide sequence encoding a KINETOCHORE NULL2 (KNL2) protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain, and/or a conserved C- terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC- k motive, wherein the nucleotide sequence comprises at least one mutation, preferably in any one of the alleles.
2. Plant according to claim 1, wherein the mutation is in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
3. Plant according to claim 1 or 2, wherein the at least one mutation is a deletion, addition or substitution of at least one nucleotide in the nucleotide sequence for the SANTA domain and/or the conserved N-terminal motif located preferably upstream of the SANTA domain and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
4. Plant according to any of the preceding claims, wherein the plant has biological activity of a haploid inducer.
5. Plant according to any of the preceding claims, wherein the plant expresses a KNL2 protein not comprising a CENPC-k motif having at least one amino acid addition, amino acid deletion and/or amino acid substitution in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
6. Part of the plant according to any of the preceding claims, which is preferably a shoot vegetative organ, root, flower or floral organ, seed, fruit, ovule, embryo, plant tissue or cell.
7. Haploid plant obtainable by crossing a plant according to any of claims 1 to 6 with a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive.
8. Haploid plant obtainable by crossing in a first step a plant according to any of claims 1 to 6 with a plant comprising a nucleotide sequence encoding a centromere histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, and crossing in a second step a plant obtained in the first step with a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein.
9. Double haploid plant obtainable by converting the haploid plant according to claim 7 or 8 into a double haploid plant, preferably via colchicine treatment.
10. A method of generating a haploid plant, comprising the steps of: a) crossing a plant according to claims 1 to 6 to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N- terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, and b) identifying the haploid progeny plant generated from the crossing step.
11. A method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to claims 1 to 6 to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N- terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive, b) identifying a haploid progeny plant generated from the crossing step, and
c) converting the haploid progeny plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
12. A method of generating a haploid plant, comprising the steps of: a) crossing a plant according to claims 1 to 6 to a plant expressing wildtype KNL2 protein but comprising a nucleotide sequence encoding a centromer histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein, and c) identifying the haploid progeny plant generated from step b).
13. A method of generating a double haploid plant, comprising the steps of: a) crossing a plant according to claims 1 to 6 to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive but comprising a nucleotide sequence encoding a centromer histone H3 (CENH3) protein comprising a CATD domain, wherein the nucleotide sequence comprises at least one mutation causing in the CATD domain an amino acid substitution which confers the biological activity of a haploid inducer, b) crossing a plant obtained in step a) to a plant expressing wildtype KNL2 protein comprising a SANTA domain and/or a conserved
N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive and wildtype CENH3 protein, c) identifying a haploid progeny plant generated from step b), and d) converting the haploid progeny plant into a double haploid plant, preferably via colchicine treatment or via spontaneous chromosome doubling.
14. Haploid progeny plant generated in a method according to any of claims 10 or 12 or double haploid progeny plant generated in a method according to any of claims 11 or 13.
15. A method of generating a plant according to claims 1 to 6, comprising the steps of: i) subjecting seeds of a plant to a sufficient amount of the mutagen ethylmethane sulfonate to obtain Ml plants, ii) allowing sufficient production of fertile M2 plants, iii) isolating genomic DNA of M2 plants and iv) selecting individuals possessing at least one amino acid substitution, deletion or addition in KNL2.
16. Nucleotide sequence encoding a KNL2 protein comprising a SANTA domain and/or a conserved N-terminal motif, preferably upstream of the SANTA domain and/or a conserved C-terminal motif, preferably downstream of the SANTA domain, and not comprising a CENPC-k motive comprising at least one mutation causing an amino acid substitution, deletion or addition, preferably in the SANTA domain encoding sequence and/or in the conserved N-terminal motif located preferably upstream of the SANTA domain encoding sequence and/or in the conserved C-terminal motif, preferably downstream of the SANTA domain.
17. Plant cell or host cell comprising the nucleotide sequence of claim 16 or an according vector as a transgene.
18. A method of generating a plant according to claims 1 to 6 , comprising the steps of: yy) transforming a plant cell with the nucleotide sequence of claim 16, and zz) regenerating a plant having the biological activity of a haploid inducer from the plant cell.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP23723986.8A EP4525602A1 (en) | 2022-05-19 | 2023-05-19 | Generation of haploid plants based on novel knl2 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22174323 | 2022-05-19 | ||
EP22174323.0 | 2022-05-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023222908A1 true WO2023222908A1 (en) | 2023-11-23 |
Family
ID=81749324
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2023/063525 WO2023222908A1 (en) | 2022-05-19 | 2023-05-19 | Generation of haploid plants based on novel knl2 |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP4525602A1 (en) |
WO (1) | WO2023222908A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110083202A1 (en) | 2009-10-06 | 2011-04-07 | Regents Of The University Of California | Generation of haploid plants and improved plant breeding |
EP3159413A1 (en) * | 2015-10-22 | 2017-04-26 | Leibniz-Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK); OT Gatersleben | Generation of haploid plants based on knl2 |
WO2020157197A1 (en) * | 2019-01-30 | 2020-08-06 | KWS SAAT SE & Co. KGaA | Haploid inducers |
-
2023
- 2023-05-19 EP EP23723986.8A patent/EP4525602A1/en active Pending
- 2023-05-19 WO PCT/EP2023/063525 patent/WO2023222908A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110083202A1 (en) | 2009-10-06 | 2011-04-07 | Regents Of The University Of California | Generation of haploid plants and improved plant breeding |
EP3159413A1 (en) * | 2015-10-22 | 2017-04-26 | Leibniz-Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK); OT Gatersleben | Generation of haploid plants based on knl2 |
WO2017067714A1 (en) | 2015-10-22 | 2017-04-27 | Leibniz-Institut für Pflanzengenetik Und Kulturpflanzenforschung (IPK) | Generation of haploid plants based on knl2 |
WO2020157197A1 (en) * | 2019-01-30 | 2020-08-06 | KWS SAAT SE & Co. KGaA | Haploid inducers |
Non-Patent Citations (70)
Title |
---|
ANONYMOUS: "ABRC SALK_091054", 3 October 2015 (2015-10-03), XP093066097, Retrieved from the Internet <URL:https://abrc.osu.edu/stocks/number/SALK_091054> [retrieved on 20230721] * |
ANONYMOUS: "ABRC SALK_135778", 22 January 2003 (2003-01-22), XP093066089, Retrieved from the Internet <URL:https://abrc.osu.edu/stocks/618439> [retrieved on 20230721] * |
BAILEY, T.L.BODEN, M.BUSKE, F.A.FRITH, M.GRANT, C.E.CLEMENTI, L.REN, J.Y.LI, W.W.NOBLE, W.S: "MEME SUITE: tools for motif discovery and searching", NUCLEIC ACIDS, vol. 37, 2009, pages W202 - W208 |
BANKS, J.A., NISHIYAMA, T., HASEBE, M., BOWMAN, J.L., GRIBSKOV, M., DEPAMPHILIS, C., ALBERT, V.A.AONO, N., AOYAMA, T., AMBROSE, B.: "The Selaginella Genome Identifies Genetic Changes Associated with the Evolution of Vascular", PLANTS. SCIENCE, vol. 332, 2011, pages 960 - 963, XP055758636, DOI: 10.1126/science.1203810 |
BARRA, V., AND FACHINETTI, D: "The dark side of centromeres: types, causes and consequences off structural abnormalities implicating centromeric DNA.", NAT COMMUN, vol. 9, 2018, pages 4340 |
BLANC, G., AGARKOVA, I., GRIMWOOD, J., KUO, A., BRUEGGEMAN, A., DUNIGAN, D.D., GURNON, J.,LADUNGA, I., LINDQUIST, E., LUCAS, S., P: "The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation.", GENOME BIOL, vol. 13, 2012, pages R39, XP021096300, DOI: 10.1186/gb-2012-13-5-r39 |
CHEESEMAN, I.M., AND DESAI, A.: "Molecular architecture of the kinetochore-microtubule interface.", NAT REV MOL CELL BIO, vol. 9, 2008, pages 33 - 46, XP055175880, DOI: 10.1038/nrm2310 |
CHENG, C.Y., KRISHNAKUMAR, V., CHAN, A.P., THIBAUD-NISSEN, F., SCHOBEL, S., AND TOWN, C.D: "Araport 1 1: a complete reannotation of the Arabidopsis thaliana reference genome.", PLANT J, vol. 89, 2017, pages 789 - 804 |
CLOUGH, S.J., AND BENT, A.F.: "Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana", PLANT JOURNAL, vol. 16, 1998, pages 735 - 743, XP002132452, DOI: 10.1046/j.1365-313x.1998.00343.x |
COLLEN, J., PORCEL, B., CARRE, W., BALL, S.G., CHAPARRO, C., TONON, T., BARBEYRON, T., MICHEL, G., NOEL,B., VALENTIN, K., ELIAS, M: " Genome structure and metabolic features in the red seaweed Chondrus crispus shed light on evolution of the Archaeplastida. ", PROC NATL ACAD SCI U S A, vol. 110, 2013, pages 5247 - 5252 |
CROOKS, G.E., HON, G., CHANDONIA, J.M., AND BRENNER, S.E.: " WebLogo: A sequence logo generator", GENOME RESEARCH, vol. 14, 2004, pages 1188 - 1190, XP055570674, DOI: 10.1101/gr.849004 |
CZECHOWSKI, T., STITT, M., ALTMANN, T., UDVARDI, M.K., AND SCHEIBLE, W.R.: "Genome-wide identification and testing of superior reference genes for transcript normalization in Arabidopsis.", PLANT PHYSIOL, vol. 139, 2005, pages 5 - 17 |
DAMBACHER, S., DENG, W., HAHN, M., SADIC, D., FROHLICH, J.J., NUBER, A., HOISCHEN, C., DIEKMANN, S.LEONHARDT, H., AND SCHOTTA, G.: "CENP-C facilitates the recruitment of M18BP1 to centromeric chromatin.", NUCLEUS-AUSTIN, vol. 3, 2012, pages 101 - 110 |
DAWE, R.K.REED, L.M.YU, H.-G.MUSZYNSKI, M.G.HIATT, E.N: "A Maize Homolog of Mammalian CENPC Is a Constitutive Component of the Inner Kinetochore", THE PLANT CELL, vol. 11, 1999, pages 1227 - 1238, XP002462808 |
DEEKS, M.J., CALCUTT, J.R., INGLE, E.K., HAWKINS, T.J., CHAPMAN, S., RICHARDSON, A.C., MENTLAK, D.A.,DIXON, M.R., CARTWRIGHT, F., : "A superfamily of actin-binding proteins at the actin-membrane nexus of higher plants", CURR BIOL, vol. 22, 2012, pages 1595 - 1600 |
DUCKNEY, P., DEEKS, M.J., DIXON, M.R., KROON, J., HAWKINS, T.J., AND HUSSEY, P.J.: "Actin-membrane interactions mediated by NETWORKED2 in Arabidopsis pollen tubes through associations with Pollen Receptor-Like Kinase 4 and 5. ", NEW PHYTOL, vol. 216, 2017, pages 1170 - 1180 |
EDGER, P.P., HALL, J.C., HARKESS, A., TANG, M., COOMBS, J., MOHAMMADIN, S., SCHRANZ, M.E., XIONG,Z., LEEBENS-MACK, J., MEYERS, B.C: "Brassicales phylogeny inferred from 72 plastid genes: A reanalysis of the phylogenetic localization of two paleopolyploid events and origin of novel chemical defenses", AM J BOT, vol. 105, 2018 |
FACHINETTI, D., FOLCO, H.D., NECHEMIA-ARBELY, Y., VALENTE, L.P., NGUYEN, K., WONG, A.J., ZHU, Q.,HOLLAND, A.J., DESAI, A., JANSEN,: "A two-step mechanism for epigenetic specification of centromere identity and function.", NAT CELL BIOL, vol. 15, 2013, pages 1056 - 1066 |
FRENCH, B.T., AND STRAIGHT, A.F.: "CDK phosphorylation of Xenopus laevis M18BP1 promotes its metaphase centromere localization.", EMBO J, vol. 38, 2019 |
FRENCH, B.T., WESTHORPE, F.G., LIMOUSE, C., AND STRAIGHT, A.F.: "Xenopus laevis M18BP1 Directly Binds Existing CENP-A Nucleosomes to Promote Centromeric Chromatin Assembly. Dev", CELL, vol. 42, 2017, pages 190 - 199 e 110 |
FRIIS, E.M.PEDERSEN, K.R.CRANE, P.R.: "The emergence of core eudicots: new floral evidence from the earliest Late Cretaceous", P ROY SOC B-BIOL SCI, vol. 283, 2016 |
FUJITA, Y., HAYASHI, T., KIYOMITSU, T., TOYODA, Y., KOKUBU, A., OBUSE, C., AND YANAGIDA, M: "Priming of centromere for CENP-A recruitment by human hMis 18 alpha, hMis 18 beta, and M18BP1", DEVELOPMENTAL CELL, vol. 12, 2007, pages 17 - 30 |
GOODSTEIN, D.M., SHU, S.Q., HOWSON, R., NEUPANE, R., HAYES, R.D., FAZO, J., MITROS, T., DIRKS, W., HELLSTEN, U., PUTNAM, N., AND R: "Phytozome: a comparative platform for green plant genomics.", NUCLEIC ACIDS RESEARCH, vol. 40, no. D 1, 2012, pages 178 - D1186 |
HARA, M., AND FUKAGAWA, T.: "Kinetochore assembly and disassembly during mitotic entry and exit. ", CURR OPIN CELL BIOL, vol. 52, 2018, pages 73 - 81 |
HEEGER, S., LEISMANN, O., SCHITTENHELM, R., SCHRAIDT, O., HEIDMANN, S., AND LEHNER, C.F.: "Genetic interactions of separase regulatory subunits reveal the diverged Drosophila Cenp-C homolog.", GENES DEV, vol. 19, 2005, pages 2041 - 2053 |
HORI, T., SHANG, W.H., HARA, M., ARIYOSHI, M., ARIMURA, Y., FUJITA, R., KURUMIZAKA, H., AND FUKAGAWA, T.: " Association of M18BP1/KNL2 with CENP-A Nucleosome Is Essential for Centromere Formation in Non-mammalian Vertebrates", DEV CELL, vol. 42, 2017, pages 181 - 189 e183 |
KATO, H., JIANG, J.S., ZHOU, B.R., ROZENDAAL, M., FENG, H.Q., GHIRLANDO, R., XIAO, T.S., STRAIGHT, A.F.AND BAI, Y.W.: " A Conserved Mechanism for Centromeric Nucleosome Recognition by Centromere Protein CENP-C.", SCIENCE, vol. 340, 2013, pages 1110 - 1113, XP055272882, DOI: 10.1126/science.1235532 |
KLARE, K., WEIR, J.R., BASILICO, F., ZIMNIAK, T., MASSIMILIANO, L., LUDWIGS, N., HERZOG, F., AND MUSACCHIO, A.: "CENP-C is a blueprint for constitutive centromere-associated network assembly within human kinetochores", J CELL BIOL, vol. 210, 2015, pages 11 - 22 |
KLEPIKOVA, A.V., KASIANOV, A.S., GERASIMOV, E.S., LOGACHEVA, M.D., AND PENIN, A.A.: "A high resolution map of the Arabidopsis thaliana developmental transcriptome based on RNA-seq profiling", PLANT J, vol. 88, 2016, pages 1058 - 1070 |
KRAL, L.: "Possible identification of CENP-C in fish and the presence of the CENP-C motif in M18BP1 of vertebrates", F1000RES, vol. 4, 2015, pages 474 |
KURSEL, L.E., AND MALIK, H.S.: "Recurrent Gene Duplication Leads to Diverse Repertoires of Centromeric Histones in Drosophila Species.", MOL BIOL EVOL, vol. 34, 2017, pages 1445 - 1462 |
KURSEL, L.E., WELSH, F.C., AND MALIK, H.S.: "Ancient Coretention of Paralogs of Cid Centromeric Histones and Call Chaperones in Mosquito Species.", MOL BIOL EVOL, vol. 37, 2020, pages 1949 - 1963 |
KWON, M.S., HORI, T., OKADA, M., AND FUKAGAWA, T.: "CENP-C is involved in chromosome segregation, mitotic checkpoint function, and kinetochore assembly.", MOL BIOL CELL, vol. 18, 2007, pages 2155 - 2168, XP055327416, DOI: 10.1091/mbc.E07-01-0045 |
LE GOFF, S., KECELI, B.N., JERABKOVA, H., HECKMANN, S., RUTTEN, T., COTTERELL, S., SCHUBERT, V.,ROITINGER, E., MECHTLER, K., FRANK: "The H3 histone chaperone NASP(SIM3) escorts CenH3 in Arabidopsis.", PLANT JOURNAL, vol. 101, 2020, pages 71 - 86 |
LERMONTOVA I., GENERATION OF HAPLOID PLANTS BASED ON KNL2, 2017, Retrieved from the Internet <URL:https://patents.google.com/patent/W02017067714A1/en> |
LERMONTOVA, I., KUHLMANN, M., FRIEDEL, S., RUTTEN, T., HECKMANN, S., SANDMANN, M., DEMIDOV, D. SCHUBERT, V., AND SCHUBERT, I.: "Arabidopsis KINETOCHORE NULL2 Is an Upstream Component for Centromeric Histone H3 Variant cenH3 Deposition at Centromeres", PLANT CELL, vol. 25, 2013, pages 3389 - 3404, XP055246251, DOI: 10.1105/tpc.113.114736 |
LU, Y., RAN, J.H., GUO, D.M., YANG, Z.Y., AND WANG, X.Q.: "Phylogeny and Divergence Times of Gymnosperms Inferred from Single-Copy Nuclear Genes.", PLOS ONE, vol. 9, 2014 |
MANDAKOVA, T., POUCH, M., BROCK, J.R., AL-SHEHBAZ, I.A., AND LYSAK, M.A.: " Origin and Evolution of Diploid and Allopolyploid Camelina Genomes Were Accompanied by Chromosome Shattering.", PLANT CELL, vol. 31, 2019, pages 2596 - 2612 |
MCKINLEY, K.L., AND CHEESEMAN, I.M: " The molecular basis for centromere identity and function.", NAT REV MOL CELL BIO, vol. 17, 2016 |
MERCHANT, S.S., PROCHNIK, S.E., VALLON, O., HARRIS, E.H., KARPOWICZ, S.J., WITMAN, G.B., TERRY, A.,SALAMOV, A., FRITZ-LAYLIN, L.K.: "The Chlamydomonas genome reveals the evolution of key animal and plant functions", SCIENCE, vol. 318, 2007, pages 245 - 250, XP055424333, DOI: 10.1126/science.1143609 |
MICHAEL SANDMANN ET AL: "Targeting of Arabidopsis KNL2 to Centromeres Depends on the Conserved CENPC-k Motif in Its C Terminus", THE PLANT CELL, vol. 29, no. 1, 1 January 2017 (2017-01-01), US, pages 144 - 155, XP055575444, ISSN: 1040-4651, DOI: 10.1105/tpc.16.00720 * |
MOL. BIOL. EVOL, vol. 39, no. 6, pages msac123, Retrieved from the Internet <URL:https://doi.org/10.1093/molbev/msac123> |
MOREE, B., MEYER, C.B., FULLER, C.J., AND STRAIGHT, A.F.: "CENP-C recruits M18BP1 to centromeres to promote CENP-A chromatin assembly", J CELL BIOL, vol. 194, 2011, pages 855 - 871 |
MURASHIGE, T., SKOOG, F.: "A Revised Medium for Rapid Growth and Bio Assays with Tobacco Tissue Cultures", PHYSIOL PLANTARUM, vol. 15, pages 473 - 497, XP002577845, DOI: 10.1111/j.1399-3054.1962.tb08052.x |
MUSACCHIO, A.DESAI, A.: "A Molecular View of Kinetochore Assembly and Function", BIOLOGY (BASEL, vol. 6, 2017 |
NGUYEN, L.T., SCHMIDT, H.A., VON HAESELER, A., MINH, B.Q.: "IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies", EVOLUTION, vol. 32, 2015, pages 268 - 274 |
QI, X.P., KUO, L.Y., GUO, C.C., LI, H., LI, Z.Y., QI, J., WANG, L.B., HU, Y., XIANG, J.Y., ZHANG, C.F., GUO, J., HUANG, C.H., MA, : "A well-resolved fern nuclear phylogeny reveals the evolution history of numerous transcription factor families. Molecular", PHYLOGENETICS AND EVOLUTION, vol. 127, 2018, pages 961 - 977, XP085430694, DOI: 10.1016/j.ympev.2018.06.043 |
RAVI, M ET AL., GENETICS, vol. 186, 2010, pages 461 - 471 |
RAVICHAN, NATURE, vol. 464, 2010, pages 615 - 618 |
SANDMANN, M., TALBERT, P., DEMIDOV, D., KUHLMANN, M., RUTTEN, T., CONRAD, U., LERMONTOVA, I.: "Targeting of Arabidopsis KNL2 to Centromeres Depends on the Conserved CENPC-k Motif in Its C Terminus", PLANT CELL, vol. 29, 2017, pages 144 - 155, XP055575444, DOI: 10.1105/tpc.16.00720 |
SCHRITTWIESER: "Combining evidence using p-values: application to sequence homology searches", BIOINFORMATICS, vol. 14, 1998, pages 48 - 54 |
STELLFOX, M.E.NARDI, I.K.KNIPPLER, C.M.FOLTZ, D.R.: "Differential Binding Partners of the Mis 18 alpha/beta YIPPEE Domains Regulate Mis 18 Complex Recruitment to Centromeres", CELL, vol. 15, 2016, pages 2127 - 2135 |
SUGIMOTO, K.YATA, H.MURO, Y.HIMENO, M.: "Human Centromere Protein-C (Cenp-C) Is a DNA-Binding Protein Which Possesses a Novel DNA-Binding Motif", J BIOCHEM-TOKYO, vol. 116, 1994, pages 877 - 881 |
TACHIWANA, H., MULLER, S., BLUMER, J., KLARE, K., MUSACCHIO, A., ALMOUZNI, G.: "HJURP involvement in de novo CenH3(CENP-A) and CENP-C recruitment", CELL REP, vol. 11, 2015, pages 22 - 32 |
TALBERT, P.B., BRYSON, T.D., HENIKOFF, S.: "Adaptive evolution of centromere proteins in plants and animals", J BIOL, vol. 3, 2004, pages 18, XP021013058, DOI: 10.1186/jbiol11 |
TALBERT, P.B., MASUELLI, R., TYAGI, A.P., COMAI, L., HENIKOFF, S.: "Centromeric localization and adaptive evolution of an Arabidopsis histone H3 variant", PLANT CELL, vol. 14, 2002, pages 1053 - 1066 |
TEIXEIRA, J.R.DIAS, G.B.SVARTMAN, M.RUIZ, A.KUHN, G.C.S.: "Concurrent Duplication of Drosophila Cid and Cenp-C Genes Resulted in Accelerated Evolution and Male Germline-Biased Expression of the New Copies", J MOL EVOL, vol. 86, 2018, pages 353 - 364 |
TRAPNELL, C.ROBERTS, A.GOFF, L.PERTEA, G.KIM, D.KELLEY, D.R.PIMENTEL, H.SALZBERG, S.L.RINN, J.L.PACHTER, L.: "Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks", NAT PROTOC, vol. 7, 2012, pages 562 - 578, XP055175849, DOI: 10.1038/nprot.2012.016 |
VAN HOOFF, J.J.E.TROMER, E.VAN WIJK, L.M.SNEL, B.KOPS, G.J.P.L.: "Evolutionary dynamics of the kinetochore network in eukaryotes as revealed by comparative genomics", EMBO REPORTS, vol. 18, 2017, pages 1559 - 1571, XP072235745, DOI: 10.15252/embr.201744102 |
WALTER, M., CHABAN, C., SCHUTZE, K., BATISTIC, O., WECKERMANN, K., NAKE, C., BLAZEVIC, D., GREFEN, C., SCHUMACHER, K., OECKING, C.: "Visualization of protein interactions in living plant cells using bimolecular fluorescence complementation", PLANT JOURNAL, vol. 40, 2004, pages 428 - 438 |
WEISSHART KFUCHS JSCHUBERT V.: "Structured Illumination Microscopy (SIM) and Photoactivated Localization Microscopy (PALM) to Analyze the Abundance and Distribution of RNA Polymerase II Molecules on Flow-sorted Arabidopsis Nuclei", BIO-PROTOCOL, vol. 6, 2016, pages 1725 |
WU, Y., YOU, H.L., LI, X.Q.: "Dinosaur-associated Poaceae epidermis and phytoliths from the Early Cretaceous of China", NATL SCI REV, vol. 5, 2018, pages 721 - 727 |
YAMADA, K.D., TOMII, K., KATOH, K.: "Application of the MAFFT sequence alignment program to large data-reexamination of the usefulness of chained guide trees", BIOINFORMATICS, vol. 32, 2016, pages 3246 - 3251 |
YAN, K., YANG, J., ZHANG, Z., MCLAUGHLIN, S.H., CHANG, L., FASCI, D., EHRENHOFER-MURRAY, A.E., HECK, A.J.R., BARFORD, D.: "Structure of the inner kinetochore CCAN complex assembled onto a centromeric nucleosome", NATURE, 2019 |
YANG, Z.: "PAML 4: phylogenetic analysis by maximum likelihood", MOL BIOL EVOL, vol. 24, 2007, pages 1586 - 1591 |
ZASADZINSKA, E., FOLTZ, D.R.: "Orchestrating the Specific Assembly of Centromeric Nucleosomes", PROG MOL SUBCELL BIO, vol. 56, 2017, pages 165 - 192 |
ZHANG, D.MARTYNIUK, C.J.TRUDEAU, V.L.: "SANTA domain: a novel conserved protein module in Eukaryota with potential involvement in chromatin regulation", BIOINFORMATICS, vol. 22, 2006, pages 2459 - 2462 |
ZHANG, M.ZHENG, F.XIONG, Y.J.SHAO, C.WANG, C.L.WU, M.H.NIU, X.J.DONG, F.F.ZHANG, X.FU, C.H.: "Centromere targeting of Mis 18 requires the interaction with DNA and H2A-H2B in fission yeast", CELL MOL LIFE SCI, 2020 |
ZUO SHENG ET AL: "Recurrent Plant-Specific Duplications of KNL2 and its Conserved Function as a Kinetochore Assembly Factor", MOLECULAR BIOLOGY AND EVOLUTION, vol. 39, no. 6, 2 June 2022 (2022-06-02), US, XP093066034, ISSN: 0737-4038, Retrieved from the Internet <URL:https://academic.oup.com/mbe/article-pdf/39/6/msac123/44179671/msac123.pdf> DOI: 10.1093/molbev/msac123 * |
ZUO, S. ET AL., MOL BIOL EVOL, vol. 39, 2022 |
Also Published As
Publication number | Publication date |
---|---|
EP4525602A1 (en) | 2025-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11895960B2 (en) | Generation of haploid plants | |
US20230337611A1 (en) | Generation of hapoloid plants based on knl2 | |
CN106164272B (en) | Modified plants | |
JP5770086B2 (en) | Brassica plant with mutant INDEHISCENT allele | |
US11725214B2 (en) | Methods for increasing grain productivity | |
MX2011010695A (en) | Rice zinc finger protein transcription factor dst and use thereof for regulating drought and salt tolerance. | |
WO2019038417A1 (en) | Methods for increasing grain yield | |
JP2018525995A (en) | Multiphase spore reproduction gene | |
US20240060079A1 (en) | Methods for producing vanilla plants with improved flavor and agronomic production | |
EP2989889B1 (en) | Generation of haploid plants | |
US20150024388A1 (en) | Expression of SEP-like Genes for Identifying and Controlling Palm Plant Shell Phenotypes | |
CA3154052A1 (en) | Plants having a modified lazy protein | |
HK1049845A1 (en) | Methods for increasing plant cell proliferation by functionally inhibiting a plant cyclin inhibitor gene | |
US20230081195A1 (en) | Methods of controlling grain size and weight | |
WO2023222908A1 (en) | Generation of haploid plants based on novel knl2 | |
Kim et al. | Heterologous expression of an RNA-binding protein affects flowering time as well as other developmental processes in Solanaceae | |
EP2522675B1 (en) | SpBRANCHED1a of Solanum pennellii and tomato plants with reduced branching comprising this heterologous SpBRANCHED1a gene | |
US20220042030A1 (en) | A method to improve the agronomic characteristics of plants | |
Skoczowski et al. | Using immunoblotting and chemiluminescence methods for identification immunoreactive fractions of wheat storage proteins | |
BR112017003555B1 (en) | USE OF A PLANT OR PART THEREOF POSSESSING BIOLOGICAL ACTIVITY OF A HAPLOID INDUCER, AND METHOD FOR GENERATING A PLANT | |
Marciniak et al. | LlGA20ox1 and LlGA20ox2–the Lupinus luteus homologues of Arabidopsis genes involved in gibberellin biosynthesis pathway | |
Glazinska et al. | Identification and analysis of cDNA LlTIR1–the miR393 target gene in Lupinus luteus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23723986 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023723986 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2023723986 Country of ref document: EP Effective date: 20241219 |