WO2025019762A1 - Method of detecting maize plants with altered aleurone anthocyanin content and associated marker alleles and haplotypes - Google Patents

Method of detecting maize plants with altered aleurone anthocyanin content and associated marker alleles and haplotypes Download PDF

Info

Publication number
WO2025019762A1
WO2025019762A1 PCT/US2024/038717 US2024038717W WO2025019762A1 WO 2025019762 A1 WO2025019762 A1 WO 2025019762A1 US 2024038717 W US2024038717 W US 2024038717W WO 2025019762 A1 WO2025019762 A1 WO 2025019762A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
nucleic acid
agpv4
referenced
reference genome
Prior art date
Application number
PCT/US2024/038717
Other languages
French (fr)
Inventor
Brett BURDO
John Gill
Marie Helene TIXIER
Johan ROBIN
Sebastien Praud
Monika KLOIBER-MAITZ
Thomas PRESTERL
Original Assignee
Agreliant Genetics, Llc
Limagrain Europe S.A.
KWS SAAT SE & Co. KGaA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agreliant Genetics, Llc, Limagrain Europe S.A., KWS SAAT SE & Co. KGaA filed Critical Agreliant Genetics, Llc
Publication of WO2025019762A1 publication Critical patent/WO2025019762A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H1/00Processes for modifying genotypes ; Plants characterised by associated natural traits
    • A01H1/04Processes of selection involving genotypic or phenotypic markers; Methods of using phenotypic markers for selection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H6/00Angiosperms, i.e. flowering plants, characterised by their botanic taxonomy
    • A01H6/46Gramineae or Poaceae, e.g. ryegrass, rice, wheat or maize
    • A01H6/4684Zea mays [maize]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/13Plant traits
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • An embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having an altered aleurone anthocyanin content, comprising detecting a presence or absence of one, two, three or more marker alleles linked to anthocyaninregulating genes selected from the group consisting of R1 and Cl genes.
  • An embodiment of the present disclosure provides isolated nucleic acid sequences for identify ing and selecting plants of the genus Zea having an altered aleurone anthocyanin content.
  • the present disclosure provides one or more nucleic acid sequences for the identification of Zea plants having altered aleurone anthocyanin content wherein the one or more nucleic acid sequences shares at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% identity to the anthocyanin-regulating genes identified and presented herein.
  • An embodiment of the present disclosure provides an R1 gene that is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39;
  • the C1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and
  • An embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having no or substantially no smoky kernels, the method comprising detecting a presence or absence of one, two, three or more marker alleles linked to anthocyanin- regulating genes selected from the group consisting of R1 and C1 genes, wherein the R1 gene is selected from the group consisting of SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is selected from the group consisting of SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66; or either or both of the R1 gene or the C1 gene comprises a sequence having at least 90%, 91%, 92%, 93%, 94%,
  • An embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having no or substantially no smoky kernels, the method comprising detecting a presence or absence of one, two, three or more marker alleles linked to anthocyanin- regulating genes selected from the group consisting of R1 and C1 gene, wherein said marker alleles comprise at least one allele being selected from M1, M2, M3, M4, M5, M6, M7, M8, M9, M10, M11, M12, M13, M14 and M15, wherein: M1 is a SNP which is guanine (G) at the position 139770766 referenced to the B73 reference genome AGPv4; M2 is a SNP which is thymine (T) at the position 139771365 referenced to the B73 reference genome AGPv4; M3 is a SNP which is thymine (T) at the position 139771617 referenced to the B73 reference genome AGPv4;
  • An embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having no or substantially no smoky kernels, the method comprising detecting a presence or absence of one, two, three or more marker alleles linked to anthocyaninregulating genes selected from the group consisting of R1 and Cl gene, wherein said marker alleles comprise at least one allele being selected from Ml ’, M2’, M3’, M4’, MS’, M6", M7’, M8’, M9’, M10’.
  • Ml' is a SNP which is cytosine (C) at the position 139770766 referenced to the B73 reference genome AGPv4
  • M2’ is a SNP which is cytosine (C) at the position 139771365 referenced to the B73 reference genome AGPv4
  • M3’ is a SNP which is cytosine (C) at the position 139771617 referenced to the B73 reference genome AGPv4
  • M4’ is a SNP which is guanine (G) at the position 139772481 referenced to the B73 reference genome AGPv4
  • MS’ is a SNP which is adenine (A) at the position 139772862 referenced to the B73 reference genome AGPv4
  • M6’ is a SNP which is adenine (A) at the position 139789400 referenced to the B73 reference genome AGPv4
  • M7’ is an indel
  • An embodiment of the present disclosure provides a method for identifying a maize plant or plant part, the method comprising screening for the presence of one or more marker alleles selected from Ml to M15 or M1 ’ to M15’.
  • An embodiment of the present disclosure provides a maize plant, identified by screening for the presence of one or more marker alleles selected from Ml to M15 or Ml ’ to M15".
  • An embodiment of the present disclosure provides a method for producing maize plants having altered aleurone anthocyanin content, comprising the steps of: identifying one or more maize lines comprising one or more molecular marker alleles identified by the wherein said marker alleles comprise at least one allele selected from the alleles Ml to M15; selfing the identified maize line with itself or crossing the identified maize line with a second identified maize line; and producing the maize plant.
  • An embodiment of the present disclosure provides a method for producing maize plants having no or substantially no smoky kernels, comprising the steps of: identifying one or more maize lines comprising one or more molecular marker alleles selected from the alleles Ml to M15; selfing the identified maize line with itself or crossing the identified maize line with a second identified maize line: and producing the maize plant.
  • An embodiment of the present disclosure provides a method for detecting anthocyanin-regulating genes in maize, comprising using one or more discriminant markers selected from the group consisting of SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO: 48; SEQ ID NO: 49; SEQ ID NO: 50; SEQ ID NO: 51; SEQ ID NO: 52; SEQ ID NO: 53; SEQ ID NO: 54; and SEQ ID NO: 55
  • An embodiment of the present disclosure provides an isolated polynucleic acid comprising a coding sequence selected from: SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO: 48; SEQ ID NO: 49; SEQ ID NO: 50; SEQ ID NO: 51; SEQ ID NO: 52; SEQ ID NO: 53; SEQ ID NO: 54; and SEQ ID NO: 55, or a sequence having at least 90%, 91%, 92%,
  • An embodiment of the present disclosure provides a method for screening for a maize plant with one or more anthocyanin-regulating genes, the method comprising: screening a population of maize plants for one or more anthocyanin-regulating genes are selected from the group consisting of an R1 gene and a C1 gene, wherein the R1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO
  • An embodiment of the present disclosure provides a method for screening for a maize plant with no or substantially no smoky kernels, the method comprising: screening a population of maize plants for one or more anthocyanin-regulating genes are selected from the group consisting of an R1 gene and a C1 gene, wherein the R1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO
  • An embodiment of the present disclosure provides a method for identification of maize plants having no or substantially no smoky kernels, the method comprising: screening a population of maize plants for one or more anthocyanin-regulating genes, homologs of such genes, orthologs of such genes, paralogs of such genes and fragments and variations thereof.
  • the present disclosure provides nucleic acid sequences for identifying maize plants having no or substantially no smoky kernels, comprising one or more nucleic acid sequences that shares at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% identity to the genes identified and presented herein.
  • a method as described herein comprises that said at least one plant cell, tissue, organ, plant, or seed is not obtained by an essentially biological process. Instead, said at least one plant cell, tissue, organ, plant, or seed is obtained by at least one step of artificial human intervention as such not occurring in nature and influencing the plant cell by modifying and/or introducing a step of technical nature influencing sexually crossing and selecting.
  • a step may include a step of genome editing, e.g., to exchange a base or nucleotide of interest, a chemical treatment, e.g.
  • operably linked refers to nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other.
  • a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence.
  • each of the expressions “at least one of A, B and C,” “at least one of A, B, or C,” “one or more of A, B, and C,” “one or more of A, B, or C” and “A, B, and/or C” means A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together.
  • “sometime” means at some indefinite or indeterminate point of time.
  • FIG. 1(a) provides an association mapping results of the male germplasm testcrosses.
  • Figure 1(b) provides an association mapping results of the female germplasm testcrosses.
  • Figure 2 provides the sequence listings for SEQ ID NO: 1 to SEQ ID NO:66 and all SEQ ID NOs in between.
  • Figure 3 provides an alignment sequences of B73_R1 (genotype carrying smoky alleles); PHG39 (known smokey donor) and the non_smokey2 allele producing the non- smokey genotype.
  • Figure 4 provides an alignment of multiple sequences of B73_R1 (genotype carrying smoky alleles); PHG39 (known smokey donor) and the non_smokey2 allele producing the non-smokey genotype with EXON information.
  • the verb “comprise” as is used in this description and in the claims and its conjugations are used in its non-limiting sense to mean that items following the word are included, but items not specifically mentioned are not excluded.
  • the term “plant” refers to any living organism belonging to the kingdom Plantae (i.e., any genus/species in the Plant Kingdom).
  • plant part refers to any part of a plant including but not limited to the shoot, root, stem, seeds, fruits, stipules, leaves, petals, flowers, ovules, bracts, branches, petioles, internodes, bark, pubescence, tillers, rhizomes, fronds, blades, pollen, stamen, rootstock, scion and the like.
  • the two main parts of plants grown in some sort of media, such as soil, are often referred to as the “above-ground” part, also often referred to as the “shoots”, and the “below-ground” part, also often referred to as the “roots”.
  • a or “an” refers to one or more of that entity; for example, “a gene” refers to one or more genes or at least one gene.
  • the terms “a” (or “an”), “one or more” and “at least one” are used interchangeably herein.
  • nucleic acid refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides, or analogs thereof. This term refers to the primary structure of the molecule, and thus includes double- and single-stranded DNA, as well as double- and single-stranded RNA.
  • nucleic acid and “nucleotide sequence” are used interchangeably.
  • polypeptide As used herein, the terms “polypeptide,” “peptide,” and “protein” are used interchangeably herein to refer to polymers of amino acids of any length. These terms also include proteins that are post-translationally modified through reactions that include glycosylation, acetylation and phosphorylation.
  • derived from refers to the origin or source, and may include naturally occurring, recombinant, unpurified, or purified molecules.
  • a nucleic acid or an amino acid derived from an origin or source may have all kinds of nucleotide changes or protein modification as defined elsewhere herein.
  • the term “primer” as used herein refers to an oligonucleotide which is capable of annealing to the amplification target allowing a DNA polymerase to attach, thereby serving as a point of initiation of DNA synthesis when placed under conditions in which synthesis of primer extension product is induced, i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase and at a suitable temperature and pH.
  • the (amplification) primer is preferably single stranded for maximum efficiency in amplification.
  • the primer is an oligodeoxyribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the agent for polymerization. The exact lengths of the primers will depend on many factors, including temperature and composition (A/T en G/C content) of primer.
  • a pair of bi-directional primers consists of one forward and one reverse primer as commonly used in the art of DNA amplification such as in PCR amplification.
  • a probe comprises an identifiable, isolated nucleic acid that recognizes a target nucleic acid sequence.
  • a probe includes a nucleic acid that is attached to an addressable location, a detectable label or other reporter molecule and that hybridizes to a target sequence.
  • Typical labels include radioactive isotopes, enzyme substrates, co-factors, ligands, chemiluminescent or fluorescent agents, haptens, and enzymes.
  • Methods for labelling and guidance in the choice of labels appropriate for various purposes are discussed, for example, in Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 2 nd ed., vol.1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989 and Ausubel et al. Short Protocols in Molecular Biology, 4 th ed., John Wiley & Sons, Inc., 1999. [0043] Methods for preparing and using nucleic acid probes and primers are described, for example, in Sambrook et al.
  • Amplification primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as PRIMER (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge, MA).
  • probes and primers can be selected that comprise at least 20, 25, 30, 35, 40, 45, 50 or more consecutive nucleotides of a target nucleotide sequences.
  • oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any organism of interest. Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual (3rd ed., Cold Spring Harbor Laboratory Press, Plainview, New York).
  • PCR Protocols A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York).
  • Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially mismatched primers, and the like. [0045] Methods of alignment of sequences for comparison are well known in the art.
  • the degree of sequence identity may vary, but in one embodiment, is at least 50% (when using standard sequence alignment programs known in the art), at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least 98.5%, or at least about 99%, or at least 99.5%, or at least 99.8%, or at least 99.9%.
  • Homology can be determined using software programs readily available in the art, such as those discussed in Current Protocols in Molecular Biology (F.M.
  • the term “offspring” refers to any plant resulting as progeny from a vegetative or sexual reproduction from one or more parent plants or descendants thereof.
  • an offspring plant may be obtained by cloning or selfing of a parent plant or by crossing two parents plants and include selfings as well as the F1 or F2 or still further generations.
  • An F1 is a first-generation offspring produced from parents at least one of which is used for the first time as donor of a trait, while offspring of second generation (F2) or subsequent generations (F3, F4, etc.) are specimens produced from selfings of F1's, F2's etc.
  • An F1 may thus be (and usually is) a hybrid resulting from a cross between two true breeding parents (true-breeding is homozygous for a trait), while an F2 may be (and usually is) an offspring resulting from self-pollination of said F1 hybrids.
  • cross refers to the process by which the pollen of one flower on one plant is applied (artificially or naturally) to the ovule (stigma) of a flower on another plant.
  • cultivar refers to a variety, strain or race of plant that has been produced by horticultural or agronomic techniques and is not normally found in wild populations.
  • dotyledon and “dicot” refer to a flowering plant having an embryo containing two seed halves or cotyledons.
  • genes refers to any segment of DNA associated with a biological function.
  • genes include, but are not limited to, coding sequences and/or the regulatory sequences required for their expression. Genes can also include nonexpressed DNA segments that, for example, form recognition sequences for other proteins.
  • Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters.
  • the term “genotype” refers to the genetic makeup of an individual cell, cell culture, tissue, organism (e.g., a plant), or group of organisms.
  • the term “hemizygous” refers to a cell, tissue or organism in which a gene is present only once in a genotype, as a gene in a haploid cell or organism, a sex-linked gene in the heterogametic sex, or a gene in a segment of chromosome in a diploid cell or organism where its partner segment has been deleted.
  • the term “heterozygote” refers to a diploid or polyploid individual cell or plant having different alleles (forms of a given gene) present at least at one locus.
  • the term “heterozygous” refers to the presence of different alleles (forms of a given gene) at a particular gene locus.
  • the terms “homolog” or “homologue” refer to a nucleic acid or peptide sequence which has a common origin and functions similarly to a nucleic acid or peptide sequence from another species.
  • the term “homozygote” refers to an individual cell or plant having the same alleles at one or more loci.
  • the term “homozygous” refers to the presence of identical alleles at one or more loci in homologous chromosomal segments.
  • hybrid refers to any individual cell, tissue or plant resulting from a cross between parents that differ in one or more genes.
  • inbred or “inbred line” refers to a relatively true-breeding strain.
  • single allele converted plant refers to those plants which are developed by a plant breeding technique called backcrossing wherein essentially all of the desired morphological and physiological characteristics of an inbred are recovered in addition to the single allele transferred into the inbred via the backcrossing technique.
  • line is used broadly to include, but is not limited to, a group of plants vegetatively propagated from a single parent plant, via tissue culture techniques or a group of inbred plants which are genetically very similar due to descent from a common parent(s).
  • a plant is said to “belong” to a particular line if it (a) is a primary transformant (TO) plant regenerated from material of that line; (b) has a pedigree comprised of a TO plant of that line; or (c) is genetically very similar due to common ancestry (e.g., via inbreeding or selfing).
  • TO primary transformant
  • the term “pedigree” denotes the lineage of a plant, e.g. in terms of the sexual crosses affected such that a gene or a combination of genes, in heterozygous (hemizygous) or homozygous condition, imparts a desired trait to the plant.
  • late plant greenness refers to a visual assessment given at around the dent stage but typically a few weeks before harvest to characterize the degree of greenness left in the leaves. Plants are rated from 1 (poorest) to 9 (best) with poorer scores given for plants that have more non-green leaf tissue typically due to early senescence or from disease.
  • locus refers to a defined segment of DNA. This segment is often associated with an allele position on a chromosome.
  • MN RM Minnesota Relative Maturity Rating
  • MN RM Minnesota Relative Maturity Rating
  • moisture refers to the actual percentage moisture of the grain at harvest.
  • introgression refers to the process whereby genes of one species, variety or cultivar are moved into the genome of another species, variety or cultivar, by crossing those species.
  • the crossing may be natural or artificial.
  • the process may optionally be completed by backcrossing to the recurrent parent, in which case introgression refers to infiltration of the genes of one species into the gene pool of another through repeated backcrossing of an interspecific hybrid with one of its parents.
  • An introgression may also be described as a heterologous genetic material stably integrated in the genome of a recipient plant.
  • the term “population” means a genetically homogeneous or heterogeneous collection of plants sharing a common genetic derivation.
  • the term “variety” or “cultivar” means a group of similar plants that by structural features and performance can be identified from other varieties within the same species.
  • the term “variety” as used herein has identical meaning to the corresponding definition in the International Convention for the Protection of New Varieties of Plants (UPOV treaty), of Dec.2, 1961, as Revised at Geneva on Nov.10, 1972, on Oct.23, 1978, and on Mar.19, 1991.
  • “variety” means a plant grouping within a single botanical taxon of the lowest known rank, which grouping, irrespective of whether the conditions for the grant of a breeder's right are fully met, can be i) defined by the expression of the characteristics resulting from a given genotype or combination of genotypes, ii) distinguished from any other plant grouping by the expression of at least one of the said characteristics and iii) considered as a unit with regard to its suitability for being propagated unchanged.
  • the term “allele(s)” means any of one or more alternative forms of a gene, all of which alleles relate to at least one trait or characteristic.
  • the two alleles of a given gene occupy corresponding loci on a pair of homologous chromosomes.
  • QTLs i.e. genomic regions that may comprise one or more genes or regulatory sequences
  • haplotype i.e. an allele of a chromosomal segment
  • allele should be understood to comprise the term “haplotype”.
  • Alleles are considered identical when they express a similar phenotype. Differences in sequence are possible but not important as long as they do not influence phenotype.
  • the term “mass selection” refers to a form of selection in which individual plants are selected and the next generation propagated from the aggregate of their seeds. More details of mass selection are described herein in the specification.
  • the term “monocotyledon” or “monocot” refer to any of a subclass (Monocotyledoneae) of flowering plants having an embryo containing only one seed leaf and usually having parallel-veined leaves, flower parts in multiples of three, and no secondary growth in stems and roots.
  • open pollination refers to a plant population that is freely exposed to some gene flow, as opposed to a closed one in which there is an effective barrier to gene flow.
  • open-pollinated population or “open-pollinated variety” refer to plants normally capable of at least some cross-fertilization, selected to a standard, that may show variation but that also have one or more genotypic or phenotypic characteristics by which the population or the variety can be differentiated from others.
  • a hybrid which has no barriers to cross-pollination, is an open-pollinated population or an open-pollinated variety.
  • ovule refers to the female gametophyte
  • polylen means the male gametophyte.
  • phenotype refers to the observable characters of an individual cell, cell culture, organism (e.g., a plant), or group of organisms which results from the interaction between that individual's genetic makeup (i.e., genotype) and the environment.
  • plant tissue refers to any part of a plant.
  • plant organs include, but are not limited to the leaf, stem, root, tuber, seed, branch, pubescence, nodule, leaf axil, flower, pollen, stamen, pistil, petal, peduncle, stalk, stigma, style, bract, fruit, trunk, carpel, sepal, anther, ovule, pedicel, needle, cone, rhizome, stolon, shoot, pericarp, endosperm, placenta, berry, stamen, and leaf sheath.
  • the term “self-crossing”, “self pollinated” or “self-pollination” means the pollen of one flower on one plant is applied (artificially or naturally) to the ovule (stigma) of the same or a different flower on the same plant.
  • the terms “Quantitative Trait Loci” and “QTL” are used herein in their art-recognized meaning.
  • a QTL may for instance comprise one or more genes of which the products confer the genetic resistance.
  • a QTL may for instance comprise regulatory genes or sequences of which the products influence the expression of genes on other loci in the genome of the plant thereby conferring the resistance.
  • the QTLs of the present invention may be defined by indicating their genetic location in the genome of the respective pathogen-resistant accession using one or more molecular genomic markers.
  • One or more markers indicate a specific locus. Distances between loci are usually measured by frequency of crossing-over between loci on the same chromosome. The farther apart two loci are, the more likely that a crossover will occur between them. Conversely, if two loci are close together, a crossover is less likely to occur between them. As a rule, one centimorgan (cM) is equal to 1% recombination between loci (markers).
  • the term “regeneration” refers to the development of a plant from tissue culture.
  • the term “single locus converted (conversion)” refers to plants which are developed by a plant breeding technique called backcrossing wherein essentially all of the desired morphological and physiological characteristics of a variety are recovered in addition to the single locus transferred into the variety via the backcrossing technique or via genetic engineering.
  • a single locus converted plant can also be referred to a plant obtained though mutagenesis taught in the present disclosure or through the use of some new breeding techniques.
  • the single locus converted plant has essentially all of the desired morphological and physiological characteristics of the original variety in addition to a single locus converted by spontaneous and/or artificially induced mutations, which is introduced and/or transferred into the plant by the plant breeding techniques such as backcrossing.
  • the single locus converted plant has essentially all of the desired morphological and physiological characteristics of the original variety in addition to a single locus, gene or nucleotide sequence(s) converted, mutated, modified or engineered through the New Breeding Techniques taught herein.
  • single locus converted (conversion) can be interchangeably referred to single gene converted (conversion).
  • transgene refers to any nucleotide sequence used in the transformation of a plant (e.g., maize ) , animal , or other organism.
  • a transgene can be a coding sequence , a non- coding sequence, a cDNA, a gene or fragment or portion thereof, a genomic sequence, a regulatory element and the like.
  • transgenic refers to an organism, such as a transgenic plant, into which a transgene has been delivered or introduced and the transgene can be expressed in the transgenic organism to produce a product, the presence of which can impart an effect and/or a phenotype in the organism.
  • Variety and Cultivar refer to a group of similar plants that by structural or genetic features and/or performance can be distinguished from other varieties within the same species.
  • a plant variety as used by one skilled in the art of plant breeding means a plant grouping within a single botanical taxon of the lowest known rank which can be defined by the expression of the characteristics resulting from a given genotype or combination of phenotypes, distinguished from any other plant grouping by the expression of at least one of the said characteristics and considered as a unit with regard to its suitability for being propagated unchanged (International Convention for the Protection of New Varieties of Plants).
  • Yield “Bushels/Acre)” refers to the actual yield of the grain at harvest adjusted to 15.5% moisture.
  • the term “molecular marker” or “genetic marker” refers to an indicator that is used in methods for visualizing differences in characteristics of nucleic acid sequences.
  • indicators are restriction fragment length polymorphism (RFLP) markers, amplified fragment length polymorphism (AFLP) markers, single nucleotide polymorphisms (SNPs), insertion mutations, microsatellite markers (SSRs), sequence- characterized amplified regions (SCARs), cleaved amplified polymorphic sequence (CAPS) markers or isozyme markers or combinations of the markers described herein which defines a specific genetic and chromosomal location.
  • RFLP restriction fragment length polymorphism
  • AFLP amplified fragment length polymorphism
  • SNPs single nucleotide polymorphisms
  • SSRs single nucleotide polymorphisms
  • SCARs sequence- characterized amplified regions
  • CAS cleaved amplified polymorphic sequence
  • the present disclosure provides systems and methods for screening for and identifying one or more plants of the genus Zea having an altered aleurone anthocyanin content or substantially no smoky kernels. As will be discussed in further detail herein, the present disclosure provides methods for producing maize plants having altered aleurone anthocyanin content or substantially no smoky kernels as well as methods for detecting anthocyanin-regulating genes in maize.
  • Anthocyanin coloration of maize kernels is typically an undesired phenotype in maize breeding because occurrence of colored or speckled kernels (also referred to as smoky kernels) reduces the salability of maize kernels. Therefore, farmers prefer maize varieties or maize hybrids with solely yellow kernels. However, even after decades of maize breeding the phenomenon of colored or speckled kernels is still there, and its appearance during a breeding program is difficult to predict. The reason for that is because the mechanism of the anthocyanin coloration of maize kernels is complex and not fully understood.
  • Aleurone color in maize is determined by a group of regulatory and structural genes in the anthocyanin biosynthetic pathway. Regulatory genes identified so far, are the b1/r1 gene family, which codes for myc-like transcriptional activators (Ludwig, Steven R., et al.
  • Lc a member of the maize R gene family responsible for tissue-specific anthocyanin production, encodes a protein similar to transcriptional activators and contains the myc-homology region. Proceedings of the National Academy of Sciences 86.18 (1989): 7092-7096.; Radicella, J. Pablo, et al. "Allelic diversity of the maize B regulatory gene: different leader and promoter sequences of two B alleles determine distinct tissue specificities of anthocyanin production.” Genes & development 6.11 (1992): 2152-2164.), and the c1/pl1 gene family, which codes for myb-like transcriptional activators (Cone, Karen C., et al.
  • Molecular analysis of the c1 locus indicates that functional alleles code for a myb-related transcriptional activator that has a DNA binding domain and a transcription activating domain (Paz ⁇ Ares, Javier, et al. "The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto ⁇ oncogene products and with structural similarities to transcriptional activators.” The EMBO journal 6.12 (1987): 3553-3558., Paz ⁇ Ares, Javier, Debabrota Ghosal, and Heinz Saedler. "Molecular analysis of the C1 ⁇ I allele from Zea mays: a dominant mutant of the regulatory C1 locus.” The EMBO Journal 9.2 (1990): 315-321.).
  • haplotype rather than allele is used in reference to variations at the r1 locus because it is a complex locus composed of one to several genes that affect anthocyanin pigmentation of various plant parts (Panavas, Tadas, Jessica Weir, and Elsbeth L. Walker. "The structure and paramutagenicity of the R-marbled haplotype of Zea mays.” Genetics 153.2 (1999): 979-991.). While there is anecdotal information as to the existence of r1 haplotype-specific inhibitors of aleurone color in maize, only very few have been characterized in detail (Stinard, P. S., and Martin M. Sachs.
  • an embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having no or substantially no smoky kernels through the use of QTLs to identify the presence of alleles linked to anthocyanin-regulating genes.
  • the QTLs related to alleles linked to anthocyanin-regulating genes may be discovered through QTL mapping.
  • Inheritance of quantitative traits or polygenic inheritance refers to the inheritance of a phenotypic characteristic that varies in degree and can be attributed to the interactions between two or more genes and their environment.
  • quantitative trait loci are stretches of DNA that are closely linked to the genes that underlie the trait in question.
  • QTLs can be molecularly identified to help map regions of the genome that contain genes involved in specifying a quantitative trait. This can be an early step in identifying and sequencing these genes.
  • QTLs underlie continuous traits (those traits that vary continuously, e.g. level of resistance to pathogen) as opposed to discrete traits (traits that have two or several character values, e.g. smooth vs.
  • a QTL is a region of DNA that is associated with a particular phenotypic trait—these QTLs are often found on different chromosomes. Knowing the number of QTLs that explains variation in a particular phenotypic trait informs about the genetic architecture of the trait. It may tell that plant resistance to a specific pathogen is controlled by many genes of small effect, or by a few genes of large effect. [0095] Another use of QTLs is to identify candidate genes underlying a trait.
  • QTL mapping is the statistical study of the alleles that occur in a locus and the phenotypes (physical forms or traits) that they produce (see, Meksem and Kahl, The handbook of plant genome mapping: genetic and physical mapping, 2005, Wiley-VCH, ISBN 3527311165, 9783527311163). Because most traits of interest are governed by more than one gene, defining and studying the entire locus of genes related to a trait gives hope of understanding what effect the genotype of an individual might have in the real world. [0098] Statistical analysis is required to demonstrate that different genes interact with one another and to determine whether they produce a significant effect on the phenotype.
  • QTLs identify a particular region of the genome as containing a gene that is associated with the trait being assayed or measured. They are shown as intervals across a chromosome, where the probability of association is plotted for each marker used in the mapping experiment.
  • a marker is an identifiable region of variable DNA.
  • Biologists are interested in understanding the genetic basis of phenotypes (physical traits). The aim is to find a marker that is significantly more likely to co-occur with the trait than expected by chance, that is, a marker that has a statistical association with the trait. Ideally, they would be able to find the specific gene or genes in question, but this is a long and difficult undertaking.
  • Another interest of statistical geneticists using QTL mapping is to determine the complexity of the genetic architecture underlying a phenotypic trait. For example, they may be interested in knowing whether a phenotype is shaped by many independent loci, or by a few loci, and do those loci interact. This can provide information on how the phenotype may be evolving.
  • Molecular markers are used for the visualization of differences in nucleic acid sequences. This visualization is possible due to DNA-DNA hybridization techniques (RFLP) and/or due to techniques using the polymerase chain reaction (e.g. STS, microsatellites, AFLP). All differences between two parental genotypes will segregate in a mapping population based on the cross of these parental genotypes.
  • the segregation of the different markers may be compared and recombination frequencies can be calculated.
  • the recombination frequencies of molecular markers on different chromosomes are generally 50%. Between molecular markers located on the same chromosome the recombination frequency depends on the distance between the markers. A low recombination frequency corresponds to a low distance between markers on a chromosome. Comparing all recombination frequencies will result in the most logical order of the molecular markers on the chromosomes. This most logical order can be depicted in a linkage map (Paterson, 1996).
  • a group of adjacent or contiguous markers on the linkage map that is associated to a reduced disease incidence and/or a reduced lesion growth rate pinpoints the position of a QTL.
  • the nucleic acid sequence of a QTL may be determined by methods known to the skilled person. For instance, a nucleic acid sequence comprising said QTL may be isolated from a donor maize plant having no or substantially no smoky kernels by fragmenting the genome of said plant and selecting those fragments harboring one or more markers indicative of said QTL.
  • the marker sequences (or parts thereof) indicative of said QTL may be used as (PCR) amplification primers, in order to amplify a nucleic acid sequence comprising said QTL from a genomic nucleic acid sample or a genome fragment obtained from said plant.
  • the amplified sequence may then be purified in order to obtain the isolated QTL.
  • the nucleotide sequence of the QTL, and/or of any additional markers comprised therein, may then be obtained by standard sequencing methods.
  • One or more such QTLs associated maize plants having no or substantially no smoky kernels can be transferred to a recipient plant make it express kernels with no or substantially no smoky kernels.
  • an advanced backcross QTL analysis is used to discover the nucleotide sequence or the QTLs responsible for the expression of no or substantially no smoky kernels in a plant.
  • AB-QTL advanced backcross QTL analysis
  • Such method was proposed by Tanksley and Nelson in 1996 (Tanksley and Nelson, 1996, Advanced backcross QTL analysis: a method for simultaneous discovery and transfer of valuable QTL from un-adapted germplasm into elite breeding lines.
  • Theor Appl Genet 92:191-203 as a new breeding method that integrates the process of QTL discovery with variety development, by simultaneously identifying and transferring useful QTL alleles from un-adapted (e.g., land races, wild species) to elite germplasm, thus broadening the genetic diversity available for breeding.
  • NILs near isogenic lines
  • ILs introgression lines
  • NILs near isogenic lines
  • ILs introgression lines
  • BILs backcross inbred lines
  • BCRIL backcross recombinant inbred lines
  • RCSLs recombinant chromosome substitution lines
  • CSSLs chromosome segment substitution lines
  • STAIRSs stepped aligned inbred recombinant strains
  • An introgression line in plant molecular biology is a line of a crop species that contains genetic material derived from a similar species.
  • ILs represent NILs with relatively large average introgression length
  • BILs and BCRILs are backcross populations generally containing multiple donor introgressions per line.
  • introduction lines or ILs refers to plant lines containing a single marker defined homozygous donor segment
  • pre-ILs refers to lines which still contain multiple homozygous and/or heterozygous donor segments.
  • a genetic infrastructure of exotic libraries can be developed. Such an exotic library comprises of a set of introgression lines, each of which has a single, possibly homozygous, marker-defined chromosomal segment that originates from a donor exotic parent, in an otherwise homogenous elite genetic background, so that the entire donor genome would be represented in a set of introgression lines.
  • a collection of such introgression lines is referred as libraries of introgression lines or IL libraries (ILLs).
  • the lines of an ILL cover usually the complete genome of the donor, or the part of interest. Introgression lines allow the study of quantitative trait loci, but also the creation of new varieties by introducing exotic traits. High resolution mapping of QTL using ILLs enable breeders to assess whether the effect on the phenotype is due to a single QTL or to several tightly linked QTL affecting the same trait. In addition, sub-ILs can be developed to discover molecular markers which are more tightly linked to the QTL of interest, which can be used for marker-assisted breeding (MAB).
  • MAB marker-assisted breeding
  • the present disclosure provides molecular markers that are linked to maize plants with no to substantially no smoky kernels. These molecular markers and their defining primers are described herein.
  • the term “linked” refers to the situation wherein the molecular marker and at least one of the QTLs and/or agronomic QTLs of the present invention are segregating together over one or more generations.
  • the molecular markers of the present invention are linked to at least one trait loci of the present invention.
  • the molecular marker can be any kind of marker described herein.
  • the molecular markers of the present invention are closely linked to at least one trait loci of the present invention.
  • the phrase “closely linked” or “tightly linked” refers to the situation wherein the genetic distance between the molecular marker and at least one of the QTLs and/or agronomic QTLs is less than 2 centimorgan (cM).
  • the genetic distance between the marker and the QTL is about 2.0 cM, about 1.9 cM, about 1.8 cM, about 1.7 cM, about 1.6 cM, about 1.5 cM, about 1.4 cM, about 1.3 cM, about 1.2 cM, about 1.1 cM, about 1.0 cM, about 0.9 cM, about 0.8 cM, about 0.7 cM, about 0.6 cM, about 0.5 cM, about 0.4 cM, about 0.3 cM, about 0.2 cM, about 0.1 cM, or less than 0.1 cM [00110] Molecular markers have proven to be of great value for increasing the speed and efficiency of plant breeding.
  • Map distance is simply a function of recombination frequency between two markers, genes, QTLs, genes and QTLS, or markers and genes or QTLs. Consequently, if a marker and a gene or a QTL map too far apart, too much recombination will occur during a series of crosses or self-pollinations such that the marker becomes no longer associated with the gene or the QTL.
  • More molecular markers can be developed by using the plants having an altered aleurone anthocyanin content that express no to substantially no smoky kernels of the present invention.
  • the marker and the gene are more closely localized to each other, and more likely to be inherited simultaneously; thus such markers are more useful.
  • Methods of developing molecular markers are well known to one of ordinary skill in the art.
  • the marks can be bi-allelic dominant, bi-allelic co-dominant, and/or multi- allelic co-dominant.
  • RFLPs restriction fragment length polymorphisms
  • ASH allele specific hybridization
  • amplified variable sequences of plant genome self-sustained sequence replication
  • simple sequence repeat SSR
  • single base-pair change single nucleotide polymorphism, SNP
  • random amplification of polymorphic DNA RAPDs
  • SSCPs single stranded conformation polymorphisms
  • amplified fragment length polymorphisms AFLPs
  • AFLPs amplified fragment length polymorphisms
  • microsatellites DNA RAPD methods generally refer to methods of detecting DNA polymorphisms using differences in the length of DNAs amplified using appropriate primers.
  • AFLP methods are essentially a combination of the above RFLP and RAPD methods, and refer to methods of selectively amplifying DNA restriction fragments using PCR to detect differences in their length, or their presence or absence.
  • Methods of developing molecular markers and their applications are described by Avise (Molecular markers, natural history, and evolution, Publisher: Sinauer Associates, 2004, ISBN 0878930418, 9780878930418), Srivastava et al. (Plant biotechnology and molecular markers, Publisher: Springer, 2004, ISBN1402019114, 9781402019111), and Vienne (Molecular markers in plant genetics and biotechnology, Publisher: Science Publishers, 2003), each of which is incorporated by reference in its entirety.
  • Detection of AFLP fragments is commonly carried out by electrophoresis on slab- gels (Vos et al., AFLP: a new technique for DNA fingerprinting, Nucleic Acids Res.1995 Nov.11; 23(21): 4407-4414, 1995) or capillary electrophoresis (van der Meulen et al., 2002).
  • the majority of AFLP markers scored in this way represent polymorphisms occurring either in the restriction enzyme recognition sites used for AFLP template preparation or their flanking nucleotides covered by selective AFLP primers.
  • the remainder of the AFLP markers are insertion/deletion polymorphisms occurring in the internal sequences of the restriction fragments and a very small fraction on single nucleotide substitutions occurring in small restriction fragments ( ⁇ approximately 100 bp), which for these fragments cause reproducible mobility variations between both alleles which can be observed upon electrophoresis; these AFLP markers can be scored co-dominantly without having to rely on band intensities.
  • Methods of developing AFLP markers are described in EP 534858, U.S. Pat. No.6,045,994, WO2007114693 and Vos et al., each of which is hereby incorporated by reference in its entirety.
  • the molecular markers of the present invention are genetically linked to the QTLs associated with maize plants having an altered aleurone anthocyanin content that expresses no to substantially no smoky kernels. It should be understood that these molecular markers merely indicate nucleic acid sequence polymorphisms between the genome of a maize plant having said QTLs and the genome of a maize plant not having said QTLs. The polymorphisms can be detected by PCR amplification, or any other suitable methods well known to one skilled in the art.
  • Genomic selection also known as genome wide selection (GWS) is a form of MAS that estimates all locus, haplotype, and/or marker effects across the entire genome to calculate genomic estimated breeding values (GEBVs). See Nakaya and Isobe, Will genomic selection be a practical method for plant breeding?
  • GS utilizes a training phase and a breeding phase. In the training phase, genotypes and phenotypes are analyzed in a subset of a population to generate a GS prediction model that incorporates significant relationships between phenotypes and genotypes.
  • a GS training population must be representative of selection candidates in the breeding program to which GS will be applied, in the breeding phase, genotype data are obtained in a breeding population, then favorable individuals are selected based on GEBVs obtained using the GS prediction model generated during the training phase without the need for phenotypic data.
  • Larger training populations typically increase the accuracy of GEBV predictions. Increasing the training population to breeding population ratio is helpful for obtaining accurate GEBVs when working with populations having high genetic diversity, small breeding populations, low heritability of traits, or large numbers of QTLs.
  • the number of markers required for GS modeling is determined based on the rate of LD decay across the genome, which must be calculated for each specific population to which GS will be applied.
  • GS comprises at least one marker in LD with each QTL, but in practical terms one of ordinary skill in the art would recognized that this is not necessary.
  • GEBVs are the sum of the estimate of genetic deviation and the weighted sum of estimates of breed effects, which are predicted using phenotypic data.
  • commonly used statistical models for prediction of GEBVs include best linear unbiased prediction (Henderson, Best linear unbiased estimation and prediction under a selection model.
  • compositions and methods of the present disclosure can be utilized for GS or breeding corn varieties with a desired complement (set) of allelic forms of chromosome intervals associated with superior agronomic performance (e.g., no smokey kernels).
  • a corn plant, seed, or cell provided herein can be selected using genomic selection.
  • a genomic selection method provided herein comprises phenotyping a population of corn plants for no smoky kernels.
  • a genomic selection method provided herein comprises genotyping a population of corn plants, seeds, or cells with at least one of marker loci SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35;, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO
  • One known smokey donor (line PHG39) and two smokey and two non-smokey lines, forming pairs within the same DH (doubled haploid) population were sequenced by Illumina short read technology. The genomes were assembled to scaffold level, covering the gene space very well.
  • the proprietary dataset is comprised of public sequenced genomes PH207 (SEQ ID NO:66), PHG29 (SEQ ID NO:65), (SEQ ID NO:64), MBS847(SEQ ID NO:62), B73 (SEQ ID NO:56), OH43 (SEQ ID NO:63) complemented with sequences from C1 alleles reported in the literature (C1 (SEQ ID NO:57), C1-m (SEQ ID NO:58), C1-p (SEQ ID NO:61) and C1-n (SEQ ID NO:59 and SEQ ID NO:60)).
  • the smoky phenotype corresponds to a functional C1 allele.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Botany (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mycology (AREA)
  • Developmental Biology & Embryology (AREA)
  • Environmental Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Method for identifying a plant of the genus Zea having an altered aleurone anthocyanin content or substantially no smoky kernels are provided herein. The present disclosure provides herein are methods for producing maize plants having altered aleurone anthocyanin content or substantially no smoky kernels. Also provided herein are methods for detecting anthocyanin-regulating genes in maize.

Description

METHOD GF DETECTING MAIZE PLANTS WITH ALTERED ALEURONE
ANTHOCYzANIN CONTENT AND ASSOCIATED M ARKER ALLELES AND
HAPLOTYPES
CROSS REFERENCE TO RELATED MATTER
[0001] The present application claims priority to U.S. Application No. 63/514,701, as filed on July 20, 2023. the entire contents of both applications are incorporated herein by reference for all purposes.
[0002] All references, articles, publications, patents, patent publications, and patent applications cited herein within the above text and/or cited below are incorporated by reference in their entireties for all purposes. However, mention of any reference, article, publication, patent, patent publication, and patent application cited herein is not, and should not be taken as acknowledgment or any form of suggestion that they constitute valid prior art or form part of the common general knowledge in any country' in the world.
BACKGROUND
[0003] The foregoing examples of the related art and limitations related therewith are intended to be illustrative and not exclusive. Other limitations of the related art will become apparent to those of skill in the art upon a reading of the specification.
DESCRIPTION OF THE TEXT FILE SUBMITTED ELECTRONICALLY
[0004] The Sequence Listing associated with this application is provided in an xml format in lieu of a paper copy. The contents of the xml file submitted electronically herewith are incorporated herein by reference in their entirety: A computer readable format copy of the Sequence Listing (filename: AGRT38WOUlSequencelisting.xml, with a size of 153 kb and dated: July 19, 2024).
SUMMARY
[0005] It is to be understood that the embodiments include a variety of different versions or embodiments, and this Summary' is not meant to be limiting or all-inclusive. This Summary provides some general descriptions of some of the embodiments, but may also include some more specific descriptions of other embodiments.
[0006] An embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having an altered aleurone anthocyanin content, comprising detecting a presence or absence of one, two, three or more marker alleles linked to anthocyaninregulating genes selected from the group consisting of R1 and Cl genes.
[0007] An embodiment of the present disclosure provides isolated nucleic acid sequences for identify ing and selecting plants of the genus Zea having an altered aleurone anthocyanin content. In one embodiment, the present disclosure provides one or more nucleic acid sequences for the identification of Zea plants having altered aleurone anthocyanin content wherein the one or more nucleic acid sequences shares at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% identity to the anthocyanin-regulating genes identified and presented herein. [0008] An embodiment of the present disclosure provides an R1 gene that is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66. [0009] An embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having no or substantially no smoky kernels, the method comprising detecting a presence or absence of one, two, three or more marker alleles linked to anthocyanin- regulating genes selected from the group consisting of R1 and C1 genes, wherein the R1 gene is selected from the group consisting of SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is selected from the group consisting of SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66; or either or both of the R1 gene or the C1 gene comprises a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to the foregoing respective sequences. [0010] An embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having no or substantially no smoky kernels, the method comprising detecting a presence or absence of one, two, three or more marker alleles linked to anthocyanin- regulating genes selected from the group consisting of R1 and C1 gene, wherein said marker alleles comprise at least one allele being selected from M1, M2, M3, M4, M5, M6, M7, M8, M9, M10, M11, M12, M13, M14 and M15, wherein: M1 is a SNP which is guanine (G) at the position 139770766 referenced to the B73 reference genome AGPv4; M2 is a SNP which is thymine (T) at the position 139771365 referenced to the B73 reference genome AGPv4; M3 is a SNP which is thymine (T) at the position 139771617 referenced to the B73 reference genome AGPv4; M4 is a SNP which is adenine (A) at the position 139772481 referenced to the B73 reference genome AGPv4; M5 is a SNP which is guanine (G) at the position 139772862 referenced to the B73 reference genome AGPv4; M6 is a SNP which is guanine (G) at the position 139789400 referenced to the B73 reference genome AGPv4; M7 is an indel which is insertion of nucleotides at the position 139781170 referenced to the B73 reference genome AGPv4; M8 is an indel which is insertion of nucleotides at the position 8983319 referenced to the B73 reference genome AGPv4; M9 is a SNP which is thymine (T) at the position 8983500 referenced to the B73 reference genome AGPv4; M10 is a SNP which is thymine (T) at the position 8983838 referenced to the B73 reference genome AGPv4; Ml 1 is a SNP which is guanine (G) at the position 8983897 referenced to the B73 reference genome AGPv4; Ml 2 is an indel which is insertion of nucleotides at the position 8984653 referenced to the B73 reference genome AGPv4; Ml 3 is a SNP which is guanine (G) at the position 8983862 referenced to the B73 reference genome AGPv4; M14 is a SNP which is thymine (T) or cytosine (C) at the position 8983983 referenced to the B73 reference genome AGPv4; and M 15 is a SNP which is cytosine (C) at the position 8984469 referenced to the B73 reference genome AGPv4.
[0011] An embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having no or substantially no smoky kernels, the method comprising detecting a presence or absence of one, two, three or more marker alleles linked to anthocyaninregulating genes selected from the group consisting of R1 and Cl gene, wherein said marker alleles comprise at least one allele being selected from Ml ’, M2’, M3’, M4’, MS’, M6", M7’, M8’, M9’, M10’. MU’, M12’, M13’, MU’ and M15", wherein: Ml' is a SNP which is cytosine (C) at the position 139770766 referenced to the B73 reference genome AGPv4; M2’ is a SNP which is cytosine (C) at the position 139771365 referenced to the B73 reference genome AGPv4; M3’ is a SNP which is cytosine (C) at the position 139771617 referenced to the B73 reference genome AGPv4; M4’ is a SNP which is guanine (G) at the position 139772481 referenced to the B73 reference genome AGPv4; MS’ is a SNP which is adenine (A) at the position 139772862 referenced to the B73 reference genome AGPv4; M6’ is a SNP which is adenine (A) at the position 139789400 referenced to the B73 reference genome AGPv4; M7’ is an indel which is deletion of nucleotides at the position 139781170 referenced to the B73 reference genome AGPv4; M8" is an indel which is deletion of nucleotides at the position 8983319 referenced to the B73 reference genome AGPv4; M9’ is a SNP which is cytosine (C) at the position 8983500 referenced to the B73 reference genome AGPv4; MIO’ is a SNP which is guanine (G) at the position 8983838 referenced to the B73 reference genome AGPv4; Ml 1 ’ is a SNP which is adenine (A) at the position 8983897 referenced to the B73 reference genome AGPv4; Ml 2’ is an indel which is deletion of nucleotides at the position 8984653 referenced to the B73 reference genome AGPv4; M13’ is a SNP which is guanine (G) or adenine (A) at the position 8983862 referenced to the B73 reference genome AGPv4; M14’ is a SNP which is thymine (T) at the position 8983983 referenced to the B73 reference genome AGPv4; and M15’ is a SNP which is cytosine (C) or adenine (A) at the position 8984469 referenced to the B73 reference genome AGPv4.
[0012] An embodiment of the present disclosure provides a method for identifying a maize plant or plant part, the method comprising screening for the presence of one or more marker alleles selected from Ml to M15 or M1 ’ to M15’.
[0013] An embodiment of the present disclosure provides a maize plant, identified by screening for the presence of one or more marker alleles selected from Ml to M15 or Ml ’ to M15".
[0014] An embodiment of the present disclosure provides a method for producing maize plants having altered aleurone anthocyanin content, comprising the steps of: identifying one or more maize lines comprising one or more molecular marker alleles identified by the wherein said marker alleles comprise at least one allele selected from the alleles Ml to M15; selfing the identified maize line with itself or crossing the identified maize line with a second identified maize line; and producing the maize plant.
[0015] An embodiment of the present disclosure provides a method for producing maize plants having no or substantially no smoky kernels, comprising the steps of: identifying one or more maize lines comprising one or more molecular marker alleles selected from the alleles Ml to M15; selfing the identified maize line with itself or crossing the identified maize line with a second identified maize line: and producing the maize plant.
[0016] An embodiment of the present disclosure provides a method for detecting anthocyanin-regulating genes in maize, comprising using one or more discriminant markers selected from the group consisting of SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO: 48; SEQ ID NO: 49; SEQ ID NO: 50; SEQ ID NO: 51; SEQ ID NO: 52; SEQ ID NO: 53; SEQ ID NO: 54; and SEQ ID NO: 55, or a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity with any of the foregoing sequences.. [0017] An embodiment of the present disclosure provides an isolated polynucleic acid comprising a coding sequence selected from: SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO: 48; SEQ ID NO: 49; SEQ ID NO: 50; SEQ ID NO: 51; SEQ ID NO: 52; SEQ ID NO: 53; SEQ ID NO: 54; and SEQ ID NO: 55, or a sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity with any of the foregoing sequences. [0018] An embodiment of the present disclosure provides a method for screening for a maize plant with one or more anthocyanin-regulating genes, the method comprising: screening a population of maize plants for one or more anthocyanin-regulating genes are selected from the group consisting of an R1 gene and a C1 gene, wherein the R1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66.SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is selected from the group consisting of SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66; and selecting a maize plant having said one or more anthocyanin-regulating genes. [0019] An embodiment of the present disclosure provides a method for screening for a maize plant with no or substantially no smoky kernels, the method comprising: screening a population of maize plants for one or more anthocyanin-regulating genes are selected from the group consisting of an R1 gene and a C1 gene, wherein the R1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66.SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is selected from the group consisting of SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66; and selecting a maize plant having said one or more anthocyanin-regulating genes. [0020] An embodiment of the present disclosure provides a method for identification of maize plants having no or substantially no smoky kernels, the method comprising: screening a population of maize plants for one or more anthocyanin-regulating genes, homologs of such genes, orthologs of such genes, paralogs of such genes and fragments and variations thereof. [0021] In an embodiment, the present disclosure provides nucleic acid sequences for identifying maize plants having no or substantially no smoky kernels, comprising one or more nucleic acid sequences that shares at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% identity to the genes identified and presented herein. [0022] Preferably, in one embodiment, a method as described herein comprises that said at least one plant cell, tissue, organ, plant, or seed is not obtained by an essentially biological process. Instead, said at least one plant cell, tissue, organ, plant, or seed is obtained by at least one step of artificial human intervention as such not occurring in nature and influencing the plant cell by modifying and/or introducing a step of technical nature influencing sexually crossing and selecting. Such a step may include a step of genome editing, e.g., to exchange a base or nucleotide of interest, a chemical treatment, e.g. for chromosome doubling an agent or gene or gene product including chromosome elimination, the introduction of an exogenous gene or genetic material into a plant genome (nuclear, mitochondrial or plastid genome) and the like, or any combination thereof. [0023] Various components are referred to herein as “operably linked”, “linked” or “operably associated.” As used herein, “operably linked”, “linked” or “operably associated” refers to nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence. [0024] As used herein, “at least one,” “one or more,” and “and/or” are open-ended expressions that are both conjunctive and disjunctive in operation. For example, each of the expressions “at least one of A, B and C,” “at least one of A, B, or C,” “one or more of A, B, and C,” “one or more of A, B, or C” and “A, B, and/or C” means A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together. [0025] As used herein, “sometime” means at some indefinite or indeterminate point of time. So for example, as used herein, “sometime after” means following, whether immediately following or at some indefinite or indeterminate point of time following the prior act. [0026] Various embodiments of the present invention are set forth in the Detailed Description as provided herein and as embodied by the claims. It should be understood, however, that this Summary does not contain all of the aspects and embodiments of the present invention, is not meant to be limiting or restrictive in any manner, and that the invention(s) as disclosed herein is/are understood by those of ordinary skill in the art to encompass obvious improvements and modifications thereto. [0027] Additional advantages of the present invention will become readily apparent from the following discussion, particularly when taken together with the accompanying drawings and sequence listings. BRIEF DESCRIPTION OF THE FIGURES [0028] The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate some, but not the only or exclusive, example embodiments and/or features. It is intended that the embodiments and figures disclosed herein are to be considered illustrative rather than limiting. [0029] Figure 1(a) provides an association mapping results of the male germplasm testcrosses. [0030] Figure 1(b) provides an association mapping results of the female germplasm testcrosses. [0031] Figure 2 provides the sequence listings for SEQ ID NO: 1 to SEQ ID NO:66 and all SEQ ID NOs in between. [0032] Figure 3 provides an alignment sequences of B73_R1 (genotype carrying smoky alleles); PHG39 (known smokey donor) and the non_smokey2 allele producing the non- smokey genotype. [0033] Figure 4 provides an alignment of multiple sequences of B73_R1 (genotype carrying smoky alleles); PHG39 (known smokey donor) and the non_smokey2 allele producing the non-smokey genotype with EXON information. DEFINITIONS [0034] As used herein, the verb “comprise” as is used in this description and in the claims and its conjugations are used in its non-limiting sense to mean that items following the word are included, but items not specifically mentioned are not excluded. [0035] As used herein, the term “plant” refers to any living organism belonging to the kingdom Plantae (i.e., any genus/species in the Plant Kingdom). [0036] As used herein, the term “plant part” refers to any part of a plant including but not limited to the shoot, root, stem, seeds, fruits, stipules, leaves, petals, flowers, ovules, bracts, branches, petioles, internodes, bark, pubescence, tillers, rhizomes, fronds, blades, pollen, stamen, rootstock, scion and the like. The two main parts of plants grown in some sort of media, such as soil, are often referred to as the “above-ground” part, also often referred to as the “shoots”, and the “below-ground” part, also often referred to as the “roots”. [0037] The term “a” or “an” refers to one or more of that entity; for example, “a gene” refers to one or more genes or at least one gene. As such, the terms “a” (or “an”), “one or more” and “at least one” are used interchangeably herein. In addition, reference to “an element” by the indefinite article “a” or “an” does not exclude the possibility that more than one of the elements are present, unless the context clearly requires that there is one and only one of the elements. [0038] As used herein, the term “nucleic acid” refers to a polymeric form of nucleotides of any length, either ribonucleotides or deoxyribonucleotides, or analogs thereof. This term refers to the primary structure of the molecule, and thus includes double- and single-stranded DNA, as well as double- and single-stranded RNA. It also includes modified nucleic acids such as methylated and/or capped nucleic acids, nucleic acids containing modified bases, backbone modifications, and the like. The terms “nucleic acid” and “nucleotide sequence” are used interchangeably. [0039] As used herein, the terms “polypeptide,” “peptide,” and “protein” are used interchangeably herein to refer to polymers of amino acids of any length. These terms also include proteins that are post-translationally modified through reactions that include glycosylation, acetylation and phosphorylation. [0040] As used herein, the term “derived from” refers to the origin or source, and may include naturally occurring, recombinant, unpurified, or purified molecules. A nucleic acid or an amino acid derived from an origin or source may have all kinds of nucleotide changes or protein modification as defined elsewhere herein. [0041] The term “primer” as used herein refers to an oligonucleotide which is capable of annealing to the amplification target allowing a DNA polymerase to attach, thereby serving as a point of initiation of DNA synthesis when placed under conditions in which synthesis of primer extension product is induced, i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase and at a suitable temperature and pH. The (amplification) primer is preferably single stranded for maximum efficiency in amplification. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the agent for polymerization. The exact lengths of the primers will depend on many factors, including temperature and composition (A/T en G/C content) of primer. A pair of bi-directional primers consists of one forward and one reverse primer as commonly used in the art of DNA amplification such as in PCR amplification. [0042] A probe comprises an identifiable, isolated nucleic acid that recognizes a target nucleic acid sequence. A probe includes a nucleic acid that is attached to an addressable location, a detectable label or other reporter molecule and that hybridizes to a target sequence. Typical labels include radioactive isotopes, enzyme substrates, co-factors, ligands, chemiluminescent or fluorescent agents, haptens, and enzymes. Methods for labelling and guidance in the choice of labels appropriate for various purposes are discussed, for example, in Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 2nd ed., vol.1-3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989 and Ausubel et al. Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons, Inc., 1999. [0043] Methods for preparing and using nucleic acid probes and primers are described, for example, in Sambrook et al. (ed.), Molecular Cloning: A Laboratory Manual, 2nd ed., vol.1- 3, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; Ausubel et al. Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons, Inc., 1999; and Innis et al. PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., San Diego, CA, 1990. Amplification primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as PRIMER (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge, MA). One of ordinary skills in the art will appreciate that the specificity of a particular probe or primer increases with its length. Thus, in order to obtain greater specificity, probes and primers can be selected that comprise at least 20, 25, 30, 35, 40, 45, 50 or more consecutive nucleotides of a target nucleotide sequences. [0044] For PCR amplifications of the polynucleotides disclosed herein, oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any organism of interest. Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (2001) Molecular Cloning: A Laboratory Manual (3rd ed., Cold Spring Harbor Laboratory Press, Plainview, New York). See also Innis et al., eds. (1990) PCR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York). Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially mismatched primers, and the like. [0045] Methods of alignment of sequences for comparison are well known in the art. Various programs and alignment algorithms are described in: Smith and Waterman (Adv. Appl. Math., 2:482, 1981); Needleman and Wunsch (J. Mol. Biol., 48:443, 1970); Pearson and Lipman (Proc. Natl. Acad. Sci., 85:2444, 1988); Higgins and Sharp (Gene, 73:237-44, 1988); Higgins and Sharp (CABIOS, 5:151-53, 1989); Corpet et al. (Nuc. Acids Res., 16:10881-90, 1988); Huang et al. (Comp. Appls Biosci., 8:155-65, 1992); and Pearson et al. (Meth. Mol. Biol., 24:307-31, 1994). Altschul et al. (Nature Genet., 6:119-29, 1994) presents a detailed consideration of sequence alignment methods and homology calculations. [0001] "Homologous sequences" or "homologs" or “orthologs” are thought, believed, or known to be functionally related. A functional relationship may be indicated in any one of several ways, including, but not limited to: (a) degree of sequence identity and/or (b) the same or similar biological function. Preferably, both (a) and (b) are indicated. The degree of sequence identity may vary, but in one embodiment, is at least 50% (when using standard sequence alignment programs known in the art), at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least 98.5%, or at least about 99%, or at least 99.5%, or at least 99.8%, or at least 99.9%. Homology can be determined using software programs readily available in the art, such as those discussed in Current Protocols in Molecular Biology (F.M. Ausubel et al., eds., 1987) Supplement 30, section 7.718, Table 7.71. Some alignment programs are MacVector (Oxford Molecular Ltd, Oxford, U.K.) and ALIGN Plus (Scientific and Educational Software, Pennsylvania). Other non-limiting alignment programs include Sequencher (Gene Codes, Ann Arbor, Michigan), AlignX, and Vector NTI (Invitrogen, Carlsbad, CA). [0046] As used herein, the term “offspring” refers to any plant resulting as progeny from a vegetative or sexual reproduction from one or more parent plants or descendants thereof. For instance an offspring plant may be obtained by cloning or selfing of a parent plant or by crossing two parents plants and include selfings as well as the F1 or F2 or still further generations. An F1 is a first-generation offspring produced from parents at least one of which is used for the first time as donor of a trait, while offspring of second generation (F2) or subsequent generations (F3, F4, etc.) are specimens produced from selfings of F1's, F2's etc. An F1 may thus be (and usually is) a hybrid resulting from a cross between two true breeding parents (true-breeding is homozygous for a trait), while an F2 may be (and usually is) an offspring resulting from self-pollination of said F1 hybrids. [0047] As used herein, the term “cross”, “crossing”, “cross pollination” or “cross-breeding” refer to the process by which the pollen of one flower on one plant is applied (artificially or naturally) to the ovule (stigma) of a flower on another plant. [0048] As used herein, the term “cultivar” refers to a variety, strain or race of plant that has been produced by horticultural or agronomic techniques and is not normally found in wild populations. [0049] As used herein, the terms “dicotyledon” and “dicot” refer to a flowering plant having an embryo containing two seed halves or cotyledons. Examples include tobacco; tomato; the legumes, including peas, alfalfa, clover and soybeans; oaks; maples; roses; mints; squashes; daisies; walnuts; cacti; violets and buttercups. [0050] As used herein, the term “chromosome” is interchangeably with the phrase “linkage group”. [0051] As used herein, the term “gene” refers to any segment of DNA associated with a biological function. Thus, genes include, but are not limited to, coding sequences and/or the regulatory sequences required for their expression. Genes can also include nonexpressed DNA segments that, for example, form recognition sequences for other proteins. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters. [0052] As used herein, the term “genotype” refers to the genetic makeup of an individual cell, cell culture, tissue, organism (e.g., a plant), or group of organisms. [0053] As used herein, the term “hemizygous” refers to a cell, tissue or organism in which a gene is present only once in a genotype, as a gene in a haploid cell or organism, a sex-linked gene in the heterogametic sex, or a gene in a segment of chromosome in a diploid cell or organism where its partner segment has been deleted. [0054] As used herein, the term “heterozygote” refers to a diploid or polyploid individual cell or plant having different alleles (forms of a given gene) present at least at one locus. As used herein, the term “heterozygous” refers to the presence of different alleles (forms of a given gene) at a particular gene locus. [0055] As used herein, the terms “homolog” or “homologue” refer to a nucleic acid or peptide sequence which has a common origin and functions similarly to a nucleic acid or peptide sequence from another species. [0056] As used herein, the term “homozygote” refers to an individual cell or plant having the same alleles at one or more loci. [0057] As used herein, the term “homozygous” refers to the presence of identical alleles at one or more loci in homologous chromosomal segments. [0058] As used herein, the term “hybrid” refers to any individual cell, tissue or plant resulting from a cross between parents that differ in one or more genes. [0059] As used herein, the term “inbred” or “inbred line” refers to a relatively true-breeding strain. [0060] The term “single allele converted plant” as used herein refers to those plants which are developed by a plant breeding technique called backcrossing wherein essentially all of the desired morphological and physiological characteristics of an inbred are recovered in addition to the single allele transferred into the inbred via the backcrossing technique. [0061] As used herein, the term “line” is used broadly to include, but is not limited to, a group of plants vegetatively propagated from a single parent plant, via tissue culture techniques or a group of inbred plants which are genetically very similar due to descent from a common parent(s). A plant is said to “belong” to a particular line if it (a) is a primary transformant (TO) plant regenerated from material of that line; (b) has a pedigree comprised of a TO plant of that line; or (c) is genetically very similar due to common ancestry (e.g., via inbreeding or selfing). In this context, the term “pedigree” denotes the lineage of a plant, e.g. in terms of the sexual crosses affected such that a gene or a combination of genes, in heterozygous (hemizygous) or homozygous condition, imparts a desired trait to the plant. [0062] As used herein the term “late plant greenness” refers to a visual assessment given at around the dent stage but typically a few weeks before harvest to characterize the degree of greenness left in the leaves. Plants are rated from 1 (poorest) to 9 (best) with poorer scores given for plants that have more non-green leaf tissue typically due to early senescence or from disease. [0063] As used herein the term “locus” refers to a defined segment of DNA. This segment is often associated with an allele position on a chromosome. [0064] As used herein the term “MN RM” refers to the Minnesota Relative Maturity Rating (MN RM) for the hybrid and is based on the harvest moisture of the grain relative to a standard set of checks of previously determined MN RM rating. Regression analysis is used to compute this rating. [0065] As used herein the term “moisture” refers to the actual percentage moisture of the grain at harvest. [0066] As used herein, the terms “introgression”, “introgressed” and “introgressing” refer to the process whereby genes of one species, variety or cultivar are moved into the genome of another species, variety or cultivar, by crossing those species. The crossing may be natural or artificial. The process may optionally be completed by backcrossing to the recurrent parent, in which case introgression refers to infiltration of the genes of one species into the gene pool of another through repeated backcrossing of an interspecific hybrid with one of its parents. An introgression may also be described as a heterologous genetic material stably integrated in the genome of a recipient plant. [0067] As used herein, the term “population” means a genetically homogeneous or heterogeneous collection of plants sharing a common genetic derivation. [0068] As used herein, the term “variety” or “cultivar” means a group of similar plants that by structural features and performance can be identified from other varieties within the same species. The term “variety” as used herein has identical meaning to the corresponding definition in the International Convention for the Protection of New Varieties of Plants (UPOV treaty), of Dec.2, 1961, as Revised at Geneva on Nov.10, 1972, on Oct.23, 1978, and on Mar.19, 1991. Thus, “variety” means a plant grouping within a single botanical taxon of the lowest known rank, which grouping, irrespective of whether the conditions for the grant of a breeder's right are fully met, can be i) defined by the expression of the characteristics resulting from a given genotype or combination of genotypes, ii) distinguished from any other plant grouping by the expression of at least one of the said characteristics and iii) considered as a unit with regard to its suitability for being propagated unchanged. [0069] As used herein, the term “allele(s)” means any of one or more alternative forms of a gene, all of which alleles relate to at least one trait or characteristic. In a diploid cell, the two alleles of a given gene occupy corresponding loci on a pair of homologous chromosomes. Since the present invention relates to QTLs, i.e. genomic regions that may comprise one or more genes or regulatory sequences, it is in some instances more accurate to refer to “haplotype” (i.e. an allele of a chromosomal segment) in stead of “allele”, however, in those instances, the term “allele” should be understood to comprise the term “haplotype”. Alleles are considered identical when they express a similar phenotype. Differences in sequence are possible but not important as long as they do not influence phenotype. [0070] As used herein, the term “mass selection” refers to a form of selection in which individual plants are selected and the next generation propagated from the aggregate of their seeds. More details of mass selection are described herein in the specification. [0071] As used herein, the term “monocotyledon” or “monocot” refer to any of a subclass (Monocotyledoneae) of flowering plants having an embryo containing only one seed leaf and usually having parallel-veined leaves, flower parts in multiples of three, and no secondary growth in stems and roots. Examples include lilies; orchids; rice; corn, grasses, such as tall fescue, goat grass, and Kentucky bluegrass; grains, such as wheat, oats and barley; irises; onions and palms. [0072] As used herein, the term “open pollination” refers to a plant population that is freely exposed to some gene flow, as opposed to a closed one in which there is an effective barrier to gene flow. [0073] As used herein, the terms “open-pollinated population” or “open-pollinated variety” refer to plants normally capable of at least some cross-fertilization, selected to a standard, that may show variation but that also have one or more genotypic or phenotypic characteristics by which the population or the variety can be differentiated from others. A hybrid, which has no barriers to cross-pollination, is an open-pollinated population or an open-pollinated variety. [0074] As used herein when discussing plants, the term “ovule” refers to the female gametophyte, whereas the term “pollen” means the male gametophyte. [0075] As used herein, the term “phenotype” refers to the observable characters of an individual cell, cell culture, organism (e.g., a plant), or group of organisms which results from the interaction between that individual's genetic makeup (i.e., genotype) and the environment. [0076] As used herein, the term “plant tissue” refers to any part of a plant. Examples of plant organs include, but are not limited to the leaf, stem, root, tuber, seed, branch, pubescence, nodule, leaf axil, flower, pollen, stamen, pistil, petal, peduncle, stalk, stigma, style, bract, fruit, trunk, carpel, sepal, anther, ovule, pedicel, needle, cone, rhizome, stolon, shoot, pericarp, endosperm, placenta, berry, stamen, and leaf sheath. [0077] As used herein, the term “self-crossing”, “self pollinated” or “self-pollination” means the pollen of one flower on one plant is applied (artificially or naturally) to the ovule (stigma) of the same or a different flower on the same plant. [0078] As used herein, the terms “Quantitative Trait Loci” and “QTL” are used herein in their art-recognized meaning. A QTL may for instance comprise one or more genes of which the products confer the genetic resistance. Alternatively, a QTL may for instance comprise regulatory genes or sequences of which the products influence the expression of genes on other loci in the genome of the plant thereby conferring the resistance. The QTLs of the present invention may be defined by indicating their genetic location in the genome of the respective pathogen-resistant accession using one or more molecular genomic markers. One or more markers, in turn, indicate a specific locus. Distances between loci are usually measured by frequency of crossing-over between loci on the same chromosome. The farther apart two loci are, the more likely that a crossover will occur between them. Conversely, if two loci are close together, a crossover is less likely to occur between them. As a rule, one centimorgan (cM) is equal to 1% recombination between loci (markers). When a QTL can be indicated by multiple markers the genetic distance between the end-point markers is indicative of the size of the QTL. [0079] As used herein the term “regeneration” refers to the development of a plant from tissue culture. [0080] As used herein the term “single locus converted (conversion)” refers to plants which are developed by a plant breeding technique called backcrossing wherein essentially all of the desired morphological and physiological characteristics of a variety are recovered in addition to the single locus transferred into the variety via the backcrossing technique or via genetic engineering. A single locus converted plant can also be referred to a plant obtained though mutagenesis taught in the present disclosure or through the use of some new breeding techniques. In some embodiments, the single locus converted plant has essentially all of the desired morphological and physiological characteristics of the original variety in addition to a single locus converted by spontaneous and/or artificially induced mutations, which is introduced and/or transferred into the plant by the plant breeding techniques such as backcrossing. In other embodiments, the single locus converted plant has essentially all of the desired morphological and physiological characteristics of the original variety in addition to a single locus, gene or nucleotide sequence(s) converted, mutated, modified or engineered through the New Breeding Techniques taught herein. In the present disclosure, single locus converted (conversion) can be interchangeably referred to single gene converted (conversion). [0081] As used herein the term “transgene” refers to any nucleotide sequence used in the transformation of a plant (e.g., maize ) , animal , or other organism. Thus, a transgene can be a coding sequence , a non- coding sequence, a cDNA, a gene or fragment or portion thereof, a genomic sequence, a regulatory element and the like. [0082] As used herein the term “transgenic” refers to an organism, such as a transgenic plant, into which a transgene has been delivered or introduced and the transgene can be expressed in the transgenic organism to produce a product, the presence of which can impart an effect and/or a phenotype in the organism. Where an inbred line has been converted to contain one or more transgenes by single locus conversion or by direct transformation. [0083] As used herein the term “Variety and Cultivar”, which can be interchangeably used in the present disclosure, refer to a group of similar plants that by structural or genetic features and/or performance can be distinguished from other varieties within the same species. A plant variety as used by one skilled in the art of plant breeding means a plant grouping within a single botanical taxon of the lowest known rank which can be defined by the expression of the characteristics resulting from a given genotype or combination of phenotypes, distinguished from any other plant grouping by the expression of at least one of the said characteristics and considered as a unit with regard to its suitability for being propagated unchanged (International Convention for the Protection of New Varieties of Plants). [0084] As used herein the term “Yield (Bushels/Acre)” refers to the actual yield of the grain at harvest adjusted to 15.5% moisture. [0085] As used herein, the term “molecular marker” or “genetic marker” refers to an indicator that is used in methods for visualizing differences in characteristics of nucleic acid sequences. Examples of such indicators are restriction fragment length polymorphism (RFLP) markers, amplified fragment length polymorphism (AFLP) markers, single nucleotide polymorphisms (SNPs), insertion mutations, microsatellite markers (SSRs), sequence- characterized amplified regions (SCARs), cleaved amplified polymorphic sequence (CAPS) markers or isozyme markers or combinations of the markers described herein which defines a specific genetic and chromosomal location. Mapping of molecular markers in the vicinity of an allele is a procedure which can be performed quite easily by the average person skilled molecular-biological techniques which techniques are for instance described in Lefebvre and Chevre, 1995; Lorez and Wenzel, 2007, Srivastava and Narula, 2004, Meksem and Kahl, 2005, Phillips and Vasil, 2001. General information concerning AFLP technology can be found in Vos et al. (1995, AFLP: a new technique for DNA fingerprinting, Nucleic Acids Res. 1995 Nov.11; 23(21): 4407-4414). DETAILED DESCRIPTION OF THE DISCLOSURE [0086] The present disclosure provides systems and methods for screening for and identifying one or more plants of the genus Zea having an altered aleurone anthocyanin content or substantially no smoky kernels. As will be discussed in further detail herein, the present disclosure provides methods for producing maize plants having altered aleurone anthocyanin content or substantially no smoky kernels as well as methods for detecting anthocyanin-regulating genes in maize. [0087] Anthocyanin coloration of maize kernels is typically an undesired phenotype in maize breeding because occurrence of colored or speckled kernels (also referred to as smoky kernels) reduces the salability of maize kernels. Therefore, farmers prefer maize varieties or maize hybrids with solely yellow kernels. However, even after decades of maize breeding the phenomenon of colored or speckled kernels is still there, and its appearance during a breeding program is difficult to predict. The reason for that is because the mechanism of the anthocyanin coloration of maize kernels is complex and not fully understood. [0088] Even more importantly, there are no suitable marker sets available today which would allow maize breeders to detect maize germplasm with strong tendency for the occurrence of undesired aleurone coloration or to reliably select maize germplasm with very limited tendency for the occurrence of undesired aleurone coloration. [0089] Aleurone color in maize is determined by a group of regulatory and structural genes in the anthocyanin biosynthetic pathway. Regulatory genes identified so far, are the b1/r1 gene family, which codes for myc-like transcriptional activators (Ludwig, Steven R., et al. "Lc, a member of the maize R gene family responsible for tissue-specific anthocyanin production, encodes a protein similar to transcriptional activators and contains the myc-homology region." Proceedings of the National Academy of Sciences 86.18 (1989): 7092-7096.; Radicella, J. Pablo, et al. "Allelic diversity of the maize B regulatory gene: different leader and promoter sequences of two B alleles determine distinct tissue specificities of anthocyanin production." Genes & development 6.11 (1992): 2152-2164.), and the c1/pl1 gene family, which codes for myb-like transcriptional activators (Cone, Karen C., et al. "Maize anthocyanin regulatory gene pl is a duplicate of c1 that functions in the plant." The Plant Cell 5.12 (1993): 1795-1805.; Paz‐Ares, Javier, et al. "The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto‐oncogene products and with structural similarities to transcriptional activators." The EMBO journal 6.12 (1987): 3553-3558.). These genes appear to be capable of up-regulating all of the structural genes in the anthocyanin biosynthetic pathway (Bruce, Wesley, et al. "Expression profiling of the maize flavonoid pathway genes controlled by estradiol-inducible transcription factors CRC and P." The Plant Cell 12.1 (2000): 65-79.). [0090] Typically, b1 and pl1 control anthocyanin pigmentation of plant parts, while r1 and c1 control anthocyanin pigmentation in the aleurone and other seed parts. However, there are exceptions, and there is a certain amount of overlap in the tissue specificity of certain alleles. [0091] Even though there are several studies on inhibition of c1 and r1, a practical and efficient application in maize breeding is still missing. Molecular analysis of the c1 locus indicates that functional alleles code for a myb-related transcriptional activator that has a DNA binding domain and a transcription activating domain (Paz‐Ares, Javier, et al. "The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto‐oncogene products and with structural similarities to transcriptional activators." The EMBO journal 6.12 (1987): 3553-3558., Paz‐Ares, Javier, Debabrota Ghosal, and Heinz Saedler. "Molecular analysis of the C1‐I allele from Zea mays: a dominant mutant of the regulatory C1 locus." The EMBO Journal 9.2 (1990): 315-321.). Studies on mutant C1 alleles demonstrated that mutations in both the DNA binding domain and the transcription activating domain of the C1 protein product are necessary for the greatest amount of inhibition (Goff, Stephen A., Karen C. Cone, and Michael E. Fromm. "Identification of functional domains in the maize transcriptional activator C1: comparison of wild-type and dominant inhibitor proteins." Genes & development 5.2 (1991): 298-309.). The exact mechanism of inhibition is not known and specific c1 haplotypes have not been reported yet. Some modifiers have been reported to enhance anthocyanin pigmentation of vegetative parts in plants carrying specific r1 haplotypes. The term haplotype rather than allele is used in reference to variations at the r1 locus because it is a complex locus composed of one to several genes that affect anthocyanin pigmentation of various plant parts (Panavas, Tadas, Jessica Weir, and Elsbeth L. Walker. "The structure and paramutagenicity of the R-marbled haplotype of Zea mays." Genetics 153.2 (1999): 979-991.). While there is anecdotal information as to the existence of r1 haplotype-specific inhibitors of aleurone color in maize, only very few have been characterized in detail (Stinard, P. S., and Martin M. Sachs. "The identification and characterization of two dominant r1 haplotype-specific inhibitors of aleurone color in Zea mays." Journal of Heredity 93.6 (2002): 421-428.). Identification of QTLs linked to anthocyanin-regulating genes [0092] As discussed above, an embodiment of the present disclosure provides a method for identifying a plant of the genus Zea having no or substantially no smoky kernels through the use of QTLs to identify the presence of alleles linked to anthocyanin-regulating genes. The QTLs related to alleles linked to anthocyanin-regulating genes may be discovered through QTL mapping. Inheritance of quantitative traits or polygenic inheritance refers to the inheritance of a phenotypic characteristic that varies in degree and can be attributed to the interactions between two or more genes and their environment. Though not necessarily genes themselves, quantitative trait loci (QTLs) are stretches of DNA that are closely linked to the genes that underlie the trait in question. QTLs can be molecularly identified to help map regions of the genome that contain genes involved in specifying a quantitative trait. This can be an early step in identifying and sequencing these genes. [0093] Typically, QTLs underlie continuous traits (those traits that vary continuously, e.g. level of resistance to pathogen) as opposed to discrete traits (traits that have two or several character values, e.g. smooth vs. wrinkled peas used by Mendel in his experiments). Moreover, a single phenotypic trait is usually determined by many genes. Consequently, many QTLs are associated with a single trait. [0094] A QTL is a region of DNA that is associated with a particular phenotypic trait—these QTLs are often found on different chromosomes. Knowing the number of QTLs that explains variation in a particular phenotypic trait informs about the genetic architecture of the trait. It may tell that plant resistance to a specific pathogen is controlled by many genes of small effect, or by a few genes of large effect. [0095] Another use of QTLs is to identify candidate genes underlying a trait. Once a region of DNA is identified as contributing to a phenotype, it can be sequenced. The DNA sequence of any genes in this region can then be compared to a database of DNA for genes whose function is already known. [0096] In a recent development, classical QTL analyses are combined with gene expression profiling i.e. by DNA microarrays. Such expression QTLs (e-QTLs) describes cis- and trans- controlling elements for the expression of often disease-associated genes. Observed epistatic effects have been found beneficial to identify the gene responsible by a cross-validation of genes within the interacting loci with metabolic pathway- and scientific literature databases. [0097] QTL mapping is the statistical study of the alleles that occur in a locus and the phenotypes (physical forms or traits) that they produce (see, Meksem and Kahl, The handbook of plant genome mapping: genetic and physical mapping, 2005, Wiley-VCH, ISBN 3527311165, 9783527311163). Because most traits of interest are governed by more than one gene, defining and studying the entire locus of genes related to a trait gives hope of understanding what effect the genotype of an individual might have in the real world. [0098] Statistical analysis is required to demonstrate that different genes interact with one another and to determine whether they produce a significant effect on the phenotype. QTLs identify a particular region of the genome as containing a gene that is associated with the trait being assayed or measured. They are shown as intervals across a chromosome, where the probability of association is plotted for each marker used in the mapping experiment. [0099] To begin, a set of genetic markers must be developed for the species in question. A marker is an identifiable region of variable DNA. Biologists are interested in understanding the genetic basis of phenotypes (physical traits). The aim is to find a marker that is significantly more likely to co-occur with the trait than expected by chance, that is, a marker that has a statistical association with the trait. Ideally, they would be able to find the specific gene or genes in question, but this is a long and difficult undertaking. Instead, they can more readily find regions of DNA that are very close to the genes in question. When a QTL is found, it is often not the actual gene underlying the phenotypic trait, but rather a region of DNA that is closely linked with the gene. [00100] For organisms whose genomes are known, one might now try to exclude genes in the identified region whose function is known with some certainty not to be connected with the trait in question. If the genome is not available, it may be an option to sequence the identified region and determine the putative functions of genes by their similarity to genes with known function, usually in other genomes. This can be done using BLAST, an online tool that allows users to enter a primary sequence and search for similar sequences within the BLAST database of genes from various organisms. [00101] Another interest of statistical geneticists using QTL mapping is to determine the complexity of the genetic architecture underlying a phenotypic trait. For example, they may be interested in knowing whether a phenotype is shaped by many independent loci, or by a few loci, and do those loci interact. This can provide information on how the phenotype may be evolving. [00102] Molecular markers are used for the visualization of differences in nucleic acid sequences. This visualization is possible due to DNA-DNA hybridization techniques (RFLP) and/or due to techniques using the polymerase chain reaction (e.g. STS, microsatellites, AFLP). All differences between two parental genotypes will segregate in a mapping population based on the cross of these parental genotypes. The segregation of the different markers may be compared and recombination frequencies can be calculated. The recombination frequencies of molecular markers on different chromosomes are generally 50%. Between molecular markers located on the same chromosome the recombination frequency depends on the distance between the markers. A low recombination frequency corresponds to a low distance between markers on a chromosome. Comparing all recombination frequencies will result in the most logical order of the molecular markers on the chromosomes. This most logical order can be depicted in a linkage map (Paterson, 1996). A group of adjacent or contiguous markers on the linkage map that is associated to a reduced disease incidence and/or a reduced lesion growth rate pinpoints the position of a QTL. [00103] The nucleic acid sequence of a QTL may be determined by methods known to the skilled person. For instance, a nucleic acid sequence comprising said QTL may be isolated from a donor maize plant having no or substantially no smoky kernels by fragmenting the genome of said plant and selecting those fragments harboring one or more markers indicative of said QTL. Subsequently, or alternatively, the marker sequences (or parts thereof) indicative of said QTL may be used as (PCR) amplification primers, in order to amplify a nucleic acid sequence comprising said QTL from a genomic nucleic acid sample or a genome fragment obtained from said plant. The amplified sequence may then be purified in order to obtain the isolated QTL. The nucleotide sequence of the QTL, and/or of any additional markers comprised therein, may then be obtained by standard sequencing methods. [00104] One or more such QTLs associated maize plants having no or substantially no smoky kernels can be transferred to a recipient plant make it express kernels with no or substantially no smoky kernels. [00105] In one embodiment, an advanced backcross QTL analysis (AB-QTL) is used to discover the nucleotide sequence or the QTLs responsible for the expression of no or substantially no smoky kernels in a plant. Such method was proposed by Tanksley and Nelson in 1996 (Tanksley and Nelson, 1996, Advanced backcross QTL analysis: a method for simultaneous discovery and transfer of valuable QTL from un-adapted germplasm into elite breeding lines. Theor Appl Genet 92:191-203) as a new breeding method that integrates the process of QTL discovery with variety development, by simultaneously identifying and transferring useful QTL alleles from un-adapted (e.g., land races, wild species) to elite germplasm, thus broadening the genetic diversity available for breeding. AB-QTL strategy was initially developed and tested in tomato, and has been adapted for use in other crops including rice, maize, wheat, pepper, barley, and bean. Once favorable QTL alleles are detected, only a few additional marker-assisted generations are required to generate near isogenic lines (NILs) or introgression lines (ILs) that can be field tested in order to confirm the QTL effect and subsequently used for variety development. [00106] Isogenic lines in which favorable QTL alleles have been fixed can be generated by systematic backcrossing and introgressing of marker-defined donor segments in the recurrent parent background. These isogenic lines are referred as near isogenic lines (NILs), introgression lines (ILs), backcross inbred lines (BILs), backcross recombinant inbred lines (BCRIL), recombinant chromosome substitution lines (RCSLs), chromosome segment substitution lines (CSSLs), and stepped aligned inbred recombinant strains (STAIRSs). An introgression line in plant molecular biology is a line of a crop species that contains genetic material derived from a similar species. ILs represent NILs with relatively large average introgression length, while BILs and BCRILs are backcross populations generally containing multiple donor introgressions per line. As used herein, the term “introgression lines or ILs” refers to plant lines containing a single marker defined homozygous donor segment, and the term “pre-ILs” refers to lines which still contain multiple homozygous and/or heterozygous donor segments. [00107] To enhance the rate of progress of introgression breeding, a genetic infrastructure of exotic libraries can be developed. Such an exotic library comprises of a set of introgression lines, each of which has a single, possibly homozygous, marker-defined chromosomal segment that originates from a donor exotic parent, in an otherwise homogenous elite genetic background, so that the entire donor genome would be represented in a set of introgression lines. A collection of such introgression lines is referred as libraries of introgression lines or IL libraries (ILLs). The lines of an ILL cover usually the complete genome of the donor, or the part of interest. Introgression lines allow the study of quantitative trait loci, but also the creation of new varieties by introducing exotic traits. High resolution mapping of QTL using ILLs enable breeders to assess whether the effect on the phenotype is due to a single QTL or to several tightly linked QTL affecting the same trait. In addition, sub-ILs can be developed to discover molecular markers which are more tightly linked to the QTL of interest, which can be used for marker-assisted breeding (MAB). Multiple introgression lines can be developed when the introgression of a single QTL is not sufficient to result in a substantial improvement in agriculturally important traits (Gur and Zamir, Unused natural variation can lift yield barriers in plant breeding, 2004, PLoS Biol.; 2(10):e245). [00108] The present disclosure provides molecular markers that are linked to maize plants with no to substantially no smoky kernels. These molecular markers and their defining primers are described herein. As used herein, the term “linked” refers to the situation wherein the molecular marker and at least one of the QTLs and/or agronomic QTLs of the present invention are segregating together over one or more generations. In one embodiment, the molecular markers of the present invention are linked to at least one trait loci of the present invention. In one embodiment, the molecular marker can be any kind of marker described herein. [00109] In one embodiment, the molecular markers of the present invention are closely linked to at least one trait loci of the present invention. As used herein, the phrase “closely linked” or “tightly linked” refers to the situation wherein the genetic distance between the molecular marker and at least one of the QTLs and/or agronomic QTLs is less than 2 centimorgan (cM). For example, the genetic distance between the marker and the QTL is about 2.0 cM, about 1.9 cM, about 1.8 cM, about 1.7 cM, about 1.6 cM, about 1.5 cM, about 1.4 cM, about 1.3 cM, about 1.2 cM, about 1.1 cM, about 1.0 cM, about 0.9 cM, about 0.8 cM, about 0.7 cM, about 0.6 cM, about 0.5 cM, about 0.4 cM, about 0.3 cM, about 0.2 cM, about 0.1 cM, or less than 0.1 cM [00110] Molecular markers have proven to be of great value for increasing the speed and efficiency of plant breeding. Most traits of agronomic value, e.g. pest resistance, yield and the like, are difficult to measure, often requiring a full growth season and statistical analysis of field trial results. Interpretation of the data can be obscured or confused by environmental variables. Occasionally it has been possible for breeders to make use of conventional markers such as flower color which could be readily followed through the breeding process. If the desired gene or QTL is linked closely enough to a conventional marker, the likelihood of recombination occurring between them is sufficiently low that the gene or the QTL and the marker co-segregate throughout a series of crosses. The marker becomes, in effect, a surrogate for the gene or the QTL itself. Prior to the advent of molecular markers, the opportunities for carrying out marker-linked breeding were severely limited by the lack of suitable markers mapping sufficiently close to the desired trait. Map distance is simply a function of recombination frequency between two markers, genes, QTLs, genes and QTLS, or markers and genes or QTLs. Consequently, if a marker and a gene or a QTL map too far apart, too much recombination will occur during a series of crosses or self-pollinations such that the marker becomes no longer associated with the gene or the QTL. Having a wide selection of molecular markers available throughout the genetic map provides breeders the means to follow almost any desired trait through a series of crosses, by measuring the presence or absence of a marker linked to the gene or the QTL which affects that trait. The primary obstacle is the initial step of identifying a linkage between a marker and a gene or a QTL affecting the desired trait. [00111] More molecular markers can be developed by using the plants having an altered aleurone anthocyanin content that express no to substantially no smoky kernels of the present invention. In general, as the map distance (expressed by the unit cM) between a molecular marker and a gene of interest becomes shorter, the marker and the gene are more closely localized to each other, and more likely to be inherited simultaneously; thus such markers are more useful. Methods of developing molecular markers are well known to one of ordinary skill in the art. The marks can be bi-allelic dominant, bi-allelic co-dominant, and/or multi- allelic co-dominant. The types of molecular markers that can be developed include, but are not limited to, restriction fragment length polymorphisms (RFLPs), isozyme markers, allele specific hybridization (ASH), amplified variable sequences of plant genome, self-sustained sequence replication, simple sequence repeat (SSR), single base-pair change (single nucleotide polymorphism, SNP), random amplification of polymorphic DNA (RAPDs), SSCPs (single stranded conformation polymorphisms); amplified fragment length polymorphisms (AFLPs) and microsatellites DNA. RAPD methods generally refer to methods of detecting DNA polymorphisms using differences in the length of DNAs amplified using appropriate primers. AFLP methods are essentially a combination of the above RFLP and RAPD methods, and refer to methods of selectively amplifying DNA restriction fragments using PCR to detect differences in their length, or their presence or absence. [00112] Methods of developing molecular markers and their applications are described by Avise (Molecular markers, natural history, and evolution, Publisher: Sinauer Associates, 2004, ISBN 0878930418, 9780878930418), Srivastava et al. (Plant biotechnology and molecular markers, Publisher: Springer, 2004, ISBN1402019114, 9781402019111), and Vienne (Molecular markers in plant genetics and biotechnology, Publisher: Science Publishers, 2003), each of which is incorporated by reference in its entirety. [00113] The AFLP technology (Zabeau & Vos, 1993; Vos et al., 1995) has found widespread use in plant breeding and other field since its invention in the early nineties. This is due to several characteristics of AFLP, of which the most important is that no prior sequence information is needed to generate large numbers of genetic markers in a reproducible fashion, in addition, the principle of selective amplification, a cornerstone of AFLP, ensures that the number of amplified fragments can be brought in line with the resolution of the detection system, irrespective of genome size or origin. [00114] Detection of AFLP fragments is commonly carried out by electrophoresis on slab- gels (Vos et al., AFLP: a new technique for DNA fingerprinting, Nucleic Acids Res.1995 Nov.11; 23(21): 4407-4414, 1995) or capillary electrophoresis (van der Meulen et al., 2002). The majority of AFLP markers scored in this way represent polymorphisms occurring either in the restriction enzyme recognition sites used for AFLP template preparation or their flanking nucleotides covered by selective AFLP primers. The remainder of the AFLP markers are insertion/deletion polymorphisms occurring in the internal sequences of the restriction fragments and a very small fraction on single nucleotide substitutions occurring in small restriction fragments (<approximately 100 bp), which for these fragments cause reproducible mobility variations between both alleles which can be observed upon electrophoresis; these AFLP markers can be scored co-dominantly without having to rely on band intensities. Methods of developing AFLP markers are described in EP 534858, U.S. Pat. No.6,045,994, WO2007114693 and Vos et al., each of which is hereby incorporated by reference in its entirety. [00115] The molecular markers of the present invention are genetically linked to the QTLs associated with maize plants having an altered aleurone anthocyanin content that expresses no to substantially no smoky kernels. It should be understood that these molecular markers merely indicate nucleic acid sequence polymorphisms between the genome of a maize plant having said QTLs and the genome of a maize plant not having said QTLs. The polymorphisms can be detected by PCR amplification, or any other suitable methods well known to one skilled in the art. The exact size of an amplification product using the primer pairs provided herein may vary between two plants, for example, due to natural variation, even when said two plants have essentially the same QTL that is associated with an altered aleurone anthocyanin content that expresses no to substantially no smoky kernels. Genomic Selection [00116] Genomic selection (GS), also known as genome wide selection (GWS), is a form of MAS that estimates all locus, haplotype, and/or marker effects across the entire genome to calculate genomic estimated breeding values (GEBVs). See Nakaya and Isobe, Will genomic selection be a practical method for plant breeding? Annals of Botany 110: 1303-1316 (2012); Van Vleck et ah, Estimated breeding values for meat characteristics of cross-bred cattle with an animal model. Journal of Animal Science 70: 363-371 (1992); and Heffher et ah, Genomic selection for crop improvement. Crop Science 49: 1-12 (2009). GS utilizes a training phase and a breeding phase. In the training phase, genotypes and phenotypes are analyzed in a subset of a population to generate a GS prediction model that incorporates significant relationships between phenotypes and genotypes. A GS training population must be representative of selection candidates in the breeding program to which GS will be applied, in the breeding phase, genotype data are obtained in a breeding population, then favorable individuals are selected based on GEBVs obtained using the GS prediction model generated during the training phase without the need for phenotypic data. [00117] Larger training populations typically increase the accuracy of GEBV predictions. Increasing the training population to breeding population ratio is helpful for obtaining accurate GEBVs when working with populations having high genetic diversity, small breeding populations, low heritability of traits, or large numbers of QTLs. The number of markers required for GS modeling is determined based on the rate of LD decay across the genome, which must be calculated for each specific population to which GS will be applied. In general, more markers will be necessary with faster raters of LD decay. Ideally, GS comprises at least one marker in LD with each QTL, but in practical terms one of ordinary skill in the art would recognized that this is not necessary. [00118] With genotyping data, favorable individuals from a population can be selected based only on GEBVs. GEBVs are the sum of the estimate of genetic deviation and the weighted sum of estimates of breed effects, which are predicted using phenotypic data. Without being limiting, commonly used statistical models for prediction of GEBVs include best linear unbiased prediction (Henderson, Best linear unbiased estimation and prediction under a selection model. Biometrics 31 : 423 (1975)) and a Bayesian framework (Gianola and Fernando, Bayesian methods in animal breeding theory. Journal of Animal Science 63: 217- 244 (1986)). [00119] The compositions and methods of the present disclosure can be utilized for GS or breeding corn varieties with a desired complement (set) of allelic forms of chromosome intervals associated with superior agronomic performance (e.g., no smokey kernels). In an aspect, a corn plant, seed, or cell provided herein can be selected using genomic selection. In another aspect, SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35;, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO: 48; SEQ ID NO: 49; SEQ ID NO: 50; SEQ ID NO: 51; SEQ ID NO: 52; SEQ ID NO: 53; SEQ ID NO: 54; and SEQ ID NO: 55, SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66 can be used in a method comprising genomic selection. In another aspect, a genomic selection method provided herein comprises phenotyping a population of corn plants for no smoky kernels. In another aspect, a genomic selection method provided herein comprises genotyping a population of corn plants, seeds, or cells with at least one of marker loci SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35;, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO: 48; SEQ ID NO: 49; SEQ ID NO: 50; SEQ ID NO: 51; SEQ ID NO: 52; SEQ ID NO: 53; SEQ ID NO: 54; and SEQ ID NO: 55, SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66. Genetic Architecture- QTL/Linkage Mapping, GWAS [00120] To identify the regions associated with anthocyanin content within the germplasm used as males, 899 doubled haploids maize plants from 24 doubled haploid (DH) populations were testcrossed to CB64, a female maize line proven to produce anthocyanin coloration in prior field trials. [00121] To identify regions associated with anthocyanin content within the female germplasm, 88 doubled haploids from a population known to be segregating for anthocyanin content were crossed to KW7M1352, a male maize line proven to produce anthocyanin when crossed to specific lines in field trials (shown in Table 3 below). These testcrosses were visually rated for anthocyanin content on a scale from 0-9, with 9 being highest color and 0 being none. [00122] All DH lines were genotyped at about 1500 markers. All parents of the DH populations were genotyped at a density of 17567 markers, and doubled haploids were imputed from low density to high density. Association mapping was run using rrBLUP (Endelman, Jeffrey B. "Ridge regression and other kernels for genomic selection with R package rrBLUP." The plant genome 4.3 (2011).). All imputation and association analyses were run in the R environment (R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.) [00123] Within the male maize population, two peaks were found to be significant, one on chromosome 9, associated with the classical maize aleurone color gene, C1 (Figure 1a). Within the female DH population, a peak colocalized with the classical maize anthocyanin gene, R1 (Figure 1b) on chromosome 10. Both genes are known transcriptional regulators of anthocyanin production in maize plants, and both are necessary for production of anthocyanin in the maize aleurone (Cone, Karen C., Frances A. Burr, and Benjamin Burr. "Molecular analysis of the maize anthocyanin regulatory locus C1." Proceedings of the National Academy of Sciences 83.24 (1986): 9631-9635.; Walker EL, Robbins TP, Bureau TE, Kermicle J, Dellaporta SL. Transposon-mediated chromosomal rearrangements and gene duplications in the formation of the maize R-r complex. EMBO J.1995 May 15;14(10):2350-63. PMID: 7774593; PMCID: PMC398344. ). Given their established importance for producing anthocyanin in the maize kernel, these two genes were identified for marker development. Table 1: Numbers of DH lines from each population and testers used in the two mapping studies. DH Heterotic Tester Doubled Haploid (DH) population Number Pattern of Lines 0 0 1 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 8 8
Figure imgf000030_0001
Marker development on the R1 locus: [00124] As the GWAS data confirmed, the R1 locus on the female side was responsible for the smokey phenotype, no further fine-mapping was carried out. One known smokey donor (line PHG39) and two smokey and two non-smokey lines, forming pairs within the same DH (doubled haploid) population were sequenced by Illumina short read technology. The genomes were assembled to scaffold level, covering the gene space very well. Sequences of smokey (PHG39_smokey; SEQ ID NO: 37) and non-smokey lines (B73_non_smokey (SEQ ID NO: 38), non_smokey2 (SEQ ID NO: 39)) were compared. All non-smokey lines showed the same sequence as B73. Smokey lines showed many SNPs and InDels (insertion or deletion) and 5 structural variations within and close by (5kb distance) the gene R1. By comparing these structural polymorphisms to other standard references of non-smokey genotypes, three of them could be excluded as non-functional. The most prominent remaining polymorphism corresponds to a transposon insertion of the type doppia4 in the 5’ UTR. Such a doppia4 insertion was reported by Walker et al 1995 (Walker, Elsbeth L., et al. "Transposon‐mediated chromosomal rearrangements and gene duplications in the formation of the maize R‐r complex." The EMBO journal 14.10 (1995): 2350-2363.) having an aleurone specific promotor activity. Therefore, this insertion is supposed to be the functional polymorphism. [00125] Genotyping of a large panel of inbred lines confirmed, that doppia is essential for the smoky phenotype and an additional allele containing doppia but not causing the smokey genotype could be identified. By sequencing two of these genotypes it was shown, that this R1 allele was lacking a large part of its genomic sequencing, including all exon 2 (see Figure 3, an alignment sequences of B73_R1 (genotype carrying smoky alleles); PHG39 (known smokey donor) and the non_smokey2 allele producing the non-smokey genotype provided in SEQ ID NO:39) and Figure 4 which provides an alignment of multiple sequences of B73_R1 (genotype carrying smoky alleles); PHG39 (known smokey donor) and the non_smokey2 allele producing the non-smokey genotype with exon information. [00126] KASP markers were developed on the doppia insertion and on a series of closely linked SNPs selected by technical suitability for this technology.6 SNPs markers are highly linked to the doppia insertion and can be used. Anyhow, some additional rare alleles were identified. The final scoring system is summarized in the following table 2. Table 2: Smoky vs non-Smoky variants NA= the marker has no call at this position. ma0023ek76 ma0023ek80 ma0023ek81 ma0023ek85 ma0023ek88 ma0023em08 ma0004qp6e Pos Chrom 10 (AGPv4) 139770766 139771365 139771617 139772481 139772862 139789400 139781170 Location upstream upstream upstream upstream upstream R1 doppia ent ne ent, [
Figure imgf000032_0001
characterize breeding material, depending on the haplotypes present in the material. Markers and primer sequences are summarized in the following Table 3. Table 3: Primer Sequences Marker Allele Allele Location Primer Primer Primer Allele Allele Target B73 Smoky Allele X Allele Common X Y sequence R1 Y O: O: O: O: O: O: O: Q
Figure imgf000032_0002
Marker development on the C1 locus: [00128] Genetic studies described above identified two genomic regions controlling the smoky phenotype on chromosome 9 and chromosome 10. These regions co-localize with known aleurone coloration genes (C1 and R1). [00129] To develop predictive markers for the C1 locus public sequence data on the C1 gene have been assembled. The proprietary dataset is comprised of public sequenced genomes PH207 (SEQ ID NO:66), PHG29 (SEQ ID NO:65), (SEQ ID NO:64), MBS847(SEQ ID NO:62), B73 (SEQ ID NO:56), OH43 (SEQ ID NO:63) complemented with sequences from C1 alleles reported in the literature (C1 (SEQ ID NO:57), C1-m (SEQ ID NO:58), C1-p (SEQ ID NO:61) and C1-n (SEQ ID NO:59 and SEQ ID NO:60)). The smoky phenotype corresponds to a functional C1 allele. Gene sequence comparison of genotypes carrying smoky alleles (W22, MBS847, PH207 PHG29) and genotypes carrying non-smoky alleles (B73, OH43) allowed the identification of variants for C1 Smoky vs non-Smoky discrimination and the haplotype constitution overview (described in Table 4). Table 4: Smoky vs non-Smoky variants [00130] Haplotypes are represented in single nucleotide base format showing the corresponding alleles in sequential order. NA= the marker has no call at this position. Smoky Smoky Smoky C1-n Smoky Smoky C1-m1 C1-p allele allele allele allele allele discrimin discrimin allele Haplotyp discrimin discrimin discrimin discrimin discrimin discrimin e ti n ti n T N N NT NT A A N N N
Figure imgf000033_0001
[00131] The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and other modifications and variations may be possible in light of the above teachings. The embodiment was chosen and described in order to best explain the principles of the invention and its practical application to thereby enable others skilled in the art to best utilize the invention in various embodiments and various modifications as are suited to the particular use contemplated. It is intended that the appended claims be construed to include other alternative embodiments of the invention except insofar as limited by the prior art.

Claims

CLAIMS 1. A method for identifying a plant of the genus Zea having an altered aleurone anthocyanin content, the method comprising detecting a presence or absence of one or more marker alleles in a maize plant, wherein said one or more marker allele are operably linked to one or more anthocyanin-regulating genes, wherein said one or more anthocyanin-regulating genes are selected from the group consisting of R1 and C1 genes.
2. A method for identifying a plant of the genus Zea having no or substantially no smoky kernels, the method comprising detecting a presence or absence of one or more marker alleles associated with no or substantially no smoky kernels in a maize plant, wherein said one or more marker allele are operably linked to one or more anthocyanin-regulating genes, wherein said one or more anthocyanin-regulating genes are selected from the group consisting of R1 and C1 genes.
3. The method of claim 1, wherein the R1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66.
4. The method of claim 3, wherein the R1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66.
5. The method of claim 1, wherein said one or marker alleles comprise at least one allele selected from M1, M2, M3, M4, M5, M6, M7, M8, M9, M10, M11, M12, M13,M14 and M15, wherein: M1 is a SNP which is guanine (G) at the position 139770766 referenced to the B73 reference genome AGPv4; M2 is a SNP which is thymine (T) at the position 139771365 referenced to the B73 reference genome AGPv4; M3 is a SNP which is thymine (T) at the position 139771617 referenced to the B73 reference genome AGPv4; M4 is a SNP which is adenine (A) at the position 139772481 referenced to the B73 reference genome AGPv4; M5 is a SNP which is guanine (G) at the position 139772862 referenced to the B73 reference genome AGPv4; M6 is a SNP which is guanine (G) at the position 139789400 referenced to the B73 reference genome AGPv4; M7 is an indel which is insertion of nucleotides at the position 139781170 referenced to the B73 reference genome AGPv4; M8 is an indel which is insertion of nucleotides at the position 8983319 referenced to the B73 reference genome AGPv4; M9 is a SNP which is thymine (T) at the position 8983500 referenced to the B73 reference genome AGPv4; M10 is a SNP which is thymine (T) at the position 8983838 referenced to the B73 reference genome AGPv4; M11 is a SNP which is guanine (G) at the position 8983897 referenced to the B73 reference genome AGPv4; M12 is an indel which is insertion of nucleotides at the position 8984653 referenced to the B73 reference genome AGPv4; M13 is a SNP which is guanine (G) at the position 8983862 referenced to the B73 reference genome AGPv4; M14 is a SNP which is thymine (T) or cytosine (C) at the position 8983983 referenced to the B73 reference genome AGPv4; and M15 is a SNP which is cytosine (C) at the position 8984469 referenced to the B73 reference genome AGPv4.
6. The method of claim 2, wherein said one or marker alleles comprise at least one allele selected from M1’, M2’, M3’, M4’, M5’, M6’, M7’, M8’, M9’, M10’, M11’, M12’, M13’, M14’ and M15’ wherein: M1’ is a SNP which is cytosine (C) at the position 139770766 referenced to the B73 reference genome AGPv4; M2’ is a SNP which is cytosine (C) at the position 139771365 referenced to the B73 reference genome AGPv4; M3’ is a SNP which is cytosine (C) at the position 139771617 referenced to the B73 reference genome AGPv4; M4’ is a SNP which is guanine (G) at the position 139772481 referenced to the B73 reference genome AGPv4; M5’ is a SNP which is adenine (A) at the position 139772862 referenced to the B73 reference genome AGPv4; M6’ is a SNP which is adenine (A) at the position 139789400 referenced to the B73 reference genome AGPv4; M7’ is an indel which is deletion of nucleotides at the position 139781170 referenced to the B73 reference genome AGPv4; M8’ is an indel which is deletion of nucleotides at the position 8983319 referenced to the B73 reference genome AGPv4; M9’ is a SNP which is cytosine (C) at the position 8983500 referenced to the B73 reference genome AGPv4; M10’ is a SNP which is guanine (G) at the position 8983838 referenced to the B73 reference genome AGPv4; M11’ is a SNP which is adenine (A) at the position 8983897 referenced to the B73 reference genome AGPv4; M12’ is an indel which is deletion of nucleotides at the position 8984653 referenced to the B73 reference genome AGPv4; M13’ is a SNP which is guanine (G) or adenine (A) at the position 8983862 referenced to the B73 reference genome AGPv4; M14’ is a SNP which is thymine (T) at the position 8983983 referenced to the B73 reference genome AGPv4; and M15’ is a SNP which is cytosine (C) or adenine (A) at the position 8984469 referenced to the B73 reference genome AGPv4.
7. A method for identifying a maize plant or plant part, comprising screening for the presence of one or more marker alleles, wherein said one or more marker alleles are selected from M1, M2, M3, M4, M5, M6, M7, M8, M9, M10, M11, M12, M13,M14 and M15.
8. An identified maize plant resulting from the method of claim 7.
9. A method for identifying a maize plant or plant part, comprising screening for the presence of one or more marker alleles, wherein said one or more marker alleles are selected from M1’, M2’, M3’, M4’, M5’, M6’, M7’, M8’, M9’, M10’, M11’, M12’, M13’, M14’ and M15’.
10. An identified maize plant resulting from the method of claim 9.
11. A method for producing maize plants having altered aleurone anthocyanin content, comprising the steps of: a. identifying one or more maize lines comprising one or more molecular marker alleles identified by the method of claim 3; b. selfing the identified maize line with itself or crossing the identified maize line with a second identified maize line; and c. producing the maize plant.
12. A method for producing maize plants having no or substantially no smoky kernels, comprising the steps of: a. identifying one or more maize lines comprising one or more molecular marker alleles identified by the method of claim 4; b. selfing the identified maize line with itself or crossing the identified maize line with a second identified maize line; and c. producing the maize plant.
13. A method for detecting one or more anthocyanin-regulating genes in maize, the method comprising using one or more discriminant markers, wherein said one or more discriminant markers are nucleic acid sequences, wherein said nucleic acid sequences are selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO: 48; SEQ ID NO: 49; SEQ ID NO: 50; SEQ ID NO: 51; SEQ ID NO: 52; SEQ ID NO: 53; SEQ ID NO: 54; and SEQ ID NO: 55
14. An isolated polynucleic acid comprising a coding sequence, wherein said coding sequence is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NOs: SEQ ID NO: 4; SEQ ID NO: 5; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 14; SEQ ID NO: 15; SEQ ID NO: 19; SEQ ID NO: 20; SEQ ID NO: 24; SEQ ID NO: 25; SEQ ID NO: 29; SEQ ID NO: 30; SEQ ID NO: 34; SEQ ID NO: 35; SEQ ID NO: 40; SEQ ID NO: 41; SEQ ID NO: 42; SEQ ID NO: 43; SEQ ID NO: 44; SEQ ID NO: 45; SEQ ID NO: 46; SEQ ID NO: 47; SEQ ID NO: 48; SEQ ID NO: 49; SEQ ID NO: 50; SEQ ID NO: 51; SEQ ID NO: 52; SEQ ID NO: 53; SEQ ID NO: 54; and SEQ ID NO: 55, or a sequence having at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity with any of the foregoing sequences, wherein said isolated polynucleic acid is capable of being used to identify anthocyanin-regulating genes in maize.
15. A method for screening for a maize plant with one or more anthocyanin-regulating genes, the method comprising: screening a population of maize plants for one or more anthocyanin-regulating genes are selected from the group consisting of an R1 gene and a Cl gene, wherein the R1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38. and SEQ ID NO: 39; the Cl gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO 60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66.SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the Cl gene is selected from the group consisting of SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO 59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66; and selecting a maize plant having said one or more anthocyanin-regulating genes.
16. A method for screening for a maize plant with no or substantially no smoky kernels, the method comprising: screening a population of maize plants for one or more anthocyanin-regulating genes are selected from the group consisting of an R1 gene and a Cl gene, wherein the R1 gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the Cl gene is a nucleic acid sequence, wherein said nucleic acid sequence is selected from the group consisting of a nucleic acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99% sequence identity to SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66.SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, and SEQ ID NO: 39; the C1 gene is selected from the group consisting of SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO 63, SEQ ID NO:64, SEQ ID NO:65 and SEQ ID NO:66; and selecting a maize plant having said one or more anthocyanin-regulating genes.
PCT/US2024/038717 2023-07-20 2024-07-19 Method of detecting maize plants with altered aleurone anthocyanin content and associated marker alleles and haplotypes WO2025019762A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363514701P 2023-07-20 2023-07-20
US63/514,701 2023-07-20

Publications (1)

Publication Number Publication Date
WO2025019762A1 true WO2025019762A1 (en) 2025-01-23

Family

ID=94282690

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/038717 WO2025019762A1 (en) 2023-07-20 2024-07-19 Method of detecting maize plants with altered aleurone anthocyanin content and associated marker alleles and haplotypes

Country Status (1)

Country Link
WO (1) WO2025019762A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230026144A1 (en) * 2020-03-24 2023-01-26 Insignum Agtech, Llc. Modified plants and methods of detecting pathogenic disease

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230026144A1 (en) * 2020-03-24 2023-01-26 Insignum Agtech, Llc. Modified plants and methods of detecting pathogenic disease

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
OWENS BRENDA F, MATHEW DEEPU, DIEPENBROCK CHRISTINE H, TIEDE TYLER, WU DI, MATEOS-HERNANDEZ MARIA, GORE MICHAEL A, ROCHEFORD TORBE: "Genome-Wide Association Study and Pathway-Level Analysis of Kernel Color in Maize", G3 GENES - GENOMES - GENETICS, OXFORD UNIVERSITY PRESS, vol. 9, no. 6, 1 June 2019 (2019-06-01), pages 1945 - 1955, XP093267637, ISSN: 2160-1836, DOI: 10.1534/g3.119.400040 *
PAULSMEYER MICHAEL N, BROWN PATRICK J, JUVIK JOHN A: "Discovery of Anthocyanin Acyltransferase1 (AAT1) in Maize Using Genotyping-by-Sequencing (GBS)", G3 GENES - GENOMES - GENETICS, OXFORD UNIVERSITY PRESS, vol. 8, no. 11, 1 November 2018 (2018-11-01), pages 3669 - 3678, XP093267633, ISSN: 2160-1836, DOI: 10.1534/g3.118.200630 *

Similar Documents

Publication Publication Date Title
US20220338433A1 (en) Genetic loci associated with disease resistance in soybeans
CA3164582A1 (en) Novel genetic loci associated with disease resistance in soybeans
CA3144675A1 (en) Genetic loci associated with disease resistance in soybeans
US11505803B2 (en) Genetic markers associated with drought tolerance in maize
CA3228155A1 (en) Compositions and methods for gray leaf spot resistance
US20220030789A1 (en) Green bean plants with improved disease resistance
CA2986241A1 (en) Methods of identifying and selecting maize plants with resistance to anthracnose stalk rot
WO2022208489A1 (en) Semi-determinate or determinate growth habit trait in cucurbita
US20210251166A1 (en) Resistance alleles in soybean
US10517242B1 (en) Disease resistance alleles in soybean
US20170150693A1 (en) Methods and compositions for producing sorghum plants with anthracnose resistance
WO2023225469A2 (en) Conferring cytoplasmic male sterility
EP4482299A2 (en) Markers associated with spontaneous chromosome doubling
US10752964B1 (en) Disease resistance alleles in soybean
WO2025019762A1 (en) Method of detecting maize plants with altered aleurone anthocyanin content and associated marker alleles and haplotypes
US10717986B1 (en) Resistance alleles in soybean
US10667478B1 (en) Metribuzin tolerance alleles in soybean
US10544470B2 (en) Resistance alleles in soybean
US11236400B2 (en) Molecular markers associated with soy iron deficiency chlorosis
US20230399704A1 (en) Hilum color alleles in soybean
US11185032B1 (en) Disease resistance alleles in soybean
EP4551007A2 (en) Methods and compositions for selecting soybean plants having favorable allelic combinations of stem termination and maturity
US20220033886A1 (en) Nematode resistance alleles in soybean

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24844008

Country of ref document: EP

Kind code of ref document: A1