EP3830287A1 - Method for the quality control of seed lots - Google Patents
Method for the quality control of seed lotsInfo
- Publication number
- EP3830287A1 EP3830287A1 EP19749675.5A EP19749675A EP3830287A1 EP 3830287 A1 EP3830287 A1 EP 3830287A1 EP 19749675 A EP19749675 A EP 19749675A EP 3830287 A1 EP3830287 A1 EP 3830287A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- seeds
- interest
- carried out
- sequencing
- sublot
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 92
- 238000003908 quality control method Methods 0.000 title abstract description 10
- 238000012163 sequencing technique Methods 0.000 claims abstract description 55
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 44
- 108700028369 Alleles Proteins 0.000 claims description 72
- 239000000356 contaminant Substances 0.000 claims description 72
- 108020004414 DNA Proteins 0.000 claims description 49
- 241000196324 Embryophyta Species 0.000 claims description 27
- 241000894007 species Species 0.000 claims description 27
- 239000012297 crystallization seed Substances 0.000 claims description 26
- 238000011109 contamination Methods 0.000 claims description 24
- 238000001514 detection method Methods 0.000 claims description 23
- 230000009418 agronomic effect Effects 0.000 claims description 20
- 239000002299 complementary DNA Substances 0.000 claims description 20
- 230000005059 dormancy Effects 0.000 claims description 19
- 108020004635 Complementary DNA Proteins 0.000 claims description 18
- 244000052769 pathogen Species 0.000 claims description 18
- 230000035784 germination Effects 0.000 claims description 16
- 230000003321 amplification Effects 0.000 claims description 14
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 9
- 238000010839 reverse transcription Methods 0.000 claims description 7
- 230000037452 priming Effects 0.000 claims description 5
- 230000035899 viability Effects 0.000 claims description 5
- 241000238631 Hexapoda Species 0.000 claims description 4
- 241000894006 Bacteria Species 0.000 claims description 2
- 241000233866 Fungi Species 0.000 claims description 2
- 241000700605 Viruses Species 0.000 claims description 2
- 238000000605 extraction Methods 0.000 claims description 2
- 238000001712 DNA sequencing Methods 0.000 claims 1
- 239000000523 sample Substances 0.000 description 27
- 238000005516 engineering process Methods 0.000 description 21
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 18
- 238000004458 analytical method Methods 0.000 description 17
- 239000003550 marker Substances 0.000 description 16
- 235000003869 genetically modified organism Nutrition 0.000 description 15
- 238000004519 manufacturing process Methods 0.000 description 15
- 230000035772 mutation Effects 0.000 description 15
- 108091093088 Amplicon Proteins 0.000 description 14
- 238000010804 cDNA synthesis Methods 0.000 description 14
- 235000013339 cereals Nutrition 0.000 description 10
- 238000012360 testing method Methods 0.000 description 10
- 238000013459 approach Methods 0.000 description 9
- 230000002068 genetic effect Effects 0.000 description 9
- 238000003205 genotyping method Methods 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- 230000037431 insertion Effects 0.000 description 9
- 240000008042 Zea mays Species 0.000 description 8
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 8
- 239000012634 fragment Substances 0.000 description 8
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 7
- 235000005822 corn Nutrition 0.000 description 7
- 239000000203 mixture Substances 0.000 description 7
- 238000010362 genome editing Methods 0.000 description 6
- 238000005070 sampling Methods 0.000 description 6
- 235000003222 Helianthus annuus Nutrition 0.000 description 5
- 239000011324 bead Substances 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 5
- 238000011161 development Methods 0.000 description 5
- 230000018109 developmental process Effects 0.000 description 5
- 238000011156 evaluation Methods 0.000 description 5
- 238000007481 next generation sequencing Methods 0.000 description 5
- 239000002773 nucleotide Substances 0.000 description 5
- 125000003729 nucleotide group Chemical group 0.000 description 5
- 230000001717 pathogenic effect Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 4
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 4
- 244000020551 Helianthus annuus Species 0.000 description 4
- 238000002123 RNA extraction Methods 0.000 description 4
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 4
- 239000012535 impurity Substances 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 102000053602 DNA Human genes 0.000 description 3
- 101150090984 DOG1 gene Proteins 0.000 description 3
- 241000209140 Triticum Species 0.000 description 3
- 235000021307 Triticum Nutrition 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 239000000839 emulsion Substances 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 150000002500 ions Chemical class 0.000 description 3
- 238000007403 mPCR Methods 0.000 description 3
- 230000007170 pathology Effects 0.000 description 3
- 102000004169 proteins and genes Human genes 0.000 description 3
- 241000219198 Brassica Species 0.000 description 2
- 235000011331 Brassica Nutrition 0.000 description 2
- 240000002791 Brassica napus Species 0.000 description 2
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 2
- 238000000018 DNA microarray Methods 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 240000007594 Oryza sativa Species 0.000 description 2
- 235000007164 Oryza sativa Nutrition 0.000 description 2
- 108700019146 Transgenes Proteins 0.000 description 2
- 241000589649 Xanthomonas campestris pv. campestris Species 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 238000009395 breeding Methods 0.000 description 2
- 230000001488 breeding effect Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 230000035558 fertility Effects 0.000 description 2
- 235000013312 flour Nutrition 0.000 description 2
- 238000012165 high-throughput sequencing Methods 0.000 description 2
- 238000009776 industrial production Methods 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 229910052757 nitrogen Inorganic materials 0.000 description 2
- 108020004707 nucleic acids Proteins 0.000 description 2
- 102000039446 nucleic acids Human genes 0.000 description 2
- 150000007523 nucleic acids Chemical class 0.000 description 2
- 238000006116 polymerization reaction Methods 0.000 description 2
- 238000000164 protein isolation Methods 0.000 description 2
- 238000003753 real-time PCR Methods 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 235000009566 rice Nutrition 0.000 description 2
- 238000005204 segregation Methods 0.000 description 2
- 239000000377 silicon dioxide Substances 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 235000020238 sunflower seed Nutrition 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- 238000011282 treatment Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- UDPGUMQDCGORJQ-UHFFFAOYSA-N (2-chloroethyl)phosphonic acid Chemical compound OP(O)(=O)CCCl UDPGUMQDCGORJQ-UHFFFAOYSA-N 0.000 description 1
- TWQHGBJNKVFWIU-UHFFFAOYSA-N 8-[4-(4-quinolin-2-ylpiperazin-1-yl)butyl]-8-azaspiro[4.5]decane-7,9-dione Chemical compound C1C(=O)N(CCCCN2CCN(CC2)C=2N=C3C=CC=CC3=CC=2)C(=O)CC21CCCC2 TWQHGBJNKVFWIU-UHFFFAOYSA-N 0.000 description 1
- 244000015329 Aeginetia indica Species 0.000 description 1
- 235000014624 Aeginetia indica Nutrition 0.000 description 1
- 241000201327 Alectra Species 0.000 description 1
- 241000219194 Arabidopsis Species 0.000 description 1
- 241000219195 Arabidopsis thaliana Species 0.000 description 1
- 235000007319 Avena orientalis Nutrition 0.000 description 1
- 244000075850 Avena orientalis Species 0.000 description 1
- 241001465180 Botrytis Species 0.000 description 1
- 101150043687 CLPB1 gene Proteins 0.000 description 1
- 101710163595 Chaperone protein DnaK Proteins 0.000 description 1
- 241001136168 Clavibacter michiganensis Species 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 101710178376 Heat shock 70 kDa protein Proteins 0.000 description 1
- 101710152018 Heat shock cognate 70 kDa protein Proteins 0.000 description 1
- 241000208818 Helianthus Species 0.000 description 1
- 240000005979 Hordeum vulgare Species 0.000 description 1
- 235000007340 Hordeum vulgare Nutrition 0.000 description 1
- 240000008415 Lactuca sativa Species 0.000 description 1
- 235000003228 Lactuca sativa Nutrition 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 241001508464 Orobanche Species 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 241000233679 Peronosporaceae Species 0.000 description 1
- 241000201976 Polycarpon Species 0.000 description 1
- 244000088415 Raphanus sativus Species 0.000 description 1
- 235000006140 Raphanus sativus var sativus Nutrition 0.000 description 1
- 241000576755 Sclerotia Species 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 240000006394 Sorghum bicolor Species 0.000 description 1
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 1
- 241000208000 Striga Species 0.000 description 1
- ATJFFYVFTNAWJD-UHFFFAOYSA-N Tin Chemical compound [Sn] ATJFFYVFTNAWJD-UHFFFAOYSA-N 0.000 description 1
- 102000004243 Tubulin Human genes 0.000 description 1
- 108090000704 Tubulin Proteins 0.000 description 1
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000004790 biotic stress Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 210000003850 cellular structure Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000000658 coextraction Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 235000014510 cooky Nutrition 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 238000000227 grinding Methods 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 238000012787 harvest procedure Methods 0.000 description 1
- 230000002363 herbicidal effect Effects 0.000 description 1
- 239000004009 herbicide Substances 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000009533 lab test Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 239000012139 lysis buffer Substances 0.000 description 1
- 235000009973 maize Nutrition 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000012071 phase Substances 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000004080 punching Methods 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000003938 response to stress Effects 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000012764 semi-quantitative analysis Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 238000009331 sowing Methods 0.000 description 1
- 229910001220 stainless steel Inorganic materials 0.000 description 1
- 239000010935 stainless steel Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012225 targeting induced local lesions in genomes Methods 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 239000002569 water oil cream Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6888—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
- C12Q1/6895—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/142—Toxicological screening, e.g. expression profiles which identify toxicity
Definitions
- the invention relates to a quality control process in the field of seeds and varietal purity.
- the marketing of seeds is subject to the control of their purity rate. This rate is specific to each species but must be 98% by weight or more (Directive 66/402 / EEC on the marketing of cereal seeds), this standard also applies to seeds which are marketed for the production of seeds of bases, pre-base, the production of certified seeds or the production of hybrids. This varietal purity is mainly controlled by field inspection, in the case of production of hybrid seeds with a sterile male parent parent, the purity rate of this parent must be even higher (99.9% for corn).
- varietal purity rate is defined as the percentage of plants originating from a batch and which conform to the description of the variety. This percentage is expressed by weight of seeds.
- the contaminants are seeds of the same species, but showing genetic variations at certain loci in their genome, compared to the genotype expected for the seeds of the batch considered.
- the presence of contaminants is reduced, due to vigilance in the upstream production stages, cultural practices, purification, isolation, and the controls carried out throughout the process.
- the contaminants being present at a generally low percentage and indeed the level tolerated in a batch so that it can be marketed must be less than 2%.
- trait is meant the allelic form of a loci linked to a phenotypic character.
- a similar problem relates to the fortuitous presence of GMOs or any other alteration in the genome.
- the marketing of non-GMO plants requires proof of the absence of GMOs or the presence of a rate below a percentage determined by regulations.
- the regulations in certain countries, for certain GMO traits, resistance against insects in particular provides that seeds containing GMOs are sold with a certain rate of seeds not having the GMO trait, so to provide refuge areas for the insect.
- Genotyping is conventionally carried out using different technologies, by PCR (Kasp - LGC Genomics, Taqman - Life Technologies) or hybridization on DNA chips (Axiom - Life Technologies, Infinium - Illumina).
- the Taqman quantitative PCR technology is today considered as the benchmark for the detection of the fortuitous presence of GMO plants in a mixture of non-GMO plants, it is based on the detection of a polymorphism of the presence / absence type. a given sequence, not on a polymorphism between different allelic forms of an SNP.
- the polymorphism relates to the presence of a trait which can be amplified (amplicon) and therefore easily identifiable.
- Application WO 2015/1 10472 proposes to analyze batches of seeds by manual or semi-automatic sampling of a determined sample volume from one or more seeds, this volume being determined to allow the analysis of at least one constituent of the seed or seeds.
- the tissue taken from several seeds is placed in an identified and traceable well, then the said constituent is analyzed on the content of the well (s).
- This bulk constitution method makes it possible to make varietal purity (example 6) this purity is evaluated by the Kaspar method (KBioscience) from bulks of 5 and 10 seeds, the presence of a contaminant in these bulks is characterized by the presence of a heterozygous cluster, however the authors indicate that this cluster is close to the homozygous cluster and that it is easier to identify for a bulk of 5 seeds than for a bulk of 10 seeds.
- NGS Next Generation Sequencing
- the depth of sequencing makes it possible to identify an allele that is poorly represented when identifying allelic forms for a group of individuals in a pool. It can also make it possible to identify a number of allelic forms greater than two for the same locus.
- the sequencing of amplicons makes it possible to study in a targeted manner loci of interest, to identify SNPs and to characterize the allelic composition of an individual or a mixture of individuals.
- a research application is the detection of rare mutations in a mutagenized population (TILLING, Targeting Induced Local Lésions in Genomes).
- the objective is to detect the presence of a contaminant, to accurately estimate the rate within the seed lot from which the analyzed sample comes, and preferably to determine its genetic profile to better understand its origin. Detection can be carried out by analyzing the loci of interest, chosen by a person skilled in the art, based on their knowledge of the genetic material to be qualified and the genetic material likely to contaminate it.
- Chen et al 2016, PLOS ONE 1 1 (6) have developed, for corn, two series of SNPs for quality control: a set of markers for rapid control, using a reduced number of SNPs (50- 100) to identify potential labeling errors in seed packets or plots, and a wider set of markers, and used for further characterization and discrimination of genetic material.
- the sampling of 192 individuals analyzed individually would make it possible to have a probability close to 100% of detecting a contamination of 5% in a batch, but this probability becomes lower than 90% if one is interested in a 1% contamination.
- the expected genetic purity is high, as well as the precision of estimation sought, which depends on both the number of seeds sampled (tested) and the number of seeds of the batch of basic seeds. For example, if 200 grains are analyzed and the impurity rate is 0%, the confidence interval for this value ranges from 0% to 1.49%. The workforce analyzed is therefore too small to guarantee a sufficient level of purity by analyzing only 200 grains. In contrast, when analyzing 2000 grains, a 0% impurity rate has a 0% confidence interval at
- Genia (Montevideo, convinced) offers a method of determining genetic purity on batches of lines, and identifying contaminants, by analyzing a unique mixture of 10,000 seeds and sequencing amplicons targeting approximately 350 SNP. This company claims to determine varietal purity with a sensitivity of 0.8% and a confidence interval of 99%. This approach is similar to that developed by Gautier et al., In that it is based on a statistical model for estimating allelic frequencies on a large number (350) of SNPs, from which an estimate of the frequency is made. different genetic profiles present in the mixture. However, such an approach does not allow reliable detection of a rare allele for a given SNP, which is necessary in the search for contamination for a given trait.
- the method presented here is based on the estimation of the purity of a seed lot from binary qualitative analysis (presence / absence of a contaminant) of several sub-lots of samples.
- the analysis on each sub-lot consists of detecting the presence of an alternative allele to one or more loci of interest, by sequencing of amplicons.
- the number of sublots, as well as the size of each sublot are defined according to the expected purity rate (estimated by the operator) and the precision sought, and so that there is preferably a statistical probability of finding a maximum of a contaminant in a given sublot. This means that, from a given number of seeds used for the test, at least as many sublots are formed as the number of contaminants estimated, preferably exactly as many sublots as the estimated number of contaminants.
- the method makes it possible to distinguish a contamination by a hybrid (segregation) and a contamination by a line (no segregation), by comparing the contaminating profiles of the different sublots. .
- this method is not limited to this binary approach, in fact the use of sequencing makes it possible not to limit the method to the identification of two allelic forms and in this context the method also makes it possible to identify contaminants in batches heterozygous seeds for the allele considered, the contaminant being heterologous to the allelic forms of this individual.
- the invention thus relates to a method for determining the quantity of contaminants at at least one locus of interest, present in a batch of seeds of a variety of interest, characterized in that
- the seeds of a seed lot are grouped into sublots of at least 10 seeds, the number of sublots thus obtained being greater than or equal to 10
- a targeted sequencing of at least the genome region of the seeds containing the locus of interest is carried out for each sub-lot, c) the presence of a contaminant is determined for each sub-lot qualitative in case of detection of an alternative allele to the expected allele (s) (there may be several expected alleles at a single locus, especially if the seeds are seeds of a hybrid plant) for each genomic region sequenced (presence / absence of an alternative allele)
- the quantity of contaminants in the overall batch is determined by the compilation of the qualitative results obtained for all of the sublots.
- the region corresponding to the locus of interest is amplified by PCR between step a) and step b).
- This amplification step is carried out directly on all the seeds in each sublot.
- the sequencing of step b) is carried out on the DNA extracted from the seeds present in a sublot, the region of the genome of the seeds containing the locus of interest being optionally amplified.
- the RNA present in the batch is also extracted of seed, a reverse transcription is carried out to obtain complementary DNA (cDNA), and optionally an amplification of loci of interest of this cDNA, and the sequencing of loci of interest (preferably amplified) is also carried out on the CDNA obtained.
- the estimate of the impurity P of the batch is obtained according to the formula:
- This formula is the formula proposed by Remund (2001, op. C / ' f.), which makes it possible in particular to take into account the fact that the searches for contaminants are carried out only on a sample of the seed lot and therefore to take into account the biases potentially induced by this sampling.
- This process therefore makes it possible to calculate the percentage of contaminants in the seed lot (and therefore the purity of the seed lot: 1- P).
- a contaminant is a seed with an allele different from the expected allele at the locus of interest given in this seed lot.
- a maximum number of seeds is used, calculated so that at most one contaminant is present in each sample (sublot) of seeds, from a statistical point of view. .
- a purity level higher than 99% is generally observed.
- the methods described above are in fact used for homogeneous seed lots, that is to say for which at least 95%, preferably at least 96%, more preferably at least 97% so even more preferably at least 98%, most preferably at least 99% of the seeds have the same genotype.
- the sublots contain a maximum of 20, or a maximum of 50, or a maximum of 80, or a maximum of 100, even a maximum of 200, or 2,000 seeds.
- the quantity of seeds in each sublot prepared in step a is then of the order of 10, respectively 20, or between 15 and 25.
- Step b) of the process consists of the targeted sequencing of at least one genomic region containing the locus of interest for which the presence of a contaminant is sought.
- the DNA of the batches is prepared, for example by crushing the seeds and using the flour or isolating the DNA from this flour. These methods are known in the art. As seen above, one can also prepare cDNA.
- This sequencing step is preferably carried out by high throughput sequencing (NGS).
- NGS high throughput sequencing
- Different technologies Illumina®, Roche 454, Ion torrent: Proton / PGM (ThermoFisher) or SOLiD (Applied BioSystems)).
- this step being carried out by different approaches depending on the technology used.
- Illumina® technology uses clonal amplification and sequencing by synthesis (SBS).
- SBS clonal amplification and sequencing by synthesis
- a double-stranded DNA library is generated from the sample to be analyzed by PCR amplification and addition of specific adapters at the ends, then the DNA is stranded in single strand, and the end of the single strands is fixed. randomly on the “flowcell” surface, on which a solid-phase “bridge” PCR is carried out (creation of dense groups (clusters) where the fragments are amplified).
- the sequencing is carried out by adding the 4 labeled reversible terminators, the primers and the DNA polymerase, then the fluorescence emitted by each cluster is read, making it possible to determine the first base. We then perform several cycles in order to read the entire sequence.
- These beads are then integrated with the amplification products in a water-oil emulsion, in order to create "microreactors" (each drop of water in oil) containing a single ball.
- the PCR is carried out in this emulsion, the entire bank being amplified in parallel, making it possible to obtain several million copies per bead.
- the beads are purified and the fragments are loaded onto plates such that the diameter of the wells allows the entry of only one ball at a time.
- the sequencing enzymes are added and the individual labeled nucleotides are sent one after the other.
- the sequence is detected by a CCD camera according to the luminescent signal.
- the banks are prepared, the adapters are added and a PCR is carried out in an emulsion, as in method 454. Then an enrichment of the amplified beads is carried out, the 3 'end of the DNAs is modified to allow covalent attachment on a slide, and the balls are placed on the slide.
- the sequencing is carried out by ligation: primers hybridize on the adapters present on the matrix. A set of 4 fluorescently labeled 2 base probes are associated with the primers. The specificity of the 2 base probes is carried out with the 1 st and 2 n bases of each ligation reaction. Several ligation, detection and cleavage cycles are carried out.
- each base is detected by two independent ligation reactions by two different primers.
- the coding system for reading on two bases allows very high fidelity in reading the results. This method makes it possible to differentiate between sequencing errors and real variants (SNP, insertions and deletions).
- CMOS complementary metal-oxide-semiconductor
- step c) consists in determining the absence or the presence, for a sample, of an unexpected sequence in the sequencing products. In the presence of such an unexpected sequence (corresponding to the presence of a contaminant), there is no need to quantify the quantity of unexpected sequence compared to the quantity of expected sequence (corresponding to the sequence of correct seeds from the seed lot).
- the detection is therefore only qualitative (that is to say binary: presence / absence of a sequence of an alternative allele to the expected allele (s).
- the fact of using sublots of seeds also allows to increase the number of seeds studied for each sequencing reaction and thus to have a sufficient sample of seeds while controlling costs.
- This analysis is carried out for each genomic region analyzed, that is to say for each locus of interest determined beforehand by a person skilled in the art, and making it possible to characterize the batch of seeds.
- the next step in the process is to calculate the effective percentage of contaminants in the seed lot. This is done by compiling the qualitative results obtained for all of the sublots.
- the purity rate of the seed lot is then estimated by considering the number of contaminated sublots, the total number of sublots analyzed, and the workforce of each of the sublots. lot is obtained according to the formula:
- step b) the targeted sequencing of several regions of the genome containing several loci of interest is carried out. This makes it possible to better guarantee the identity of the seeds present in each sample and to detect, more precisely, the presence of contaminants.
- At least 2 preferably, at least 5, preferably, at least 10, more preferably at least 100, 50, 40, 15 loci of interest, see at least 20 loci interest. Even if there is no upper limit to the number of interest loci that can be assessed, we prefer to limit these. Indeed, it is possible to characterize a variety with a limited number of markers (specific for loci) (between 15 and 20), and to use this set of markers to discriminate plants from this variety of other plants.
- a variety is understood as a set of plants with the same genetic background, the variety can be a commercial variety, but also a line not yet listed in the catalog, basic line, pre-base line or line undergoing propagation.
- the optimal number of loci of interest is defined by a person skilled in the art, as a function of the plant material considered, but also by fixing the minimum number of loci discriminating any pair of given varieties.
- the minimum number of loci discriminating any pair of varieties can be fixed at three, limiting the risk of confusing a real contamination and an experimental false positive.
- Different algorithms are described by Rosenberg et al. (Journal of Computational Biology 12 (9), 2005, 1183-1201) to select a set of discriminating markers.
- markers can be improved or modified to take into account other criteria such as the quality of the markers chosen (by quality means their ability to be amplified, unequivocally identified).
- Quality means their ability to be amplified, unequivocally identified.
- Groups or categories of markers can be identified and define a subgroup of markers which will preferably contain markers from a given group or from different groups. We can thus define a set of markers that we want to use.
- the algorithm can also take into account the statistical quality of these markers defined as the minimum number of discriminating markers to declare a couple of individuals as different. From this criterion, the quality of discrimination of a set of markers can be assessed by the number of pairs of individuals that this set is capable of discriminating, ideally all of the individuals managed by the producer.
- the method will preferably be implemented on loci of interest making it possible both to discriminate the variety of interest (ensuring the consistency and the concordance of the genetic background between plants) and to identify the presence or absence of other loci of interest (notably linked to traits of interest).
- the method described here therefore makes it possible to determine the presence of contaminants in a batch of seeds, in particular to control varietal purity during an industrial production process.
- This method can also be implemented in order to check the purity level of a trait which is sought in the homozygous state in the batch of seeds.
- the region is preferably evaluated only of the genome containing the particular trait that one wishes to follow. Several lines can be followed simultaneously, using specific markers for each line.
- allelic form specific to a given locus in this context this allelic form can be native, linked to a mutation identified by Tilling or Ecotilling, mutation linked to the imprint of a transposable element, mutation obtained by Gene Editing ( gene editing) or by any other method ... in this context the mutation whether it is a point mutation, an insertion or a deletion implies a limited number of bases.
- This method can also be applied to a desired trait in the heterozygous state, the contaminant will then correspond to an alternative form to the allelic forms expected in this individual.
- a line (which can be linked to a single allele or to several alleles) provides the plant with a phenotypic character of interest (such as drought resistance, resistance to biotic stress, resistance to lack of nitrogen, increased yield ).
- the method can be implemented by searching for the presence of the allelic form not containing the insertion or the mutation considered.
- the presence of this allelic form indicating that the presence of the trait linked to the mutation in a homozygous form in the seed lot is not fully guaranteed.
- This method could be used for example when the mutation corresponds to the introgression of a DNA fragment from another species, this particular case will be encountered for example to check the purity of fertility restoring lines in rapeseed.
- This method also makes it possible to make the search for the fortuitous presence of a trait, the trait for which one will seek the fortuitous presence could be a GMO, a mutation linked to Gene Editing or the introgression of a fragment coming from a species heterologous, this research will be done by amplification then sequencing of a specific region of T-DNA, or insertion.
- this method can be applied to traits linked to small mutations if primers allowing specific amplification of the region when one is in the presence of the mutated allelic form can be defined.
- the protocol can be extended to identify the presence of lines for frequencies ranging, for example up to 10% and in this context we can check for example the presence of 10% wild seeds in a batch of GMO seeds (legislation on refuge areas).
- these applications are not limited to GMOs, the trait followed by this method can be introgression in a line of a fragment from another species, the presence of a fertility restoring locus from radish in rapeseed by example. In the same way, verification can make it possible to verify that this introgression is indeed in the homozygous state.
- the method can be used to detect the fortuitous (unwanted) presence of GMOs or of another mutation linked to the insertion of a fragment of substantial size, in a batch of seeds.
- This mutation can be linked to the presence of a transposable element or to an insertion obtained in particular by Gene Editing.
- primers specific to a transgene or of the particular insertion will be used (if a particular contamination is suspected) or different generic primers making it possible to detect different transgenes without a priori.
- steps b), c) and d) are carried out for several regions of the genome containing several loci of interest.
- this embodiment it is preferred when a subset of several loci makes it possible to discriminate or identify a variety of interest.
- this number of loci is variable and these loci can be determined by the skilled person in particular according to the teachings of Rosenberg (cited above).
- it may integrate information concerning the production plan, involving specific controls and measures: isolation distances, border areas, castration, which implies that the risk of contamination will be limited and the seed lot will be a priori uncontaminated or slightly contaminated.
- contamination will most likely come from a known contaminant, in particular from a parental line, including the parental lines involved in the production of basic and pre-basic seeds.
- the number of markers making it possible to identify the purity of a line can be very reduced, it can in particular be 20 or less.
- a batch is declared as containing a contaminant if an alternative allele to the allele is observed. expected for a single locus of interest.
- a batch is declared as containing a contaminant if an alternative allele is observed to the expected allele for more than one locus of interest (in particular 2 or 3 loci).
- At least or exactly one locus of interest is linked to a character of interest (trait). In another embodiment, it is a combination of loci which is related to a character of interest (trait).
- At least one locus of interest is linked to a specific trait a priori not present in the seeds of the batch.
- the method is essentially qualitative. The integration of these markers in the claimed protocol makes it possible to carry out, in a single experiment, the additional controls necessary elsewhere.
- a lot is considered non-compliant if the frequency of the unwanted trait (s) is more than 10% in the seed lot.
- the quantity of seeds in each sublot prepared in step a) is between 80 and 120.
- the method described here can also be used to determine the intrinsic agronomic characteristics of the seeds present in the lot. Thus, one can determine the expression of genes that will lead to unwanted properties of seeds (for example dormancy marker genes which, if expressed, are a marker of seed non-germination). In order to determine the expression of these genes in the seeds of the batch, the RNA is extracted and reverse transcribed. Thus, the method described above can also include the following steps:
- RNA extraction is carried out from the seeds of the sublot, and a reverse transcription of this RNA into cDNA before step b)
- step b) sequencing of this cDNA is carried out using primers specific for dormant genes, at the same time as the sequencing of step b)
- the presence of non-germinative seeds is qualitatively determined for each sub-lot, in the event of detection of cDNAs relating to dormant genes during the sequencing step ii) (presence / absence of l cDNA) iv) the quantity of dormant seeds in the overall lot is determined by the compilation of the qualitative results obtained for all of the sublots in iii).
- Steps iii) and iv) are carried out in the same way as described above.
- the seeds in the batch do not generally have the dormancy character and, by choosing the number of seeds in the over-batches adequately, the qualitative information in iii) can be used to obtain quantitative information.
- the dormancy character case generally observed in commercial seed lots, for which at least 95% of the seeds germinate satisfactorily
- sub- lots containing around 20 seeds between 15 and 25 seeds.
- This dormancy problem is particularly important for sunflower, wheat and rice seeds.
- the dormancy marker genes whose expression is evaluated by sequencing of cDNA obtained from the seed RNA are preferably chosen from the genes known in the art and some of which are described below.
- a trait may correspond to an expression level of a marker gene.
- the germination quality of a seed lot is an essential characteristic, and this quality can change during the conservation of seeds.
- a state in which a seed does not germinate when it is in a favorable germination condition (temperature and humidity) is called a dormant state.
- Dormancy reflects an adaptation of plant species to environmental conditions (ability to put itself in a latent state in the absence of favorable conditions for the development of the plant).
- the sunflower, rice or sorghum have a dormancy whose emergence is accompanied by an improvement in germination at low temperature, while in the case of wheat, barley or oats, it is acts to improve germination at higher temperatures (Baskin and Baskin, Seed Science Research (2004) 14, 1-16).
- This property is particularly important in the case of cultivated species, the objective being to produce and market batches of seeds capable of germinating quickly and evenly after sowing. It is therefore important to be able to characterize the dormancy level of a batch of seeds, and such analyzes are carried out routinely in factories, through germination tests, these tests use in particular Ethrel which has the ability to raise the dormancy.
- these analyzes are long and require a large workforce, hence the advantage of being able to replace them with molecular analyzes.
- DOG1 Delay Of Germination 1
- the role of this gene appears to be conserved between species such as in lettuce (Huo et al., PNAS April 12 , 2016 1 13 (15) E2199-E2206) or wheat (Ashikawa et al., Transgenic Res (2014) 23: 621).
- lettuce Huo et al., PNAS April 12 , 2016 1 13 (15) E2199-E2206
- wheat Ashikawa et al., Transgenic Res (2014) 23: 621).
- sunflowers Layat et al.
- RNA associated with the polysomal fraction in dormant or non-dormant embryos analyzed the abundance of RNA associated with the polysomal fraction in dormant or non-dormant embryos, and identified genes associated with the dormant state, such as HSP ( HSP70, HSP101) as well as stress response genes or involved in the signaling pathways of abscissic acid (ABA), a hormone associated with maintaining dormancy.
- HSP HSP70, HSP101
- ABA abscissic acid
- tubulin alpha are specifically expressed in non-dormant seeds (Layat et al., Op. Cit).
- the analysis of the expression of a specific gene from the dormant state makes it possible to characterize the germinative quality of a batch of seeds.
- the objective being to qualify batches for their germinative capacity
- the analysis of the expression of a specific gene of the dormant state makes it possible to determine the percentage of dormant seeds in a batch not dormant, by semi-quantitative analysis.
- the joint analysis of a specific gene from the dormant state and a specific gene from the non-dormant state would make it possible, by calculating the abundances relative of these two genes, to express a dormancy rate.
- the appropriate marker gene can be chosen based on the timing of this sequencing test phase. These tests may be carried out, for example, shortly before the seeds are packaged for marketing. This evaluation will concern in particular the quality of the priming, the aptitude for germination, the vigor and the viability of the seeds. The aptitude for germination is described in particular in application WO 2018/015495.
- the method described above can also be used to determine the specific purity of the seed lot, i.e. the presence or not (and the quantification) of seeds from a species other than the species of seeds from the seed lot. Such an analysis is currently carried out systematically by operators, who visually determine the presence or not of seeds of unwanted species (ISTA (International Seed Testing Association) rules chapter 4).
- the DNA of the sublots is also sequenced using primers specific for one or more species different from those of the seeds present in the sublot, at the same time as the sequencing of step b)
- the quantity of exogenous seeds in the overall lot is determined by the compilation of the qualitative results obtained for all of the sublots in ii).
- Steps ii) and iii) are carried out in the same manner as described above. Seeds in the lot generally do not have many seeds from other species and, by choosing the number of seeds in the over-lots adequately, the qualitative information in iii) can be used to obtain quantitative information. Thus, if we know that at most 1% of the seeds present come from a species other than the species of interest, (case generally observed in commercial seed lots, for which at least 99% of the seeds are of the species of interest), sublots containing about 100 seeds (between 80 and 120 seeds) are used. The method described above can also be used to detect the presence of pathogens in the seed lot (contamination) (see ISTA (International Seed Testing Association) rules chapter 7). For example, the quantity of Sunflower seeds contaminated with Botrytis tolerated for the marketing of a batch of sunflower seed is 5%.
- step b) sequencing the DNA or cDNA included in the sublots using primers specific for pathogenic species, at the same time as the sequencing of step b)
- This method is particularly suitable for detecting the presence of Xanthomonas Campestris pv. campestris in seeds of Brassica ISTA (rules 7-019a: Detection of Xanthomonas campestris pv. campestris in Brassica spp. Seed) or Berg (Plant Pathology (2005) 54, 416 -427).
- a PCR test for the identification of a pathogen on seed exists for the identification of downy mildew on sunflower (loos et al., Plant Pathology (2007) 56, 209-218).
- RNA is extracted from each seed sublot and a reverse transcription of this RNA into cDNA is carried out.
- steps i) and ii) can be carried out simultaneously, the extraction of DNAs and RNAs being able to be carried out in particular by means of the total DNA, RNA and protein isolation kit NucleoSpin® TriPrep from Macherey-Nagel.
- step iv) is carried out by amplifying specific sequences of the genes (in particular other organisms) of which it is desired to verify the absence or the presence. We are therefore trying to determine if these other organisms are present in quantities lower than the tolerated rates for marketing. It is thus possible to detect the presence in particular of viral sequences. We can also make a non-specific amplification of the entire DNA of the genome.
- step iv) can also be carried out by amplifying specific sequences making it possible to determine certain agronomic properties of the seeds of the sublot, at least one agronomic property of the seeds being notably chosen from the state of dormancy , in particular the quality of the priming, the aptitude for germination, the vigor and the viability of the seeds.
- the method contains the steps:
- RNA extraction is also carried out from the seeds of the sublot, and a reverse transcription of this RNA into cDNA before step b)
- step b) sequencing of this cDNA is carried out using primers specific for genes linked to an agronomic property of the seeds, at the same time as the sequencing of step b) is carried out
- the presence of seeds having the agronomic property is determined qualitatively, in the event of detection of cDNAs relating to the genes specific to the agronomic property of the seeds during the sequencing step ii) ( presence / absence of cDNA)
- the quantity of seeds exhibiting this agronomic character in the overall lot is determined by the compilation of the qualitative results obtained for all of the sublots in iii).
- the agronomic property of the seeds is chosen from the dormant state, in particular the quality of the priming, the aptitude for germination, the vigor and the viability of the seeds.
- Several agronomic properties can also be sought by sequencing suitable genes.
- the gene which marks the physiological state and the agronomic property of the seeds is chosen from the genes which are expressed, in the seeds, at the same time as the undesired agronomic character (dormancy, lack of vigor, etc.). Thus, we want an absence of expression of this gene and we generally wish that the expression of this gene is not present in more than 10% of the seeds in the seed lot.
- varietal purity analysis can identify the contaminant (s) present in the seed lot.
- each subsample it is possible to define a molecular profile corresponding to the compilation of data from each locus of interest.
- the profile of each subsample can then be compared to the expected molecular profile, and a contaminating molecular profile can be deduced by subtraction.
- a locus of interest with no alternative allele will be considered identical to the locus between the expected variety and the contaminant, while a locus with an alternative allele will be defined as potentially homozygous for the alternative allele, or heterozygote allele expected / alternative allele.
- contaminant molecular profiles can then be compared to a reference database in order to identify the nature of the contaminant, and possibly when it entered the production cycle.
- ii) compare the profile obtained in i) with those of a reference database.
- a method of determining the degree of purity is considered, as defined above, characterized in that the contaminant is further identified for each sublot contaminated in
- One or more contaminant profiles are therefore obtained for the starting seed lot, corresponding to the sum of the contaminants of each contaminated sublot.
- the methods described above therefore make it possible to carry out a quality control of seed lots, on several different traits (varietal purity, specific purity, agronomic characteristics contamination by pathogens), in a single step, and by quantifying the presence of some of the unwanted traits or contaminants. Furthermore, these methods allow the precise determination of the nature of the contaminants present, due to the use of sequencing which gives precise information which can be easily used, as well as the determination of the presence of SNP (Single Nucleotide Polymorphism, polymorphism relating to a single nucleotide) which could not be detected by other methods (probes, amplifications, DNA chips). These methods therefore provide high precision with regard to the characterization of the batch of seeds tested.
- the methods described make it possible to improve the precision of the control of seed batches, in particular when they are combined.
- These same methods can also be transposed and used for the study of the conformity of plants marketed in the form of plants, species with vegetative multiplication, the material evaluated will then consist of sampling plant tissues, the amount of which will be equivalent from one plant to the other, this plant tissue could be, among other things, a leaf disc.
- Figure 1 result of the Taqman analysis for a SNP, comprising two allelic forms detected respectively by the FAM and VIC fluorochromes, in samples of maize homozygous (A, B) or heterozygous for SNP (C).
- A homozygous sample for the allelic form detected in FAM.
- B homozygous sample for the allelic form detected in VIC.
- C heterozygous sample for the allelic forms detected in FAM and VIC.
- Figure 2 Relative frequency, in each sub-lot, of the alternative allele for SNP10. Sub-lots 3, 14 and 16 show a significant frequency of the alternative allele.
- Figure 3 Qualitative profile (presence / absence of a contaminating allele) Profile of the presence of an alternative allele for the 17 markers (line) (16 discriminating markers and one marker associated with a trait) within the 16 sublots ( column). The presence of an alternative allele is detected for at least 3 SNPs in sub-lots 3, 14 and 16. These sub-lots are declared contaminated. The other 13 sublots are declared uncontaminated.
- Figure 4 molecular profiles obtained on the 17 SNPs (16 discriminating markers and one marker associated with a trait) obtained on the 16 sublots analyzed.
- the profile of the first line corresponds to the majority profile, the following profiles to the contaminated profiles observed for lots 3, 14 and 16 respectively.
- This example evaluates the possibility of detecting a contaminating seed in a sub-batch of corn seeds, by genotyping using Taqman technology (Applied Biosystem).
- FIG. 1 shows the result of the Taqman analysis for an SNP, comprising two allelic forms detected respectively by the fluorochromes FAM and VIC, in samples of corn homozygous or heterozygous at the SNP, and highlights the presence of signal with the FAM probe in a sample homozygous for the VIC allele (B), that is to say a non-specific signal, which does not allow a false positive signal to be distinguished from a signal linked to actual contamination in a sample.
- B VIC allele
- lots of 200 seeds from a line A containing 10%, 20%, 30%, 40%, and up to 90% of contaminants by a line B were produced and a sample of 15 seeds from this batch was analyzed by genotyping on an Infinium chip (Illumina), in order to assess the feasibility of identifying a contamination.
- Illumina Infinium chip
- Example 3 implementation of the method according to the invention on a set of markers
- SNPs discriminating markers
- an amplicon of 70 to 120bp was defined, and the 16 markers co-amplified by multiplex PCR.
- a unique index (TAG) is used for each DNA sample, allowing sequencing of all the amplicons and assigning the sequences obtained to their original batch.
- the amplicons have been sequenced by technology. Illumina on a Miniseq sequencer. Matched sequences of 75 bases were generated, assigned to the original DNAs by a demultiplexing step. After removal of the poor quality adapter and base sequences (threshold Q30), each pair of sequences is assembled into a single sequence, then aligned with the reference corn genome (RefGenV4). For each SNP, the relative allelic frequencies of the majority allele and the alternative allele were calculated, and correspond to the number of readings containing the allele of interest compared to the sum of the readings of each allele.
- a sample is declared contaminated when it contains at least 3 SNPs for which an alternative allele is detected. Thus, it is concluded that, among these 24 sublots, 13 are considered to be contaminated and 11 to be pure.
- the number of contaminated sublots makes it possible to estimate the varietal purity of the analyzed batch, this calculation is carried out using the Seed Cale software which uses the formulas of Remund (2001). In this example, the estimated purity is 99.22% (98.64% -99.6%), for an actual controlled purity of 99%.
- the estimate of the impurity P of the batch is obtained according to the formula:
- each sublot was crushed and the DNA extracted.
- a set of 17 markers including 16 discriminating SNPs (allowing unambiguous identification of the presence of a variety other than that expected) and a marker associated with a trait, has been identified.
- an amplicon of 70-120bp was defined, and the 17 markers were co-amplified by multiplex PCR.
- a unique index (Tag) is used for each DNA sample, allowing sequencing of all the amplicons and assigning the sequences obtained to their original batch.
- the amplicons were sequenced by Illumina technology on a Miniseq sequencer. Matched sequences of 75 bases were generated, assigned to the original DNAs by a demultiplexing step. After removal of the poor quality adapter and base sequences (threshold Q30), each pair of sequences is assembled into a single sequence, then aligned with the reference corn genome (RefGenV4). For each SNP, the relative allelic frequencies of the majority allele and the alternative allele were calculated, and correspond to the number of readings containing the allele of interest compared to the sum of the readings of each allele.
- Figure 2 shows, for an SNP (SNP10), the frequency of the alternative allele in each of the sub-lots (i.e. the frequency of appearance of the sequence of the alternative allele).
- SNP SNP
- sublots 3, 14 and 16 show a significant presence of the alternative allele (above the background noise represented by the horizontal line).
- Figure 3 shows the qualitative profile (presence / absence of the alternative allele) obtained for each SNP in each sublot. Confirmation of the presence of an alternative allele for at least 3 SNPs in sub-lots 3, 14 and 16.
- These 3 sub-lots are declared contaminated.
- the other 13 sublots are declared uncontaminated.
- the varietal purity rate estimated with SeedCalc is 99.79% (95% confidence interval: 99.39% - 99.96%).
- the SNP17 marker was analyzed separately and used to estimate the purity of the associated trait.
- Figure 3 shows that sublots 3 and 16 have a significant frequency of the alternative allele. These 2 sublots are declared contaminated, leading to an estimate of the line purity of 99.87% (95% confidence interval: 99.52 - 99.98%).
- the molecular profile identified on the uncontaminated sublots is first used to check its compliance with the expected profile for the variety analyzed (the previous step allows you to check the varietal purity of the batch, this step allows you to check that the variety identified is the one expected). Then, on sub-lots 3, 14 and 16 showing contamination, a contaminating molecular profile is deduced from the observed molecular profile, by subtraction from the expected profile. For each SNP marker showing contamination, the 2 alleles observed are reported ( Figure 4). The contaminant can thus be homozygous for the minority allele, or heterozygous.
- Each contaminating molecular profile is then compared to a reference database in order to identify it. If this genotype corresponds to a known accession, this is proposed as a potential contaminant, otherwise the contaminating genotype is declared unidentifiable.
- This reference database can be refined according to the production plan in particular, this database will then contain, as a priority, all of the varieties grown in the line production sector. And in this context a contaminant which will not appear in this reference base will be qualified as a contaminant linked to the post-harvest process.
- Example 5 Implementation of the method for the simultaneous evaluation of the varietal purity and the germinative quality of a batch of seeds
- 16 sub-lots of 100 seeds are formed, so as to evaluate the seed lot on a sample of 1600 seeds. From each sublot, the DNAs and the RNAs are co-extracted.
- each sublot is mechanically ground in a tube by the addition of stainless steel balls, the tubes and the grinding support being previously cooled in liquid nitrogen in order to preserve the integrity of the nucleic acids, in particular RNA.
- Co-extraction of DNA and RNA is carried out using the total DNA, RNA and protein isolation NucleoSpin® TriPrep kit from Macherey-Nagel.
- a lysis buffer is added to the ground materials, making it possible to destroy the cellular structures as well as to inactivate enzymes such as RNases simultaneously.
- the lysates are then deposited on columns containing a silica membrane to which the DNA and RNA molecules are attached.
- a first elution in a specific buffer makes it possible to elute the DNAs while keeping the RNAs fixed on the silica membrane. After treatment with DNAse degrading the residual DNA, the RNAs are washed and then eluted in RNAse free water.
- a reverse transcription is carried out, initiated with oligo-dT oligonucleotides making it possible to synthesize the double-stranded DNA complementary to the messenger RNA present in each sample.
- a DNA mixture is then constituted for each sub-lot, composed of the genomic DNAs extracted and the cDNAs synthesized from the RNA fraction.
- a multiplex PCR is carried out on each DNA sample in order to specifically amplify the targets of interest in the form of amplicons from 70 to 120 bp. These amplicons correspond to the genomic regions of interest for determining the molecular profile of varietal identification on the one hand (set of discriminating SNPs), and to the DOG1 gene marker of the dormant state of the seeds on the other hand.
- a unique index (TAG) is used for each DNA sample, thus making it possible to carry out a sequencing of all the amplicons and to attribute the sequences obtained to their original sublot.
- the amplicons are sequenced by Illumina technology, generating paired sequences of 75 bases each.
- sequences are then assigned to the original DNAs by a demultiplexing step, then undergo different treatments consisting in the removal of the sequences of poor quality adapters and bases (threshold Q30). Each pair of sequences is finally assembled into a single sequence, then aligned with the sequence of the reference genome.
- the relative allelic frequencies of the majority allele and the alternative allele were calculated, and correspond to the number of readings containing the allele of interest compared to the sum of the readings of each allele. It is considered that there is contamination for an SNP marker if, in a sublot, the sequence of an allelic form, which is not that of the expected allele for the variety tested, appears to be greater than the noise background.
- a sample is declared contaminated when it contains at least 3 SNPs for which an allele alternative is detected. The number of contaminated sublots makes it possible to estimate the varietal purity of the batch analyzed. This calculation is carried out using the Seed Cale software which uses the formulas of Remund (2001).
- a sublot is considered to contain a dormant seed if specific sequences of the transcript of this gene are detected in quantities significantly different from the background noise, the expression of this gene being negligible in seeds not dormant.
- This significance threshold is determined beforehand using a standard range.
- the dormancy rate is then estimated by counting the number of sublots for which expression of the DOG1 gene is detected, using the calculation method used previously.
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1857115A FR3084374B1 (en) | 2018-07-30 | 2018-07-30 | PROCESS FOR QUALITY CONTROL OF SEED LOTS |
PCT/EP2019/070386 WO2020025554A1 (en) | 2018-07-30 | 2019-07-29 | Method for the quality control of seed lots |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3830287A1 true EP3830287A1 (en) | 2021-06-09 |
Family
ID=63722623
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19749675.5A Pending EP3830287A1 (en) | 2018-07-30 | 2019-07-29 | Method for the quality control of seed lots |
Country Status (7)
Country | Link |
---|---|
US (1) | US20210317539A1 (en) |
EP (1) | EP3830287A1 (en) |
JP (1) | JP2021532834A (en) |
AU (1) | AU2019312799A1 (en) |
CA (1) | CA3107562A1 (en) |
FR (1) | FR3084374B1 (en) |
WO (1) | WO2020025554A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA3212294A1 (en) * | 2021-03-02 | 2022-09-09 | Indiana Crop Improvement Association | Genetic purity estimate method by sequencing |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2417476A1 (en) * | 2002-01-29 | 2003-07-29 | Third Wave Technologies, Inc. | Systems and methods for analysis of agricultural products |
US20040241662A1 (en) * | 2003-05-30 | 2004-12-02 | Robey W. Wade | Detecting microbial contamination in grain and related products |
NL1034267C2 (en) * | 2007-08-17 | 2009-02-18 | Stichting Tech Wetenschapp | Method for measuring seed quality. |
US8716550B2 (en) | 2007-09-24 | 2014-05-06 | Keygene N.V. | Method for the selection of plants with specific mutations |
US10172305B2 (en) * | 2011-04-29 | 2019-01-08 | Monsanto Technology Llc | Diagnostic molecular markers for seed lot purity traits in soybeans |
US20160047003A1 (en) | 2013-03-08 | 2016-02-18 | Vineland Research And Innovation Centre | High throughput method of screening a population for members comprising mutation(s) in a target sequence |
FR3016698B1 (en) * | 2014-01-21 | 2020-10-30 | Limagrain Europe | SEED TISSUE SAMPLING PROCESS |
AU2016220556B2 (en) * | 2015-02-19 | 2017-09-21 | Yeditepe Universitesi | Coating formulation for seed and surface sterilization |
WO2018015495A1 (en) | 2016-07-20 | 2018-01-25 | Vilmorin & Cie | Method for predicting the germination ability of maize seed using nuclear magnetic resonance |
-
2018
- 2018-07-30 FR FR1857115A patent/FR3084374B1/en active Active
-
2019
- 2019-07-29 JP JP2021529517A patent/JP2021532834A/en active Pending
- 2019-07-29 AU AU2019312799A patent/AU2019312799A1/en active Pending
- 2019-07-29 WO PCT/EP2019/070386 patent/WO2020025554A1/en unknown
- 2019-07-29 CA CA3107562A patent/CA3107562A1/en active Pending
- 2019-07-29 EP EP19749675.5A patent/EP3830287A1/en active Pending
- 2019-07-29 US US17/264,427 patent/US20210317539A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
CA3107562A1 (en) | 2020-02-06 |
FR3084374B1 (en) | 2024-04-26 |
WO2020025554A1 (en) | 2020-02-06 |
AU2019312799A1 (en) | 2021-02-25 |
FR3084374A1 (en) | 2020-01-31 |
US20210317539A1 (en) | 2021-10-14 |
JP2021532834A (en) | 2021-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10544471B2 (en) | Methods for sequence-directed molecular breeding | |
Wright et al. | Molecular population genetics and the search for adaptive evolution in plants | |
Eichten et al. | Minimal evidence for consistent changes in maize DNA methylation patterns following environmental stress | |
Takasaki et al. | Water pre-filtration methods to improve environmental DNA detection by real-time PCR and metabarcoding | |
Ho et al. | Genetic characterization of mango accessions through RAPD and ISSR markers in Vietnam. | |
Strable et al. | Microarray analysis of vegetative phase change in maize | |
Norton et al. | A bioinformatic and transcriptomic approach to identifying positional candidate genes without fine mapping: an example using rice root-growth QTLs | |
EP3830287A1 (en) | Method for the quality control of seed lots | |
Yang et al. | Linkage analysis and residual heterozygotes derived near isogenic lines reveals a novel protein quantitative trait loci from a Glycine soja accession | |
Taliercio et al. | Changes in gene expression between a soybean F1 hybrid and its parents are associated with agronomically valuable traits | |
TW201606084A (en) | Method of predicting or determining plant phenotypes | |
Romay | Rapid, affordable, and scalable genotyping for germplasm exploration in maize | |
Rubio-Piña et al. | A quantitative PCR approach for determining the ribosomal DNA copy number in the genome of Agave tequila Weber | |
Baggett et al. | De novo identification and targeted sequencing of SSRs efficiently fingerprints Sorghum bicolor sub-population identity | |
US20110010102A1 (en) | Methods and Systems for Sequence-Directed Molecular Breeding | |
CA3152086A1 (en) | Methods for preparing mutant plants | |
Usovsky et al. | Loss-of-function of an α-SNAP gene confers resistance to soybean cyst nematode | |
Kitamura et al. | Development of a simple multiple mutation detection system using seed-coat flavonoid pigments in irradiated Arabidopsis M1 plants | |
Mursyidin | Genetic diversity and phylogenetic position of traditional rice (Oryza sativa L.) landraces: A case study of South Kalimantan in Indonesia | |
Priyadarshan et al. | Molecular Breeding | |
CN114507750B (en) | Primer group, kit and detection method for detecting corn transgenic line | |
Ghose et al. | Assessment of somaclonal variation among sugarcane varieties for salt tolerance through RAPD markers. | |
Haas et al. | RNA-seq reveals few differences in resistant and susceptible responses of barley to infection by the spot blotch pathogen Bipolaris sorokiniana | |
Sengar | Molecular mapping techniques | |
Stamati et al. | A quantitative genomic imbalance gene expression assay in a hexaploid species: wheat (Triticum aestivum) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20210121 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20240306 |