EP2813141A1 - Directed strategies for improving phenotypic traits - Google Patents
Directed strategies for improving phenotypic traits Download PDFInfo
- Publication number
- EP2813141A1 EP2813141A1 EP20140172374 EP14172374A EP2813141A1 EP 2813141 A1 EP2813141 A1 EP 2813141A1 EP 20140172374 EP20140172374 EP 20140172374 EP 14172374 A EP14172374 A EP 14172374A EP 2813141 A1 EP2813141 A1 EP 2813141A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- individuals
- population
- breeding
- allele
- loci
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 claims abstract description 127
- 241000283690 Bos taurus Species 0.000 claims abstract description 10
- 108700028369 Alleles Proteins 0.000 claims description 160
- 230000001488 breeding effect Effects 0.000 claims description 131
- 238000009395 breeding Methods 0.000 claims description 124
- 230000000694 effects Effects 0.000 claims description 97
- 238000006467 substitution reaction Methods 0.000 claims description 59
- 238000012549 training Methods 0.000 claims description 39
- 230000006798 recombination Effects 0.000 claims description 37
- 238000005215 recombination Methods 0.000 claims description 37
- 230000002068 genetic effect Effects 0.000 claims description 26
- 241000196324 Embryophyta Species 0.000 claims description 23
- 238000003205 genotyping method Methods 0.000 claims description 15
- 238000001514 detection method Methods 0.000 claims description 13
- 241000894007 species Species 0.000 claims description 12
- 240000002791 Brassica napus Species 0.000 claims description 10
- 240000008042 Zea mays Species 0.000 claims description 9
- 239000003147 molecular marker Substances 0.000 claims description 9
- 240000005020 Acaciella glauca Species 0.000 claims description 8
- 241000219195 Arabidopsis thaliana Species 0.000 claims description 7
- FGUUSXIOTUKUDN-IBGZPJMESA-N C1(=CC=CC=C1)N1C2=C(NC([C@H](C1)NC=1OC(=NN=1)C1=CC=CC=C1)=O)C=CC=C2 Chemical compound C1(=CC=CC=C1)N1C2=C(NC([C@H](C1)NC=1OC(=NN=1)C1=CC=CC=C1)=O)C=CC=C2 FGUUSXIOTUKUDN-IBGZPJMESA-N 0.000 claims description 7
- 240000008067 Cucumis sativus Species 0.000 claims description 7
- 241001465754 Metazoa Species 0.000 claims description 7
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 claims description 7
- 235000002017 Zea mays subsp mays Nutrition 0.000 claims description 7
- 235000009973 maize Nutrition 0.000 claims description 7
- 235000011293 Brassica napus Nutrition 0.000 claims description 6
- 235000000540 Brassica rapa subsp rapa Nutrition 0.000 claims description 6
- 244000070406 Malus silvestris Species 0.000 claims description 6
- 241000219000 Populus Species 0.000 claims description 6
- 235000013399 edible fruits Nutrition 0.000 claims description 6
- 235000010799 Cucumis sativus var sativus Nutrition 0.000 claims description 5
- 235000010328 Acer nigrum Nutrition 0.000 claims description 4
- 240000002245 Acer pensylvanicum Species 0.000 claims description 4
- 235000006760 Acer pensylvanicum Nutrition 0.000 claims description 4
- 244000291564 Allium cepa Species 0.000 claims description 4
- 241000272517 Anseriformes Species 0.000 claims description 4
- 235000010921 Betula lenta Nutrition 0.000 claims description 4
- 240000001746 Betula lenta Species 0.000 claims description 4
- 241000219430 Betula pendula Species 0.000 claims description 4
- 235000009109 Betula pendula Nutrition 0.000 claims description 4
- 244000178993 Brassica juncea Species 0.000 claims description 4
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 claims description 4
- 235000011297 Brassica napobrassica Nutrition 0.000 claims description 4
- 235000006008 Brassica napus var napus Nutrition 0.000 claims description 4
- 240000007124 Brassica oleracea Species 0.000 claims description 4
- 235000003899 Brassica oleracea var acephala Nutrition 0.000 claims description 4
- 235000011299 Brassica oleracea var botrytis Nutrition 0.000 claims description 4
- 235000011301 Brassica oleracea var capitata Nutrition 0.000 claims description 4
- 240000003259 Brassica oleracea var. botrytis Species 0.000 claims description 4
- 240000008100 Brassica rapa Species 0.000 claims description 4
- 235000010149 Brassica rapa subsp chinensis Nutrition 0.000 claims description 4
- 235000000536 Brassica rapa subsp pekinensis Nutrition 0.000 claims description 4
- 241000030939 Bubalus bubalis Species 0.000 claims description 4
- 241000282472 Canis lupus familiaris Species 0.000 claims description 4
- 240000003562 Carya cordiformis Species 0.000 claims description 4
- 235000005662 Carya cordiformis Nutrition 0.000 claims description 4
- 235000012939 Caryocar nuciferum Nutrition 0.000 claims description 4
- 235000014037 Castanea sativa Nutrition 0.000 claims description 4
- 240000007857 Castanea sativa Species 0.000 claims description 4
- 241000700199 Cavia porcellus Species 0.000 claims description 4
- 244000298479 Cichorium intybus Species 0.000 claims description 4
- 240000007582 Corylus avellana Species 0.000 claims description 4
- 235000007466 Corylus avellana Nutrition 0.000 claims description 4
- 235000009917 Crataegus X brevipes Nutrition 0.000 claims description 4
- 235000013204 Crataegus X haemacarpa Nutrition 0.000 claims description 4
- 235000009685 Crataegus X maligna Nutrition 0.000 claims description 4
- 235000009444 Crataegus X rubrocarnea Nutrition 0.000 claims description 4
- 235000009486 Crataegus bullatus Nutrition 0.000 claims description 4
- 235000017181 Crataegus chrysocarpa Nutrition 0.000 claims description 4
- 235000009682 Crataegus limnophila Nutrition 0.000 claims description 4
- 240000000171 Crataegus monogyna Species 0.000 claims description 4
- 235000004423 Crataegus monogyna Nutrition 0.000 claims description 4
- 235000002313 Crataegus paludosa Nutrition 0.000 claims description 4
- 235000009840 Crataegus x incaedua Nutrition 0.000 claims description 4
- 235000009854 Cucurbita moschata Nutrition 0.000 claims description 4
- 241000252233 Cyprinus carpio Species 0.000 claims description 4
- 241000252212 Danio rerio Species 0.000 claims description 4
- 241000283073 Equus caballus Species 0.000 claims description 4
- 240000000731 Fagus sylvatica Species 0.000 claims description 4
- 241000287828 Gallus gallus Species 0.000 claims description 4
- 244000068988 Glycine max Species 0.000 claims description 4
- 235000010469 Glycine max Nutrition 0.000 claims description 4
- 235000014056 Juglans cinerea Nutrition 0.000 claims description 4
- 240000004929 Juglans cinerea Species 0.000 claims description 4
- 235000005590 Larix decidua Nutrition 0.000 claims description 4
- 235000005087 Malus prunifolia Nutrition 0.000 claims description 4
- 240000005561 Musa balbisiana Species 0.000 claims description 4
- 241000276703 Oreochromis niloticus Species 0.000 claims description 4
- 241000283973 Oryctolagus cuniculus Species 0.000 claims description 4
- 241001494479 Pecora Species 0.000 claims description 4
- 244000046052 Phaseolus vulgaris Species 0.000 claims description 4
- 241000218657 Picea Species 0.000 claims description 4
- 235000008582 Pinus sylvestris Nutrition 0.000 claims description 4
- 241000183024 Populus tremula Species 0.000 claims description 4
- 240000001416 Pseudotsuga menziesii Species 0.000 claims description 4
- 235000011471 Quercus robur Nutrition 0.000 claims description 4
- 240000009089 Quercus robur Species 0.000 claims description 4
- 244000088415 Raphanus sativus Species 0.000 claims description 4
- 235000006140 Raphanus sativus var sativus Nutrition 0.000 claims description 4
- 241000700159 Rattus Species 0.000 claims description 4
- 244000061458 Solanum melongena Species 0.000 claims description 4
- 244000061456 Solanum tuberosum Species 0.000 claims description 4
- 235000002595 Solanum tuberosum Nutrition 0.000 claims description 4
- 244000019194 Sorbus aucuparia Species 0.000 claims description 4
- 241000282898 Sus scrofa Species 0.000 claims description 4
- 102000054765 polymorphisms of proteins Human genes 0.000 claims description 4
- 235000006414 serbal de cazadores Nutrition 0.000 claims description 4
- 244000241257 Cucumis melo Species 0.000 claims description 3
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 claims description 3
- 238000012070 whole genome sequencing analysis Methods 0.000 claims description 3
- 241000191291 Abies alba Species 0.000 claims description 2
- 235000004507 Abies alba Nutrition 0.000 claims description 2
- 235000014081 Abies amabilis Nutrition 0.000 claims description 2
- 244000101408 Abies amabilis Species 0.000 claims description 2
- 244000283070 Abies balsamea Species 0.000 claims description 2
- 235000007173 Abies balsamea Nutrition 0.000 claims description 2
- 241000208140 Acer Species 0.000 claims description 2
- 240000002767 Acer grandidentatum Species 0.000 claims description 2
- 235000010319 Acer grandidentatum Nutrition 0.000 claims description 2
- 244000205124 Acer nigrum Species 0.000 claims description 2
- 240000004731 Acer pseudoplatanus Species 0.000 claims description 2
- 235000002754 Acer pseudoplatanus Nutrition 0.000 claims description 2
- 240000004144 Acer rubrum Species 0.000 claims description 2
- 235000011772 Acer rubrum var tomentosum Nutrition 0.000 claims description 2
- 235000009057 Acer rubrum var tridens Nutrition 0.000 claims description 2
- 235000002629 Acer saccharinum Nutrition 0.000 claims description 2
- 235000010157 Acer saccharum subsp saccharum Nutrition 0.000 claims description 2
- 241000251468 Actinopterygii Species 0.000 claims description 2
- 241000157282 Aesculus Species 0.000 claims description 2
- 244000198134 Agave sisalana Species 0.000 claims description 2
- 235000006667 Aleurites moluccana Nutrition 0.000 claims description 2
- 235000005254 Allium ampeloprasum Nutrition 0.000 claims description 2
- 240000006108 Allium ampeloprasum Species 0.000 claims description 2
- 235000010167 Allium cepa var aggregatum Nutrition 0.000 claims description 2
- 235000002732 Allium cepa var. cepa Nutrition 0.000 claims description 2
- 240000002234 Allium sativum Species 0.000 claims description 2
- 241001564395 Alnus rubra Species 0.000 claims description 2
- 244000304226 Amelanchier arborea Species 0.000 claims description 2
- 235000007084 Amelanchier arborea Nutrition 0.000 claims description 2
- 235000007087 Amelanchier canadensis Nutrition 0.000 claims description 2
- 244000144725 Amygdalus communis Species 0.000 claims description 2
- 235000011437 Amygdalus communis Nutrition 0.000 claims description 2
- 244000144730 Amygdalus persica Species 0.000 claims description 2
- 244000226021 Anacardium occidentale Species 0.000 claims description 2
- 244000099147 Ananas comosus Species 0.000 claims description 2
- 235000007119 Ananas comosus Nutrition 0.000 claims description 2
- 241000272525 Anas platyrhynchos Species 0.000 claims description 2
- 241000272826 Anser anser domesticus Species 0.000 claims description 2
- 241000272816 Anser cygnoides Species 0.000 claims description 2
- 240000007087 Apium graveolens Species 0.000 claims description 2
- 235000015849 Apium graveolens Dulce Group Nutrition 0.000 claims description 2
- 235000010591 Appio Nutrition 0.000 claims description 2
- 235000006549 Arenga pinnata Nutrition 0.000 claims description 2
- 241001444063 Aronia Species 0.000 claims description 2
- 235000002672 Artocarpus altilis Nutrition 0.000 claims description 2
- 240000004161 Artocarpus altilis Species 0.000 claims description 2
- 235000008725 Artocarpus heterophyllus Nutrition 0.000 claims description 2
- 244000025352 Artocarpus heterophyllus Species 0.000 claims description 2
- 244000003416 Asparagus officinalis Species 0.000 claims description 2
- 235000005340 Asparagus officinalis Nutrition 0.000 claims description 2
- 244000075850 Avena orientalis Species 0.000 claims description 2
- 235000000832 Ayote Nutrition 0.000 claims description 2
- 235000021537 Beetroot Nutrition 0.000 claims description 2
- 235000016068 Berberis vulgaris Nutrition 0.000 claims description 2
- 241000335053 Beta vulgaris Species 0.000 claims description 2
- 241000219310 Beta vulgaris subsp. vulgaris Species 0.000 claims description 2
- 235000018185 Betula X alpestris Nutrition 0.000 claims description 2
- 235000018212 Betula X uliginosa Nutrition 0.000 claims description 2
- 244000189108 Betula alleghaniensis Species 0.000 claims description 2
- 235000018199 Betula alleghaniensis var. alleghaniensis Nutrition 0.000 claims description 2
- 235000018198 Betula alleghaniensis var. macrolepis Nutrition 0.000 claims description 2
- 235000009131 Betula nigra Nutrition 0.000 claims description 2
- 241000219495 Betulaceae Species 0.000 claims description 2
- 244000208235 Borassus flabellifer Species 0.000 claims description 2
- 241000283699 Bos indicus Species 0.000 claims description 2
- 241000219198 Brassica Species 0.000 claims description 2
- 235000006463 Brassica alba Nutrition 0.000 claims description 2
- 235000005637 Brassica campestris Nutrition 0.000 claims description 2
- 235000005156 Brassica carinata Nutrition 0.000 claims description 2
- 244000257790 Brassica carinata Species 0.000 claims description 2
- 235000003351 Brassica cretica Nutrition 0.000 claims description 2
- 235000011371 Brassica hirta Nutrition 0.000 claims description 2
- 235000014750 Brassica kaber Nutrition 0.000 claims description 2
- 244000178924 Brassica napobrassica Species 0.000 claims description 2
- 240000000385 Brassica napus var. napus Species 0.000 claims description 2
- 235000011291 Brassica nigra Nutrition 0.000 claims description 2
- 244000180419 Brassica nigra Species 0.000 claims description 2
- 244000026811 Brassica nipposinica Species 0.000 claims description 2
- 235000007294 Brassica nipposinica Nutrition 0.000 claims description 2
- 235000004221 Brassica oleracea var gemmifera Nutrition 0.000 claims description 2
- 235000017647 Brassica oleracea var italica Nutrition 0.000 claims description 2
- 235000001169 Brassica oleracea var oleracea Nutrition 0.000 claims description 2
- 235000012905 Brassica oleracea var viridis Nutrition 0.000 claims description 2
- 244000178937 Brassica oleracea var. capitata Species 0.000 claims description 2
- 244000308368 Brassica oleracea var. gemmifera Species 0.000 claims description 2
- 244000304217 Brassica oleracea var. gongylodes Species 0.000 claims description 2
- 235000004214 Brassica oleracea var. sabauda Nutrition 0.000 claims description 2
- 241001332183 Brassica oleracea var. sabauda Species 0.000 claims description 2
- 244000240551 Brassica parachinensis Species 0.000 claims description 2
- 235000000981 Brassica parachinensis Nutrition 0.000 claims description 2
- 235000011292 Brassica rapa Nutrition 0.000 claims description 2
- 244000221633 Brassica rapa subsp chinensis Species 0.000 claims description 2
- 235000006618 Brassica rapa subsp oleifera Nutrition 0.000 claims description 2
- 241000499436 Brassica rapa subsp. pekinensis Species 0.000 claims description 2
- 241000499439 Brassica rapa subsp. rapa Species 0.000 claims description 2
- 235000010570 Brassica rapa var. rapa Nutrition 0.000 claims description 2
- 235000003343 Brassica rupestris Nutrition 0.000 claims description 2
- 235000004977 Brassica sinapistrum Nutrition 0.000 claims description 2
- 235000004936 Bromus mango Nutrition 0.000 claims description 2
- 241000272834 Cairina moschata Species 0.000 claims description 2
- 244000045232 Canavalia ensiformis Species 0.000 claims description 2
- 244000025254 Cannabis sativa Species 0.000 claims description 2
- 235000012766 Cannabis sativa ssp. sativa var. sativa Nutrition 0.000 claims description 2
- 235000012765 Cannabis sativa ssp. sativa var. spontanea Nutrition 0.000 claims description 2
- 241000283707 Capra Species 0.000 claims description 2
- 241000283705 Capra hircus Species 0.000 claims description 2
- 235000002566 Capsicum Nutrition 0.000 claims description 2
- 235000008534 Capsicum annuum var annuum Nutrition 0.000 claims description 2
- 235000002568 Capsicum frutescens Nutrition 0.000 claims description 2
- 241001609213 Carassius carassius Species 0.000 claims description 2
- 235000009467 Carica papaya Nutrition 0.000 claims description 2
- 240000006432 Carica papaya Species 0.000 claims description 2
- 241000726768 Carpinus Species 0.000 claims description 2
- 241000726818 Carpinus caroliniana Species 0.000 claims description 2
- 235000003255 Carthamus tinctorius Nutrition 0.000 claims description 2
- 244000020518 Carthamus tinctorius Species 0.000 claims description 2
- 235000007890 Carya glabra var. glabra Nutrition 0.000 claims description 2
- 235000014669 Carya glabra var. hirsuta Nutrition 0.000 claims description 2
- 235000014667 Carya glabra var. megacarpa Nutrition 0.000 claims description 2
- 235000009025 Carya illinoensis Nutrition 0.000 claims description 2
- 244000068645 Carya illinoensis Species 0.000 claims description 2
- 244000143780 Carya laciniosa Species 0.000 claims description 2
- 235000018242 Carya ovata Nutrition 0.000 claims description 2
- 244000264616 Carya tomentosa Species 0.000 claims description 2
- 235000014076 Carya tomentosa Nutrition 0.000 claims description 2
- 241001070941 Castanea Species 0.000 claims description 2
- 235000014036 Castanea Nutrition 0.000 claims description 2
- 235000003801 Castanea crenata Nutrition 0.000 claims description 2
- 244000209117 Castanea crenata Species 0.000 claims description 2
- 244000242134 Castanea dentata Species 0.000 claims description 2
- 235000000908 Castanea dentata Nutrition 0.000 claims description 2
- 240000004957 Castanea mollissima Species 0.000 claims description 2
- 235000018244 Castanea mollissima Nutrition 0.000 claims description 2
- 235000007763 Castanea pumila Nutrition 0.000 claims description 2
- 244000025797 Castanea pumila Species 0.000 claims description 2
- 241001107116 Castanospermum australe Species 0.000 claims description 2
- 241001249588 Catla catla Species 0.000 claims description 2
- 241000218645 Cedrus Species 0.000 claims description 2
- 235000021538 Chard Nutrition 0.000 claims description 2
- 240000006162 Chenopodium quinoa Species 0.000 claims description 2
- 240000006740 Cichorium endivia Species 0.000 claims description 2
- 235000007542 Cichorium intybus Nutrition 0.000 claims description 2
- 244000241235 Citrullus lanatus Species 0.000 claims description 2
- 235000012828 Citrullus lanatus var citroides Nutrition 0.000 claims description 2
- 244000060011 Cocos nucifera Species 0.000 claims description 2
- 235000013162 Cocos nucifera Nutrition 0.000 claims description 2
- 240000007154 Coffea arabica Species 0.000 claims description 2
- 241000272205 Columba livia Species 0.000 claims description 2
- 241000272201 Columbiformes Species 0.000 claims description 2
- 241000209022 Cornus florida Species 0.000 claims description 2
- 240000006766 Cornus mas Species 0.000 claims description 2
- 235000009035 Corsican pine Nutrition 0.000 claims description 2
- 235000001543 Corylus americana Nutrition 0.000 claims description 2
- 229920000742 Cotton Polymers 0.000 claims description 2
- 235000002361 Crambe hispanica Nutrition 0.000 claims description 2
- 241000252230 Ctenopharyngodon idella Species 0.000 claims description 2
- 235000015510 Cucumis melo subsp melo Nutrition 0.000 claims description 2
- 235000009847 Cucumis melo var cantalupensis Nutrition 0.000 claims description 2
- 240000004244 Cucurbita moschata Species 0.000 claims description 2
- 240000001980 Cucurbita pepo Species 0.000 claims description 2
- 235000009852 Cucurbita pepo Nutrition 0.000 claims description 2
- 235000009804 Cucurbita pepo subsp pepo Nutrition 0.000 claims description 2
- 244000301850 Cupressus sempervirens Species 0.000 claims description 2
- 235000017788 Cydonia oblonga Nutrition 0.000 claims description 2
- 244000019459 Cynara cardunculus Species 0.000 claims description 2
- 235000019106 Cynara scolymus Nutrition 0.000 claims description 2
- 102100028717 Cytosolic 5'-nucleotidase 3A Human genes 0.000 claims description 2
- 235000018783 Dacrycarpus dacrydioides Nutrition 0.000 claims description 2
- 235000018782 Dacrydium cupressinum Nutrition 0.000 claims description 2
- 235000002767 Daucus carota Nutrition 0.000 claims description 2
- 244000000626 Daucus carota Species 0.000 claims description 2
- 241000238557 Decapoda Species 0.000 claims description 2
- 235000014466 Douglas bleu Nutrition 0.000 claims description 2
- 241000255601 Drosophila melanogaster Species 0.000 claims description 2
- 235000001950 Elaeis guineensis Nutrition 0.000 claims description 2
- 244000127993 Elaeis melanococca Species 0.000 claims description 2
- 235000009008 Eriobotrya japonica Nutrition 0.000 claims description 2
- 244000061508 Eriobotrya japonica Species 0.000 claims description 2
- 235000014755 Eruca sativa Nutrition 0.000 claims description 2
- 244000024675 Eruca sativa Species 0.000 claims description 2
- 241001588281 Eucalyptus fraxinoides Species 0.000 claims description 2
- 244000004281 Eucalyptus maculata Species 0.000 claims description 2
- 235000009419 Fagopyrum esculentum Nutrition 0.000 claims description 2
- 240000008620 Fagopyrum esculentum Species 0.000 claims description 2
- 244000222296 Fagus americana Species 0.000 claims description 2
- 235000018241 Fagus americana Nutrition 0.000 claims description 2
- 235000010099 Fagus sylvatica Nutrition 0.000 claims description 2
- 235000004994 Fagus sylvatica subsp sylvatica Nutrition 0.000 claims description 2
- 241000282326 Felis catus Species 0.000 claims description 2
- 235000004204 Foeniculum vulgare Nutrition 0.000 claims description 2
- 240000006927 Foeniculum vulgare Species 0.000 claims description 2
- 244000299507 Gossypium hirsutum Species 0.000 claims description 2
- 244000020551 Helianthus annuus Species 0.000 claims description 2
- 235000003222 Helianthus annuus Nutrition 0.000 claims description 2
- 235000003230 Helianthus tuberosus Nutrition 0.000 claims description 2
- 240000008892 Helianthus tuberosus Species 0.000 claims description 2
- 244000043261 Hevea brasiliensis Species 0.000 claims description 2
- 240000005979 Hordeum vulgare Species 0.000 claims description 2
- 235000007340 Hordeum vulgare Nutrition 0.000 claims description 2
- 244000025221 Humulus lupulus Species 0.000 claims description 2
- 241000720946 Hypophthalmichthys molitrix Species 0.000 claims description 2
- 241000252234 Hypophthalmichthys nobilis Species 0.000 claims description 2
- 235000003338 Ilex verticillata Nutrition 0.000 claims description 2
- 244000188413 Ilex verticillata Species 0.000 claims description 2
- 241000758789 Juglans Species 0.000 claims description 2
- 235000000383 Juglans ailantifolia var cordiformis Nutrition 0.000 claims description 2
- 244000049364 Juglans ailantifolia var. cordiformis Species 0.000 claims description 2
- 235000013740 Juglans nigra Nutrition 0.000 claims description 2
- 244000184861 Juglans nigra Species 0.000 claims description 2
- 235000009496 Juglans regia Nutrition 0.000 claims description 2
- 241000721662 Juniperus Species 0.000 claims description 2
- 235000014556 Juniperus scopulorum Nutrition 0.000 claims description 2
- 235000014560 Juniperus virginiana var silicicola Nutrition 0.000 claims description 2
- 241001660766 Labeo rohita Species 0.000 claims description 2
- 244000153390 Lactuca indica Species 0.000 claims description 2
- 235000007017 Lactuca indica var laciniata Nutrition 0.000 claims description 2
- 240000008415 Lactuca sativa Species 0.000 claims description 2
- 235000003228 Lactuca sativa Nutrition 0.000 claims description 2
- 241000218652 Larix Species 0.000 claims description 2
- 241001235216 Larix decidua Species 0.000 claims description 2
- 241000534018 Larix kaempferi Species 0.000 claims description 2
- 241000218653 Larix laricina Species 0.000 claims description 2
- 235000008119 Larix laricina Nutrition 0.000 claims description 2
- 235000014647 Lens culinaris subsp culinaris Nutrition 0.000 claims description 2
- 244000043158 Lens esculenta Species 0.000 claims description 2
- 240000007472 Leucaena leucocephala Species 0.000 claims description 2
- 235000010643 Leucaena leucocephala Nutrition 0.000 claims description 2
- 235000004431 Linum usitatissimum Nutrition 0.000 claims description 2
- 240000006240 Linum usitatissimum Species 0.000 claims description 2
- 241000238553 Litopenaeus vannamei Species 0.000 claims description 2
- 241000219745 Lupinus Species 0.000 claims description 2
- 235000007688 Lycopersicon esculentum Nutrition 0.000 claims description 2
- 241001327110 Macrobrachium rosenbergii Species 0.000 claims description 2
- 235000008512 Magnolia grandiflora Nutrition 0.000 claims description 2
- 240000003293 Magnolia grandiflora Species 0.000 claims description 2
- 235000011430 Malus pumila Nutrition 0.000 claims description 2
- 235000015103 Malus silvestris Nutrition 0.000 claims description 2
- 241000219071 Malvaceae Species 0.000 claims description 2
- 235000014826 Mangifera indica Nutrition 0.000 claims description 2
- 240000007228 Mangifera indica Species 0.000 claims description 2
- 240000004658 Medicago sativa Species 0.000 claims description 2
- 235000017587 Medicago sativa ssp. sativa Nutrition 0.000 claims description 2
- 241000219828 Medicago truncatula Species 0.000 claims description 2
- 240000002624 Mespilus germanica Species 0.000 claims description 2
- 235000017784 Mespilus germanica Nutrition 0.000 claims description 2
- 235000000560 Mimusops elengi Nutrition 0.000 claims description 2
- 241000699666 Mus <mouse, genus> Species 0.000 claims description 2
- 241000699660 Mus musculus Species 0.000 claims description 2
- 235000003805 Musa ABB Group Nutrition 0.000 claims description 2
- 235000018290 Musa x paradisiaca Nutrition 0.000 claims description 2
- 235000017879 Nasturtium officinale Nutrition 0.000 claims description 2
- 240000005407 Nasturtium officinale Species 0.000 claims description 2
- 235000002637 Nicotiana tabacum Nutrition 0.000 claims description 2
- 244000061176 Nicotiana tabacum Species 0.000 claims description 2
- 241000267354 Nothofagus procera Species 0.000 claims description 2
- 235000003339 Nyssa sylvatica Nutrition 0.000 claims description 2
- 244000018764 Nyssa sylvatica Species 0.000 claims description 2
- 235000004263 Ocotea pretiosa Nutrition 0.000 claims description 2
- 240000007817 Olea europaea Species 0.000 claims description 2
- 241000277275 Oncorhynchus mykiss Species 0.000 claims description 2
- 241000277323 Pangasius pangasius Species 0.000 claims description 2
- 240000001090 Papaver somniferum Species 0.000 claims description 2
- 240000004370 Pastinaca sativa Species 0.000 claims description 2
- 235000017769 Pastinaca sativa subsp sativa Nutrition 0.000 claims description 2
- 241000238552 Penaeus monodon Species 0.000 claims description 2
- 239000006002 Pepper Substances 0.000 claims description 2
- 235000010632 Phaseolus coccineus Nutrition 0.000 claims description 2
- 235000010617 Phaseolus lunatus Nutrition 0.000 claims description 2
- 244000042209 Phaseolus multiflorus Species 0.000 claims description 2
- 235000010627 Phaseolus vulgaris Nutrition 0.000 claims description 2
- 235000010659 Phoenix dactylifera Nutrition 0.000 claims description 2
- 244000104275 Phoenix dactylifera Species 0.000 claims description 2
- 235000002489 Physalis philadelphica Nutrition 0.000 claims description 2
- 240000009134 Physalis philadelphica Species 0.000 claims description 2
- 235000008124 Picea excelsa Nutrition 0.000 claims description 2
- 241000218601 Picea omorika Species 0.000 claims description 2
- 241000218596 Picea rubens Species 0.000 claims description 2
- 241000218595 Picea sitchensis Species 0.000 claims description 2
- 235000008331 Pinus X rigitaeda Nutrition 0.000 claims description 2
- 241000018646 Pinus brutia Species 0.000 claims description 2
- 235000011613 Pinus brutia Nutrition 0.000 claims description 2
- 235000009324 Pinus caribaea Nutrition 0.000 claims description 2
- 241000218606 Pinus contorta Species 0.000 claims description 2
- 244000083281 Pinus coulteri Species 0.000 claims description 2
- 235000008568 Pinus coulteri Nutrition 0.000 claims description 2
- 241000263478 Pinus nigra subsp. laricio Species 0.000 claims description 2
- 235000017339 Pinus palustris Nutrition 0.000 claims description 2
- 235000005105 Pinus pinaster Nutrition 0.000 claims description 2
- 241001236212 Pinus pinaster Species 0.000 claims description 2
- 235000013697 Pinus resinosa Nutrition 0.000 claims description 2
- 235000007738 Pinus rigida Nutrition 0.000 claims description 2
- 240000007320 Pinus strobus Species 0.000 claims description 2
- 235000008578 Pinus strobus Nutrition 0.000 claims description 2
- 235000016761 Piper aduncum Nutrition 0.000 claims description 2
- 240000003889 Piper guineense Species 0.000 claims description 2
- 235000017804 Piper guineense Nutrition 0.000 claims description 2
- 235000008184 Piper nigrum Nutrition 0.000 claims description 2
- 240000004713 Pisum sativum Species 0.000 claims description 2
- 235000010582 Pisum sativum Nutrition 0.000 claims description 2
- 235000015266 Plantago major Nutrition 0.000 claims description 2
- 235000006485 Platanus occidentalis Nutrition 0.000 claims description 2
- 241000218982 Populus nigra Species 0.000 claims description 2
- 244000018633 Prunus armeniaca Species 0.000 claims description 2
- 235000009827 Prunus armeniaca Nutrition 0.000 claims description 2
- 244000007021 Prunus avium Species 0.000 claims description 2
- 235000010401 Prunus avium Nutrition 0.000 claims description 2
- 241001290151 Prunus avium subsp. avium Species 0.000 claims description 2
- 244000141353 Prunus domestica Species 0.000 claims description 2
- 235000011230 Prunus domestica subsp. italica Nutrition 0.000 claims description 2
- 241000198945 Prunus domestica subsp. syriaca Species 0.000 claims description 2
- 235000013992 Prunus padus Nutrition 0.000 claims description 2
- 235000013647 Prunus pensylvanica Nutrition 0.000 claims description 2
- 235000006029 Prunus persica var nucipersica Nutrition 0.000 claims description 2
- 235000006040 Prunus persica var persica Nutrition 0.000 claims description 2
- 244000017714 Prunus persica var. nucipersica Species 0.000 claims description 2
- 235000014441 Prunus serotina Nutrition 0.000 claims description 2
- 235000008572 Pseudotsuga menziesii Nutrition 0.000 claims description 2
- 235000005386 Pseudotsuga menziesii var menziesii Nutrition 0.000 claims description 2
- 235000010580 Psophocarpus tetragonolobus Nutrition 0.000 claims description 2
- 244000046095 Psophocarpus tetragonolobus Species 0.000 claims description 2
- 235000014443 Pyrus communis Nutrition 0.000 claims description 2
- 241000219492 Quercus Species 0.000 claims description 2
- 235000009137 Quercus alba Nutrition 0.000 claims description 2
- 244000274906 Quercus alba Species 0.000 claims description 2
- 241000395651 Quercus kelloggii Species 0.000 claims description 2
- 235000016976 Quercus macrolepis Nutrition 0.000 claims description 2
- 235000007748 Quercus prinus Nutrition 0.000 claims description 2
- 244000025767 Quercus prinus Species 0.000 claims description 2
- 235000009135 Quercus rubra Nutrition 0.000 claims description 2
- 240000004885 Quercus rubra Species 0.000 claims description 2
- 235000005733 Raphanus sativus var niger Nutrition 0.000 claims description 2
- 244000155437 Raphanus sativus var. niger Species 0.000 claims description 2
- 244000249693 Reneklode Species 0.000 claims description 2
- 244000299790 Rheum rhabarbarum Species 0.000 claims description 2
- 235000009411 Rheum rhabarbarum Nutrition 0.000 claims description 2
- 241001495449 Robinia pseudoacacia Species 0.000 claims description 2
- 241001412173 Rubus canescens Species 0.000 claims description 2
- 235000008691 Sabina virginiana Nutrition 0.000 claims description 2
- 240000000111 Saccharum officinarum Species 0.000 claims description 2
- 235000007201 Saccharum officinarum Nutrition 0.000 claims description 2
- 241000119565 Salix gooddingii Species 0.000 claims description 2
- 241000277289 Salmo salar Species 0.000 claims description 2
- 244000009660 Sassafras variifolium Species 0.000 claims description 2
- 235000018704 Scorzonera hispanica Nutrition 0.000 claims description 2
- 244000292071 Scorzonera hispanica Species 0.000 claims description 2
- 241001247145 Sebastes goodei Species 0.000 claims description 2
- 244000082988 Secale cereale Species 0.000 claims description 2
- 235000007238 Secale cereale Nutrition 0.000 claims description 2
- 240000003768 Solanum lycopersicum Species 0.000 claims description 2
- 235000002597 Solanum melongena Nutrition 0.000 claims description 2
- 244000062793 Sorghum vulgare Species 0.000 claims description 2
- 240000007641 Spergula rubra Species 0.000 claims description 2
- 235000009337 Spinacia oleracea Nutrition 0.000 claims description 2
- 244000300264 Spinacia oleracea Species 0.000 claims description 2
- 235000009184 Spondias indica Nutrition 0.000 claims description 2
- 235000021536 Sugar beet Nutrition 0.000 claims description 2
- 241000255588 Tephritidae Species 0.000 claims description 2
- 244000269722 Thea sinensis Species 0.000 claims description 2
- 244000299492 Thespesia populnea Species 0.000 claims description 2
- 235000009430 Thespesia populnea Nutrition 0.000 claims description 2
- 241000219793 Trifolium Species 0.000 claims description 2
- 244000126298 Trifolium alexandrinum Species 0.000 claims description 2
- 235000015724 Trifolium pratense Nutrition 0.000 claims description 2
- 244000042324 Trifolium repens Species 0.000 claims description 2
- 235000013540 Trifolium repens var repens Nutrition 0.000 claims description 2
- 241000379576 Trifolium resupinatum Species 0.000 claims description 2
- 235000019714 Triticale Nutrition 0.000 claims description 2
- 235000021307 Triticum Nutrition 0.000 claims description 2
- 244000098338 Triticum aestivum Species 0.000 claims description 2
- 235000008554 Tsuga heterophylla Nutrition 0.000 claims description 2
- 240000003021 Tsuga heterophylla Species 0.000 claims description 2
- 241001106462 Ulmus Species 0.000 claims description 2
- 240000004668 Valerianella locusta Species 0.000 claims description 2
- 235000003560 Valerianella locusta Nutrition 0.000 claims description 2
- 235000007837 Vangueria infausta Nutrition 0.000 claims description 2
- 235000010749 Vicia faba Nutrition 0.000 claims description 2
- 240000006677 Vicia faba Species 0.000 claims description 2
- 235000002098 Vicia faba var. major Nutrition 0.000 claims description 2
- 244000105017 Vicia sativa Species 0.000 claims description 2
- 241000219975 Vicia villosa Species 0.000 claims description 2
- 235000005072 Vigna sesquipedalis Nutrition 0.000 claims description 2
- 235000005755 Vigna unguiculata ssp. sesquipedalis Nutrition 0.000 claims description 2
- 241000746966 Zizania Species 0.000 claims description 2
- 235000002636 Zizania aquatica Nutrition 0.000 claims description 2
- FJJCIZWZNKZHII-UHFFFAOYSA-N [4,6-bis(cyanoamino)-1,3,5-triazin-2-yl]cyanamide Chemical compound N#CNC1=NC(NC#N)=NC(NC#N)=N1 FJJCIZWZNKZHII-UHFFFAOYSA-N 0.000 claims description 2
- 235000020224 almond Nutrition 0.000 claims description 2
- 235000016520 artichoke thistle Nutrition 0.000 claims description 2
- QKSKPIVNLNLAAV-UHFFFAOYSA-N bis(2-chloroethyl) sulfide Chemical compound ClCCSCCCl QKSKPIVNLNLAAV-UHFFFAOYSA-N 0.000 claims description 2
- 235000021279 black bean Nutrition 0.000 claims description 2
- 235000009120 camo Nutrition 0.000 claims description 2
- 235000020226 cashew nut Nutrition 0.000 claims description 2
- 235000005607 chanvre indien Nutrition 0.000 claims description 2
- 235000019693 cherries Nutrition 0.000 claims description 2
- 235000003733 chicria Nutrition 0.000 claims description 2
- 244000013123 dwarf bean Species 0.000 claims description 2
- 235000004611 garlic Nutrition 0.000 claims description 2
- 239000011487 hemp Substances 0.000 claims description 2
- 235000010181 horse chestnut Nutrition 0.000 claims description 2
- 235000014684 lodgepole pine Nutrition 0.000 claims description 2
- 235000019713 millet Nutrition 0.000 claims description 2
- 235000010460 mustard Nutrition 0.000 claims description 2
- 235000006502 papoula Nutrition 0.000 claims description 2
- 235000015136 pumpkin Nutrition 0.000 claims description 2
- 235000013526 red clover Nutrition 0.000 claims description 2
- 235000003499 redwood Nutrition 0.000 claims description 2
- 235000001520 savin Nutrition 0.000 claims description 2
- 235000000673 shore pine Nutrition 0.000 claims description 2
- 235000020354 squash Nutrition 0.000 claims description 2
- 235000012069 sugar maple Nutrition 0.000 claims description 2
- 235000013616 tea Nutrition 0.000 claims description 2
- 235000013311 vegetables Nutrition 0.000 claims description 2
- 235000020234 walnut Nutrition 0.000 claims description 2
- 241000228158 x Triticosecale Species 0.000 claims description 2
- 208000005652 acute fatty liver of pregnancy Diseases 0.000 claims 1
- 244000038559 crop plants Species 0.000 abstract description 2
- 210000000349 chromosome Anatomy 0.000 description 22
- 239000003550 marker Substances 0.000 description 17
- 238000012163 sequencing technique Methods 0.000 description 17
- 102000054766 genetic haplotypes Human genes 0.000 description 16
- 239000012071 phase Substances 0.000 description 13
- 108090000623 proteins and genes Proteins 0.000 description 10
- 238000004088 simulation Methods 0.000 description 10
- 239000011159 matrix material Substances 0.000 description 9
- 238000012360 testing method Methods 0.000 description 9
- 238000009826 distribution Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 239000002773 nucleotide Substances 0.000 description 7
- 125000003729 nucleotide group Chemical group 0.000 description 7
- 230000002349 favourable effect Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 230000007613 environmental effect Effects 0.000 description 5
- 230000007614 genetic variation Effects 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 5
- 238000010187 selection method Methods 0.000 description 5
- 230000000295 complement effect Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 238000012165 high-throughput sequencing Methods 0.000 description 4
- 235000013336 milk Nutrition 0.000 description 4
- 239000008267 milk Substances 0.000 description 4
- 210000004080 milk Anatomy 0.000 description 4
- 239000000523 sample Substances 0.000 description 4
- 108020004414 DNA Proteins 0.000 description 3
- 108091092878 Microsatellite Proteins 0.000 description 3
- 230000003321 amplification Effects 0.000 description 3
- 238000003975 animal breeding Methods 0.000 description 3
- 210000004027 cell Anatomy 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- GNFTZDOKVXKIBK-UHFFFAOYSA-N 3-(2-methoxyethoxy)benzohydrazide Chemical compound COCCOC1=CC=CC(C(=O)NN)=C1 GNFTZDOKVXKIBK-UHFFFAOYSA-N 0.000 description 2
- 241000219194 Arabidopsis Species 0.000 description 2
- 235000009849 Cucumis sativus Nutrition 0.000 description 2
- 244000182264 Lucuma nervosa Species 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000013011 mating Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000001823 molecular biology technique Methods 0.000 description 2
- 238000007481 next generation sequencing Methods 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 238000002948 stochastic simulation Methods 0.000 description 2
- 238000010998 test method Methods 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 108091093088 Amplicon Proteins 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 241000219112 Cucumis Species 0.000 description 1
- 208000035240 Disease Resistance Diseases 0.000 description 1
- 102000014171 Milk Proteins Human genes 0.000 description 1
- 108010011756 Milk Proteins Proteins 0.000 description 1
- 102100030569 Nuclear receptor corepressor 2 Human genes 0.000 description 1
- 101710153660 Nuclear receptor corepressor 2 Proteins 0.000 description 1
- 208000020584 Polyploidy Diseases 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 210000000577 adipose tissue Anatomy 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000010165 autogamy Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000035784 germination Effects 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 238000009399 inbreeding Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 235000021239 milk protein Nutrition 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000003976 plant breeding Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 244000144977 poultry Species 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000012175 pyrosequencing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 239000010421 standard material Substances 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 230000017260 vegetative to reproductive phase transition of meristem Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01H—NEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
- A01H1/00—Processes for modifying genotypes ; Plants characterised by associated natural traits
- A01H1/04—Processes of selection involving genotypic or phenotypic markers; Methods of using phenotypic markers for selection
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K67/00—Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
- A01K67/02—Breeding vertebrates
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/124—Animal traits, i.e. production traits, including athletic performance or the like
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/13—Plant traits
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/40—Population genetics; Linkage disequilibrium
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Definitions
- the present invention relates to the field of methods for improving at least one phenotypic trait of interest in subsequent generation(s) of a population of individuals, preferably crop plants or cattle. Particularly, the present invention relates to methods for identifying the combination of at least three individuals that gives, upon subsequent intercrossing, the highest estimated probability of improving the at least one phenotypic trait of interest in the subsequent generation(s).
- MAS marker assisted selection
- Cost effective genotyping techniques enable breeding companies to generate information about genetic variation on thousands of loci across the genome on all breeding materials a plant breeder has at his or her disposal to create new varieties.
- GS Genomic Selection
- ESV estimated breeding values
- the EBV is calculated from performance data and it is an approximation to an individual's genetic merit.
- EBVs are predicted based on genome wide information.
- GS is performed with a class of statistical methods called ridge regression. Ridge regression was first introduced in by E.A. Hoerl (1959). Ridge regression is used when ill posed problems are observed: for instance when more variables than the total number of observations are used to make a prediction. The most important parameter in ridge regression corresponds to the distribution assumed to the model parameters (GS model). This distribution is used to capture the genotype/phenotype relationship.
- genomic selection has been under investigation by plant breeders as an alternative to marker assisted selection (MAS) and phenotype selection.
- GS as a strategy has the potential to improve highly complex traits or combinations of multiple traits without the requirement to identify significantly linked/associated loci or candidate genes by simply constructing quantitative genotype/phenotype models over a large amount of genome-wide distributed markers.
- Use of genome-wide estimated breeding values (GEBVs) rather than actual phenotypic values provides breeders the opportunity to select individual animals or plants for trait performance without doing actual phenotyping, thus potentially saving costs and time. This can be applied both to single, complex traits but also to multiple traits combined in an index.
- the possibility to estimate traits in an earlier stage is in particular advantageous in crops with a long breeding cycle ( i.e. tree species), and in this way easily multiple years can be gained.
- GS or methods that capture whole genotype/phenotype relationships in the breeding practice is the selection of parents for the next breeding cycle. This is done by prediction of the GEBVs for a trait or an index of traits for all members of a panel of candidate parents after which the parents with the highest values are selected for further breeding, a practice not unlike the traditional selection practice based on actual phenotypes [Haley and Visscher, 1998].
- the investment is made in setting up the phenotypic prediction model.
- Required is accurate phenotyping of the members of a test population or germplasm panel and the high-density genotyping (10 4 -10 5 markers) of the same individual.
- a model is constructed by one of the multiple methods available to this in the public domain [e.g. Meu Giveaway et al, 2001].
- the prediction quality of the constructed model is tested on a second population from which both genotypes and phenotypes have been measured.
- all members of a breeding population are genotyped, using the same marker set of the training phase.
- breeding values are predicted for each individual without doing any actual phenotypic measurement. Selections are made from populations for breeding purposes, based on the predicted values.
- the present inventors came to the realization that the prior art selection strategy does not consider the long-term impact on the breeding process of the selection made in the current population. This is because in prior art methods, the best per se performing parents will be selected, assuming that, under an additive model, crossing these will result in the best performing genotypes in the next generations.
- the future improvement of the genotype will be limited to obtaining fixed homozygous allele states for these loci (a state that may also be reached by inbreeding from a single parent), and missing the selection opportunity to gather additional favorable alleles on alternative loci by introduction via other parents.
- genomic selection strategy that is based on the selection of a subset of at least 3 parents as compared to strategies that are based on the selection of single parent pairs (regular genomic selection) and crossing of said pairs.
- each subset of at least 3 marker genotypes can be considered as a library of haplotypes, from which multiple combinations have a predictable likelihood to produce genotypes with the predicted highest achievable phenotypic value. In some embodiments, this will be reached by recombination of the existing haplotypes within a genotype prior to transmission to the offspring in the next breeding cycle or plurality of breeding cycles.
- genotypes can only be recombined by crossing of two parents or self-fertilization.
- mixing of subsets of at least 3 genomes can be achieved via several parallel and/or subsequent crosses, which can be performed after the selection method according to the present disclosure.
- the present disclosure thus extends the use of the GS model as developed in the training phase (in which no changes are proposed) by improving the efficiency in the selection phase.
- This procedure enables the selection of groups of at least 3 individuals that have the highest probability to produce the best performing offspring in the next generations, rather than to select the best performing pairs. This approach was shown to achieve unexpectedly better breeding results in simulation models.
- Figure 1 shows a graphical representation of the selection process in a breeding population consisting of 3 diploid individuals represented by their genotypes G1-3.
- the individuals have been genotyped for 5 loci L1-5 and the phenotype for an individual with a particular genotype can be predicted using a mathematical genome-wide prediction model that assigns positive or negative effects to each allele occurring on the loci.
- the concept of the current disclosure involves the construction of the putative future genotype that predicts the highest phenotypic performance from haplotypes (H1.1-H3.2) occurring in the current population, or recombinants of those.
- the best obtainable genotype that can be obtained with a single cross is combining haplotypes H1.1 and H3.1 (indicated bold), which complement each other in locus L5 versus the others.
- the putative genotype can be constructed analogously from haplotypes of more than two individuals.
- the breeding value is calculated from the recombinant haplotype, multiplied with the probability that this recombination occurs between the loci (example P[ ⁇ 1 ⁇ 2 ] with ⁇ 1 ⁇ 2 defining the estimated frequency of recombination between locus 1 and 2).
- Table 1 shows a selection strategy from a population using Combined Genomic Estimated Breeding Values. From a very small population of 5 diploid homozygous individuals the genotypes have been established (represented by 12 loci; top side of the table). If each allele marked "A” is positively contributing to a desired trait value, according to a genomic prediction model and "B" indicates all other alleles, the best achievable genotype contains the highest fraction of "A" alleles.
- the genomic estimated breeding values (GEBVs) of each individual are shown in the boxed cells on the diagonal of the lower part of the table, the off-diagonal cells contain CGEBVs.
- Table 2 shows a selection strategy as in Table 1, but now applied to a heterozygous population and taking recombination events into account.
- individual 1 and 2 are the top ranking parents (0.33 and 0.21, respectively), but the CGEBV ranking for the different parent pairs shows that actually the combinations (1, 3), (1, 4) and (1, 5) are able to produce the best offspring genotypes.
- the genotypes of individual 3, 4 and 5 are identical, however the allele phases of the haplotypes differ.
- the alleles of the first haplotype of individual 3 includes all favorable alleles in linkage phase and requires no recombination event to pass on all favorable alleles to the offspring.
- Selection and selection criteria process, model, system or algorithm in order to choose individuals in a population that will contribute genetic material to the next generation.
- process, model, system or algorithm can be based both on natural or artificial phenomena or procedural steps.
- Selection criteria can be based on phenotypic or genomic characteristics, for instance, but not limited to, the presence, or degree of presence, of genes, gene expression, genetic markers, combinations of genes, quantitative trait loci, traits or combinations of traits.
- Breeding value the genetic merit of a unit of inheritance such as an individual in a breeding program. This genetic merit is determined by the contribution to at least one phenotypic trait of interest of an individual's gene or genes or (genetic) loci in a breeding program aimed at improving the at least one phenotypic trait of interest.
- Estimated breeding value an approximation of an individual's breeding value, in particular based on the estimated difference between the average performance of that individual's offspring and the average performance of all offspring in a randomly mating population.
- the estimated average performance of all offspring in a randomly mating population may take into account that individuals with inter-familial relationships, i.e. pedigree relations, normally do not mate.
- Genome-wide estimated breeding value estimated breeding value based on genome-wide information, i.e. information derived from different or remote (genetic) loci of the genome such as loci of different chromosomes.
- genome-wide estimated breeding values are an approximation of an individual's genome-wide genetic merit, determined by the contribution to at least one phenotypic trait of interest of an individual's genome-wide genes or genome-wide (genetic) loci, or genome-wide haplotypes or genome-wide molecular marker scores in a breeding program aimed at improving the at least one phenotypic trait of interest.
- CGEBV Genome-wide Estimated Breeding Value of a combination of three or more individuals within a population.
- the combination with the highest CGEBV (as compared to the other combinations) together have the highest estimated probability to produce the best performing offspring in subsequent generations in a breeding program aimed at improving an at least one phenotypic trait of interest. So, the CGEBV actually accounts for the genome-wide estimated breeding values of the genotypes of the putative offspring, and does not solely consider the genotype of each individual potential parent separately. In particular, the potential parents may not be the best performing individuals per se, or the potential parents with the best genome-wide estimated genomic breeding value.
- Directed genome-wide selection selection method based on focusing on a combination of individuals in a population that together have the highest probability to produce the best performing offspring in the next generations in a breeding program aimed at one or more selection criteria.
- the focus is on genome-wide estimated breeding values of the genotypes of the putative offspring (combined genome-wide estimated breeding value), rather than by focusing on the genotype of each individual parent itself.
- this selection method is not based on selecting the best performing individuals per se.
- Regular genome-wide selection selection method based on crossing parents with the best genome-wide estimated breeding values per se.
- Offspring as used herein, the term “offspring”, refers to the first or further generation obtained by intercrossing.
- Phenotype the composite of an individual's characteristics or traits, particularly observable characteristics or traits, such as, but not limited to morphological, physical, biochemical, developmental or behavioral characteristics or traits.
- the phenotype of an individual can be formed by the expression of genes as well as environmental factors, and on interactions between gene expression and environmental factors.
- Phenotypic trait of interest a heritable characteristic of a plant or animal species which may be quantified in a certain unit of measure.
- quantitative phenotypic traits of interest are (but are not limited to) for plants: fruit size, fruit count, yield in kg per ha, plant height, relative growth speed, flowering time, germination rate, leave area, disease resistances, yield components, biochemical composition, and for animals: milk yield, milk protein content, carcass weight, fodder conversion, body fat composition, litter size, coat color, resistances to diseases. It can be desired that a quantitative phenotypic trait of interest is increased or decreased, and the respective shift of the average value for the characteristic in the population can improve the economic value of that population, variety or offspring relative to the parent generation(s).
- Genotype refers to the genetic makeup of a cell, an organism, or an individual (i.e. the specific allele makeup of the individual) usually with reference to a specific character or phenotypic trait of interest under consideration.
- genotype refers to the genetic makeup of a cell, an organism, or an individual (i.e. the specific allele makeup of the individual) usually with reference to a specific character or phenotypic trait of interest under consideration.
- genotype refers to the genetic makeup of a cell, an organism, or an individual (i.e. the specific allele makeup of the individual) usually with reference to a specific character or phenotypic trait of interest under consideration.
- not all organisms with the same genotype necessarily look or act the same way because appearance and behavior are modified by environmental and developmental conditions.
- not all organisms that look alike necessarily have the same genotype.
- Genotyping refers to the process of determining genetic variations among individuals in a species.
- Single nucleotide polymorphisms SNPs are the most common type of genetic variation that are used for genotyping and by definition are single-base differences at a specific locus that is found in more than 1% of the population. SNPs are found in both coding and non-coding regions of the genome and can be associated with a phenotypic trait of interest such as a quantitative phenotypic trait of interest. Hence, SNPs can be used as markers for quantitative phenotypic traits of interest.
- Genometics Another common type of genetic variation that are used for genotyping are "InDels" or insertions and deletions of nucleotides of varying length.
- InDels Another common type of genetic variation that are used for genotyping are "InDels" or insertions and deletions of nucleotides of varying length.
- SNP and InDel genotyping many methods exist to determine genotype among individuals. The chosen method generally depends on the throughput needed, which is a function of both the number of individuals being genotyped and the number of genotypes being tested for each individual. The chosen method also depends on the amount of sample material available from each individual or sample. For example, sequencing may be used for determining presence or absence of markers such as SNPs, e.g. such as Sanger sequencing and High Throughput Sequencing technologies (HTS).
- Sanger sequencing may involve sequencing via detection through (capillary) electrophoresis, in which up to 384 capillaries may be sequence analysed in one run.
- High throughput sequencing involves the parallel sequencing of thousands or millions or more sequences at once.
- HTS can be defined as Next Generation sequencing, i.e. techniques based on solid phase pyrosequencing or as Next-Next Generation sequencing based on single nucleotide real time sequencing (SMRT).
- HTS technologies are available such as offered by Roche, Illumina and Applied Biosystems (Life Technologies). Further high throughput sequencing technologies are described by and/or available from Helicos, Pacific Biosciences, Complete Genomics, Ion Torrent Systems, Oxford Nanopore Technologies, Nabsys, ZS Genetics, GnuBio.
- each of these sequencing technologies have their own way of preparing samples prior to the actual sequencing step. These steps may be included in the high throughput sequencing method. In certain cases, steps that are particular for the sequencing step may be integrated in the sample preparation protocol prior to the actual sequencing step for reasons of efficiency or economy.
- adapters that are ligated to fragments may contain sections that can be used in subsequent sequencing steps (so-called sequencing adapters).
- Primers that are used to amplify a subset of fragments prior to sequencing may contain parts within their sequence that introduce sections that can later be used in the sequencing step, for instance by introducing through an amplification step a sequencing adapter or a capturing moiety in an amplicon that can be used in a subsequent sequencing step.
- amplification steps may be omitted.
- Genotype/phenotype relationship model a model that can associate (correlate) genotype with phenotype for individuals in a population. To create such model it is typically required to phenotype individuals of a population and genotype the same individuals. In particular, genotyping can be based on high-density marker data, such as data on the presence or absence of a SNP at a plurality of loci. Likewise, phenotyping can be performed at high accuracy, for example by measuring the value for the quantitative phenotypic trait of interest per individual. The genotype/phenotype relationship model can then be created by calculating correlations between the genotypic data and the phenotypic data.
- the model can attribute a contribution to the quantitative phenotypic trait of interest to the presence or absence of a marker.
- Said contribution may for example be expressed in kg, m, L, depending on the unit of measure as used for the quantitative phenotypic trait of interest (for example fruit size, milk production, etc.).
- Various methods are available in the art in order to construct such a model (Meu Giveaway et al., 2001).
- locus refers to a specific site (place) or sites on the genome.
- locus refers to the site in the genome where the two alleles of the locus are found (for diploid organisms).
- QTLs Quantitative trait loci are sites on the genome containing alleles that are associated to a quantitative trait (based on the genotype/phenotype relationship model).
- Allele refers to the nucleotide sequence variant that is present on one of the positions of a particular locus.
- a diploid individual has two positions for one allele per locus, one position on either one of the two homologous chromosomes.
- one or more alternative nucleotide sequence variants may exist in a population, i.e. for each position different possible alleles may exist in a population. However, each individual can have only one of the possible alleles on each one of the positions of a locus.
- the alternative nucleotide sequence variants i.e.
- the different possible alleles differ at least slightly in nucleotide sequence, and typically can be distinguished based on the presence or absence of at least one SNP or InDel.
- an "allelic state” reference is made to the presence or absence of an allele at a position within a particular locus, which can be expressed as the presence or absence of the respective marker (e.g. SNP or indel) at the particular locus.
- Allele dose of a locus the number of copies present in a genome of a given allele on a given locus.
- the range for the allele dose is between 0 (no copies present) to the (auto)ploidy level of the genome; i.e. for diploid species, the allele dose for a given allele can be either 0, 1 or 2.
- the max allele dose corresponds to the number of homologous chromosome copies.
- Attributed Allele substitution effect refers to the estimated quantitative effect on the trait when on a given locus the one allele (e.g. as measured by presence of a particular SNP) is substituted by the other allele (e.g. as measured by absence of the particular SNP) within a given genetic and/or environmental background.
- fruit yield is the quantitative phenotypic trait of interest in a population of plants
- the quantitative effect on that trait may be expressed in kg.
- a particular allele on a given locus e.g. as measured by presence of a particular SNP
- an allele substitution effect e.g. 0.0001 kg, which means that if the particular allele is replaced by the other possible allele (e.g. as measured by absence of the particular SNP), the quantitative effect on the trait, i.e. fruit yield is estimated to be 0.0001 kg.
- Attributed Allele substitution effect corrected for recombination probability can be corrected for recombination probabilities. The further away two loci are from each other, the more likely it is that recombination (crossing over) takes place between the two loci. The distance between loci is measured in terms of recombination probability and is given in cM (centiMorgans; 1 cM is a meiotic recombination probability between two markers of 1%). This is relevant because for both positively and negatively contributing alleles, one would like to know the chance that they are transmitted to offspring.
- a positive attributed allele substitution effect can be corrected for recombination probability by taking into account the probability that (after crossing with another individual) the allele is transmitted to the genome of offspring.
- a negative attributed allele substitution effect can be corrected for recombination probability by taking into account the probability that (after crossing with another individual) the allele is not transmitted to the genome of offspring.
- Heterozygous and homozygous refers to a genetic condition existing when two different alleles reside at a specific locus, for example a locus having alleles A/B, wherein A and B are positioned individually on either one of the two homologous chromosomes.
- the term “homozygous” refers to a genetic condition existing when two identical alleles reside at a specific locus, for example a locus having alleles A/A, positioned individually on either one of the two homologous chromosomes.
- Molecular marker technique refers to a (DNA based) assay that indicates (directly or indirectly) the presence or absence of a marker allele of interest in an individual (e.g. (crop) plant or cattle). Preferably, it allows one to determine, e.g. by sequencing, whether a particular allele is present or absent at one of the positions at the locus in any individual.
- the present disclosure relates to a method for identifying combination(s) of at least three individuals within a breeding population, wherein the combinations have, for at least one phenotypic trait of interest, a higher Combined Genome-Wide Estimated Breeding Value in the offspring for said at least one phenotypic trait of interest, as compared to at least 70% of the other combinations of at least three individuals within said breeding population.
- the present method comprises the following steps:
- the present method allows to identify which subset(s) of at least three parents together have the best combined genome-wide estimated breeding value. In this way, one could say that the present method actually assesses the genome-wide estimated breeding value of the putative offspring of different combinations of at least three potential parents.
- the method allows to identify (and/or subsequently select) at least one (e.g. at least 2, at least 3, at least 5, or at least 10) combination of at least three (e.g. at least 5, at least 10) individuals in a (breeding) population that together have the highest probability to produce the best performing offspring in the next generation(s) in a breeding program aimed at improving at least one (or at least 2, 3, 4) (quantitative) phenotypic trait of interest.
- a CGEBV in the offspring of a combination the CGEBV of the respective combination of individuals is meant which reflects the breeding value of their putative offspring.
- the method can identify the combination (b, c and d) as the best combination of three individuals (subset).
- the method also allows to identify and/or select more than one combination for use in a subsequent breeding program, e.g.
- the method can identify the combination (b, c and d) as well as the combination (a, c, and d), because their CGEBVs are both higher than that of the combination (a, b, and c).
- Exactly the same principle can be applied to extract subgroups of more than three parents out of larger panels, by calculating CGEBVs for all triplets, quartettes etc. and ranking these.
- a (training) population of individuals is provided.
- This population can optionally be called a training population, because it serves for the establishment of a genotype/phenotype relationship model.
- Such model allows to attribute an allele substitution effect on the at least one phenotypic trait of interest to each of the alleles of a plurality of loci of individuals of a breeding population. Therefore, preferably, the training population and the breeding population relate to the same plant or non-human animal species, and more preferably the training population is the same as the breeding population or most preferably a selection of individuals therefrom. It is also possible that the training population is a specifically designed population, which means that the population is specifically compiled for the purpose of generating a phenotype/genotype relationship model.
- the term "individual” refers to living subjects, and in particular to (crop) plants or non-human animals such as cattle.
- the training population comprises at least 3, at least 10, or at least 50 individuals, but (in particular if the individuals are plants) the training population may also comprise at least 100, or at least 500 individuals.
- step b) of the method phenotypic data is collected for the at least one phenotypic trait of interest for each individual within said population. For example, if the trait concerns the quantity of milk production (cattle), or the size of the flowers (plants), one can measure, for each individual of the training population, the quantity of milk production (in L) or the size of the flowers (diameter in m).
- step c) of the method collects genotypic data for each individual with the training population using methods well-known to the skilled person, such as molecular marker techniques, sequence-based genotyping or whole genome sequencing.
- genotyping, or determining the genotype refers to the process of determining genetic variations among individuals in the population.
- the skilled person has various molecular biology techniques at his disposal such as hybridisation analysis, PCR and preferably sequencing in order to examine DNA molecules of the individuals in order to unravel sequence variations between said individuals.
- the molecular marker technique(s) used in the present method are preferably selected from the group consisting of the detection of SNPs, the detection of RFLPs, the detection of SSR polymorphisms, RAPDs, the detection of indels or CNVs, and AFLP.
- Step c) then continues with attributing to each allele of a plurality of loci of each individual, an allele substitution effect for the at least one phenotypic trait of interest.
- Said attributing is typically based on the identification of correlations between the phenotypic data and the genotypic data.
- the term "allele substitution effect”, as also explained elsewhere herein, refers to an estimated quantitative effect on a certain phenotypic trait when on a given locus the one allele (e.g. as measured by presence/absence of a particular SNP) is substituted by the respective allele (e.g. as measured by presence/absence of the particular SNP) within a given genetic and/or environmental background.
- the allele substitution effect of a certain allele with SNP versus an allele without the SNP on the same position can be based on comparing phenotypes of individuals having only the allele with SNP with phenotypes of individuals having only the allele without SNP. Such comparing may identify correlations between the genotype and the phenotype.
- step d) can provide a genotype/phenotype relationship model for the training population of individuals, wherein the model allows to estimate (and/or attribute) for a given genotype of an individual within a breeding population, what the quantitative contribution is of the allele substitution effects of the plurality of loci on the at least one phenotypic trait of interest.
- the model can attribute to each allele of the plurality of loci, an allele substitution effect based on the correlations found while or after producing the genotype/phenotype relationship model.
- steps a), b), c), and d) are not required (and thus optional).
- each individual within a breeding population is genotyped, for which molecular marker techniques well-known to the skilled person can be used.
- the molecular marker technique(s) used in the present method are preferably selected from the group consisting of the detection of SNPs, the detection of RFLPs (differing locations of restriction enzyme sites), the detection of SSR (Simple Sequence Repeat) polymorphisms, RAPDs (Random Amplification of Polymorphic DNA), the detection of indels or CNVs (Copy Number Variations), and AFLP (Amplified Fragment Length Polymorphism).
- step f) of the present method for each individual within the breeding population, for each allele of a (the) plurality of loci, the allele substitution effect is calculated (attributed) using the genotype/phenotype relationship model of step d).
- the breeding population refers to the population of individuals which can be further intercrossed with the aim of improving the at least one phenotypic trait in subsequent generation(s). So, within step e) (or prior to step e)) the providing of a breeding population is envisaged. It will be clear that the attribution of allele substitution effects to the individuals within the breeding population can be based on the results of prior genotyping of said individuals.
- the plurality of loci may refer to as few as two, five, or twenty loci which may be located on separate regions of the genome such as different chromosomes, but the term plurality may also refer to at least 10, at least 25, at least 100, or at least 500, or at least 1000, or at least 5000 or more different loci preferably located on separate regions of the genome such as different chromosomes.
- the plurality of loci are (genome-wide) loci located on the entire genome, e.g. located on at least 2, at least 5, at least 10, or at least 20 chromosomes.
- the plurality of loci comprises at least 100, at least 500, at least 1000, or at least 2000 loci.
- at least one loci of the plurality of loci is found every 100 cM, preferably every 50 cM, more preferably every 25 cM, even more preferably every 10 cM, even more preferably every 5 cM, most preferably every 1 cM of the genome.
- Step f) also aims to correct the attributed allele substitution effects as awarded to each allele of the plurality of loci, for (estimated) recombination probabilities with flanking loci (e.g. the previous or preferably the next loci (or both) in the 5' to 3' direction).
- flanking loci e.g. the previous or preferably the next loci (or both) in the 5' to 3' direction.
- step f) of the method allows for correcting of the attributed allele substitution effects attributed to each allele of the plurality of loci of each individual within the population for (estimated) recombination probabilities.
- Correction of a positive attributed allele substitution effect is preferably (not necessarily) done by multiplying the effect with the probability that the corresponding allele is transmitted to the offspring (i.e. a gamete), and correction of a negative attributed allele substitution effect is preferably (not necessarily) done by multiplying the effect with the probability that the corresponding allele is not transmitted to the offspring (i.e. a gamete).
- the recombination probabilities are calculated based on genetic distances between loci, or based on aligning physical and genetic maps. Further details hereon may be found in Liu (1998).
- step g) relating to determining the Combined Genome-wide Estimated Breeding Value (in the offspring) for the at least one phenotypic trait of interest for each combination of at least three individuals within the (breeding) population by calculating for each combination of at least three individuals for each locus of said plurality of loci (in the offspring) the highest combination of allele substitution effects using the calculated and corrected allele substitution effects of the individuals calculated in step f).
- the present method takes into account, for each attributed allele substitution effect, the chance that the corresponding allele actually ends up in a gamete of an individual. This is relevant because for both positively and negatively contributing alleles, one would like to know the probability that they are transmitted to a gamete. For example, one could consider the following plurality of loci consisting of three loci of individual 1: Individual 1 Locus 1 1 0 Locus 2 1 0 Locus 3 0 1 (wherein 1 refers to presence of the marker allele and 0 to the absence of the marker allele)
- locus 1 and locus 2 are desired such that the first allele of Locus 1 and the second allele of Locus 2 are transmitted to a gamete, because this leads to an increase in attributed allele substitution effects in the gamete.
- recombination between the second allele of Locus 2 and the first allele of Locus 3 is not desired, because that would lead to a decrease of attributed allele substitution effects in the gamete.
- the probability that recombination occurs between two loci can be calculated based on the genetic distance between the two loci. This calculation is based on the fact that the chance of recombination occurring between loci that are located proximal to each other is lower as compared to the chance of recombination occurring between loci that are located less proximal to each other.
- the following estimated recombination probabilities may have been estimated: Probability of recombination occurring with the previous locus: Probability of no recombination occurring with the previous locus: Locus 1 1 0 Locus 2 0.1 0.9 Locus 3 0.15 0.85
- CGEBV Combined Genome-wide Estimated Breeding Value
- step h) identifies the at least one combination(s) of at least three individuals within the (breeding) population that have a higher CGEBV (preferably the highest) for the at least one phenotypic trait of interest, as compared to at least 70% of the other combinations, preferably as compared to at least 80%, 90%, 95%, 99% or most preferably 100% of the other combinations of at least three individuals within the breeding population.
- CGEBV preferably the highest
- the present method allows for a pre selection of individuals within the breeding population.
- This pre selection preferably takes place after the providing of the breeding population in step e) of the method.
- the pre selection allows to reduce the number of individuals to be considered for the calculating part of step e), and thus allows to reduce the number of combinations of individuals for which the CGEBV has to be calculated. In practice this may be worthwhile in certain situations, particularly when the number of individuals or combinations of individuals, or the number of loci to be considered demands more computational power than the user of the method can provide. In such scenario, it may be advantageous to perform a pre selection to reduce the number of (combinations of) individuals for which the CGEBV has to be calculated.
- the pre selection of individuals within the breeding population to be combined is made by selecting (exclusively) at most 30%, more preferably at most 20%, even more preferably at most 10%, yet even more preferably at most 5%, most preferably at most 2% of the individuals with the highest sum (or higher as compared to the other, not selected individuals) of all corrected allele substitution effects attributed to the plurality of loci.
- the corrected attributed allele substitution effect of each locus of each individual in the population can be calculated.
- said corrected attributed allele substitution effect of a particular locus can be referred to as the "corrected locus effect" of that particular locus.
- the result of calculating the corrected locus effects of a (sub set of the) plurality of loci shows which individuals have higher corrected locus effects for certain subsets of the plurality of loci than the other individuals, which can form the basis of a pre selection. For example, one can identify which individuals have higher corrected locus effects for different parts of the plurality of loci, e.g. relating to parts of the plurality of loci associated with at most part of the genome, e.g.
- one chromosome For example, if the plurality of loci comprises 100 loci, one can make a pre selection of e.g. five individuals, being the individual with the highest total of locus effects for loci 1-20, the individual with the highest total of locus effects for loci 21-40, the individual with the highest total of locus effects for loci 41-60, the individual with the highest total of locus effects for loci 61-80, and the individual with the highest total of locus effects for loci 81-100.
- S represents the corrected locus effect of a particular locus, i.e the sum of corrected attributed allele substitution effect of that locus.
- S can be calculated by multiplying P and F, wherein P represents the (uncorrected) locus effect of a particular locus, and F represents the estimated recombination probability between flanking loci.
- step g) of the present method is by calculating CGEBV according to Formula I:
- the CGEBV is determined for at least two, at least three, at least four, or at most one, at most two or at most three phenotypic traits of interest.
- the at least one phenotypic trait of interest is a quantitative trait.
- the trait is preferably influenced by multiple, e.g. at least 10, at least 20, at least 30, at least 40, at least 100, at least 200, at least 500, or at least 1000 genes, and/or preferably can (only) phenotypically be measured in quantitative terms (e.g. in kg, m, or L).
- the present method is preferably used prior to actual selection and/or intercrossing of the identified combination of individuals, it is therefore optional but not preferred that the present method comprises step i) of (enabling) intercrossing (or interbreeding) of members of the identified combination of at least three individuals, such that offspring, i.e. a next generation is obtained or even that the resulting offspring as obtained is intercrossed.
- the present method can be applied to more than one generation (see e.g. Figure 3 ), such as to at least 2, 3, or at least 5 generations, although this may not be necessary in every situation.
- the identification of the best parent subgroup finds its basis in the accurate estimation of allelic effects of chromosomal regions. Therefore, the accuracy of the applied genomic selection model ideally should be as high as possible.
- a way to enhance the level of accuracy of the model is to use state-of-the-art model construction methodology, to improve the quality of the phenotype data collected for the training panel, and to optimize the ratio marker density/average window of linkage disequilibrium, as well as the ratio loci contributing to the trait/number of observations in the training panel.
- factors like the genetic complexity and heritability of the trait of interest, genetic diversity of the training panel, the level genetic relationship between the training panel and the candidate parent panel may also influence the accuracy.
- the breeding population of individuals used in the present method can be of different ploidy nature.
- the population of individuals can be of a diploid, allopolyploid, or autopolyploid species. The same applies to the training population.
- the population of individuals preferably is a field crop, or vegetable crop, or woody fruit species, or forestry species, or plantation crop, preferably selected from the group consisting of Arabidopsis thaliana, Abyssinian mustard, alfalfa, barley, barrel clover, black mustard, buckwheat, canola, clover, common flax, common vetch, corn spurry, coffee, cotton, Egyptian clover, fodder beet, hemp, hop, Indian mustard, Jerusalem artichoke, maize, millet, mustard, lupin, oat, oilseed rape ( Brassica napus), field mustard (Brassica rapa), opium poppy, Persian clover, potato, red clover, rye, safflower, sisal, soy bean, sugar beet, sunflower, tea, tobacco, triticale, wheat, white clover, white mustard, wild rice, winter vetch, artichoke, asparagus, asparagus beans, aubergine, be
- the population of individuals is of a species selected from the group consisting of Cattle (Bos taurus, Bos indicus), Water buffalo ( Bubalus bubalis ), Equine (Equus caballus ), Sheep ( Ovis aries), Goat (Capra hircus), Pig (Sus scrofa), Chicken (Gallus gallus ), Turkey (Maleagris gallopavo), Ducks (Anas platyrhynchos, Cairina moschata), Geese (Anser anser domesticus, Anser cygnoides), Pigeons (Columba livia domestica), Rat ( Rattus novergicus), Mouse (Mus musculus ), Cat ( Felis catus), Dog ( Canis familiaris ), Rabbit (Oryctolagus cuniculus ), Guinea pig ( Cavia porcellus), Zebra fish ( Danio rerio ) and Fruit fly ( Drosophila me
- the population of individuals is of a fish species selected from the group consisting of Cyprinus carpio, Salmo salar, Oreochromis niloticus, Oncorhynchus mykiss, Ctenopharyngodon idella, Hypophthalmichthys molitrix, Gibelion catla, Cyprinus carpio, Hypophthalmichthys nobilis, Carassius carassius, Oreochromis niloticus, Pangasius pangasius and Labeo rohita, or wherein the method is applied to a shrimp species selected from the group consisting of Macrobrachium rosenbergii, Litopenaeus vannamei and Penaeus monodon.
- the method of the disclosure comprises a method for identifying combinations of at least three individuals and as such the identification process itself does not require crossing and subsequently selecting of plants or animals
- the present method for identifying typically is not an essentially biological process for the production of plants or animals, and does not necessarily require crossing and subsequent selection of plants or animals.
- a computer-readable medium comprising instructions for performing the present method.
- the attributing of step b), and steps c), d), e), f), g) and h) as a whole of the present method are computer-implemented steps and/or the present method is (partly) a computer-implemented method.
- the method of the present disclosure can advantageously be used particularly for improving at least one (quantitative) phenotypic trait of interest in a breeding program.
- Also foreseen is a product obtainable by the method according to the disclosure, preferably wherein the product is a plant.
- Two-state allelic coding (-1/1) was applied to indicate the allelic status per allele at each locus (data derived from the dataset), where -1 means the absence of the marker allele and 1 means the presence of the marker allele.
- Each allele was attributed an allele substitution effect, i.e. the contribution to the phenotype of interest; the size of this effect was randomly drawn from a truncated normal distribution with mean 0 and standard deviation 1, for which all negative values were discarded. In this way each locus may contribute both positively and negatively to the trait depending on its allelic state, while the size of the contribution is determined by the effect size.
- reaGWS for each of the parental lines a per se performance was determined based on the accumulated effects of each of the individual loci, by multiplying the allele substitution effect (drawn from a truncated Normal distribution) with the allelic state (-1 or 1) at each locus for all loci present in the genome. The lines with the highest predicted genomic performance were selected and intercrossed. In simple bi-parental simulation only two parents were involved. In more advanced simulations several breeding cycles (generations) were simulated in which in each additional cycle an additional parent was crossed with selected progeny (see below) obtained from the previous cycle. The order of the selected parents for use in next cycles followed the predicted performance ranking, i.e. the third best parent entered the breeding cycle as the third parent.
- dirGWS Parents were not selected based on their per-se performance but rather on the potential performance of their combined genomes. For all combined sets (of size 3, 4 or 5) of lines taken from the parental set of lines a predicted combined performance was estimated. The set with the highest combined performance was selected and the members of this set were used as parents for crossing.
- the genome-wide estimated breeding value of a parent is calculated by the total of the values for S , i.e. the corrected locus effect, of each locus.
- the corrected data matrix, S then can be used as a basis for (pre)selection.
- S an example is given for five parental lines containing 250 loci located on 5 chromosomes. From this figure it can be seen that parent 4 exceeds all other parents at the first two chromosomes (up to locus 100) but is underperforming on the other chromosomes. The relative performance of an individual was calculated in this way. The combination of parent 4, 2 and parent 1 together gives the maximum S on all chromosomes and is most likely to outperform other combinations of two or more parents.
- the corrected population matrix S Once the corrected population matrix S is known the best combination of parental lines can be chosen.
- CGEBV combined genome-wide estimated breeding value
- Stochastic simulation was used to evaluate the performance of each of the two methods. Two parents from the selected set of parents were intercrossed to generate a new hybrid genotype. This hybrid product of each cross was considered to be the base genotype from which, through stochastic simulation, 1000 gametes were generated. The generated gametes were used as a sample of the potential genetic performance of the pair of parents. These gametes were ordered by their performance, which was calculated by multiplying their allelic state at each locus (-1 or 1) with the locus genomic prediction values. The 95% percentile value of the ranked performances of the 1000 gametes was taken as a measure to judge the offspring performance.
- FIG. 3 is a further graphical representation of genomewide breeding potentials in a population. The results of Figure 4 were obtained by simulated GS models and real Arabidopsis thaliana genotype data.
- the genomic selection models were constructed using Ridge Regression (Meu Giveaway et al., 2001). The models used in both regGWS and dirGWS were always pairwise identical for each test round.
- Table 3 shows the results of the dirGWS vs regGWS comparison.
- the directed GWS method outperformed the regular GWS selection in more than half of the cases and this frequency increased dramatically when more than three parents were involved in the breeding scheme.
- the increase in superiority with increasing number of parents demonstrates that the focus on an optimal combination of complementary genomic regions, as is the main idea of the directed GWS approach, indeed yields better result that a focus on per-se performance of the parents, as is typically done in classical GWS approaches.
- Example 1 Arabidopsis thaliana
- a public genotype data set of the model organism Arabidopsis thaliana (At) was retrieved, consisting of genotype data for 250K loci of 1179 ecotypes (Horton et al., 2012). Simulated (i.e. attributed) allele substitution effects were genomewide randomly distributed over 500-2000 loci.
- the At lines are divided uniformly at random in a training set of 50-1000 parents, which are used to construct GS models, and a validation set (for determining the accuracy) consisting of the remaining parents. The tests were done selecting either the 99% or 100% best ranking combinations and following 2-5 breeding cycles.
- Each parameter i.e. number of loci, number of parents in the training set, selecting percentage, and the number of breeding cycles) combination was repeated 200 times.
- Fig 5 shows the results of the dirGWS vs regGWS comparison, wherein the training population serves as breeding population.
- Directed GWS method outperformed the regular GWS selection, in particular when the GS models are more accurate (R>0.6).
- the dirGS strategy is providing better selection results, in particular when more than 2 parents are involved and the crossing scenario spans more than one generation.
- the increase in superiority with increasing number of parents demonstrates that the focus on an optimal combination of complementary genomic regions, according to the present invention, indeed yields better result than a focus on per-se performance of the parents, as is typically done in classical GWS approaches.
- Test results wherein the training population serves as breeding population, shown in Fig 6 , indicate again that for most parameter combinations, the selection results are better when following the dirGS strategy, in particular when more than 2 parents are involved. The accuracy has a less dramatic effect on the overperformance of the dirGS strategy than in the previous example.
- a third test was conducted using Cucumber (Cucumis sativus) genotype data for 3.7*10 6 loci of 115 lines (Qi et al., 2013). The set was reduced to the homozygous marker subset with no missing data (179K markers) and 86 non-identical parent lines. Trait effects were simulated on 450-1789 loci traits. GS models were constructed on training populations of 50 training parents. Further test procedures were similar as in the previous two examples.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Genetics & Genomics (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- Molecular Biology (AREA)
- Environmental Sciences (AREA)
- Developmental Biology & Embryology (AREA)
- Botany (AREA)
- Zoology (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioethics (AREA)
- Animal Behavior & Ethology (AREA)
- Animal Husbandry (AREA)
- Biodiversity & Conservation Biology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
Abstract
Description
- The present invention relates to the field of methods for improving at least one phenotypic trait of interest in subsequent generation(s) of a population of individuals, preferably crop plants or cattle. Particularly, the present invention relates to methods for identifying the combination of at least three individuals that gives, upon subsequent intercrossing, the highest estimated probability of improving the at least one phenotypic trait of interest in the subsequent generation(s).
- The limitations of marker assisted selection (MAS), occurring when applied to manage quantitative variation under the influence of a large number of loci, are expected to be greatly improved by the simultaneous usage of genome wide markers. Cost effective genotyping techniques enable breeding companies to generate information about genetic variation on thousands of loci across the genome on all breeding materials a plant breeder has at his or her disposal to create new varieties. As a result, new strategies to perform plant/animal breeding in this new landscape of abundant information are emerging. One strategy is Genomic Selection (GS), the use of genome-wide genotypic data to predict estimated breeding values (EBV) for selection purposes, which was originally proposed in animal breeding [Meuwissen et al., 2001], and has since then been successfully applied in animal breeding in for instance cattle and poultry. The EBV is calculated from performance data and it is an approximation to an individual's genetic merit. In GS, EBVs are predicted based on genome wide information. GS is performed with a class of statistical methods called ridge regression. Ridge regression was first introduced in by E.A. Hoerl (1959). Ridge regression is used when ill posed problems are observed: for instance when more variables than the total number of observations are used to make a prediction. The most important parameter in ridge regression corresponds to the distribution assumed to the model parameters (GS model). This distribution is used to capture the genotype/phenotype relationship. Intense research has led to multiple distribution assumptions, which in turn has led to a large number of multiple ridge regression models, also call GS models (BayesA, BayesB, BayesC, BayesC pi, BayesD, machine learning methods, information theory methods, etc.). With ever improving statistical methods and increasing sizes of breeding data it is fair to expect that the accuracy of models capturing genotype/phenotype relationships are likely to gradually improve. And it is expected that a variety of such models will become applicable in breeding, enabling GS to become a more and more important breeding strategy.
- In recent years, genomic selection has been under investigation by plant breeders as an alternative to marker assisted selection (MAS) and phenotype selection. GS as a strategy has the potential to improve highly complex traits or combinations of multiple traits without the requirement to identify significantly linked/associated loci or candidate genes by simply constructing quantitative genotype/phenotype models over a large amount of genome-wide distributed markers. Use of genome-wide estimated breeding values (GEBVs) rather than actual phenotypic values provides breeders the opportunity to select individual animals or plants for trait performance without doing actual phenotyping, thus potentially saving costs and time. This can be applied both to single, complex traits but also to multiple traits combined in an index. The possibility to estimate traits in an earlier stage is in particular advantageous in crops with a long breeding cycle (i.e. tree species), and in this way easily multiple years can be gained.
- One major application of GS or methods that capture whole genotype/phenotype relationships in the breeding practice is the selection of parents for the next breeding cycle. This is done by prediction of the GEBVs for a trait or an index of traits for all members of a panel of candidate parents after which the parents with the highest values are selected for further breeding, a practice not unlike the traditional selection practice based on actual phenotypes [Haley and Visscher, 1998].
- As a breeding method, prediction based strategies based on genotype/phenotype relationships works in two phases, a training phase and a selection phase:
- In this phase, the investment is made in setting up the phenotypic prediction model. Required is accurate phenotyping of the members of a test population or germplasm panel and the high-density genotyping (104-105 markers) of the same individual. From these two data types, a model is constructed by one of the multiple methods available to this in the public domain [e.g. Meuwissen et al, 2001]. Optionally, the prediction quality of the constructed model is tested on a second population from which both genotypes and phenotypes have been measured.
- In the second phase, all members of a breeding population are genotyped, using the same marker set of the training phase. By entering the genotypic data in the model constructed in the previous phase, breeding values are predicted for each individual without doing any actual phenotypic measurement. Selections are made from populations for breeding purposes, based on the predicted values.
- However, the optimal strategy of selecting individuals to obtain the best gain in phenotype improvement has not been worked out in the prior art. Therefore, it is an object of the present invention to provide for an improved procedure which is better optimized, as compared to the prior art, to identify (and select) the individuals that have the highest probability to produce the best performing offspring in the next generation.
- The present inventors came to the realization that the prior art selection strategy does not consider the long-term impact on the breeding process of the selection made in the current population. This is because in prior art methods, the best per se performing parents will be selected, assuming that, under an additive model, crossing these will result in the best performing genotypes in the next generations.
- However, if, for example, two parents are selected from which the favorable alleles are largely overlapping, the future improvement of the genotype will be limited to obtaining fixed homozygous allele states for these loci (a state that may also be reached by inbreeding from a single parent), and missing the selection opportunity to gather additional favorable alleles on alternative loci by introduction via other parents.
- The prior art methodology to make selections in breeding based on or predicted by a GS model is shaped after the practice of traditional selection on phenotypic observations and is overlooking that a genomic prediction model contains detailed information about which genome sections are most contributing to the positive and negative performance of the trait. Although the basic principle of GS is the assumption that all loci of a genome contribute to the trait, most GS models specify only an additive effect for each single locus (see e.g.
EP1962212 , orUS2010/0037342 ). Therefore, although both a directly measurable phenotype and a given GS prediction model might predict an equally high quantitative trait value for two arbitrary individuals, for the next breeding step, it makes a large difference whether in both cases the superior performance is predicted by identical loci in both genomes or by different genomic regions. In the latter case, improvement of the trait average may be expected in the next breeding cycles, by combining the favorable alleles, whereas in the first case it may not. In phenotypic as well as genomic selection, the performance of the parents does not adequately predict their mutually combined ability and expected success of the resulting genotype of the crosses to be made in the next generation or generations. - A first attempt to predict the combined ability and expected success of different parent pairs is described in
WO2012075125 . Briefly, the document suggests to calculate breeding values of different parent pairs by taking the mean breeding value of simulated offspring genotypes of each parent pair. However, the present inventors recognized that the full potential of a population, e.g. with respect to a certain trait, lies in the combination of multiple parents, i.e. more than a single parent pair), because complete genomewide complementarity is unlikely to occur within a single parent pair. Indeed, in several experiments the present inventors discovered a strongly enhanced efficiency of a genomic selection strategy that is based on the selection of a subset of at least 3 parents as compared to strategies that are based on the selection of single parent pairs (regular genomic selection) and crossing of said pairs. - Technically, the present inventors achieved this by focusing on Genomewide Estimated Breeding Values from the genotypes of the combination of at least 3 individuals of the parent generation, rather than by focusing on the genotype of individual pairs. Each subset of at least 3 marker genotypes can be considered as a library of haplotypes, from which multiple combinations have a predictable likelihood to produce genotypes with the predicted highest achievable phenotypic value. In some embodiments, this will be reached by recombination of the existing haplotypes within a genotype prior to transmission to the offspring in the next breeding cycle or plurality of breeding cycles. Of course, genotypes can only be recombined by crossing of two parents or self-fertilization. However, as will be clear to the skilled person, mixing of subsets of at least 3 genomes can be achieved via several parallel and/or subsequent crosses, which can be performed after the selection method according to the present disclosure.
- The present disclosure thus extends the use of the GS model as developed in the training phase (in which no changes are proposed) by improving the efficiency in the selection phase. This procedure enables the selection of groups of at least 3 individuals that have the highest probability to produce the best performing offspring in the next generations, rather than to select the best performing pairs. This approach was shown to achieve unexpectedly better breeding results in simulation models.
- As an example of the principle underlying the present disclosure,
Figure 1 shows a graphical representation of the selection process in a breeding population consisting of 3 diploid individuals represented by their genotypes G1-3. The individuals have been genotyped for 5 loci L1-5 and the phenotype for an individual with a particular genotype can be predicted using a mathematical genome-wide prediction model that assigns positive or negative effects to each allele occurring on the loci. The concept of the current disclosure involves the construction of the putative future genotype that predicts the highest phenotypic performance from haplotypes (H1.1-H3.2) occurring in the current population, or recombinants of those. In the example ofFigure 1 , the best obtainable genotype that can be obtained with a single cross is combining haplotypes H1.1 and H3.1 (indicated bold), which complement each other in locus L5 versus the others. According to the present disclosure, and extrapolating fromFigure 1 , it will be clear that in larger populations, the putative genotype can be constructed analogously from haplotypes of more than two individuals. - In case recombination between loci with another haplotype would lead to an increase of positively contributing alleles (as in haplotype H 1.1) the breeding value is calculated from the recombinant haplotype, multiplied with the probability that this recombination occurs between the loci (example P[θ 1→2] with θ 1→2 defining the estimated frequency of recombination between
locus 1 and 2). On the other hand, if recombination would lead to a decrease of positively contributing alleles, the predicted phenotype value of the haplotype is multiplied with the probability of no recombination occurring (example P[not θ 3→4] = 1-P[θ 3→4] ). - Another example is shown in Table 1 which shows a selection strategy from a population using Combined Genomic Estimated Breeding Values. From a very small population of 5 diploid homozygous individuals the genotypes have been established (represented by 12 loci; top side of the table). If each allele marked "A" is positively contributing to a desired trait value, according to a genomic prediction model and "B" indicates all other alleles, the best achievable genotype contains the highest fraction of "A" alleles.
- The genomic estimated breeding values (GEBVs) of each individual are shown in the boxed cells on the diagonal of the lower part of the table, the off-diagonal cells contain CGEBVs.
- Following the prior art selection procedure using GEBVs, the individuals with the highest values would have been selected as parents for the next generation, which are in the example of Table 1
individuals individuals - The prospect of finding the best allele gathering in the described population becomes even better when combinations of more than two parents are compared. When all combinations of three parents are compared, it becomes apparent that the combinations (1, 2 and 3) and (1, 2 and 5) are the superior triplets with a CGEBV of 0.92 (not shown in the table), while the triplet of superior per se GEBV parents (1, 2 and 4) have a CGEBV of only 0.83. The superior triplets cannot directly be inferred from the superior CGEBVs as calculated for the two-parent combinations, which are (1,2), (1,3) and (1,5). In the example of Table 1, this becomes clear because the combination (1, 3 and 5) has a lower CGEBV (0.83) than the combinations (1, 2 and 5) and (1, 2 and 3), which can only be found if the three-parent combinations are compared.
-
- A further example is shown in Table 2 which shows a selection strategy as in Table 1, but now applied to a heterozygous population and taking recombination events into account. In the example of Table 2, according to GEBV values, individual 1 and 2 are the top ranking parents (0.33 and 0.21, respectively), but the CGEBV ranking for the different parent pairs shows that actually the combinations (1, 3), (1, 4) and (1, 5) are able to produce the best offspring genotypes. The genotypes of
individual individual 3 includes all favorable alleles in linkage phase and requires no recombination event to pass on all favorable alleles to the offspring. In the genotypes ofindividuals individual 4, the first of the required recombinations is betweenloci individual 5 this is betweenloci loci individual 4. By correcting the initial CGEBV value in the table with the probability of the recombination events occurring, the CGEBV ranking of individual 1 with the others will become 3, 4, 5, 2. As a result, the combination ofindividual - In the following description and examples, a number of terms are used. In order to provide a clear and consistent understanding of the description and claims, including the scope to be given such terms, the following definitions are provided for the terms as used in the description and claims. Unless otherwise defined herein, all technical and scientific terms used have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
- Selection and selection criteria: process, model, system or algorithm in order to choose individuals in a population that will contribute genetic material to the next generation. In particular, such a process, model, system or algorithm can be based both on natural or artificial phenomena or procedural steps. Selection criteria can be based on phenotypic or genomic characteristics, for instance, but not limited to, the presence, or degree of presence, of genes, gene expression, genetic markers, combinations of genes, quantitative trait loci, traits or combinations of traits.
- Breeding value: the genetic merit of a unit of inheritance such as an individual in a breeding program. This genetic merit is determined by the contribution to at least one phenotypic trait of interest of an individual's gene or genes or (genetic) loci in a breeding program aimed at improving the at least one phenotypic trait of interest.
- Estimated breeding value: an approximation of an individual's breeding value, in particular based on the estimated difference between the average performance of that individual's offspring and the average performance of all offspring in a randomly mating population. The estimated average performance of all offspring in a randomly mating population may take into account that individuals with inter-familial relationships, i.e. pedigree relations, normally do not mate.
- Genome-wide estimated breeding value: estimated breeding value based on genome-wide information, i.e. information derived from different or remote (genetic) loci of the genome such as loci of different chromosomes. In particular, genome-wide estimated breeding values are an approximation of an individual's genome-wide genetic merit, determined by the contribution to at least one phenotypic trait of interest of an individual's genome-wide genes or genome-wide (genetic) loci, or genome-wide haplotypes or genome-wide molecular marker scores in a breeding program aimed at improving the at least one phenotypic trait of interest.
- Combined Genome-wide Estimated Breeding Value (CGEBV): Genome-wide Estimated Breeding Value of a combination of three or more individuals within a population. The combination with the highest CGEBV (as compared to the other combinations) together have the highest estimated probability to produce the best performing offspring in subsequent generations in a breeding program aimed at improving an at least one phenotypic trait of interest. So, the CGEBV actually accounts for the genome-wide estimated breeding values of the genotypes of the putative offspring, and does not solely consider the genotype of each individual potential parent separately. In particular, the potential parents may not be the best performing individuals per se, or the potential parents with the best genome-wide estimated genomic breeding value.
- Directed genome-wide selection: selection method based on focusing on a combination of individuals in a population that together have the highest probability to produce the best performing offspring in the next generations in a breeding program aimed at one or more selection criteria. With directed genome-wide selection the focus is on genome-wide estimated breeding values of the genotypes of the putative offspring (combined genome-wide estimated breeding value), rather than by focusing on the genotype of each individual parent itself. In particular, this selection method is not based on selecting the best performing individuals per se.
- Regular genome-wide selection: selection method based on crossing parents with the best genome-wide estimated breeding values per se.
- Offspring: as used herein, the term "offspring", refers to the first or further generation obtained by intercrossing.
- Phenotype: the composite of an individual's characteristics or traits, particularly observable characteristics or traits, such as, but not limited to morphological, physical, biochemical, developmental or behavioral characteristics or traits. The phenotype of an individual can be formed by the expression of genes as well as environmental factors, and on interactions between gene expression and environmental factors.
- Phenotypic trait of interest: a heritable characteristic of a plant or animal species which may be quantified in a certain unit of measure. Examples of quantitative phenotypic traits of interest are (but are not limited to) for plants: fruit size, fruit count, yield in kg per ha, plant height, relative growth speed, flowering time, germination rate, leave area, disease resistances, yield components, biochemical composition, and for animals: milk yield, milk protein content, carcass weight, fodder conversion, body fat composition, litter size, coat color, resistances to diseases. It can be desired that a quantitative phenotypic trait of interest is increased or decreased, and the respective shift of the average value for the characteristic in the population can improve the economic value of that population, variety or offspring relative to the parent generation(s).
- Genotype: as used herein, the term "genotype" refers to the genetic makeup of a cell, an organism, or an individual (i.e. the specific allele makeup of the individual) usually with reference to a specific character or phenotypic trait of interest under consideration. However, not all organisms with the same genotype necessarily look or act the same way because appearance and behavior are modified by environmental and developmental conditions. Likewise, not all organisms that look alike necessarily have the same genotype.
- Genotyping: as used herein, the term "genotyping" or "determining the genotype" refers to the process of determining genetic variations among individuals in a species. Single nucleotide polymorphisms (SNPs) are the most common type of genetic variation that are used for genotyping and by definition are single-base differences at a specific locus that is found in more than 1% of the population. SNPs are found in both coding and non-coding regions of the genome and can be associated with a phenotypic trait of interest such as a quantitative phenotypic trait of interest. Hence, SNPs can be used as markers for quantitative phenotypic traits of interest. Another common type of genetic variation that are used for genotyping are "InDels" or insertions and deletions of nucleotides of varying length. For both SNP and InDel genotyping, many methods exist to determine genotype among individuals. The chosen method generally depends on the throughput needed, which is a function of both the number of individuals being genotyped and the number of genotypes being tested for each individual. The chosen method also depends on the amount of sample material available from each individual or sample. For example, sequencing may be used for determining presence or absence of markers such as SNPs, e.g. such as Sanger sequencing and High Throughput Sequencing technologies (HTS). Sanger sequencing may involve sequencing via detection through (capillary) electrophoresis, in which up to 384 capillaries may be sequence analysed in one run. High throughput sequencing involves the parallel sequencing of thousands or millions or more sequences at once. HTS can be defined as Next Generation sequencing, i.e. techniques based on solid phase pyrosequencing or as Next-Next Generation sequencing based on single nucleotide real time sequencing (SMRT). HTS technologies are available such as offered by Roche, Illumina and Applied Biosystems (Life Technologies). Further high throughput sequencing technologies are described by and/or available from Helicos, Pacific Biosciences, Complete Genomics, Ion Torrent Systems, Oxford Nanopore Technologies, Nabsys, ZS Genetics, GnuBio. Each of these sequencing technologies have their own way of preparing samples prior to the actual sequencing step. These steps may be included in the high throughput sequencing method. In certain cases, steps that are particular for the sequencing step may be integrated in the sample preparation protocol prior to the actual sequencing step for reasons of efficiency or economy. For instance, adapters that are ligated to fragments may contain sections that can be used in subsequent sequencing steps (so-called sequencing adapters). Primers that are used to amplify a subset of fragments prior to sequencing may contain parts within their sequence that introduce sections that can later be used in the sequencing step, for instance by introducing through an amplification step a sequencing adapter or a capturing moiety in an amplicon that can be used in a subsequent sequencing step. Depending also on the sequencing technology used, amplification steps may be omitted.
- Genotype/phenotype relationship model: a model that can associate (correlate) genotype with phenotype for individuals in a population. To create such model it is typically required to phenotype individuals of a population and genotype the same individuals. In particular, genotyping can be based on high-density marker data, such as data on the presence or absence of a SNP at a plurality of loci. Likewise, phenotyping can be performed at high accuracy, for example by measuring the value for the quantitative phenotypic trait of interest per individual. The genotype/phenotype relationship model can then be created by calculating correlations between the genotypic data and the phenotypic data. For example, with a dense marker map, such as SNP map, some markers can be correlated with positive or negative effects on a particular quantitative phenotypic trait of interest. In this way, the model can attribute a contribution to the quantitative phenotypic trait of interest to the presence or absence of a marker. Said contribution may for example be expressed in kg, m, L, depending on the unit of measure as used for the quantitative phenotypic trait of interest (for example fruit size, milk production, etc.). Various methods are available in the art in order to construct such a model (Meuwissen et al., 2001).
- Locus: as used herein, the term "locus" or "loci" (plural) refers to a specific site (place) or sites on the genome. For example, the "locus" refers to the site in the genome where the two alleles of the locus are found (for diploid organisms). Quantitative trait loci (QTLs) are sites on the genome containing alleles that are associated to a quantitative trait (based on the genotype/phenotype relationship model).
- Allele: the term "allele" refers to the nucleotide sequence variant that is present on one of the positions of a particular locus. A diploid individual has two positions for one allele per locus, one position on either one of the two homologous chromosomes. For each of the positions of a particular locus, one or more alternative nucleotide sequence variants may exist in a population, i.e. for each position different possible alleles may exist in a population. However, each individual can have only one of the possible alleles on each one of the positions of a locus. The alternative nucleotide sequence variants, i.e. the different possible alleles, differ at least slightly in nucleotide sequence, and typically can be distinguished based on the presence or absence of at least one SNP or InDel. When referred herein to an "allelic state", reference is made to the presence or absence of an allele at a position within a particular locus, which can be expressed as the presence or absence of the respective marker (e.g. SNP or indel) at the particular locus.
- Allele dose of a locus: the number of copies present in a genome of a given allele on a given locus. The range for the allele dose is between 0 (no copies present) to the (auto)ploidy level of the genome; i.e. for diploid species, the allele dose for a given allele can be either 0, 1 or 2. For polyploid genomes the max allele dose corresponds to the number of homologous chromosome copies.
- Attributed Allele substitution effect: this term refers to the estimated quantitative effect on the trait when on a given locus the one allele (e.g. as measured by presence of a particular SNP) is substituted by the other allele (e.g. as measured by absence of the particular SNP) within a given genetic and/or environmental background. For example, if fruit yield is the quantitative phenotypic trait of interest in a population of plants, the quantitative effect on that trait may be expressed in kg. Based on the genotype/phenotype relationship model, a particular allele on a given locus (e.g. as measured by presence of a particular SNP) can thus be attributed an allele substitution effect of e.g. 0.0001 kg, which means that if the particular allele is replaced by the other possible allele (e.g. as measured by absence of the particular SNP), the quantitative effect on the trait, i.e. fruit yield is estimated to be 0.0001 kg.
- Attributed Allele substitution effect corrected for recombination probability: Attributed allele substitution effects can be corrected for recombination probabilities. The further away two loci are from each other, the more likely it is that recombination (crossing over) takes place between the two loci. The distance between loci is measured in terms of recombination probability and is given in cM (centiMorgans; 1 cM is a meiotic recombination probability between two markers of 1%). This is relevant because for both positively and negatively contributing alleles, one would like to know the chance that they are transmitted to offspring. A positive attributed allele substitution effect can be corrected for recombination probability by taking into account the probability that (after crossing with another individual) the allele is transmitted to the genome of offspring. A negative attributed allele substitution effect can be corrected for recombination probability by taking into account the probability that (after crossing with another individual) the allele is not transmitted to the genome of offspring.
- Heterozygous and homozygous: as used herein, the term "heterozygous" refers to a genetic condition existing when two different alleles reside at a specific locus, for example a locus having alleles A/B, wherein A and B are positioned individually on either one of the two homologous chromosomes. Conversely, as used herein, the term "homozygous" refers to a genetic condition existing when two identical alleles reside at a specific locus, for example a locus having alleles A/A, positioned individually on either one of the two homologous chromosomes.
- Molecular marker technique: as used herein, the term "molecular maker technique" refers to a (DNA based) assay that indicates (directly or indirectly) the presence or absence of a marker allele of interest in an individual (e.g. (crop) plant or cattle). Preferably, it allows one to determine, e.g. by sequencing, whether a particular allele is present or absent at one of the positions at the locus in any individual.
- The present disclosure relates to a method for identifying combination(s) of at least three individuals within a breeding population, wherein the combinations have, for at least one phenotypic trait of interest, a higher Combined Genome-Wide Estimated Breeding Value in the offspring for said at least one phenotypic trait of interest, as compared to at least 70% of the other combinations of at least three individuals within said breeding population. The present method comprises the following steps:
- a) providing a training population of individuals;
- b) collecting phenotypic data for the at least one trait of interest for each individual within said training population;
- c) collecting genotypic data for each individual within said training population using molecular marker techniques, sequence-based genotyping or whole genome sequencing, and attributing to each allele of a plurality of loci of each individual, an allele substitution effect for the at least one phenotypic trait of interest;
- d) providing a genotype/phenotype relationship model for said training population of individuals, wherein the model estimates for a given genotype of an individual what the quantitative contribution is of the allele substitution effects of said plurality of loci on the at least one phenotypic trait of interest;
- e) genotyping each individual within a breeding population, preferably (by collecting genotypic data for each individual within said breeding population) in the same way as in step c);
- f) calculating for each individual within the breeding population the allele substitution effect for each allele of said plurality of loci by using the genotype/phenotype relationship model of step d), and correcting for recombination probabilities with flanking loci, wherein for an allele with a positive allele substitution effect said effect is multiplied with the probability that said allele is transmitted to the offspring, and for an allele with a negative allele substitution effect said effect is multiplied with the probability that said allele is not transmitted to the offspring;
- g) determining the Combined Genome-Wide Estimated Breeding Value in the offspring for the at least one phenotypic trait of interest for each combination of at least three individuals within the breeding population by calculating for each combination of at least three individuals for each locus of said plurality of loci in the offspring the highest combination of allele substitution effects using the calculated and corrected allele substitution effects of the individuals calculated in step f);
- h) identifying the combinations of at least three individuals within the breeding population that provide for said at least one phenotypic trait of interest Combined Genome-Wide Estimated Breeding Values in the offspring that are higher than at least 70% of the Combined Genome-Wide Estimated Breeding Values in the offspring of other combinations of individuals within the breeding population.
- While previous methods focus on identifying the (pairs of) individuals within the breeding population that on their own have the best genome-wide estimated breeding values, the present method allows to identify which subset(s) of at least three parents together have the best combined genome-wide estimated breeding value. In this way, one could say that the present method actually assesses the genome-wide estimated breeding value of the putative offspring of different combinations of at least three potential parents.
- In other words, the method allows to identify (and/or subsequently select) at least one (e.g. at least 2, at least 3, at least 5, or at least 10) combination of at least three (e.g. at least 5, at least 10) individuals in a (breeding) population that together have the highest probability to produce the best performing offspring in the next generation(s) in a breeding program aimed at improving at least one (or at least 2, 3, 4) (quantitative) phenotypic trait of interest. Where reference is made to a CGEBV in the offspring of a combination, the CGEBV of the respective combination of individuals is meant which reflects the breeding value of their putative offspring.
- For example, in a population of four individuals a, b, c, d three possible combinations are: (a, b, and c); (a, c, and d); (b, c and d). If in this example the combination (a, b, and c) has a CGEBV of 10, the combination of (a, c, and d) has a CGEBV of 20, and the combination (b, c and d) has a CGEBV of 30, the method can identify the combination (b, c and d) as the best combination of three individuals (subset). However, the method also allows to identify and/or select more than one combination for use in a subsequent breeding program, e.g. in this example the method can identify the combination (b, c and d) as well as the combination (a, c, and d), because their CGEBVs are both higher than that of the combination (a, b, and c). Exactly the same principle can be applied to extract subgroups of more than three parents out of larger panels, by calculating CGEBVs for all triplets, quartettes etc. and ranking these.
- In step a) of the method, a (training) population of individuals is provided. This population can optionally be called a training population, because it serves for the establishment of a genotype/phenotype relationship model. Such model allows to attribute an allele substitution effect on the at least one phenotypic trait of interest to each of the alleles of a plurality of loci of individuals of a breeding population. Therefore, preferably, the training population and the breeding population relate to the same plant or non-human animal species, and more preferably the training population is the same as the breeding population or most preferably a selection of individuals therefrom. It is also possible that the training population is a specifically designed population, which means that the population is specifically compiled for the purpose of generating a phenotype/genotype relationship model.
- In the present disclosure, the term "individual" refers to living subjects, and in particular to (crop) plants or non-human animals such as cattle. Preferably the training population comprises at least 3, at least 10, or at least 50 individuals, but (in particular if the individuals are plants) the training population may also comprise at least 100, or at least 500 individuals.
- In step b) of the method, phenotypic data is collected for the at least one phenotypic trait of interest for each individual within said population. For example, if the trait concerns the quantity of milk production (cattle), or the size of the flowers (plants), one can measure, for each individual of the training population, the quantity of milk production (in L) or the size of the flowers (diameter in m).
- Then step c) of the method collects genotypic data for each individual with the training population using methods well-known to the skilled person, such as molecular marker techniques, sequence-based genotyping or whole genome sequencing. As explained earlier herein, genotyping, or determining the genotype refers to the process of determining genetic variations among individuals in the population. For this, the skilled person has various molecular biology techniques at his disposal such as hybridisation analysis, PCR and preferably sequencing in order to examine DNA molecules of the individuals in order to unravel sequence variations between said individuals.
- The molecular marker technique(s) used in the present method are preferably selected from the group consisting of the detection of SNPs, the detection of RFLPs, the detection of SSR polymorphisms, RAPDs, the detection of indels or CNVs, and AFLP.
- Molecular biology techniques are well-known to the skilled person and for example described in standard handbooks such as Sambrook and Russell (2001) Molecular Cloning: A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, NY; and in
Volumes - Step c) then continues with attributing to each allele of a plurality of loci of each individual, an allele substitution effect for the at least one phenotypic trait of interest. Said attributing is typically based on the identification of correlations between the phenotypic data and the genotypic data. In this respect, the term "allele substitution effect", as also explained elsewhere herein, refers to an estimated quantitative effect on a certain phenotypic trait when on a given locus the one allele (e.g. as measured by presence/absence of a particular SNP) is substituted by the respective allele (e.g. as measured by presence/absence of the particular SNP) within a given genetic and/or environmental background.
- For example, the allele substitution effect of a certain allele with SNP versus an allele without the SNP on the same position can be based on comparing phenotypes of individuals having only the allele with SNP with phenotypes of individuals having only the allele without SNP. Such comparing may identify correlations between the genotype and the phenotype.
- After steps a), b), and c), step d) can provide a genotype/phenotype relationship model for the training population of individuals, wherein the model allows to estimate (and/or attribute) for a given genotype of an individual within a breeding population, what the quantitative contribution is of the allele substitution effects of the plurality of loci on the at least one phenotypic trait of interest. In other words, the model can attribute to each allele of the plurality of loci, an allele substitution effect based on the correlations found while or after producing the genotype/phenotype relationship model.
- If the present method uses a preexisting (or prior prepared) genotype/phenotype relationship model for the individuals of the breeding population, steps a), b), c), and d) are not required (and thus optional).
- In step e), each individual within a breeding population is genotyped, for which molecular marker techniques well-known to the skilled person can be used. The molecular marker technique(s) used in the present method are preferably selected from the group consisting of the detection of SNPs, the detection of RFLPs (differing locations of restriction enzyme sites), the detection of SSR (Simple Sequence Repeat) polymorphisms, RAPDs (Random Amplification of Polymorphic DNA), the detection of indels or CNVs (Copy Number Variations), and AFLP (Amplified Fragment Length Polymorphism).
- In step f) of the present method, for each individual within the breeding population, for each allele of a (the) plurality of loci, the allele substitution effect is calculated (attributed) using the genotype/phenotype relationship model of step d). In this respect, the breeding population refers to the population of individuals which can be further intercrossed with the aim of improving the at least one phenotypic trait in subsequent generation(s). So, within step e) (or prior to step e)) the providing of a breeding population is envisaged. It will be clear that the attribution of allele substitution effects to the individuals within the breeding population can be based on the results of prior genotyping of said individuals.
- In the context of the present method, the plurality of loci may refer to as few as two, five, or twenty loci which may be located on separate regions of the genome such as different chromosomes, but the term plurality may also refer to at least 10, at least 25, at least 100, or at least 500, or at least 1000, or at least 5000 or more different loci preferably located on separate regions of the genome such as different chromosomes.
- In a preferred embodiment of the present method, the plurality of loci are (genome-wide) loci located on the entire genome, e.g. located on at least 2, at least 5, at least 10, or at least 20 chromosomes. At the same time or alternatively, the plurality of loci comprises at least 100, at least 500, at least 1000, or at least 2000 loci. Furthermore, it is preferred that at least one loci of the plurality of loci is found every 100 cM, preferably every 50 cM, more preferably every 25 cM, even more preferably every 10 cM, even more preferably every 5 cM, most preferably every 1 cM of the genome.
- Step f) also aims to correct the attributed allele substitution effects as awarded to each allele of the plurality of loci, for (estimated) recombination probabilities with flanking loci (e.g. the previous or preferably the next loci (or both) in the 5' to 3' direction). This is done by correcting each allele of the plurality of loci that is attributed a positive allele substitution effect for the (estimated) recombination probability that (after crossing with another individual) the allele is transmitted to the genome of offspring, and, correcting each allele of the plurality of loci that is attributed a negative allele substitution effect for the (estimated) recombination probability that (after crossing with another individual) the allele is not transmitted to the genome of the offspring. In this way, step f) of the method allows for correcting of the attributed allele substitution effects attributed to each allele of the plurality of loci of each individual within the population for (estimated) recombination probabilities. Correction of a positive attributed allele substitution effect is preferably (not necessarily) done by multiplying the effect with the probability that the corresponding allele is transmitted to the offspring (i.e. a gamete), and correction of a negative attributed allele substitution effect is preferably (not necessarily) done by multiplying the effect with the probability that the corresponding allele is not transmitted to the offspring (i.e. a gamete).
- In a preferred embodiment of step f) of the method, the recombination probabilities are calculated based on genetic distances between loci, or based on aligning physical and genetic maps. Further details hereon may be found in Liu (1998).
- The present method then continues with step g) relating to determining the Combined Genome-wide Estimated Breeding Value (in the offspring) for the at least one phenotypic trait of interest for each combination of at least three individuals within the (breeding) population by calculating for each combination of at least three individuals for each locus of said plurality of loci (in the offspring) the highest combination of allele substitution effects using the calculated and corrected allele substitution effects of the individuals calculated in step f).
- So, the present method takes into account, for each attributed allele substitution effect, the chance that the corresponding allele actually ends up in a gamete of an individual. This is relevant because for both positively and negatively contributing alleles, one would like to know the probability that they are transmitted to a gamete. For example, one could consider the following plurality of loci consisting of three loci of individual 1:
Individual 1Locus 11 0 Locus 21 0 Locus 30 1 - Based on a genotype/phenotype relationship model, the following allele substitution effects may have been attributed to the alleles of the loci:
Locus 10.1 0 Locus 2 -0.2 0 Locus 30 0.15 - As can be seen, recombination between
locus 1 andlocus 2 is desired such that the first allele ofLocus 1 and the second allele ofLocus 2 are transmitted to a gamete, because this leads to an increase in attributed allele substitution effects in the gamete. On the other hand, recombination between the second allele ofLocus 2 and the first allele ofLocus 3 is not desired, because that would lead to a decrease of attributed allele substitution effects in the gamete. - Based on methods well-known to the skilled person, such as methods disclosed in Liu (1998), the probability that recombination occurs between two loci can be calculated based on the genetic distance between the two loci. This calculation is based on the fact that the chance of recombination occurring between loci that are located proximal to each other is lower as compared to the chance of recombination occurring between loci that are located less proximal to each other. In this example, the following estimated recombination probabilities may have been estimated:
Probability of recombination occurring with the previous locus: Probability of no recombination occurring with the previous locus: Locus 11 0 Locus 20.1 0.9 Locus 30.15 0.85 - So, in this example, the positive allele substitution effect attributed the first allele of
locus 1 now should be corrected for the probability that this allele is transmitted to a gamete, and the negative attributed allele substitution effect of the first allele oflocus 2 should be corrected for the probability that the allele is not transmitted to a gamete. Finally, the positive attributed allele substitution effect of the second allele oflocus 3 should be corrected for the probability that this allele is transmitted to a gamete:Corrected attributed allele substitution effect Locus 1 0.1 x 1 = 0.1 Locus 2 -0.2 x 0.1 = -0.02 Locus 30.15 x 0.85 = 0.1275 - The total of corrected allele substitution effects for loci 1-3 of this individual thus is 0.1 + (-0.02) + 0.1275 = 0.2075.
- Then, the Combined Genome-wide Estimated Breeding Value (CGEBV) of a combination of two individuals is calculated by taking for each locus of the plurality of loci, the highest corrected locus effect S (corrected allele substitution effects of the locus) of the individuals of the combination.
- Then, step h) identifies the at least one combination(s) of at least three individuals within the (breeding) population that have a higher CGEBV (preferably the highest) for the at least one phenotypic trait of interest, as compared to at least 70% of the other combinations, preferably as compared to at least 80%, 90%, 95%, 99% or most preferably 100% of the other combinations of at least three individuals within the breeding population.
- In a preferred embodiment, the present method allows for a pre selection of individuals within the breeding population. This pre selection preferably takes place after the providing of the breeding population in step e) of the method. The pre selection allows to reduce the number of individuals to be considered for the calculating part of step e), and thus allows to reduce the number of combinations of individuals for which the CGEBV has to be calculated. In practice this may be worthwhile in certain situations, particularly when the number of individuals or combinations of individuals, or the number of loci to be considered demands more computational power than the user of the method can provide. In such scenario, it may be advantageous to perform a pre selection to reduce the number of (combinations of) individuals for which the CGEBV has to be calculated.
- Preferably, the pre selection of individuals within the breeding population to be combined is made by selecting (exclusively) at most 30%, more preferably at most 20%, even more preferably at most 10%, yet even more preferably at most 5%, most preferably at most 2% of the individuals with the highest sum (or higher as compared to the other, not selected individuals) of all corrected allele substitution effects attributed to the plurality of loci.
- So, for the optional pre selection, first, the corrected attributed allele substitution effect of each locus of each individual in the population can be calculated. In this respect, said corrected attributed allele substitution effect of a particular locus can be referred to as the "corrected locus effect" of that particular locus. The result of calculating the corrected locus effects of a (sub set of the) plurality of loci shows which individuals have higher corrected locus effects for certain subsets of the plurality of loci than the other individuals, which can form the basis of a pre selection. For example, one can identify which individuals have higher corrected locus effects for different parts of the plurality of loci, e.g. relating to parts of the plurality of loci associated with at most part of the genome, e.g. one chromosome. For example, if the plurality of loci comprises 100 loci, one can make a pre selection of e.g. five individuals, being the individual with the highest total of locus effects for loci 1-20, the individual with the highest total of locus effects for loci 21-40, the individual with the highest total of locus effects for loci 41-60, the individual with the highest total of locus effects for loci 61-80, and the individual with the highest total of locus effects for loci 81-100.
- The pre selection can for example be performed by selecting individuals with a higher value for S for part of the plurality of loci, e.g. a plurality of loci of at least part (or at most part) of the genome, such as only the loci located on one chromosome, as compared to the value S as determined for the same part of the plurality of loci of the other individuals, wherein S = PF, wherein P∈R n× p, and F ∈ R p×p , wherein R is a set of real numbers (excluding imaginary numbers), n is the number of parents, p is the number of loci that is considered. In other words, S represents the corrected locus effect of a particular locus, i.e the sum of corrected attributed allele substitution effect of that locus. S can be calculated by multiplying P and F, wherein P represents the (uncorrected) locus effect of a particular locus, and F represents the estimated recombination probability between flanking loci.
-
- In a preferred embodiment of the present method, the CGEBV is determined for at least two, at least three, at least four, or at most one, at most two or at most three phenotypic traits of interest. In another preferred embodiment of the method, the at least one phenotypic trait of interest is a quantitative trait. The trait is preferably influenced by multiple, e.g. at least 10, at least 20, at least 30, at least 40, at least 100, at least 200, at least 500, or at least 1000 genes, and/or preferably can (only) phenotypically be measured in quantitative terms (e.g. in kg, m, or L).
- The present method is preferably used prior to actual selection and/or intercrossing of the identified combination of individuals, it is therefore optional but not preferred that the present method comprises step i) of (enabling) intercrossing (or interbreeding) of members of the identified combination of at least three individuals, such that offspring, i.e. a next generation is obtained or even that the resulting offspring as obtained is intercrossed. However, the present method can be applied to more than one generation (see e.g.
Figure 3 ), such as to at least 2, 3, or at least 5 generations, although this may not be necessary in every situation. - As will be clear to the skilled person, the identification of the best parent subgroup finds its basis in the accurate estimation of allelic effects of chromosomal regions. Therefore, the accuracy of the applied genomic selection model ideally should be as high as possible. A way to enhance the level of accuracy of the model is to use state-of-the-art model construction methodology, to improve the quality of the phenotype data collected for the training panel, and to optimize the ratio marker density/average window of linkage disequilibrium, as well as the ratio loci contributing to the trait/number of observations in the training panel. Of course, factors like the genetic complexity and heritability of the trait of interest, genetic diversity of the training panel, the level genetic relationship between the training panel and the candidate parent panel may also influence the accuracy. A person skilled in the field of the present disclosure will have no problem in appreciating that variations in these input factors will result in models with higher or lower accuracy and that model accuracy can be determined through standard cross-validation procedures (Hastie, T., Tibshirani, R.,, Friedman, J. (2001); and Arlot, Sylvain; Celisse, 2010).
- The breeding population of individuals used in the present method can be of different ploidy nature. For example, the population of individuals can be of a diploid, allopolyploid, or autopolyploid species. The same applies to the training population.
- In the method of the present disclosure, the population of individuals preferably is a field crop, or vegetable crop, or woody fruit species, or forestry species, or plantation crop, preferably selected from the group consisting of Arabidopsis thaliana, Abyssinian mustard, alfalfa, barley, barrel clover, black mustard, buckwheat, canola, clover, common flax, common vetch, corn spurry, coffee, cotton, Egyptian clover, fodder beet, hemp, hop, Indian mustard, Jerusalem artichoke, maize, millet, mustard, lupin, oat, oilseed rape (Brassica napus), field mustard (Brassica rapa), opium poppy, Persian clover, potato, red clover, rye, safflower, sisal, soy bean, sugar beet, sunflower, tea, tobacco, triticale, wheat, white clover, white mustard, wild rice, winter vetch, artichoke, asparagus, asparagus beans, aubergine, beetroot, black radish, black bean, black salsify, broad bean, broccoli, Brussels sprouts, cabbage, cantaloupe, carrot, cauliflower, celery, chard, chicory, chili pepper, chinese cabbage, choi sum, common bean, corn salad, courgette, cucumber, daikon, eggplant, endive, fennel, garlic, goosefoot, green bean, Indian lettuce, kale, kidney bean, kohlrabi, leek, lettuce, lentil, lima bean, maize, melon, mizuna, napa cabbage, onion, parsnip, pea, pepper, potato, pumpkin, quinoa, radicchio, radish, rapini, red cabbage, rhubarb, runner bean, rutabaga, salad rocket, Savoy cabbage, shallot, soy bean, spinach, squash, sugar cane, swede, tomatillo, tomato, turnip, watercress, watermelon, yellow turnip, almond, apple, apricot, bird cherry, butternut, cashew, cherry, chokeberry, crabapple, filbert, greengage, hawthorn, hazel, heartnut, loquat, medlar, mirabelle prune, nectarine, peach, peacherine, pear, pecan, pistacio, plum, prune, quince, rowan, walnut, acacia, alder, Allegheny chinkapin, American beech, American chestnut, American hornbeam, ash, aspen, basswood, beech, bigtoothed, aspen, birch, bitternut hickory, black alder, black birch, black cherry, black gum, black locust, black maple, black oak, black poplar, black walnut, black willow, butternut, cedar, chestnut, chestnut oak, Chinese chestnut, Corsican pine, cottonwood, crabapple, cucumbertree, cypress, dogwood, Douglas fir, Eastern hemlock, elm, English oak, eucalyptus, European beech, European larch, European silver fir, European white birch, fir, flowering dogwood, gum, hawthorn, hornbeam, horse chestnut, hybrid poplar, Japanese chestnut, Japanese larch, larch, lodgepole pine, maple, maritime pine, mockernut hickory, Norway spruce, oak, Oregon pine, Pacific silver fir, pedunculate oak, pignut hickory, pine, pitch pine, poplar, Scots pine, sweet chestnut, red alder, red cedar, red maple, red oak, red pine, red spruce, redwood, rowan, sassafrass, Scots pine, Serbian spruce, serviceberry, shagbark hickory, silver birch, Sitka spruce, southern beech, spruce, striped maple, sugar maple, sweet birch, sweet chestnut, sycamore, tamarack, tulip tree, Western hemlock, white ash, white oak, white pine, yellow birch, banana, breadfruit, coconut, date palm, jackfruit, mango, oil palm, olive, papaya, pineapple, plantain, rubber tree and sugar palm.
- In another preferred embodiment, the population of individuals is of a species selected from the group consisting of Cattle (Bos taurus, Bos indicus), Water buffalo (Bubalus bubalis), Equine (Equus caballus), Sheep (Ovis aries), Goat (Capra hircus), Pig (Sus scrofa), Chicken (Gallus gallus), Turkey (Maleagris gallopavo), Ducks (Anas platyrhynchos, Cairina moschata), Geese (Anser anser domesticus, Anser cygnoides), Pigeons (Columba livia domestica), Rat (Rattus novergicus), Mouse (Mus musculus), Cat (Felis catus), Dog (Canis familiaris), Rabbit (Oryctolagus cuniculus), Guinea pig (Cavia porcellus), Zebra fish (Danio rerio) and Fruit fly (Drosophila melanogaster).
- In yet another preferred embodiment, the population of individuals is of a fish species selected from the group consisting of Cyprinus carpio, Salmo salar, Oreochromis niloticus, Oncorhynchus mykiss, Ctenopharyngodon idella, Hypophthalmichthys molitrix, Gibelion catla, Cyprinus carpio, Hypophthalmichthys nobilis, Carassius carassius, Oreochromis niloticus, Pangasius pangasius and Labeo rohita, or wherein the method is applied to a shrimp species selected from the group consisting of Macrobrachium rosenbergii, Litopenaeus vannamei and Penaeus monodon.
- The method of the disclosure comprises a method for identifying combinations of at least three individuals and as such the identification process itself does not require crossing and subsequently selecting of plants or animals The present method for identifying typically is not an essentially biological process for the production of plants or animals, and does not necessarily require crossing and subsequent selection of plants or animals.
- In another aspect of the present disclosure, a computer-readable medium comprising instructions for performing the present method is provided. In a preferred embodiment, the attributing of step b), and steps c), d), e), f), g) and h) as a whole of the present method are computer-implemented steps and/or the present method is (partly) a computer-implemented method.
- The method of the present disclosure can advantageously be used particularly for improving at least one (quantitative) phenotypic trait of interest in a breeding program.
- Also foreseen is a product obtainable by the method according to the disclosure, preferably wherein the product is a plant.
-
-
Figure 1 : Graphical representation of the selection process in a breeding population consisting of 3 diploid individuals represented by their genotypes G1-3. The individuals have been genotyped for 5 loci L1-5 and the phenotype for an individual with an particular genotype can be predicted using a mathematical genome-wide prediction model that assigns positive of negative effects on each allele occurring on the loci. The concept underlying the current disclosure involves the construction of the putative future genotype that predicts the highest phenotypic performance from haplotypes (H1.1-H3.2) occurring in the current population, or recombinants of those. In this example, the best obtainable genotype that can be obtained with a single cross is combining haplotypes H1.1 and H3.1 (indicated bold), which complement each other in locus L5 versus the others. According to the present disclosure, and extrapolating fromFigure 1 , it will be clear that in larger populations, the putative genotype can be constructed analogously from haplotypes of more than two individuals. -
Figure 2 : An example of the filtered effects of the loci of 5 parental lines. A high value means that this part of the genome has a positive effect on the trait of interest. A combination of genotypes that yields high positive values on the entire genome is highly beneficial. -
Figure 3 : Schematic representation of the crossing scheme that is used for both regGWS and dirGWS for 5 selected parents. -
Figure 4 : Graphical representation of breeding potentials in a population. Current results were obtained by simulated GS models and real Arabidopsis thaliana genotype data. Visible are the potentials over five chromosomes, with plotted as a solid thick line (fullPop): the max breeding potential over an entire population of 100 individuals; The dotted line ("GWS") indicates the total potential of the selected 5 best parents; The dashed line ("dirGWS") plots the potential of the selected set of 5 best combining parents. The thin solid line at the bottom ("difference") indicates the superiority of the combined best parents over the best parents, mainly found on thechromosomes Figure 2 . -
Figure 5 : Test results with the described method according to the present disclosure of directed genomic selection, in comparison to "regular" genomic selection (i.e. selection of parents with best per se GEBV). Simulations were done using Arabidopsis thaliana genotype data and simulated trait effects, randomly distributed over 500-2000 loci. The horizontal axis indicates the accuracy of the genomic prediction model in each situation. The vertical axis indicates the fraction of repetitions in which the performance of final result in the final breeding cycle is better in the directed GS method than in the normal GS procedure (0.6 means that in 60% of the cases directed genomic selection had a better result, and the 0.5 line indicates the regular GS performance level). -
Figure 6 : asfigure 5 , but tests performed with Maize (Zea mais) genotype data and simulated traits. -
Figure 7 : asFigure 5 , but tests performed with Cucumber (Cucumis sativus) genotype data. - In validation simulation experiments, directed genome wide selection (dirGWS) was compared with regular genome wide selection (regGWS). For both methods the realized progress through simulated breeding and selection was determined and compared. Both methods use the same starting material for the simulations. The model plant species used in the simulations contained 5 chromosomes with a length of 1 Morgan each. A randomly generated (parental) starting population of size N=50-1000 parents was generated. The genomic scores, i.e. the presence of specific alleles at SNP marker positions, present in the population were sampled from various plant datasets that were retrieved from the public domain. In this way realistic values for allele frequency (proportion of all copies of a gene that is made up of the allele) and inter-marker correlation were applied. Two-state allelic coding (-1/1) was applied to indicate the allelic status per allele at each locus (data derived from the dataset), where -1 means the absence of the marker allele and 1 means the presence of the marker allele. Each allele was attributed an allele substitution effect, i.e. the contribution to the phenotype of interest; the size of this effect was randomly drawn from a truncated normal distribution with
mean 0 andstandard deviation 1, for which all negative values were discarded. In this way each locus may contribute both positively and negatively to the trait depending on its allelic state, while the size of the contribution is determined by the effect size. Next, a multi generation breeding effort was simulated in which selected parents were intercrossed, and selection was applied on the resulting progeny in order to advance in phenotype. In our simulations phenotype was not observed directly but was implicitly determined through summation of allelic states multiplied with allele substitution effects. In order to combine, through crossing and selection, several favorable genomic regions from different sources, our simulated breeding schemes involved up to 4 cycles of crossing and selection and up to 5 different parental genotypes from the starting population. All crossings steps were simulated using commonly used methodology, abiding to Mendelian genetic rules. The approach taken to select the most optimal parents for breeding from the starting population in both methods is elaborated in the next paragraphs. - reaGWS: for each of the parental lines a per se performance was determined based on the accumulated effects of each of the individual loci, by multiplying the allele substitution effect (drawn from a truncated Normal distribution) with the allelic state (-1 or 1) at each locus for all loci present in the genome. The lines with the highest predicted genomic performance were selected and intercrossed. In simple bi-parental simulation only two parents were involved. In more advanced simulations several breeding cycles (generations) were simulated in which in each additional cycle an additional parent was crossed with selected progeny (see below) obtained from the previous cycle. The order of the selected parents for use in next cycles followed the predicted performance ranking, i.e. the third best parent entered the breeding cycle as the third parent.
dirGWS: Parents were not selected based on their per-se performance but rather on the potential performance of their combined genomes. For all combined sets (ofsize - The selection of the set with the highest combined expected performance, however, is not straightforward because cross-over frequencies and allele substitution effects should be taken into account. The selection of the best combination of parents was done as follows: From the genetic map the cross-over frequency between two alleles can be calculated. Because two neighboring loci are in linkage disequilibrium it is likely that by passing over one locus to the next generation, the linked loci will also be transmitted to the next generation. Because two chromosomes, or linkage blocks, segregate independent from each other, no linkage drag occurs between two linkage blocks. In order to take linkage drag into account a filter was designed based on Kosambi's mapping function. Note that other mapping functions such as Haldane, or others (see Liu) can also be applied. The cross-over probability estimation between all locus pairs results in a block diagonal matrix, F ∈ R p×p with p being the number of loci. Now with the parental genotype data matrix, P∈R n×p , with n being the number of parental lines, the corrected data matrix can be calculated by: S = PF
- In other words, the genome-wide estimated breeding value of a parent is calculated by the total of the values for S, i.e. the corrected locus effect, of each locus. The corrected data matrix, S, then can be used as a basis for (pre)selection. In Figure 2 an example is given for five parental lines containing 250 loci located on 5 chromosomes. From this figure it can be seen that
parent 4 exceeds all other parents at the first two chromosomes (up to locus 100) but is underperforming on the other chromosomes. The relative performance of an individual was calculated in this way. The combination ofparent parent 1 together gives the maximum S on all chromosomes and is most likely to outperform other combinations of two or more parents.
Once the corrected population matrix S is known the best combination of parental lines can be chosen. The potential value of a parental line ϑ is taken to be - Because there are only a limited number of cross-overs per chromosome, the regions with the highest corrected allele substitution effects can be combined while the remaining part of the genome should not have too low values since that would affect the phenotype in a negative way. The estimated value of two parental lines (ϑ1 and ϑ2) to be crossed can therefore be determined by:
In other words, the combined genome-wide estimated breeding value (CGEBV) of two parents is calculated by taking the highest corrected locus effect S of each of the loci of the combination over the total number of loci of the combination. The subset of 2 parental lines that has the largest potential value c (=CGEBV) will be crossed. This procedure is easily extended to multiple parental lines by taking the maximum over each filtered locus for multiple parental lines. Because the procedure is limited to basic, calculation extensive matrix manipulations, many subsets can be tested. The number of subsets to be tested, however, grows with the binomial coefficient - Stochastic simulation was used to evaluate the performance of each of the two methods. Two parents from the selected set of parents were intercrossed to generate a new hybrid genotype. This hybrid product of each cross was considered to be the base genotype from which, through stochastic simulation, 1000 gametes were generated. The generated gametes were used as a sample of the potential genetic performance of the pair of parents. These gametes were ordered by their performance, which was calculated by multiplying their allelic state at each locus (-1 or 1) with the locus genomic prediction values. The 95% percentile value of the ranked performances of the 1000 gametes was taken as a measure to judge the offspring performance. In the case that more than two parents (as a result of more than one crossing) were involved in the generation of the gamete representing the 95% performance percentile was also selected to represent the selected cross result, and was subsequently combined in a next breeding cycle with a gamete from the third (or 4th, 5th) selected parent. Again 1000 gametes were derived from this new cross and ordered and the 99 or 100 % percentile performance value was again used to judge the performance. A schematic representation of the applied crossing scheme is represented in Figure 3. Figure 4 is a further graphical representation of genomewide breeding potentials in a population. The results of Figure 4 were obtained by simulated GS models and real Arabidopsis thaliana genotype data. Visible are the potentials over five chromosomes, with plotted as a solid thick line (fullPop): the max breeding potential over an entire population of 100 individuals; The dotted line ("GWS") indicates the total potential of the selected 5 best parents; The dashed line ("dirGWS") plots the potential of the selected set of 5 best combining parents. The thin solid line at the bottom ("difference") indicates the superiority of the combined best parents over the best parents, mainly found on the
chromosomes - Except for the initial choice of parents, the evaluation of both methods was thus performed in an identical fashion. In each round, the final values for the selected gametes of both methods were compared and it was recorded which of the methods yielded better predicted values after 2-5 cycles of breeding.
- The genomic selection models were constructed using Ridge Regression (Meuwissen et al., 2001).The models used in both regGWS and dirGWS were always pairwise identical for each test round.
- The entire procedure described above was repeated several hundred times to produce reliable estimates on the method comparison.
- We tested this procedure for breeding schemes in which 3, 4 or 5 parents were combined. The modeled genomic effects were drawn from a truncated normal distribution with
mean 0 andstandard deviation 1, for which all negative values were discarded. As such all loci effects have a positive value and the sign of the contribution to the trait comes from the allelic state (-1/1). All scenarios were tested while selecting the 95% percentile best combinations. - Table 3 shows the results of the dirGWS vs regGWS comparison. When three parents or more were used the directed GWS method outperformed the regular GWS selection in more than half of the cases and this frequency increased dramatically when more than three parents were involved in the breeding scheme. The increase in superiority with increasing number of parents demonstrates that the focus on an optimal combination of complementary genomic regions, as is the main idea of the directed GWS approach, indeed yields better result that a focus on per-se performance of the parents, as is typically done in classical GWS approaches.
- A public genotype data set of the model organism Arabidopsis thaliana (At) was retrieved, consisting of genotype data for 250K loci of 1179 ecotypes (Horton et al., 2012). Simulated (i.e. attributed) allele substitution effects were genomewide randomly distributed over 500-2000 loci. The At lines are divided uniformly at random in a training set of 50-1000 parents, which are used to construct GS models, and a validation set (for determining the accuracy) consisting of the remaining parents. The tests were done selecting either the 99% or 100% best ranking combinations and following 2-5 breeding cycles. Each parameter (i.e. number of loci, number of parents in the training set, selecting percentage, and the number of breeding cycles) combination was repeated 200 times.
- One of the presumptions of the dirGWS strategy is the availability of reasonably accurate GEBVs. This accuracy is dependent on GS model construction methodology and on training data set properties such as number of parents, the genetic diversity within the panel and the distribution of the allele substitution effects. Results for performance of dirGS vs regGS were therefore ordered to accuracy of the GS model (expressed as correlation between Breeding Values and GEBVs).
-
Fig 5 shows the results of the dirGWS vs regGWS comparison, wherein the training population serves as breeding population. Directed GWS method outperformed the regular GWS selection, in particular when the GS models are more accurate (R>0.6). However, even with less accurate models, the dirGS strategy is providing better selection results, in particular when more than 2 parents are involved and the crossing scenario spans more than one generation. The increase in superiority with increasing number of parents demonstrates that the focus on an optimal combination of complementary genomic regions, according to the present invention, indeed yields better result than a focus on per-se performance of the parents, as is typically done in classical GWS approaches. - A similar test as was done in Arabidopsis was performed using a Maize (Zea mais) genotype data for 106 loci of 368 lines (Li et al., 2013). Traits were simulated by assigning allele substitution effects to 258-2015 loci. Training sets of 50-200 training parents were randomly selected. Further test procedures were similar as in the Arabidopsis example.
- Test results, wherein the training population serves as breeding population, shown in
Fig 6 , indicate again that for most parameter combinations, the selection results are better when following the dirGS strategy, in particular when more than 2 parents are involved. The accuracy has a less dramatic effect on the overperformance of the dirGS strategy than in the previous example. - A third test was conducted using Cucumber (Cucumis sativus) genotype data for 3.7*106 loci of 115 lines (Qi et al., 2013). The set was reduced to the homozygous marker subset with no missing data (179K markers) and 86 non-identical parent lines. Trait effects were simulated on 450-1789 loci traits. GS models were constructed on training populations of 50 training parents. Further test procedures were similar as in the previous two examples.
- By the nature of this dataset, which contains a relative low amount of lines, it was not possible to construct very accurate models, so results were obtained for the accuracy range (0.55 > R < 0.67) only. Again, using the training population as breeding population, we observed an improved performance when following the dirGS strategy for most cases, and even within this rather narrow range, we observed a modest impact of model accuracy (more overperformance of dirGS is found when the accuracy is higher), see
Fig. 7 . -
- P
- matrix of parental lines, each row contains a parental line, each column represents a single locus
- F
- a block diagonal symmetric filter matrix. An entry of the ith row and the jth column represents the amount of linkage between the ith and the jth locus.
- S
- The filtered matrix of parental lines.
- R
- Set of real numbers
- s
- a vector of a single parental line containing the filtered locus values of that parental line
- c
- the potential value after crossing of two parental lines
- p
- the total number of loci involved in selection
- n
- the total number of parental lines involved in selection
- ϑi
- ith parental line
- k
- number of parental lines to be crossed
-
- Haley C.S. and Visscher P.M. (1998) Strategies to utilize marker-quantitative trait loci association. J Dairy Sci 81:85-97 85
- Hoerl, A. E. (1959), Optimum Solution of Many Variables Equations. Chemical Engineering Progress 55: 69-78.
- Horton M.W. et al. (2012) Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel. Nature Genetics 44, 212-216
- Johnson GR and Yang XS (2010) Methods and compositions for breeding plants with enhanced yield.
US 2010/0037342 - Meuwissen, T.H.E. et al. (2001) Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps. Genetics 157, 1819-1829.
- Haldane, J. B. S. (1919)The combination of linkage values and the calculation of distances between the loci of linked factors. J Genet 8.29: 299-309.
- Hastie, T., Tibshirani, R.,, Friedman, J. (2001). The Elements of Statistical Learning. New York, NY, USA: Springer New York Inc.;
- Arlot, Sylvain; Celisse, Alain. A survey of cross-validation procedures for model selection. Statistics Surveys 4 (2010), 40-79
- Kishore, V.K. and Guo, Z. (2012) Methods for increasing genetic gain in a breeding population.
WO 2012/075125 - Kosambi, D. D. (1943) The estimation of map distances from recombination values. Annals of Eugenics 12.1: 172-175.
- Li, H. et al. (2013) Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels. Nature Genetics 45: 43-50
- Liu, B.H., Statistical Genomics, Linkage, Mapping and QTL analysis, CRC Press, 1998, pp.611
- Peleman, J.D., and Rouppe van der Voort, J. The challenges in MarkerAssisted Breeding, CGN Eucarpia Leafy Vegetables (eds. Van Hintum , Th,J.L., Lebeda, A, Pink, D, Schut, J.W.), 2003
- Qi, J et al. (2013) A genomic variation map provides insights into the genetic basis of cucumber domestication and diversity. Nature Genetics 45: 1510-1515.
- Ragot, M. et al. (2008) Process for selecting individuals and designin a breeding program.
EP 1962212
Claims (16)
- Method for identifying combinations of at least three individuals within a breeding population, wherein the combinations have for at least one phenotypic trait of interest a higher Combined Genome-Wide Estimated Breeding Value in the offspring for said at least one phenotypic trait of interest, as compared to at least 70% of the other combinations of at least three individuals within said breeding population, wherein the method comprises:a) providing a training population of individuals;b) collecting phenotypic data for the at least one trait of interest for each individual within said training population;c) collecting genotypic data for each individual within said training population using molecular marker techniques, sequence-based genotyping or whole genome sequencing, and attributing to each allele of a plurality of loci of each individual, an allele substitution effect for the at least one phenotypic trait of interest;d) providing a genotype/phenotype relationship model for said training population of individuals, wherein the model estimates for a given genotype of an individual what the quantitative contribution is of the allele substitution effects of said plurality of loci on the at least one phenotypic trait of interest;e) genotyping each individual within a breeding population in the same way as in step c);f) calculating for each individual within the breeding population the allele substitution effect for each allele of said plurality of loci by using the genotype/phenotype relationship model of step d), and correcting for recombination probabilities with flanking loci, wherein for an allele with a positive allele substitution effect said effect is multiplied with the probability that said allele is transmitted to the offspring, and for an allele with a negative allele substitution effect said effect is multiplied with the probability that said allele is not transmitted to the offspring;g) determining the Combined Genome-Wide Estimated Breeding Value in the offspring for the at least one phenotypic trait of interest for each combination of at least three individuals within the breeding population by calculating for each combination of at least three individuals for each locus of said plurality of loci in the offspring the highest combination of allele substitution effects using the calculated and corrected allele substitution effects of the individuals calculated in step f);h) identifying the combinations of at least three individuals within the breeding population that provide for said at least one phenotypic trait of interest Combined Genome-Wide Estimated Breeding Values in the offspring that are higher than at least 70% of the Combined Genome-Wide Estimated Breeding Values in the offspring of other combinations of at least three individuals within the breeding population,wherein the method does not involve crossing and subsequent selection of plants or animals.
- The method according to claim 1, wherein the combinations of at least three individuals within the breeding population provide for said at least one phenotypic trait of interest Combined Genome-Wide Estimated Breeding Values in the offspring that are higher than at least 80%, more preferably al least 90%, even more preferable at least 95%, yet even more preferably 97%, most preferably 99% of the Combined Genome-Wide Estimated Breeding Values in the offspring of other combinations of at least three individuals within the breeding population.
- The method according to any one of the previous claims, wherein a pre-selection of individuals of the breeding population to be combined is made by selecting less than 30%, more preferably less than 20%, even more preferably less than 10%, yet even more preferably less than 5%, most preferably less than 2% of the individuals with the highest sum of all allele substitution effects for said plurality of loci.
- The method according to any one of the previous claims, wherein the recombination probabilities are calculated based on genetic distances between loci, or based on aligning physical and genetic maps.
- The method according to any one of the previous claims, wherein said molecular marker technique is selected from the group consisting of the detection of SNPs, the detection of RFLPs, the detection of SSR polymorphisms, RAPDs, the detection of indels or CNVs, and AFLP.
- The method according to any one of the previous claims, wherein the training population is a specifically designed population or wherein the training population is equal to the breeding population.
- The method according to any one of the previous claims, wherein at least one phenotypic trait of interest is a quantitative trait.
- The method according to any one of the previous claims, wherein the method is applied to more than one generation.
- The method according to any one of the previous claims, wherein the method is applied to a species that is diploid.
- The method according to any one of the previous claims, wherein the method is applied to a species that is allopolyploid.
- The method according to any one of the previous claims, wherein the method is applied to a species that is autopolyploid.
- Method according to any one of the previous claims, wherein the breeding population of individuals is a field crop, or vegetable crop, or woody fruit species, or forestry species, or plantation crop, preferably selected from the group consisting of Arabidopsis thaliana, Abyssinian mustard, alfalfa, barley, barrel clover, black mustard, buckwheat, canola, clover, common flax, common vetch, corn spurry, coffee, cotton, Egyptian clover, fodder beet, hemp, hop, Indian mustard, Jerusalem artichoke, maize, millet, mustard, lupin, oat, oilseed rape (Brassica napus), field mustard (Brassica rapa), opium poppy, Persian clover, potato, red clover, rye, safflower, sisal, soy bean, sugar beet, sunflower, tea, tobacco, triticale, wheat, white clover, white mustard, wild rice, winter vetch, artichoke, asparagus, asparagus beans, aubergine, beetroot, black radish, black bean, black salsify, broad bean, broccoli, Brussels sprouts, cabbage, cantaloupe, carrot, cauliflower, celery, chard, chicory, chili pepper, chinese cabbage, choi sum, common bean, corn salad, courgette, cucumber, daikon, eggplant, endive, fennel, garlic, goosefoot, green bean, Indian lettuce, kale, kidney bean, kohlrabi, leek, lettuce, lentil, lima bean, maize, melon, mizuna, napa cabbage, onion, parsnip, pea, pepper, potato, pumpkin, quinoa, radicchio, radish, rapini, red cabbage, rhubarb, runner bean, rutabaga, salad rocket, Savoy cabbage, shallot, soy bean, spinach, squash, sugar cane, swede, tomatillo, tomato, turnip, watercress, watermelon, yellow turnip, almond, apple, apricot, bird cherry, butternut, cashew, cherry, chokeberry, crabapple, filbert, greengage, hawthorn, hazel, heartnut, loquat, medlar, mirabelle prune, nectarine, peach, peacherine, pear, pecan, pistacio, plum, prune, quince, rowan, walnut, acacia, alder, Allegheny chinkapin, American beech, American chestnut, American hornbeam, ash, aspen, basswood, beech, bigtoothed, aspen, birch, bitternut hickory, black alder, black birch, black cherry, black gum, black locust, black maple, black oak, black poplar, black walnut, black willow, butternut, cedar, chestnut, chestnut oak, Chinese chestnut, Corsican pine, cottonwood, crabapple, cucumbertree, cypress, dogwood, Douglas fir, Eastern hemlock, elm, English oak, eucalyptus, European beech, European larch, European silver fir, European white birch, fir, flowering dogwood, gum, hawthorn, hornbeam, horse chestnut, hybrid poplar, Japanese chestnut, Japanese larch, larch, lodgepole pine, maple, maritime pine, mockernut hickory, Norway spruce, oak, Oregon pine, Pacific silver fir, pedunculate oak, pignut hickory, pine, pitch pine, poplar, Scots pine, sweet chestnut, red alder, red cedar, red maple, red oak, red pine, red spruce, redwood, rowan, sassafrass, Scots pine, Serbian spruce, serviceberry, shagbark hickory, silver birch, Sitka spruce, southern beech, spruce, striped maple, sugar maple, sweet birch, sweet chestnut, sycamore, tamarack, tulip tree, Western hemlock, white ash, white oak, white pine, yellow birch, banana, breadfruit, coconut, date palm, jackfruit, mango, oil palm, olive, papaya, pineapple, plantain, rubber tree and sugar palm.
- Method according to any one of the previous claims, wherein the breeding population of individuals is a species selected from the group consisting of Cattle (Bos taurus, Bos indicus), Water buffalo (Bubalus bubalis), Equine (Equus caballus), Sheep (Ovis aries), Goat (Capra hircus), Pig (Sus scrofa), Chicken (Gallus gallus), Turkey (Maleagris gallopavo), Ducks (Anas platyrhynchos, Cairina moschata), Geese (Anser anser domesticus, Anser cygnoides), Pigeons (Columba livia domestica), Rat (Rattus novergicus), Mouse (Mus musculus), Cat (Felis catus), Dog (Canis familiaris), Rabbit (Oryctolagus cuniculus), Guinea pig (Cavia porcellus), Zebra fish (Danio rerio) and Fruit fly (Drosophila melanogaster).
- Method according to any one of the previous claims, wherein the breeding population of individuals is a fish species selected from the group consisting of Cyprinus carpio, Salmo salar, Oreochromis niloticus, Oncorhynchus mykiss, Ctenopharyngodon idella, Hypophthalmichthys molitrix, Gibelion catla, Cyprinus carpio, Hypophthalmichthys nobilis, Carassius carassius, Oreochromis niloticus, Pangasius pangasius and Labeo rohita, or wherein the breeding population of individuals is a shrimp species selected from the group consisting of Macrobrachium rosenbergii, Litopenaeus vannamei and Penaeus monodon.
- Computer-readable medium comprising instructions for performing the method according to any of the previous claims.
- Product obtainable by the method according to any one of claims 1-14, preferably wherein the product is a plant.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15176340.6A EP2949204B2 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
DK15176340.6T DK2949204T4 (en) | 2013-06-14 | 2014-06-13 | Targeted strategies to improve phenotypic traits |
EP16193259.5A EP3135103B1 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
NL2010982 | 2013-06-14 |
Related Child Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP15176340.6A Division-Into EP2949204B2 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
EP15176340.6A Division EP2949204B2 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
EP16193259.5A Division-Into EP3135103B1 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
EP16193259.5A Division EP3135103B1 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
Publications (3)
Publication Number | Publication Date |
---|---|
EP2813141A1 true EP2813141A1 (en) | 2014-12-17 |
EP2813141B1 EP2813141B1 (en) | 2015-08-05 |
EP2813141B2 EP2813141B2 (en) | 2018-11-28 |
Family
ID=48951556
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP14172374.2A Active EP2813141B2 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
EP15176340.6A Active EP2949204B2 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
EP16193259.5A Active EP3135103B1 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP15176340.6A Active EP2949204B2 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
EP16193259.5A Active EP3135103B1 (en) | 2013-06-14 | 2014-06-13 | Directed strategies for improving phenotypic traits |
Country Status (5)
Country | Link |
---|---|
US (1) | US11107551B2 (en) |
EP (3) | EP2813141B2 (en) |
JP (1) | JP6566484B2 (en) |
DK (2) | DK2949204T4 (en) |
WO (1) | WO2014200348A1 (en) |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105028326A (en) * | 2015-08-22 | 2015-11-11 | 昆明亚灵生物科技有限公司 | Method for raising non-specific-pathogen-class experimental animals during export quarantine period |
CN105420357A (en) * | 2015-12-08 | 2016-03-23 | 中国农业科学院蔬菜花卉研究所 | Universal SSR marker tightly linked with dominant genic male sterility genes in broccoli |
CN105512513A (en) * | 2015-12-02 | 2016-04-20 | 新疆农业大学 | Method for identifying prunus persica plant species based on SSR molecular markers |
CN106222278A (en) * | 2016-08-09 | 2016-12-14 | 南京林业大学 | A kind of primer special combination for quickly detection willow polyploid and detection method thereof |
CN106258663A (en) * | 2016-11-10 | 2017-01-04 | 辽宁省农业科学院 | A kind of prevention and controls being suitable to the Northeast winter greenhouse Folium Allii tuberosi root maggot |
EP2949204B1 (en) | 2013-06-14 | 2017-01-04 | Keygene N.V. | Directed strategies for improving phenotypic traits |
CN106755387A (en) * | 2016-12-14 | 2017-05-31 | 李兴盛 | A kind of utilization molecular labeling Rapid identification cucumber stock is made a concerted effort the method for two purity |
CN107385052A (en) * | 2017-08-08 | 2017-11-24 | 中国林业科学研究院热带林业研究所 | For identifying STR primers and its application of Eucalyptus clone |
CN109479827A (en) * | 2019-01-03 | 2019-03-19 | 湖南沩滨农业科技有限公司 | The method of domestic pig thick forest free-ranging |
CN110029174A (en) * | 2019-04-22 | 2019-07-19 | 浙江省淡水水产研究所 | A kind of SSR marker related to Macrobrachium rosenbergii weight |
CN110551841A (en) * | 2019-08-28 | 2019-12-10 | 福建省农业科学院作物研究所 | SSR primer group developed based on eggplant transcriptome sequencing data and application thereof |
CN110643735A (en) * | 2019-11-15 | 2020-01-03 | 中国农业科学院郑州果树研究所 | InDel molecular marker 5mBi3 for identifying bitter taste character of melon fruit, and primer and application thereof |
CN110699478A (en) * | 2019-11-15 | 2020-01-17 | 中国农业科学院郑州果树研究所 | InDel molecular marker 2mBi134 for identifying bitter taste character of melon fruit as well as primer and application thereof |
CN111088388A (en) * | 2020-01-18 | 2020-05-01 | 中国农业科学院郑州果树研究所 | InDel marker and primer pair for identifying flesh red/non-red character of peach fruit and application of InDel marker and primer pair |
CN111312335A (en) * | 2020-02-24 | 2020-06-19 | 吉林省农业科学院 | Soybean parent selection method, soybean parent selection device, storage medium and electronic equipment |
CN111387141A (en) * | 2020-04-28 | 2020-07-10 | 青岛康大兔业发展有限公司 | Meat rabbit breeding method |
CN111394474A (en) * | 2020-03-24 | 2020-07-10 | 西北农林科技大学 | Method for detecting copy number variation of cattle GA L3 ST1 gene and application thereof |
CN111826429A (en) * | 2020-07-28 | 2020-10-27 | 辽宁省果树科学研究所 | Non-hybrid progeny identification method based on simplified genome sequencing and SNP (single nucleotide polymorphism) sub-allele frequency |
CN113151489A (en) * | 2021-02-26 | 2021-07-23 | 河南省畜牧总站 | Molecular diagnosis method for evaluating growth traits based on cow ZNF146 gene CNV marker and application thereof |
CN113430300A (en) * | 2021-08-30 | 2021-09-24 | 广东省农业科学院蚕业与农产品加工研究所 | SSR molecular marker of mulberry variety Yuehen 123, core primer group and kit thereof, and application of SSR molecular marker |
CN113621710A (en) * | 2021-07-13 | 2021-11-09 | 武汉中科瑞华生态科技股份有限公司 | Bighead microsatellite marker primer and Bighead marker discharge effect evaluation method |
CN113801954A (en) * | 2021-09-07 | 2021-12-17 | 云南省农业科学院质量标准与检测技术研究所 | SSR core primer group for purity identification of pepper hybrid and screening method thereof |
CN113796353A (en) * | 2021-09-18 | 2021-12-17 | 江苏农牧科技职业学院 | Marking method and device for systemic breeding of black muscovy ducks |
CN113862392A (en) * | 2021-11-15 | 2021-12-31 | 西北农林科技大学 | SSR molecular marker primer linked with Chinese cabbage yellow cotyledon gene Bryc and application thereof |
CN114262748A (en) * | 2021-12-29 | 2022-04-01 | 广东省农业科学院蚕业与农产品加工研究所 | Molecular marker for identifying variety 'Yueshi 143', identifying primer group, kit and application |
CN116200521A (en) * | 2022-12-05 | 2023-06-02 | 东北林业大学 | SSR (simple sequence repeat) marker primer group for identifying Korean pine clone and construction method and application of SSR marker primer group and fingerprint |
Families Citing this family (51)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016069078A1 (en) * | 2014-10-27 | 2016-05-06 | Pioneer Hi-Bred International, Inc. | Improved molecular breeding methods |
AU2015362942B2 (en) | 2014-12-18 | 2022-02-17 | Pioneer Hi-Bred International, Inc. | Improved molecular breeding methods |
EP3307756A4 (en) * | 2015-06-12 | 2019-01-02 | Anandia Laboratories Inc. | Methods and compositions for cannabis characterization |
CN105046108B (en) * | 2015-07-07 | 2017-12-05 | 中国农业大学 | Corn hybridization compound formulation and system based on self-mating system SSR and phenotypic information |
CN105063028B (en) * | 2015-07-31 | 2018-04-13 | 青岛啤酒股份有限公司 | SSR primer sets and the method using primer sets structure malt finger-print |
CN105145479A (en) * | 2015-09-14 | 2015-12-16 | 蒙山县瑶山大鲵繁育专业合作社 | Giant salamander incubator |
CN106811465A (en) * | 2015-12-01 | 2017-06-09 | 深圳华大农业与循环经济科技有限公司 | With the molecular labeling SVmc4 of hasked millet color gene close linkage |
CN106811464A (en) * | 2015-12-01 | 2017-06-09 | 深圳华大农业与循环经济科技有限公司 | With the molecular labeling SVmc1 of hasked millet color gene close linkage |
CN105994129B (en) * | 2016-05-12 | 2021-04-20 | 瑞昌市久兴农业科技有限公司 | Ecological production method of local chicken and vegetables |
CN105907869B (en) * | 2016-06-01 | 2019-04-12 | 青岛科技大学 | The SSR primer and method identified for anvil with pumpkin " Huang Chenggen 2 " hybrid seed purity |
WO2017210603A1 (en) * | 2016-06-03 | 2017-12-07 | Illumina, Inc. | Genotyping polyploid loci |
CN107012200B (en) * | 2016-08-17 | 2019-09-10 | 青岛科技大学 | Anvil pumpkin ' the method for identifying molecules of big anvil 1 ' cenospecies |
CN107805672B (en) * | 2016-09-09 | 2021-03-12 | 宁波市农业科学研究院 | Method for identifying authenticity of Indian pumpkin and Chinese pumpkin hybrid stock varieties |
CN106755315B (en) * | 2016-11-21 | 2019-08-02 | 中国农业科学院烟草研究所 | Tobacco glandular hairs secretion characteristic related gene Nt-te Molecular mapping and preparation method and application |
CN106665495A (en) * | 2016-12-31 | 2017-05-17 | 贵阳富之源农业科技有限公司 | Cross breeding method for new-American-line boars and new-Canadian-line sows |
US20180239866A1 (en) * | 2017-02-21 | 2018-08-23 | International Business Machines Corporation | Prediction of genetic trait expression using data analytics |
CN107201404B (en) * | 2017-06-15 | 2020-12-01 | 江西省农业科学院蔬菜花卉研究所 | Molecular biological identification method for sex of asparagus hermaphrodite plants and application thereof |
WO2018234639A1 (en) * | 2017-06-22 | 2018-12-27 | Aalto University Foundation Sr. | Method and system for selecting a plant variety |
CN107338295A (en) * | 2017-07-20 | 2017-11-10 | 扬州大学 | A kind of waterlogging phenotype of eggplant and the method for SSR molecular marker association analysis |
CN107509686A (en) * | 2017-07-21 | 2017-12-26 | 平南县德湖种养农民专业合作社 | A kind of cultural method of arhat chicken |
CN107345256B (en) * | 2017-08-22 | 2020-06-26 | 山西省农业科学院农作物品种资源研究所 | Transcriptome sequencing-based EST-SSR primer group for developing mucuna pruriens, method and application |
CN107354222B (en) * | 2017-08-30 | 2020-10-30 | 中国林业科学研究院热带林业研究所 | STR primer, PCR kit and method for identifying clone of eucalyptus |
SG11202001715YA (en) * | 2017-09-07 | 2020-03-30 | Regeneron Pharma | Systems and methods for leveraging relatedness in genomic data analysis |
WO2019079464A1 (en) * | 2017-10-17 | 2019-04-25 | Jungla Inc. | Molecular evidence platform for auditable, continuous optimization of variant interpretation in genetic and genomic testing and analysis |
CN107581151B (en) * | 2017-10-20 | 2020-07-07 | 葛洪伟 | Method for cultivating Chinese bamboo rat terminal father line |
CN108103231A (en) * | 2018-02-07 | 2018-06-01 | 武汉蔬博农业科技有限公司 | A kind of method of Rapid identification new water melon breed ' Wu Nong 8 ' hybrid seed purity |
CN108300798A (en) * | 2018-04-10 | 2018-07-20 | 新疆农业科学院园艺作物研究所 | A kind of primer pair of walnut microsatellite DNA mark fingerprint map construction method and its application |
CN108611434B (en) * | 2018-05-18 | 2022-02-08 | 中国热带农业科学院南亚热带作物研究所 | Sisal hemp internal reference gene ACT/GAPDH and application thereof |
CN108835040A (en) * | 2018-07-12 | 2018-11-20 | 湖北富山生态农业股份有限公司 | A kind of pigeon eggs artificial incubation method |
CN109169503B (en) * | 2018-08-29 | 2021-04-27 | 扬州大学 | Screening method of duck special for rice-duck farming |
CN109349209A (en) * | 2018-10-17 | 2019-02-19 | 张古权 | A kind of six draw the cultural method of pheasant |
CN109221003A (en) * | 2018-10-26 | 2019-01-18 | 蒙细梅 | The feed and its method for breeding of sugarcane chicken |
CN109697302A (en) * | 2018-11-16 | 2019-04-30 | 天津大学 | The behavior modeling method of radio-frequency power amplifier based on OP-ELM |
WO2020132683A1 (en) | 2018-12-21 | 2020-06-25 | TeselaGen Biotechnology Inc. | Method, apparatus, and computer-readable medium for efficiently optimizing a phenotype with a specialized prediction model |
CN109526869A (en) * | 2018-12-28 | 2019-03-29 | 安庆永强农业科技股份有限公司 | Kind duck hatching method suitable for industrial aquaculture |
CN109845697A (en) * | 2019-01-28 | 2019-06-07 | 薛广奎 | A kind of cultural method of fructus lycii sheep |
WO2020197891A1 (en) * | 2019-03-28 | 2020-10-01 | Monsanto Technology Llc | Methods and systems for use in implementing resources in plant breeding |
WO2020224636A1 (en) * | 2019-05-09 | 2020-11-12 | 江苏省农业科学院 | Gossypium anomalum-sourced ssr sequence associated with high lint percentage and drought tolerance and application thereof |
CN110050751A (en) * | 2019-05-21 | 2019-07-26 | 吕孝龙 | A kind of ecological cultivation method of turnip pig |
CN110463657A (en) * | 2019-09-04 | 2019-11-19 | 河北省畜牧良种工作站(河北省种畜禽质量监测站) | A kind of method that Taihang chicken yield is put in winter raising in a suitable place to breed |
CN110564832B (en) * | 2019-09-12 | 2023-06-23 | 广东省农业科学院动物科学研究所 | Genome breeding value estimation method based on high-throughput sequencing platform and application |
CN112514846A (en) * | 2019-09-18 | 2021-03-19 | 中国农业大学 | Method for exploring, screening and purifying colored-feather chickens from dominant white-feather chickens |
CN110853710B (en) * | 2019-11-20 | 2023-09-12 | 云南省烟草农业科学研究院 | Whole genome selection model for predicting starch content of tobacco and application thereof |
CN110853711B (en) * | 2019-11-20 | 2023-09-12 | 云南省烟草农业科学研究院 | Whole genome selection model for predicting fructose content of tobacco and application thereof |
WO2021217138A1 (en) * | 2020-04-24 | 2021-10-28 | TeselaGen Biotechnology Inc. | Method for efficiently optimizing a phenotype with a combination of a generative and a predictive model |
CN111621586B (en) * | 2020-05-11 | 2021-01-26 | 广东省农业科学院蔬菜研究所 | SNP molecular marker closely linked with pumpkin yellow stem character and application thereof |
CN111926020B (en) * | 2020-07-24 | 2022-09-20 | 中国科学院海洋研究所 | Two prawn growth related genes and application thereof in genetic breeding |
CN111903609B (en) * | 2020-08-13 | 2021-09-07 | 江苏京海禽业集团有限公司 | Seed production method of high-yield green-foot high-quality broiler chicken with carcass identification |
CN112514790B (en) * | 2020-11-27 | 2022-04-01 | 上海师范大学 | Rice molecular navigation breeding method and application |
CN117821650B (en) * | 2024-01-11 | 2024-06-11 | 武汉市农业科学院 | Taro whole genome SNP-Panel and application thereof |
CN117831636B (en) * | 2024-03-04 | 2024-06-11 | 北京市农林科学院信息技术研究中心 | Method, device, equipment and medium for implementing genome selection by fusion model |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008074101A2 (en) * | 2006-12-21 | 2008-06-26 | Agriculture Victoria Services Pty Limited | Artificial selection method and reagents |
WO2008085046A1 (en) * | 2007-01-09 | 2008-07-17 | Asg Veehouderij B.V | Method for estimating a breeding value for an organism without a known phenotype |
EP1962212A1 (en) | 2007-01-17 | 2008-08-27 | Syngeta Participations AG | Process for selecting individuals and designing a breeding program |
US20100037342A1 (en) | 2008-08-01 | 2010-02-11 | Monsanto Technology Llc | Methods and compositions for breeding plants with enhanced yield |
WO2010020252A1 (en) * | 2008-08-19 | 2010-02-25 | Viking Genetics Fmba | Methods for determining a breeding value based on a plurality of genetic markers |
WO2012075125A1 (en) | 2010-11-30 | 2012-06-07 | Syngenta Participations Ag | Methods for increasing genetic gain in a breeding population |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100304353A1 (en) * | 2007-07-16 | 2010-12-02 | Pfizer Inc | Methods of improving a genomic marker index of dairy animals and products |
AU2008300011A1 (en) * | 2007-09-12 | 2009-03-19 | Pfizer, Inc. | Methods of using genetic markers and related epistatic interactions |
WO2009085689A2 (en) * | 2007-12-17 | 2009-07-09 | Pfizer Inc. | Methods of improving genetic profiles of dairy animals and products |
CA2721816A1 (en) * | 2008-05-09 | 2009-11-12 | Pfizer Inc. | Methods of generating genetic predictors employing dna markers and quantitative trait data |
US20100269216A1 (en) † | 2009-04-16 | 2010-10-21 | Syngenta Participations Ag | Network population mapping |
US11107551B2 (en) † | 2013-06-14 | 2021-08-31 | Keygene N.V. | Directed strategies for improving phenotypic traits |
-
2014
- 2014-06-13 US US14/898,005 patent/US11107551B2/en active Active
- 2014-06-13 EP EP14172374.2A patent/EP2813141B2/en active Active
- 2014-06-13 DK DK15176340.6T patent/DK2949204T4/en active
- 2014-06-13 EP EP15176340.6A patent/EP2949204B2/en active Active
- 2014-06-13 WO PCT/NL2014/050389 patent/WO2014200348A1/en active Application Filing
- 2014-06-13 DK DK14172374.2T patent/DK2813141T4/en active
- 2014-06-13 EP EP16193259.5A patent/EP3135103B1/en active Active
- 2014-06-13 JP JP2016519471A patent/JP6566484B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008074101A2 (en) * | 2006-12-21 | 2008-06-26 | Agriculture Victoria Services Pty Limited | Artificial selection method and reagents |
WO2008085046A1 (en) * | 2007-01-09 | 2008-07-17 | Asg Veehouderij B.V | Method for estimating a breeding value for an organism without a known phenotype |
EP1962212A1 (en) | 2007-01-17 | 2008-08-27 | Syngeta Participations AG | Process for selecting individuals and designing a breeding program |
US20100037342A1 (en) | 2008-08-01 | 2010-02-11 | Monsanto Technology Llc | Methods and compositions for breeding plants with enhanced yield |
WO2010020252A1 (en) * | 2008-08-19 | 2010-02-25 | Viking Genetics Fmba | Methods for determining a breeding value based on a plurality of genetic markers |
WO2012075125A1 (en) | 2010-11-30 | 2012-06-07 | Syngenta Participations Ag | Methods for increasing genetic gain in a breeding population |
Non-Patent Citations (19)
Title |
---|
ARLOT, SYLVAIN; CELISSE, ALAIN: "A survey of cross-validation procedures for model selection", STATISTICS SURVEYS, vol. 4, 2010, pages 40 - 79 |
AUSUBEL ET AL.: "Current Protocols in Molecular Biology", vol. 1, 2, 1994, CURRENT PROTOCOLS |
BUNTJER J B ET AL: "Haplotype diversity: the link between statistical and biological association", TRENDS IN PLANT SCIENCE, ELSEVIER SCIENCE, OXFORD, GB, vol. 10, no. 10, 1 October 2005 (2005-10-01), pages 466 - 471, XP027846795, ISSN: 1360-1385, [retrieved on 20051001] * |
HALDANE, J. B. S.: "The combination of linkage values and the calculation of distances between the loci of linked factors", J GENET, vol. 8, no. 29, 1919, pages 299 - 309 |
HALEY C.S.; VISSCHER P.M.: "Strategies to utilize marker-quantitative trait loci association", J DAIRY SCI, vol. 81, 1998, pages 85 - 97, XP026993385 |
HASTIE, T.; TIBSHIRANI, R.; FRIEDMAN, J.: "The Elements of Statistical Learning", 2001, SPRINGER NEW YORK INC. |
HOERL, A. E.: "Optimum Solution of Many Variables Equations", CHEMICAL ENGINEERING PROGRESS, vol. 55, 1959, pages 69 - 78 |
HORTON M.W. ET AL.: "Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel", NATURE GENETICS, vol. 44, 2012, pages 212 - 216 |
KOSAMBI, D. D.: "The estimation of map distances from recombination values", ANNALS OF EUGENICS, vol. 12.1, 1943, pages 172 - 175 |
LI, H. ET AL.: "Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels", NATURE GENETICS, vol. 45, 2013, pages 43 - 50 |
LIU, B.H.: "Statistical Genomics, Linkage, Mapping and QTL analysis", 1998, CRC PRESS, pages: 611 |
MEUWISSEN T H E ET AL: "Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps", GENETICS, GENETICS SOCIETY OF AMERICA, AUSTIN, TX, US, vol. 157, 1 April 2001 (2001-04-01), pages 1819 - 1829, XP007901970, ISSN: 0016-6731 * |
MEUWISSEN T: "Genomic selection: the future of marker assisted selection and animal breeding", INTERNET CITATION, 17 October 2003 (2003-10-17), XP002334468, Retrieved from the Internet <URL:http://www.fao.org/biotech/Torino.htm> [retrieved on 20050704] * |
MEUWISSEN, T.H.E. ET AL.: "Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps", GENETICS, vol. 157, 2001, pages 1819 - 1829, XP007901970 |
PELEMAN, J.D.; ROUPPE VAN DER VOORT, J.: "The challenges in Marker Assisted Breeding, CGN Eucarpia Leafy Vegetables", 2003 |
QI, J ET AL.: "A genomic variation map provides insights into the genetic basis of cucumber domestication and diversity", NATURE GENETICS, vol. 45, 2013, pages 1510 - 1515 |
R.D.D. CROY: "Plant Molecular Biology Labfax", 1993, BIOS SCIENTIFIC PUBLICATIONS LTD |
SAMBROOK; RUSSELL: "Molecular Cloning: A Laboratory Manual", 2001, COLD SPRING HARBOR LABORATORY PRESS |
TINGTING GUO ET AL: "Performance prediction of Fhybrids between recombinant inbred lines derived from two elite maize inbred lines", THEORETICAL AND APPLIED GENETICS ; INTERNATIONAL JOURNAL OF PLANT BREEDING RESEARCH, SPRINGER, BERLIN, DE, vol. 126, no. 1, 13 September 2012 (2012-09-13), pages 189 - 201, XP035157575, ISSN: 1432-2242, DOI: 10.1007/S00122-012-1973-9 * |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2949204B2 (en) † | 2013-06-14 | 2020-06-03 | Keygene N.V. | Directed strategies for improving phenotypic traits |
EP2949204B1 (en) | 2013-06-14 | 2017-01-04 | Keygene N.V. | Directed strategies for improving phenotypic traits |
US11107551B2 (en) | 2013-06-14 | 2021-08-31 | Keygene N.V. | Directed strategies for improving phenotypic traits |
CN105028326A (en) * | 2015-08-22 | 2015-11-11 | 昆明亚灵生物科技有限公司 | Method for raising non-specific-pathogen-class experimental animals during export quarantine period |
CN105512513A (en) * | 2015-12-02 | 2016-04-20 | 新疆农业大学 | Method for identifying prunus persica plant species based on SSR molecular markers |
CN105512513B (en) * | 2015-12-02 | 2018-04-13 | 新疆农业大学 | A kind of method for differentiating Amygdalus plant germplasm based on SSR molecular marker |
CN105420357A (en) * | 2015-12-08 | 2016-03-23 | 中国农业科学院蔬菜花卉研究所 | Universal SSR marker tightly linked with dominant genic male sterility genes in broccoli |
CN106222278A (en) * | 2016-08-09 | 2016-12-14 | 南京林业大学 | A kind of primer special combination for quickly detection willow polyploid and detection method thereof |
CN106258663A (en) * | 2016-11-10 | 2017-01-04 | 辽宁省农业科学院 | A kind of prevention and controls being suitable to the Northeast winter greenhouse Folium Allii tuberosi root maggot |
CN106755387A (en) * | 2016-12-14 | 2017-05-31 | 李兴盛 | A kind of utilization molecular labeling Rapid identification cucumber stock is made a concerted effort the method for two purity |
CN107385052A (en) * | 2017-08-08 | 2017-11-24 | 中国林业科学研究院热带林业研究所 | For identifying STR primers and its application of Eucalyptus clone |
CN107385052B (en) * | 2017-08-08 | 2020-10-30 | 中国林业科学研究院热带林业研究所 | STR primer for identifying clone of eucalyptus and application thereof |
CN109479827A (en) * | 2019-01-03 | 2019-03-19 | 湖南沩滨农业科技有限公司 | The method of domestic pig thick forest free-ranging |
CN110029174B (en) * | 2019-04-22 | 2023-01-10 | 浙江省淡水水产研究所 | SSR (simple sequence repeat) marker related to quality of macrobrachium rosenbergii bodies |
CN110029174A (en) * | 2019-04-22 | 2019-07-19 | 浙江省淡水水产研究所 | A kind of SSR marker related to Macrobrachium rosenbergii weight |
CN110551841B (en) * | 2019-08-28 | 2022-03-18 | 福建省农业科学院作物研究所 | SSR primer group developed based on eggplant transcriptome sequencing data and application thereof |
CN110551841A (en) * | 2019-08-28 | 2019-12-10 | 福建省农业科学院作物研究所 | SSR primer group developed based on eggplant transcriptome sequencing data and application thereof |
CN110643735A (en) * | 2019-11-15 | 2020-01-03 | 中国农业科学院郑州果树研究所 | InDel molecular marker 5mBi3 for identifying bitter taste character of melon fruit, and primer and application thereof |
CN110699478A (en) * | 2019-11-15 | 2020-01-17 | 中国农业科学院郑州果树研究所 | InDel molecular marker 2mBi134 for identifying bitter taste character of melon fruit as well as primer and application thereof |
CN110699478B (en) * | 2019-11-15 | 2022-07-08 | 中国农业科学院郑州果树研究所 | InDel molecular marker 2mBi134 for identifying bitter taste character of melon fruit as well as primer and application thereof |
CN111088388A (en) * | 2020-01-18 | 2020-05-01 | 中国农业科学院郑州果树研究所 | InDel marker and primer pair for identifying flesh red/non-red character of peach fruit and application of InDel marker and primer pair |
CN111312335A (en) * | 2020-02-24 | 2020-06-19 | 吉林省农业科学院 | Soybean parent selection method, soybean parent selection device, storage medium and electronic equipment |
CN111394474A (en) * | 2020-03-24 | 2020-07-10 | 西北农林科技大学 | Method for detecting copy number variation of cattle GA L3 ST1 gene and application thereof |
CN111394474B (en) * | 2020-03-24 | 2022-08-16 | 西北农林科技大学 | Method for detecting copy number variation of GAL3ST1 gene of cattle and application thereof |
CN111387141A (en) * | 2020-04-28 | 2020-07-10 | 青岛康大兔业发展有限公司 | Meat rabbit breeding method |
CN111826429B (en) * | 2020-07-28 | 2022-06-17 | 辽宁省果树科学研究所 | Non-hybrid progeny identification method based on simplified genome sequencing and SNP (single nucleotide polymorphism) sub-allele frequency |
CN111826429A (en) * | 2020-07-28 | 2020-10-27 | 辽宁省果树科学研究所 | Non-hybrid progeny identification method based on simplified genome sequencing and SNP (single nucleotide polymorphism) sub-allele frequency |
CN113151489A (en) * | 2021-02-26 | 2021-07-23 | 河南省畜牧总站 | Molecular diagnosis method for evaluating growth traits based on cow ZNF146 gene CNV marker and application thereof |
CN113151489B (en) * | 2021-02-26 | 2022-09-27 | 河南省畜牧总站 | Molecular diagnosis method for evaluating growth traits based on cow ZNF146 gene CNV marker and application thereof |
CN113621710A (en) * | 2021-07-13 | 2021-11-09 | 武汉中科瑞华生态科技股份有限公司 | Bighead microsatellite marker primer and Bighead marker discharge effect evaluation method |
CN113621710B (en) * | 2021-07-13 | 2023-11-03 | 武汉中科瑞华生态科技股份有限公司 | Bighead microsatellite marked primer and bighead mark releasing effect evaluation method |
CN113430300A (en) * | 2021-08-30 | 2021-09-24 | 广东省农业科学院蚕业与农产品加工研究所 | SSR molecular marker of mulberry variety Yuehen 123, core primer group and kit thereof, and application of SSR molecular marker |
CN113430300B (en) * | 2021-08-30 | 2021-11-09 | 广东省农业科学院蚕业与农产品加工研究所 | SSR molecular marker of mulberry variety Yuehen 123, core primer group and kit thereof, and application of SSR molecular marker |
CN113801954A (en) * | 2021-09-07 | 2021-12-17 | 云南省农业科学院质量标准与检测技术研究所 | SSR core primer group for purity identification of pepper hybrid and screening method thereof |
CN113796353A (en) * | 2021-09-18 | 2021-12-17 | 江苏农牧科技职业学院 | Marking method and device for systemic breeding of black muscovy ducks |
CN113862392A (en) * | 2021-11-15 | 2021-12-31 | 西北农林科技大学 | SSR molecular marker primer linked with Chinese cabbage yellow cotyledon gene Bryc and application thereof |
CN114262748A (en) * | 2021-12-29 | 2022-04-01 | 广东省农业科学院蚕业与农产品加工研究所 | Molecular marker for identifying variety 'Yueshi 143', identifying primer group, kit and application |
CN116200521A (en) * | 2022-12-05 | 2023-06-02 | 东北林业大学 | SSR (simple sequence repeat) marker primer group for identifying Korean pine clone and construction method and application of SSR marker primer group and fingerprint |
CN116200521B (en) * | 2022-12-05 | 2023-08-18 | 东北林业大学 | SSR (simple sequence repeat) marker primer group for identifying Korean pine clone and construction method and application of SSR marker primer group and fingerprint |
Also Published As
Publication number | Publication date |
---|---|
JP2016521984A (en) | 2016-07-28 |
EP2813141B2 (en) | 2018-11-28 |
EP2949204B2 (en) | 2020-06-03 |
DK2813141T4 (en) | 2019-03-18 |
EP2813141B1 (en) | 2015-08-05 |
DK2949204T3 (en) | 2017-04-03 |
WO2014200348A1 (en) | 2014-12-18 |
EP3135103B1 (en) | 2019-09-04 |
DK2813141T3 (en) | 2015-11-09 |
US20160132635A1 (en) | 2016-05-12 |
EP2949204A1 (en) | 2015-12-02 |
JP6566484B2 (en) | 2019-08-28 |
EP3135103A1 (en) | 2017-03-01 |
US11107551B2 (en) | 2021-08-31 |
EP2949204B1 (en) | 2017-01-04 |
DK2949204T4 (en) | 2020-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2949204B1 (en) | Directed strategies for improving phenotypic traits | |
Lovell et al. | Genomic mechanisms of climate adaptation in polyploid bioenergy switchgrass | |
Zila et al. | Genome-wide association study of Fusarium ear rot disease in the USA maize inbred line collection | |
Aguilar‐Meléndez et al. | Genetic diversity and structure in semiwild and domesticated chiles (Capsicum annuum; Solanaceae) from Mexico | |
Tian et al. | Tracking footprints of maize domestication and evidence for a massive selective sweep on chromosome 10 | |
US8874420B2 (en) | Methods for increasing genetic gain in a breeding population | |
US10455783B2 (en) | Compositions and methods of plant breeding using high density marker information | |
Xue et al. | Genetic architecture of domestication-related traits in maize | |
Rowley et al. | A draft genome and high-density genetic map of European hazelnut (Corylus avellana L.) | |
Alcântara et al. | Genetic diversity of teak (Tectona grandis LF) from different provenances using microsatellite markers | |
Koopman et al. | Linked vs. unlinked markers: multilocus microsatellite haplotype‐sharing as a tool to estimate gene flow and introgression | |
Newman et al. | Initiation of genomics-assisted breeding in Virginia-type peanuts through the generation of a de novo reference genome and informative markers | |
Germain‐Aubrey et al. | Are microsatellite fragment lengths useful for population‐level studies? The case of Polygala lewtonii (Polygalaceae) | |
Class et al. | Patent application title: DIRECTED STRATEGIES FOR IMPROVING PHENOTYPIC TRAITS Inventors: Jacob Bernhard Buntjer (Wageningen, NL) José Lúcio Lima Guerra (Wageningen, NL) Timotheus Gerardus Doeswijk (Wageningen, NL) Remco Van Berloo (Wageningen, NL) Niek Bouman (Wageningen, NL) Assignees: Keygene NV | |
Niu et al. | Inferring the invasion history of coral berry Ardisia crenata from China to the USA using molecular markers | |
WO2010120844A1 (en) | Network population mapping | |
Oliveira | Practical considerations for genotype imputation and multi-trait multi-environment genomic prediction in a tropical maize breeding program | |
Ledesma | Molecular and phenotypic characterization of doubled haploid lines derived from different cycles of the Iowa Stiff Stalk Synthetic maize population | |
Crawford et al. | Reticulate speciation and adaptive introgression in the Anopheles gambiae species complex | |
Shirley | Utilization of genetic and genomic resources in hardwood forest trees | |
Jimenez Madrigal | Next-Generation Sequencing Technologies in Tree Improvement and Conservation Genetics of Dipteryx oleifera Benth. | |
Lötter | Haplotype-resolved genome assembly of an F1 hybrid of Eucalyptus urophylla x E. grandis | |
Santo | The application of reduced-representation sequencing techniques for studying the structure of plant populations: A case study in common bean (Phaseolus vulgaris L.) | |
Ence et al. | Genomics of Disease Resistance in Loblolly Pine | |
Wu et al. | Complex history of admixture during citrus domestication revealed by genome analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
17P | Request for examination filed |
Effective date: 20140613 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
R17P | Request for examination filed (corrected) |
Effective date: 20141223 |
|
RBV | Designated contracting states (corrected) |
Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
INTG | Intention to grant announced |
Effective date: 20150318 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 740127 Country of ref document: AT Kind code of ref document: T Effective date: 20150815 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602014000112 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: NV Representative=s name: ISLER AND PEDRAZZINI AG, CH |
|
REG | Reference to a national code |
Ref country code: DK Ref legal event code: T3 Effective date: 20151103 |
|
REG | Reference to a national code |
Ref country code: SE Ref legal event code: TRGR |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 740127 Country of ref document: AT Kind code of ref document: T Effective date: 20150805 |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG4D |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: FP |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20151105 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20151106 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20151207 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20151205 Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R026 Ref document number: 602014000112 Country of ref document: DE |
|
PLBI | Opposition filed |
Free format text: ORIGINAL CODE: 0009260 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
26 | Opposition filed |
Opponent name: KWS SAAT SE Effective date: 20160506 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 3 |
|
PLAX | Notice of opposition and request to file observation + time limit sent |
Free format text: ORIGINAL CODE: EPIDOSNOBS2 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
PLAF | Information modified related to communication of a notice of opposition and request to file observations + time limit |
Free format text: ORIGINAL CODE: EPIDOSCOBS2 |
|
PLBB | Reply of patent proprietor to notice(s) of opposition received |
Free format text: ORIGINAL CODE: EPIDOSNOBS3 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: MM4A |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20160613 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 4 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20140613 Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: PLFP Year of fee payment: 5 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20160630 Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20160613 Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
PUAH | Patent maintained in amended form |
Free format text: ORIGINAL CODE: 0009272 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: PATENT MAINTAINED AS AMENDED |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20150805 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: AELC |
|
27A | Patent maintained in amended form |
Effective date: 20181128 |
|
AK | Designated contracting states |
Kind code of ref document: B2 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R102 Ref document number: 602014000112 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: FP Effective date: 20150915 |
|
REG | Reference to a national code |
Ref country code: SE Ref legal event code: RPEO |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: FP |
|
REG | Reference to a national code |
Ref country code: DK Ref legal event code: T4 Effective date: 20190312 |
|
P01 | Opt-out of the competence of the unified patent court (upc) registered |
Effective date: 20230519 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: CH Payment date: 20230702 Year of fee payment: 10 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20240618 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20240627 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DK Payment date: 20240624 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: NL Payment date: 20240610 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20240625 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 20240624 Year of fee payment: 11 Ref country code: BE Payment date: 20240625 Year of fee payment: 11 |