CN107419000A - A kind of full genome system of selection and its application that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype - Google Patents

A kind of full genome system of selection and its application that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype Download PDF

Info

Publication number
CN107419000A
CN107419000A CN201610349022.8A CN201610349022A CN107419000A CN 107419000 A CN107419000 A CN 107419000A CN 201610349022 A CN201610349022 A CN 201610349022A CN 107419000 A CN107419000 A CN 107419000A
Authority
CN
China
Prior art keywords
snp
soybean
haplotype
genotype
economical character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610349022.8A
Other languages
Chinese (zh)
Inventor
邱丽娟
马岩松
郭勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Original Assignee
Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Crop Sciences of Chinese Academy of Agricultural Sciences filed Critical Institute of Crop Sciences of Chinese Academy of Agricultural Sciences
Priority to CN201610349022.8A priority Critical patent/CN107419000A/en
Publication of CN107419000A publication Critical patent/CN107419000A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/13Plant traits
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Abstract

The invention discloses a kind of method and its application that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype.Present invention firstly provides a kind of method of the model for the economical character for establishing prediction soybean, comprise the following steps:Full-length genome SNP is replaced to sample principle with haplotype sampling principle, and the SNP marker colony obtained according to haplotype sampling principle establishes the model of the economical character of prediction soybean.The economical character is plant height, effective branch amount, single-strain legumen number, single-strain grain weight or 100-grain weight.Invention creates the mark Sampling Strategy based on soybean varieties haplotype analysis, it is possible to increase the full-length genome selection prediction accuracy of soybean difference economical character, has substantial worth for soybean breeder.

Description

A kind of full genome selecting party that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype Method and its application
Technical field
The present invention relates to it is a kind of based on haplotype sample prediction Soybean Agronomic Characters phenotype full-length genome system of selection and It is applied.
Background technology
China's soybean breeder research originates in 1913, and first soybean varieties gold that China has been bred as in nineteen twenty-three is big 332.After the founding of the state, since particularly 1978, China's soybean breeder achieves great progress, and improved variety quantity increases rapidly Add, improved variety quantity so far meets the needs of national economy and social development more than 1800 in different times.Educate Kind method also develops into hybridization (backcrossing) breeding, radiaction mutation, Exogenous DNA transfered from initial natural variation selection and use With molecular marker assisted selection etc..It is growing day by day to the demand of soybean with our people's growth in the living standard.Meanwhile Due to the adjustment of pattern of farming, soybean acreage substantially reduces, Soybean import quantity cumulative year after year.It is public according to State Statistics Bureau The as shown by data of cloth, 2010-2013 whole nations soybean acreage have dropped 17.80%, 2010-2014 Soybean import quantity Add 30.30%.To alleviate this contradiction, soybean yield per unit area is improved, quickening New Soybean Variety Breeding, which is one, to be had Effect ground approach.
Full-length genome selection calculates complicated economical character using the High Density Molecular mark being distributed on full-length genome simultaneously Genome estimated breeding value (Genomic Estimated Breeding Value), and as standard complete breeding population Segregating generation selects.The theory was proposed in 2001 by Meuwissen etc., and was applied in animal breeding.2007, Genome Choice Theory is applied in plant breeding by Bernardo etc. first.With traditional Phenotypic Selection based on genealogical relationship Compare, full-length genome selection have improve genetic gain, shorten breeding cycle, improve breeding efficiency the advantages that.With molecular labeling Assisted Selection (Molecular Assisted Selection, MAS) is compared, and full-length genome selection has advantages below:(1) not Need to build mapping population, can directly utilize the production upper widely used improved variety and the excellent resources structure that have pedigree information T-group is built, accelerates breeding process;(2) T-group is built with parent used in conventional breeding configuration cross combination, protected The model for having demonstrate,proved the genome estimated breeding value established can be directly used for the progeny selection of breeding population, reduce mark auxiliary choosing The process that obtained result also needs to further verify is selected, saves breeding cost;(3) full-length genome Selection utilization distribution Whole molecular labelings estimation genome estimated breeding value on full-length genome, overcomes in molecular marker assisted selection to minor effect The problem of complex character selection of controlled by multiple genes is invalid.With the announcement of increasing plant gene group information, full genome Group selection is applied in increasing crop.
In recent years, with the development of the completion of soybean whole genome sequence and soybean again examining order, in soybean gene group On identify abundant SNP site.On this basis, the soybean gene group chip containing varying number SNP marker is developed.This Applied for full-length genome selection on soybean and create condition.For breeder, the mark for being suitable for breeding population is filtered out Note is to carry out the premise of soybean full-length genome selection.Meanwhile rational marker number is also to reduce full-length genome alternative costs, is added Fast full-length genome selects the application on crop breeding and improves the important means to the efficiency of selection of target group.
The content of the invention
It is an object of the invention to provide a kind of method based on haplotype sampling prediction Soybean Agronomic Characters phenotype and its answer With.
Present invention firstly provides a kind of method of the model for the economical character for establishing prediction soybean, comprise the following steps: Full-length genome SNP is replaced to sample principle, and the SNP marker colony obtained according to haplotype sampling principle with haplotype sampling principle Establish the model of the economical character of prediction soybean.
The economical character is plant height, effective branch amount, single-strain legumen number, single-strain grain weight or 100-grain weight.
" the haplotype sampling principle " is:The linkage disequilibrium relation between all SNP markers is analyzed, haplotype will be formed The SNP in section is defined as " block-SNP ", and the SNP for being not on any one haplotype section is defined as into " blank- SNP ", representative of the SNP marker as the haplotype section is randomly choosed in each haplotype section, with whole blank- SNP marker together, forms a SNP marker colony being made up of m SNP marker.
The implementation of " the haplotype sampling principle " in turn includes the following steps:
1. carrying out genome-wide screening to each soybean in the T-group, the genotype of all SNP markers is obtained Data;
2. analyzing the linkage disequilibrium relation between all SNP markers, the SNP for forming haplotype section is defined as " block-SNP ", the SNP for being not on any one haplotype section is defined as " blank-SNP ", in each haplotype area Between representative of one SNP marker of middle random selection as the haplotype section, together with whole blank-SNP marks, form one The individual SNP marker colony being made up of m SNP marker.
The T-group is made up of the soybean of more than n1;N1 is more than 50 natural number.
The present invention also protects application of any of the above methods described in the economical character of prediction soybean.
The present invention also protection is a kind of to be based on establishing T-group on model and screening have target agronomy from colony to be measured The method of the soybean of shape, comprises the following steps:
(1) unbiased estimator of the economical character of each soybean in T-group is obtained;The T-group is by n1 Soybean composition above;N1 is more than 50 natural number;
(2) genome-wide screening is carried out to each soybean in the T-group, obtains the genotype of all SNP markers Data;The linkage disequilibrium relation between all SNP markers is analyzed, the SNP for forming haplotype section is defined as " block- SNP ", the SNP for being not on any one haplotype section is defined as " blank-SNP ", in each haplotype section with Machine selects representative of the SNP marker as the haplotype section, together with whole blank-SNP marks, forms one by m The SNP marker colony of SNP marker composition;
(3) T-group is based on, following equation first is established for the relation of economical character and genotype:Y=Y= μ1+X×g+ε1;Y is n1 dimensional vectors, represents the unbiased estimator of the economical character of each soybean in T-group;μ1Generation The average value of the unbiased estimator of the economical character of each soybean of table composition T-group;X is that n1 × m ties up matrix, generation Table SNP genotype codes, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represents heterozygous genotypes;G be m tie up to Amount, represent SNP effect value;ε1Represent residual error;Will be each in the corresponding data in step (1) and the SNP marker colony The genotype data of SNP marker is converted into genotype code input aforesaid equation first, obtains for the economical character Each SNP g values;
(4) the g values obtained based on step (3), pass through random regression BLUP (random Regression best linear unbiased prediction, rrBLUP) model, it is verified each big in colony The predicted value of the economical character of beans, so as to screen the soybean that the economical character meets expected standard from checking colony; The checking colony is made up of the soybean of more than n2;N2 is more than 5 natural number.
The economical character is plant height, effective branch amount, single-strain legumen number, single-strain grain weight or 100-grain weight.
In the step (1), the initial data of economical character described in soybean is obtained by field test first, then will be big Beans are estimated the economical character using BLUP (BLUP), obtained described in soybean as stochastic effects The unbiased estimator of economical character.Economical character investigation standard is write with reference to Qiu Lijuan etc.《Soybean Germplasm Description standard With data standard (2006)》.
In the step (2), " genome-wide screening is carried out to each soybean in the T-group, obtains all SNP The method of the genotype data of mark " is as follows:The genomic DNA of each soybean is taken, using IlluminaSoySNP6k ISelectBeadChip chips (probe on chip with detection soybean 5361 SNP) and according to Illumina companies Normal process (the http of GoldenGate chips detection://www.illumina.com) detected, then using Genome Studio Genotyping Module softwares obtain data, less than 95% are standard with missing data ratio, obtain all SNP The genotype data of mark." genome-wide screening is carried out to each soybean in the T-group, obtains all SNP markers Genotype data " it is specific as shown in table 3.
In the step (2), using the linkage disequilibrium relation between all SNP markers of the software analysis of haploview 4.2, Data import and use linkage format forms, and unbalanced scan chain window ranges are 500kb.Operating process refers to Haploview service manuals.
The present invention also protection is a kind of to be based on establishing T-group on model and screening have target agronomy from colony to be measured The method of the soybean of shape, comprises the following steps:
(1) unbiased estimator of the economical character of each soybean in T-group is obtained;The T-group is by n1 Soybean composition above;N1 is more than 50 natural number;
(2) genome-wide screening is carried out to each soybean in the T-group, obtains the genotype of all SNP markers Data;The linkage disequilibrium relation between all SNP markers is analyzed, the SNP for forming haplotype section is defined as " block- SNP ", the SNP for being not on any one haplotype section is defined as " blank-SNP ", in each haplotype section with Machine selects representative of the SNP marker as the haplotype section, together with whole blank-SNP marks, forms one by m The SNP marker colony of SNP marker composition;
(3) T-group is based on, following equation first is established for the relation of economical character and genotype:Y=μ1+ X×g+ε1;Y is n1 dimensional vectors, represents the unbiased estimator of the economical character of each soybean in T-group;μ1Represent Form the average value of the unbiased estimator of the economical character of each soybean of T-group;X is that n1 × m ties up matrix, is represented SNP genotype codes, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represents heterozygous genotypes;G is m dimensional vectors, Represent SNP effect value;ε1Represent residual error;By each SNP in the corresponding data in step (1) and the SNP marker colony The genotype data of mark is converted into genotype code input aforesaid equation first, obtains each for the economical character SNP g values;
(4) average value of the unbiased estimator of the economical character of all soybean in checking colony is obtained;The checking group Body is made up of the soybean of more than n2;N2 is more than 5 natural number;
(5) genome-wide screening is carried out to each soybean in the checking colony, obtained each in the SNP marker colony Individual SNP genotype data;
(6) following equation second is established:Y=μ2+x×g+ε2;μ2Represent the agriculture of each soybean of composition checking colony The average value of the unbiased estimator of skill character;X is that n2 × m ties up matrix, represents SNP genotype codes, is formed by -1,0 ,+1, its In ± 1 represent pure and mild genotype, 0 represents heterozygous genotypes;G is m dimensional vectors, represents SNP effect value;ε2Represent residual error;Will It is above-mentioned that the genotype data for the SNP that corresponding data and step (5) in step (4) obtain is converted into the input of SNP genotype code Equation second, obtains y values, and y values are the predicted value for the economical character for verifying each soybean in colony;
(7) the y values obtained based on step (6) are screened the economical character from the colony to be measured and meet expected standard Soybean.
The economical character is plant height, effective branch amount, single-strain legumen number, single-strain grain weight or 100-grain weight.
In the step (1) and/or the step (4), economical character described in soybean is obtained by field test first Initial data, then using soybean (soybean varieties) as stochastic effects, using BLUP (BLUP) to the agriculture Skill character is estimated, obtains the unbiased estimator of economical character described in soybean.Economical character investigation standard is with reference to Qiu Lijuan etc. Write《Soybean Germplasm Description standard and data standard (2006)》.
In the step (2), " genome-wide screening is carried out to each soybean in the T-group, obtains all SNP The method of the genotype data of mark " is as follows:The genomic DNA of each soybean is taken, using IlluminaSoySNP6k ISelectBeadChip chips (probe on chip with detection soybean 5361 SNP) and according to Illumina companies Normal process (the http of GoldenGate chips detection://www.illumina.com) detected, then using Genome Studio Genotyping Module softwares obtain data, less than 95% are standard with missing data ratio, obtain all SNP The genotype data of mark." genome-wide screening is carried out to each soybean in the T-group, obtains all SNP markers Genotype data " it is specific as shown in table 3.
In the step (2), using the linkage disequilibrium relation between all SNP markers of the software analysis of haploview 4.2, Data import and use linkage format forms, and unbalanced scan chain window ranges are 500kb.Operating process refers to Haploview service manuals.
In the step (5), " genome-wide screening " uses IlluminaSoySNP 6k iSelectBeadChip Chip (probe on chip with detection soybean 5361 SNP) and according to the GoldenGate chips detection of Illumina companies Normal process (http://www.illumina.com) carry out.Use Genome Studio Genotyping Module softwares Obtain data.
N1 described in any of the above concretely more than 150 natural number, more specifically can be 192-224, more specifically can be 192 Or 224.N2 described in any of the above concretely more than 40 natural number, more specifically can be 48-56, more specifically can be 48 or 56.
The present invention is by comparing random sampling methods, uniform sampling method and the sampling method based on linkage disequilibrium value The varying number SNP marker of acquisition selects soybean full-length genome the influence of prediction accuracy, establishes and is suitable for the full base of soybean Because of the screening technique of the SNP marker of group selection.Invention creates the mark Sampling Strategy based on soybean varieties haplotype analysis, The full-length genome selection prediction accuracy of soybean difference economical character can be improved, the breeding seed selection for soybean has great valency Value.
Brief description of the drawings
Fig. 1 Main Agronomic Characters In Soybean full-length genomes between varying number SNP select prediction accuracy.
Fig. 2 Main Agronomic Characters In Soybean full-length genomes between different sampling methods select prediction accuracy box traction substation.
Fig. 3 is comparison box traction substation of the different sampling methods to Spring Sowing Soybean in Northern kind gene group selection prediction accuracy,
Embodiment
Following embodiment facilitates a better understanding of the present invention, but does not limit the present invention.Experiment in following embodiments Method, it is conventional method unless otherwise specified.Test material used in following embodiments, it is certainly unless otherwise specified What routine biochemistry reagent shop was commercially available.Quantitative test in following examples, it is respectively provided with and repeats to test three times, as a result make even Average.
Haplotype section:There is no the region for occurring to recombinate and only including a haplotype on genome, the region is not It is highly similar (Gabriel et al.2002) with the limit between colony and haplotype.
The foundation of embodiment 1, method
Totally 280 soybean varieties (being made up of the soybean varieties of table 1 and the soybean varieties of table 2), wherein 224 soybean varieties Form T-group, 56 soybean varieties composition checking colonies.Carry out 500 repetitions to test, repeat every time in testing, 280 Soybean varieties random division enters T-group and checking colony.280 soybean varieties are available from national Germplasm Resources of Farm Crop Preservation center soybean mid-term storehouse (http://www.cgris.net/query/croplist.php).
Wherein certain is once repeated in testing, and the soybean varieties of T-group are shown in Table 1, verify that the soybean varieties of colony are shown in Table 2.
First, the processing of the multiple years qualification test of different soybean varieties phenotypic datas and phenotypic data
Between 2008-2012, in Heilungkiang, Jilin, the Inner Mongol, Hebei, Henan, Shandong, Anhui, Hubei, Jiangxi and wide West identifies the Other Main Agronomic Characters of 280 soybean varieties.Field experimental design uses random district's groups, repeats three times, 4 row areas, and 5 Meter Hang Chang.Planting density, farming method use test site locality conventional soy Cultivate administration mode.Per cell during harvest The uniform 10 individual plant species tests of growing way among random selection, the economical character of investigation include plant height, effective branch amount (abbreviation branch Number), single-strain legumen number, single-strain grain weight and 100-grain weight, it is that the cell records result to calculate average value.Economical character investigates standard reference What Qiu Lijuan etc. write《Soybean Germplasm Description standard and data standard (2006)》.
Using soybean varieties as stochastic effects, using BLUP (BLUP) respectively to each economical character (plant height, effective branch amount, single-strain legumen number, single-strain grain weight and 100-grain weight) is estimated, obtains each agriculture of each soybean varieties The unbiased estimator of skill character.The calculating of BLUP value is with reference to methods in 1975 such as Handerson: Henderson C R.Best linear unbiased estimation and prediction under a selection model[J].Biometrics,1975,31(2):423-447.Each economical character of 280 soybean varieties Unbiased estimator see Tables 1 and 2.
Table 1
Table 2
Broad-sense heritability (broad-sense heritability of plant height, the broad sense something lost of effective branch amount of each economical character are calculated respectively Power transmission, the broad-sense heritability of single-strain legumen number, the broad-sense heritability of the broad-sense heritability of single-strain grain weight and 100-grain weight).
H2=V (G)/V (P);
H2Broad-sense heritability is represented, V (G) represents genetic variance, and V (P) represents phenotypic variance.
Genetic force calculates the methods of using Fehr:Fehr W R.Principle of cultivar development [M].Vol.Ⅰ,Theory and technique.Iowa State University 1987,Macmillan Inc.New York。
2nd, the genome-wide screening of different soybean varieties
56 soybean varieties in 224 soybean varieties and checking colony in T-group proceed as follows respectively:
1st, extracting genome DNA
The genomic DNA of soybean is extracted with the Genomic DNA purification kits of Thermo companies.
2nd, SNP chip detects
Each genomic DNA that step 1 obtains is taken respectively, is diluted with ultra-pure water, obtains the core that DNA concentration is 50ng/ μ l Acid solution.10 μ l nucleic acid solutions are taken, (there is inspection on chip using IlluminaSoySNP 6k iSelectBeadChip chips Survey 5361 SNP of soybean probe) and according to the normal process (http of Illumina companies GoldenGate chips detection:// Www.illumina.com) detected, then obtain SNP using Genome Studio Genotyping Module softwares Data, form full-length genome information.
Obtain the full-length genome information of 280 soybean varieties.Less than 95% it is standard with missing data ratio, obtains altogether The corresponding information of 5354 SNP sites.The corresponding information of 5354 SNP sites is shown in Table 3.
Distributed intelligences of 3 5354 SNP of table in soybean gene group
3rd, the Accuracy evaluation of economical character prediction is carried out using existing method
1st, genotype data pre-processes
For there is the individual of missing data in 5354 SNP markers, (i.e. some SNP data of some sample do not scan To), utilize " A.mat " code estimation missing idiotype in rrBLUP bags.Evaluation method uses average estimation algorithm, most Big missing data ratio is set as 50%.Calculation formula is as follows:
XijRepresent genotype of i-th part of kind on j-th of SNP.N represents non-missing number of individuals in j-th of SNP marker Amount.
The corresponding information of 5268 SNP sites (i.e. in table 3, the SNP site in addition to last 43 row) is obtained.
2nd, rudimentary model is established with T-group
Following equation is established for the relation of economical character phenotype and genotype:Y=μ1+X×g+ε1.Y be n1 tie up to Amount, represent the unbiased estimator of a certain economical character of material to be tested;μ1224 soybean varieties for representing composition T-group should The average value of the unbiased estimator of economical character;X is that n1 × m ties up matrix, represents SNP genotype codes, is formed by -1,0 ,+1, Wherein ± 1 represents pure and mild genotype (+1 and -1 is random), and 0 represents heterozygous genotypes;G is m dimensional vectors, represents SNP effect value; ε1Represent residual error.
By taking plant height as an example, it is specifically described as follows:Y=μ1+X×g+ε1;Y is n1 dimensional vectors, represents the plant height of material to be tested Unbiased estimator;μ1Represent the average value of the unbiased estimator of the plant height of 224 soybean varieties of composition T-group;X is n1 × m ties up matrix, represents SNP genotype codes, and by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represents heterozygous genes (if some SNP is that A/G is polymorphic, homozygous genotype AA or GG X values are random for ± 1 ,+1 and -1, and genotype AG X values are for type 0);G is m dimensional vectors, represents SNP effect value;ε1Represent residual error.X × g is embodied as Xij×gj, i and j are respectively n1 and m Following natural number, i.e., first SNP genotype code Xi1Corresponding first SNP effect value g1, the like.For trying material Expect to have obtained 5268 SNP, therefore n1=224, m=5268 in 224 soybean varieties, steps 1 in T-group.By Y (unbiased estimator of the plant height of 224 soybean varieties of composition T-group), μ1(224 soybean product of composition T-group Kind plant height unbiased estimator average value), (224 soybean varieties, each 5268 SNP of soybean varieties, X are SNP bases to X Because of type code, genotype code is "+1 ", " -1 " or " 0 ") input, the g of each SNP for plant height phenotype can be obtained Value.
According to the method described above, the g values of each SNP for effective branch amount phenotype can be obtained.
According to the method described above, the g values of each SNP for single-strain legumen number phenotype can be obtained.
According to the method described above, the g values of each SNP for single-strain grain weight phenotype can be obtained.
According to the method described above, the g values of each SNP for 100-grain weight phenotype can be obtained.
3rd, the Accuracy evaluation of rudimentary model
56 soybean varieties in checking colony proceed as follows (by taking plant height as an example) respectively:
(1) genotype data that each soybean varieties step 1 obtains is converted into SNP genotype code and step 2 obtains For plant height phenotype each SNP g values input equation below, obtain y values, i.e. plant height phenotypic predictors:Y=μ2+x× g+ε2”。μ2Represent the average value of the unbiased estimator of the plant height of 56 soybean varieties of composition checking colony;X is that n2 × m ties up square Battle array, represents SNP genotype codes, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represent heterozygous genotypes (if Some SNP is that A/G is polymorphic, and homozygous genotype AA or GG X values are random for ± 1 ,+1 and -1,0) genotype AG X values is;G is m Dimensional vector, represent SNP effect value;ε2Represent residual error.X × g is embodied as xij×gj, i and j are respectively n2's and below m Natural number, i.e., first SNP genotype code xi1Corresponding first SNP effect value g1, the like.Material to be tested is to test 5268 SNP, therefore n2=56, m=5268 have been obtained in 56 soybean varieties, steps 1 in card colony.By μ2(composition is tested Demonstrate,prove colony 56 soybean varieties plant height unbiased estimator average value), x (56 soybean varieties, each soybean varieties 5268 SNP, x are SNP genotype codes, and genotype code is "+1 ", " -1 " or " 0 ") and step 2 obtain for plant height table Each SNP g values input, can obtain y values for type.
(2) the plant height unbiased estimator of 56 soybean varieties, i.e. plant height Phenotypic Observation value are taken.
(3) coefficient correlation (rMP) of plant height phenotypic predictors and plant height Phenotypic Observation value is calculated.
(4) the leaf angle disposition degree of accuracy (rGS) is calculated.RGS=rMP/h, wherein h represent the broad sense for the plant height that step 1 obtains The square root of genetic force.
The leaf angle disposition degree of accuracy is 0.7602 (average value of 500 repetition experiments).
According to the method described above, effective branch amount prediction accuracy can be obtained, (500 repetitions are tested flat for 0.5347 Average).
According to the method described above, single-strain legumen number prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.4974 Value).
According to the method described above, single-strain grain weight prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.1481 Value).
According to the method described above, 100-grain weight prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.5377 Value).
4th, screening is applied to the optimum mark quantity of soybean difference economical character full-length genome selection
By taking plant height as an example, other economical characters are referring to plant height.
1st, based on 5354 SNP markers obtained in step 2,5% with SNP marker sum is interval, using taking at random The method of sample, SNP marker quantity is reduced successively.
2nd, the SNP determined using step 1, carried out successively according to 2 and 3 method of step 3.
The prediction accuracy (rGS) of each economical character is calculated, takes the average value of 500 repetition experiments.
Detecting the conspicuousness of prediction accuracy difference between different SNP marker quantity using minimum significantly range method, (result is shown in Table 4 and Fig. 1).Under conditions of full-length genome selection prediction accuracy is not significantly reduced, the full-length genome of soybean plant height is selected Most suitable SNP marker quantity is 2371, and the most suitable SNP marker quantity of branch amount is 1844, the most suitable SNP marker of single-strain legumen number Quantity is 3424, and the most suitable SNP marker quantity of single-strain grain weight is 1317, and the most suitable SNP marker quantity of 100-grain weight is 3688.
Main Agronomic Characters In Soybean full-length genome selects the prediction accuracy significance of difference between the different SNP marker quantity of table 4 Compare
Note:Represent that difference is not notable with same letter.
5th, more different Sampling Strategies select soybean difference economical character full-length genome the influence of prediction accuracy
By taking plant height as an example, other economical characters are referring to plant height.
1st, haplotype Sampling Strategy
(1) closed using the linkage disequilibrium of 5354 SNP markers obtained in the software analysis step 2 of haploview 4.2 System.Data import and use linkage format forms, and unbalanced scan chain window ranges are 500kb.Operating process refers to Haploview service manuals.Haplotype section definition refers to the Gabriel et al 95% confidential interval methods of 2005. Formed altogether on 20 chromosomes 351, haplotype section (include 2091 SNP, account for whole marker numbers 39.05%).Form The SNP number changes scope of haplotype is 2-22, wherein the haplotype quantity being made up of 4 SNP is 84, accounts for all single times The 23.93% of type.The haplotype of more than 15 SNP compositions has 6, is distributed on 9,11,19 and No. 20 chromosomes.
(2) SNP for forming haplotype section is defined as " block-SNP ", any one haplotype area will be not on Between SNP be defined as " blank-SNP ".In 5354 SNP, 2091 are block-SNP, and 3263 are blank-SNP.Every Representative of the SNP marker as the haplotype is randomly choosed in individual haplotype, together with whole blank-SNP marks, is formed One SNP marker colony being made up of 3614 SNP.
(3) following equation is established for the relation of economical character phenotype and genotype:Y=μ1+X×g+ε1.Y ties up for n1 Vector, represent the unbiased estimator of a certain economical character of material to be tested;μ1Represent 224 soybean varieties of composition T-group The average value of the unbiased estimator of the economical character;X is that n1 × m ties up matrix, SNP genotype codes is represented, by -1,0 ,+1 group Into wherein ± 1 represents pure and mild genotype (+1 and -1 is random), 0 represents heterozygous genotypes;G is m dimensional vectors, represents SNP effect Value;ε1Represent residual error.
By taking plant height as an example, it is specifically described as follows:Y=μ1+X×g+ε1;Y is n1 dimensional vectors, represents the plant height of material to be tested Unbiased estimator;μ1Represent the average value of the unbiased estimator of the plant height of 224 soybean varieties of composition T-group;X is n1 × m ties up matrix, represents SNP genotype codes, and by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represents heterozygous genes (if some SNP is that A/G is polymorphic, homozygous genotype AA or GG X values are random for ± 1 ,+1 and -1, and genotype AG X values are for type 0);G is m dimensional vectors, represents SNP effect value;ε1Represent residual error.X × g is embodied as Xij×gj, i and j are respectively n1 and m Following natural number, i.e., first SNP genotype code Xi1Corresponding first SNP effect value g1, the like.For trying material Expect to have obtained 3614 SNP, therefore n1=224, m=3614 in 224 soybean varieties in T-group, step (2).Will Y (unbiased estimator of the plant height of 224 soybean varieties of composition T-group), μ1(224 soybean product of composition T-group Kind plant height unbiased estimator average value), (224 soybean varieties, each 3614 SNP of soybean varieties, X are SNP bases to X Because of type code, genotype code is "+1 ", " -1 " or " 0 ") input, the g of each SNP for plant height phenotype can be obtained Value.
According to the method described above, the g values of each SNP for effective branch amount phenotype can be obtained.
According to the method described above, the g values of each SNP for single-strain legumen number phenotype can be obtained.
According to the method described above, the g values of each SNP for single-strain grain weight phenotype can be obtained.
According to the method described above, the g values of each SNP for 100-grain weight phenotype can be obtained.
(4) verify that 56 soybean varieties in colony proceed as follows (by taking plant height as an example) respectively:
By each 3614 SNP of soybean varieties genotype data is converted into SNP genotype code and step (3) obtains Each SNP g values input equation below, obtains y values, i.e. plant height phenotypic predictors for plant height phenotype:Y=μ2+x×g+ ε2”。μ2Represent the average value of the unbiased estimator of the plant height of 56 soybean varieties of composition checking colony;X is that n2 × m ties up matrix, Represent SNP genotype codes, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represent heterozygous genotypes (if certain Individual SNP is that A/G is polymorphic, and homozygous genotype AA or GG X values are random for ± 1 ,+1 and -1,0) genotype AG X values is;G ties up for m Vector, represent SNP effect value;ε2Represent residual error.X × g is embodied as xij×gj, i and j be respectively n2 and below m from So number, i.e., first SNP genotype code xi1Corresponding first SNP effect value g1, the like.Material to be tested is checking 3614 SNP, therefore n2=56, m=3614 have been obtained in 56 soybean varieties, steps 1 in colony.By μ2(composition checking The average value of the unbiased estimator of the plant height of 56 soybean varieties of colony), x (56 soybean varieties, each soybean varieties 3614 Individual SNP, x are SNP genotype codes, and genotype code is "+1 ", " -1 " or " 0 ") and step 2 obtain for plant height phenotype Each SNP g values input is said, y values can be obtained.
(5) the plant height unbiased estimator of 56 soybean varieties in colony, i.e. plant height Phenotypic Observation value are verified.
(6) coefficient correlation (rMP) of plant height phenotypic predictors and plant height Phenotypic Observation value is calculated.
(7) the leaf angle disposition degree of accuracy (rGS) is calculated.RGS=rMP/h, wherein h represent the broad sense for the plant height that step 1 obtains The square root of genetic force.The leaf angle disposition degree of accuracy is 0.7698 (average value of 500 repetition experiments).
According to the method described above, effective branch amount prediction accuracy can be obtained, (500 repetitions are tested flat for 0.5354 Average).
According to the method described above, single-strain legumen number prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.5091 Value).
According to the method described above, single-strain grain weight prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.1503 Value).
According to the method described above, 100-grain weight prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.5715 Value).
2nd, grab sample strategy
(1) on the basis of 5354 SNP that step 2 obtains, 10% with marker number is interval, using taking at random The method of sample, the marker number for the selection of soybean full-length genome is reduced successively.
(2) SNP determined using step (1), carried out successively according to 2 and 3 method of step 3.
The prediction accuracy (rGS) of each economical character is calculated, takes the average value of 500 repetition experiments.
3rd, uniform sampling strategy
(1) it is a sampling unit with 2-10 adjacent mark on the basis of 5354 SNP that step 2 obtains, A SNP marker is randomly choosed in each sampling unit and forms new tagging populations.
(2) SNP determined using step (1), carried out successively according to 2 and 3 method of step 3.
The prediction accuracy (rGS) of each economical character is calculated, takes the average value of 500 repetition experiments.
The results contrast of step 1, step 2 and step 3 is shown in Fig. 2.Compare marker number in the range of 260-3700, difference takes The full-length genome prediction accuracy of Main Agronomic Characters In Soybean under sample strategy, it is determined that the Sampling Strategy based on haplotype can improve Main Agronomic Characters In Soybean full-length genome selects prediction accuracy.
Embodiment 2, verified in Spring Sowing Soybean in Northern kind based on the sampling method of haplotype analysis to soybean full-length genome Select the improvement result of prediction accuracy
Verify colony by coming from Heilungkiang, Jilin, Liaoning, the Inner Mongol, Shanxi, 240 northern spring of Hebei and Pekinese Soybean varieties composition (240 soybean varieties are all soybean varieties in the soybean varieties in Tables 1 and 2 in addition to table 5). Carry out 500 repetitions to test, repeat every time in testing, 192 soybean varieties random divisions enter to train group in 240 soybean varieties Body, 48 soybean varieties random divisions enter to verify colony.
Table 5
Middle yellow 24 Shandong beans 8 Middle yellow No. 3 Ji beans 12 Change lures 542 He beans 13 Henan beans 20 Shanxi is big by 70
Ten victory come into leaves Iron stalk 1 Middle yellow 13 Five-pointed star No. 1 Shanxi beans 29 Gaofeng No.1 Zheng 90007 Middle yellow 28
No.1 is drawn in conjunction Rich No. 1 of text Middle yellow 19 Five-pointed star No. 2 Shandong beans 10 84-51 Xu's beans 8 Section rich 14
Conjunction draws No. 2 Give beans No.1 Middle yellow 20 Handan beans 3 Shandong beans 11 Henan beans 15 Xu's beans 11 Middle product 03-5368
Henan beans 12 Precocity 17 Middle product 662 Handan beans 5 Neat tea beans 2 Henan beans 19 Southern agriculture 217 Middle product 95-5383
First, the processing of the multiple years qualification test of different soybean varieties phenotypic datas and phenotypic data
Between 2008-2012, in Heilungkiang, Jilin, the Inner Mongol, Hebei, Henan, Shandong, Anhui, Hubei, Jiangxi and wide West identifies the Other Main Agronomic Characters of 240 soybean varieties.Field experimental design uses random district's groups, repeats three times, 4 row areas, and 5 Meter Hang Chang.Planting density, farming method use test site locality conventional soy Cultivate administration mode.Per cell during harvest The uniform 10 individual plant species tests of growing way among random selection, the economical character of investigation include plant height, effective branch amount (abbreviation branch Number), single-strain legumen number, single-strain grain weight and 100-grain weight, it is that the cell records result to calculate average value.Economical character investigates standard reference What Qiu Lijuan etc. write《Soybean Germplasm Description standard and data standard (2006)》.
Using soybean varieties as stochastic effects, using BLUP (BLUP) respectively to each economical character (plant height, effective branch amount, single-strain legumen number, single-strain grain weight and 100-grain weight) is estimated, obtains each agriculture of each soybean varieties The unbiased estimator of skill character.The calculating of BLUP value is with reference to methods in 1975 such as Handerson: Henderson C R.Best linear unbiased estimation and prediction under a selection model[J].Biometrics,1975,31(2):423-447。
Broad-sense heritability (broad-sense heritability of plant height, the broad sense something lost of effective branch amount of each economical character are calculated respectively Power transmission, the broad-sense heritability of single-strain legumen number, the broad-sense heritability of the broad-sense heritability of single-strain grain weight and 100-grain weight).
H2=V (G)/V (P);
H2Broad-sense heritability is represented, V (G) represents genetic variance, and V (P) represents phenotypic variance.
Genetic force calculates the methods of using Fehr:Fehr W R.Principle of cultivar development [M].Vol.Ⅰ,Theory and technique.Iowa State University 1987,Macmillan Inc.New York。
2nd, full genome selection is carried out using haplotype sampling principle
1st, genome-wide screening (method is with the step of embodiment 1 two) is carried out, the step of obtaining embodiment 1 two obtains 5354 SNP genotype data.
2nd, haplotype Sampling Strategy
(1) linkage disequilibrium relation between haploview 4.2 software analysis, 5354 SNP markers is utilized.Data are imported and adopted With linkage format forms, unbalanced scan chain window ranges are 500kb.Operating process uses with reference to haploview Handbook.Haplotype section definition refers to the Gabriel et al 95% confidential interval methods of 2005.On 20 chromosomes 328, haplotype section is formed altogether.
(2) SNP for forming haplotype section is defined as " block-SNP ", any one haplotype area will be not on Between SNP be defined as " blank-SNP ".In 5354 SNP, 1987 are block-SNP, and 3367 are blank-SNP.Every Representative of the SNP marker as the haplotype is randomly choosed in individual haplotype, together with whole blank-SNP marks, is formed One SNP marker colony being made up of 3695 SNP.
3rd, following equation is established for the relation of economical character phenotype and genotype:Y=μ1+X×g+ε1
Y is n1 dimensional vectors, represents the unbiased estimator of the plant height of each soybean varieties in T-group;μ1Represent composition The average value of the unbiased estimator of the plant height of 192 soybean varieties of T-group;X is that n1 × m ties up matrix, represents SNP genes Type code, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represent heterozygous genotypes (if some SNP is more as A/G State, homozygous genotype AA or GG X values are random for ± 1 ,+1 and -1,0) genotype AG X values is;G is m dimensional vectors, represents SNP Effect value;ε1Represent residual error.X × g is embodied as Xij×gj, i and j are respectively n1 and below m natural number, i.e., first SNP genotype code Xi1Corresponding first SNP effect value g1, the like.Material to be tested is 192 in T-group 3695 SNP, therefore n1=192, m=3695 have been obtained in soybean varieties, step (2).By Y (192 of composition T-group The unbiased estimator of the plant height of soybean varieties), μ1(the unbiased estimator of the plant height of 192 soybean varieties of composition T-group Average value), (192 soybean varieties, each 3695 SNP of soybean varieties, X are SNP genotype codes to X, and genotype code is "+1 ", " -1 " or " 0 ") input, the g values of each SNP for plant height phenotype can be obtained.
4th, verify that 48 soybean varieties in colony proceed as follows respectively:
By each 3695 SNP of soybean varieties genotype data is converted into SNP genotype code and step 3 obtains Each SNP g values input equation below, obtains y values, i.e. plant height phenotypic predictors for plant height phenotype:Y=μ2+xg+ ε2”。μ2Represent the average value of the unbiased estimator of the plant height of 48 soybean varieties of composition checking colony;X is that n2 × m ties up matrix, Represent SNP genotype codes, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represent heterozygous genotypes (if certain Individual SNP is that A/G is polymorphic, and homozygous genotype AA or GG X values are random for ± 1 ,+1 and -1,0) genotype AG X values is;G ties up for m Vector, represent SNP effect value;ε2Represent residual error.X × g is embodied as xij×gj, i and j be respectively n2 and below m from So number, i.e., first SNP genotype code xi1Corresponding first SNP effect value g1, the like.Material to be tested is checking 3695 SNP, therefore n2=48, m=3695 have been obtained in 48 soybean varieties, step 2 in colony.By μ2(composition checking The average value of the unbiased estimator of the plant height of 48 soybean varieties of colony), x (48 soybean varieties, each soybean varieties 3695 Individual SNP, x are SNP genotype codes, and genotype code is "+1 ", " -1 " or " 0 ") and step 3 obtain for plant height phenotype For each SNP g values input, y values can be obtained.
5th, the plant height unbiased estimator of 48 soybean varieties in checking colony, i.e. plant height Phenotypic Observation value are taken.
6th, the coefficient correlation (rMP) of plant height phenotypic predictors and plant height Phenotypic Observation value is calculated.
7th, the leaf angle disposition degree of accuracy (rGS) is calculated.RGS=rMP/h, wherein h represent the broad sense for the plant height that step 1 obtains The square root of genetic force.The leaf angle disposition degree of accuracy is 0.7928 (average value of 500 repetition experiments).
According to the method described above, effective branch amount prediction accuracy can be obtained, (500 repetitions are tested flat for 0.5348 Average).
According to the method described above, single-strain legumen number prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.5375 Value).
According to the method described above, single-strain grain weight prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.1769 Value).
According to the method described above, 100-grain weight prediction accuracy can be obtained, (500 repetition experiments are averaged for 0.5282 Value).
8th, check experiment (grab sample strategy)
By taking plant height as an example, comprise the following steps that:On the basis of 5354 SNP obtained in (1) of step 2, with reference numerals The 10% of amount is interval, using the method for grab sample, reduces the marker number for the selection of soybean full-length genome successively.Specifically Method is referring to step 3 to seven.Number of repetition is set as 500 times.
The results contrast of step 7 and step 8 is shown in Fig. 3.As a result show, the Sampling Strategy based on haplotype analysis can be significantly The full-length genome selection prediction accuracy of raising soybean plant height, single-strain legumen number, single-strain grain weight and 100-grain weight, raising degree are respectively 1.58%, 2.01%, 34.95% and 9.20%.Result above proves the mark based on haplotype analysis that this invention proposes Screening technique can effectively improve soybean full-length genome selection prediction accuracy.

Claims (9)

1. a kind of method of the model for the economical character for establishing prediction soybean, comprises the following steps:Principle generation is sampled with haplotype Principle is sampled for full-length genome SNP, and the agronomy for the SNP marker colony foundation prediction soybean that principle obtains is sampled according to haplotype The model of character.
2. the method as described in claim 1, it is characterised in that:The economical character is plant height, effective branch amount, individual plant pod Number, single-strain grain weight or 100-grain weight.
3. method as claimed in claim 1 or 2, it is characterised in that:" the haplotype sampling principle " is:Analyze all SNP Linkage disequilibrium relation between mark, the SNP for forming haplotype section is defined as " block-SNP ", will be not on any The SNP in one haplotype section is defined as " blank-SNP ", and a SNP marker is randomly choosed in each haplotype section and is made For the representative in the haplotype section, together with whole blank-SNP marks, form a SNP being made up of m SNP marker and mark Remember colony.
4. method as claimed in claim 1 or 2, it is characterised in that:The implementation of " the haplotype sampling principle " is successively Comprise the following steps:
1. carrying out genome-wide screening to each soybean in the T-group, the genotype data of all SNP markers is obtained; The T-group is made up of the soybean of more than n1;N1 is more than 50 natural number;
2. analyzing the linkage disequilibrium relation between all SNP markers, the SNP for forming haplotype section is defined as " block- SNP ", the SNP for being not on any one haplotype section is defined as " blank-SNP ", in each haplotype section with Machine selects representative of the SNP marker as the haplotype section, together with whole blank-SNP marks, forms one by m The SNP marker colony of SNP marker composition.
5. application of any methods described in the economical character of prediction soybean in Claims 1-4.
6. it is a kind of based on method model established to T-group from colony to be measured and screen the soybean with target Agronomic Traits, Comprise the following steps:
(1) unbiased estimator of the economical character of each soybean in T-group is obtained;The T-group is by more than n1 Soybean composition;N1 is more than 50 natural number;
(2) genome-wide screening is carried out to each soybean in the T-group, obtains the genotype number of all SNP markers According to;The linkage disequilibrium relation between all SNP markers is analyzed, the SNP for forming haplotype section is defined as " block-SNP ", The SNP for being not on any one haplotype section is defined as " blank-SNP ", randomly choosed in each haplotype section Representative of one SNP marker as the haplotype section, together with whole blank-SNP marks, form one and marked by m SNP Remember the SNP marker colony of composition;
(3) T-group is based on, following equation first is established for the relation of economical character and genotype:Y=Y=μ1+X ×g+ε1;Y is n1 dimensional vectors, represents the unbiased estimator of the economical character of each soybean in T-group;μ1Represent group Into the average value of the unbiased estimator of the economical character of each soybean of T-group;X is that n1 × m ties up matrix, represents SNP Genotype code, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represents heterozygous genotypes;G is m dimensional vectors, table Show SNP effect value;ε1Represent residual error;Each SNP in corresponding data in step (1) and the SNP marker colony is marked The genotype data of note is converted into genotype code input aforesaid equation first, obtains each for the economical character SNP g values;
(4) the g values obtained based on step (3), by random regression BLUP model, are verified in colony The predicted value of the economical character of each soybean, meet expected standard so as to screen the economical character from checking colony Soybean;The checking colony is made up of the soybean of more than n2;N2 is more than 5 natural number.
7. method as claimed in claim 6, it is characterised in that:The economical character is plant height, effective branch amount, individual plant pod Number, single-strain grain weight or 100-grain weight.
8. it is a kind of based on method model established to T-group from colony to be measured and screen the soybean with target Agronomic Traits, Comprise the following steps:
(1) unbiased estimator of the economical character of each soybean in T-group is obtained;The T-group is by more than n1 Soybean composition;N1 is more than 50 natural number;
(2) genome-wide screening is carried out to each soybean in the T-group, obtains the genotype number of all SNP markers According to;The linkage disequilibrium relation between all SNP markers is analyzed, the SNP for forming haplotype section is defined as " block-SNP ", The SNP for being not on any one haplotype section is defined as " blank-SNP ", randomly choosed in each haplotype section Representative of one SNP marker as the haplotype section, together with whole blank-SNP marks, form one and marked by m SNP Remember the SNP marker colony of composition;
(3) T-group is based on, following equation first is established for the relation of economical character and genotype:Y=Y=μ1+X ×g+ε1;Y is n1 dimensional vectors, represents the unbiased estimator of the economical character of each soybean in T-group;μ1Represent group Into the average value of the unbiased estimator of the economical character of each soybean of T-group;X is that n1 × m ties up matrix, represents SNP Genotype code, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, 0 represents heterozygous genotypes;G is m dimensional vectors, table Show SNP effect value;ε1Represent residual error;Each SNP in corresponding data in step (1) and the SNP marker colony is marked The genotype data of note is converted into genotype code input aforesaid equation first, obtains each for the economical character SNP g values;
(4) average value of the unbiased estimator of the economical character of all soybean in checking colony is obtained;It is described checking colony by The soybean composition of more than n2;N2 is more than 5 natural number;
(5) genome-wide screening is carried out to each soybean in the checking colony, obtained each in the SNP marker colony SNP genotype data;
(6) following equation second is established:Y=μ2+x×g+ε2”;μ2Represent the agronomy of each soybean of composition checking colony The average value of the unbiased estimator of character;X is that n2 × m ties up matrix, represents SNP genotype codes, by -1,0 ,+1 composition, wherein ± 1 represents pure and mild genotype, and 0 represents heterozygous genotypes;G is m dimensional vectors, represents SNP effect value;ε2Represent residual error;Will step Suddenly the genotype data for the SNP that the corresponding data in (4) and step (5) obtain is converted into SNP genotype code and inputs above-mentioned side Formula second, obtains y values, and y values are the predicted value for the economical character for verifying each soybean in colony;
(7) the y values obtained based on step (6) are screened the economical character from the colony to be measured and meet the big of expected standard Beans.
9. method as claimed in claim 8, it is characterised in that:The economical character is plant height, effective branch amount, individual plant pod Number, single-strain grain weight or 100-grain weight.
CN201610349022.8A 2016-05-24 2016-05-24 A kind of full genome system of selection and its application that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype Pending CN107419000A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610349022.8A CN107419000A (en) 2016-05-24 2016-05-24 A kind of full genome system of selection and its application that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610349022.8A CN107419000A (en) 2016-05-24 2016-05-24 A kind of full genome system of selection and its application that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype

Publications (1)

Publication Number Publication Date
CN107419000A true CN107419000A (en) 2017-12-01

Family

ID=60422724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610349022.8A Pending CN107419000A (en) 2016-05-24 2016-05-24 A kind of full genome system of selection and its application that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype

Country Status (1)

Country Link
CN (1) CN107419000A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109273052A (en) * 2018-09-13 2019-01-25 北京百迈客生物科技有限公司 A kind of genome monoploid assembling method and device
CN109727641A (en) * 2019-01-22 2019-05-07 袁隆平农业高科技股份有限公司 A kind of full-length genome prediction technique and device
CN110400597A (en) * 2018-04-23 2019-11-01 成都二十三魔方生物科技有限公司 A kind of genetype for predicting method based on deep learning
CN111676308A (en) * 2020-06-01 2020-09-18 中国农业科学院作物科学研究所 QTL (quantitative trait locus) and SNP (Single nucleotide polymorphism) marker related to quantitative traits of soybean branches and application
CN111798920A (en) * 2020-07-14 2020-10-20 云南省烟草农业科学研究院 Tobacco economic trait phenotypic value prediction method based on whole genome selection and application
CN112102880A (en) * 2020-10-19 2020-12-18 北京诺禾致源科技股份有限公司 Method for identifying variety, and method and device for constructing prediction model thereof
CN112233722A (en) * 2020-10-19 2021-01-15 北京诺禾致源科技股份有限公司 Method for identifying variety, and method and device for constructing prediction model thereof
CN112746121A (en) * 2020-12-31 2021-05-04 中国科学院东北地理与农业生态研究所 SNP locus combination related to soybean agronomic traits, gene chip and application
CN112852989A (en) * 2020-12-31 2021-05-28 中国科学院东北地理与农业生态研究所 SNP locus combination related to soybean agronomic traits, liquid phase gene chip and application

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150181822A1 (en) * 2013-12-31 2015-07-02 Dow Agrosciences Llc Selection based on optimal haploid value to create elite lines

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150181822A1 (en) * 2013-12-31 2015-07-02 Dow Agrosciences Llc Selection based on optimal haploid value to create elite lines

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A. H. SALLAMA, ET AL.: "Assessing Genomic Selection Prediction Accuracy in a Dynamic Barley Breeding Population", 《THE PLANT GENOME》 *
MA, YANSONG ET AL.: "Potential of marker selection to increase prediction accuracy of genomic selection in soybean (Glycine max L.)", 《MOLECULAR BREEDING(2016)》 *
R. VERONEZE,ET AL.: "Linkage disequilibrium and haplotype block structure in six commercial pig lines", 《JOURNAL OF ANIMAL SCIENCE》 *
马岩松等: "群体构成方式对大豆百粒重全基因组选择预测准确度的影响", 《作物学报》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110400597A (en) * 2018-04-23 2019-11-01 成都二十三魔方生物科技有限公司 A kind of genetype for predicting method based on deep learning
CN109273052B (en) * 2018-09-13 2022-03-18 北京百迈客生物科技有限公司 Genome haploid assembling method and device
CN109273052A (en) * 2018-09-13 2019-01-25 北京百迈客生物科技有限公司 A kind of genome monoploid assembling method and device
CN109727641A (en) * 2019-01-22 2019-05-07 袁隆平农业高科技股份有限公司 A kind of full-length genome prediction technique and device
CN109727641B (en) * 2019-01-22 2021-03-23 隆平农业发展股份有限公司 Whole genome prediction method and device
CN111676308A (en) * 2020-06-01 2020-09-18 中国农业科学院作物科学研究所 QTL (quantitative trait locus) and SNP (Single nucleotide polymorphism) marker related to quantitative traits of soybean branches and application
CN111798920A (en) * 2020-07-14 2020-10-20 云南省烟草农业科学研究院 Tobacco economic trait phenotypic value prediction method based on whole genome selection and application
CN111798920B (en) * 2020-07-14 2023-10-20 云南省烟草农业科学研究院 Tobacco economic character phenotype value prediction method based on whole genome selection and application
CN112233722A (en) * 2020-10-19 2021-01-15 北京诺禾致源科技股份有限公司 Method for identifying variety, and method and device for constructing prediction model thereof
CN112102880A (en) * 2020-10-19 2020-12-18 北京诺禾致源科技股份有限公司 Method for identifying variety, and method and device for constructing prediction model thereof
CN112233722B (en) * 2020-10-19 2024-01-30 北京诺禾致源科技股份有限公司 Variety identification method, and method and device for constructing prediction model thereof
CN112852989A (en) * 2020-12-31 2021-05-28 中国科学院东北地理与农业生态研究所 SNP locus combination related to soybean agronomic traits, liquid phase gene chip and application
CN112746121A (en) * 2020-12-31 2021-05-04 中国科学院东北地理与农业生态研究所 SNP locus combination related to soybean agronomic traits, gene chip and application
CN112746121B (en) * 2020-12-31 2023-08-18 中国科学院东北地理与农业生态研究所 SNP locus combination related to soybean agronomic traits, gene chip and application

Similar Documents

Publication Publication Date Title
CN107419000A (en) A kind of full genome system of selection and its application that prediction Soybean Agronomic Characters phenotype is sampled based on haplotype
CN107278877B (en) A kind of full-length genome selection and use method of corn seed-producing rate
Singh et al. Evaluation of microsatellite markers for genetic diversity analysis among sugarcane species and commercial hybrids
CN106676172B (en) 212 SNP sites of tomato and its application in identification tomato variety authenticity and seed purity
CN103789306B (en) The SNP marker and its method of a kind of rice blast resistance gene Pia and application
CN102395678B (en) Major qtl of maize stalk rot resistance, molecular markers linked with the same and uses thereof
Locatelli et al. Genome-wide association mapping of agronomic traits in relevant barley germplasm in Uruguay
Kim et al. Genome-wide SNP discovery and core marker sets for DNA barcoding and variety identification in commercial tomato cultivars
Ademe et al. Association mapping analysis of fiber yield and quality traits in Upland cotton (Gossypium hirsutum L.)
CN102140506B (en) Molecular marker linked with gummy stem blight resistance gene Gsb-2 and application thereof
Parisi et al. Phenotypic and molecular diversity in a collection of ‘Pomodoro di Sorrento’Italian tomato landrace
Liao et al. Aus rice root architecture variation contributing to grain yield under drought suggests a key role of nodal root diameter class
CN105238866B (en) One SNP site related to upland cotton Early mature apricot and its application
CN105734057A (en) SSR mark linked with pseudoperonospora cubensis resistance main effect QTL and application of SSR mark
Aravanopoulos et al. Population and conservation genomics in forest and fruit trees
CN105063201A (en) Molecular marker of corn chromosome 9 ear row number major QTL and application thereof
CN101824479B (en) SCAR markerer of sorghum head smut resistance germ No. 3 physiological strain
CN106011136B (en) SSR (simple sequence repeat) marker related to ramie yield and application thereof
Ochieng et al. Genetic variation within two sympatric spotted gum eucalypts exceeds between taxa variation
CN111118192B (en) KASP molecular marker of wheat ear base small ear fruition main effect QTL and application thereof
CN108060247B (en) Haplotype related to upland cotton No. 8 chromosome fiber strength
CN107254535B (en) SNP molecular marker related to salt tolerance of corn and application thereof
CN105018487A (en) Molecular marker for major QTL of chromosome-3 ear row number of corn and application thereof
Ongom et al. A mid-density single-nucleotide polymorphism panel for molecular applications in cowpea (Vigna unguiculata (L.) Walp)
CN110257553A (en) A kind of KASP molecule labelling method for identifying resistance gene of rice blast Pigm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination