CN113345520A - Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees - Google Patents

Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees Download PDF

Info

Publication number
CN113345520A
CN113345520A CN202110629578.3A CN202110629578A CN113345520A CN 113345520 A CN113345520 A CN 113345520A CN 202110629578 A CN202110629578 A CN 202110629578A CN 113345520 A CN113345520 A CN 113345520A
Authority
CN
China
Prior art keywords
qtl
equation
growth
tree
quantitative
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110629578.3A
Other languages
Chinese (zh)
Other versions
CN113345520B (en
Inventor
张晓宇
龚慧莹
姜立波
邬荣领
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Forestry University
Original Assignee
Beijing Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Forestry University filed Critical Beijing Forestry University
Priority to CN202110629578.3A priority Critical patent/CN113345520B/en
Publication of CN113345520A publication Critical patent/CN113345520A/en
Application granted granted Critical
Publication of CN113345520B publication Critical patent/CN113345520B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks

Landscapes

  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Chemical & Material Sciences (AREA)
  • Physiology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees, and belongs to the technical field of bioinformatics analysis. The method comprises the steps of fitting a growth curve of the tree quantitative character by using a Richards growth equation to obtain Richards estimation parameters of the quantitative character, and obtaining structural parameters of an SAD (1) model according to the correlation of the character phenotype along with time; performing functional mapping according to the difference of the growth parameters among genotypes, and screening a significant QTL for regulating and controlling the quantitative trait growth of the trees; calculating the genetic effect value and the heritability of the significant QTL, establishing a regulation structure network among the QTLs, and identifying the key regulation QTL for explaining the growth process of the tree quantitative characters. The method has strong biological significance and high positioning precision, and the established genetic effect regulation and control network provides an effective method for genetic structure analysis, so that the character growth analysis is more comprehensive.

Description

Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees
Technical Field
The invention belongs to the technical field of bioinformatics analysis, and particularly relates to a Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees.
Background
The tree biomass is a basic raw material for forestry production practice, and has important ecological value and economic value. The growth potential and development characteristics of trees are mainly reflected in the accumulation of quantitative characters such as diameter, stem height, volume of timber and the like. These traits have no clear boundaries between relative traits and can only be quantitatively used to differentiate between individual differences in performance. The quantitative trait of trees is a genetic process controlled by multiple genes in a combined way, and a chromosome locus for controlling the expression of the quantitative trait is called a Quantitative Trait Locus (QTL). Establishing biological correlation between quantitative trait phenotypes and genotypes, and screening QTLs for regulating the trait growth of trees plays an important role in revealing the growth mechanism of trees from the genetic aspect. The research on the genetic control mode of the significant QTL on the quantitative traits and the establishment of the regulation and control structure among the QTLs can provide reliable basis for the molecular genetic control breeding of the trees.
In the existing QTL positioning method, function mapping combines a dynamic curve with biological significance in character development with a statistical model to realize positioning of important QTLs for controlling dynamic character development tracks. For example, in 2018, wangping et al analyzed the upper interaction mechanism of 4 phenotypic dynamic growth data of stem height, main root length, total lateral root length and lateral root number at the growth stage of populus diversifolia seedlings based on functional mapping and a 2HIGWAS major effect model. In addition, in 2017, Zhang et al studied the genetic structure of the shoots and roots of populus euphratica during development, and detected the heterotime QTL located in the candidate gene region by fitting a growth equation and using the heterogeneity parameters as phenotypes.
However, the current research is not comprehensive in establishing the genetic mechanism of the key quantitative traits of trees, and lacks genetic control aiming at significant QTL regulated growth and deep-in analysis of the regulation network structure between QTLs.
Disclosure of Invention
In view of the above, the present invention aims to provide a Richards equation-based method for QTL positioning frame of tree quantitative trait for establishing a complete tree quantitative trait analysis system.
The invention provides a Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees, which comprises the following steps of:
1) fitting Richard's equation by using the quantitative character phenotype data of woody plant sample individuals in the same living environment, and searching to obtain the estimation parameters of the equation;
2) carrying out single nucleotide polymorphism typing on the whole genome of each sample individual to obtain the genotype of each sample individual;
3) taking genotype and phenotype data of woody plant sample individuals as analysis data, and applying the estimation parameters in the step 1) to a model frame of functional mapping to position and regulate the remarkable QTL of the growth of the tree quantitative character;
4) calculating the genetic effect value of the quantitative trait according to the significant QTL in the step 3);
5) establishing linear correlation relations among different significant QTLs according to the genetic effect values in the step 4) to obtain a QTL regulation structure network;
6) and (5) identifying and explaining the key QTL for the quantitative trait growth process of the trees by utilizing the QTL regulation structure network in the step 5).
Preferably, the Richard equation in step 1) is as shown in equation (3):
Figure BDA0003102887610000021
wherein, b1Represents a growth threshold; b2Representing a shape parameter; b3Is a parameter related to growth rate, y represents the fitted value of the quantitative trait, and t represents time in years.
Preferably, the fitting of the Richard equation in step 1) is performed by a least squares method;
the search method of the estimated parameters is a BFGS quasi-Newton method;
the estimation parameters are Richards equation parameters with minimum residual square sum of the table values and the fitting values.
Preferably, the model framework of the functional mapping in step 3) determines whether a gene locus affects the growth of the quantitative trait by the following hypothesis test:
establishing an assumption that:
H0j=Θversus H1j≠Θ,j=1,…,J (4)
primitive hypothesis H0If the gene locus does not affect the growth of the quantitative trait, namely the growth of the quantitative trait has no difference among different genotypes, the estimation parameters are equal under different genotypes, and the estimation parameters are expressed as theta;
alternative hypothesis H1The genetic locus influences the growth of quantitative traits, namely the growth of the quantitative traits of the tree samples with different genotypes has difference, so that the estimated parameters are not equal under different genotypes, the number of the genotypes of the genetic locus is J, and the estimated parameter of the tree with the genotype of J is expressed as thetaj (j=1,…,J);
Secondly, establishing likelihood functions L of the original hypothesis and the alternative hypothesis respectively0(y) and L1(y):
Figure BDA0003102887610000031
Figure BDA0003102887610000032
Where n is the number of samples of the tree,
Figure BDA0003102887610000033
the number of tree samples with genotype j; y isi=(yi(1),…,yi(T)) represents the quantitative trait of tree sample individual i in the amount of growth at observation time 1, …, T; the tree sample overall approximately follows normal distribution; the mean value of the normal distribution obeyed by the tree as a whole is mu (1), …, mu (T)), and the mean value is constructed by using Richards equation parameters theta; the mean value of trees with genotype j is muj=(μj(1),…,μj(T)), using the Richards equation parameter thetajConstructing; f (y)i(ii) a Θ, Ψ) represents the original assumptionProbability density function of tree normal distribution, fj(yi;ΘjΨ) represents the probability density function of the tree normal distribution for genotype j under the alternative hypothesis; psi is a structural parameter for constructing a normal distribution covariance matrix;
according to likelihood function L0(y) and L1(y) establishing a likelihood ratio statistic LR:
LR=-2(logL0(y)-logL1(y)) (7)
LR is approximate compliance ×)2The distribution statistic, the degree of freedom of which is the difference between the original hypothesis and the alternative hypothesis parameters;
determining a p value according to the statistic LR, determining a statistical inference: the reject domain of LR is expressed as W ≧ c { LR ≧ c }, where the critical value c satisfies P (LR ≧ c) ≦ α, and when the check level is set to α, if P ≦ α, LR belongs to the reject domain, then the original hypothesis H is rejected0Accepting alternative hypothesis H1And judging that the growth of different genotype characters of the gene locus has difference when the test level is alpha, wherein the gene locus is the obvious QTL.
Preferably, the covariance matrix of the normal distribution to which the tree in the original hypothesis and the tree in the alternative hypothesis generally obeys is constructed by using a first-order forward-dependent structure model (SAD (1));
an innovation variance γ included in the structure parameter Ψ of the first-order forward-dependent structure model (SAD (1))2And a first order pre-dependent parameter phi;
the covariance matrix is expressed by Σ:
Figure BDA0003102887610000041
each element in the sigma is related to time t, and the element on the diagonal line is the variance of the quantitative character of the log sample and the time, and is constructed by equation (9); the elements on the non-diagonal are the covariance of the quantitative character at different times, and are constructed by equation (10);
Figure BDA0003102887610000042
Figure BDA0003102887610000043
preferably, the value of the genetic effect of the quantitative trait in step 4) comprises an additive effect, a dominant effect.
Preferably, the formula for calculating the additive effect is shown in equation (11), and the formula for calculating the dominant effect is shown in equation (12):
Figure BDA0003102887610000044
Figure BDA0003102887610000045
wherein mu0(t),μ1(t),μ2(t) represents the average effect of the three genotypes QQ, Qq, QQ of the QTL respectively.
Preferably, the linear correlation relationship between the different significant QTLs in step 5) is shown in equation (13), which represents the regulatory relationship between the ith significant QTL and the other i-1 QTLs:
Ei=β1E1+…+βi-1Ei-1i+1Ei+1+…+βpEp+ε (13)
wherein EiA genetic effect value representing the ith significant QTL, wherein i 1.. P; epsilon to N (0, sigma)2), β1,…,βi-1i+1,…,βpAnd σ2Is an unknown parameter;
(E1(t),…,Ei-1(t),Ei(t),Ei+1(t),…,Ep(T)) (T ═ 1, …, T) is the calculated value of the genetic effect at T time points, then equation (13) can be expressed in the form of a matrix as follows:
Figure RE-GDA0003157391750000052
wherein epsiloni~N(0,σ2) And are independently and equally distributed;
Figure BDA0003102887610000052
is a p-dimensional error vector and satisfies:
E(ε)=0,
Var(ε)=σ2In
Figure BDA0003102887610000053
is a p-dimensional parameter vector, and whether a regulation relation exists between the significant QTLs is judged by checking the parameter vector of the regression equation: when a certain element beta in the parameter vectorjWhen j belongs to (1, …, i-1, i +1, …, p) is 0, the obvious QTLi and QTLj have no regulation and control relation, and otherwise, the regulation and control relation exists.
Preferably, the identification method for the key regulation and control QTL for explaining the tree quantitative trait growth process in the step 6) is to count the quantity of the rest QTLs regulated and controlled by each QTL, wherein the QTL with the larger regulation and control quantity is called as the key regulation and control QTL, the quantity of the key regulation and control QTL is related to the quantity of all the QTLs in the regulation and control network, and the selection standard in the invention is the first 5% of the quantity of all the QTLs in the network.
Preferably, the heritability is calculated according to the significant QTL for regulating and controlling the quantitative trait growth of the tree, which is obtained by positioning in the step 3), namely the ratio of the genetic variance of each QTL to the phenotypic variance of the quantitative trait is used for balancing the contribution of the genetic factors of the quantitative QTL to the phenotypic difference, and when the heritability is more than 10 percent (the standard is set by statistical comparison of the heritability curve), the contribution of the QTL to the phenotypic difference is large;
the heritability calculation formula is as shown in equation (14):
Figure BDA0003102887610000061
wherein VAIs additive variance, VDIs a dominant variance, VP=VA+VD+VEIs a phenotypic variance composed of both genetic and environmental variances.
The invention provides a Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees, which is characterized in that a Richards equation with strong plasticity is used for fitting a growth curve of the quantitative traits of the trees to obtain estimation parameters of the quantitative traits; meanwhile, the model frame of the functional mapping is used for positioning and regulating the remarkable QTL of the quantitative trait growth of the trees, and the method improves the calculation efficiency while not losing the precision; then, a genetic regulation network between the QTLs is established according to the genetic effect of the significant QTLs to identify all key regulated QTLs explaining the growth process of the tree quantitative traits, so that the invention provides an effective method for analyzing the interrelation among the genetic loci.
Drawings
FIG. 1 is a Richard growth graph showing stem height of poplar in example 1;
FIG. 2 is a Manhattan image of stem height growth significance test of cross-testing SNPs and cross-testing SNPs of 19 chromosomes of the poplar whole genome in example 1;
FIG. 3 shows the genetic effect and the tendency of heritability of the stem height of the poplar in example 1 over time; wherein FIG. 3(A) is an additive effect curve of the hybrid QTL, FIG. 3(B) is a dominant effect curve of the hybrid QTL, FIG. 3(C) is a heritability variation curve of the hybrid QTL, FIG. 3(D) is an additive effect curve of the test-cross QTL, and FIG. 3(E) is a heritability variation curve of the test-cross QTL;
FIG. 4 is a network diagram of the regulation and control of the significant QTL for regulating the high growth of poplar stems in example 1; fig. 4(a) is a regulatory network between additive effects of hybrid QTLs, fig. 4(B) is a regulatory network between dominant effects of hybrid QTLs, and fig. 4(C) is a regulatory network between additive effects of test-cross QTLs;
FIG. 5 is a Richard growth graph of the poplar diameter in example 2;
FIG. 6 is a Manhattan image of diameter growth significance test of cross-measuring SNPs and cross-hybridizing SNPs of 19 chromosomes of the poplar whole genome in example 2;
FIG. 7 shows the trend of the genetic effect and heritability of poplar diameter over time in example 2; wherein FIG. 7(A) is an additive effect curve of the hybrid QTL, FIG. 7(B) is a dominant effect curve of the hybrid QTL, FIG. 7(C) is a heritability variation curve of the hybrid QTL, FIG. 7(D) is an additive effect curve of the test-cross QTL, and FIG. 7(E) is a heritability variation curve of the test-cross QTL;
fig. 8 is a network diagram of the regulation and control of the significant QTL for regulating poplar diameter growth in example 2, fig. 8(a) is the network of the regulation and control between the additive effects of the hybrid QTL, fig. 8(B) is the network of the regulation and control between the dominant effects of the hybrid QTL, and fig. 8(C) is the network of the regulation and control between the additive effects of the test cross QTL.
Detailed Description
The invention provides a Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees, which comprises the following steps of:
1) fitting Richard's equation by using the quantitative character phenotype data of woody plant sample individuals in the same living environment, and searching to obtain the estimation parameters of the equation;
2) carrying out single nucleotide polymorphism typing on the whole genome of each sample individual to obtain the genotype of each sample individual;
3) taking genotype and phenotype data of woody plant sample individuals as analysis data, and applying the estimation parameters in the step 1) to a model frame of functional mapping to position and regulate the remarkable QTL of the growth of the tree quantitative character;
4) calculating the genetic effect value of the quantitative trait according to the significant QTL in the step 3);
5) establishing linear correlation relations among different significant QTLs according to the genetic effect values in the step 4) to obtain a QTL regulation structure network;
6) and (5) identifying and explaining the key QTL for the quantitative trait growth process of the trees by utilizing the QTL regulation structure network in the step 5).
The method preferably takes the same woody plant in the same or similar growing environment as a sample, measures the growing data of each sample individual for several years continuously, and obtains the quantitative trait phenotypic data.
The method provided by the invention is applicable to all kinds of woody plants. In the examples of the present invention, for the purpose of specifically illustrating specific embodiments, poplar was analyzed as a representative of woody plants, but this should not be construed as limiting the present invention. The number of the samples is preferably more than 50, and more preferably 65 to 150. The years are preferably 6-20 years, and more preferably 10-15 years. The quantitative character refers to the character that no obvious boundary exists between relative characters, the difference of the individual expression can be only distinguished by quantity, and the variation is continuous.
After the quantitative trait phenotypic data are obtained, the quantitative trait phenotypic data are used for fitting a Richard equation, and estimation parameters of the equation are obtained through searching.
In the present invention, the Richard equation is preferably as shown in equation (3):
Figure BDA0003102887610000081
wherein, b1Represents a growth threshold; b2Representing a shape parameter; b3Is a parameter related to growth rate, y represents the fitted value of the quantitative trait, and t represents time in years.
In the present invention, the method for obtaining the Richard equation preferably includes the following steps:
the growth rate equation of the tree quantitative character is shown in equation (1):
Figure BDA0003102887610000082
wherein, y ═ y (t) represents the total growth of the tree quantitative character, and the growth rate is
Figure BDA0003102887610000083
According to the biological characteristics of tree growth, the growth rate is divided into assimilation rate aymAnd a dissimilatory rate by, wherein a is called the assimilation coefficient and a > 0, m is the assimilation power exponent and m < 1, b is the dissimilatory coefficient and b > 0;
the special solution form of the growth rate differential equation (1) is obtained and is shown in equation (2):
Figure BDA0003102887610000084
order to
Figure BDA0003102887610000085
b2=(1-m)b,
Figure BDA0003102887610000086
The Richards growth equation was obtained.
In the present invention, fitting the Richard equation is preferably achieved by a least squares method; the search method of the estimated parameters is a BFGS quasi-Newton method. The estimation parameter is a Richards equation parameter with the minimum sum of the square residuals of the table value and the fitting value, and is specifically realized by an optimal function of a general optimization method in an R language.
The invention carries out single nucleotide polymorphism typing on the whole genome of each sample individual to obtain the genotype of each sample individual.
The SNP genotyping method of the present invention is not particularly limited. In the present example, the SNP genotyping information of the poplar representative sample uses the Applied Biosystems (Foster City, CA, USA) QuantStaudio 12KFlex Real-Time PCR system. SNP genotypes were determined by quality control screening.
After the phenotype data and the genotype are obtained, the genotype and the phenotype data of each sample individual are used as analysis data, and the model frame of functional mapping is adopted to position and regulate the obvious QTL for the quantitative trait growth of the trees.
In the present invention, the model framework of the functional mapping determines whether a gene locus affects the growth of the quantitative trait by the following hypothesis test:
establishing an assumption that:
H0j=Θversus H1j≠Θ,j=1,…,J (4)
primitive hypothesis H0If the gene locus does not affect the growth of the quantitative trait, namely the growth of the quantitative trait has no difference among different genotypes, the estimation parameters are equal under different genotypes, and the estimation parameters are expressed as theta;
alternative hypothesis H1The genetic locus influences the growth of quantitative traits, namely the growth of the quantitative traits of the tree samples with different genotypes has difference, so that the estimated parameters are not equal under different genotypes, the number of the genotypes of the genetic locus is J, and the estimated parameter of the tree with the genotype of J is expressed as thetaj (j=1,…,J);
Secondly, establishing likelihood functions L of the original hypothesis and the alternative hypothesis respectively0(y) and L1(y):
Figure BDA0003102887610000091
Figure BDA0003102887610000092
Where n is the number of samples of the tree,
Figure BDA0003102887610000093
the number of tree samples with genotype j; y isi=(yi(1),…,yi(T)) represents the quantitative trait of tree sample individual i in the amount of growth at observation time 1, …, T; the tree sample overall approximately follows normal distribution; the mean value of the normal distribution obeyed by the tree as a whole is mu (1), …, mu (T)), and the mean value is constructed by using Richards equation parameters theta; the mean value of trees with genotype j is muj=(μj(1),…,μj(T)), using the Richards equation parameter thetajConstructing; f (y)i(ii) a Θ, Ψ) represents the probability density function of the normal distribution of the tree under the original assumption, fj(yi;ΘjΨ) represents the probability density function of the tree normal distribution for genotype j under the alternative hypothesis; psi is a structural parameter for constructing a normal distribution covariance matrixCounting;
according to likelihood function L0(y) and L1(y) establishing a likelihood ratio statistic LR:
LR=-2(logL0(y)-logL1(y)) (7)
LR is approximate compliance ×)2The distribution statistic, the degree of freedom of which is the difference between the original hypothesis and the alternative hypothesis parameters;
determining a p value according to the statistic LR, determining a statistical inference: the reject domain of LR is expressed as W ≧ c { LR ≧ c }, where the critical value c satisfies P (LR ≧ c) ≦ α, and when the check level is set to α, if P ≦ α, LR belongs to the reject domain, then the original hypothesis H is rejected0Accepting alternative hypothesis H1And judging that the growth of different genotype characters of the gene locus has difference when the test level is alpha, wherein the gene locus is the obvious QTL.
In the invention, covariance matrixes of normal distribution which trees generally obey under the original hypothesis and the alternative hypothesis are constructed by a first-order forward dependence structure model (SAD (1));
an innovation variance γ included in the structure parameter Ψ of the first-order forward-dependent structure model (SAD (1))2And a first order pre-dependent parameter phi;
the covariance matrix is expressed by Σ:
Figure BDA0003102887610000101
each element in the sigma is related to time t, and the element on the diagonal line is the variance of the quantitative character of the log sample and the time, and is constructed by equation (9); the elements on the non-diagonal are the covariance of the quantitative character at different times, and are constructed by equation (10);
Figure BDA0003102887610000102
Figure BDA0003102887610000103
after obtaining the significant QTL, the invention calculates the genetic effect value of the quantitative trait according to the significant QTL.
In the present invention, the value of the genetic effect of the quantitative trait preferably includes an additive effect, a dominant effect. The formula for the additive effect is preferably shown in equation (11), and the formula for the dominant effect is preferably shown in equation (12):
Figure BDA0003102887610000111
Figure BDA0003102887610000112
wherein mu0(t),μ1(t),μ2(t) represents the average effect of the three genotypes QQ, Qq, QQ of the QTL respectively.
After the genetic effect value is obtained, the invention establishes the linear correlation relationship between different significant QTLs according to the dominant genetic effect value and the additive genetic effect value respectively to obtain a QTL regulation structure network.
In the present invention, the linear correlation relationship between different significant QTLs is preferably as shown in equation (13), which represents the regulatory relationship between the ith significant QTL and the other i-1 QTLs:
Ei=β1E1+…+βi-1Ei-1i+1Ei+1+…+βpEp+ε (13)
wherein EiA genetic effect value (additive or dominant effect) representing the i-th significant QTL, wherein i 1. Epsilon to N (0, sigma)2),β1,…,βi-1i+1,…,βpAnd σ2Is an unknown parameter;
(E1(t),…,Ei-1(t),Ei(t),Ei+1(t),…,Ep(T)) (T ═ 1, …, T) are calculated values of the genetic effect at T time points, equation (13) can be expressed in the form of a matrix as follows:
Figure RE-GDA0003157391750000121
Wherein epsiloni~N(0,σ2) And are independently and equally distributed;
Figure BDA0003102887610000114
εiis a p-dimensional error vector and satisfies:
E(ε)=0,
Var(ε)=σ2In
Figure BDA0003102887610000121
is a p-dimensional parameter vector, and whether a regulation relation exists between the significant QTLs is judged by checking the parameter vector of the regression equation: when a certain element beta in the parameter vectorjWhen j belongs to (1, …, i-1, i +1, …, p) is 0, the obvious QTL i and QTLj have no regulation and control relation, and otherwise, the regulation and control relation exists.
The QTL regulation and control structure network is established by a multivariate linear regression function lm in an R language.
Obtaining a QTL regulation structure network, and identifying and explaining the key QTL regulation and control in the tree quantitative trait growth process by utilizing the QTL regulation and control structure network.
In the invention, the identification method of the key regulation QTL in the quantitative trait growth process of the tree is preferably to count the quantity of the rest QTLs regulated by each QTL, and select the QTL with larger regulation quantity to identify as the key regulation QTL. The number of key regulatory QTLs is related to the number of all QTLs in the regulatory network. In the present invention, the number of critical QTLs regulated is preferably 5% of the number of QTLs in the entire QTL regulation structure network.
In the present invention, the method further comprises genetic structure analysis. The method of genetic structural analysis preferably comprises calculating the heritability for said significant QTLs identified, i.e. the ratio of the genetic variance of each QTL to the quantitative trait phenotypic variance.
The said heritability calculation formula is preferably as in equation (14):
Figure BDA0003102887610000122
wherein VAIs additive variance, VDIs a dominant variance, VP=VA+VD+VEIs a phenotypic variance composed of both genetic and environmental variances.
The genetic factors of the QTL are used for measuring the contribution of the genetic factors of the QTL to the phenotype difference, when the genetic force is more than 10 percent (the standard is set by statistical comparison of a genetic force curve), the QTL has larger contribution to the phenotype difference, the capability of transmitting characters to filial generations is higher, a reliable selection is provided for breeding selection, and subsequent functional verification can be carried out on the significant QTL with large contribution.
The following will describe in detail a QTL mapping framework and genetic structure analysis method for tree quantitative traits based on Richards equation provided by the present invention with reference to the following examples, but they should not be construed as limiting the scope of the present invention.
Example 1
The present invention is described in further detail using stem height growth data disclosed in A computational frame for mapping the timing of a genetic phase change (New Phototist; 211: 750-. The sample data is obtained by artificially hybridizing populus deltoids clone (I-69) serving as a female parent and populus deltoids clone (I-45) serving as a male parent to obtain 450 offspring, and uniformly planting the offspring in Zhang Jilin farm (34.14 degrees N, 117.38 degrees W) in Jiangsu province. The applied phenotypic data includes the growth data of the stem height and diameter of the randomly selected 64 offspring and female parent I-69 and 66 forest poplar samples of the male parent I-45 in the first 14 years of growth (1987-2010), and the genetic data is 156362 SNP information distributed on 19 chromosomes. 94591 SNPs in the gene data are test cross markers, and 61771 SNPs are hybridization markers. The test cross marker means that one of the parents is heterozygous and the other is homozygous; the hybrid markers are derived from two heterozygous parents.
1. And fitting the growth data of the stem height of the poplar sample by using a quasi-Newton method.
Fitting Richard equation by using the phenotypic data of the quantitative characters of the poplar, and searching to obtain the estimation parameters of the equation. The Richard equation is shown in equation (3):
Figure BDA0003102887610000131
wherein, b1Represents a growth threshold; b2Representing a shape parameter; b3Is a parameter related to growth rate, y represents the fitted value for stem height growth, and t represents time.
Fitting Richard's equation by least square method; the search method of the estimated parameters is a BFGS quasi-Newton method, Richard estimated parameters which enable the sum of the square of the residual errors of stem height phenotype values and fitting values to be minimum are found, and the Rihard estimated parameters are specifically realized through an optim function of a general optimization method in an R language.
Fig. 1 is a Richard growth curve of the stem height of a poplar, in which a light color curve is a growth curve of 66 samples and a dark color curve is an average growth curve. The coefficient of determination of fitting Richard equation to poplar stem height data is 0.9926, which shows that Richard growth curve has good fitting degree to sample data and good fitting goodness. Richard parameter b of the mean growth curve1,b2,b329.11, 0.07, 0.84, respectively, and 0.2759 is the sum of the squared residuals of the mean curve fit.
2. And (3) positioning the significant QTL for controlling the stem height growth of the poplar based on a functional mapping framework.
And taking the genotype and phenotype data of each sample individual as analysis data, and positioning and controlling the significant QTL of the tree quantitative trait growth by using the model frame adopting the function mapping.
The model framework of the functional mapping judges whether a gene locus affects the growth of quantitative traits by the following hypothesis test:
the model framework of the functional mapping judges whether a gene locus affects the growth of quantitative traits by the following hypothesis test:
establishing an assumption that:
H0j=Θversus H1j≠Θ,j=1,…,J (4)
primitive hypothesis H0If the gene locus does not affect the growth of the quantitative trait, namely the growth of the quantitative trait has no difference among different genotypes, the estimation parameters are equal under different genotypes, and the estimation parameters are expressed as theta;
alternative hypothesis H1The genetic locus influences the growth of quantitative traits, namely the growth of the quantitative traits of the tree samples with different genotypes has difference, so that the estimated parameters are not equal under different genotypes, the number of the genotypes of the genetic locus is J, and the estimated parameter of the tree with the genotype of J is expressed as thetaj (j=1,…,J);
Secondly, establishing likelihood functions L of the original hypothesis and the alternative hypothesis respectively0(y) and L1(y):
Figure BDA0003102887610000141
Figure BDA0003102887610000142
Where n is the number of samples of the tree,
Figure BDA0003102887610000143
the number of tree samples with genotype j; y isi=(yi(1),…,yi(T)) represents the quantitative trait of tree sample individual i in the amount of growth at observation time 1, …, T; the tree sample overall approximately follows normal distribution; the mean value of the normal distribution obeyed by the tree as a whole is mu (1), …, mu (T)), and the mean value is constructed by using Richards equation parameters theta; the mean value of trees with genotype j is muj=(μj(1),…,μj(T)), using the Richards equation parameter thetajConstructing; f (y)i(ii) a Θ, Ψ) representsProbability density function of tree normal distribution under original assumption, fj(yi;ΘjΨ) represents the probability density function of the tree normal distribution for genotype j under the alternative hypothesis; psi is a structural parameter for constructing a normal distribution covariance matrix;
according to likelihood function L0(y) and L1(y) establishing a likelihood ratio statistic LR:
LR=-2(logL0(y)-logL1(y)) (7)
LR is approximate compliance ×)2The distribution statistic, the degree of freedom of which is the difference between the original hypothesis and the alternative hypothesis parameters;
determining a p value according to the statistic LR, determining a statistical inference: the reject domain of LR is expressed as W ≧ c { LR ≧ c }, where the critical value c satisfies P (LR ≧ c) ≦ α, and when the check level is set to α, if P ≦ α, LR belongs to the reject domain, then the original hypothesis H is rejected0Accepting alternative hypothesis H1And judging that the growth of different genotype characters of the gene locus has difference when the test level is alpha, wherein the gene locus is the obvious QTL.
In the invention, a covariance matrix uniform-order forward dependence structure model (SAD (1)) of normal distribution which is obeyed by the tree overall under the original hypothesis and the alternative hypothesis is constructed;
an innovation variance γ included in the structure parameter Ψ of the first-order forward-dependent structure model (SAD (1))2And a first order pre-dependent parameter phi;
the covariance matrix is expressed by Σ:
Figure BDA0003102887610000151
each element in the sigma is related to time T, and the element on the diagonal line is the variance of the quantitative character of the log sample and the time, and is constructed by equation (9); the elements on the non-diagonal are the covariance of the quantitative character at different times, and are constructed by equation (10);
Figure BDA0003102887610000152
Figure BDA0003102887610000153
FIG. 2 is a Manhattan image of significance testing of cross Single Nucleotide Polymorphisms (SNPs) and cross Single Nucleotide Polymorphisms (SNPs) of 19 chromosomes of a poplar whole genome by performing functional mapping according to a stem height growth curve of a sample. The horizontal line in red is the threshold after Bonferroni correction at a level of 0.05. And screening 83 test cross single nucleotide polymorphisms and 40 hybrid single nucleotide polymorphisms to regulate the growth of the stem height of the poplar. These significant QTLs are distributed mainly on chromosomes 8, 11, 18, 19. Specific information such as the type of the significant QTL, SNP number, chromosomal location, and p-value is shown in table 1.
3. Calculating the genetic effect (including additive effect (a) (t) and dominant effect (d (t)) and genetic force (H) of quantitative character according to 83 tested QTLs and 40 crossed QTLs2)。
And calculating the genetic effect value of the quantitative trait according to the significant QTL. Wherein the formula for the additive effect is preferably shown in equation (11), and the formula for the dominant effect is preferably shown in equation (12):
Figure BDA0003102887610000161
Figure BDA0003102887610000162
wherein mu0(t),μ1(t),μ2(t) represents the average effect of the three genotypes QQ, Qq, QQ of the QTL respectively.
The heritability calculation formula is preferably as in equation (13):
Figure BDA0003102887610000163
wherein VAIs additive variance, VDIs a dominant variance, VP=VA+VD+VEIs a phenotypic variance composed of both genetic and environmental variances.
FIG. 3 shows the time-dependent trend of the genetic effect and heritability of poplar stem height. Fig. 3(a) is an additive effect time variation trend curve of 40 significant hybrid QTLs, the additive effect curve of 40 significant hybrid QTLs can be mainly divided into two types with opposite variation directions, and the effect values both increase and decrease. Fig. 3(B) is a time variation trend curve of dominant effect of 40 significant hybrid QTLs, the time variation trend curve of dominant effect can be divided into three categories, the effect value is greater than 0, the dominant effect of one category of QTL increases first and then decreases, the dominant effect of one category of QTL decreases first and then increases, and the trend is relatively gentle. Curves with effect values greater than 0 are curves with effect values increasing and then decreasing in opposite directions. Fig. 3(C) is a heritability variation curve for 40 significant hybrid QTLs, where the heritability of hybrid QTL9 located on chromosome 11, SNP100096, grew for the largest 1-3 years, with a large contribution to stem height growth. The heritability of QTL10 Chr11/SNP102549 in 2-6 years and QTL1 Chr21/SNP152514 in 3-14 years is more than 10%, and the genetic contribution is large. In addition, the heritability of the chr8/SNP 77873 and QTL7 chr18SNP146811 is increased by more than 10 percent in the early period after the 7 th year.
Fig. 3(D) is an additive effect time variation trend curve of 83 significant test-cross QTLs, similar to the additive effect of the hybrid QTLs, which can be mainly divided into two categories with opposite variation directions, and the effect values both increase and decrease. Fig. 3(E) is a heritability variation curve of 83 significant test-cross QTLs, wherein the heritability curve of chr11/SNP 100722 shows a gradual decline trend, which is 10% or more significantly higher than the heritability of other QTLs in 1-5 years; the heritability of chr18/SNP 145821 is more than 10% after the 5 th year. The two test-cross QTLs have significant genetic contributions to the growth of stem height at the early and late stages of the analysis time, respectively.
4. And establishing a regulation structure network between the QTLs based on genetic effect, and identifying a key regulation QTL for explaining the high growth process of the tree stem.
And establishing a linear correlation relationship between different significant QTLs according to the additive genetic effect value and the dominant genetic effect value to obtain a QTL regulation structure network. The linear correlation relationship between the different significant QTLs is preferably shown by equation (14), which represents the regulatory relationship between the ith significant QTL and the other i-1 QTLs:
Ei=β1E1+…+βi-1Ei-1i+1Ei+1+…+βpEp+ε (14)
wherein EiA genetic effect value (additive genetic effect or dominant genetic effect) representing the ith significant QTL, where i 1.... P; epsilon to N (0, sigma)2),β1,…,βi-1i+1,…,βpAnd σ2Is an unknown parameter;
(E1(t),…,Ei-1(t),Ei(t),Ei+1(t),…,Ep(T)) (T ═ 1, …, T) is the calculated value of the genetic effect at T time points, then equation (14) can be expressed in the form of a matrix as follows:
Figure RE-GDA0003157391750000181
wherein epsiloni~N(0,σ2) And are independently and equally distributed;
Figure BDA0003102887610000172
εiis a p-dimensional error vector and satisfies:
E(ε)=0,
Var(ε)=σ2In
Figure BDA0003102887610000173
is a p-dimensional parameter vector, and whether a regulation relation exists between the significant QTLs is judged by checking the parameter vector of the regression equation: when a certain element beta in the parameter vectorjWhen j belongs to (1, …, i-1, i +1, …, p) is 0, the obvious QTLi and QTLj have no regulation and control relationOtherwise, there is a regulation relationship. The establishment of the QTL regulation and control structure network is realized by adopting a multiple linear regression function lm in an R language.
And identifying and explaining the key QTL for the quantitative character growth process of the trees by utilizing a QTL regulation structure network. The identification method of the key regulation QTL in the quantitative trait growth process of the tree is preferably to count the quantity of the rest QTLs regulated by each QTL, wherein the QTL with the larger regulation quantity is called as the key regulation QTL, the quantity of the key regulation QTL is related to the quantity of all the QTLs in a regulation network, and the selection standard in the method is preferably the first 5 percent of the quantity of all the QTLs in the network.
According to the genetic effect of the significant QTL for adjusting the high growth of the poplar stems, a linear equation set between the QTLs is established. The regulatory relationship between QTLs is represented by the coefficients of the equation. Fig. 4(a) is a gene regulatory network of additive effects of 40 hybrid QTLs, where 2 QTLs (2, 8): critical regulatory QTLs in the additive effector network of chr11/SNP 99352, chr11/SNP 99316; fig. 4(B) is a gene regulatory network of dominant effect of 40 hybrid QTLs, where 2 key hybrid QTLs (2, 6): the dominant effects of chr11/SNP 99352 and chr11/SNP99268 play a key role in the genetic structure of high growth of poplar stems. Fig. 4(C) is a gene regulatory network of additive effect of 83 hybrid QTLs, where the test-cross QTLs (1, 3, 4, 5): the additive effect of chr18/SNP 145821, chr25/SNP 153154, chr25/SNP 153279, chr18/SNP 145831 and chr25/SNP 153164 plays a key role in the high-growth genetic structure of the poplar stems. The number of QTLs in the network structure is shown in table 1.
TABLE 1 significant QTL information for regulating and controlling stem height growth of poplar samples
Figure BDA0003102887610000181
Figure BDA0003102887610000191
Figure BDA0003102887610000201
Example 2
1. The growth data of the poplar sample diameter was fitted using the quasi-newton method.
Fitting Richard equation by using the phenotypic data of the quantitative characters of the poplar, and searching to obtain the estimation parameters of the equation. The Richard equation is shown in equation (3):
Figure BDA0003102887610000202
wherein, b1Represents a growth threshold; b2Representing a shape parameter; b3Is a parameter related to growth rate and y represents the fitted value for diameter growth.
Fitting Richard's equation by least square method; the search method of the estimation parameters is a BFGS quasi-Newton method, Richard estimation parameters which enable the sum of the squares of the residual errors of the diameter form values and the fitting values to be minimum are found, and the method is specifically realized through an optim function of a general optimization method in an R language.
Fig. 5 is a Richard growth curve of the diameter of a poplar, in which a light color curve is a growth curve of 66 samples and a dark color curve is an average growth curve. The coefficient of determination of fitting Richard equation to the poplar diameter data is 0.9948, which shows that the Richard growth curve has good fitting degree to the sample data and good fitting goodness. Richard parameter b of the mean growth curve1,b2,b3Are 26.00, 0.24,
1.76 and the sum of the squares of the residuals is 3.9968.
2. And (3) positioning the significant QTL for controlling the stem height growth of the poplar based on a functional mapping framework.
And taking the genotype and phenotype data of each sample individual as analysis data, and positioning and controlling the significant QTL of the tree quantitative trait growth by using the model frame adopting the function mapping.
The model framework of the functional mapping judges whether a gene locus affects the growth of quantitative traits by the following hypothesis test:
establishing an assumption that:
H0j=Θversus H1j≠Θ,j=1,…,J (4)
primitive hypothesis H0If the gene locus does not affect the growth of the quantitative trait, namely the growth of the quantitative trait has no difference among different genotypes, the estimation parameters are equal under different genotypes, and the estimation parameters are expressed as theta;
alternative hypothesis H1The genetic locus influences the growth of quantitative traits, namely the growth of the quantitative traits of the tree samples with different genotypes has difference, so that the estimated parameters are not equal under different genotypes, the number of the genotypes of the genetic locus is J, and the estimated parameter of the tree with the genotype of J is expressed as thetaj (j=1,…,J);
Secondly, establishing likelihood functions L of the original hypothesis and the alternative hypothesis respectively0(y) and L1(y):
Figure BDA0003102887610000211
Figure BDA0003102887610000212
Where n is the number of samples of the tree,
Figure BDA0003102887610000213
the number of tree samples with genotype j; y isi=(yi(1),…,yi(T)) represents the quantitative trait of tree sample individual i in the amount of growth at observation time 1, …, T; the tree sample overall approximately follows normal distribution; the mean value of the normal distribution obeyed by the tree as a whole is mu (1), …, mu (T)), and the mean value is constructed by using Richards equation parameters theta; the mean value of trees with genotype j is muj=(μj(1),…,μj(T)), using the Richards equation parameter thetajConstructing; f (y)i(ii) a Θ, Ψ) represents the tree normal under the original assumptionProbability density function of distribution, fj(yi;ΘjΨ) represents the probability density function of the tree normal distribution for genotype j under the alternative hypothesis; psi is a structural parameter for constructing a normal distribution covariance matrix;
according to likelihood function L0(y) and L1(y) establishing a likelihood ratio statistic LR:
LR=-2(logL0(y)-logL1(y)) (7)
LR is approximate compliance ×)2The distribution statistic, the degree of freedom of which is the difference between the original hypothesis and the alternative hypothesis parameters;
determining a p value according to the statistic LR, determining a statistical inference: the reject domain of LR is expressed as W ≧ c { LR ≧ c }, where the critical value c satisfies P (LR ≧ c) ≦ α, and when the check level is set to α, if P ≦ α, LR belongs to the reject domain, then the original hypothesis H is rejected0Accepting alternative hypothesis H1And judging that the growth of different genotype characters of the gene locus has difference when the test level is alpha, wherein the gene locus is the obvious QTL.
In the invention, covariance matrixes of normal distribution which trees generally obey under the original hypothesis and the alternative hypothesis are constructed by a first-order forward dependence structure model (SAD (1));
an innovation variance γ included in the structure parameter Ψ of the first-order forward-dependent structure model (SAD (1))2And a first order pre-dependent parameter phi;
the covariance matrix is expressed by Σ:
Figure BDA0003102887610000221
each element in the sigma is related to time t, and the element on the diagonal line is the variance of the quantitative character of the log sample and the time, and is constructed by equation (9); the elements on the non-diagonal are the covariance of the quantitative character at different times, and are constructed by equation (10);
Figure BDA0003102887610000222
Figure BDA0003102887610000223
FIG. 6 is a Manhattan image functionally plotted against a sample diameter growth curve for significance testing of cross Single Nucleotide Polymorphisms (SNPs) and cross Single Nucleotide Polymorphisms (SNPs) of 19 chromosomes of the poplar whole genome. The horizontal line in red is the Bonferroni corrected threshold at the FDR corrected 0.001 level. Screening out 37 test cross single nucleotide polymorphisms and 57 hybridization single nucleotide polymorphisms to regulate the growth of the poplar diameter. These significant QTLs are distributed mainly on chromosomes 5, 9, 11, 14. Specific information such as the type of the significant QTL, SNP number, chromosomal location, and p-value is shown in table 2.
3. Calculating the genetic effect (including additive effect (a) (t) and dominant effect (d (t)) and genetic force (H) of quantitative character according to 83 tested QTLs and 40 crossed QTLs2)。
And calculating the genetic effect value of the quantitative trait according to the significant QTL. Wherein the formula for the additive effect is preferably shown in equation (11), and the formula for the dominant effect is preferably shown in equation (12):
Figure BDA0003102887610000231
Figure BDA0003102887610000232
wherein mu0(t),μ1(t),μ2(t) represents the average effect of the three genotypes QQ, Qq, QQ of the QTL respectively.
The heritability calculation formula is preferably as in equation (13):
Figure BDA0003102887610000233
wherein VAIs additive variance, VDIs a dominant variance, VP=VA+VD+VEIs a phenotypic variance composed of both genetic and environmental variances.
FIG. 7 shows the time-dependent trends of genetic effects and heritability of poplar diameters. Fig. 7(a) is an additive effect time variation trend curve of 57 significant hybrid QTLs, the additive effect curve of 57 significant hybrid QTLs can be mainly divided into two types with opposite variation directions, and the effect values both increase and decrease. The additive genetic effect change of hybrid QTL12, namely SNP117737 located on chromosome 14, is special, and the effect curve is negative in 1-5 years in the early growth stage, is enhanced in the first two years, is weakened in 3-5 years, is enhanced in the positive direction from the 10 th year, and then is slowly reduced. FIG. 7(B) is a time trend curve of dominant effect of 40 significant hybrid QTLs, which can be divided into two types, and the curves of the two types of effect values are changed in opposite directions. More particularly, the hybrid QTL34, Chr4/SNP44034 and the hybrid QTL52, Chr8/SNP81991 have dominant genetic effects which are positive in the initial stage and then are reduced to zero and become negative. Fig. 7(C) is a heritability variation curve for 57 significant hybrid QTLs, where the heritability of hybrid QTL4(Chr9/SNP85666) and hybrid QTL20 (Chr4/SNP44577) at the early stage of growth was greater than 10%, contributing more to radial growth. Hybrid QTL1(chr11/SNP101069), although the heritability of diameter growth at the early stage of growth was not significant, the heritability of this significant QTL was above 10% after 6 years, the highest of the 57 significant hybrid QTLs, which contributed significantly to diameter growth of the poplar samples from 6-14 years.
Fig. 7(D) is an additive effect time variation trend curve of 37 significant test-cross QTLs, similar to the additive effect of the hybrid QTLs, which can be mainly divided into two categories with opposite variation directions, and the effect values both increase and decrease. Fig. 7(E) is a heritability variation curve of 37 significant crosstest QTLs, where the heritability of the crosstest QTL1, chr17/SNP 137076 was significantly greater than other QTLs after year 2, especially above 10% in 3-11 years, during which time the QTL contributed significantly to radial growth. The test cross QTL18, chr9/SNP85906, contributed significantly to the radial growth during the early growth phase.
4. And establishing a regulation structure network between the QTLs based on genetic effect, and identifying a key regulation QTL for explaining the high growth process of the tree stem.
And establishing a linear correlation relationship between different significant QTLs according to the additive effect genetic effect value and the dominant genetic effect value to obtain a QTL regulation structure network. The linear correlation relationship between the different significant QTLs is preferably as in equation (14), which represents the regulatory relationship between the ith significant QTL and the other i-1 QTLs:
Ei=β1E1+…+βi-1Ei-1i+1Ei+1+…+βpEp+ε (14)
wherein EiA genetic effect value (additive or dominant effect) representing the i-th significant QTL, wherein i 1. Epsilon to N (0, sigma)2),β1,…,βi-1i+1,…,βpAnd σ2Is an unknown parameter;
(E1(t),…,Ei-1(t),Ei(t),Ei+1(t),…,Ep(T)) (T ═ 1, …, T) is the calculated value of the genetic effect at T time points, then equation (14) can be expressed in the form of a matrix as follows:
Figure RE-GDA0003157391750000251
wherein epsiloni~N(0,σ2) And are independently and equally distributed;
Figure BDA0003102887610000251
εiis a p-dimensional error vector and satisfies:
E(ε)=0,
Var(ε)=σ2In
Figure BDA0003102887610000252
is p dimensionAnd (3) parameter vectors, wherein whether a regulation relation exists between the significant QTLs is judged by checking the parameter vectors of the regression equation: when a certain element beta in the parameter vectorjWhen j belongs to (1, …, i-1, i +1, …, p) is 0, the obvious QTL i and QTLj have no regulation and control relation, and otherwise, the regulation and control relation exists. The establishment of the QTL regulation and control structure network is realized by adopting a multiple linear regression function lm in an R language.
And identifying and explaining the key QTL for the quantitative character growth process of the trees by utilizing a QTL regulation structure network. The identification method of the key regulation QTL in the quantitative trait growth process of the tree is preferably to count the quantity of the rest QTLs regulated by each QTL, wherein the QTL with the larger regulation quantity is called as the key regulation QTL, the quantity of the key regulation QTL is related to the quantity of all the QTLs in a regulation network, and the selection standard in the method is preferably the first 5 percent of the quantity of all the QTLs in the network. .
According to the genetic effect of the significant QTL for adjusting the diameter growth of the poplar, a linear equation set between the QTLs is established. The regulatory relationship between QTLs is represented by the coefficients of the equation. Fig. 8(a) is a gene regulatory network of additive effects of 57 hybrid QTLs (2, 4, 7): the chr9/SNP84578, chr9/SNP85666 and chr5/SNP48359 are key QTLs. Fig. 8(B) is a gene regulatory network of dominant effect of 57 hybrid QTLs, where 4 critical hybrid QTLs (1, 8, 9): the chr11/SNP101069, chr4/SNP44290 and chr9/SNP 86045 are key QTLs. Fig. 8(C) is a gene regulatory network of additive effects of 37 hybrid QTLs, where the test-cross QTLs (3, 8): the additive effect of chr9/SNP 86354 and chr14/SNP 118150 plays a key role in the genetic structure of the diameter growth of poplar. The number of QTLs in the network structure is shown in table 2.
TABLE 2 significant QTL information for regulating poplar sample diameter growth
Figure BDA0003102887610000253
Figure BDA0003102887610000261
Figure BDA0003102887610000271
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method for a Quantitative Trait Locus (QTL) positioning frame of trees based on Richards equation is characterized by comprising the following steps of:
1) fitting Richard's equation by using the quantitative character phenotype data of woody plant sample individuals in the same living environment, and searching to obtain the estimation parameters of the equation;
2) carrying out single nucleotide polymorphism typing on the whole genome of each sample individual to obtain the genotype of each sample individual;
3) taking genotype and phenotype data of woody plant sample individuals as analysis data, and applying the estimation parameters in the step 1) to a model frame of functional mapping to position and regulate the remarkable QTL of the growth of the tree quantitative character;
4) calculating the genetic effect value of the quantitative trait according to the significant QTL in the step 3);
5) establishing linear correlation relations among different significant QTLs according to the genetic effect values in the step 4) to obtain a QTL regulation and control structure network;
6) identifying and explaining the key QTL for the tree quantitative character growth process by utilizing the QTL regulation and control structure network in the step 5).
2. The method for the QTL positioning framework for quantitative traits of trees according to Richards equation in claim 1, wherein the Richards equation in step 1) is as shown in equation (3):
Figure FDA0003102887600000011
wherein, b1Represents a growth threshold; b2Representing a shape parameter; b3Is a parameter related to growth rate, y represents the fitted value of the quantitative trait, and t represents time in years.
3. The method for the QTL location frame of the quantitative trait of trees based on the Richard equation according to claim 1 or 2, wherein the fitting of the Richard equation in the step 1) is realized by a least square method;
the search method of the estimated parameters is a BFGS quasi-Newton method;
the estimation parameters are Richards equation parameters with minimum residual square sum of the table values and the fitting values.
4. The method for the QTL mapping frame for quantitative traits in trees according to Richards equation in claim 1, wherein the model frame for functional mapping in step 3) determines whether a genetic locus affects the growth of quantitative traits by the following hypothesis test:
establishing an assumption that:
H0j=Θversus H1j≠Θ,j=1,…,J (4)
primitive hypothesis H0If the gene locus does not affect the growth of the quantitative trait, namely the growth of the quantitative trait has no difference among different genotypes, the estimation parameters are equal under different genotypes, and the estimation parameters are expressed as theta;
alternative hypothesis H1The genetic locus influences the growth of quantitative traits, namely the growth of the quantitative traits of the tree samples with different genotypes has difference, so that the estimated parameters are not equal under different genotypes, the number of the genotypes of the genetic locus is J, and the estimated parameter of the tree with the genotype of J is expressed as thetaj(j=1,…,J);
Secondly, establishing likelihood functions L of the original hypothesis and the alternative hypothesis respectively0(y) and L1(y):
Figure FDA0003102887600000021
Figure FDA0003102887600000022
Where n is the number of samples of the tree,
Figure FDA0003102887600000023
the number of tree samples with genotype j; y isi=(yi(1),…,yi(T)) represents the quantitative trait of tree sample individual i in the amount of growth at observation time 1, …, T; the tree sample overall approximately follows normal distribution; the mean value of the normal distribution obeyed by the tree as a whole is mu (1), …, mu (T)), and the mean value is constructed by using Richards equation parameters theta; mean value of trees with genotype j is muj=(μj(1),…,μj(T)), using the Richards equation parameter thetajConstructing; f (y)i(ii) a Θ, Ψ) represents the probability density function of the normal distribution of the tree under the original assumption, fj(yi;ΘjΨ) represents the probability density function of the normal distribution of the trees for genotype j under the chosen hypothesis; psi is a structural parameter for constructing a normal distribution covariance matrix;
according to likelihood function L0(y) and L1(y) establishing a likelihood ratio statistic LR:
LR=-2(logL0(y)-logL1(y)) (7)
the statistic LR is the approximate compliance χ2The distribution statistic, the degree of freedom of which is the difference between the original hypothesis and the alternative hypothesis parameter number;
determining a p value according to the statistic LR, determining a statistical inference: the reject domain of LR is expressed as W ≧ c { LR ≧ c }, where the critical value c satisfies P (LR ≧ c) ≦ α, and when the check level is set to α, if P ≦ α, LR belongs to the reject domain, then the original hypothesis H is rejected0Accepting alternative hypothesis H1Can be judged at the inspection levelWhen the gene locus is alpha, the growth of different genotype characters of the gene locus is different, and the gene locus is a remarkable QTL.
5. The method for the QTL positioning frame of the tree quantitative trait based on the Richards equation is characterized in that a covariance matrix of normal distribution obeyed by the tree population under an original hypothesis and a candidate hypothesis is constructed by using a first-order forward-dependent structural model (SAD (1));
an innovation variance γ included in the structure parameter Ψ of the first-order forward-dependent structure model (SAD (1))2And a first order pre-dependent parameter phi;
the covariance matrix is expressed by Σ:
Figure FDA0003102887600000031
wherein each element in Σ is related to time t, and the element on the diagonal is the variance of the numeric character of the numeric sample with the time, and is constructed by equation (9); the elements on the non-diagonal are the covariance of the quantitative character at different times, and are constructed by equation (10);
Figure FDA0003102887600000032
Figure FDA0003102887600000033
6. the method for the QTL positioning frame for a quantitative trait of a tree according to Richards equation according to claim 1, wherein the value for the genetic effect of the quantitative trait in step 4) comprises an additive effect or a dominant effect.
7. The method for the QTL positioning frame of the quantitative trait of trees according to the Richards equation, wherein the additive effect is calculated as shown in equation (11) and the dominant effect is calculated as shown in equation (12):
Figure FDA0003102887600000041
Figure FDA0003102887600000042
wherein mu0(t),μ1(t),μ2(t) represents the average effect of the three genotypes QQ, Qq, QQ of the QTL respectively.
8. The method for the QTL positioning frame for the quantitative trait of trees according to Richards equation in claim 1, wherein the linear correlation between the different significant QTLs in step 5) is shown as equation (13), which represents the regulatory relationship between the ith significant QTL and the other i-1 QTLs:
Ei=β1E1+…+βi-1Ei-1i+1Ei+1+…+βpEp+ε (13)
wherein EiA genetic effect value representing the ith significant QTL, wherein i 1.. P; epsilon to N (0, sigma)2),β1,…,βi-1i+1,…,βpAnd σ2Is an unknown parameter;
let (E)1(t),…,Ei-1(t),Ei(t),Ei+1(t),…,Ep(T)) (T ═ 1, …, T) are calculated values of the genetic effect at T time points, then equation (13) can be expressed in the form of a matrix as follows:
Figure RE-FDA0003157391740000043
wherein epsiloni~N(0,σ2) And are independently and equally distributed; epsilon ═ epsilon1,…,εi-1i+1,…,εp]' is a p-dimensional error vector and satisfies:
E(ε)=0,
Var(ε)=σ2In
Figure RE-FDA0003157391740000051
the method is characterized in that the method is a p-dimensional parameter vector, and whether a regulation and control relation exists between the significant QTLs is judged through the parameter vector of a regression equation: when a certain element beta in the parameter vectorjAnd when j belongs to (1, …, i-1, i +1, …, p) is 0, the obvious QTLi and QTLj have no regulation and control relation, otherwise, the regulation and control relation exists.
9. The method for Richards equation-based Quantitative Trait Locus (QTL) positioning framework of trees according to any one of claims 1-8, wherein the identification method for the key regulation QTL in the step 6) is to count the number of QTLs for regulating and controlling the rest of QTLs of each QTL, and select the QTL with the larger regulation quantity as the key regulation QTL;
the quantity of the key QTL is selected from the first 5% of the quantity of all QTLs in the QTL regulation structure network.
10. The method for the QTL localization frame for quantitative traits of trees according to Richards equation according to claim 9, further comprising genetic structure analysis;
the genetic structure analysis method is to calculate the heritability according to the significant QTL in the step 3), and evaluate the contribution of the genetic factors of the significant QTL to the phenotypic difference according to the heritability of the significant QTL: when the heritability of the significant QTL is more than 10%, the genetic factors of the significant QTL are proved to have larger contribution to the phenotypic difference;
the heritability calculation formula is as shown in equation (14):
Figure FDA0003102887600000051
wherein VAIs additive variance, VDIs a dominant variance, VP=VA+VD+VEIs a phenotypic variance composed of both genetic and environmental variances.
CN202110629578.3A 2021-06-07 2021-06-07 Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees Expired - Fee Related CN113345520B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110629578.3A CN113345520B (en) 2021-06-07 2021-06-07 Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110629578.3A CN113345520B (en) 2021-06-07 2021-06-07 Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees

Publications (2)

Publication Number Publication Date
CN113345520A true CN113345520A (en) 2021-09-03
CN113345520B CN113345520B (en) 2021-12-14

Family

ID=77474436

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110629578.3A Expired - Fee Related CN113345520B (en) 2021-06-07 2021-06-07 Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees

Country Status (1)

Country Link
CN (1) CN113345520B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116825191A (en) * 2023-06-25 2023-09-29 北京林业大学 Method for screening key regulation QTL of microorganism bacteria

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101970688A (en) * 2007-09-12 2011-02-09 美国辉瑞有限公司 Methods of using genetic markers and related epistatic interactions
CN104357442A (en) * 2014-10-25 2015-02-18 中国科学院东北地理与农业生态研究所 QTL mapping region for soybean flowering stage and obtaining method as well as application
CN106978494A (en) * 2017-04-21 2017-07-25 吉林省农业科学院 A kind of QTL related to soybean salt-tolerance, SNP marker and application
CN110564884A (en) * 2019-09-20 2019-12-13 南通大学 Method for excavating salix matsudana salt-tolerant pivot gene
CN110867209A (en) * 2019-11-28 2020-03-06 中国农业大学 SNP (Single nucleotide polymorphism) marker for predicting dominant hybridization combination with strong spike grain number of subspecies of indica rice and high-throughput detection method thereof
CN111341384A (en) * 2020-02-26 2020-06-26 中国农业科学院作物科学研究所 Quantitative Trait Locus (QTL) sites of soybean and screening method thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101970688A (en) * 2007-09-12 2011-02-09 美国辉瑞有限公司 Methods of using genetic markers and related epistatic interactions
CN104357442A (en) * 2014-10-25 2015-02-18 中国科学院东北地理与农业生态研究所 QTL mapping region for soybean flowering stage and obtaining method as well as application
CN106978494A (en) * 2017-04-21 2017-07-25 吉林省农业科学院 A kind of QTL related to soybean salt-tolerance, SNP marker and application
CN110564884A (en) * 2019-09-20 2019-12-13 南通大学 Method for excavating salix matsudana salt-tolerant pivot gene
CN110867209A (en) * 2019-11-28 2020-03-06 中国农业大学 SNP (Single nucleotide polymorphism) marker for predicting dominant hybridization combination with strong spike grain number of subspecies of indica rice and high-throughput detection method thereof
CN111341384A (en) * 2020-02-26 2020-06-26 中国农业科学院作物科学研究所 Quantitative Trait Locus (QTL) sites of soybean and screening method thereof

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
B ZHANG: "Detection of quantitative trait loci influencing growth trajectories of adventitious roots in Populus using functional mapping", 《SPRINGER》 *
LIVIA MOURA SOUZ: "QTL Mapping of Growth-Related Traits in a Full-Sib Family of Rubber Tree (Hevea brasiliensis) Evaluated in a Sub-Tropical Climate", 《PLOS ONE》 *
刘粉香 等: "林木多元性状数据QTL区间作图统计分析及其在杨树上的应用", 《南京林业大学学报(自然科学版)》 *
李婕 等: "异速生长的QTL定位模型及一因多效性扩展", 《南京林业大学学报(自然科学版)》 *
王平 等: "胡杨幼苗生长相关性状QTL上位性分析", 《北京林业大学学报》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116825191A (en) * 2023-06-25 2023-09-29 北京林业大学 Method for screening key regulation QTL of microorganism bacteria

Also Published As

Publication number Publication date
CN113345520B (en) 2021-12-14

Similar Documents

Publication Publication Date Title
Mandel et al. Association mapping and the genomic consequences of selection in sunflower
Uchiyama et al. Demonstration of genome-wide association studies for identifying markers for wood property and male strobili traits in Cryptomeria japonica
Pandey et al. Genetic diversity and population structure of advanced clones selected over forty years by a potato breeding program in the USA
Mikołajczak et al. Quantitative trait loci for yield and yield-related traits in spring barley populations derived from crosses between European and Syrian cultivars
Kumar et al. Characterization of genetic diversity and population structure in wheat using array based SNP markers
Caruana et al. Validation of genotyping by sequencing using transcriptomics for diversity and application of genomic selection in tetraploid potato
US10492393B2 (en) Yield traits for maize
Ogawa et al. Haplotype-based allele mining in the Japan-MAGIC rice population
Muqaddasi et al. Genetic and physical mapping of anther extrusion in elite European winter wheat
US9670499B2 (en) Yield traits for maize
Wachowiak et al. Molecular signatures of divergence and selection in closely related pine taxa
CN115691660A (en) Method for whole genome selection research of cadmium accumulation traits of corn grains
CN113345520B (en) Richards equation-based Quantitative Trait Locus (QTL) positioning frame method for trees
Zelener et al. Selection strategy for a seedling seed orchard design based on trait selection index and genomic analysis by molecular markers: a case study for Eucalyptus dunnii
Miller et al. Genomic prediction of optimal cross combinations to accelerate genetic improvement of soybean (Glycine max)
Schoen et al. Self‐incompatibility and the genetic architecture of inbreeding depression
CN112226529A (en) SNP molecular marker of wax gourd blight-resistant gene and application
Park et al. Development of genome-wide single nucleotide polymorphism markers for variety identification of F1 hybrids in cucumber (Cucumis sativus L.)
Abeyratne et al. High-resolution mapping reveals hotspots and sex-biased recombination in Populus trichocarpa
CN113005215B (en) Haplotype molecular marker related to poplar wood yield and application thereof
CN111354417B (en) Novel method for estimating aquatic animal genome variety composition based on ADMIXTURE-MCP model
Bai Genome-Wide Association Studies of Ear Traits in Maize
CN117594129A (en) Poplar growth trait optimal prediction system based on whole genome selection and construction method and application thereof
WO2010120844A1 (en) Network population mapping
Campos et al. Including selected sequence variants in genomic predictions for age at first calving in Nellore cattle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20211214

CF01 Termination of patent right due to non-payment of annual fee