WO2020133588A1 - 一种快速稳定的动物个体基因组育种值评估方法 - Google Patents

一种快速稳定的动物个体基因组育种值评估方法 Download PDF

Info

Publication number
WO2020133588A1
WO2020133588A1 PCT/CN2019/071514 CN2019071514W WO2020133588A1 WO 2020133588 A1 WO2020133588 A1 WO 2020133588A1 CN 2019071514 W CN2019071514 W CN 2019071514W WO 2020133588 A1 WO2020133588 A1 WO 2020133588A1
Authority
WO
WIPO (PCT)
Prior art keywords
matrix
genetic
estimated
value
effect
Prior art date
Application number
PCT/CN2019/071514
Other languages
English (en)
French (fr)
Inventor
赵书红
刘小磊
杨翔
李新云
朱猛进
项韬
马云龙
余梅
王志全
尹立林
Original Assignee
华中农业大学
广州影子科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华中农业大学, 广州影子科技有限公司 filed Critical 华中农业大学
Priority to EP19903054.5A priority Critical patent/EP3905253B1/en
Priority to US17/417,007 priority patent/US20220076781A1/en
Publication of WO2020133588A1 publication Critical patent/WO2020133588A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis

Definitions

  • the invention relates to the technical field of animal breeding, in particular to a fast and stable method for evaluating the breeding value of the genome of individual animals.
  • the technical problem to be solved by the present invention is to propose a fast and stable method for evaluating the breeding value of individual animal genomes based on the above-mentioned deficiencies of the prior art.
  • BLUP that is, optimal linear unbiased prediction
  • HIBLUP makes full use of pedigree, phenotype and genotype information to predict the genetic (additive and dominant effect) value of each animal and the effect value of each SNP marker site, to achieve the most advanced genomic breeding value prediction and variance Component estimation algorithm to achieve genome selection.
  • the technical solution adopted by the present invention is: a fast and stable method for assessing the genomic breeding value of individual animals, using HIBLUP to predict the genomic breeding value using phenotype, genotype and pedigree information, and the final output includes The estimated individual genetic value, the additive and dominant effect values of each individual, and the reverse analytical value of each genetic marker effect used in the genotyping chip; specifically including the following steps:
  • Step 1 The genotypes are digitized, and the genotypes AA, AB, and BB are encoded as 0, 1, and 2 respectively; using the pedigree information of the Henderson list method and the genome information of the VanRaden method to construct the relationship A (kinship) between individuals Correlation IBD) matrix and G (state correlation IBS) matrix, and then based on the information of A matrix and G matrix, construct a hybrid correlation matrix H between individual animals, as shown in the following formula:
  • the subscript with "1” represents the group of individuals with only pedigree and no genotype information
  • the subscript with "2” Represents a group of individuals with both pedigree and genotype information
  • a 11 and A 22 represent the kinship correlation between individuals in group “1” and the kinship correlation matrix between individuals in group “2”
  • a 12 represents the kinship correlation matrix between the individuals of group “1” and group “2”
  • a 21 is the transposed matrix of A 12
  • is the harmony percentage of the relationship between fusion matrix G and matrix A 22 ;
  • Step 2 Use the HE regression algorithm to derive the genetic variance and residual variance from the H matrix and phenotype values.
  • the equation is as follows:
  • y is the phenotype value vector
  • the variance explained for the i-th random effect Is the residual variance
  • n is the number of random effects in the model
  • a j is a symmetric non-negative matrix
  • K i and K j are the i-th and j-th additive effect covariate matrices, respectively;
  • Step 3 Set the genetic variance and residual variance of the HE regression as priors for subsequent AI iterations, and then use the AI iterative algorithm to derive the genetic variance and residual variance to the convergence standard, and obtain the estimated genetic parameters;
  • is the genetic parameter to be estimated
  • k is the number of iterations
  • Is the first derivative of the maximum log-likelihood function of each parameter to be estimated
  • Hes is the Hessian matrix, which is the second derivative of the maximum log-likelihood function of each variance
  • the AI matrix is calculated by the following formula
  • Step 4 Use Henderson method 3 to solve the mixed model equation using the genetic parameters estimated in step 3, and obtain the estimated breeding value for each individual.
  • Step 5 Use the reverse solution method to calculate the additive effect of each SNP marker in the genotyping chip.
  • the calculation formula is:
  • m is the number of SNP markers
  • M′ is the additive marker covariate matrix
  • p i and q i are the allele frequencies of the i-th SNP genetic marker
  • Step 6 When the genotypes of alleles AA, AB, and BB are coded as 0, 1, and 0, use the same method from step 2 to step 5 to process the dominant model to reversely solve the dominant effect of each SNP marker value.
  • HIBLUP Optimal Linear Unbiased Prediction
  • pedigree phenotype and genotype information to predict the genetic (additive and explicit) of each animal Sex effect) value and the effect value of each SNP marker site to realize the most advanced genomic breeding value prediction and variance component estimation algorithm to achieve genome selection.
  • FIG. 1 is a flowchart of a fast and stable method for evaluating an individual animal genome breeding value according to an embodiment of the present invention.
  • the method of this embodiment is as follows.
  • HIBLUP uses phenotype, genotype and pedigree information to predict the genomic breeding value.
  • the final output includes the estimated individual genetic value, the additive effect of each individual and the obvious sexual effect value and the reverse analytical value of each genetic marker effect used in the genotyping chip; specifically including the following steps:
  • Step 1 The genotypes are digitized, and the genotypes AA, AB, and BB are encoded as 0, 1, and 2 respectively; using the pedigree information of the Henderson list method and the genome information of the VanRaden method to construct the relationship A (kinship) between individuals Correlation IBD) matrix and G (state correlation IBS) matrix, and then based on the information of A matrix and G matrix, construct a hybrid correlation matrix H between animal individuals, which contains information from A matrix and G matrix, as shown in the following formula:
  • the individuals are divided into two different groups according to whether the individual animals in the group have genotyping information.
  • the group with the subscript "1" represents the group of individuals with only pedigree and no genotyping information.
  • the subscript is " The 2” group represents the group of individuals with both pedigree and genotype information; where A 11 and A 22 represent the kinship between individuals in group “1” and between individuals in group “2”, respectively.
  • a related correlation matrix A 12 represents the related correlation matrix between individuals in group "1” and group "2”, and A 21 is the transposed matrix of A 12 , and ⁇ is the fusion matrix G and matrix A 22 Reconciliation percentage of the relationship;
  • Step 2 Use the HE regression algorithm to derive the genetic variance and residual variance from the H matrix and phenotype values.
  • the equation is as follows:
  • y is the phenotype value vector
  • the variance explained for the i-th random effect Is the residual variance
  • n is the number of random effects in the model
  • a j is a symmetric non-negative matrix
  • K i and K j are the i-th and j-th additive effect covariate matrices, respectively;
  • Step 3 Set the genetic variance and residual variance of the HE regression to the prior values of the subsequent AI iterations, and then iteratively use the AI algorithm to derive the genetic variance and residual variance to the convergence criterion, and obtain the estimated genetic parameters;
  • is the genetic parameter to be estimated
  • k is the number of iterations
  • Is the first derivative of the maximum log-likelihood function of each parameter to be estimated
  • Hes is the Hessian matrix, which is the second derivative of the maximum log-likelihood function of each variance
  • the AI matrix is calculated by the following formula
  • Step 4 Use Henderson method 3 to solve the mixed model equation using the genetic parameters estimated in step 3, and obtain the estimated breeding value for each individual.
  • Step 5 Use the reverse solution method to calculate the additive effect of each SNP marker in the genotyping chip.
  • the calculation formula is:
  • m is the number of SNP markers
  • M′ is the additive marker covariate matrix
  • p i and q i are the allele frequencies of the i-th SNP genetic marker
  • Step 6 When the genotypes of alleles AA, AB, and BB are coded as 0, 1, and 0, use the same method as step 2 to step 5 to process the dominant model to reversely solve the dominant effect of each SNP marker value.
  • the application of HIBLUP in pig genome selection can be used to shorten the breeding cycle (time interval), improve selection accuracy and accelerate the genetic progress of selective traits.
  • the application mainly includes the following steps: obtaining genotype data, pedigree data and phenotype data; inputting the data format in HIBLUP requires the preparation of the above data set; running the HIBLUP program to obtain the estimated breeding value (EBV) of each individual; using multiple traits EBV calculates the selection index; ranks individuals through a comprehensive selection index and provides a candidate list.
  • EBV estimated breeding value

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Epidemiology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioethics (AREA)
  • Operations Research (AREA)
  • General Engineering & Computer Science (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种快速稳定的动物个体基因组育种值评估方法,涉及动物育种技术领域。该方法采用HIBLUP使用表型、基因型和谱系信息进行基因组育种值的预测,最终输出中包括估计的个体遗传价值、每个个体的加性效应和显性效应值以及用于基因分型芯片中的每个遗传标记效应的反向解析值。该方法全面利用谱系、表型和基因型信息来预测每个动物的遗传价值以及每个SNP标记位点的效应值,实现最先进的基因组育种值的预测和方差组分估计算法而实现基因组选择。

Description

一种快速稳定的动物个体基因组育种值评估方法 技术领域
本发明涉及动物育种技术领域,尤其涉及一种快速稳定的动物个体基因组育种值评估方法。
背景技术
随着覆盖整个基因组高密度单核苷酸多态性(SNP)基因分型技术的发展,基因组选择(预测)作为基因组统计分析的强大工具,被广泛应用于植物和动物育种中复杂性状的遗传价值(种用价值)预测和评估,以及在人类遗传学研究中的应用也越来越多。方差组分的估计可能是基因组选择过程中最耗时的部分。在基因组选择中流行的方差组分估计算法,例如EMAI,需要迭代计算,并且每次迭代的计算复杂度非常高。以前的基因组选择程序需要计算基因组亲缘关系矩阵的逆矩阵,并且随着基因分型样本量的增加,计算时间也随之迅速增加。
发明内容
本发明要解决的技术问题是针对上述现有技术的不足,提出一种快速稳定的动物个体基因组育种值评估方法,基于HE-AI算法的BLUP(即最优线性无偏预测)被称为HIBLUP,HIBLUP全面利用谱系、表型和基因型信息来预测每个动物的遗传(加性和显性效应)价值以及每个SNP标记位点的效应值,实现最先进的基因组育种值的预测和方差组分估计算法而实现基因组选择。
为解决上述技术问题,本发明所采取的技术方案是:一种快速稳定的动物个体基因组育种值评估方法,采用HIBLUP使用表型、基因型和谱系信息进行基因组育种值的预测,最终输出中包括估计的个体遗传价值、每个个体的加性效应和显性效应值以及用于基因分型芯片中的每个遗传标记效应的反向解析值;具体包括以下步骤:
步骤1:将基因型进行数值化,基因型AA、AB和BB的编码分别为0、1和2;分别使用Henderson列表法的谱系信息和VanRaden方法的基因组信息构建个体之间的关系A(亲缘相关IBD)矩阵和G(状态相关IBS)矩阵,然后根据A矩阵和G矩阵的信息,构建动物个体间的混合相关矩阵H,如下式所示:
Figure PCTCN2019071514-appb-000001
根据群体中的动物个体是否具有基因分型信息将个体分成两种不同的群组,下角标为“1”的代表仅具有系谱而没有基因组分型信息的个体群组,下角标为“2”的代表同时具有谱系和 基因组分型信息的个体群组;其中A 11、A 22分别表示群组“1”内个体之间的亲缘相关和群组“2”内个体之间的亲缘相关矩阵,A 12表示群组“1”和群组“2”的个体之间的亲缘相关矩阵,并且A 21是A 12的转置矩阵,α是融合矩阵G和矩阵A 22之间的关系调和百分比;
步骤2:使用HE回归算法从H矩阵和表型值导出遗传方差和残差方差,其方程如下:
Figure PCTCN2019071514-appb-000002
其中,y为表型值向量;
Figure PCTCN2019071514-appb-000003
为第i个随机效应所解释的方差;
Figure PCTCN2019071514-appb-000004
为残差方差,n是模型中随机效应的数目;A j为对称非负矩阵,
Figure PCTCN2019071514-appb-000005
为A j的最优估计值,
Figure PCTCN2019071514-appb-000006
Figure PCTCN2019071514-appb-000007
K i和K j分别是第i个和第j个加性效应协变量矩阵;
步骤3:将HE回归的遗传方差和残差方差设置为后续AI迭代的先验值,然后使用AI迭代算法推导遗传方差和残差方差至收敛标准,并得到所估计的遗传参数;
AI算法分部分描述为:
a.Newton-Raphson算法:
Figure PCTCN2019071514-appb-000008
其中,θ是要估计的遗传参数,k是迭代次数,
Figure PCTCN2019071514-appb-000009
是要估计的每个参数的最大对数似然函数的一阶导数,Hes是黑塞矩阵,它是每个方差的最大对数似然函数的二阶导数;
b.Fisher得分方法,Hes矩阵的逆矩阵用它的期望矩阵F取代,得到:
Figure PCTCN2019071514-appb-000010
AI矩阵通过下式计算得到;
AI=(-Hes+F)/2;
参数估计如下:
Figure PCTCN2019071514-appb-000011
步骤4:通过Henderson方法3使用步骤3中估计的遗传参数求解混合模型方程,并获得每个个体的估计育种值,混合模型方程为:
Figure PCTCN2019071514-appb-000012
其中,
Figure PCTCN2019071514-appb-000013
Cov(u,e')=0,
Figure PCTCN2019071514-appb-000014
X代表对应固定效应的设计矩阵,Z是对应随机效应的设计矩阵,I是单位矩阵,K -1是亲缘关系矩阵的逆矩阵,
Figure PCTCN2019071514-appb-000015
是估计的固定 效应向量,
Figure PCTCN2019071514-appb-000016
是估计育种值向量;
步骤5:用反向求解方法计算基因分型芯片中每个SNP标记的加性效应,计算公式为:
Figure PCTCN2019071514-appb-000017
其中,
Figure PCTCN2019071514-appb-000018
是SNP标记的加性效应值向量,m是SNP标记数量,M′是加性标记协变量矩阵,p i和q i为第i个SNP遗传标记的等位基因频率;
步骤6:当等位基因AA、AB和BB的基因型分别编码为0、1和0时,使用步骤2至步骤5相同的方法处理显性模型来反向求解每个SNP标记的显性效应值。
采用上述技术方案所产生的有益效果在于:本发明提出的一种快速稳定的动物个体基因组育种值评估方法,使用Haseman-Elston(HE)回归和平均信息(AI)算法的组合策略来有效地获得方差组分的稳定估计,基于HE-AI算法的BLUP(最优线性无偏预测)被称为HIBLUP,HIBLUP全面利用谱系、表型和基因型信息来预测每个动物的遗传(加性和显性效应)价值以及每个SNP标记位点的效应值,实现最先进的基因组育种值的预测和方差组分估计算法而实现基因组选择。
附图说明
图1为本发明实施例提供的快速稳定的动物个体基因组育种值评估方法流程图。
具体实施方式
下面结合附图和实施例,对本发明的具体实施方式作进一步详细描述。以下实施例用于说明本发明,但不用来限制本发明的范围。
如图1所示,本实施例的方法如下所述。
一种快速稳定的动物个体基因组育种值评估方法,采用HIBLUP使用表型、基因型和谱系信息进行基因组育种值的预测,最终输出中包括估计的个体遗传价值、每个个体的加性效应和显性效应值以及用于基因分型芯片中的每个遗传标记效应的反向解析值;具体包括以下步骤:
步骤1:将基因型进行数值化,基因型AA、AB和BB的编码分别为0、1和2;分别使用Henderson列表法的谱系信息和VanRaden方法的基因组信息构建个体之间的关系A(亲缘相关IBD)矩阵和G(状态相关IBS)矩阵,然后根据A矩阵和G矩阵的信息,构建动物个体间的混合相关矩阵H,该矩阵包含来自A矩阵和G矩阵的信息,如下式所示:
Figure PCTCN2019071514-appb-000019
根据群体中的动物个体是否具有基因分型信息将个体分成两种不同的群组,下角标为“1”的群组代表仅具有系谱而没有基因组分型信息的个体群组,下角标为“2”的群组代表同时具有谱系和基因组分型信息的个体群组;其中,A 11、A 22分别表示群组“1”内个体之间的亲缘相关和群组“2”内个体之间的亲缘相关矩阵,A 12表示群组“1”和群组“2”的个体之间的亲缘相关矩阵,并且A 21是A 12的转置矩阵,α是融合矩阵G和矩阵A 22之间的关系调和百分比;
步骤2:使用HE回归算法从H矩阵和表型值导出遗传方差和残差方差,其方程如下:
Figure PCTCN2019071514-appb-000020
其中,y为表型值向量;
Figure PCTCN2019071514-appb-000021
为第i个随机效应所解释的方差;
Figure PCTCN2019071514-appb-000022
为残差方差,n是模型中随机效应的数目;A j为对称非负矩阵,
Figure PCTCN2019071514-appb-000023
为A j的最优估计值,
Figure PCTCN2019071514-appb-000024
Figure PCTCN2019071514-appb-000025
K i和K j分别是第i个和第j个加性效应协变量矩阵;
步骤3:将HE回归的遗传方差和残差方差设置为后续AI迭代的先验值,然后迭代使用AI算法推导遗传方差和残差方差至收敛标准,并得到估计的遗传参数;
AI算法分部分描述为:
a.Newton-Raphson算法:
Figure PCTCN2019071514-appb-000026
其中,θ是要估计的遗传参数,k是迭代次数,
Figure PCTCN2019071514-appb-000027
是要估计的每个参数的最大对数似然函数的一阶导数,Hes是黑塞矩阵,它是每个方差的最大对数似然函数的二阶导数;
b.Fisher得分方法,Hes矩阵的逆矩阵用它的期望矩阵F取代,得到:
Figure PCTCN2019071514-appb-000028
AI矩阵通过下式计算得到;
AI=(-Hes+F)/2;
参数估计如下:
Figure PCTCN2019071514-appb-000029
步骤4:通过Henderson方法3使用步骤3中估计的遗传参数求解混合模型方程,并获得 每个个体的估计育种值,混合模型方程为:
Figure PCTCN2019071514-appb-000030
其中,
Figure PCTCN2019071514-appb-000031
Cov(u,e')=0,
Figure PCTCN2019071514-appb-000032
X代表对固定效应的设计矩阵,Z是对应随机效应的设计矩阵,I是单位矩阵,K -1是亲缘关系矩阵的逆矩阵,
Figure PCTCN2019071514-appb-000033
是估计的固定效应向量,
Figure PCTCN2019071514-appb-000034
是估计育种值向量;
步骤5:用反向求解方法计算基因分型芯片中每个SNP标记的加性效应,计算公式为:
Figure PCTCN2019071514-appb-000035
其中,
Figure PCTCN2019071514-appb-000036
是SNP标记的加性效应值向量,m是SNP标记数量,M′是加性标记协变量矩阵,p i和q i为第i个SNP遗传标记的等位基因频率;
步骤6:当等位基因AA、AB和BB的基因型分别编码为0、1和0时,使用步骤2至步骤5相同的方法处理显性模型来反向求解每个SNP标记的显性效应值。
HIBLUP在猪基因组选择中的应用可用来缩短育种周期(时代间隔),提高选择准确性并加速选择性状的遗传进展。该应用主要包括以下步骤:获得基因型数据、谱系数据和表型数据;以HIBLUP输入数据格式要求准备上述数据集;运行HIBLUP程序以获得每个个体的估计育种值(EBV);使用多重性状的EBV计算选择指数;通过综合选择指数对个体排序,并提供候选名单。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明权利要求所限定的范围。

Claims (2)

  1. 一种快速稳定的动物个体基因组育种值评估方法,其特征在于:采用HIBLUP使用表型、基因型和谱系信息进行基因组育种值的预测,最终输出中包括估计的个体遗传价值、每个个体的加性效应和显性效应值以及用于基因分型芯片中的每个遗传标记效应的反向解析值;具体包括以下步骤:
    步骤1:将基因型进行数值化,基因型AA、AB和BB的编码分别为0、1和2;分别使用Henderson列表法的谱系信息和VanRaden方法的基因组信息构建个体之间的关系A(亲缘相关IBD)矩阵和G(状态相关IBS)矩阵,然后根据A矩阵和G矩阵的信息,构建动物个体间的混合相关矩阵H,如下式所示:
    Figure PCTCN2019071514-appb-100001
    根据群体中的动物个体是否具有基因分型信息将个体分成两种不同的群组,下角标为“1”的代表仅具有系谱而没有基因组分型信息的个体群组,下角标为“2”的代表同时具有谱系和基因组分型信息的个体群组;其中,A 11、A 22分别表示群组“1”内个体之间的亲缘相关和群组“2”内个体之间的亲缘相关矩阵,A 12表示群组“1”和群组“2”的个体之间的亲缘相关矩阵,并且A 21是A 12的转置矩阵,α是融合矩阵G和矩阵A 22之间的关系调和百分比;
    步骤2:使用HE回归算法从H矩阵和表型值导出遗传方差和残差方差,其方程如下:
    Figure PCTCN2019071514-appb-100002
    其中,y为表型值向量;
    Figure PCTCN2019071514-appb-100003
    为第i个随机效应所解释的方差;
    Figure PCTCN2019071514-appb-100004
    为残差方差,n是模型中随机效应的数目;A j为对称非负矩阵,
    Figure PCTCN2019071514-appb-100005
    为A j的最优估计值,
    Figure PCTCN2019071514-appb-100006
    Figure PCTCN2019071514-appb-100007
    K i和K j分别是第i个和第j个加性效应协变量矩阵;
    步骤3:将HE回归的遗传方差和残差方差设置为后续AI迭代的先验值,然后使用AI迭代算法推导遗传方差和残差方差至收敛标准,并得到所估计的遗传参数;
    步骤4:通过Henderson方法3使用步骤3中估计的遗传参数求解混合模型方程,并获得每个个体的估计育种值,混合模型方程为:
    Figure PCTCN2019071514-appb-100008
    其中,
    Figure PCTCN2019071514-appb-100009
    Cov(u,e’)=0,
    Figure PCTCN2019071514-appb-100010
    X代表对应固定效应的设计矩阵,Z是对应随机效应的设计矩阵,I是单位矩阵,K -1是亲缘关系矩阵的逆矩阵,
    Figure PCTCN2019071514-appb-100011
    是估计的固定 效应向量,
    Figure PCTCN2019071514-appb-100012
    是估计育种值向量;
    步骤5:用反向求解方法计算基因分型芯片中每个SNP标记的加性效应,计算公式为:
    Figure PCTCN2019071514-appb-100013
    其中,
    Figure PCTCN2019071514-appb-100014
    是SNP标记的加性效应值向量,m是SNP标记数量,M′是加性标记协变量矩阵,p i和q i为第i个SNP遗传标记的等位基因频率;
    步骤6:当等位基因AA、AB和BB的基因型分别编码为0、1和0时,使用步骤2至步骤5相同的方法处理显性模型来反向求解每个SNP标记的显性效应值。
  2. 根据权利要求1所述的快速稳定的动物个体基因组育种值评估方法,其特征在于:所述步骤3中的AI算法分部分描述为:
    a.Newton-Raphson算法:
    Figure PCTCN2019071514-appb-100015
    其中,θ是要估计的遗传参数,k是迭代次数,
    Figure PCTCN2019071514-appb-100016
    是要估计的每个参数的最大对数似然函数的一阶导数,Hes是黑塞矩阵,它是每个方差的最大对数似然函数的二阶导数;
    b.Fisher得分方法,Hes矩阵的逆矩阵用它的期望矩阵F取代,得到:
    Figure PCTCN2019071514-appb-100017
    AI矩阵通过下式计算得到;
    AI=(-Hes+F)/2;
    参数估计如下:
    Figure PCTCN2019071514-appb-100018
PCT/CN2019/071514 2018-12-28 2019-01-14 一种快速稳定的动物个体基因组育种值评估方法 WO2020133588A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19903054.5A EP3905253B1 (en) 2018-12-28 2019-01-14 Rapid and stable method for evaluating individual animal genome breeding values
US17/417,007 US20220076781A1 (en) 2018-12-28 2019-01-14 Fast and stable genomic breeding value evaluating method for animal individuals

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811620927.X 2018-12-28
CN201811620927.XA CN109524059B (zh) 2018-12-28 2018-12-28 一种快速稳定的动物个体基因组育种值评估方法

Publications (1)

Publication Number Publication Date
WO2020133588A1 true WO2020133588A1 (zh) 2020-07-02

Family

ID=65797805

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/071514 WO2020133588A1 (zh) 2018-12-28 2019-01-14 一种快速稳定的动物个体基因组育种值评估方法

Country Status (4)

Country Link
US (1) US20220076781A1 (zh)
EP (1) EP3905253B1 (zh)
CN (1) CN109524059B (zh)
WO (1) WO2020133588A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837750A (zh) * 2021-02-07 2021-05-25 深圳市华大农业应用研究院 一种动植物育种值预测方法及装置
CN113555063A (zh) * 2021-07-28 2021-10-26 仲恺农业工程学院 一种基于snp芯片的阈性状基因组育种值估计方法及应用

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110060783A (zh) * 2019-04-16 2019-07-26 广州影子科技有限公司 防疫等级的认证方法、认证装置和认证系统
EP3970092A1 (en) * 2019-05-14 2022-03-23 Agriculture and Food Development Authority (TEAGASC) A method and system for estimation of the breeding value of an animal for eating quality and/or commercial yield prediction
CN110317884A (zh) * 2019-07-30 2019-10-11 河南省农业科学院畜牧兽医研究所 一种快速选择繁殖用肉牛系祖的方法
CN110459265B (zh) * 2019-08-14 2022-07-05 中国农业科学院作物科学研究所 一种提高全基因组预测准确性的方法
CN112750494B (zh) * 2021-01-22 2024-04-09 贵州大学 一种评估香猪表型性状的个体基因组育种值方法
CN113223606B (zh) * 2021-05-13 2022-05-27 浙江大学 一种用于复杂性状遗传改良的基因组选择方法
CN113517020A (zh) * 2021-08-04 2021-10-19 华中农业大学 一种快速准确的动物基因组选配分析方法
CN114639446B (zh) * 2022-04-01 2024-03-15 中国海洋大学 一种基于mcp稀疏深层神经网络模型估计水产动物基因组育种值的方法
CN114743601B (zh) * 2022-04-18 2023-02-03 中国农业科学院农业基因组研究所 基于多组学数据和深度学习的育种方法、装置、设备
CN115588465B (zh) * 2022-10-19 2023-05-23 温州医科大学 一种性状相关基因的筛选方法及其系统
CN117852824B (zh) * 2024-01-09 2024-09-03 内蒙古盛健农牧业工程技术研究有限公司 一种基于智能化的奶山羊育种情况监控管理系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050177316A1 (en) * 2003-09-30 2005-08-11 Mitsubishi Research Institute, Inc. Algorithm for estimating and testing association between a haplotype and quantitative phenotype
CN103914632A (zh) * 2014-02-26 2014-07-09 中国农业大学 一种快速估计基因组育种值的方法和应用
CN105052729A (zh) * 2015-08-31 2015-11-18 华中农业大学 一种基于受选择位点指数评估动植物品种育种潜力的方法及其应用
CN106022005A (zh) * 2016-05-21 2016-10-12 安徽省农业科学院畜牧兽医研究所 一种连续性状和阈性状基因组育种值联合估计的贝叶斯方法
CN107338321A (zh) * 2017-08-29 2017-11-10 集美大学 一种确定最佳snp数量及其通过筛选标记对大黄鱼生产性能进行基因组选择育种的方法
CN107590364A (zh) * 2017-08-29 2018-01-16 集美大学 一种新的估计基因组育种值的快速贝叶斯方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150181822A1 (en) * 2013-12-31 2015-07-02 Dow Agrosciences Llc Selection based on optimal haploid value to create elite lines

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050177316A1 (en) * 2003-09-30 2005-08-11 Mitsubishi Research Institute, Inc. Algorithm for estimating and testing association between a haplotype and quantitative phenotype
CN103914632A (zh) * 2014-02-26 2014-07-09 中国农业大学 一种快速估计基因组育种值的方法和应用
CN105052729A (zh) * 2015-08-31 2015-11-18 华中农业大学 一种基于受选择位点指数评估动植物品种育种潜力的方法及其应用
CN106022005A (zh) * 2016-05-21 2016-10-12 安徽省农业科学院畜牧兽医研究所 一种连续性状和阈性状基因组育种值联合估计的贝叶斯方法
CN107338321A (zh) * 2017-08-29 2017-11-10 集美大学 一种确定最佳snp数量及其通过筛选标记对大黄鱼生产性能进行基因组选择育种的方法
CN107590364A (zh) * 2017-08-29 2018-01-16 集美大学 一种新的估计基因组育种值的快速贝叶斯方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3905253A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837750A (zh) * 2021-02-07 2021-05-25 深圳市华大农业应用研究院 一种动植物育种值预测方法及装置
CN113555063A (zh) * 2021-07-28 2021-10-26 仲恺农业工程学院 一种基于snp芯片的阈性状基因组育种值估计方法及应用

Also Published As

Publication number Publication date
CN109524059A (zh) 2019-03-26
EP3905253B1 (en) 2024-10-16
EP3905253A1 (en) 2021-11-03
US20220076781A1 (en) 2022-03-10
CN109524059B (zh) 2023-02-28
EP3905253A4 (en) 2022-09-07

Similar Documents

Publication Publication Date Title
WO2020133588A1 (zh) 一种快速稳定的动物个体基因组育种值评估方法
Zhang et al. Best linear unbiased prediction of genomic breeding values using a trait-specific marker-derived relationship matrix
Guo et al. Comparison of single-trait and multiple-trait genomic prediction models
Speed et al. Relatedness in the post-genomic era: is it still useful?
Fernandes Júnior et al. Genomic prediction of breeding values for carcass traits in Nellore cattle
Dong et al. Comparative analysis of the GBLUP, emBayesB, and GWAS algorithms to predict genetic values in large yellow croaker (Larimichthys crocea)
dos Santos et al. Inclusion of dominance effects in the multivariate GBLUP model
Ning et al. Performance gains in genome-wide association studies for longitudinal traits via modeling time-varied effects
Chen et al. Genome-wide association analyses based on broadly different specifications for prior distributions, genomic windows, and estimation methods
Zhang et al. Advances in genomic selection in domestic animals
Vela-Avitúa et al. Accuracy of genomic selection for a sib-evaluated trait using identity-by-state and identity-by-descent relationships
Sun et al. A fast EM algorithm for BayesA-like prediction of genomic breeding values
Li et al. An efficient unified model for genome-wide association studies and genomic selection
Yang et al. A new genotype imputation method with tolerance to high missing rate and rare variants
Alvarenga et al. Comparing alternative single-step GBLUP approaches and training population designs for genomic evaluation of crossbred animals
Wang et al. A computationally efficient algorithm for genomic prediction using a Bayesian model
Song et al. The superiority of multi-trait models with genotype-by-environment interactions in a limited number of environments for genomic prediction in pigs
Jones et al. Progress and opportunities through use of genomics in animal production
Masuda et al. 331 Efficient quality control methods for genomic and pedigree data used in routine genomic evaluation
Böndel et al. The distribution of fitness effects of spontaneous mutations in Chlamydomonas reinhardtii inferred using frequency changes under experimental evolution
Shabannejad et al. A classic approach for determining genomic prediction accuracy under terminal drought stress and well-watered conditions in wheat landraces and cultivars
Cleveland et al. Genotype imputation for the prediction of genomic breeding values in non-genotyped and low-density genotyped individuals
Tempelman Statistical and computational challenges in whole genome prediction and genome-wide association analyses for plant and animal breeding
Chang et al. A rapid and efficient linear mixed model approach using the score test and its application to GWAS
Zhang et al. Comparison of gene-based rare variant association mapping methods for quantitative traits in a bovine population with complex familial relationships

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19903054

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019903054

Country of ref document: EP

Effective date: 20210728