WO2023103303A1 - 东星斑抗病良种培育的基因组选择方法 - Google Patents

东星斑抗病良种培育的基因组选择方法 Download PDF

Info

Publication number
WO2023103303A1
WO2023103303A1 PCT/CN2022/096720 CN2022096720W WO2023103303A1 WO 2023103303 A1 WO2023103303 A1 WO 2023103303A1 CN 2022096720 W CN2022096720 W CN 2022096720W WO 2023103303 A1 WO2023103303 A1 WO 2023103303A1
Authority
WO
WIPO (PCT)
Prior art keywords
disease
resistant
population
snp
genome
Prior art date
Application number
PCT/CN2022/096720
Other languages
English (en)
French (fr)
Inventor
陈松林
卢昇
刘洋
周茜
王磊
朱春华
张天时
陈亚东
徐文腾
Original Assignee
中国水产科学研究院黄海水产研究所
南方海洋科学与工程广东省实验室湛江
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国水产科学研究院黄海水产研究所, 南方海洋科学与工程广东省实验室湛江 filed Critical 中国水产科学研究院黄海水产研究所
Priority to CN202280002105.3A priority Critical patent/CN116917504A/zh
Publication of WO2023103303A1 publication Critical patent/WO2023103303A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/124Animal traits, i.e. production traits, including athletic performance or the like
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/80Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in fisheries management
    • Y02A40/81Aquaculture, e.g. of fish

Definitions

  • the invention belongs to the technical field of aquatic genetics and breeding, and in particular relates to a genome selection method for cultivating disease-resistant fine varieties of eastern star spot.
  • Dongxingban Eastern star spot, scientific name Plectropomus leopardus, belongs to the order Perciformes, Anthidae, Grouper subfamily, and the genus Plectropomus.
  • Dongxingban has delicate meat, delicious taste, and beautiful red body. It has high economic value and ornamental value. It is a rare and precious cultured fish in seawater. Due to the high demand in the market, the number of wild populations has dropped significantly due to overfishing, which has caused the eastern star spot to become a scarce fish in the market. In recent years, with the breakthrough and development of Dongxingban artificial breeding technology, the scope and scale of breeding have gradually expanded.
  • Dongxingban mainly relies on natural spawning and natural fertilization, and it is impossible to establish a family. Therefore, there is a lack of effective breeding methods in fishes that are difficult to establish families such as the eastern star spot.
  • the invention describes in detail the method for cultivating disease-resistant improved varieties of Eastern star spot by using non-family populations, and aims to provide a technical method for cultivating improved varieties of Eastern star spot, and provide new ideas for the selection and breeding of fish that are difficult to establish families.
  • the purpose of the present invention is to establish a genome selection method based on non-pedigree populations for cultivating disease-resistant varieties of Eastern Star Spot, to overcome the problems of needing to establish families and low selection accuracy in traditional breeding, and to provide molecular breeding for the cultivation of improved disease-resistant varieties of Eastern Star Spot Methods, to improve the disease resistance of the East star spot cultured population, and to promote the breeding process of the disease-resistant improved varieties of East star spot.
  • the genome selection method for cultivating disease-resistant improved varieties of eastern star spot provided by the present invention, said method comprises the following steps:
  • the East star spot juveniles used to construct the disease-resistant reference population were obtained after screening for disease resistance;
  • step 3 select a group of SNP markers for calculating the GEBV of the eastern star spot candidate population, and the SNP markers include the SNP sites related to the eastern star spot disease-resistant traits;
  • step 5) Use the genome selection method selected in step 3) and the prediction effect of the SNP markers selected in step 4) to verify the verification population. If the average GEBV of the surviving verification individuals is greater than the average GEBV of the dead individuals, it indicates that GEBV is related to the individual's There is a positive correlation between disease resistance, and individuals with high GEBV also have strong disease resistance, and GEBV can be used to select candidate parents to breed offspring with strong disease resistance;
  • step 6) Using the genome selection method selected in step 3) and the SNP marker selected in step 4) to calculate the GEBV of the candidate population, individuals with high GEBV have strong disease resistance and can be used as parents to breed offspring.
  • SNP sites there are 20,000 (20k) SNP sites in total, located at the 36th and 107th positions of any one of the sequences SEQ ID NO: 1-SEQ ID NO: 10000, and the sequences within 35bp before and after the site are flanking sequences;
  • the SNP loci include 30 SNP loci related to the disease resistance of Eastern star spot, which are located at the 36th and 107th positions of any one of the sequences SEQ ID NO: 1-SEQ ID NO: 15, before and after the sites The sequence within 35bp is the flanking sequence;
  • the method for cultivating disease-resistant improved varieties of Eastern Star Spot based on genome selection technology and 20,000 SNP sites provided by the present invention can be used to screen parents with strong disease resistance. Survival rates provide efficient technical means.
  • Figure 1 Boxplot of the deletion rate of common loci in the reference group and verification group of Dongxingspot
  • Figure 2 The change curve of G-BLUP and BayesC ⁇ to predict the accuracy of GEBV resistance traits of Eastern star spot;
  • Figure 3 20k SNP density distribution plot for estimating the GEBV of the validation population
  • Figure 4 Comparison chart of the mean GEBV resistance of surviving individuals and dead individuals in the verification population
  • Figure 5 Comparison of the mean GEBV values of disease-resistant individuals and susceptible individuals in the candidate population
  • Example 1 Establishment of Eastern star spot disease-resistant reference population
  • Juveniles of East star spot were collected from a number of large-scale breeding farms in Hainan and Shandong, and the collected juveniles were transported to places where artificial infection experiments could be carried out, and these juveniles were treated by intraperitoneal injection of Vibrio harveyi bacteria liquid artificial infection.
  • the time of injection of the bacterial solution as the starting point (0 o'clock)
  • observe the survival of the juvenile fish after injection every 8 hours remove the dead individual in time, record the individual information (such as full length, body weight, and death time, etc.) and cut off the tail fin Fin rays were stored in absolute ethanol.
  • the experiment was carried out for 5 days.
  • the caudal fin rays of surviving individuals were collected and stored in absolute ethanol, and the data of individual origin, full length, body weight and death time were recorded. Some individuals were selected for whole-genome resequencing to construct a reference population against Vibrio harveyi (Table 1). The survival status of the tested larvae in the artificial infection experiment was used as the disease resistance phenotype for subsequent analysis.
  • the disease resistance phenotype was defined as a binary trait, ie: 0 means juveniles that died during the infection experiment; 1 means juveniles that survived the experiment.
  • the collected larvae were divided into 6 groups, and after injection, they were placed in glass fiber reinforced plastic tanks of the same specification.
  • the average body weight and average whole field of juveniles from Shandong were higher than those from Hainan.
  • the range of survival rate after infection was 31.85% to 43.41%, a total of 534 juveniles of the eastern star spot were selected to construct a reference population (Table 1).
  • Example 2 Genome resequencing and screening of high-quality SNPs in the eastern star spot disease-resistant reference population
  • Genomic DNA from the reference population of Eastern star spot disease resistance was extracted and purified using the standard procedure provided by the DNA extraction kit (TIANGEN, Beijing).
  • the genomic DNA of 298 Eastern star spots with a disease-resistant phenotype was also extracted to verify the effect of the method of the present invention.
  • Purified genomic DNA was used to build an Illumina paired-end library, and then the Illumina HiSeq 2000 sequencing platform was used for sequencing. After filtering out low-quality sequencing reads (reads), the software BWA was used to compare these reads to the Eastern Star Spot reference genome. Use the software GATK to detect the variation information (SNP and INDEL).
  • Example 3 The accuracy of two genome selection methods in predicting disease-resistant GEBV in eastern star spot
  • the present invention uses the phenotype and genotype data of the constructed reference population to compare the G-BLUP by 5-fold cross-validation method
  • the area under the receiver operating characteristic curve (AUC) was used as the evaluation index for the prediction accuracy, and the calculation was performed 25 times in total, and the final accuracy was expressed as the mean of 25 times.
  • the following R packages need to be installed before calculation: data.table, ASReml, BGLR, Rcpp, pROC and parallel.
  • #Use R to extract subsets containing different numbers of SNPs from the reference population and store them in binary form.
  • the numbers of SNPs are 500, 700, 1k, 2k, 8k, 10k, 20k, 50k, 80k, and 100k
  • y ijk is the phenotype value, which means the phenotype of individual k collected from location i and infected at weight j; Ori.Location i is a fixed effect, which means the sampling location of the individual tested; b.weight j is a covariate, which means the body weight of the individual tested in the infection experiment; a k is the random additive effect of the individual; ⁇ means the cumulative function of the standard normal distribution.
  • the information in the last line of the file "Ref.cv.results.gblup.csv” is the accuracy of predicting disease-resistant GEBV when G-BLUP changes with the number of SNPs.
  • the prediction accuracy of BayesC ⁇ was evaluated in R using the same model as G-BLUP. Since the Bayesian method is time-consuming, a command is specified for each SNP subset. The following uses the SNP subset "s1" as an example to illustrate how to estimate the prediction accuracy of BayesC ⁇ .
  • BayesC ⁇ is slightly better than that of G-BLUP, and BayesC ⁇ can be used as a method to estimate disease-resistant GEBV. Since the more markers used to estimate GEBV, the higher the cost of genotyping; the present invention uses 20k SNP sites to estimate disease-resistant GEBV, which can ensure the prediction accuracy of the genome selection method, and can also appropriately reduce the cost of genotyping , Save calculation time.
  • Embodiment 4 Selection is used to predict the SNP marker of Eastern star spot candidate group GEBV
  • the present invention selects 20,000 (20k) SNP sites for estimating the disease-resistant GEBV of the candidate population, which includes 30 sites and 19,970 homogeneous sites associated with Eastern star spot resistance to Vibrio harveyi
  • the disease resistance-related loci are located at the 36th and 107th positions of any one of the sequences SEQ ID NO: 1 to SEQ ID NO: 15, which are obtained through genome-wide association analysis.
  • Other SNP sites are evenly distributed on the Dongxingban genome, and the sites are located at the 36th and 107th positions of any one of the sequences SEQ ID NO: 16 to SEQ ID NO: 10000.
  • gcta --bpfile geno4g was --grm-sparse sparse.grm.gcta --pheno pheno4gcta.phe-qcovar qcov4gcta.qcov --fastGWA-mlm --autosome-num 24 --out res.gwas
  • the file "snps.20k.txt" contains the number of 20k SNPs, which can be read by PLINK2 to extract the corresponding loci.
  • the files all contain 20,000 SNP loci, including 30 SNP loci related to disease resistance and 19,970 SNP loci that uniformly cover the reference genome of E.
  • Figure 3 shows the distribution of these 20k SNPs on the East Star Spot genome in every 1M range, and it can be seen that these sites basically evenly cover the East Star Spot genome.
  • Example 4 The 20k SNP obtained in Example 4 is adopted to verify the same model as described in Example 3, and compared with the results of Example 3, it can be seen that: BayesC ⁇ uses the 20k SNP sites of the Eastern Star Spot genome evenly distributed (not selected in Example 4) 30 disease-resistance-associated SNP loci) prediction accuracy was 0.670 (Fig. 2); and after using 20k SNP loci containing these 30 disease-resistance-associated loci, the prediction accuracy of BayesC ⁇ was 0.682, which improved 0.012, which is close to the prediction accuracy (0.681) of BaeysC ⁇ using 50k SNPs in Example 3. Therefore, using the 20k SNP provided by the present invention can not only successfully complete genome selection, but also has a good prediction effect.
  • Embodiment 6 Calculate the disease-resistant GEBV of the eastern star spot candidate population
  • Example 4 Select 100 juvenile starfish as the candidate population, and use whole genome resequencing to obtain the genotype of the candidate population, and the genotype information is stored in the file "candidates.geno.vcf.gz". After the sequencing was completed, the same 20k SNPs as described in Example 4 were extracted, combined with the reference population in Example 3, and the disease-resistant GEBV of the candidate population was calculated using BayesC ⁇ .
  • the present invention uses infection experiments to measure the disease resistance of candidate populations. Surviving individuals are identified as disease-resistant individuals, and dead individuals are identified as susceptible individuals. Among the 100 candidate populations of Eastern star spot, each contained 50 resistant individuals and 50 susceptible individuals. Run the following command in the Linux environment:

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明提供一种东星斑抗病良种培育的基因组选择方法,所述的方法是使用不同地理来源的、非家系的东星斑幼鱼构建抗病参考群体,用于构建抗病参考群体的东星斑幼鱼是通过抗病力筛选后获得的;将具有抗病表型、但未纳入参考群体的个体作为验证群体,用于验证本发明提供的SNP位点计算基因组估计育种值的效果,这些SNP位点中包含了与东星斑抗病相关的SNP位点;使用本发明提供的SNP位点计算候选群体的基因组估计育种值,基因组估计育种值高的个体抗病力较强,可用于东星斑抗病良种培育。本发明所提供的基于基因组选择技术的东星斑抗病良种培育方法可用于筛选抗病力强的亲本,选出的亲本可用于东星斑抗病选育,为提高东星斑养殖群体的存活率提供了技术手段。

Description

东星斑抗病良种培育的基因组选择方法 技术领域
本发明属于水产遗传育种技术领域,具体涉及一种东星斑抗病良种培育的基因组选择方法。
背景技术
东星斑,学名豹纹鳃棘鲈(Plectropomus leopardus),隶属鲈形目,鮨科,石斑鱼亚科,鳃棘鲈属。东星斑肉质细腻、味道鲜美,通体呈靓丽红色,具有很高的经济价值和观赏价值,为海水名贵养殖鱼类。由于市场需求量大,野生种群数量因过度捕捞大幅下降,导致东星斑成为市场上一种稀缺的鱼类。近年来,随着东星斑人工养殖技术的突破和发展,养殖范围和规模逐渐扩大。不过,由于高密度养殖模式盛行,加之东星斑养殖环境恶化和种质退化,养殖过程中病害频发,造成了巨大的经济损失。其中,由哈维氏弧菌(Vibrio harveyi)引起的“烂身病”成为了困扰养殖户的一大难题。例如:海南地区曾在2011年爆发过以哈维氏弧菌为优势菌的细菌性疾病,病鱼大量死亡并出现尾部溃烂、全身发白等症状。通常,养殖户应对疾病多以预防为主,出现疾病后往往采用全池泼洒抗生素,如土霉素、恩诺沙星和诺氟沙星等药物进行治疗。但大面积使用抗生素不仅影响水产品质量,还会造成食品安全问题、影响商品鱼价值,给消费者带来潜在的危害。因此,培育东星斑抗病良种,提高东星斑养殖群体自身的抗病力十分必要。
如何准确挑选具有育种潜力的个体是良种培育的核心问题。在鱼类抗病选育中,通常利用大规模建立的家系和人工感染实验获取抗病表型,之后通过最佳线性无偏预测(PBLUP)估算出的估计育种值(EBV)进行选育。然而,为了避免病原的垂直传播,人工感染实验中存活的个体一般不用作繁育亲本,而是从EBV 高的家系中选择未感染的健康个体进行繁育。因此,选择效率不高,每年可获得的遗传进展有限。若选育的鱼类无法建立家系,则依据表型或经验选择,导致选育效率大幅度下降。目前,东星斑的人工繁育主要依靠自然产卵和自然受精,无法建立家系。因此,在东星斑这类难以建立家系的鱼类中缺乏有效的选育方法。本发明详述了利用非家系群体培育东星斑抗病良种的方法,旨在提供一种培育东星斑良种的技术方法,为难以建立家系的鱼类的选育提供新思路。
发明内容
本发明的目的是建立一种基于非家系群体的东星斑抗病良种培育的基因组选择方法,以克服传统选育中需要建立家系和选择准确性较低的问题,为培育东星斑抗病良种提供分子育种方法,提高东星斑养殖群体抗病力,推动东星斑抗病良种培育进程。
本发明所提供的东星斑抗病良种培育的基因组选择方法,所述方法包括如下步骤:
1)使用不同地理来源的东星斑幼鱼构建东星斑抗病参考群体;
此外,再将一部分具有抗病表型但未纳入参考群体的个体作为验证群体用于验证本发明所述方法的选择效果;
其中用于构建抗病参考群体的东星斑幼鱼是通过抗病力筛选后获得的;
2)对参考群体和验证群体进行全基因组重测序,捕获基因组范围内高质量单核苷酸多态性(SNP)位点;
3)通过比较基因组最佳线性无偏预测(G-BLUP)和BayesCπ两种方法预测抗病性状基因组估计育种值(GEBV)的准确性随SNP数目变化的趋势,选出用于计算候选群体GEBV的SNP数目和方法;
4)根据步骤3)得到的结果,选出一组用于计算东星斑候选群体GEBV的SNP标记,所述的SNP标记包含了与东星斑抗病性状相关的SNP位点;
5)利用验证群体验证步骤3)中选出的基因组选择方法和步骤4)中挑选出的SNP标记的预测效果,若存活的验证个体的平均GEBV大于死亡个体的平均GEBV则表明GEBV与个体的抗病力存在正相关关系,GEBV高的个体抗病力也强,可用GEBV对候选亲本进行选择,从而培育抗病力强的后代;
6)利用步骤3)中选出的基因组选择方法和步骤4)中挑选出的SNP标记计算候选群体的GEBV,GEBV高的个体抗病力强,可以作为亲本繁育后代。
所述的SNP位点共有20,000个(20k),位于序列SEQ ID NO:1—SEQ ID NO:10000中任一个序列的第36位和第107位,位点前后35bp内的序列为侧翼序列;
所述的SNP位点中包括与东星斑抗病相关的30个SNP位点,其位于序列SEQ ID NO:1—SEQ ID NO:15中任一个序列的第36位和第107位,位点前后35bp内的序列为侧翼序列;
本发明所提供的基于基因组选择技术的东星斑抗病良种培育方法和20,000个SNP位点可用于筛选抗病力强的亲本,选出的亲本可直接用于东星斑选育,为提高东星斑养殖群体的存活率提供了高效的技术手段。
附图说明
图1:东星斑参考群体和验证群体共有位点缺失率箱线图;
图2:G-BLUP和BayesCπ预测东星斑抗病性状GEBV准确性变化曲线图;
图3:用于估算验证群体GEBV的20k SNP密度分布图;
图4:验证群体中存活个体和死亡个体抗病GEBV均值比较图;
图5:候选群体中抗病个体和易感个体抗病GEBV均值比较图;
具体实施方式
下面对本发明培育抗哈维氏弧菌的东星斑养殖群体的方法进行详细描述。
实施例1:建立东星斑抗病参考群体
从海南和山东养殖规模较大的多个养殖场收集东星斑幼鱼,将收集到的幼鱼运送至可进行人工感染实验的场所,通过腹腔注射哈维氏弧菌菌液的方式对这些幼鱼进行人工感染。将注射菌液的时间视为起点(0时),每8小时观察一次注射后幼鱼的存活情况,及时移出死亡个体,记录个体信息(如全长、体重和死亡时间等)并剪取尾鳍鳍条存于无水乙醇中。实验共进行5天,结束后采集存活个体的尾鳍鳍条存于无水乙醇中并记录个体来源、全长、体重和死亡时间等数据。从中选取部分个体进行全基因组重测序,用于构建东星斑抗哈维氏弧菌参考群体(表1)。将受试东星斑幼鱼在人工感染实验中的存活状态作为抗病表型,用于后续分析。将抗病表型定义为二分类性状,即:0表示在感染实验中死亡的幼鱼;1表示实验期间存活的幼鱼。
实验中,将收集到的幼鱼分为6组,注射后分别置于相同规格的玻璃钢水槽。来自山东的东星斑幼鱼平均体重和平均全场均高于来自海南的东星斑幼鱼,单因素方差检验表明6组实验的感染存活率不存在显著性差异(p=0.05),感染后存活率范围为31.85%~43.41%,共选取534尾东星斑幼鱼构建参考群体(表1)。
表1:东星斑人工感染实验及参考群体信息
Figure PCTCN2022096720-appb-000001
Figure PCTCN2022096720-appb-000002
实施例2:东星斑抗病参考群体基因组重测序和高质量SNP的筛选
1、参考群体全基因组重测序及变异检测
使用DNA提取试剂盒(TIANGEN,北京)提供的标准流程提取、纯化东星斑抗病参考群体的基因组DNA。除参考群体基因组DNA外,还提取了298尾具有抗病表型的东星斑的基因组DNA,用于验证本发明所述方法的效果。使用纯化的基因组DNA建立Illumina双端文库,之后采用Illumina HiSeq 2000测序平台测序,滤去低质量的测序读段(reads)后使用软件BWA将这些reads比对至东星斑参考基因组。使用软件GATK检测变异信息(SNP和INDEL)。最终,在参考群体中,498个个体成功测序,个体测序数据量为10G,共获得34,026,719个变异,染色体上的变异为28,789,387个,非染色体上的变异为5,237,332个,个体平均测序深度为7.66X,变异结果存于文件“variances.Ref.vcf.gz”。在验证群体中,298个个体成功测序,个体测序数据量为5G,共获得12,452,483个变异,染色体上的变异为11,355,373个,非染色体上的变异为1,097,110个,个体平均测序深度为3.66X变异结果存于文件“variances.Val.vcf.gz”。
2、筛选高质量SNP变异位点
按下述步骤过滤上述变异,保留高质量的SNP位点后填充缺失的位点信息。在Linux环境下运行以下命令:
Figure PCTCN2022096720-appb-000003
Figure PCTCN2022096720-appb-000004
运行后,参考群体和验证群体中染色体上的SNP位点分别存于文件“raw.snps.Ref.vcf.gz”和“raw.snps.Val.vcf.gz”中。
Figure PCTCN2022096720-appb-000005
Figure PCTCN2022096720-appb-000006
Figure PCTCN2022096720-appb-000007
运行后,文件“kept.loci.txt”中存放了两个文件中缺失率低于0.05的位点(图1)。
Figure PCTCN2022096720-appb-000008
Figure PCTCN2022096720-appb-000009
运行后,共剩余1,213,496个高质量SNP用于后续分析,填充好的基因型存于文件“snps.imputed.vcf.gz”。
实施例3:两种基因组选择方法预测东星斑抗病GEBV的准确性
为探究用于估算东星斑抗哈维氏弧菌病GEBV的方法和适宜的SNP标记数,本发明利用构建好的参考群体的表型和基因型数据,通过5倍交叉验证方法比较了G-BLUP和BayesCπ随SNP数目降低时预测东星斑抗哈维氏弧菌病GEBV准确性的变化趋势。预测准确性采用受试者工作特征曲线下面积(AUC)作为评价指标,计算共进行25次,最终的准确性以25次的均值表示。计算前需安装以下R包:data.table、ASReml、BGLR、Rcpp、pROC和parallel。
1、从东星斑抗病参考群体中提取不同数目的SNP集合(SNP子集)
从实施例2中获得的高质量SNP中提取数目不同的SNP子集。抗病参考群体中的个体编号存于文件“indiv.ref.txt”中。在Linux环境下运行以下命令提取SNP子集:
#读取VCF文件并准备相关数据
##使用PLINK2读取VCF文件且再次质控
plink2--vcf snps.imputed.vcf.gz--autosome-num 24--maf 0.05--make-bpgen--out snps.imputed
##使用PLINK2统计等位基因频率
plink2--bpfile snps.imputed--autosome-num 24--freq--out snps.imputed
##在R中运行以下代码,准备输出基因型所需文件
library(data.table)
freq<-fread(“snps.imputed.afreq”)
alleles<-rbind(freq[ALT_FREQS<=0.5,c(“ID”,“ALT”)],freq[ALT_FREQS>0.5,c(“ID”,“REF”)],use.names=F)
write.table(alleles,“export.A.txt”,sep=“\t”,col.names=F,quote=F)
运行后,剩余1,211,259个高质量SNP和796个个体供后续分析,其中,抗病参考群体498个个体,验证群体298个个体。
#使用R从参考群体中提取包含不同SNP数量的子集并以二进制形式储存,SNP数量分别为500、700、1k、2k、8k、10k、20k、50k、80k和100k
##使用PLINK2生成以0/1/2方式编码基因型AA/Aa/aa的文件
plink2--bpfile snps.imputed--autosome-num 24--export-allele export.A.txt--keep indiv.ref.txt--export A--out geno.ref
##输出个体编号
awk‘{print$2}’geno.ref.raw|sed‘1d’>iid.ref.txt
##在R中运行以下脚本提取SNP子集并以二进制方式保存
Figure PCTCN2022096720-appb-000010
运行后,所有SNP子集均已二进制形式储存,便于后续读取。
2、评估G-BLUP的预测准确性
在ASRreml-R中使用以下模型拟合表型数据,并评估G-BLUP的预测准确性。
y ijk=Φ(Ori.Location i+b.weight j+a k)
上述模型中,y ijk为表型值,表示个体k采自地点i,在体重为j时进行感染时的表型;Ori.Location i为固定效应,表示受试个体的采样地点;b.weight j为协变量,表示受试个体进行感染实验时的体重;a k为个体的随机加性效应;Φ表示标准正态分布累计函数。在Linux环境中运行以下命令:
##在R中运行以下脚本构建三列式G矩阵的逆矩阵
Figure PCTCN2022096720-appb-000011
Figure PCTCN2022096720-appb-000012
Figure PCTCN2022096720-appb-000013
Figure PCTCN2022096720-appb-000014
##在R中运行以下脚本评估G-BLUP的预测准确性,其中,抗病表型变量名为sur.status;Ori.Location和b.weight分别为取样地点和体重,作为固定项。
Figure PCTCN2022096720-appb-000015
Figure PCTCN2022096720-appb-000016
运行后,文件“Ref.cv.results.gblup.csv”中最后一行的信息即为G-BLUP随 SNP数目变化时预测抗病GEBV的准确性。
3、评估BayesCπ的预测准确性
使用与G-BLUP相同的模型,在R中评估BayesCπ的预测准确性。由于贝叶斯方法较为耗时,因此为每个SNP子集指定一条命令,下面以SNP子集“s1”为例说明如何估算BayesCπ的预测准确性。
在Linux环境中运行以下命令:
Figure PCTCN2022096720-appb-000017
Figure PCTCN2022096720-appb-000018
运行后,文件“Ref.results.cv.BC.s1.csv”中最后一行的信息即为使用SNP子 集“s1”时BayesCπ预测抗病的准确性。运行上述脚本10次(i的取值为1~10)得到BayesCπ随SNP数目变化时的预测东星斑抗哈维氏弧菌病GEBV的准确性(图2)。
4、不同SNP数目下两种基因组选择方法预测准确性的比较
GBLUP和BayesCπ随SNP数目变化时预测东星斑抗哈维氏弧菌GEBV的准确性变化趋势示于图2,从中可知:随SNP数目减少,G-BLUP和BayesCπ的预测准确性呈下降趋势;当SNP数为50k时,G-BLUP和BayesCπ的预测准确性最高,分别为0.670和0.681;当SNP标记数由20k降至3k时,G-BLUP和BayesCπ的预测准确性在可接受范围内下降;当标记数少于2k时,两种方法的预测准确性快速下降至0.606和0.611。总体而言,BayesCπ的预测准确性略优于G-BLUP,可将BayesCπ作为估算抗病GEBV的方法。由于估算GEBV使用的标记数越多,基因分型的成本就越高;本发明使用20k的SNP位点估算抗病GEBV既可保证基因组选择方法的预测准确性,也可适当降低基因分型成本、节约计算耗时。
实施例4:挑选用于预测东星斑候选群体GEBV的SNP标记
基于实施例3的结果,本发明选出用于估算候选群体抗病GEBV的20,000个(20k)SNP位点,其中包含了30个与东星斑抗哈维氏弧菌相关的位点和19,970个均匀分布在东星斑基因组上的位点,抗病相关的位点位于序列SEQ ID NO:1~SEQ ID NO:15中任一个序列的第36位和第107位,是通过全基因组关联分析得到的。其它SNP位点均匀分布在东星斑基因组上,位点位于序列SEQ ID NO:16~SEQ ID NO:10000中任一个序列的第36位和第107位。
在Linux环境中运行以下命令:
#使用PLINK准备基因型数据
plink2--bpfile snps.imputed--autosome-num 24--keep indiv.ref.txt--make-bpgen--out geno4gwas
#使用GCTA构建GRM矩阵
gcta--bpfile geno4gwas--autosome-num 24--make-grm--out grm.gcta
#使用GCTA构建稀疏GRM矩阵
gcta--grm grm.gcta--make-bK-sparse 0.05--out sparse.grm.gcta
#将抗病表型和协变量分别整理至文本文件“pheno4gcta.phe”和“qcov4gcta.qcov”,协变量包括体重和PCA分析中前10项主成分向量
#使用GCTA进行全基因组关联分析
gcta--bpfile geno4gwas--grm-sparse sparse.grm.gcta--pheno pheno4gcta.phe-qcovar qcov4gcta.qcov--fastGWA-mlm--autosome-num 24--out res.gwas
#使用R读取分析结果并输出与抗病性状相关的30个SNP的信息
library(data.table)
res<-fread(“res.gwas.fastGWA”)
setorder(res,P)
sel.loci<-res[1:30,]
setorder(sel.loci,CHR,POS)
write.table(sel.loci,“selected.loci.info.txt”,sep=“\t”,quote=F)
运行后,抗病相关的30个SNP位点的编号、染色体编号和物理位置等信息 存于文件“selected.loci.info.txt”中。之后,选取用于计算GEBV的20k SNP,这20k SNP中包含上述30个SNP。
在Linux环境中运行以下命令:
#挑选均匀覆盖基因组的20k SNP
##使用PLINK2提取均匀覆盖基因组的19,970个SNP,增加--bp-space的数值获得的SNP数目会表少,反之则变多
plink2--bpfile snps.imputed--extract comm.loci.txt--bp-space 37267--make-bpgen--out snps.20k.part1
##使用R整理用于提取最终20k SNP的文件
library(data.table)
bim.p1<-fread(“snps.20k.part1.bim”)
loci.info<-fread(“trait.realted.loci.txt”)
loci.20k<-c(bim.p1$V2,loci.info$SNP)
write.table(loci.20k,“snps.20k.txt”,sep=“\t”,col.names=F,row.names=F,quote=F)
运行后,文件“snps.20k.txt”中包含了20k SNP的编号,该编号可被PLINK2读取用于提取相应位点。文件均包含了20,000个SNP位点,其中包含东星斑抗病相关SNP位点30个和均匀覆盖东星斑参考基因组的SNP位点19,970个。图3展示了每1M范围内这20k SNP在东星斑基因组上的分布,可看出这些位点基本均匀覆盖了东星斑基因组。
实施例5:验证BayesCπ和优选的20k SNP标记的预测效果
1、利用交叉验证评估预测效果
采用与实施例3中所述相同的模型验证实施例4中所得的20k SNP,与实施例3的结果比较可知:BayesCπ使用均匀分布东星斑基因组的20k SNP位点(未加入实施例4中选出的30个抗病相关的SNP位点)的预测准确性为0.670(图2);而使用包含这30个抗病相关位点的20k SNP位点后,BayesCπ的预测准确性为0.682,提高了0.012,接近实施例3中BaeysCπ使用50k SNP的预测准确性(0.681)。因此,使用本发明提供的20k SNP不仅能够顺利完成基因组选择,还具有良好的预测效果。
2、利用验证群体验的抗病GEBV评估预测效果
将具有抗病表型但未纳入参考群体的个体作为验证个体,使用BayesCπ和实施例4中得到的20k SNP计算验证群体的抗病GEBV,比较存活个体和死亡个体的平均GEBV,以验证本发明所述方法的可行性。Linux环境中运行以下命令:
Figure PCTCN2022096720-appb-000019
Figure PCTCN2022096720-appb-000020
运行后,成功预测了298尾验证群体的抗病GEBV,存活个体和死亡个体的 平均GEBV如图4所示。其中,存活个体110尾,平均GEBV为0.413,高于总体平均GEBV;死亡个体188尾,平均GEBV为0.255,低于总体平均GEBV。由此可知,GEBV与个体的抗病力呈正相关关系,GEBV高的个体抗病力强,因此可用GEBV对个体进行选择,从中挑选出抗病力强的个体用于繁育。
实施例6:计算东星斑候选群体抗病GEBV
选取100尾东星斑幼鱼作为候选群体,使用全基因组重测序获取候选群体的基因型,基因型信息存于文件“candidates.geno.vcf.gz”。测序完成后提取出与实施例4中所述相同的20k SNP,再结合实施例3中的参考群体,利用BayesCπ计算候选群体的抗病GEBV。为进一步说明抗病GEBV高的个体抗病力强,本发明利用感染实验测定了候选群体的抗病力,存活个体认定为抗病个体,死亡个体认定为易感个体。在这100尾东星斑候选群体中,各包含了50尾抗病个体和50尾易感个体。Linux环境中运行以下命令:
Figure PCTCN2022096720-appb-000021
Figure PCTCN2022096720-appb-000022
Figure PCTCN2022096720-appb-000023
运行后,成功计算了100尾东星斑候选群体的抗病GEBV,50尾抗病个体的GEBV均值为0.643,高于易感个体GEBV的均值(0.525)(图5)。此外,GEBV排名前20%的个体中,抗病个体有15尾,远高于易感个体(5尾),进一步表明了GEBV高的个体抗病力强,可以作为亲本繁育后代,以提高子代的抗病力。同时,也表明了本发明实施例3中选出的BayesCπ方法和实施例4中优选出来的20k SNP具有良好的预测效果,可用于筛选抗病力强的个体。

Claims (7)

  1. 一种基于基因组选择的东星斑良种培育方法,其特征在于,所述的方法包括如下步骤:
    1)使用不同地理来源的东星斑幼鱼构建东星斑抗病参考群体;
    将具有抗病表型但未纳入构建抗病参考群体的个体作为验证群体用于验证方法的选择效果;
    2)对参考群体和验证群体进行全基因组重测序,捕获基因组范围内高质量单核苷酸多态性SNP位点;
    3)通过比较基因组最佳线性无偏预测G-BLUP和BayesCπ两种方法预测抗病性状基因组估计育种值GEBV的准确性随SNP数目变化的趋势,选出用于计算候选群体GEBV的SNP数目和方法;
    4)根据步骤3)得到的结果,选出用于计算东星斑候选群体GEBV的SNP标记,所述的SNP标记包含了与东星斑抗病性状相关的SNP位点;
    5)利用步骤3)中选出的基因组选择方法和步骤4)中挑选出的SNP标记计算候选群体的GEBV,GEBV高的个体抗病力强,可以作为亲本繁育后代。
  2. 如权利要求1所述的基因组选择方法,其特征在于,所述的1)中用于构建抗病参考群体的东星斑幼鱼是通过抗病力筛选后获得的。
  3. 如权利要求1所述的基因组选择方法,其特征在于,所述的方法是利用验证群体验证步骤3)中选出的基因组选择方法和步骤4)中挑选出的SNP标记的预测效果。
  4. 如权利要求1所述的基因组选择方法,其特征在于,所述的SNP标记是位于序列SEQ ID NO:1~SEQ ID NO:10000中任一个序列的第36位和第107位。
  5. 如权利要求1所述的基因组选择方法,其特征在于,所述的与东星斑抗病性状相关的SNP位点,是位于序列SEQ ID NO:1~SEQ ID NO:15中任一个序 列的第36位和第107位。
  6. 一种SNP标记集合,其特征在于,所述的SNP标记集合中的SNP位点是位于序列为SEQ ID NO:1~SEQ ID NO:10000的任一序列的第36位和第107位。
  7. 权利要求6所述的SNP标记集合在东星斑抗病良种筛选中的应用。
PCT/CN2022/096720 2021-12-06 2022-06-02 东星斑抗病良种培育的基因组选择方法 WO2023103303A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202280002105.3A CN116917504A (zh) 2021-12-06 2022-06-02 东星斑抗病良种培育的基因组选择方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111478938.0 2021-12-06
CN202111478938.0A CN114015789A (zh) 2021-12-06 2021-12-06 一种东星斑抗病良种培育的基因组选择方法

Publications (1)

Publication Number Publication Date
WO2023103303A1 true WO2023103303A1 (zh) 2023-06-15

Family

ID=80067796

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096720 WO2023103303A1 (zh) 2021-12-06 2022-06-02 东星斑抗病良种培育的基因组选择方法

Country Status (2)

Country Link
CN (2) CN114015789A (zh)
WO (1) WO2023103303A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116694785A (zh) * 2023-08-02 2023-09-05 中国海洋大学三亚海洋研究院 一种豹纹鳃棘鲈的生长相关snp分子标记及其应用

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114015789A (zh) * 2021-12-06 2022-02-08 中国水产科学研究院黄海水产研究所 一种东星斑抗病良种培育的基因组选择方法
CN114410746B (zh) * 2022-03-29 2022-07-12 中国海洋大学三亚海洋研究院 一种东星斑分子溯源选择育种方法及其应用
CN116064846A (zh) * 2023-01-30 2023-05-05 中国海洋大学三亚海洋研究院 一种评估花鲈生长和抗性性状综合育种值的方法及应用
CN116516028B (zh) * 2023-06-27 2023-09-15 中国海洋大学三亚海洋研究院 豹纹鳃棘鲈抗神经坏死病毒性状相关的snp位点及其应用
CN117802249A (zh) * 2024-03-01 2024-04-02 中国海洋大学三亚海洋研究院 一种东星斑全基因组snp芯片的制备方法及应用

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103388027A (zh) * 2013-08-02 2013-11-13 厦门大学 不同类群豹纹鳃棘鲈单核苷酸多态性的鉴别方法
CN106480189A (zh) * 2016-10-18 2017-03-08 中国水产科学研究院黄海水产研究所 一种基于全基因组选择的鱼类抗病良种培育方法
CN111128306A (zh) * 2020-01-06 2020-05-08 中国水产科学研究院黄海水产研究所 一种罗非鱼基因组选择育种方法
CN111944913A (zh) * 2020-09-04 2020-11-17 中国水产科学研究院黄海水产研究所 一种半滑舌鳎抗病育种基因芯片及其应用
WO2021119980A1 (zh) * 2019-12-17 2021-06-24 中国水产科学研究院黄海水产研究所 一种牙鲆抗病育种基因芯片及其应用
CN114015789A (zh) * 2021-12-06 2022-02-08 中国水产科学研究院黄海水产研究所 一种东星斑抗病良种培育的基因组选择方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104357547B (zh) * 2014-09-17 2016-09-21 中山大学 一种豹纹鳃棘鲈微卫星dna分子标记的构建方法
WO2019066052A1 (ja) * 2017-09-28 2019-04-04 国立大学法人京都大学 魚類および魚類の生産方法
CN113373245A (zh) * 2021-07-14 2021-09-10 广东海洋大学 基于全基因组选择的马氏珠母贝金黄壳色性状良种培育方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103388027A (zh) * 2013-08-02 2013-11-13 厦门大学 不同类群豹纹鳃棘鲈单核苷酸多态性的鉴别方法
CN106480189A (zh) * 2016-10-18 2017-03-08 中国水产科学研究院黄海水产研究所 一种基于全基因组选择的鱼类抗病良种培育方法
WO2021119980A1 (zh) * 2019-12-17 2021-06-24 中国水产科学研究院黄海水产研究所 一种牙鲆抗病育种基因芯片及其应用
CN111128306A (zh) * 2020-01-06 2020-05-08 中国水产科学研究院黄海水产研究所 一种罗非鱼基因组选择育种方法
CN111944913A (zh) * 2020-09-04 2020-11-17 中国水产科学研究院黄海水产研究所 一种半滑舌鳎抗病育种基因芯片及其应用
CN114015789A (zh) * 2021-12-06 2022-02-08 中国水产科学研究院黄海水产研究所 一种东星斑抗病良种培育的基因组选择方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XU CHENGXU: "Chinese scientists decipher the genome of the eastern star spot", FISHERIES SCIENCE & TECHNOLOGY INFORMATION., vol. 47, no. 1, 20 January 2020 (2020-01-20), pages 57, XP093072546 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116694785A (zh) * 2023-08-02 2023-09-05 中国海洋大学三亚海洋研究院 一种豹纹鳃棘鲈的生长相关snp分子标记及其应用
CN116694785B (zh) * 2023-08-02 2023-12-19 中国海洋大学三亚海洋研究院 一种豹纹鳃棘鲈的生长相关snp分子标记及其应用

Also Published As

Publication number Publication date
CN116917504A (zh) 2023-10-20
CN114015789A (zh) 2022-02-08

Similar Documents

Publication Publication Date Title
WO2023103303A1 (zh) 东星斑抗病良种培育的基因组选择方法
Chung et al. Population structure and domestication revealed by high-depth resequencing of Korean cultivated and wild soybean genomes
Druet et al. Fine mapping of quantitative trait loci affecting female fertility in dairy cattle on BTA03 using a dense single-nucleotide polymorphism map
Wang et al. High-density genetic linkage mapping in turbot (Scophthalmus maximus L.) based on SNP markers and major sex-and growth-related regions detection
Garner et al. Genetic loci with parent‐of‐origin effects cause hybrid seed lethality in crosses between Mimulus species
Canestrelli et al. Phylogeography of the pool frog Rana (Pelophylax) lessonae in the Italian peninsula and Sicily: multiple refugia, glacial expansions and nuclear–mitochondrial discordance
Olsen et al. Genome‐wide association mapping in Norwegian Red cattle identifies quantitative trait loci for fertility and milk production on BTA12
Jankowicz‐Cieslak et al. Induction, rapid fixation and retention of mutations in vegetatively propagated banana
CN111128306B (zh) 一种罗非鱼基因组选择育种方法
Bell et al. Reed frog diversification in the Gulf of Guinea: Overseas dispersal, the progression rule, and in situ speciation
Yang et al. Genome‐wide association study of multiple yield traits in a diversity panel of polyploid sugarcane (Saccharum spp.)
AU2011261447B2 (en) Methods and compositions for predicting unobserved phenotypes (PUP)
Menda et al. Analysis of wild-species introgressions in tomato inbreds uncovers ancestral origins
CN111278994B (zh) 一种牙鲆抗病育种基因芯片及其应用
Scharmann et al. Sex is determined by XY chromosomes across the radiation of dioecious Nepenthes pitcher plants
Harney et al. Transcriptome based SNP discovery and validation for parentage assignment in hatchery progeny of the European abalone Haliotis tuberculata
Hu et al. Resequencing of 388 cassava accessions identifies valuable loci and selection for variation in heterozygosity
Haas et al. Whole‐genome assembly and annotation of northern wild rice, Zizania palustris L., supports a whole‐genome duplication in the Zizania genus
Kong et al. High‐resolution bin‐based linkage mapping uncovers the genetic architecture and heterosis‐related loci of plant height in indica–japonica derived populations
Bernard et al. Development of a high-density 665 K SNP array for rainbow trout genome-wide genotyping
Mengist et al. Autopolyploid inheritance and a heterozygous reciprocal translocation shape chromosome genetic behavior in tetraploid blueberry (Vaccinium corymbosum)
Delomas et al. Evaluating cost-effective genotyping strategies for genomic selection in oysters
Boleckova et al. Strategies for haplotype-based association mapping in complex pedigreed populations
Nomura et al. Genetic parameters and quantitative trait loci analysis associated with body size and timing at metamorphosis into glass eels in captive-bred Japanese eels (Anguilla japonica)
Uchino et al. Genotyping‐by‐sequencing for construction of a new genetic linkage map and QTL analysis of growth‐related traits in Pacific bluefin tuna

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202280002105.3

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22902741

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023570117

Country of ref document: JP