WO2020052101A1 - 基于ngs读段搜索实现序列延伸的虚拟pcr方法 - Google Patents

基于ngs读段搜索实现序列延伸的虚拟pcr方法 Download PDF

Info

Publication number
WO2020052101A1
WO2020052101A1 PCT/CN2018/118584 CN2018118584W WO2020052101A1 WO 2020052101 A1 WO2020052101 A1 WO 2020052101A1 CN 2018118584 W CN2018118584 W CN 2018118584W WO 2020052101 A1 WO2020052101 A1 WO 2020052101A1
Authority
WO
WIPO (PCT)
Prior art keywords
contig
sequencing
sequence
seed
primer
Prior art date
Application number
PCT/CN2018/118584
Other languages
English (en)
French (fr)
Inventor
段乃彬
王效睦
丁汉凤
马玉敏
宫永超
谢坤
白静
杨永义
Original Assignee
山东省农作物种质资源中心
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 山东省农作物种质资源中心 filed Critical 山东省农作物种质资源中心
Publication of WO2020052101A1 publication Critical patent/WO2020052101A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the invention relates to the field of bioinformatics or molecular biology, in particular to a method for implementing a virtual PCR method for sequence extension based on NGS read search.
  • PCR Polymerase chain reaction
  • Sequencing reads are obtained by NGS sequencing, also known as second-generation sequencing.
  • the main platforms of high-throughput sequencing include Roche's 454 sequencer (Roch, GS, FLX sequencer), Illumina's Solexa genome analyzer (Illumina, Genome, Analyzer), and ABI's SOLiD sequencer (ABI, SOLiD, sequencer).
  • Different NGS platforms can sequence millions of small DNA fragments in parallel, and obtain massive sequencing data, that is, sequencing reads. These reads can be combined to assemble the genome or map each read to a reference genome for genome comparison.
  • NGS is not only used for sequencing the entire genome but also for specific regions of interest, including all coding genes (the entire exome) or a small number of individual genes.
  • the amplification of the target gene is usually achieved by using WetLab's PCR technology; gel electrophoresis can be used to detect whether the amplification is successful; and then Sanger generation sequencing is used to obtain the sequence details to complete such a complete PCR amplification- Sequencing an experiment cycle takes days or even a week, and the experiment efficiency is low.
  • experimenters usually accumulate multiple sequences for batch amplification and batch sequencing; even if the actual PCR process is simultaneous, the preparation of experimental materials is complicated and the cost of reagents is high. Therefore, Wetlab's PCR operation cannot meet the needs of rapid and efficient molecular biology.
  • the present invention provides a virtual PCR method for realizing sequence extension based on NGS read search.
  • the invention firstly performs a high-throughput high-throughput sequencing on the entire genome of the test sample based on the NGS sequencing technology, thereby obtaining massive sequencing reads covering the entire genome.
  • a method for establishing contigs of sequence fragments based on search and sequencing reads to develop virtual PCR was developed. The method can perform virtual amplification experiments on any gene fragment of interest, and the amplified fragments are long and the amplification cycle is short, which significantly improves the experimental efficiency.
  • a virtual PCR method for sequence extension based on NGS read search is used to establish contigs of sequence fragments so as to simulate the PCR amplification target gene in the program without performing actual PCR instrument operations. process.
  • the above method includes the following steps:
  • the coverage of the NGS sequencing is not less than 50X;
  • sequencing read data remove the sequencing repeat data obtained in step 1) from sequencing repeats, remove sequencing adapters, Barcode and low quality data;
  • the virtual PCR process described in step 3) above includes the following steps:
  • Primer design Use the flanking ends of the target fragment S1 to design a pair of virtual primers with a length of 30-40bp at the 3 ′ and 5 ’ends, respectively. These two primers are called seed_L and seed_R;
  • 3 'end extension Set the expected extension length at the beginning of the program, enter the initial primer and all sequencing read data, search for seed_L primer sequences in the sequencing read data, and follow the seed_L from the sequencing read data. Sort the sequencing reads hit by search from left to right, select the first sequencing read, arrange the sequence to the right of seed_L to extend it, and use the rightmost 30- The 40bp sequence is used as the new seed_L. Continue to search and extend. This cycle is continued until the right reaches the set length to form Contig_L1. Similarly, the reverse complement sequence of seed_L is used as the primer to extend left to the set length (seed_L_reverse left Extension is equivalent to seed_L extending to the right) to form Contig_L_reverse. Contig_L_reverse is reversely complementary to form Contig_L2. Contig_L2 and Contig_L1 are merged to obtain the 3 'extension product Contig_L;
  • c) 5 'end extension Set the expected extension length at the beginning of the program, enter the initial primer and all sequencing read data, search the seed_R primer sequence in the sequencing read data, and follow the seed_R in the sequencing read data from Right-to-left sequence sorts the sequencing reads hit by the search, selects the first sequencing read, arranges the sequence to the left of seed_R to extend, and at the same time the left-most portion of the intercepted sequence is 30- The 40bp sequence is used as the new seed_R. Continue to search and extend. This cycle is continued until it extends to the left to the set length to form Contig_R1. Similarly, the reverse complementary sequence of seed_R is used as the primer to extend right to the set length (seed_R_reverse to the right).
  • Contig_R_reverse is reversely complementary to form Contig_R2.
  • Contig_R1 and Contig_R1 are merged to obtain the 3 'extension product Contig_R;
  • step 3 the program will generate an alarm and prompt to replace the initial primer seed. This loops to the first step (ie step a)) and restarts the virtual PCR process.
  • the blocking in step 3) is that the primer sequence intercepted by the first hit sequencing read does not meet the set requirements and the like.
  • the replacement of the new primer in step 3) is to redesign the initial primer seed.
  • the present invention can provide a virtual PCR method for sequence extension based on NGS read search.
  • This method has the following advantages over the prior art: 1
  • 1 The amplified fragment is longer, and the largest fragment that can be extended by practice is 20K. Much higher than the limitations of Wet Lab and PCR enzymes and reaction systems; 2 its amplification cycle is shorter, complete a round of virtual PCR on a small machine, usually only takes two hours to complete the 5k base extension; 3 no actual operation PCR experiments, the sequence information of the target gene can be obtained in the program, the experimental cost is low, and the efficiency is high; 4 can dozens or even hundreds of sequences can be amplified at one time (limited by the configuration of the computing server), a 40 core cpu, 128g memory minicomputer can complete the extension of 50 sequences in half a day.
  • Figure 1 Flow chart of virtual PCR process (reverse complementation in search extension is not illustrated in this figure).
  • the sequencing coverage in the step 2 is calculated with 500Mb of the radish genome.
  • Primer design Use the flanking ends of the target fragment S1 to design a pair of virtual primers with a length of 35bp at the 3 ′ and 5 ′ ends, respectively. These two primers are called seed_L and seed_R;
  • 3 'end extension Set the expected extension length to 5k at the beginning of the program, enter the initial primer and all sequencing read data, search for the seed_L primer sequence in the sequencing read data, and follow the seed_L in the sequencing read data
  • the sequencing reads hit by the search are sorted in order from left to right in the middle of the search; the first sequencing read is selected, the sequence to the right of seed_L is intercepted for extension, and the rightmost 35bp of the intercepted sequence is extended. The sequence is used as the new seed_L.
  • search-extend This loop continues until it extends to the right to the set length to form Contig_L1.
  • seed_L_reverse extends to the left.
  • seed_L_reverse extends to the left.
  • seed_L_reverse extends to the right
  • Contig_L_reverse reverse complementing Contig_L_reverse to form Contig_L2
  • c) 5 'end extension Set the expected extension length to 5k at the beginning of the program, enter the initial primer and all sequencing read data, search for seed_R primer sequences in the sequencing read data, and follow the seed_R in the sequencing read data Sort the sequencing reads hit by the search from right to left; select the first sequencing read, arrange the sequence to the left of seed_R to extend it, and meanwhile use the intercepted sequence to the far left The 35bp sequence is used as the new seed_R. Continue to search and extend. This cycle is continued until the left reaches the set length to form Contig_R1. Similarly, the reverse complementary sequence of seed_R is used as the primer to extend right to the set length (seed_R_reverse to the right).
  • Contig_R_reverse is reversely complementary to form Contig_R2.
  • Contig_R1 and Contig_R1 are merged to obtain the 5 'extension product Contig_R;
  • step b) and step c) if the extension length fails to reach the set length of 5k, the rightmost 3 'primer of the sequencing read that hits the sequencing read (or the most right of the 5' primer) (Left) 35bp sequence as a new primer, continue the search primer-extension cycle;
  • step f) If the first hit sequencing read in step b) and step c) fails to intercept the 35bp sequence as a primer, return to step a); redesign the initial primer and proceed to step b) and step c) to start Virtual PCR process.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了一种基于NGS读段搜索实现序列延伸的虚拟PCR方法。该方法首先基于NGS测序技术对试验样品全基因组进行一次深度较高的高通量测序,获得覆盖全基因组的海量测序读段(reads);再结合生物信息学方法,搜索测序读段,根据搜索结果在程序中实现序列延伸,从而建立序列片段重叠群(contig),继而实现目的基因在虚拟程序中的扩增。

Description

基于NGS读段搜索实现序列延伸的虚拟PCR方法 技术领域
本发明涉及生物信息学或分子生物学领域,具体涉及一种基于NGS读段搜索实现序列延伸的虚拟PCR方法的方法。
背景技术
聚合酶链式反应(PCR)是一种用于放大扩增特定的DNA片段的分子生物学技术,它可看作是生物体外的特殊DNA复制。PCR技术的基本原理类似于DNA的天然的半保留复制过程,其特异性依赖于与靶序列两端互补的寡核苷酸引物。PCR由变性--退火--延伸三个基本反应步骤构成,不断重复循环变性--退火--延伸三过程就可将待扩增的目的基因扩增放大几百万倍。
测序读段(reads)是通过NGS测序又称二代测序获得的。目前高通量测序的主要平台代表有罗氏公司(Roche)的454测序仪(Roch GS FLX sequencer),Illumina公司的Solexa基因组分析仪(Illumina Genome Analyzer)和ABI的SOLiD测序仪(ABI SOLiD sequencer)。不同的NGS平台均可并行地对数百万个小DNA片段进行测序,而获得海量测序数据即测序读段(reads)。这些读段组合在一起可进行基因组的组装,也可将每个读段映射到参考基因组而进行基因组比对。NGS不仅用于整个基因组测序也可对特定的感兴趣区域,包括所有编码基因(整个外显子组)或少量个体基因。
对目的基因的扩增通常是利用Wet Lab的PCR技术实现;通过凝胶电泳法可以检测是否扩增成功;再利用Sanger一代测序的方法获得该序列详细信息,完成这样一个完整的PCR扩增-测序一个试验周期需要几天甚至一周时间,实验效率较低。实验者为提高实验效率通常会累积多个序列进行批量扩增,进行批量测序;即便这样同时实际的PCR流程,其实验材料的准备过程复杂,试剂成本也较高。因此Wet lab的PCR操作不能满足分子生物学快速高效的需要。
发明内容
为了克服上述现有技术的不足,本发明提供了一种基于NGS读段搜索实现序列延伸的虚拟PCR方法。本发明首先基于NGS测序技术对试验样品全基因组进行一次深度较高的高通量测序,从而获得覆盖全基因组的海量测序读段(reads)。再结合生物信息学方法,开发了一种基于搜索测序读段建立序列片段重叠群(contig)从而实现虚拟PCR的方法。该方法可对任何感兴趣的基因片段进行虚拟扩增实验,其扩增片段长,扩增周期短,显著提高实验效率。
基于NGS读段搜索实现序列延伸的虚拟PCR方法,利用对测序读段的搜索来建立序列 片段重叠群从而在不进行实际的PCR仪器操作的情况下,在程序中实现模拟PCR扩增目的基因的过程。
上述方法,包括如下步骤:
1)对实验材料的全基因NGC测序:选择所需的实验材料利用全基因组DNA提取试剂盒提取长片段DNA,将长片段DNA样品打断后加载到基因芯片中,使用测序仪进行边合成边测序,获得样品的短片段90-150bp成对的测序读段数据;
其中,所述的NGS测序的覆盖度不低于50X;
2)测序读段数据的预处理:将步骤1)中获得的测序读段数据去除测序重复,去除测序接头、Barcode及低质量数据;
3)虚拟PCR过程:在不进行实际的PCR仪器操作的情况下,在程序中模拟PCR扩增目的基因的过程(如图1所示),从而获得目的基因的扩增序列。
上述步骤3)中所述虚拟PCR过程,包括以下步骤:
a)引物设计:利用目的片段S1的两端侧翼分别设计3’及5’端长度为30-40bp的虚拟引物一对,这两个引物分别称为seed_L与seed_R;
b)3’端的延伸:在程序起始设定预期延伸长度,并输入初始引物及全部的测序读段数据,在测序读段数据中搜索seed_L引物序列,并按照seed_L在测序读段数据中从左到右位置的顺序对搜索击中的测序读段进行排序,选择排列第一位的测序读段,截取其上seed_L右侧的序列进行延伸,同时以该被截取的序列最右侧30-40bp序列作为新的seed_L,继续搜索-延伸,如此循环直至向右延伸到设定的长度形成Contig_L1,同理以seed_L的反向互补序列seed_L_reverse作为引物向左延伸到设定的长度(seed_L_reverse向左延伸等同于seed_L向右延伸),形成Contig_L_reverse,将Contig_L_reverse做反向互补形成Contig_L2,将Contig_L2与Contig_L1取并集,进而得到3’延伸的产物Contig_L;
c)5’端的延伸:在程序起始设定预期延伸长度,并输入初始引物及全部的测序读段数据,在测序读段数据中搜索seed_R引物序列,并按照seed_R在测序读段数据中从右到左位置的顺序对搜索击中的测序读段进行排序,选择排列第一位的测序读段,截取其上seed_R左侧的序列进行延伸,同时以该被截取的序列最左侧30-40bp序列作为新的seed_R,继续搜索-延伸,如此循环直至向左延伸到设定的长度形成Contig_R1,同理以seed_R的反向互补序列seed_R_reverse作为引物向右延伸到设定的长度(seed_R_reverse向右延伸等同于seed_R向左延伸),形成Contig_R_reverse,将Contig_R_reverse做反向互补形成Contig_R2,将Contig_R1与Contig_R1取并集,进而得到3’延伸的产物Contig_R;
d)获得扩增序列:两侧延伸的序列Contig_L与Contig_R,依据重叠关系,合并取并集 得到所设定长度的序列片段重叠群(contig),进而得到完整的扩增序列S2,至此虚拟PCR完成。
其中,如果步骤3)所述虚拟PCR过程出现阻断,程序会出现报警并提示更换初始引物seed,如此循环到第一步(即步骤a)),重新进行虚拟PCR过程。
上述步骤3)所述阻断为第一位击中测序读段其上所截取的引物序列不符合设定要求等。
上述步骤3)所述更换新引物为重新设计初始引物seed。
本发明的有益效果:
本发明可以提供一种基于NGS读段搜索实现序列延伸的虚拟PCR方法,该方法相对于现有技术有如下优势:1其扩增片段更长,经实践验证可延长的最大片段为20K,远远高于Wet Lab PCR中酶与反应体系的限制;2其扩增周期更短,在小型机上完成一轮虚拟PCR,通常只需要两小时即可完成5k碱基的延伸;3无需实际的操作PCR实验,在程序中即可获得目的基因的序列信息,实验成本低,效率高;4可一次性扩增几十甚至于几百条序列(受限于计算服务器的配置),一台40核cpu,128g内存的小型机可以在半天完成50条序列的延伸。
附图说明
图1虚拟PCR过程流程图(搜索延伸中的反向互补情况未在本图说明)。
具体实施方式
实施例
1)对萝卜的全基因测序:取10g萝卜植株春梢幼嫩叶片,洗净后直接置于液氮中冷冻,再利用植物全基因组DNA提取试剂盒以提取长片段DNA,长片段DNA样品打断后再经双末端PE125策略建库并加载到测序芯片Flowcell,再依照标准流程在Illumina Hiseq 2500测序仪(Illumina,San Diego,CA)上进行边合成边测序,从而得到各样品的短片段125bp成对的测序读段数据;
2)测序读段数据的预处理:测序下机的Rawdata文件为fastq格式,原始数据下机后先经由个性设计的Perl脚本过滤去除PCR重复,再由Trimmomatic 3.0去除测序接头、Barcode及低质量数据后,用Fastqc进行数据质量检测;
所述步骤2中测序覆盖度以萝卜基因组500Mb计算,当reads读长为125bp时,reads数目合计应不少于S=(50×500×10 6)/125=2.0×10 8条reads;
3)虚拟PCR的过程:利用本自主研发的脚本from_seed_to_contigs,步骤如下:
a)引物设计:利用目的片段S1的两端侧翼分别设计3’及5’端长度为35bp的虚拟引物一对,这两个引物分别称为seed_L与seed_R;
b)3’端的延伸:在程序起始设定预期延伸长度为5k,并输入初始引物及全部的测序读段 数据,在测序读段数据中搜索seed_L引物序列,并按照seed_L在测序读段数据中从左到右位置顺序对搜索击中的测序读段进行排序;选择排列第一位的测序读段,截取其上seed_L右侧的序列进行延伸,同时以该被截取的序列最右侧35bp序列作为新的seed_L,继续搜索-延伸,如此循环直至向右延伸到设定的长度形成Contig_L1,同理以seed_L的反向互补序列seed_L_reverse作为引物向左延伸到设定的长度(seed_L_reverse向左延伸等同于seed_L向右延伸),形成Contig_L_reverse,将Contig_L_reverse做反向互补形成Contig_L2,将Contig_L2与Contig_L1取并集,进而得到3’延伸的产物Contig_L;
c)5’端的延伸:在程序起始设定预期延伸长度为5k,并输入初始引物及全部的测序读段数据,在测序读段数据中搜索seed_R引物序列,并按照seed_R在测序读段数据中从右到左位置的顺序对搜索击中的测序读段进行排序;选择排列第一位的测序读段,截取其上seed_R左侧的序列进行延伸,同时以该被截取的序列最左侧35bp序列作为新的seed_R,继续搜索-延伸,如此循环直至向左延伸到设定的长度形成Contig_R1,同理以seed_R的反向互补序列seed_R_reverse作为引物向右延伸到设定的长度(seed_R_reverse向右延伸等同于seed_R向左延伸),形成Contig_R_reverse,将Contig_R_reverse做反向互补形成Contig_R2,将Contig_R1与Contig_R1取并集,进而得到5’延伸的产物Contig_R;
d)获得萝卜基因S1的扩增序列S2:两侧延伸的序列Contig_L与Contig_R,依据重叠关系,合并取并集得到5k长度的序列片段重叠群(contig),而得到完整的扩增序列S2,至此虚拟PCR完成;
e)在步骤b)和步骤c)中,若延伸长度未能达到设定长度5k,则截取排序第一位的击中测序读段的3’引物的最右侧(或5’引物的最左侧)35bp的序列作为新引物,继续进行搜索引物-延伸的循环;
f)若在步骤b)和步骤c)中第一位的击中测序读段不能成功截取35bp序列作为引物,则返回步骤a);重新设计初始引物,而进入步骤b)和步骤c)开始虚拟PCR过程。

Claims (7)

  1. 基于NGS读段搜索实现序列延伸的虚拟PCR方法,其特征在于,利用对测序读段的搜索来建立序列片段重叠群从而在不进行实际的PCR仪器操作的情况下,在程序中实现模拟PCR扩增目的基因的过程。
  2. 如权利要求1所述的虚拟PCR方法,其特征在于,包括如下步骤:
    1)对实验材料的全基因NGC测序:选择所需的实验材料提取长片段DNA,将长片段DNA打断后加载到基因芯片中,使用测序仪进行边合成边测序,获得样品的短片段90-150bp成对的测序读段数据;
    2)测序读段数据的预处理:将步骤1)中获得的测序读段数据去除测序重复,去除测序接头、Barcode及低质量数据;
    3)虚拟PCR过程:在不进行实际的PCR仪器操作的情况下,在程序中模拟PCR扩增目的基因的过程,从而获得目的基因的扩增序列。
  3. 如权利要求2所述的虚拟PCR方法,其特征在于,所述的NGC测序的覆盖度不低于50X。
  4. 如权利要求2或3所述的虚拟PCR方法,其特征在于,所述的步骤3)的虚拟PCR过程,包括以下步骤:
    a)引物设计:利用目的片段S1的两端侧翼分别设计3’及5’端长度为30-40bp的虚拟引物一对,这两个引物分别称为seed_L与seed_R;
    b)3’端的延伸:在程序起始设定预期延伸长度,并输入初始引物及全部的测序读段数据,在测序读段数据中搜索seed_L引物序列,并按照seed_L在测序读段数据中从左到右位置的顺序对搜索击中的测序读段进行排序,选择排列第一位的测序读段,截取其上seed_L右侧的序列进行延伸,同时以该被截取的序列最右侧30-40bp序列作为新的seed_L,继续搜索-延伸,如此循环直至向右延伸到设定的长度形成Contig_L1,同理以seed_L的反向互补序列seed_L_reverse作为引物向左延伸到设定的长度,形成Contig_L_reverse,将Contig_L_reverse做反向互补形成Contig_L2,将Contig_L2与Contig_L1取并集,进而得到3’延伸的产物Contig_L;
    c)5’端的延伸:在程序起始设定预期延伸长度,并输入初始引物及全部的测序读段数据,在测序读段数据中搜索seed_R引物序列,并按照seed_R在测序读段数据中从右到左位置的顺序对搜索击中的测序读段进行排序,选择排列第一位的测序读段,截取其上seed_R左侧的序列进行延伸,同时以该被截取的序列最左侧30-40bp序列作为新的seed_R,继续搜索-延伸,如此循环直至向左延伸到设定的长度形成Contig_R1,同理以seed_R的反向互补序列seed_R_reverse作为引物向右延伸到设定的长度,形成Contig_R_reverse,将Contig_R_reverse做反向互补形成Contig_R2,将Contig_R1与Contig_R1取并 集,进而得到3’延伸的产物Contig_R;
    d)获得扩增序列:两侧延伸的序列Contig_L与Contig_R,依据重叠关系,合并取并集得到所设定长度的序列片段重叠群,进而得到完整的扩增序列S2,至此虚拟PCR完成。
  5. 如权利要求4所述的虚拟PCR过程,其特征在于,若在其扩增目的基因的过程中出现阻断,则程序会出现报警并提示更换新引物seed,如此循环到第一步,重新进行虚拟PCR过程。
  6. 如权利要求5所述的虚拟PCR过程,其特征在于,所述的阻断是第一位击中测序读段其上所截取的引物的序列不符合设定要求。
  7. 如权利要求5所述的虚拟PCR过程,其特征在于,所述更换新引物为重新设计初始引物seed。
PCT/CN2018/118584 2018-09-12 2019-01-31 基于ngs读段搜索实现序列延伸的虚拟pcr方法 WO2020052101A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811062669.8A CN109097458A (zh) 2018-09-12 2018-09-12 基于ngs读段搜索实现序列延伸的虚拟pcr方法
CN201811062669.8 2018-09-12

Publications (1)

Publication Number Publication Date
WO2020052101A1 true WO2020052101A1 (zh) 2020-03-19

Family

ID=64865924

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/118584 WO2020052101A1 (zh) 2018-09-12 2019-01-31 基于ngs读段搜索实现序列延伸的虚拟pcr方法

Country Status (2)

Country Link
CN (1) CN109097458A (zh)
WO (1) WO2020052101A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109097458A (zh) * 2018-09-12 2018-12-28 山东省农作物种质资源中心 基于ngs读段搜索实现序列延伸的虚拟pcr方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112687334B (zh) * 2020-12-29 2022-09-23 中南大学 一种可应用于传染病病原体测序的读段映射延伸方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303068A (zh) * 2015-10-27 2016-02-03 华中农业大学 一种基于参考基因组和从头组装相结合的二代测序数据组装方法
CN106834465A (zh) * 2017-01-22 2017-06-13 西北农林科技大学 一种简便、高效且通用的植物叶绿体基因组测序方法
CN107858408A (zh) * 2016-09-19 2018-03-30 深圳华大基因科技服务有限公司 一种基因组二代序列组装方法和系统
CN109097458A (zh) * 2018-09-12 2018-12-28 山东省农作物种质资源中心 基于ngs读段搜索实现序列延伸的虚拟pcr方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130123120A1 (en) * 2010-05-18 2013-05-16 Natera, Inc. Highly Multiplex PCR Methods and Compositions
CN104673884B (zh) * 2014-05-24 2017-11-07 四川农业大学 利用全基因组和est数据开发多态性est‑ssr标记的方法
WO2017127741A1 (en) * 2016-01-22 2017-07-27 Grail, Inc. Methods and systems for high fidelity sequencing
CN109321646A (zh) * 2018-09-12 2019-02-12 山东省农作物种质资源中心 基于ngs读段与参考序列比对的虚拟pcr方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105303068A (zh) * 2015-10-27 2016-02-03 华中农业大学 一种基于参考基因组和从头组装相结合的二代测序数据组装方法
CN107858408A (zh) * 2016-09-19 2018-03-30 深圳华大基因科技服务有限公司 一种基因组二代序列组装方法和系统
CN106834465A (zh) * 2017-01-22 2017-06-13 西北农林科技大学 一种简便、高效且通用的植物叶绿体基因组测序方法
CN109097458A (zh) * 2018-09-12 2018-12-28 山东省农作物种质资源中心 基于ngs读段搜索实现序列延伸的虚拟pcr方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109097458A (zh) * 2018-09-12 2018-12-28 山东省农作物种质资源中心 基于ngs读段搜索实现序列延伸的虚拟pcr方法

Also Published As

Publication number Publication date
CN109097458A (zh) 2018-12-28

Similar Documents

Publication Publication Date Title
CA2927102C (en) Methods and systems for genotyping genetic samples
Matochko et al. Deep sequencing analysis of phage libraries using Illumina platform
CA2925335C (en) Methods and systems for detecting sequence variants
Zhang et al. Fastq_clean: An optimized pipeline to clean the Illumina sequencing data with quality control
CN107345256B (zh) 一种基于转录组测序开发山黧豆est-ssr引物组及方法和应用
KR20160068953A (ko) 질환-유도된 돌연변이를 확인하기 위한 방법 및 시스템
Campbell et al. Idiosyncratic genome degradation in a bacterial endosymbiont of periodical cicadas
WO2020052101A1 (zh) 基于ngs读段搜索实现序列延伸的虚拟pcr方法
CN102899335A (zh) 一种高通量Small RNA测序获得番木瓜环斑病毒基因组序列的方法
Gao et al. Ancestral gene duplications in mosses characterized by integrated phylogenomic analyses
CN109337997B (zh) 一种山茶属多态性叶绿体基因组微卫星分子标记引物及筛选和甄别近缘种的方法
CN113463202A (zh) 一种新的rna高通量测序的方法、引物组和试剂盒及其应用
CN110970091B (zh) 标签质控的方法及装置
CN108034696A (zh) 一种基于转录组测序开发ssr引物的方法
Hu et al. Genome‐scale angiosperm phylogenies based on nuclear, plastome, and mitochondrial datasets
CN105567843B (zh) 用于稻田广食性捕食天敌猎物多样性高通量测序的复合标签及其应用
CN107828858B (zh) 一种基于转录组测序开发鬼针草植物ssr引物的方法
Liu et al. Phylogenomics of Aralia sect. Aralia (Araliaceae): Signals of hybridization and insights into its species delimitations and intercontinental biogeography
CN111192636A (zh) 一种适用于oligodT富集的mRNA二代测序结果分析方法
US20230193301A1 (en) Method and use for identifying plant species based on whole genome analysis and genome editing
CN108733974A (zh) 一种基于高通量测序的线粒体序列拼接及拷贝数测定的方法
Raza et al. Next-generation sequencing technologies and plant molecular virology: a practical perspective
CN109321646A (zh) 基于ngs读段与参考序列比对的虚拟pcr方法
Huang et al. Next-generation sequencing promoted the release of reference genomes and discovered genome evolution in cereal crops
CN103824000A (zh) 一种批量检测植物基因组ltr-反转座子的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18933372

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18933372

Country of ref document: EP

Kind code of ref document: A1