CN114317529B

CN114317529B - Random splicing method of oligonucleotide chains

Info

Publication number: CN114317529B
Application number: CN202111522088.XA
Authority: CN
Inventors: 夏朋延; 王硕; 朱芳蕊; 钱言
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2021-12-13
Filing date: 2021-12-13
Publication date: 2023-12-01
Anticipated expiration: 2041-12-13
Also published as: CN114317529A

Abstract

The invention relates to the technical field of molecular biology, in particular to a random splicing method of an oligonucleotide chain. The random splicing method provided by the invention is to randomly splice any k oligonucleotide chains in n oligonucleotide chains into long-chain oligonucleotides by using magnetic beads as a carrier and adopting a primer group for randomly splicing the oligonucleotide chains. The method can randomly splice short-chain oligonucleotides into long-chain oligonucleotides with high efficiency, and can better ensure the randomness of the nucleotides in the short-chain oligonucleotides and the polymorphism of the sequences by reducing the length of the random sequences of the synthesized nucleotides, and then reach the length of target oligonucleotide chains by random splicing. The long-chain oligonucleotide library and the random peptide library constructed by the method have larger library capacity, higher polymorphism and better application prospect.

Description

A method for random assembly of oligonucleotide chains

技术领域Technical field

本发明涉及分子生物学技术领域，具体涉及用于寡核苷酸链随机拼合的引物组以及寡核苷酸链随机拼合方法。The invention relates to the technical field of molecular biology, and specifically to a primer set for random assembly of oligonucleotide chains and a method for random assembly of oligonucleotide chains.

背景技术Background technique

肽库(peptide libraries)是大量特定长度且序列不同的短肽的集合，它包括了该长度短肽中各种(或绝大部分)氨基酸序列的排列组合。目前，利用随机肽库进行筛选已广泛应用于蛋白质间相互作用、药物设计及筛选等多个领域。Peptide libraries are a collection of a large number of short peptides of specific lengths and different sequences, which include permutations and combinations of various (or most) amino acid sequences in short peptides of this length. Currently, screening using random peptide libraries has been widely used in many fields such as protein-protein interactions, drug design and screening.

合成随机肽库较为常用的方法是在长链DNA合成时设计核苷酸回文序列，其中编码随机核苷酸序列的前两个核苷酸为任意核苷酸N(A/T/C/G)，最后一个核苷酸根据密码子的简并性设计为K(G/T)。根据载体及酶切位点的情况，设计出两条寡核苷酸单链(Christian RB,Zuckermann RN,Kerr JM,Wang L,Malcolm BA.Simplified methods forconstruction,assessment and rapid screening of peptide libraries inbacteriophage.J Mol Biol.1992；227(3):711-718.doi:10.1016/0022-2836(92)90219-a.)。在PCR扩增中，两条寡核苷酸单链在Taq酶作用下互补延伸成双链。通过优化PCR反应条件及参数，可以更好地保证构建寡核苷酸库时随机序列DNA的多样性(胡又佳,高枫,朱春宝,等.随机序列八肽库的构建及其在双杂交系统中的应用[J].生物技术,2007,17(2):82-86.DOI:10.3969/j.issn.1004-311X.2007.02.030.)。A common method for synthesizing random peptide libraries is to design nucleotide palindromic sequences during long-chain DNA synthesis, in which the first two nucleotides encoding the random nucleotide sequence are arbitrary nucleotides N (A/T/C/ G), the last nucleotide is designed as K(G/T) based on the degeneracy of codons. According to the conditions of the vector and enzyme cutting site, two oligonucleotide single strands were designed (Christian RB, Zuckermann RN, Kerr JM, Wang L, Malcolm BA. Simplified methods for construction, assessment and rapid screening of peptide libraries inbacteriophage.J Mol Biol. 1992;227(3):711-718.doi:10.1016/0022-2836(92)90219-a.). In PCR amplification, two oligonucleotide single strands are complementary extended into double strands under the action of Taq enzyme. By optimizing PCR reaction conditions and parameters, the diversity of random sequence DNA can be better ensured when constructing oligonucleotide libraries (Hu Youjia, Gao Feng, Zhu Chunbao, et al. Construction of random sequence octapeptide library and its application in two-hybrid system Applications[J]. Biotechnology, 2007, 17(2):82-86.DOI:10.3969/j.issn.1004-311X.2007.02.030.).

目前，市售的常用肽库主要是NEB公司提供的3种随机肽库，包括7肽库(Ph.D.-7)、12肽库(Ph.D.-12)和含有二硫键的环7肽库(Ph.D.-C7C)，其中，Ph.D.-C7C随机肽库所展示的多肽两侧各加了一个半胱氨酸，所有肽库的容量超过20亿个克隆。Currently, the commonly used peptide libraries on the market are mainly three random peptide libraries provided by NEB, including 7-peptide library (Ph.D.-7), 12-peptide library (Ph.D.-12) and peptide library containing disulfide bonds. Cyclic 7 peptide library (Ph.D.-C7C). Among them, the polypeptides displayed in the Ph.D.-C7C random peptide library have a cysteine added to both sides of the peptide. The capacity of all peptide libraries exceeds 2 billion clones.

体外合成随机肽库主要有两种方法。第一种是Split and Mix，该方法分为三个基本步骤：平分(Split)、偶联(Couple)和混合(Mix)。首先平分，将树脂球平分为若干等份，等份数量与肽库的氨基酸种类相同；其次偶联，每一等份加入一份特定对应的氨基酸组份进行彻底反应；最后混匀，这一步将所有等份完全混匀。然后重新平分为与之前相同数目的等份，这样可以得到均匀的等量的多肽混合物。重复这三个基本步骤N次(N为一条目标肽的氨基酸个数)，可以快速生成一系列新分子，而树脂球数量不变。树脂球每次只与一种反应物反应，每个树脂球只会产生一种化合物(One-bead one-compound：OBOC library，一树脂球一种肽)。由于每一次都是完全反应，因此混合物中所有的多肽都是等比例的。在除去保护基团后，树脂球上的多肽便可以用于测试。There are two main methods for synthesizing random peptide libraries in vitro. The first is Split and Mix, which is divided into three basic steps: Split, Couple and Mix. First, divide the resin ball into equal parts, the number of equal parts is the same as the amino acid type of the peptide library; secondly, couple, add a specific corresponding amino acid component to each equal part for complete reaction; finally mix, this step Mix all equal portions thoroughly. The mixture is then re-divided into the same number of aliquots as before to obtain a homogeneous mixture of equal amounts of peptides. By repeating these three basic steps N times (N is the number of amino acids in a target peptide), a series of new molecules can be quickly generated while the number of resin beads remains unchanged. The resin beads only react with one reactant at a time, and each resin bead only produces one compound (One-bead one-compound: OBOC library, one resin bead, one peptide). Since each reaction is a complete reaction, all peptides in the mixture are in equal proportions. After the protecting groups are removed, the peptides on the resin beads are ready for testing.

第二种是Pre-mix，由于OBOC方法合成较大的肽库存在其局限性，常用氨基酸预混合液的方法合成混合肽库。该方法中用于肽库合成的氨基酸经提前混合后偶联到一批树脂球上。在偶联最后一个氨基酸之前，将树脂球平分为N等份(N等于肽库中氨基酸种类的数量)，然后分别添加一种氨基酸进行反应。这样可以得到N末端残基标志的子库。由于每个树脂球都是在相同环境下、由相同的试剂反应，因此，该方法合成的肽库中每个树脂球上都包含子库的所有肽的集合。The second is Pre-mix. Due to the limitations of the OBOC method for synthesizing larger peptide libraries, the amino acid premix method is commonly used to synthesize mixed peptide libraries. In this method, the amino acids used for peptide library synthesis are mixed in advance and then coupled to a batch of resin beads. Before coupling the last amino acid, the resin sphere is divided into N equal parts (N is equal to the number of amino acid species in the peptide library), and then one amino acid is added for reaction. This results in a sublibrary of N-terminal residue tags. Since each resin sphere is reacted in the same environment and with the same reagents, each resin sphere in the peptide library synthesized by this method contains a collection of all peptides in the sublibrary.

目前的随机肽库合成方法一般仅在应用于生成短肽序列(5-10aa)时，才能够较好地保证短肽库的丰富性与多样性。但是，肽段长度较短，会导致其与目的蛋白结合位点小进而使得相互作用力较弱，这可能会影响整体筛选效率与特异性。在采用现有的方法直接随机合成更长肽链的多肽库时，由于长度限制，每个位点氨基酸的随机性较差，合成多肽种类的多样性会受到限制，较难符合理想的建库筛选标准。Current random peptide library synthesis methods are generally only able to better ensure the richness and diversity of short peptide libraries when applied to generate short peptide sequences (5-10aa). However, the short length of the peptide will result in a small binding site with the target protein and a weak interaction, which may affect the overall screening efficiency and specificity. When using existing methods to directly randomly synthesize peptide libraries with longer peptide chains, due to length limitations, the randomness of the amino acids at each site is poor, and the diversity of synthesized peptide types will be limited, making it difficult to meet the ideal library construction requirements. Filter criteria.

目前，DNA拼接方法主要有核酸外切酶法、尿嘧啶-DNA糖基化酶法、特殊限制性内切酶方法和PCR技术。前三种方法的主要原理都是在DNA片段两侧产生互补序列或通过重叠序列进行连接，需要额外增加酶切位点，因此会引入额外的干扰序列，对于以构建短肽库进行筛选为目的的DNA拼接并不适用。而利用PCR技术进行DNA拼接需要在每个片段之间设计10-25bp的互补区域，该互补区域的长度往往已经超出了想要拼接的寡核苷酸链的长度，因此也无法适用于较短的寡核苷酸链的多片段拼接。At present, DNA splicing methods mainly include exonuclease method, uracil-DNA glycosylase method, special restriction endonuclease method and PCR technology. The main principles of the first three methods are to generate complementary sequences on both sides of the DNA fragments or to connect them through overlapping sequences, which requires additional enzyme cutting sites and therefore introduces additional interference sequences. For the purpose of constructing short peptide libraries for screening DNA splicing does not apply. The use of PCR technology for DNA splicing requires the design of a 10-25 bp complementary region between each fragment. The length of this complementary region often exceeds the length of the oligonucleotide chain to be spliced, so it cannot be applied to shorter splicing methods. Multi-fragment splicing of oligonucleotide chains.

发明内容Contents of the invention

本发明的目的之一是提供一种用于寡核苷酸链随机拼合的引物组、试剂盒。本发明的另一目的是提供寡核苷酸链的随机拼合方法以及利用该方法构建随机长链肽库的方法。One of the objects of the present invention is to provide a primer set and kit for random assembly of oligonucleotide chains. Another object of the present invention is to provide a random assembly method of oligonucleotide chains and a method for constructing a random long chain peptide library using this method.

在利用生物体表达、合成随机肽库时，寡核苷酸链库的合成是随机肽库合成的关键。寡核苷酸链随机文库的随机序列DNA的多样性决定了肽库的丰富性和多样性。对于长链肽库，直接合成随机寡核苷酸链会导致核苷酸分布的随机性较差，进而影响随机DNA序列的多样性。若可以先合成较短的寡核苷酸链，再将其进行拼接，则能够更好地保证核苷酸分布的随机性。然而，本发明在研发时发现，现有技术中常用的DNA重组、拼接方法在用于短片段(20bp以内)寡核苷酸链的连接时，存在拼合效率低、无法保留短片段特性进行连接、无法进行相对较为无缝的连接(引入过多的干扰序列)、无法在同一反应体系内完成随机拼合、无法满足平末端连接等问题。为解决上述问题，本发明开发了一种高效的寡核苷酸链的随机拼合方法。该随机拼合方法以磁珠为载体，利用具有特异设计的组成结构的引物、连接子和不同的DNA聚合酶的配合作用，实现了较高的寡核苷酸链的拼合效率。When using organisms to express and synthesize random peptide libraries, the synthesis of oligonucleotide chain libraries is the key to the synthesis of random peptide libraries. The diversity of the random sequence DNA of the oligonucleotide chain random library determines the richness and diversity of the peptide library. For long-chain peptide libraries, direct synthesis of random oligonucleotide chains will lead to poor randomness in nucleotide distribution, thereby affecting the diversity of random DNA sequences. If shorter oligonucleotide chains can be synthesized first and then spliced, the randomness of nucleotide distribution can be better ensured. However, during the research and development of the present invention, it was discovered that the DNA recombination and splicing methods commonly used in the prior art have low splicing efficiency when used to connect short fragments (within 20 bp) oligonucleotide chains, and cannot retain the characteristics of short fragments for connection. , unable to perform a relatively seamless connection (introducing too many interfering sequences), unable to complete random assembly in the same reaction system, unable to meet issues such as blunt-end connection. In order to solve the above problems, the present invention developed an efficient random assembly method of oligonucleotide chains. This random splicing method uses magnetic beads as carriers and utilizes the cooperation of primers, linkers and different DNA polymerases with specially designed structures to achieve high splicing efficiency of oligonucleotide chains.

具体地，本发明提供以下技术方案：Specifically, the present invention provides the following technical solutions:

首先，本发明提供用于寡核苷酸链随机拼合的引物组，将n条寡核苷酸链中的任意k条寡核苷酸链随机拼合为长链寡核苷酸，所述引物组包含n×k条引物，n和k均为大于1的整数，且k＜n；First, the present invention provides a primer set for random assembly of oligonucleotide chains, which randomly assembles any k oligonucleotide chains among n oligonucleotide chains into long-chain oligonucleotides. The primer set Contains n×k primers, n and k are both integers greater than 1, and k<n;

为使每个寡核苷酸链均在拼合后的长链寡核苷酸的任意拼合位置出现，将n×k条引物分为n个亚组，每个亚组包含k条引物，每个亚组的k条引物如下：In order to make each oligonucleotide chain appear at any splicing position of the spliced long chain oligonucleotide, the n×k primers are divided into n subgroups, each subgroup contains k primers, and each The k primers of the subgroup are as follows:

位于拼合后的长链寡核苷酸的5’端的第1条寡核苷酸链的引物自5’-3’方向依次包含第1连接子序列的反向互补序列、第1条寡核苷酸链的反向互补序列；The primer of the first oligonucleotide chain located at the 5' end of the assembled long oligonucleotide contains the reverse complement sequence of the first linker sequence and the first oligonucleotide in sequence from the 5'-3' direction. The reverse complement of the acid chain;

位于拼合后的长链寡核苷酸的5’端的第2条寡核苷酸链的引物自5’-3’方向依次包含第2连接子的反向互补序列或第2连接子除3’末端A以外序列的反向互补序列、第2条寡核苷酸链的反向互补序列和第1连接子的反向互补序列；The primer of the second oligonucleotide chain located at the 5' end of the spliced long oligonucleotide sequentially contains the reverse complement sequence of the second linker from the 5'-3' direction or the second linker minus 3' The reverse complement of the sequence other than terminal A, the reverse complement of the second oligonucleotide strand, and the reverse complement of the first linker;

位于拼合后的长链寡核苷酸的5’端的第i条寡核苷酸链的引物自5’-3’方向依次包含第i连接子的反向互补序列或第i连接子除3’末端A以外序列的反向互补序列、第i条寡核苷酸链的反向互补序列和第i-1连接子的反向互补序列，其中，2＜i≤k-1，且为整数；The primer of the i-th oligonucleotide chain located at the 5' end of the assembled long-chain oligonucleotide sequentially contains the reverse complement sequence of the i-th linker or the i-th linker except 3' from the 5'-3' direction. The reverse complement sequence of the sequence other than terminal A, the reverse complement sequence of the i-th oligonucleotide chain and the reverse complement sequence of the i-1 linker, where 2<i≤k-1, and is an integer;

位于拼合后的长链寡核苷酸的5’端的第k条寡核苷酸链的引物自5’-3’方向依次包含第k条寡核苷酸链的反向互补序列和第k-1连接子的反向互补序列。The primer of the k-th oligonucleotide chain located at the 5' end of the assembled long-chain oligonucleotide sequentially includes the reverse complementary sequence of the k-th oligonucleotide chain and the k-th oligonucleotide chain from the 5'-3' direction. 1 The reverse complement of the linker.

以上所述的寡核苷酸链在拼合后的长链寡核苷酸上的位置为自5’-3’方向各寡核苷酸链顺次排列的位置。The position of the above-mentioned oligonucleotide chain on the combined long-chain oligonucleotide is the position of each oligonucleotide chain arranged sequentially from the 5'-3' direction.

本发明发现，由于寡核苷酸链的长度较短，需要在任意两个寡核苷酸链之间加入连接子(linker)来完成拼接，通过连接子区域的配对延伸，可以实现多个短链寡核苷酸的拼接。The present invention found that due to the short length of the oligonucleotide chain, it is necessary to add a linker between any two oligonucleotide chains to complete splicing. Through the paired extension of the linker region, multiple short oligonucleotide chains can be realized. Splicing of stranded oligonucleotides.

对于第i条寡核苷酸链，2＜i≤k-1，若其对应的第i连接子的3’末端不为A，则其引物自5’-3’方向依次包含第i连接子的反向互补序列、第i条寡核苷酸链的反向互补序列和第i-1连接子的反向互补序列。若其对应的第i连接子的3’末端为A，则其引物自5’-3’方向依次包含第i连接子除3’末端A以外序列的反向互补序列、第i条寡核苷酸链的反向互补序列和第i-1连接子的反向互补序列。For the i-th oligonucleotide chain, 2<i≤k-1, if the 3' end of the corresponding i-th linker is not A, then its primers include the i-th linker in sequence from the 5'-3' direction. The reverse complement sequence of , the reverse complement sequence of the i-th oligonucleotide chain and the reverse complement sequence of the i-1 linker. If the 3' end of the corresponding i-th linker is A, then its primers include the reverse complementary sequence of the i-th linker except the 3' end A, and the i-th oligonucleotide from the 5'-3' direction. The reverse complement of the acid chain and the reverse complement of the i-1th linker.

对于各寡核苷酸链之间的连接子的设计，本发明发现，连接子的长度和序列是否相同对于拼合效率存在明显影响，当连接子的长度小于6nt时，正确拼合的效率明显降低，当各连接子的长度不相同或者某些连接子的序列相同时，也会导致拼合效率的明显降低。Regarding the design of the linker between each oligonucleotide chain, the present invention found that whether the length and sequence of the linker are the same has a significant impact on the splicing efficiency. When the length of the linker is less than 6 nt, the efficiency of correct splicing is significantly reduced. When the lengths of each linker are different or the sequences of some linkers are the same, the splicing efficiency will also be significantly reduced.

优选地，以上所述的连接子的长度≥6nt，第1～k-1连接子的长度相同且各连接子的序列彼此之间均不相同。Preferably, the length of the above-mentioned linkers is ≥6 nt, the lengths of the 1st to k-1 linkers are the same, and the sequences of each linker are different from each other.

上述序列彼此之间均不相同是指各连接子之间序列相似性不为100％。The fact that the above sequences are not identical to each other means that the sequence similarity between the linkers is not 100%.

在表达随机肽库时，通常需要先将合成的寡核苷酸链库与载体连接，为便于与载体连接，上述引物组的两端引物还可包含与载体序列互补的序列。When expressing a random peptide library, it is usually necessary to first connect the synthesized oligonucleotide chain library to a vector. To facilitate connection with the vector, the primers at both ends of the above primer set may also include sequences complementary to the vector sequence.

优选地，位于拼合后的长链寡核苷酸的5’端的第1条寡核苷酸链的引物的3’端还含有与用于克隆所述长链寡核苷酸的载体序列和/或酶切位点序列互补的序列。Preferably, the 3' end of the primer of the first oligonucleotide chain located at the 5' end of the assembled long-chain oligonucleotide also contains the same sequence as the vector used for cloning the long-chain oligonucleotide and/ Or a sequence that is complementary to the enzyme cleavage site sequence.

位于拼合后的长链寡核苷酸的5’端的第k条寡核苷酸链的引物的5’端还含有与用于将拼合后的长链寡核苷酸单链进行PCR扩增形成平末端双链的引物的3’端重叠的序列。The 5' end of the primer of the k-th oligonucleotide chain located at the 5' end of the spliced long oligonucleotide also contains a primer for PCR amplification of the spliced long oligonucleotide single strand. Blunt-ended double-stranded primers have overlapping sequences at the 3' end.

上述与载体序列和酶切位点序列互补的序列只需保证能够与载体高效连接即可，优选的互补序列长度为10-45bp。互补序列根据选择的表达载体的不同而不同。The above-mentioned sequences complementary to the vector sequence and restriction site sequence only need to ensure that they can be efficiently connected to the vector. The preferred length of the complementary sequence is 10-45 bp. The complementary sequence will vary depending on the expression vector chosen.

上述与用于将拼合后的长链寡核苷酸单链进行PCR扩增形成平末端双链的引物的3’端重叠的序列优选为10-30bp，更优选为15-20bp。根据需要，若引入终止密码子，可在该重叠序列的3’端引入终止密码子。The above-mentioned sequence overlapping with the 3' end of the primer used to PCR amplify the spliced long oligonucleotide single strand to form a blunt-ended double strand is preferably 10-30 bp, and more preferably 15-20 bp. If necessary, if a stop codon is introduced, a stop codon can be introduced at the 3' end of the overlapping sequence.

作为本发明的一种实施方式，所用载体为pGADT7-Rec(载体序列如SEQ ID NO.40所示)。As an embodiment of the present invention, the vector used is pGADT7-Rec (the vector sequence is shown in SEQ ID NO. 40).

本发明的随机拼合引物组可用于不同长度的寡核苷酸链的拼合，待拼合的寡核苷酸链的长度可以是相同的，也可以是不同的，经实验验证，上述随机拼合引物组至少可满足长度为10-20nt的寡核苷酸链的高效拼合。The random splicing primer set of the present invention can be used to splice oligonucleotide chains of different lengths. The lengths of the oligonucleotide chains to be spliced can be the same or different. It has been experimentally verified that the above random splicing primer set It can meet the efficient assembly of oligonucleotide chains with a length of at least 10-20nt.

优选地，待拼合的寡核苷酸链的长度为10-20nt。Preferably, the length of the oligonucleotide chains to be spliced is 10-20 nt.

作为本发明的一种实施方案，所述待拼合的寡核苷酸链的长度为12nt。As an embodiment of the present invention, the length of the oligonucleotide chain to be spliced is 12nt.

随着n的数量增加，随机拼合的组合方式增加，引物的数量增加吗，但是n的大小不会影响拼合效率，因此，对于上述引物组中的n没有特殊的数量限制。As the number of n increases, the number of random splicing combinations increases, and the number of primers increases. However, the size of n does not affect the splicing efficiency. Therefore, there is no special limit on the number of n in the above primer set.

对于上述引物组中的k理论上也没有特殊数量限制，但随着用于拼合形成一条长链寡核苷酸的寡核苷酸链的数量不断增加，拼合效率可能会下降。There is theoretically no special limit on the number of k in the above primer set, but as the number of oligonucleotide chains used to assemble a long oligonucleotide continues to increase, the assembly efficiency may decrease.

经验证，本发明至少能够实现4个寡核苷酸链的高效拼合。It has been verified that the present invention can achieve efficient splicing of at least four oligonucleotide chains.

作为本发明的一种优选方案，k＝4，第1～k-1连接子的序列(5’-3’方向)依次为GGTGCA、GCTGCA、GGAGCA。As a preferred embodiment of the present invention, k=4, and the sequences of the 1st to k-1 linkers (5'-3' direction) are GGTGCA, GCTGCA, and GGAGCA in order.

本发明发现，上述3个连接子更有利于保证较高的拼合效率。The present invention finds that the above three connectors are more conducive to ensuring higher splicing efficiency.

除上述引物外，本发明所述的引物组还包含Block引物，所述Block引物为n条寡核苷酸链的反向互补链的混合物。In addition to the above primers, the primer set of the present invention also includes a Block primer, which is a mixture of reverse complementary strands of n oligonucleotide strands.

以上所述的Block引物的作用在于，阻止寡核苷酸链自动互补形成的双链，避免在拼合延伸至2条以上寡核苷酸后，寡核苷酸链自动互补形成的双链在后续反应中作为模板，进而导致阻止继续拼合延伸的可能，保证单向拼接的可控性。The function of the above-mentioned Block primer is to prevent the double strands formed by the automatic complementation of the oligonucleotide chains, and prevent the double strands formed by the automatic complementation of the oligonucleotide chains from being formed after the splicing is extended to more than two oligonucleotides. It serves as a template in the reaction, thereby preventing the possibility of continued splicing and extension, ensuring the controllability of one-way splicing.

优选地，所述引物组还包含：Preferably, the primer set further includes:

F1引物，用于与oligo dT偶联，并将拼合后的长链寡核苷酸与用于克隆的载体连接；F1 primer, used to couple to oligo dT and connect the assembled long-chain oligonucleotide to the vector used for cloning;

F2引物和R引物，用于将拼合后的长链寡核苷酸单链进行PCR扩增形成平末端双链，并与用于克隆的载体连接。The F2 primer and the R primer are used to PCR amplify the spliced long oligonucleotide single strand to form a blunt-ended double strand, and connect it to the vector used for cloning.

其中，F1引物作为与载体互补的接头片段，使得拼合后的长链寡核苷酸能够与用于其克隆的载体进行连接；F2引物和R引物配合使用，用于最后的长链寡核苷酸随机文库的PCR扩增，使得拼合后的长链寡核苷酸单链在PCR扩增过程中补齐成为平末端的双链，同时可供测序使用。Among them, the F1 primer serves as a linker fragment that is complementary to the vector, allowing the spliced long-chain oligonucleotide to be connected to the vector used for its cloning; the F2 primer and the R primer are used together to finalize the long-chain oligonucleotide. The PCR amplification of acid random libraries makes the combined long oligonucleotide single strands completed into blunt-ended double strands during the PCR amplification process, and can be used for sequencing at the same time.

优选地，F1引物自5’-3’方向依次包含载体插入位点上游20-45bp的反向互补序列以及能够偶联至oligo dT的polyA尾端(优选为10-14bp)。Preferably, the F1 primer contains a reverse complementary sequence of 20-45 bp upstream of the vector insertion site and a polyA tail (preferably 10-14 bp) that can be coupled to oligo dT from the 5'-3' direction.

F2引物自5’-3’方向包含载体插入位点上游15-35bp的与载体相同的正向序列，可用于测序。The F2 primer contains the same forward sequence as the vector 15-35 bp upstream of the vector insertion site from the 5’-3’ direction and can be used for sequencing.

R引物自5’-3’方向依次包含载体插入位点下游20-45bp的反向互补序列以及与位于拼合后的长链寡核苷酸的5’端的第k条寡核苷酸链的引物重叠的6-8个核苷酸。The R primer contains the reverse complementary sequence 20-45 bp downstream of the vector insertion site in sequence from the 5'-3' direction and the primer of the k-th oligonucleotide chain located at the 5' end of the spliced long oligonucleotide. Overlapping 6-8 nucleotides.

优选地，F2引物序列与F1引物序列的poly A尾端之前的25-35bp完全互补。Preferably, the F2 primer sequence is completely complementary to the 25-35 bp before the poly A tail of the F1 primer sequence.

作为本发明的一种实施方式，以6个12nt DNA片段(SEQ ID NO.1-6)为待拼合寡核苷酸链，将任意4个寡核苷酸链进行随机拼合。用于拼合的引物组包含24条(n×k)引物，24条引物的序列如SEQ ID NO.13-36所示。As an embodiment of the present invention, six 12nt DNA fragments (SEQ ID NO. 1-6) are used as oligonucleotide chains to be assembled, and any four oligonucleotide chains are randomly assembled. The primer set used for assembly contains 24 (n×k) primers, and the sequences of the 24 primers are shown in SEQ ID NO. 13-36.

拼合后用于克隆的载体为pGADT7-Rec，F1引物、R引物和F2引物的序列依次如SEQID NO.37-39所示。The vector used for cloning after assembly is pGADT7-Rec, and the sequences of the F1 primer, R primer and F2 primer are shown in SEQ ID NO. 37-39.

在上述引物组的基础上，本发明提供一种试剂盒，其包含所述用于寡核苷酸链随机拼合的引物组。Based on the above primer set, the present invention provides a kit, which includes the primer set for random assembly of oligonucleotide chains.

以上所述的试剂盒用于寡核苷酸链的随机拼合。该试剂盒还可包含其它用于寡核苷酸链的随机拼合的试剂，例如：DNA聚合酶、dNTPs、Klenow酶、反应缓冲液、磁珠、ddH₂O等。The kit described above is used for random assembly of oligonucleotide chains. The kit can also contain other reagents for random assembly of oligonucleotide chains, such as: DNA polymerase, dNTPs, Klenow enzyme, reaction buffer, magnetic beads, ddH ₂ O, etc.

本发明提供所述用于寡核苷酸链随机拼合的引物组或所述试剂盒在随机寡核苷酸链文库构建或随机肽库构建中的应用。The present invention provides the application of the primer set for random assembly of oligonucleotide chains or the kit in the construction of a random oligonucleotide chain library or a random peptide library.

进一步地，本发明提供一种寡核苷酸链随机拼合方法，所述方法为以磁珠为载体，采用上述用于寡核苷酸链随机拼合的引物组，将n条寡核苷酸链中的任意k条寡核苷酸链随机拼合为长链寡核苷酸。Further, the present invention provides a method for randomly assembling oligonucleotide chains. The method uses magnetic beads as carriers and uses the above-mentioned primer set for random assembly of oligonucleotide chains to combine n oligonucleotide chains. Any k oligonucleotide chains in are randomly assembled into long-chain oligonucleotides.

优选地，所述方法包括如下步骤：Preferably, the method includes the following steps:

(1)PCR：以磁珠为载体，采用F1引物以及第一引物混合物，在高保真DNA聚合酶的作用下进行PCR，PCR结束后经固液分离得到第一反应产物；(1) PCR: Using magnetic beads as a carrier, using F1 primer and the first primer mixture, PCR is performed under the action of high-fidelity DNA polymerase. After the PCR is completed, the first reaction product is obtained through solid-liquid separation;

所述第一引物混合物为N亚组中每个亚组位于拼合后的长链寡核苷酸的5’端的第1条寡核苷酸链的引物的混合物；The first primer mixture is a mixture of primers of the first oligonucleotide chain of each subgroup of the N subgroups located at the 5' end of the assembled long-chain oligonucleotide;

(2)洗脱：将所述第一反应产物与Block引物混合，待寡核苷酸互补配对后，经洗脱得到第一次洗脱产物；将第一次洗脱产物再与Block引物混合，待寡核苷酸互补配对后，经洗脱得到第二次洗脱产物；(2) Elution: Mix the first reaction product with the Block primer. After the oligonucleotides are complementary and paired, the first elution product is obtained through elution; mix the first elution product with the Block primer again. , after the oligonucleotides are complementary paired, the second elution product is obtained through elution;

(3)延伸：在步骤(2)的第二次洗脱产物的基础上，采用Block引物、第二引物混合物，以dNTPs为原料在Klenow酶的作用下进行延伸反应，得到第二反应产物；(3) Extension: On the basis of the second elution product in step (2), use the Block primer and the second primer mixture, and use dNTPs as raw materials to perform an extension reaction under the action of Klenow enzyme to obtain the second reaction product;

所述第二引物混合物为N亚组中每个亚组位于拼合后的长链寡核苷酸的5’端的第2条寡核苷酸链的引物的混合物；The second primer mixture is a mixture of primers of the second oligonucleotide chain of each subgroup of the N subgroups located at the 5' end of the assembled long-chain oligonucleotide;

本发明发现，与其它DNA聚合酶相比，在延伸时采用Klenow酶能够显著提高寡核苷酸链之间的拼合效率；The present invention found that compared with other DNA polymerases, the use of Klenow enzyme during extension can significantly improve the splicing efficiency between oligonucleotide chains;

(4)重复步骤(2)-(3)，逐个拼合k条寡核苷酸链中剩余的寡核苷酸链，其中，在第i个寡核苷酸链的延伸步骤，采用Block引物以及第i引物混合物；(4) Repeat steps (2)-(3) to assemble the remaining oligonucleotide chains among the k oligonucleotide chains one by one. In the extension step of the i-th oligonucleotide chain, Block primers and i primer mixture;

所述第i引物混合物为N亚组中每个亚组位于拼合后的长链寡核苷酸的5’端的第i条寡核苷酸链的引物的混合物，其中，2＜i≤k-1，且为整数；The i-th primer mixture is a mixture of primers of the i-th oligonucleotide chain of each subgroup of the N subgroups located at the 5' end of the assembled long-chain oligonucleotide, wherein 2<i≤k- 1, and is an integer;

最后再重复步骤(2)-(3)进行第k条寡核苷酸链的拼合，在第k个寡核苷酸链的延伸步骤，采用Block引物以及第k引物混合物；Finally, repeat steps (2)-(3) to assemble the k-th oligonucleotide chain. In the extension step of the k-th oligonucleotide chain, use the Block primer and the k-th primer mixture;

所述第k引物混合物为N亚组中每个亚组位于拼合后的长链寡核苷酸的5’端的第k条寡核苷酸链的引物的混合物；The k-th primer mixture is a mixture of primers of the k-th oligonucleotide chain of each subgroup in the N subgroups located at the 5' end of the assembled long-chain oligonucleotide;

(5)洗脱：步骤(4)的拼合结束后，将拼合产物与Block引物混合，经洗脱得到洗脱产物；(5) Elution: After the assembly of step (4) is completed, mix the assembly product with the Block primer, and obtain the elution product through elution;

(6)以步骤(5)的洗脱产物为模板，采用F2引物和R引物进行PCR，回收PCR产物，得到随机拼合的寡核苷酸库。(6) Use the elution product from step (5) as a template, use F2 primer and R primer to perform PCR, recover the PCR product, and obtain a randomly assembled oligonucleotide library.

以上所述的步骤(1)中，所用磁珠为偶联有oligo dT的磁珠。优选偶联25nt的oligo dT。In the above-mentioned step (1), the magnetic beads used are magnetic beads coupled with oligo dT. Preference is given to coupling to oligo dT of 25 nt.

所述高保真DNA聚合酶可为任意的高保真DNA聚合酶，只需保证扩增产物为平末端即可，例如：Phanta Max Master Mix等。The high-fidelity DNA polymerase can be any high-fidelity DNA polymerase, as long as the amplification product has a blunt end, for example: Phanta Max Master Mix, etc.

所述固液分离可采用磁力架分离上清和沉淀，然后去除上清，回收沉淀作为第一反应产物。The solid-liquid separation can use a magnetic stand to separate the supernatant and precipitate, then remove the supernatant and recover the precipitate as the first reaction product.

PCR的反应体系中，F1引物的终浓度为0.3-0.5μM，第一引物混合物的终浓度为0.3-0.5μM。In the PCR reaction system, the final concentration of the F1 primer is 0.3-0.5 μM, and the final concentration of the first primer mixture is 0.3-0.5 μM.

优选的50μl的PCR反应体系如下：磁珠10μl，2×高保真DNA聚合酶25μl，高保真DNA聚合酶对应的反应缓冲液，F1引物0.4μM，第一引物混合物0.4μM，以水补足反应体系。The preferred 50 μl PCR reaction system is as follows: 10 μl magnetic beads, 25 μl 2× high-fidelity DNA polymerase, reaction buffer corresponding to high-fidelity DNA polymerase, 0.4 μM F1 primer, 0.4 μM first primer mixture, and make up the reaction system with water. .

PCR的反应程序包括：94-98℃、5-30s，55℃、10-30s，72℃、10-20s，18-25个循环。The PCR reaction program includes: 94-98°C, 5-30s, 55°C, 10-30s, 72°C, 10-20s, 18-25 cycles.

以上所述方法的步骤(2)中，Block引物的终浓度为18-22μM。洗脱为先于90-95℃孵育2min，再于0-4℃孵育1-3min。In step (2) of the above method, the final concentration of the Block primer is 18-22 μM. For elution, incubate at 90-95°C for 2 minutes and then incubate at 0-4°C for 1-3 minutes.

以上所述方法的步骤(3)中，延伸的反应体系中Block引物的终浓度为1-3μM，引物混合物的终浓度为0.5-2μM。In step (3) of the above method, the final concentration of the Block primer in the extension reaction system is 1-3 μM, and the final concentration of the primer mixture is 0.5-2 μM.

优选的延伸反应体系如下(总体积20μL)：第一洗脱产物、Block引物2μM，dNTPs0.5mM，1×Klenow酶的反应缓冲液，Klenow酶1μl，引物混合物1μM，以水补足反应体系。The preferred extension reaction system is as follows (total volume 20 μL): first elution product, 2 μM Block primer, 0.5 mM dNTPs, 1×Klenow enzyme reaction buffer, 1 μL Klenow enzyme, 1 μM primer mixture, and make up the reaction system with water.

其中，所述引物混合物优选先经94℃变性2min后于37℃保温。Wherein, the primer mixture is preferably denatured at 94°C for 2 minutes and then incubated at 37°C.

上述反应体系的加样顺序优选为：先将Block引物、水、dNTPs，反应缓冲液混匀，于94℃孵育1-3min，再于0-4℃孵育1-3min。The preferred order of adding the above reaction system is: first mix the Block primer, water, dNTPs, and reaction buffer, incubate at 94°C for 1-3 minutes, and then incubate at 0-4°C for 1-3 minutes.

延伸的反应条件为：37℃反应15-25min。The extended reaction conditions are: 37°C for 15-25 minutes.

以上所述的步骤(6)中，PCR反应体系中，F2引物、R引物的终浓度为0.3-0.8μM。In the above-mentioned step (6), the final concentration of F2 primer and R primer in the PCR reaction system is 0.3-0.8 μM.

优选的PCR反应体系为(总体积50μL)：2×高保真DNA聚合酶Mix25μl，，F2引物0.4μM，R引物0.4μM，以水补足反应体系。The preferred PCR reaction system is (total volume 50 μL): 2× high-fidelity DNA polymerase Mix 25 μl, F2 primer 0.4 μM, R primer 0.4 μM, and the reaction system is supplemented with water.

优选的PCR反应程序为：The preferred PCR reaction procedure is:

PCR的反应程序包括：94-98℃、5-30s，55℃、10-30s，72℃、10-20s，18-35个循环。The PCR reaction program includes: 94-98°C, 5-30s, 55°C, 10-30s, 72°C, 10-20s, 18-35 cycles.

本发明还提供一种随机肽库的构建方法，该方法包括以下步骤：采用所述寡核苷酸链随机拼合方法将编码短链肽库的短链寡核苷酸进行随机拼合，得到随机长链寡核苷酸库，将随机长链寡核苷酸库与载体连接后，转入宿主细胞中进行表达，得到随机长链肽库。The present invention also provides a method for constructing a random peptide library, which method includes the following steps: using the oligonucleotide chain random assembly method to randomly assemble short-chain oligonucleotides encoding short-chain peptide libraries to obtain random long-chain oligonucleotides. For the chain oligonucleotide library, the random long chain oligonucleotide library is connected to the vector and then transferred into host cells for expression to obtain a random long chain peptide library.

以上所述的随机肽库的构建方法中，寡核苷酸链随机拼合过程中，待拼合的寡核苷酸链可以为采用NNK(其中，N代表任意核苷酸，K代表G或T)编码方式构建的随机寡核苷酸链；In the construction method of the random peptide library described above, in the process of random assembly of oligonucleotide chains, the oligonucleotide chain to be assembled can be NNK (where N represents any nucleotide, and K represents G or T) Random oligonucleotide chains constructed in a coding manner;

或者，待拼合的寡核苷酸链也可以为采用NNK(其中，N代表任意核苷酸，K代表G或T)编码方式构建的随机寡核苷酸链、经连接载体、在宿主细胞中表达短肽库并经初步筛选得到的初筛肽库对应的寡核苷酸链。Alternatively, the oligonucleotide chain to be assembled can also be a random oligonucleotide chain constructed using NNK (where N represents any nucleotide, and K represents G or T) encoding method, connected to the vector, and in the host cell Express the short peptide library and obtain the oligonucleotide chain corresponding to the preliminary screening peptide library after preliminary screening.

通过选择初筛中与靶标结合较强的短肽序列进行后续的拼合、长链肽库的构建和再筛选，可以减轻肽库构建和筛选的工作量，提高工作效率，同时对初筛结果进行验证，增强筛选结果的可靠性。By selecting short peptide sequences that have strong binding to the target in the initial screening for subsequent assembly, construction and re-screening of long-chain peptide libraries, the workload of peptide library construction and screening can be reduced, work efficiency improved, and the results of the preliminary screening can be improved. Verify and enhance the reliability of screening results.

作为本发明的一种实施方案，本发明提供一种16aa随机肽库的构建方法，该方法包括以下步骤：As an embodiment of the present invention, the present invention provides a method for constructing a 16aa random peptide library, which method includes the following steps:

(1)采用NNK(其中，N代表任意核苷酸，K代表G或T)编码方式合成的4aa的随机寡核苷酸链、经连接载体、在宿主细胞中表达短肽库并经初步筛选得到的初筛肽库；(1) A 4aa random oligonucleotide chain is synthesized using NNK (where N represents any nucleotide and K represents G or T) coding method, connected to a vector, and a short peptide library is expressed in host cells and subjected to preliminary screening The obtained preliminary screened peptide library;

(2)合成初筛肽库中各多肽对应的寡核苷酸链，采用所述寡核苷酸链随机拼合方法进行拼合，得到编码随机肽库的随机寡核苷酸库，将随机寡核苷酸库与载体连接后，转入宿主细胞中进行表达，得到16aa随机肽库；(2) Synthesize the oligonucleotide chains corresponding to each polypeptide in the preliminary screened peptide library, and assemble them using the random assembly method of the oligonucleotide chains to obtain a random oligonucleotide library encoding the random peptide library, and combine the random oligonucleotides with After the nucleotide library is connected to the vector, it is transferred into host cells for expression to obtain a 16aa random peptide library;

其中，所述寡核苷酸链随机拼合方法中，n条寡核苷酸链的长度为12nt，k＝4。Wherein, in the random assembly method of oligonucleotide chains, the length of n oligonucleotide chains is 12nt, and k=4.

与随机短肽库相比，16aa随机肽库可以通过多结合位点与目的蛋白的相互作用，提高筛选效率与特异性。Compared with random short peptide libraries, 16aa random peptide libraries can improve screening efficiency and specificity through the interaction of multiple binding sites with target proteins.

本发明的有益效果至少包括：本发明提供的用于寡核苷酸链随机拼合的引物组和随机拼合方法解决了在随机长链寡核苷酸库合成过程中，某些位点会出现一定偏好性而导致整体DNA双链中某些位点核苷酸分布的随机性不足，以致整体文库的多态性不足、筛选效果不理想等问题。本发明提供的寡核苷酸链随机拼合方法能够以较高的效率将短链寡核苷酸随机拼合为长链寡核苷酸，通过降低直接合成核苷酸随机序列的长度，很好地保证了短链寡核苷酸库中DNA的随机性与多态性，而后再通过高效率的随机拼合来达到目标寡核苷酸链的长度。由此构建的长链寡核苷酸库和随机肽库具有较大的库容，且能够很好地保证每个位点的随机性和文库的序列多态性。The beneficial effects of the present invention at least include: the primer set and random assembly method for random assembly of oligonucleotide chains provided by the present invention solve the problem that certain positions will appear during the synthesis of random long-chain oligonucleotide libraries. Preference leads to insufficient randomness in the distribution of nucleotides at certain sites in the overall DNA double strand, resulting in insufficient polymorphism of the overall library and unsatisfactory screening effects. The oligonucleotide chain random assembly method provided by the invention can randomly assemble short-chain oligonucleotides into long-chain oligonucleotides with higher efficiency. By reducing the length of the directly synthesized nucleotide random sequence, it can effectively It ensures the randomness and polymorphism of DNA in the short-chain oligonucleotide library, and then achieves the length of the target oligonucleotide chain through high-efficiency random assembly. The long-chain oligonucleotide library and random peptide library constructed in this way have a large library capacity, and can well ensure the randomness of each site and the sequence polymorphism of the library.

利用本发明的随机拼合方法构建的随机肽库可应用于酵母双杂交、噬菌体展示等筛选，具有较好的应用前景。The random peptide library constructed using the random splicing method of the present invention can be used in yeast two-hybrid, phage display and other screenings, and has good application prospects.

附图说明Description of the drawings

图1为本发明实施例1中寡核苷酸链的拼合流程示意图。Figure 1 is a schematic diagram of the assembly process of oligonucleotide chains in Example 1 of the present invention.

图2为本发明实施例1中拼合阳性克隆的插入序列测序结果。Figure 2 is the insertion sequence sequencing result of the combined positive clone in Example 1 of the present invention.

图3为本发明实施例1采用6nt连接子和对比例1中不采用连接子进行寡核苷酸拼合的拼合产物的电泳检测结果，其中，泳道1～5依次为：marker、无连接子33个循环终产物(对比例1)、6nt连接子20个循环终产物(实施例1)、6nt连接子(实施例1)33个循环终产物以及132bp阳性对照。Figure 3 shows the electrophoresis detection results of the splicing products of oligonucleotides spliced using a 6nt linker in Example 1 of the present invention and without using a linker in Comparative Example 1. Lanes 1 to 5 are in order: marker, no linker 33 The final product of 3 cycles (Comparative Example 1), the final product of 20 cycles of 6nt linker (Example 1), the final product of 33 cycles of 6nt linker (Example 1) and the 132bp positive control.

图4为对比例2中拼合错误克隆的测序结果。Figure 4 shows the sequencing results of the incorrectly assembled clones in Comparative Example 2.

图5为对比例3中拼合错误克隆的测序结果。Figure 5 shows the sequencing results of the incorrectly assembled clones in Comparative Example 3.

图4和图5中，insert num代表12nt寡核苷酸链数量，insert length代表插入片段长度，insert sequence代表插入片段的测序结果序列。In Figure 4 and Figure 5, insert num represents the number of 12nt oligonucleotide chains, insert length represents the length of the inserted fragment, and insert sequence represents the sequencing result sequence of the inserted fragment.

具体实施方式Detailed ways

以下实施例用于说明本发明，但不用来限制本发明的范围。The following examples are used to illustrate the invention but are not intended to limit the scope of the invention.

以下实施例中使用的磁珠为偶联25nt的oligo dT的磁珠，购自Invitrogen，商品牌号为Dynabeads^TM Oligo(dT)25，货号:61002；2×Phanta Max Master Mix为Vazyme货号为P515-01的产品；Klenow酶为NEB的3'-5'exo-产品。The magnetic beads used in the following examples are magnetic beads coupled to 25nt oligo dT, purchased from Invitrogen, the brand number is Dynabeads ^TM Oligo (dT) 25, product number: 61002; 2×Phanta Max Master Mix is Vazyme, product number is P515- Product of 01; Klenow enzyme is NEB's 3'-5'exo-product.

实施例1寡核苷酸链的拼合Example 1 Assembling of oligonucleotide chains

本实施例以6个12nt DNA片段为待拼合寡核苷酸链，将任意4个寡核苷酸链(命名为A、B、C、D)进行随机拼合，基本流程为：以磁珠为载体，先将Smart3_RC连接到磁珠的oligo-dT后，再将A、B、C、D按照不同的排列顺序连接至磁珠上，最后用AdrecF引物和CDSIII_RC引物将拼合的长链寡核苷酸进行扩增。拼合后的长链寡核苷酸的结构为：寡核苷酸链A-第1连接子-寡核苷酸链B-第2连接子-寡核苷酸链C-第3连接子-寡核苷酸链D，使用的第1连接子、第2连接子和第3连接子的序列(5’-3’)分别为GGTGCA、GCTGCA、GGAGCA。In this example, six 12nt DNA fragments are used as oligonucleotide chains to be assembled, and any four oligonucleotide chains (named A, B, C, D) are randomly assembled. The basic process is: using magnetic beads as Vector, first connect Smart3_RC to the oligo-dT of the magnetic beads, then connect A, B, C, and D to the magnetic beads in different order, and finally use AdrecF primer and CDSIII_RC primer to combine the assembled long-chain oligonucleotides. acid for amplification. The structure of the combined long-chain oligonucleotide is: oligonucleotide chain A-1st linker-oligonucleotide chain B-2nd linker-oligonucleotide chain C-3rd linker-oligonucleotide For nucleotide chain D, the sequences (5'-3') of the first, second and third linkers used are GGTGCA, GCTGCA and GGAGCA respectively.

具体的随机拼合方法如下，主要流程如图1所示：The specific random stitching method is as follows, and the main process is shown in Figure 1:

1、PCR：取10μl磁珠，加入50μl水混匀，室温洗涤，共洗涤两次；1. PCR: Take 10 μl of magnetic beads, add 50 μl of water, mix well, and wash at room temperature, twice in total;

在洗涤后的磁珠中加入2μl F1引物(10μM)，2μl A_RC(10μM)，21μl H₂O，25μl 2×Phanta Mix，混匀，得到PCR反应体系；Add 2μl F1 primer (10μM), 2μl A_RC (10μM), 21μl H ₂ O, 25μl 2×Phanta Mix to the washed magnetic beads, mix well to obtain a PCR reaction system;

将PCR反应体系进行如下程序的PCR：95℃、15s，55℃、15s，72℃、15s，20个循环，每隔10min用移液器吹打混匀；The PCR reaction system was subjected to PCR with the following program: 95°C, 15s, 55°C, 15s, 72°C, 15s, 20 cycles, pipetting and mixing every 10 minutes;

PCR反应结束后用移液器吹打均匀，于室温条件下，置于磁力架上去除上清，得到第一反应产物；After the PCR reaction is completed, use a pipette to pipette evenly, place on a magnetic stand at room temperature, remove the supernatant, and obtain the first reaction product;

2、洗脱：在步骤1得到的第一反应产物中加入20μl Block引物(20μM)，混匀，于94℃孵育2min，冰上放置2min，置于磁力架上去除上清，重复Block引物洗脱一次，置于磁力架上去除上清，得到洗脱产物；2. Elution: Add 20 μl Block primer (20 μM) to the first reaction product obtained in step 1, mix well, incubate at 94°C for 2 minutes, place on ice for 2 minutes, place on a magnetic stand, remove the supernatant, and repeat the Block primer wash. Strip once, place on a magnetic stand to remove the supernatant, and obtain the eluted product;

3、延伸：在步骤2得到的洗脱产物中加入2μl Block引物(20μM)，12μl H₂O，1μldNTPs(10mM each)，2μl 10×NEB Buffer 2，吹打均匀，于94℃孵育2min，再于冰上放置2min，然后再加入1μL Klenow(exo-，NEB)，最后加入经94℃变性2min后于37℃保温的B_RCprimer(10μM)2μl混匀，于37℃反应20min，每10min用移液器吹打混匀；3. Extension: Add 2μl Block primer (20μM), 12μl H ₂ O, 1μl ldNTPs (10mM each), 2μl 10×NEB Buffer 2 to the elution product obtained in step 2, pipet evenly, incubate at 94°C for 2 minutes, and then Place on ice for 2 minutes, then add 1 μL of Klenow (exo-, NEB), and finally add 2 μl of B_RCprimer (10 μM) that has been denatured at 94°C for 2 minutes and incubated at 37°C, mix well, and react at 37°C for 20 minutes, using a pipette every 10 minutes. Whisk to mix;

4、洗脱：重复步骤2；4. Elution: Repeat step 2;

5、延伸：重复步骤3，区别仅在于，使用2μl C_RC primer(10μM)替换B_RC primer(10μM)；5. Extension: Repeat step 3, the only difference is that use 2μl C_RC primer (10μM) instead of B_RC primer (10μM);

6、洗脱：重复步骤(2)；6. Elution: Repeat step (2);

7、延伸：重复步骤3，区别仅在于，使用2μl D_RC primer(10μM)替换B_RC primer(10μM)；7. Extension: Repeat step 3, the only difference is that use 2μl D_RC primer (10μM) instead of B_RC primer (10μM);

8、洗脱：重复步骤2，得到洗脱产物；8. Elution: Repeat step 2 to obtain the elution product;

9、以步骤8的洗脱产物为模板，采用F2引物和R引物进行PCR扩增，回收扩增产物，得到随机拼合的寡核苷酸库；9. Use the elution product in step 8 as a template, use F2 primer and R primer to perform PCR amplification, recover the amplification product, and obtain a randomly assembled oligonucleotide library;

其中，PCR扩增的反应体系如下(总体积50μL)：2×Phanta Max Master Mix 25μl，AdrecF引物0.4μM，Cds3_RC引物0.4μM，以水补足反应体系。Among them, the reaction system for PCR amplification is as follows (total volume 50 μL): 2×Phanta Max Master Mix 25 μl, AdrecF primer 0.4 μM, Cds3_RC primer 0.4 μM, and the reaction system is supplemented with water.

PCR扩增的反应程序如下PCR的反应程序包括：95℃、30s，55℃、15s，72℃、15s，20个循环或33个循环。The reaction program of PCR amplification is as follows. The reaction program of PCR includes: 95℃, 30s, 55℃, 15s, 72℃, 15s, 20 cycles or 33 cycles.

以上方法中，6个12nt的寡核苷酸链的序列如下(方向为5’-3’,以下所有引物序列的方向也均为5’-3’)：In the above method, the sequences of the six 12nt oligonucleotide chains are as follows (the direction is 5’-3’, and the direction of all the following primer sequences is also 5’-3’):

GTGGCGATTCAG；GTGGCGATTCAG;

TGGGCTAGTGAT；TGGGCTAGTGAT;

CGGGTGCCGCTT；CGGGTGCCGCTT;

TTGCTTGTTCAG；TTGCTTGTTCAG;

AATGCTACTGGT；AATGCTACTGGT;

CCGTGTACGGCT；CCGTGTACGGCT;

Block引物是6个12nt的寡核苷酸链的反向互补寡核苷酸链的混合物，其包含的引物序列如下：Block primer is a mixture of six 12nt oligonucleotide chains and reverse complementary oligonucleotide chains. The primer sequence it contains is as follows:

CTGAATCGCCAC；CTGAATCGCCAC;

ATCACTAGCCCA；ATCACTAGCCCA;

AAGCGGCACCCG；AAGCGGCACCCG;

CTGAACAAGCAA；CTGAACAAGCAA;

ACCAGTAGCATT；ACCAGTAGCATT;

AGCCGTACACGG；AGCCGTACACGG;

引物A_RC为6个12nt的寡核苷酸链在拼合后的长链寡核苷酸5’-3’方向的第1个寡核苷酸链位置时所有引物(6个引物)的混合物；Primer A_RC is a mixture of all primers (6 primers) when the 6 12nt oligonucleotide chains are at the position of the first oligonucleotide chain in the 5’-3’ direction of the assembled long-chain oligonucleotide;

这6个引物的序列如下：The sequences of these 6 primers are as follows:

A_RC1:TGCACCCTGAATCGCCACGGGCCATAATGGCCACTC；A_RC1:TGCACCCTGAATCGCCACGGGCCATAATGGCCACTC;

A_RC2:TGCACCATCACTAGCCCAGGGCCATAATGGCCACTC；A_RC2:TGCACCATCACTAGCCCAGGGCCATAATGGCCACTC;

A_RC3:TGCACCAAGCGGCACCCGGGGCCATAATGGCCACTC；A_RC3:TGCACCAAGCGGCACCCGGGGCCATAATGGCCACTC;

A_RC4:TGCACCCTGAACAAGCAAGGGCCATAATGGCCACTC；A_RC4:TGCACCCTGAACAAGCAAGGGCCATAATGGCCACTC;

A_RC5:TGCACCACCAGTAGCATTGGGCCATAATGGCCACTC；A_RC5:TGCACCACCAGTAGCATTGGGCCATAATGGCCACTC;

A_RC6:TGCACCAGCCGTACACGGGGGCCATAATGGCCACTC。A_RC6:TGCACCAGCCGTACACGGGGGCCATAATGGCCACTC.

引物B_RC为6个12nt的寡核苷酸链在拼合后的长链寡核苷酸5’-3’方向的第2个寡核苷酸链位置时所有引物(6个引物)的混合物；Primer B_RC is a mixture of all primers (6 primers) when the 6 12nt oligonucleotide chains are at the position of the second oligonucleotide chain in the 5’-3’ direction of the assembled long-chain oligonucleotide;

这6个引物的序列如下：The sequences of these 6 primers are as follows:

B_RC1:GCAGCCTGAATCGCCACTGCACC；B_RC1:GCAGCCTGAATCGCCACTGCACC;

B_RC2:GCAGCATCACTAGCCCATGCACC；B_RC2:GCAGCATCACTAGCCCATGCACC;

B_RC3:GCAGCAAGCGGCACCCGTGCACC；B_RC3:GCAGCAAGCGGCACCCGTGCACC;

B_RC4:GCAGCCTGAACAAGCAATGCACC；B_RC4:GCAGCCTGAACAAGCAATGCACC;

B_RC5:GCAGCACCAGTAGCATTTGCACC；B_RC5:GCAGCACCAGTAGCATTTGCACC;

B_RC6:GCAGCAGCCGTACACGGTGCACC。B_RC6:GCAGCAGCCGTACACGGTGCACC.

引物C_RC为6个12nt的寡核苷酸链在拼合后的长链寡核苷酸5’-3’方向的第3个寡核苷酸链位置时所有引物(6个引物)的混合物；Primer C_RC is a mixture of all primers (6 primers) when the 6 12nt oligonucleotide chains are at the position of the third oligonucleotide chain in the 5’-3’ direction of the assembled long-chain oligonucleotide;

这6个引物的序列如下：The sequences of these 6 primers are as follows:

C_RC1:GCTCCCTGAATCGCCACTGCAGC；C_RC1:GCTCCCTGAATCGCCACTGCAGC;

C_RC2:GCTCCATCACTAGCCCATGCAGC；C_RC2:GCTCCATCACTAGCCCATGCAGC;

C_RC3:GCTCCAAGCGGCACCCGTGCAGC；C_RC3:GCTCCAAGCGGCACCCGTGCAGC;

C_RC4:GCTCCCTGAACAAGCAATGCAGC；C_RC4:GCTCCCTGAACAAGCAATGCAGC;

C_RC5:GCTCCACCAGTAGCATTTGCAGC；C_RC5:GCTCCACCAGTAGCATTTGCAGC;

C_RC6:GCTCCAGCCGTACACGGTGCAGC。C_RC6:GCTCCAGCCGTACACGGTGCAGC.

引物D_RC为6个12nt的寡核苷酸链在拼合后的长链寡核苷酸5’-3’方向的第4个寡核苷酸链位置时所有引物(6个引物)的混合物。Primer D_RC is a mixture of all primers (6 primers) of six 12nt oligonucleotide chains at the position of the fourth oligonucleotide chain in the 5’-3’ direction of the assembled long-chain oligonucleotide.

这6个引物的序列如下：The sequences of these 6 primers are as follows:

D_RC1:GAGGCGGCCGACATGCTACTGAATCGCCACTGCTCC；D_RC1:GAGGCGGCCGACATGCTACTGAATCGCCACTGCTCC;

D_RC2:GAGGCGGCCGACATGCTAATCACTAGCCCATGCTCC；D_RC2:GAGGCGGCCGACATGCTAATCACTAGCCCATGCTCC;

D_RC3:GAGGCGGCCGACATGCTAAAGCGGCACCCGTGCTCC；D_RC3:GAGGCGGCCGACATGCTAAAGCGGCACCCGTGCTCC;

D_RC4:GAGGCGGCCGACATGCTACTGAACAAGCAATGCTCC；D_RC4:GAGGCGGCCGACATGCTACTGAACAAGCAATGCTCC;

D_RC5:GAGGCGGCCGACATGCTAACCAGTAGCATTTGCTCC；D_RC5:GAGGCGGCCGACATGCTAACCAGTAGCATTTGCTCC;

D_RC6:GAGGCGGCCGACATGCTAAGCCGTACACGGTGCTCC。D_RC6:GAGGCGGCCGACATGCTAAGCCGTACACGGTGCTCCC.

F1引物的序列如下：The sequence of the F1 primer is as follows:

GGGCCATAATGGCCACTCTGCGTTGATACCACTGCTTGGGTGGAAAAAAAAAAAAAAAA。GGGCCATAATGGCCACTCTGCGTTGATACCACTGCTTGGGTGGAAAAAAAAAAAAAAAA.

R引物的序列如下：The sequence of the R primer is as follows:

GTATCGATGCCCACCCTCTAGAGGCCGAGGCGGCCGACATGCTA。GTATCGATGCCCACCCTCTAGAGGCCGAGGCGGCCGACATGCTA.

F2引物的序列如下：The sequence of the F2 primer is as follows:

TTCCACCCAAGCAGTGGTATCAACGCAGAGT。TTCCACCCAAGCAGTGGTATCAACGCAGAGT.

对比例1Comparative example 1

本对比例提供一种寡核苷酸链的拼合方法，其与实施例1的区别仅在于：在待拼合的短链寡核苷酸之间不设置连接子，相应地删除实施例1中各引物序列中的连接子序列，其它方法均与实施例1相同。This comparative example provides a method for assembling oligonucleotide chains. The only difference from Example 1 is that no linkers are provided between the short-chain oligonucleotides to be assembled, and each link in Example 1 is deleted accordingly. The linker sequence in the primer sequence and other methods were the same as in Example 1.

对比例2Comparative example 2

本对比例提供一种寡核苷酸链的拼合方法，其与实施例1的区别仅在于：将实施例1中的连接子替换为以下连接子：This comparative example provides a method for assembling oligonucleotide chains. The only difference from Example 1 is that the linker in Example 1 is replaced by the following linker:

第1连接子为GGA，第2连接子为GGA，第3连接子为GGA。The first linker is GGA, the second linker is GGA, and the third linker is GGA.

相应地替换各引物序列中的连接子序列，其它方法均与实施例1相同。Replace the linker sequence in each primer sequence accordingly, and other methods are the same as in Example 1.

对比例3Comparative example 3

第1连接子为GGGGGA，第2连接子为GGGGGA，第3连接子为GGGGGA。The first linker is GGGGGA, the second linker is GGGGGA, and the third linker is GGGGGA.

实验例Experimental example

对上述实施例和对比例构建的随机寡核苷酸链文库进行拼合效率、库容和多态性的检测，具体过程如下：The random oligonucleotide chain libraries constructed in the above examples and comparative examples were tested for assembly efficiency, library capacity and polymorphism. The specific process is as follows:

1、文库浓度检测1. Library concentration detection

利用琼脂糖凝胶电泳检测实施例1以及对比例1中步骤9的PCR扩增循环数分别为20个和33个循环得到的拼合后的终产物。结果如图3所示，对比例1无连接子时，扩增33个循环仍然无清晰的产物条带，表明无连接子时无法获得成功拼合的产物，拼合效率为0，因此不进行后续的克隆和测序检测；而实施例1扩增20个循环和33个循环得到的拼合终产物均可以看到清晰条带，表明终产物含量为100ng左右。Agarose gel electrophoresis was used to detect the assembled final products obtained by performing PCR amplification cycles of 20 and 33 cycles in step 9 in Example 1 and Comparative Example 1, respectively. The results are shown in Figure 3. In Comparative Example 1 without a linker, there was still no clear product band after 33 cycles of amplification, indicating that a successfully assembled product could not be obtained without a linker, and the assembly efficiency was 0, so subsequent steps were not performed. Cloning and sequencing detection; clear bands can be seen in the combined final products obtained by 20 cycles and 33 cycles of amplification in Example 1, indicating that the content of the final product is about 100ng.

根据以下公式计算，20个循环的扩增可以看到清晰条带，拼合所得的16aa肽库对应的随机寡核苷酸链文库的容量约在8千万拷贝。Calculated according to the following formula, clear bands can be seen after 20 cycles of amplification. The capacity of the random oligonucleotide chain library corresponding to the assembled 16aa peptide library is approximately 80 million copies.

其中，N为循环数，M为拷贝量，质量单位为ng。Among them, N is the number of cycles, M is the copy amount, and the mass unit is ng.

以上结果表明，利用本发明实施例1的随机拼合方法构建的随机寡核苷酸链文库能够达到筛选文库容量的要求。The above results show that the random oligonucleotide chain library constructed using the random assembly method of Example 1 of the present invention can meet the screening library capacity requirements.

2、拼合效率和多态性检测2. Splitting efficiency and polymorphism detection

将实施例1和各对比例得到的拼合后的终产物进行DNA琼脂糖凝胶电泳，对DNA产物进行胶回收纯化并连接T载体进行克隆，将得到的克隆送测序。实施例1的成功拼合后的长链寡核苷酸的长度应为66nt，因此，阳性克隆的插入片段应该为66bp。The combined final products obtained in Example 1 and each comparative example were subjected to DNA agarose gel electrophoresis, the DNA products were gel recovered and purified, and ligated into a T vector for cloning, and the obtained clones were sent for sequencing. The length of the successfully assembled long-chain oligonucleotide in Example 1 should be 66 nt, therefore, the inserted fragment of the positive clone should be 66 bp.

其中，实施例1的拼合终产物20个循环、33个循环各随机选取50个进行测序，总计100个，排除无法连入载体的克隆，共获得81个存在片段插入的克隆，对这81个克隆进行DNA测序验证，PCR扩增20个循环获得的拼合终产物的克隆测序结果统计如表1所示，其中，成功拼合(插入片段长度为66bp)的克隆中的突变情况和各待拼合的短链寡核苷酸在所有成功拼合的克隆中出现的比例统计如表2所示；PCR扩增33个循环获得的拼合终产物的克隆测序结果统计如表3所示，其中，成功拼合(插入片段长度为66bp)的克隆中的突变情况和各待拼合的短链寡核苷酸在所有成功拼合的克隆中出现的比例统计如表4所示。部分成功拼合的长链寡核苷酸的测序结果如图2所示。Among them, 50 of the 20 cycles and 33 cycles of the final assembly product of Example 1 were randomly selected for sequencing, totaling 100. Clones that could not be connected to the vector were excluded, and a total of 81 clones with fragment insertion were obtained. For these 81 clones, The clones were verified by DNA sequencing. The clone sequencing results of the final splicing product obtained after 20 cycles of PCR amplification are shown in Table 1. Among them, the mutations in the clones that were successfully spliced (the length of the inserted fragment is 66 bp) and the clones to be spliced. The statistics of the proportion of short-chain oligonucleotides appearing in all successfully assembled clones are shown in Table 2; the statistics of clone sequencing results of the assembled final products obtained by 33 cycles of PCR amplification are shown in Table 3, among which, the successfully assembled ( The mutation status in clones with an insert length of 66 bp) and the statistics of the proportion of short-chain oligonucleotides to be spliced in all successfully spliced clones are shown in Table 4. The sequencing results of some successfully assembled long-chain oligonucleotides are shown in Figure 2.

结果表明，实施例1的短链寡核苷酸的拼合成功率为46％，表明本发明的拼合方法具有较高的拼合效率。在成功拼合的克隆中，6种12bp的短链寡核苷酸链出现的比率相对较为平均，可见长链寡核苷酸文库的多态性较高，能够满足肽库筛选时对寡核苷酸文库的多态性要求。The results show that the assembly efficiency of the short-chain oligonucleotide of Example 1 is 46%, indicating that the assembly method of the present invention has high assembly efficiency. Among the successfully assembled clones, the six 12bp short-chain oligonucleotide chains appear at a relatively even rate. It can be seen that the long-chain oligonucleotide library has high polymorphism and can meet the requirements for oligonucleotides during peptide library screening. Polymorphism requirements for acid libraries.

表1扩增20个循环的拼合终产物克隆测序情况统计Table 1 Statistics of clone sequencing of the final assembly product after 20 cycles of amplification

克隆数Number of clones 克隆数占比(％)Proportion of clones (%) 存在插入片段Insert exists 4242 ———— 插入片段长度为66bpThe insert length is 66bp 1717 40.50％40.50% 插入片段长度为30bpInsert length is 30bp 11 2.40％2.40% 插入片段长度为48bpThe insert length is 48bp 77 16.60％16.60% 其它长度other lengths 1717 40.50％40.50%

表2扩增20个循环的成功拼合终产物克隆测序情况统计Table 2 Statistics of clone sequencing of the final product of successful assembly after 20 cycles of amplification

注：表2中，无突变中各种核苷酸的占比为其出现占全部无突变克隆的占比，有突变占比为突变在共计68个克隆中的占比。Note: In Table 2, the proportion of various nucleotides without mutations is the proportion of their occurrence in all clones without mutations, and the proportion with mutations is the proportion of mutations in a total of 68 clones.

表3扩增33个循环的拼合终产物克隆测序情况统计Table 3 Statistics of clone sequencing of the final product of 33 cycles of amplification

克隆数Number of clones 克隆数占比(％)Proportion of clones (%) 存在插入片段Insert exists 3939 ———— 插入片段长度为66bpThe insert length is 66bp 2020 51.3051.30 插入片段长度为30bpInsert length is 30bp 22 5.105.10 插入片段长度为48bpThe insert length is 48bp 99 23.1023.10 其它长度other lengths 88 20.5020.50

表4扩增33个循环的成功拼合终产物克隆测序情况统计Table 4 Statistics on clone sequencing of the final product of successful assembly after 33 cycles of amplification

注：表4中，无突变中各种核苷酸的占比为其出现占全部无突变克隆的占比，有突变占比为突变在共计80个克隆中的占比。Note: In Table 4, the proportion of various nucleotides without mutations is the proportion of their occurrence in all clones without mutations, and the proportion with mutations is the proportion of mutations in a total of 80 clones.

对比例2和3经PCR扩增33个循环得到的拼合终产物经克隆分别获得21个和36个存在片段插入的克隆。对这些克隆进行DNA测序验证。其中，对比例2的成功拼合后的长链寡核苷酸的长度应为57nt，对比例3的成功拼合后的长链寡核苷酸的长度应为66nt，其阳性克隆的插入片段应该分别为57bp和66bp。Comparative Examples 2 and 3 were cloned to obtain 21 and 36 clones containing fragment insertions from the combined final products obtained by 33 cycles of PCR amplification in Comparative Examples 2 and 3, respectively. These clones were verified by DNA sequencing. Among them, the length of the successfully assembled long-chain oligonucleotide of Comparative Example 2 should be 57nt, and the length of the successfully assembled long-chain oligonucleotide of Comparative Example 3 should be 66nt, and the inserted fragments of the positive clones should be respectively are 57bp and 66bp.

对比例2和3的克隆测序统计结果如图4和图5所示，结果显示，对比例2和对比例3的所有克隆中均未出现符合拼合后57bp及66bp的克隆，因此，对比例2和对比例3的拼合方法的阳性率均为0。The statistical results of clone sequencing in Comparative Examples 2 and 3 are shown in Figures 4 and 5. The results show that none of the clones in Comparative Examples 2 and 3 have clones that match the spliced 57bp and 66bp. Therefore, Comparative Example 2 The positive rates of the splicing method of Comparative Example 3 and Comparative Example 3 are both 0.

虽然，上文中已经用一般性说明及具体实施方案对本发明作了详尽的描述，但在本发明基础上，可以对之作一些修改或改进，这对本领域技术人员而言是显而易见的。因此，在不偏离本发明精神的基础上所做的这些修改或改进，均属于本发明要求保护的范围。Although the present invention has been described in detail with general descriptions and specific embodiments above, it is obvious to those skilled in the art that some modifications or improvements can be made based on the present invention. Therefore, these modifications or improvements made without departing from the spirit of the present invention all fall within the scope of protection claimed by the present invention.

序列表sequence list

<110> 北京大学<110> Peking University

<120> 一种寡核苷酸链随机拼合方法<120> A method of random assembly of oligonucleotide chains

<130> KHP211121626.2<130> KHP211121626.2

<160> 40<160> 40

<170> SIPOSequenceListing 1.0<170> SIPOSequenceListing 1.0

<210> 1<210> 1

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 1<400> 1

gtggcgattc ag 12gtggcgattc ag 12

<210> 2<210> 2

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 2<400> 2

tgggctagtg at 12tgggctagtg at 12

<210> 3<210> 3

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 3<400> 3

cgggtgccgc tt 12cgggtgccgc tt 12

<210> 4<210> 4

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 4<400> 4

ttgcttgttc ag 12ttgcttgttc ag 12

<210> 5<210> 5

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 5<400> 5

aatgctactg gt 12aatgctactg gt 12

<210> 6<210> 6

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 6<400> 6

ccgtgtacgg ct 12ccgtgtacggct 12

<210> 7<210> 7

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 7<400> 7

ctgaatcgcc ac 12ctgaatcgcc ac 12

<210> 8<210> 8

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 8<400> 8

atcactagcc ca 12atcactagcc ca 12

<210> 9<210> 9

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 9<400> 9

aagcggcacc cg 12aagcggcacc cg 12

<210> 10<210> 10

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 10<400> 10

ctgaacaagc aa 12ctgaacaagc aa 12

<210> 11<210> 11

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 11<400> 11

accagtagca tt 12accagtagca tt 12

<210> 12<210> 12

<211> 12<211> 12

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 12<400> 12

agccgtacac gg 12agccgtacac gg 12

<210> 13<210> 13

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 13<400> 13

tgcaccctga atcgccacgg gccataatgg ccactc 36tgcaccctga atcgccacgg gccataatgg ccactc 36

<210> 14<210> 14

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 14<400> 14

tgcaccatca ctagcccagg gccataatgg ccactc 36tgcaccatca ctagcccagg gccataatgg ccactc 36

<210> 15<210> 15

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 15<400> 15

tgcaccaagc ggcacccggg gccataatgg ccactc 36tgcaccaagc ggcacccggg gccataatgg ccactc 36

<210> 16<210> 16

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 16<400> 16

tgcaccctga acaagcaagg gccataatgg ccactc 36tgcaccctga acaagcaagg gccataatgg ccactc 36

<210> 17<210> 17

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 17<400> 17

tgcaccacca gtagcattgg gccataatgg ccactc 36tgcaccacca gtagcattgg gccataatgg ccactc 36

<210> 18<210> 18

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 18<400> 18

tgcaccagcc gtacacgggg gccataatgg ccactc 36tgcaccagcc gtacacgggg gccataatgg ccactc 36

<210> 19<210> 19

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 19<400> 19

gcagcctgaa tcgccactgc acc 23gcagcctgaa tcgccactgc acc 23

<210> 20<210> 20

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 20<400> 20

gcagcatcac tagcccatgc acc 23gcagcatcac tagcccatgc acc 23

<210> 21<210> 21

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 21<400> 21

gcagcaagcg gcacccgtgc acc 23gcagcaagcg gcacccgtgc acc 23

<210> 22<210> 22

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 22<400> 22

gcagcctgaa caagcaatgc acc 23gcagcctgaa caagcaatgc acc 23

<210> 23<210> 23

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 23<400> 23

gcagcaccag tagcatttgc acc 23gcagcaccag tagcatttgc acc 23

<210> 24<210> 24

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 24<400> 24

gcagcagccg tacacggtgc acc 23gcagcagccg tacacggtgc acc 23

<210> 25<210> 25

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 25<400> 25

gctccctgaa tcgccactgc agc 23gctccctgaa tcgccactgc agc 23

<210> 26<210> 26

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 26<400> 26

gctccatcac tagcccatgc agc 23gctccatcac tagcccatgc agc 23

<210> 27<210> 27

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 27<400> 27

gctccaagcg gcacccgtgc agc 23gctccaagcg gcacccgtgc agc 23

<210> 28<210> 28

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 28<400> 28

gctccctgaa caagcaatgc agc 23gctccctgaa caagcaatgc agc 23

<210> 29<210> 29

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 29<400> 29

gctccaccag tagcatttgc agc 23gctccaccag tagcatttgc agc 23

<210> 30<210> 30

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 30<400> 30

gctccagccg tacacggtgc agc 23gctccagccg tacacggtgc agc 23

<210> 31<210> 31

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 31<400> 31

gaggcggccg acatgctact gaatcgccac tgctcc 36gaggcggccg acatgctact gaatcgccac tgctcc 36

<210> 32<210> 32

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 32<400> 32

gaggcggccg acatgctaat cactagccca tgctcc 36gaggcggccg acatgctaat cactagccca tgctcc 36

<210> 33<210> 33

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 33<400> 33

gaggcggccg acatgctaaa gcggcacccg tgctcc 36gaggcggccg acatgctaaa gcggcacccg tgctcc 36

<210> 34<210> 34

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 34<400> 34

gaggcggccg acatgctact gaacaagcaa tgctcc 36gaggcggccg acatgctact gaacaagcaa tgctcc 36

<210> 35<210> 35

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 35<400> 35

gaggcggccg acatgctaac cagtagcatt tgctcc 36gaggcggccg acatgctaac cagtagcatt tgctcc 36

<210> 36<210> 36

<211> 36<211> 36

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 36<400> 36

gaggcggccg acatgctaag ccgtacacgg tgctcc 36gaggcggccg acatgctaag ccgtacacgg tgctcc 36

<210> 37<210> 37

<211> 59<211> 59

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 37<400> 37

gggccataat ggccactctg cgttgatacc actgcttggg tggaaaaaaa aaaaaaaaa 59gggccataat ggccactctg cgttgatacc actgcttggg tggaaaaaaa aaaaaaaaa 59

<210> 38<210> 38

<211> 44<211> 44

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 38<400> 38

gtatcgatgc ccaccctcta gaggccgagg cggccgacat gcta 44gtatcgatgc ccaccctcta gaggccgagg cggccgacat gcta 44

<210> 39<210> 39

<211> 31<211> 31

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 39<400> 39

ttccacccaa gcagtggtat caacgcagag t 31ttccacccaa gcagtggtat caacgcagag t 31

<210> 40<210> 40

<211> 8058<211> 8058

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 40<400> 40

tgcatgcctg caggtcgaga tccgggatcg aagaaatgat ggtaaatgaa ataggaaatc 60tgcatgcctg caggtcgaga tccggggatcg aagaaatgat ggtaaatgaa ataggaaatc 60

aaggagcatg aaggcaaaag acaaatataa gggtcgaacg aaaaataaag tgaaaagtgt 120aaggagcatg aaggcaaaag acaaataa gggtcgaacg aaaaataaag tgaaaagtgt 120

tgatatgatg tatttggctt tgcggcgccg aaaaaacgag tttacgcaat tgcacaatca 180tgatatgatg tatttggctt tgcggcgccg aaaaaacgag tttacgcaat tgcacaatca 180

tgctgactct gtggcggacc cgcgctcttg ccggcccggc gataacgctg ggcgtgaggc 240tgctgactct gtggcggacc cgcgctcttg ccggcccggc gataacgctg ggcgtgaggc 240

tgtgcccggc ggagtttttt gcgcctgcat tttccaaggt ttaccctgcg ctaaggggcg 300tgtgcccggc ggagtttttt gcgcctgcat tttccaaggt ttaccctgcg ctaaggggcg 300

agattggaga agcaataaga atgccggttg gggttgcgat gatgacgacc acgacaactg 360agattggaga agcaataaga atgccggttg gggttgcgat gatgacgacc acgacaactg 360

gtgtcattat ttaagttgcc gaaagaacct gagtgcattt gcaacatgag tatactagaa 420gtgtcattat ttaagttgcc gaaagaacct gagtgcattt gcaacatgag tatactagaa 420

gaatgagcca agacttgcga gacgcgagtt tgccggtggt gcgaacaata gagcgaccat 480gaatgagcca agacttgcga gacgcgagtt tgccggtggt gcgaacaata gagcgaccat 480

gaccttgaag gtgagacgcg cataaccgct agagtacttt gaagaggaaa cagcaatagg 540gaccttgaag gtgagacgcg cataaccgct agagtacttt gaagaggaaa cagcaatagg 540

gttgctacca gtataaatag acaggtacat acaacactgg aaatggttgt ctgtttgagt 600gttgctacca gtataaatag acaggtacat acaacactgg aaatggttgt ctgtttgagt 600

acgctttcaa ttcatttggg tgtgcacttt attatgttac aatatggaag ggaactttac 660acgctttcaa ttcatttggg tgtgcacttt attatgttac aatatggaag ggaactttac 660

acttctccta tgcacatata ttaattaaag tccaatgcta gtagagaagg ggggtaacac 720acttctccta tgcacatata ttaattaaag tccaatgcta gtagagaagg ggggtaacac 720

ccctccgcgc tcttttccga tttttttcta aaccgtggaa tatttcggat atccttttgt 780ccctccgcgc tcttttccga tttttttcta aaccgtggaa tatttcggat atccttttgt 780

tgtttccggg tgtacaatat ggacttcctc ttttctggca accaaaccca tacatcggga 840tgtttccggg tgtacaatat ggacttcctc ttttctggca accaaaccca tacatcggga 840

ttcctataat accttcgttg gtctccctaa catgtaggtg gcggagggga gatatacaat 900ttcctataat accttcgttg gtctccctaa catgtaggtg gcggagggga gatatacaat 900

agaacagata ccagacaaga cataatgggc taaacaagac tacaccaatt acactgcctc 960agaacagata ccagacaaga cataatgggc taaacaagac tacaccaatt acactgcctc 960

attgatggtg gtacataacg aactaatact gtagccctag acttgatagc catcatcata 1020attgatggtg gtacataacg aactaatact gtagccctag acttgatagc catcatcata 1020

tcgaagtttc actacccttt ttccatttgc catctattga agtaataata ggcgcatgca 1080tcgaagtttc actacccttt ttccatttgc catctattga agtaataata ggcgcatgca 1080

acttcttttc tttttttttc ttttctctct cccccgttgt tgtctcacca tatccgcaat 1140acttcttttc tttttttttc ttttctctct cccccgttgt tgtctcacca tatccgcaat 1140

gacaaaaaaa tgatggaaga cactaaagga aaaaattaac gacaaagaca gcaccaacag 1200gacaaaaaaa tgatggaaga cactaaagga aaaaattaac gacaaagaca gcaccaacag 1200

atgtcgttgt tccagagctg atgaggggta tctcgaagca cacgaaactt tttccttcct 1260atgtcgttgt tccagagctg atgaggggta tctcgaagca cacgaaactt tttccttcct 1260

tcattcacgc acactactct ctaatgagca acggtatacg gccttccttc cagttacttg 1320tcattcacgc acactactct ctaatgagca acggtatacg gccttccttc cagttatacttg 1320

aatttgaaat aaaaaaaagt ttgctgtctt gctatcaagt ataaatagac ctgcaattat 1380aatttgaaat aaaaaaaagt ttgctgtctt gctatcaagt ataaatagac ctgcaattat 1380

taatcttttg tttcctcgtc attgttctcg ttccctttct tccttgtttc tttttctgca 1440taatcttttg tttcctcgtc attgttctcg ttccctttct tccttgtttc tttttctgca 1440

caatatttca agctatacca agcatacaat caactccaag ctttgcaaag atggataaag 1500caatatttca agctatacca agcatacaat caactccaag ctttgcaaag atggataaag 1500

cggaattaat tcccgagcct ccaaaaaaga agagaaaggt cgaattgggt accgccgcca 1560cggaattaat tcccgagcct ccaaaaaaga agagaaaggt cgaattgggt accgccgcca 1560

attttaatca aagtgggaat attgctgata gctcattgtc cttcactttc actaacagta 1620attttaatca aagtgggaat attgctgata gctcattgtc cttcactttc actaacagta 1620

gcaacggtcc gaacctcata acaactcaaa caaattctca agcgctttca caaccaattg 1680gcaacggtcc gaacctcata acaactcaaa caaattctca agcgctttca caaccaattg 1680

cctcctctaa cgttcatgat aacttcatga ataatgaaat cacggctagt aaaattgatg 1740cctcctctaa cgttcatgat aacttcatga ataatgaaat cacggctagt aaaattgatg 1740

atggtaataa ttcaaaacca ctgtcacctg gttggacgga ccaaactgcg tataacgcgt 1800atggtaataa ttcaaaacca ctgtcacctg gttggacgga ccaaactgcg tataacgcgt 1800

ttggaatcac tacagggatg tttaatacca ctacaatgga tgatgtatat aactatctat 1860ttggaatcac tacagggatg tttaatacca ctacaatgga tgatgtatat aactatctat 1860

tcgatgatga agatacccca ccaaacccaa aaaaagagat ctttaatacg actcactata 1920tcgatgatga agatacccca ccaaacccaa aaaaagagat ctttaatacg actcactata 1920

gggcgagcgc cgccatggag tacccatacg acgtaccaga ttacgctcat atggccatgg 1980gggcgagcgc cgccatggag tacccatacg acgtaccaga ttacgctcat atggccatgg 1980

aggccagtga attccaccca agcagtggta tcaacgcaga gtggccatta tggcccggga 2040aggccagtga attccaccca agcagtggta tcaacgcaga gtggccatta tggcccggga 2040

aaaaacatgt cggccgcctc ggcctctaga gggtgggcat cgatacggga tccatcgagc 2100aaaaacatgt cggccgcctc ggcctctaga gggtgggcat cgatacggga tccatcgagc 2100

tcgagctgca gatgaatcgt agatactgaa aaaccccgca agttcacttc aactgtgcat 2160tcgagctgca gatgaatcgt agatactgaa aaaccccgca agttcacttc aactgtgcat 2160

cgtgcaccat ctcaatttct ttcatttata catcgttttg ccttctttta tgtaactata 2220cgtgcaccat ctcaatttct ttcatttata catcgttttg ccttctttta tgtaactata 2220

ctcctctaag tttcaatctt ggccatgtaa cctctgatct atagaatttt ttaaatgact 2280ctcctctaag tttcaatctt ggccatgtaa cctctgatct atagaatttt ttaaatgact 2280

agaattaatg cccatctttt ttttggacct aaattcttca tgaaaatata ttacgagggc 2340agaattaatg cccatctttt ttttggacct aaattcttca tgaaaatata ttacgagggc 2340

ttattcagaa gctttggact tcttcgccag aggtttggtc aagtctccaa tcaaggttgt 2400ttattcagaa gctttggact tcttcgccag aggtttggtc aagtctccaa tcaaggttgt 2400

cggcttgtct accttgccag aaatttacga aaagatggaa aagggtcaaa tcgttggtag 2460cggcttgtct accttgccag aaatttacga aaagatggaa aagggtcaaa tcgttggtag 2460

atacgttgtt gacacttcta aataagcgaa tttcttatga tttatgattt ttattattaa 2520atacgttgtt gacacttcta aataagcgaa tttcttatga tttatgattt ttattattaa 2520

ataagttata aaaaaaataa gtgtatacaa attttaaagt gactcttagg ttttaaaacg 2580ataagttata aaaaaaataa gtgtatacaa attttaaagt gactcttagg ttttaaaacg 2580

aaaattctta ttcttgagta actctttcct gtaggtcagg ttgctttctc aggtatagca 2640aaaattctta ttcttgagta actctttcct gtaggtcagg ttgctttctc aggtatagca 2640

tgaggtcgct cttattgacc acacctctac cggccggtcg aaattcccct accctatgaa 2700tgaggtcgct cttattgacc acacctctac cggccggtcg aaattcccct accctatgaa 2700

catattccat tttgtaattt cgtgtcgttt ctattatgaa tttcatttat aaagtttatg 2760catattccat tttgtaattt cgtgtcgttt ctattatgaa tttcatttat aaagtttatg 2760

tacaaatatc ataaaaaaag agaatctttt taagcaagga ttttcttaac ttcttcggcg 2820tacaaatatc ataaaaaaag agaatctttt taagcaagga ttttcttaac ttcttcggcg 2820

acagcatcac cgacttcggt ggtactgttg gaaccaccta aatcaccagt tctgatacct 2880acagcatcac cgacttcggt ggtactgttg gaaccaccta aatcaccagt tctgatacct 2880

gcatccaaaa cctttttaac tgcatcttca atggccttac cttcttcagg caagttcaat 2940gcatccaaaa cctttttaac tgcatcttca atggccttac cttcttcagg caagttcaat 2940

gacaatttca acatcattgc agcagacaag atagtggcga tagggttgac cttattcttt 3000gacaatttca acatcattgc agcagacaag atagtggcga tagggttgac cttattcttt 3000

ggcaaatctg gagcagaacc gtggcatggt tcgtacaaac caaatgcggt gttcttgtct 3060ggcaaatctg gagcagaacc gtggcatggt tcgtacaaac caaatgcggt gttcttgtct 3060

ggcaaagagg ccaaggacgc agatggcaac aaacccaagg aacctgggat aacggaggct 3120ggcaaagagg ccaaggacgc agatggcaac aaacccaagg aacctgggat aacggaggct 3120

tcatcggaga tgatatcacc aaacatgttg ctggtgatta taataccatt taggtgggtt 3180tcatcggaga tgatatcacc aaacatgttg ctggtgatta taataccatt taggtgggtt 3180

gggttcttaa ctaggatcat ggcggcagaa tcaatcaatt gatgttgaac cttcaatgta 3240gggttcttaa ctaggatcat ggcggcagaa tcaatcaatt gatgttgaac cttcaatgta 3240

ggaaattcgt tcttgatggt ttcctccaca gtttttctcc ataatcttga agaggccaaa 3300ggaaattcgt tcttgatggt ttcctccaca gtttttctcc ataatcttga agaggccaaa 3300

acattagctt tatccaagga ccaaataggc aatggtggct catgttgtag ggccatgaaa 3360acattagctt tatccaagga ccaaataggc aatggtggct catgttgtag ggccatgaaa 3360

gcggccattc ttgtgattct ttgcacttct ggaacggtgt attgttcact atcccaagcg 3420gcggccattc ttgtgattct ttgcacttct ggaacggtgt attgttcact atcccaagcg 3420

acaccatcac catcgtcttc ctttctctta ccaaagtaaa tacctcccac taattctctg 3480acaccatcac catcgtcttc ctttctctta ccaaagtaaa tacctcccac taattctctg 3480

acaacaacga agtcagtacc tttagcaaat tgtggcttga ttggagataa gtctaaaaga 3540acaacaacga agtcagtacc tttagcaaat tgtggcttga ttggagataa gtctaaaaga 3540

gagtcggatg caaagttaca tggtcttaag ttggcgtaca attgaagttc tttacggatt 3600gagtcggatg caaagttaca tggtcttaag ttggcgtaca attgaagttc tttacggatt 3600

tttagtaaac cttgttcagg tctaacacta cctgtacccc atttaggacc acccacagca 3660tttagtaaac cttgttcagg tctaacacta cctgtacccc atttaggacc acccacagca 3660

cctaacaaaa cggcatcagc cttcttggag gcttccagcg cctcatctgg aagtgggaca 3720cctaacaaaa cggcatcagc cttcttggag gcttccagcg cctcatctgg aagtgggaca 3720

cctgtagctt cgatagcagc accaccaatt aaatgatttt cgaaatcgaa cttgacattg 3780cctgtagctt cgatagcagc accaccaatt aaatgatttt cgaaatcgaa cttgacattg 3780

gaacgaacat cagaaatagc tttaagaacc ttaatggctt cggctgtgat ttcttgacca 3840gaacgaacat cagaaatagc tttaagaacc ttaatggctt cggctgtgat ttcttgacca 3840

acgtggtcac ctggcaaaac gacgatcttc ttaggggcag acattagaat ggtatatcct 3900acgtggtcac ctggcaaaac gacgatcttc ttaggggcag acattagaat ggtatatcct 3900

tgaaatatat atatatattg ctgaaatgta aaaggtaaga aaagttagaa agtaagacga 3960tgaaatatat atatatattg ctgaaatgta aaaggtaaga aaagttagaa agtaagacga 3960

ttgctaacca cctattggaa aaaacaatag gtccttaaat aatattgtca acttcaagta 4020ttgctaacca cctattggaa aaaacaatag gtccttaaat aatattgtca acttcaagta 4020

ttgtgatgca agcatttagt catgaacgct tctctattct atatgaaaag ccggttccgg 4080ttgtgatgca agcatttagt catgaacgct tctctattct atatgaaaag ccggttccgg 4080

cgctctcacc tttccttttt ctcccaattt ttcagttgaa aaaggtatat gcgtcaggcg 4140cgctctcacc tttccttttt ctcccaattt ttcagttgaa aaaggtatat gcgtcaggcg 4140

acctctgaaa ttaacaaaaa atttccagtc atcgaatttg attctgtgcg atagcgcccc 4200acctctgaaa ttaacaaaaa atttccagtc atcgaatttg attctgtgcg atagcgcccc 4200

tgtgtgttct cgttatgttg aggaaaaaaa taatggttgc taagagattc gaactcttgc 4260tgtgtgttct cgttatgttg aggaaaaaaa taatggttgc taagagattc gaactcttgc 4260

atcttacgat acctgagtat tcccacagtt gggggatctc gactctagct agaggatcaa 4320atcttacgat acctgagtat tcccacagtt gggggatctc gactctagct agaggatcaa 4320

ttcgtaatca tgtcatagct gtttcctgtg tgaaattgtt atccgctcac aattccacac 4380ttcgtaatca tgtcatagct gtttcctgtg tgaaattgtt atccgctcac aattccacac 4380

aacatacgag ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc 4440aacatacgag ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt gagctaactc 4440

acattaattg cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc gtgccagctg 4500acattaattg cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc gtgccagctg 4500

ataacttcgt ataatgtatg ctatacgaag ttattaggtc tgaagaggag tttacgtcca 4560ataacttcgt ataatgtatg ctatacgaag ttattaggtc tgaagaggag tttacgtcca 4560

gccaagctag cttggctgca ggtcgagcgg ccgcgatccg gaacccttaa tataacttcg 4620gccaagctag cttggctgca ggtcgagcgg ccgcgatccg gaacccttaa tataacttcg 4620

tataatgtat gctatacgaa gttatcagct gcattaatga atcggccaac gcgcggggag 4680tataatgtat gctatacgaa gttatcagct gcattaatga atcggccaac gcgcggggag 4680

aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 4740aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 4740

cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 4800cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 4800

atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 4860atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 4860

taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 4920taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 4920

aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 4980aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 4980

tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 5040tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 5040

gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 5100gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 5100

cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 5160cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 5160

cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 5220cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 5220

atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 5280atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 5280

tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat 5340tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat 5340

ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 5400ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 5400

acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 5460acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 5460

aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 5520aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 5520

aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct 5580aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct 5580

tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga 5640tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga 5640

cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc 5700cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc 5700

catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg 5760catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg 5760

ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat 5820ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat 5820

aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat 5880aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat 5880

ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg 5940ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg 5940

caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc 6000caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc 6000

attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa 6060attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa 6060

agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc 6120agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc 6120

actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt 6180actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt 6180

ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag 6240ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag 6240

ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt 6300ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt 6300

gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag 6360gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag 6360

atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac 6420atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac 6420

cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc 6480cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc 6480

gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca 6540gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca 6540

gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 6600gggttatattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 6600

ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca ttattatcat 6660ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca ttattatcat 6660

gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc gtttcggtga 6720gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc gtttcggtga 6720

tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt gtctgtaagc 6780tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt gtctgtaagc 6780

ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg ggtgtcgggg 6840ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg ggtgtcgggg 6840

ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata acgcatttaa 6900ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata acgcatttaa 6900

gcataaacac gcactatgcc gttcttctca tgtatatata tatacaggca acacgcagat 6960gcataaacac gcactatgcc gttcttctca tgtatatata tatacaggca acacgcagat 6960

ataggtgcga cgtgaacagt gagctgtatg tgcgcagctc gcgttgcatt ttcggaagcg 7020ataggtgcga cgtgaacagt gagctgtatg tgcgcagctc gcgttgcatt ttcggaagcg 7020

ctcgttttcg gaaacgcttt gaagttccta ttccgaagtt cctattctct agctagaaag 7080ctcgttttcg gaaacgcttt gaagttccta ttccgaagtt cctattctct agctagaaag 7080

tataggaact tcagagcgct tttgaaaacc aaaagcgctc tgaagacgca ctttcaaaaa 7140tataggaact tcagagcgct tttgaaaacc aaaagcgctc tgaagacgca ctttcaaaaa 7140

accaaaaacg caccggactg taacgagcta ctaaaatatt gcgaataccg cttccacaaa 7200accaaaaacg caccggactg taacgagcta ctaaaatatt gcgaataccg cttccacaaa 7200

cattgctcaa aagtatctct ttgctatata tctctgtgct atatccctat ataacctacc 7260cattgctcaa aagtatctct ttgctatata tctctgtgct atatccctat ataacctacc 7260

catccacctt tcgctccttg aacttgcatc taaactcgac ctctacattt tttatgttta 7320catccacctt tcgctccttg aacttgcatc taaactcgac ctctacattt tttatgttta 7320

tctctagtat tactctttag acaaaaaaat tgtagtaaga actattcata gagtgaatcg 7380tctctagtat tactctttag acaaaaaaat tgtagtaaga actattcata gagtgaatcg 7380

aaaacaatac gaaaatgtaa acatttccta tacgtagtat atagagacaa aatagaagaa 7440aaaacaatac gaaaatgtaa acatttccta tacgtagtat atagagacaa aatagaagaa 7440

accgttcata attttctgac caatgaagaa tcatcaacgc tatcactttc tgttcacaaa 7500accgttcata attttctgac caatgaagaa tcatcaacgc tatcactttc tgttcacaaaa 7500

gtatgcgcaa tccacatcgg tatagaatat aatcggggat gcctttatct tgaaaaaatg 7560gtatgcgcaa tccacatcgg tatagaatat aatcggggat gcctttatct tgaaaaaatg 7560

cacccgcagc ttcgctagta atcagtaaac gcgggaagtg gagtcaggct ttttttatgg 7620cacccgcagc ttcgctagta atcagtaaac gcgggaagtg gagtcaggct ttttttatgg 7620

aagagaaaat agacaccaaa gtagccttct tctaacctta acggacctac agtgcaaaaa 7680aagagaaaat agacaccaaa gtagccttct tctaacctta acggacctac agtgcaaaaa 7680

gttatcaaga gactgcatta tagagcgcac aaaggagaaa aaaagtaatc taagatgctt 7740gttatcaaga gactgcatta tagagcgcac aaaggagaaa aaaagtaatc taagatgctt 7740

tgttagaaaa atagcgctct cgggatgcat ttttgtagaa caaaaaagaa gtatagattc 7800tgttagaaaa atagcgctct cgggatgcat ttttgtagaa caaaaaagaa gtatagattc 7800

tttgttggta aaatagcgct ctcgcgttgc atttctgttc tgtaaaaatg cagctcagat 7860tttgttggta aaatagcgct ctcgcgttgc atttctgttc tgtaaaaatg cagctcagat 7860

tctttgtttg aaaaattagc gctctcgcgt tgcatttttg ttttacaaaa atgaagcaca 7920tctttgtttg aaaaattagc gctctcgcgt tgcatttttg ttttacaaaa atgaagcaca 7920

gattcttcgt tggtaaaata gcgctttcgc gttgcatttc tgttctgtaa aaatgcagct 7980gattcttcgt tggtaaaata gcgctttcgc gttgcatttc tgttctgtaa aaatgcagct 7980

cagattcttt gtttgaaaaa ttagcgctct cgcgttgcat ttttgttcta caaaatgaag 8040cagattcttt gtttgaaaaa ttagcgctct cgcgttgcat ttttgttcta caaaatgaag 8040

cacagatgct tcgttgct 8058cacagatgct tcgttgct 8058

Claims

1. A primer set for random splicing of oligonucleotide chains, characterized in that any k oligonucleotide chains of n oligonucleotide chains are randomly spliced into long-chain oligonucleotides, the primer set comprises n×k primers, n and k are integers greater than 1, and k is less than n;

the n x k primers are divided into n subgroups, each subgroup containing k primers, the k primers of each subgroup being as follows:

the primer of the 1 st oligonucleotide strand positioned at the 5' end of the spliced long-chain oligonucleotide sequentially comprises a reverse complementary sequence of the 1 st linker sequence and a reverse complementary sequence of the oligonucleotide strand from the 5' -3' direction;

the primer of the 2 nd oligonucleotide strand positioned at the 5 'end of the spliced long-chain oligonucleotide sequentially comprises a reverse complement sequence of the 2 nd linker or a reverse complement sequence of the 2 nd linker except for the 3' end A, a reverse complement sequence of the oligonucleotide strand and a reverse complement sequence of the 1 st linker from the 5'-3' direction;

the primer of the ith oligonucleotide strand positioned at the 5 'end of the spliced long-chain oligonucleotide sequentially comprises a reverse complement sequence of the ith linker or a reverse complement sequence of the ith linker except for the 3' end A, the reverse complement sequence of the oligonucleotide strand and the reverse complement sequence of the ith-1 linker from the 5'-3' direction, wherein i is more than 2 and less than or equal to k-1, and is an integer;

the primer of the kth oligonucleotide strand positioned at the 5' -end of the spliced long-chain oligonucleotide sequentially comprises a reverse complement sequence of the kth oligonucleotide strand and a reverse complement sequence of the kth-1 linker from the 5' -3' -direction;

the length of the linker is more than or equal to 6nt, the lengths of the 1 st to k-1 st linkers are the same, and the sequences of the linkers are different from each other;

wherein k=4, and the sequence of the 1 st to k-1 st linkers is GGTGCA, GCTGCA, GGAGCA in sequence.

2. The primer set for random splicing of oligonucleotide strands according to claim 1, wherein the 3 '-end of the primer of the 1 st oligonucleotide strand located at the 5' -end of the long-chain oligonucleotide after splicing further contains a sequence complementary to a vector sequence and/or a cleavage site sequence for cloning the long-chain oligonucleotide;

the 5' end of the primer of the kth oligonucleotide strand located at the 5' end of the spliced long-chain oligonucleotide also contains a sequence overlapping with the 3' end of the primer for PCR amplification of the spliced long-chain oligonucleotide single strand to form a blunt-ended double strand.

3. The primer set for random splicing of oligonucleotide strands according to claim 1, wherein the length of the oligonucleotide strand to be spliced is 10 to 20nt.

4. A primer set for random splicing of oligonucleotide strands according to any one of claims 1 to 3, wherein the primer set further comprises a Block primer;

the Block primer is a mixture of reverse complementary strands of n oligonucleotide strands.

5. A primer set for random splicing of oligonucleotide strands according to any one of claims 1 to 3, further comprising:

f1 primer for coupling with oligo dT and ligating the spliced long-chain oligonucleotide with a vector for cloning;

the F2 primer and the R primer are used for carrying out PCR amplification on the spliced long-chain oligonucleotide single chains to form double chains with flat ends and are connected with a vector for cloning.

6. Kit, characterized in that it comprises a primer set for random splicing of oligonucleotide strands according to any one of claims 1 to 5.

7. Use of the primer set for random splicing of oligonucleotide strands according to any one of claims 1 to 5 or the kit according to claim 6 in random oligonucleotide strand library construction or random peptide library construction.

8. A random splicing method of oligonucleotide chains is characterized in that magnetic beads are used as carriers, and any k oligonucleotide chains in n oligonucleotide chains are randomly spliced into long-chain oligonucleotides by adopting the primer set for random splicing of the oligonucleotide chains according to any one of claims 1 to 5.

9. The method for random splicing of oligonucleotide strands according to claim 8, comprising the steps of:

(1) And (2) PCR: taking magnetic beads as a carrier, adopting an F1 primer and a first primer mixture, performing PCR under the action of high-fidelity DNA polymerase, and performing solid-liquid separation after the PCR is finished to obtain a first reaction product;

the first primer mixture is a mixture of primers of the 1 st oligonucleotide chain of each subgroup in the N subgroups, which is positioned at the 5' end of the spliced long-chain oligonucleotide;

(2) Eluting: mixing the first reaction product with a Block primer, and eluting after complementary pairing of the oligonucleotides to obtain a first eluting product; mixing the first eluting product with the Block primer, and eluting after complementary pairing of the oligonucleotides to obtain a second eluting product;

(3) Extension: on the basis of the second elution product in the step (2), adopting a Block primer and a second primer mixture, and carrying out an extension reaction under the action of Klenow enzyme by taking dNTPs as raw materials to obtain a second reaction product;

the second primer mixture is a mixture of primers of the 2 nd oligonucleotide chain of each subgroup in the N subgroups, which is positioned at the 5' end of the spliced long-chain oligonucleotide;

(4) Repeating steps (2) - (3), and splicing the rest of the k oligonucleotide chains one by one, wherein in the extension step of the ith oligonucleotide chain, a Block primer and an ith primer mixture are adopted;

the ith primer mixture is a mixture of primers of an ith oligonucleotide chain of each subgroup in N subgroups, which is positioned at the 5' end of the spliced long-chain oligonucleotide, wherein i is more than 2 and less than or equal to k-1, and is an integer;

finally repeating the steps (2) - (3) to splice the kth oligonucleotide chain, and adopting a Block primer and a kth primer mixture in the extension step of the kth oligonucleotide chain;

the kth primer mixture is a mixture of primers of kth oligonucleotide chains of each subgroup N positioned at the 5' end of the spliced long-chain oligonucleotides;

(5) Eluting: after the splicing in the step (4) is finished, mixing the spliced product with a Block primer, and eluting to obtain an eluted product;

(6) And (3) taking the eluted product in the step (5) as a template, adopting an F2 primer and an R primer to carry out PCR, and recovering the PCR product to obtain the randomly spliced oligonucleotide library.

10. The method of random splicing oligonucleotide chains according to claim 9, wherein in the step (3), the final concentration of the Block primer in the extended reaction system is 1 to 3. Mu.M, the final concentration of the primer mixture is 0.5 to 2. Mu.M,

and/or, the reaction conditions for extension are: the reaction is carried out for 15-25min at 37 ℃.

11. The method according to claim 9 or 10, wherein in the step (1), the final concentration of the F1 primer in the reaction system of PCR is 0.3 to 0.5. Mu.M, and the final concentration of the first primer mixture is 0.3 to 0.5. Mu.M;

the reaction procedure of PCR includes: 94-98 ℃, 5-30s,55 ℃, 10-30s,72 ℃, 10-20s and 18-25 cycles;

and/or the number of the groups of groups,

in the step (2), the final concentration of the Block primer is 18-22 mu M;

the elution is to incubate for 2min at 90-95 ℃ and then incubate for 1-3min at 0-4 ℃.

12. The construction method of the random long-chain peptide library is characterized by comprising the following steps of: randomly splicing short-chain oligonucleotides encoding the short-chain peptide library by adopting the method for randomly splicing the oligonucleotide chains according to any one of claims 8-11 to obtain a random long-chain oligonucleotide library, connecting the random long-chain oligonucleotide library with a carrier, and transferring the random long-chain oligonucleotide library into a host cell for expression to obtain the random long-chain peptide library.