WO2014169810A1 - Isolated oligonucleotide and use thereof - Google Patents

Isolated oligonucleotide and use thereof Download PDF

Info

Publication number
WO2014169810A1
WO2014169810A1 PCT/CN2014/075403 CN2014075403W WO2014169810A1 WO 2014169810 A1 WO2014169810 A1 WO 2014169810A1 CN 2014075403 W CN2014075403 W CN 2014075403W WO 2014169810 A1 WO2014169810 A1 WO 2014169810A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
rvd
oligonucleotide
identification module
single base
Prior art date
Application number
PCT/CN2014/075403
Other languages
French (fr)
Chinese (zh)
Inventor
吴璐
许奇武
王磊
原辉
Original Assignee
深圳华大基因科技服务有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因科技服务有限公司 filed Critical 深圳华大基因科技服务有限公司
Priority to CN201480011125.2A priority Critical patent/CN105008536A/en
Publication of WO2014169810A1 publication Critical patent/WO2014169810A1/en
Priority to HK16102277.2A priority patent/HK1214302A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6816Hybridisation assays characterised by the detection means

Definitions

  • the invention relates to the field of biotechnology.
  • the invention relates to isolated oligonucleotides and uses thereof. More specifically, the present invention relates to an isolated oligonucleotide, an oligonucleotide library, a method of constructing a nucleic acid expressing a TALE repeat, and a method of altering a cellular genome. Background technique
  • TALEN Transcription activator-like effectors nucleases
  • the directed gene modification technology consists of two parts. One part of the TALEN directed cutting target sequence forms a DNA double-stranded incision (DSB), and the other part of the Doner vector provides the sequence to be edited, completing the genome-directed editing work.
  • DSB DNA double-stranded incision
  • Doner vector provides the sequence to be edited, completing the genome-directed editing work.
  • TALEN is composed of more than 12 tandem protein modules and Fokl endonuclease. Each protein module contains 34 amino acids. The 12th and 13th amino acid residues are key sites for base recognition and are called double amino acid residues. RVD).
  • the target gene sequence can be identified by changing the sequence of the RVD module.
  • TALEN binds to the target sequence and cleaves the DNA double strand to form DSB.
  • TALEN technology has the characteristics of high knockout efficiency and strong specificity. TALEN can select targets in any part of the genome, knock out genes, and change target gene sequences, so TALEN technology is the preferred method for studying gene function.
  • the present invention is directed to solving at least some of the above technical problems or at least providing a useful commercial option. To this end, it is an object of the present invention to provide an oligonucleotide which can be effectively used to construct a TALE repeat and its use.
  • the invention proposes an isolated oligonucleotide.
  • the isolated oligonucleotide comprises: a first nucleic acid molecule encoding a double base recognition module, the double base recognition module being identified by a first single base in tandem a module and a second single base recognition module, wherein the first single base recognition module and the second single base recognition module each comprise a repeating variable biamino acid residue; a first amplification sequence, the first amplification a sequence is located on the 5' side of the first nucleic acid molecule, and the first amplified sequence comprises a lis type endonuclease recognition sequence; and a second amplified sequence, the second amplified sequence is located in the first nucleic acid The 3' side of the molecule, and the second nucleic acid sequence comprises a lis type endonuclease recognition sequence.
  • the oligonucleotide molecule comprises a nucleic acid sequence encoding two single base recognition modules
  • such oligonucleotide molecules encoding different RVDs can be used as starting materials by cutting with restriction enzymes.
  • the cohesive ends are formed on both sides of the oligonucleotide molecule, and the ligation can be directly performed, whereby various different combinations of RVDs can be obtained quickly, so that any target nucleic acid sequence can be recognized.
  • the number of times, and the inventors have surprisingly found that the mismatch rate that enables module unit connections is greatly reduced. Generally, for longer RVD combinations, the more the number of cleavage connections, the more difficult the correct connection.
  • the oligonucleotide may also have the following additional technical features:
  • the first single base recognition module and the second single base recognition module each have the following amino acid sequence:
  • the RVD corresponding to base A is NI; the RVD corresponding to base C is HD; the RVD corresponding to base T is NG; and the RVD corresponding to base G is N.
  • the first single base recognition module and the second single base recognition module each have an amino acid sequence selected from one of the following:
  • Identification base A LTPDQVVAIASNIGGKQALETVQRLLPVLCQDHG;
  • Identification base C LTPDQVVAIASHDGGKQALETVQRLLPVLCQDHG;
  • the inventors have surprisingly found that by using the single base recognition module of the single base recognition module described above, it is possible to effectively exert a function of specifically recognizing a base in a plurality of cells including animal cells and plant cells.
  • the recognition efficiency of the obtained TALEN for the target nucleotide can be further improved by shortening the lengths of the C-terminus and the N-terminus of both sides of the RVD and optimizing the partial amino acids as compared with the wild-type TALEN.
  • the RVD in the first single base identification module and the second single base identification module meets one of the following conditions: the RVD of the first single base identification module is NI, The RVD of the second single base identification module is I; the RVD of the first single base identification module is NI, and the RVD of the second single base identification module is NG; the first single base identification module The RVD of the second single base identification module is HD; the RVD of the first single base identification module is NI, and the RVD of the second single base identification module is NN; The RVD of the single base identification module is NG, the RVD of the second single base identification module is NI; the RVD of the first single base identification module is NG, and the RVD of the second single base identification module Is the NG; the RVD of the first single base identification module is NG, the RVD of the second single base identification module is HD; the RVD of the first single base identification module is NG, the second single The RVD of the base identification module is N; the RVD of the first single base identification module is NI
  • the oligonucleotide can encode one that recognizes AA, AT, AC, AG, TA, TT, TC, TG, CA, CT, CG, CC, GA, GT, GC, GG.
  • the combination of the recognition module and the second base recognition module, respectively, after constructing the oligonucleotide molecule can efficiently obtain a library of oligonucleotide molecules that can be used to construct any RVD combination, thereby requiring only the corresponding oligonucleoside
  • the acid molecules are ligated to obtain the desired RVD combination that recognizes the predetermined target nucleotide sequence.
  • the lis-type endonuclease is at least one selected from the group consisting of Bsal, Bbsl and BsmBI.
  • the lis-type endonuclease is Bsal
  • the lis-type endonuclease recognition sequence is GGTCTCNNNN, wherein ⁇ is , T, G or C.
  • the first amplification sequence further comprises an Xbal cleavage site
  • the first amplification sequence may have the general formula (M) 1() . 2() TCTAGA(;M;) 2 . 8 GGTCTC (H;. 8 25 , wherein M is a, T, C or G;.
  • the second amplified sequence further comprises a Xhol cleavage site, and the second amplified sequence has the general formula (M') io-2oCTCGAG(M') 2-8GGTCTC(H')i 8- 25, wherein M' is A, T, C or G; 11' is A, T, C or G and matches the coding sequence of the second single base recognition module.
  • the sequences of M and M' are variable, in the context of the present invention
  • the first nucleic acid molecule or the single base recognition module coding sequence is ligated to the vector as an amplification template, and the sequences of M and M' can be respectively set to coincide with the 5' end of the vector and with the vector 3 'End sequence matching.
  • the corresponding oligonucleotide can be conveniently saved or amplified, thereby facilitating the efficiency of subsequent construction of talen.
  • the vector or plasmid can be Containing the selection marker, such as a drug resistance gene, which can easily be amplified or screening vector or plasmid.
  • the first amplified sequence and the second amplified sequence satisfy the following condition: the first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 1, The second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 10; the first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 2, and the second amplified sequence has a nucleotide sequence of SEQ ID NO: 11; the first amplified sequence has a nucleotide sequence as set forth in SEQ ID NO: 3, and the second amplified sequence has SEQ ID NO: 12 a nucleotide sequence shown; the first amplification sequence has a nucleotide sequence as shown in SEQ ID NO: 4, and the second amplification sequence has a nucleotide sequence as shown in SEQ ID NO: The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 5, and the second amplified sequence has a
  • the first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 7, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 16;
  • the sequence has the nucleotide sequence set forth in SEQ ID NO: 8
  • the second amplified sequence has the nucleotide sequence set forth in SEQ ID NO: 17
  • the first amplified sequence has the SEQ ID NO:
  • the inventors have surprisingly found that by using the combination of the first amplification sequence and the second amplification sequence described above, the efficiency of amplification of the oligonucleotide molecule can be effectively improved, and the inventors have found that, based on the first amplification sequence described above, And the second amplification sequence, by using the above-mentioned first amplification sequence and the second amplification sequence for each of the two-base recognition modules, respectively, for each double-base recognition module, constructing for connection Starting material, so that when the larger RVD combination is connected, only the first amplification sequence and the second amplification sequence of each oligonucleotide molecule need to be selected, and the final RVD combination can be completed by one enzyme-cutting connection.
  • the construction does not require the replacement and amplification of the intermediate vector, and only needs to be transformed once after transformation. In theory, it takes only 6 hours to obtain the correct plasmid, which can be downstream after one transformation and amplification. experiment of. The whole experiment process is very simple, and the experiment is really experimental, which makes the experiment controllable.
  • the invention provides a library of oligonucleotides, comprising: a plurality of isolated oligonucleotides, wherein the isolated oligonucleotides are as described above Oligonucleotides.
  • the oligonucleotide library can be effectively used to construct different RVD combinations.
  • the different oligonucleotides are each arranged in a different container. Thereby, the construction of the RVD combination can be conveniently performed.
  • a large number of oligonucleotide types are preset in the oligonucleotide library.
  • the oligonucleotide molecule for each combination of the first base recognition module and the second base recognition module, respectively, respectively.
  • the respective first oligonucleotide sequence and the second amplification sequence are used to construct respective oligonucleotide molecules.
  • the RVD combination is connected, only the first amplification sequence and the second amplification sequence of each oligonucleotide molecule need to be selected, and the construction of the final RVD combination can be completed by one enzyme-cutting connection.
  • the invention proposes a method of constructing a nucleic acid expressing a TALE repeat.
  • the method comprises: providing a first oligonucleotide and a second oligonucleotide, the first oligonucleotide and the second oligonucleotide being the oligonucleotides described above; using a lis type inscribed
  • the first oligonucleotide and the second oligonucleotide are cleaved by an enzyme to obtain a first oligonucleotide cleavage product and a second oligonucleotide cleavage product, wherein the first oligonucleotide
  • the cleavage product forms a cohesive end at the first amplified sequence and the second amplified sequence
  • the second oligonucleotide cleavage product forms a sticky end at the first amplified sequence and the second amplified sequence, and The sticky end formed by the second amp
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining “first” and “second” may include one or more of the features, either explicitly or implicitly. In the description of the present invention, the meaning of "plurality" is two or more, unless specifically defined otherwise.
  • any number of RVD combinations can be accomplished by one enzymatic ligation reaction in one enzymatic ligation system at a time. It is not necessary to carry out the replacement and amplification of the intermediate vector, and only needs to perform the transformation after one connection. In theory, it takes only 6 hours to obtain the correct plasmid, and then the downstream experiment can be carried out after one transformation and amplification identification. The whole experiment process is very simple, and the experiment is really experimental, which makes the experiment controllable. According to the current conventional method, only the construction process takes 2 to 5 days.
  • PCR amplification is not required in the experiment to obtain individual single-double unit modules, but each single-double unit module is directly constructed on the plasmid, and a large number of fragment libraries are obtained by amplification of the plasmid, thereby reducing The mutation introduced by PCR, and the experimental controllability is better.
  • the final vector is obtained directly by restriction enzyme ligation of the plasmid and the plasmid, and different vectors can be used to screen for the correct vector.
  • the above method may further have the following additional technical features:
  • the RVD sequence of the first oligonucleotide and the second oligonucleotide molecule is determined based on a predetermined target nucleic acid sequence.
  • the RVD sequence of the first oligonucleotide and the second oligonucleotide molecule is determined based on the following relationship:
  • the RVD corresponding to base A is I;
  • the RVD corresponding to base C is HD;
  • the RVD corresponding to base T is NG;
  • the RVD corresponding to base G is NN. Thereby, the efficiency of the designed RVD combination to recognize a predetermined target nucleotide sequence can be effectively improved.
  • At least one of the first oligonucleotide and the second oligonucleotide comprises a plurality of different oligonucleotides, wherein at least one sticky end of each oligonucleotide is One sticky end of the other oligonucleotide matches.
  • the forward primer cohesive terminus Fn of each oligonucleotide is only linked to the reverse primer sticky end Rn +1 of the other oligonucleotide, but not to other primers.
  • the sticky ends are connected, whereby a plurality of oligonucleotides can be ligated at one time, thereby improving the construction efficiency.
  • the invention provides a method for altering a cellular genome, comprising: determining an RVD sequence of a TALE for a predetermined target nucleic acid sequence in a genome of a cell; constructing according to the method described above An oligonucleotide expressing a TALE repeat sequence capable of specifically recognizing the target nucleic acid sequence; and introducing an oligonucleotide expressing a TALE repeat sequence and a nucleic acid encoding a TALE DNA modifying enzyme into the cell.
  • a specific RVD combination can be efficiently constructed for a predetermined target nucleic acid sequence, thereby effectively improving the efficiency of changing the genome of the cell.
  • the target nucleic acid sequence has a length of 4 to 20 nt, preferably 12 to 20 nt.
  • the TALE DNA modifying enzyme is Fok I.
  • the oligonucleotide expressing the TALE repeat sequence and the nucleic acid encoding the TALE DNA modifying enzyme are constructed on the same vector. Therefore, it is possible to effectively improve the efficiency of changing the genome of the cell.
  • the invention provides an isolated polypeptide, characterized in that the polypeptide is The oligonucleotides described are encoded. Thus, the obtained polypeptide is capable of specifically recognizing the base sequence.
  • the invention proposes a carrier.
  • the vector comprises a nucleic acid molecule encoding the oligonucleotide described above.
  • the vector is a plasmid.
  • the corresponding oligonucleotide can be conveniently stored or amplified, thereby facilitating the efficiency of subsequent construction of talen.
  • a vector or a plasmid may contain a selection marker, such as a drug resistance gene, so that amplification or screening of the vector or plasmid can be conveniently performed.
  • a screening marker of a vector containing an oligonucleotide such as a drug resistance gene
  • a screening marker of a finally constructed expression vector may be different from a screening marker of a finally constructed expression vector, thereby facilitating screening by more convenient one-time ligation reaction.
  • a combination of identification modules required.
  • the invention proposes a plasmid library.
  • the plasmid library comprises: a plurality of plasmids, wherein the plasmid is the vector described above.
  • the different plasmids are each placed in separate containers.
  • the invention proposes a cell.
  • the cells are obtained by altering the genome by the method of changing the genome as described above.
  • the resulting cells carry mutations, such as silencing of gene expression, in the genome as compared to the original cells.
  • the invention proposes a method of modifying a gene.
  • the method comprises: selecting at least two corresponding plasmids from the plasmid library described above based on the target sequence of the gene; and cutting the at least two plasmids with a lis endonuclease , in order to obtain a plurality of plasmid cleavage products, wherein the plasmid cleavage products are respectively formed with cohesive ends, and one cohesive end of each plasmid matches at most one cohesive end of the other plasmid; the plurality of plasmids are cleaved
  • the product is ligated to obtain an oligonucleotide expressing a TALE repeat, the TALE repeat identifying a target sequence of the gene; identifying a target sequence of the gene based on the TALE repeat, by means of a TALE DNA modifying enzyme,
  • one cohesive end of each plasmid matches at most one sticky end of the other plasmid.
  • the forward primer sticky end F n of each plasmid is only linked to the reverse primer sticky end R n+1 of the other plasmid, and cannot be linked to the sticky end of the other primers, thereby Multiple plasmids are ligated at one time, thereby increasing construction efficiency.
  • the method may also have the following additional technical features:
  • the corresponding plasmid is determined based on the following relationship:
  • the RVD corresponding to base A is I;
  • the RVD corresponding to base C is HD;
  • the RVD corresponding to base T is NG;
  • the RVD corresponding to base G is N.
  • the gene is ⁇ , at least one of the target sequences TGACACAGAGATGCCATT and GAATCAGCTCTGTGG.
  • the inventor can choose AA, AT, AC, AG, TA, TT, TC, TG, CA, CT, CC,
  • NI-NI, NI-NG, NI-HD, NI-Shun, NG-NK NG-NG NG-HD, NG-Shun HD-NK HD-NG, HD-HD, HD-Li, cis-NI, NN-NG, NN-HD, NN-NN sequences are sequentially connected to the carrier p-FUS-B2, respectively, and identified as correct Increased template.
  • the base sequence of a single module is as follows:
  • F n can be attached to the sticky end of R n+1 and not to the sticky end of other primers.
  • Xhol and Xbal I restriction sites were respectively introduced at both ends of the amplification product: C'TCGAG and T'CTAGA.
  • the PCR reaction system includes:
  • the PCR product was recovered and purified for the single-module product; the PCR product of the two-module was subjected to tapping recovery due to the interference of single module, and the product with a size of about 280 bp was selected for recovery and purification.
  • PNN4 Due to the presence of the Xhol and Xbal cleavage sites on the PNN4-plasmid, the inventors selected the PNN4 plasmid resistant to tetracycline resistance as the background plasmid for the plasmid library and the single- and double-modulus linkage due to the resistance of PNN4 and the final vector. Kanamycin resistance is different, all can achieve a one-step connection more efficiently.
  • PNN4 was purchased from Addgene.
  • the PNN-4 module vector plasmid and the previously obtained recognition module amplification product were digested with Xhol and Xbal, and the target fragment was recovered, and the digested products were ligated to obtain the plasmid library of each recognition module. 180 plasmids.
  • aN (where a represents the primer code of 1-9, N represents the base recognized by the module, and can be single base A, T, C, G, or 16 double base AA , AT, AC, AG, TA, TT, TC, TG, CA, CT, CC, CG, GA, GT, GC, GG)
  • the conditions for the digestion reaction are as follows:
  • the reaction conditions are as follows:
  • the last half of the repeat unit module actually contains half of the RVD sequence and also the C-terminus of the partial talen protein.
  • the two parts are combined into one modular unit, simultaneously with other modules.
  • the connection is made, which simplifies the connection process and simplifies the final vector into a final vector that does not need to select different corresponding identification ATCGs according to different end recognition sequences ATCG.
  • the present invention uses a synthetic C terminal seq as a module, and uses a forward fragment primer and a half-R for PCR amplification.
  • the PCR amplification method is as follows:
  • the reaction conditions are as follows:
  • the reaction conditions are as follows:
  • the PCR product recovers a fragment of about 200 bp in length, and the gel is recovered and purified, and then ligated to the P-easy vector.
  • the reaction conditions are as follows:
  • a half module and a partial C-terminal module unit relative to the last bit of the recognition sequence can be obtained, and the module unit is connected to the PNN-4 like other single and dual modules.
  • the method is the same as other single and double modules, and the correct sequencing plasmid is obtained. Since half of the modules are at the 10th position of the connection, the bases identified by them are named 10A, 10T, 10C, 10G, respectively.
  • the identification of the talen target site starts from 5' to 3' and the previous digit identified is T.
  • N A, G, T or C
  • the middle 18bp is the TALEN carrier target sequence we want to construct as needed.
  • the inventors selected the gene PPAR Y 2 (NCBI request number ACCESSION: NG_011749 REGION: 5001..151507), and the target sequence of the present invention is located on exon 2 of the ⁇ 2 gene, and the sequence is TGACACAGGAGTGCCATTctggcccaccaacttcgGAATCAGCTCTGTGGA .
  • the left arm of the constructed Talen recognizes the 17 base sequence TGACACAGATGGCCCATT, and the right arm recognizes the 15 base sequence GAATCAGCTCTGTGG with an interval of 17 bases.
  • all the selection intervals in the design include the enzyme digestion.
  • the talen of the site is beneficial for subsequent target efficiency verification.
  • the talen design is based on the fact that the 5' end of the gene sequence of the target site starts with T, ie the sequence starts with T and ends with A. This T is very important to improve the efficiency of the talen.
  • the sequence of the RVD module is determined as follows:
  • the plasmid After determining the RVD recognized by the target Talen of ⁇ 2, we selected the plasmid from the plasmid library: 1TG, 2AC, 3AC, 4AG, 5AG, 6AT, 7GC, 8CA, 9T, 10T for a total of 10 plasmids and the final vector plasmid of the present invention. Enzyme-cleavage ligation, and finally the right-handed Talen left arm was obtained.
  • the plasmids were selected: 1CC, 2AC, 3AG, 4AG, 5CT, 6G, 7A, 8T, 9T, 10C - a total of 10 plasmids and the final vector plasmid of the present invention were ligated and ligated to obtain the right Talen right arm.
  • each plasmid library was uniformly diluted to 100 ng/ul. Since the restriction site on the vector was Esp3I, the final vector was first digested, and the vector with a fragment size of 4K was recovered by tapping:
  • the volume of the enzyme digestion component is as follows:
  • reaction conditions were rrc 4h.
  • the reaction system for the final talen left and right arms is obtained by enzyme digestion as follows:
  • the ligation product was transformed and the resistance was Ka.
  • the clones were screened for Sanger and used immediately.
  • RVD sequence of Talen's left arm is:
  • RVD amino acid sequence is:
  • the primer sequences for Sanger sequencing are:
  • the vector matched by the detection sequence can be used for the detection of downstream target efficiency.
  • Example 3 Gene knockdown The targeting vector constructed in the previous Example 2 was transfected into 293T cells:
  • the transfection complex was evenly added dropwise to the cells prepared in advance, and the dish was gently shaken by the cross method.
  • the cells were cultured at 37 ° C in a 5% CO 2 incubator.
  • the cell transfection efficiency was estimated by observing the green fluorescence ratio of the positive control group (CHO cells were normally over 80%). Discard the medium and replace with new complete medium.
  • the medium can be changed to a selection medium containing puromycin. After about 7 days of culture, the medium is replaced with normal complete medium. At this time, the negative control cells are dead, and the experimental group still has viable cells (antibiotic concentration and The screening time should be pre-tested in advance)
  • the cells in the experimental group grow to more than 80%, that is, the genomic DNA can be extracted. Subsequent identification discarded the medium, and each well was washed once with PBS, and then digested with 0.25% trypsin. After the cells were digested into a round shape, the whole medium was added to terminate the digestion, and the cell suspension was collected, centrifuged at 1000 rpm for 5 min, and the supernatant was discarded.
  • PCR was performed using primers designed according to the target site in advance, and the size of the target fragment was determined by agarose gel electrophoresis, and the PCR product was sent for sequencing.
  • the sequenced correct PCR product was subjected to a large number of enzyme digestion to remove the non-knockout gene, and the restriction enzyme product was recovered and subjected to TA cloning. After colony PCR, the enzyme was digested, and the correct clone was sampled and sequenced. The sequencing result verified the correct experiment. Sex.
  • the single-cell clones that have been extended after the dilution are transferred to a 96-well plate by a pipette and transferred to a 96-well plate. After the cell density reaches 90%, they are passaged into a 48-well plate, and a small amount of cells are isolated.
  • the tissue direct per kit was subjected to PCR amplification, and agarose gel electrophoresis showed that 93 of the 96 samples were banded at 530 bp.
  • the PCR product was digested with a Phol restriction enzyme at 75 °C. The electrophoresis revealed that the original 530 bp band was not cut into positive clones, and the PCR results were sent for sequencing.

Abstract

Disclosed in the present invention are an isolated oligonucleotide and use thereof. The isolated oligonucleotide comprises: a first nucleic acid molecule, the first nucleic acid molecule encoding a double-base recognition module, wherein the double-base recognition module consists of a first single-base recognition module and a second single-base identification module in series, and the first single-base recognition module and the second recognition module both contain repeat variable di-residue amino acids; a first amplification sequence, wherein the first amplification sequence is located at the 5' side of the first nucleic acid molecule, and the first amplification sequence comprises a lis-type endoenzyme recognition sequence; and a second amplification sequence, wherein the second amplification sequence is located at the 3' side of the first nucleic acid molecule, and the second amplification sequence comprises a lis-type endoenzyme recognition sequence. The oligonucleotide can quickly obtain various combinations of RVD, thereby being capable of identifying any target nucleic acid sequence.

Description

分离的寡核苷酸及其用途  Isolated oligonucleotides and uses thereof
技术领域 Technical field
本发明涉及生物技术领域。 具体的, 本发明涉及分离的寡核苷酸及其用途。 更具体的, 本发明涉及一种分离的寡核苷酸、 一种寡核苷酸文库、 一种构建表达 TALE重复序列的核 酸的方法以及一种改变细胞基因组的方法。 背景技术  The invention relates to the field of biotechnology. In particular, the invention relates to isolated oligonucleotides and uses thereof. More specifically, the present invention relates to an isolated oligonucleotide, an oligonucleotide library, a method of constructing a nucleic acid expressing a TALE repeat, and a method of altering a cellular genome. Background technique
TALEN (Transcription activator-like effectors nucleases) 定向基因修饰技术由两个部分 组成, 一部分 TALEN定向切割靶序列形成 DNA双链切口 (DSB), 另一部分 Doner载体 提供需要编辑的序列, 完成在基因组定向编辑工作。 TALEN是由 12个以上串联蛋白模块 和 Fokl内切酶组成, 每个蛋白模块包含 34个氨基酸, 第 12和 13位氨基酸残基是碱基识 别的关键位点, 被称作双氨基酸残基 (RVD)。 通过改变 RVD模块顺序就可以识别靶基因 序列, TALEN结合到靶序列处切割 DNA双链形成 DSB, 在细胞内非同源性末端修复作用 机制下, DBS处发生移码突变, 形成终止密码子。 由于靶基因转录提前终止, 完成靶基因 敲除目的。 TALEN技术有敲除效率高, 专一性强等特点。 TALEN可以在基因组任何部分 选择靶点,做到基因敲除,改变靶点基因序列,所以 TALEN技术是研究基因功能首选方法。  TALEN (Transcription activator-like effectors nucleases) The directed gene modification technology consists of two parts. One part of the TALEN directed cutting target sequence forms a DNA double-stranded incision (DSB), and the other part of the Doner vector provides the sequence to be edited, completing the genome-directed editing work. . TALEN is composed of more than 12 tandem protein modules and Fokl endonuclease. Each protein module contains 34 amino acids. The 12th and 13th amino acid residues are key sites for base recognition and are called double amino acid residues. RVD). The target gene sequence can be identified by changing the sequence of the RVD module. TALEN binds to the target sequence and cleaves the DNA double strand to form DSB. Under the mechanism of intracellular non-homologous end repair, a frameshift mutation occurs at the DBS to form a stop codon. Target gene knockout is accomplished due to early termination of target gene transcription. TALEN technology has the characteristics of high knockout efficiency and strong specificity. TALEN can select targets in any part of the genome, knock out genes, and change target gene sequences, so TALEN technology is the preferred method for studying gene function.
然而, Talen载体由于其识别序列中存在序列的重复性, 使得对 RVD模块组的直接合 成非常困难。 因而, 目前构建 TALE重复序列的方法和手段仍有待改进。 发明内容  However, Talen vectors make it difficult to directly synthesize RVD module sets due to the sequence repeatability in their recognition sequences. Therefore, the current methods and means for constructing TALE repeat sequences still need to be improved. Summary of the invention
本发明旨在至少在一定程度上解决上述技术问题之一或至少提供一种有用的商业选 择。 为此, 本发明的一个目的在于提出一种能够有效用于构建 TALE重复序列的寡核苷酸 及其用途。  The present invention is directed to solving at least some of the above technical problems or at least providing a useful commercial option. To this end, it is an object of the present invention to provide an oligonucleotide which can be effectively used to construct a TALE repeat and its use.
在本发明的第一方面, 本发明提出了一种分离的寡核苷酸。 根据本发明的实施例, 该 分离的寡核苷酸包括: 第一核酸分子, 所述第一核酸分子编码双碱基识别模块, 所述双碱 基识别模块由串联的第一单碱基识别模块和第二单碱基识别模块构成, 所述第一单碱基识 别模块和第二单碱基识别模块均包含重复可变双氨基酸残基; 第一扩增序列, 所述第一扩 增序列位于所述第一核酸分子的 5'侧, 并且所述第一扩增序列包含 lis型内切酶识别序列; 以及第二扩增序列, 所述第二扩增序列位于所述第一核酸分子的 3'侧, 并且所述第二核酸 序列包含 lis型内切酶识别序列。由于该寡核苷酸分子中包含编码两个单碱基识别模块的核 酸序列, 因而, 利用编码不同 RVD的这类寡核苷酸分子作为起始材料, 可以通过采用限制 性内切酶进行切割, 在寡核苷酸分子的两侧形成粘性末端, 可以直接进行连接, 从而, 能 够快速获得各种不同组合的 RVD, 从而可以识别任意靶核酸序列。 可以大大减少酶切连接 的次数, 并且发明人惊奇地发现, 能够实现模块单元连接的错配率大大降低, 通常而言, 对于越长的 RVD组合, 酶切连接次数越多, 正确的连接越困难。 In a first aspect of the invention, the invention proposes an isolated oligonucleotide. According to an embodiment of the invention, the isolated oligonucleotide comprises: a first nucleic acid molecule encoding a double base recognition module, the double base recognition module being identified by a first single base in tandem a module and a second single base recognition module, wherein the first single base recognition module and the second single base recognition module each comprise a repeating variable biamino acid residue; a first amplification sequence, the first amplification a sequence is located on the 5' side of the first nucleic acid molecule, and the first amplified sequence comprises a lis type endonuclease recognition sequence; and a second amplified sequence, the second amplified sequence is located in the first nucleic acid The 3' side of the molecule, and the second nucleic acid sequence comprises a lis type endonuclease recognition sequence. Since the oligonucleotide molecule comprises a nucleic acid sequence encoding two single base recognition modules, such oligonucleotide molecules encoding different RVDs can be used as starting materials by cutting with restriction enzymes. The cohesive ends are formed on both sides of the oligonucleotide molecule, and the ligation can be directly performed, whereby various different combinations of RVDs can be obtained quickly, so that any target nucleic acid sequence can be recognized. Can greatly reduce the enzyme digestion The number of times, and the inventors have surprisingly found that the mismatch rate that enables module unit connections is greatly reduced. Generally, for longer RVD combinations, the more the number of cleavage connections, the more difficult the correct connection.
根据本发明的实施例, 该寡核苷酸还可以具有下列附加技术特征:  According to an embodiment of the invention, the oligonucleotide may also have the following additional technical features:
在本发明的一个实施例中, 所述第一单碱基识别模块和第二单碱基识别模块均具有下 列氨基酸序列:  In one embodiment of the invention, the first single base recognition module and the second single base recognition module each have the following amino acid sequence:
LTPDQVVAIAS*RVD*GGKQALETVQRLLPVLCQDHG, 其中第 12 位和第 13 位的 *RVD*表示 RVD序列, 即重复可变双氨基酸, 其决定不同的识别碱基。 其中, 碱基 A对 应的 RVD为 NI; 碱基 C对应的 RVD为 HD; 碱基 T对应的 RVD为 NG; 以及碱基 G对 应的 RVD为 N。 由此, 第一单碱基识别模块和第二单碱基识别模块均具有选自下列之一 的氨基酸序列:  LTPDQVVAIAS*RVD*GGKQALETVQRLLPVLCQDHG, where *RVD* at positions 12 and 13 represent the RVD sequence, a repeating variable diamino acid, which determines the different recognition bases. Wherein, the RVD corresponding to base A is NI; the RVD corresponding to base C is HD; the RVD corresponding to base T is NG; and the RVD corresponding to base G is N. Thus, the first single base recognition module and the second single base recognition module each have an amino acid sequence selected from one of the following:
识别碱基 A: LTPDQVVAIASNIGGKQALETVQRLLPVLCQDHG;  Identification base A: LTPDQVVAIASNIGGKQALETVQRLLPVLCQDHG;
识别碱基 C: LTPDQVVAIASHDGGKQALETVQRLLPVLCQDHG;  Identification base C: LTPDQVVAIASHDGGKQALETVQRLLPVLCQDHG;
识别碱基 T: LTPDQVVAIASNGGGKQALETVQRLLPVLCQDHG以及  Identify base T: LTPDQVVAIASNGGGKQALETVQRLLPVLCQDHG and
识别碱基 G: LTPDQWAIASNNGGKQALETVQRLLPVLCQDHG。  Identify the base G: LTPDQWAIASNNGGKQALETVQRLLPVLCQDHG.
发明人惊奇地发现, 通过采用上述单碱基识别模块单碱基识别模块, 可以有效地在包 括动物细胞和植物细胞在内的多种细胞内发挥特异性识别碱基的功能。  The inventors have surprisingly found that by using the single base recognition module of the single base recognition module described above, it is possible to effectively exert a function of specifically recognizing a base in a plurality of cells including animal cells and plant cells.
并且发明人惊奇地发现, 与野生型的 TALEN相比,通过缩短 RVD两侧 C端和 N端的 长度,并且对部分氨基酸进行优化,能够进一步提高所得到 TALEN对靶核苷酸的识别效率。  And the inventors have surprisingly found that the recognition efficiency of the obtained TALEN for the target nucleotide can be further improved by shortening the lengths of the C-terminus and the N-terminus of both sides of the RVD and optimizing the partial amino acids as compared with the wild-type TALEN.
在本发明的一个实施例中, 所述第一单碱基识别模块和第二单碱基识别模块中的 RVD 满足下列条件之一: 所述第一单碱基识别模块的 RVD为 NI, 所述第二单碱基识别模块的 RVD为 I; 所述第一单碱基识别模块的 RVD为 NI, 所述第二单碱基识别模块的 RVD为 NG; 所述第一单碱基识别模块的 RVD为 NI, 所述第二单碱基识别模块的 RVD为 HD; 所 述第一单碱基识别模块的 RVD为 NI, 所述第二单碱基识别模块的 RVD为 NN; 所述第一 单碱基识别模块的 RVD为 NG, 所述第二单碱基识别模块的 RVD为 NI; 所述第一单碱基 识别模块的 RVD为 NG, 所述第二单碱基识别模块的 RVD为 NG; 所述第一单碱基识别模 块的 RVD为 NG, 所述第二单碱基识别模块的 RVD为 HD; 所述第一单碱基识别模块的 RVD为 NG, 所述第二单碱基识别模块的 RVD为 N; 所述第一单碱基识别模块的 RVD 为 HD, 所述第二单碱基识别模块的 RVD为 NI; 所述第一单碱基识别模块的 RVD为 HD, 所述第二单碱基识别模块的 RVD为 NG; 所述第一单碱基识别模块的 RVD为 HD, 所述第 二单碱基识别模块的 RVD为 HD; 所述第一单碱基识别模块的 RVD为 HD, 所述第二单碱 基识别模块的 RVD为 N; 所述第一单碱基识别模块的 RVD为 NN, 所述第二单碱基识别 模块的 RVD为 NI; 所述第一单碱基识别模块的 RVD为 NN, 所述第二单碱基识别模块的 RVD为 NG; 所述第一单碱基识别模块的 RVD为 N, 所述第二单碱基识别模块的 RVD 为 HD; 或者所述第一单碱基识别模块的 RVD为 NN, 所述第二单碱基识别模块的 RVD为 NN。 由此, 该寡核苷酸能够编码可以识别 AA、 AT、 AC、 AG、 TA、 TT、 TC、 TG、 CA、 CT、 CG、 CC、 GA、 GT、 GC、 GG 的一种。 从而, 可以通过将所有上述可能的第一碱基 识别模块和第二碱基识别模块的组合, 分别构建寡核苷酸分子后, 能够有效地获得可以用 于构建任意 RVD组合的寡核苷酸分子库, 从而, 只需要将相应的寡核苷酸分子进行连接, 即可以得到期望的可以识别预定靶核苷酸序列的 RVD组合。 In an embodiment of the present invention, the RVD in the first single base identification module and the second single base identification module meets one of the following conditions: the RVD of the first single base identification module is NI, The RVD of the second single base identification module is I; the RVD of the first single base identification module is NI, and the RVD of the second single base identification module is NG; the first single base identification module The RVD of the second single base identification module is HD; the RVD of the first single base identification module is NI, and the RVD of the second single base identification module is NN; The RVD of the single base identification module is NG, the RVD of the second single base identification module is NI; the RVD of the first single base identification module is NG, and the RVD of the second single base identification module Is the NG; the RVD of the first single base identification module is NG, the RVD of the second single base identification module is HD; the RVD of the first single base identification module is NG, the second single The RVD of the base identification module is N; the RVD of the first single base identification module is HD, and the RVD of the second single base identification module is NI; The RVD of the base identification module is HD, the RVD of the second single base identification module is NG; the RVD of the first single base identification module is HD, and the RVD of the second single base identification module is HD; The RVD of the first single base identification module is HD, the RVD of the second single base identification module is N; the RVD of the first single base identification module is NN, and the second single base identification The RVD of the module is NI; the RVD of the first single base identification module is NN, the RVD of the second single base identification module is NG; and the RVD of the first single base identification module is N, The RVD of the second single base identification module is HD; or the RVD of the first single base identification module is NN, and the RVD of the second single base identification module is NN. Thus, the oligonucleotide can encode one that recognizes AA, AT, AC, AG, TA, TT, TC, TG, CA, CT, CG, CC, GA, GT, GC, GG. Thus, by taking all of the above possible first bases The combination of the recognition module and the second base recognition module, respectively, after constructing the oligonucleotide molecule, can efficiently obtain a library of oligonucleotide molecules that can be used to construct any RVD combination, thereby requiring only the corresponding oligonucleoside The acid molecules are ligated to obtain the desired RVD combination that recognizes the predetermined target nucleotide sequence.
根据本发明的一个实施例,所述 lis型内切酶为选自 Bsal、 Bbsl 和 BsmBI的至少一种。 其中,优选所述 lis型内切酶为 Bsal,所述 lis型内切酶识别序列为 GGTCTCNNNN,其中, ^^为 、 T、 G或者 C。 由此, 可以进一步提高通过内切酶切割, 并对切割产物进行连接, 得到不同 RVD组合的效率。  According to an embodiment of the present invention, the lis-type endonuclease is at least one selected from the group consisting of Bsal, Bbsl and BsmBI. Preferably, the lis-type endonuclease is Bsal, and the lis-type endonuclease recognition sequence is GGTCTCNNNN, wherein ^^ is , T, G or C. Thereby, the efficiency of cleavage by endonuclease and ligation of the cleavage products to obtain different RVD combinations can be further improved.
根据本发明的一个实施例, 所述第一扩增序列进一步包含 Xbal酶切位点, 第一扩增序 列可以具有通式 (M)1().2()TCTAGA(;M;)2.8GGTCTC(H; 8.25, 其中 M为 A、 T、 C或 G; H为 A、 T、 C或 G, 并且与所述第一单碱基识别模块的编码序列一致。 任选地, 所述第二扩增序列 进 一 步 包 含 Xhol 酶 切 位 点 , 第 二 扩 增 序 列 具 有 通 式 (M')io-2oCTCGAG(M')2-8GGTCTC(H')i8-25, 其中 M'为 A、 T、 C或 G; 11'为 A、 T、 C或 G, 并且与所述第二单碱基识别模块的编码序列匹配。 M和 M'的序列为可变的,在本发明的一 个实施例中, 将所述第一核酸分子或者单间碱基识别模块编码序列连接于载体作为扩增模 板, M和 M'的序列可分别设置为与载体的 5'端一致和与载体的 3 '端序列匹配。 由此, 可 以方便地保存或者扩增相应的寡核苷酸, 从而便于后续构建 talen的效率。根据本发明的实 施例, 载体或者质粒中可以包含筛选标记物, 例如药物抗性基因, 从而可以方便地进行扩 增或者筛选载体或者质粒。 According to an embodiment of the present invention, the first amplification sequence further comprises an Xbal cleavage site, and the first amplification sequence may have the general formula (M) 1() . 2() TCTAGA(;M;) 2 . 8 GGTCTC (H;. 8 25 , wherein M is a, T, C or G;. H is a, T, C or G, and consistent with the coding sequence of the first single-base recognition module optionally, The second amplified sequence further comprises a Xhol cleavage site, and the second amplified sequence has the general formula (M') io-2oCTCGAG(M') 2-8GGTCTC(H')i 8- 25, wherein M' is A, T, C or G; 11' is A, T, C or G and matches the coding sequence of the second single base recognition module. The sequences of M and M' are variable, in the context of the present invention In one embodiment, the first nucleic acid molecule or the single base recognition module coding sequence is ligated to the vector as an amplification template, and the sequences of M and M' can be respectively set to coincide with the 5' end of the vector and with the vector 3 'End sequence matching. Thus, the corresponding oligonucleotide can be conveniently saved or amplified, thereby facilitating the efficiency of subsequent construction of talen. According to an embodiment of the invention, the vector or plasmid can be Containing the selection marker, such as a drug resistance gene, which can easily be amplified or screening vector or plasmid.
根据本发明的一个实施例, 所述第一扩增序列和第二扩增序列满足下列条件: 所述第 一扩增序列具有如 SEQ ID NO: 1所示的核苷酸序列,所述第二扩增序列具有如 SEQ ID NO: 10所示的核苷酸序列; 所述第一扩增序列具有如 SEQ ID NO: 2所示的核苷酸序列, 所述 第二扩增序列具有如 SEQ ID N0: 11所示的核苷酸序列;所述第一扩增序列具有如 SEQ ID NO : 3所示的核苷酸序列, 所述第二扩增序列具有如 SEQ ID NO: 12所示的核苷酸序列; 所述第一扩增序列具有如 SEQ ID NO: 4所示的核苷酸序列,所述第二扩增序列具有如 SEQ ID NO: 13所示的核苷酸序列; 所述第一扩增序列具有如 SEQ ID NO: 5所示的核苷酸序 列, 所述第二扩增序列具有如 SEQ ID NO: 14所示的核苷酸序列; 所述第一扩增序列具有 如 SEQ ID NO: 6所示的核苷酸序列, 所述第二扩增序列具有如 SEQ ID NO : 15所示的核 苷酸序列; 所述第一扩增序列具有如 SEQ ID NO : 7所示的核苷酸序列, 所述第二扩增序 列具有如 SEQ ID NO: 16所示的核苷酸序列; 所述第一扩增序列具有如 SEQ ID NO: 8所 示的核苷酸序列, 所述第二扩增序列具有如 SEQ ID NO: 17所示的核苷酸序列; 或者所述 第一扩增序列具有如 SEQ ID NO: 9所示的核苷酸序列, 所述第二扩增序列具有如 SEQ ID NO : 18所示的核苷酸序列。  According to an embodiment of the present invention, the first amplified sequence and the second amplified sequence satisfy the following condition: the first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 1, The second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 10; the first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 2, and the second amplified sequence has a nucleotide sequence of SEQ ID NO: 11; the first amplified sequence has a nucleotide sequence as set forth in SEQ ID NO: 3, and the second amplified sequence has SEQ ID NO: 12 a nucleotide sequence shown; the first amplification sequence has a nucleotide sequence as shown in SEQ ID NO: 4, and the second amplification sequence has a nucleotide sequence as shown in SEQ ID NO: The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 5, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 14; The amplified sequence has the nucleotide sequence set forth in SEQ ID NO: 6, and the second amplified sequence has the nucleotide sequence set forth in SEQ ID NO: 15. The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 7, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 16; The sequence has the nucleotide sequence set forth in SEQ ID NO: 8, the second amplified sequence has the nucleotide sequence set forth in SEQ ID NO: 17; or the first amplified sequence has the SEQ ID NO: The nucleotide sequence shown by 9, which has the nucleotide sequence shown as SEQ ID NO: 18.
关于 SEQ ID NO: 1-18的序列总结如下表 1,
Figure imgf000004_0001
Figure imgf000005_0001
The sequences for SEQ ID NOs: 1-18 are summarized in Table 1, below.
Figure imgf000004_0001
Figure imgf000005_0001
0Ϊ869Ϊ/ 0Ζ OAV 发明人惊奇地发现, 通过采用上述第一扩增序列和第二扩增序列的组合, 能够有效提 高对寡核苷酸分子进行扩增的效率, 并且发明人发现, 基于上述第一扩增序列和第二扩增 序列, 通过针对每一种双碱基识别模块, 分别采用上述第一扩增序列和第二扩增序列, 可 以针对每一种双碱基识别模块, 构建用于连接的起始材料, 从而在进行连接较大 RVD组合 时, 只需要对各个寡核苷酸分子的第一扩增序列和第二扩增序列进行选择, 就能够通过一 次酶切连接, 完成对最终 RVD组合的构建, 不需要进行中间载体的置换和扩增, 只需要进 行一次连接后进行转化, 理论上 , 只需要 6个小时就可以得到正确的质粒, 再通过一次转 化扩增鉴定后即可进行下游的实验。 整个实验过程非常简单, 真正实验了流程化, 使实验 具有可控性。 0Ϊ869Ϊ/ 0Ζ OAV The inventors have surprisingly found that by using the combination of the first amplification sequence and the second amplification sequence described above, the efficiency of amplification of the oligonucleotide molecule can be effectively improved, and the inventors have found that, based on the first amplification sequence described above, And the second amplification sequence, by using the above-mentioned first amplification sequence and the second amplification sequence for each of the two-base recognition modules, respectively, for each double-base recognition module, constructing for connection Starting material, so that when the larger RVD combination is connected, only the first amplification sequence and the second amplification sequence of each oligonucleotide molecule need to be selected, and the final RVD combination can be completed by one enzyme-cutting connection. The construction does not require the replacement and amplification of the intermediate vector, and only needs to be transformed once after transformation. In theory, it takes only 6 hours to obtain the correct plasmid, which can be downstream after one transformation and amplification. experiment of. The whole experiment process is very simple, and the experiment is really experimental, which makes the experiment controllable.
在本发明的第二方面, 本发明提出了一种寡核苷酸文库, 其特征在于, 包括: 多个分 离的寡核苷酸, 其中, 所述分离的寡核苷酸为前面所述的寡核苷酸。 由此, 该寡核苷酸文 库, 能够有效地用于构建不同的 RVD组合。 根据本发明的实施例, 不同的寡核苷酸分别设 置在不同的容器中。 由此,可以方便地进行 RVD组合的构建。另外,根据本发明的实施例, 该寡核苷酸文库中预先设置了大量寡核苷酸类型, 首先, 针对第一碱基识别模块和第二碱 基识别模块的各种组合, 分别构建各自的寡核苷酸分子, 接着, 针对每一种双碱基识别模 块, 分别采用上述第一扩增序列和第二扩增序列, 构建各自的寡核苷酸分子。 从而, 只需 要将相应的寡核苷酸分子进行连接, 即可以得到期望的可以识别预定靶核苷酸序列的 RVD 组合。 并且, 进行连接 RVD组合时, 只需要对各个寡核苷酸分子的第一扩增序列和第二扩 增序列进行选择, 就能够通过一次酶切连接, 完成对最终 RVD组合的构建, 不需要进行中 间载体的置换和扩增, 只需要进行一次连接后进行转化, 理论上,只需要 6个小时就可以得 到正确的质粒, 再通过一次转化扩增鉴定后即可进行下游的实验。整个实验过程非常简单, 真正实验了流程化, 使实验具有可控性。 另外, 在实验中不需要采用 PCR扩增以得到各个 单双的单元模块, 而是直接将各单双的单元模块构建到质粒上, 通过质粒的扩增来得到大 量的片段库, 减少了由于 PCR引入的突变, 并且使实验可控性更好。 直接通过质粒和质粒 的酶切连接来得到最终的载体, 并且可以利用不同的抗性来筛选得到正确的载体。  In a second aspect of the invention, the invention provides a library of oligonucleotides, comprising: a plurality of isolated oligonucleotides, wherein the isolated oligonucleotides are as described above Oligonucleotides. Thus, the oligonucleotide library can be effectively used to construct different RVD combinations. According to an embodiment of the invention, the different oligonucleotides are each arranged in a different container. Thereby, the construction of the RVD combination can be conveniently performed. In addition, according to an embodiment of the present invention, a large number of oligonucleotide types are preset in the oligonucleotide library. First, for each combination of the first base recognition module and the second base recognition module, respectively, respectively The oligonucleotide molecule, and then, for each of the two-base recognition modules, the respective first oligonucleotide sequence and the second amplification sequence are used to construct respective oligonucleotide molecules. Thus, by simply ligating the corresponding oligonucleotide molecules, it is possible to obtain a desired RVD combination that can recognize a predetermined target nucleotide sequence. Moreover, when the RVD combination is connected, only the first amplification sequence and the second amplification sequence of each oligonucleotide molecule need to be selected, and the construction of the final RVD combination can be completed by one enzyme-cutting connection. For the replacement and amplification of the intermediate vector, it is only necessary to carry out the transformation after one connection. In theory, it takes only 6 hours to obtain the correct plasmid, and then the downstream experiment can be carried out after one transformation and amplification identification. The whole experiment process is very simple, and the experiment is really experimental, which makes the experiment controllable. In addition, PCR amplification is not required in the experiment to obtain individual single-double unit modules, but each single-double unit module is directly constructed on the plasmid, and a large number of fragment libraries are obtained by amplification of the plasmid, thereby reducing The mutation introduced by PCR, and the experimental controllability is better. The final vector is obtained directly by restriction enzyme ligation of the plasmid and the plasmid, and different vectors can be used to screen for the correct vector.
根据本发明的第三方面, 本发明提出了一种构建表达 TALE重复序列的核酸的方法。 该方法包括: 提供第一寡核苷酸和第二寡核苷酸, 所述第一寡核苷酸和第二寡核苷酸均为 前面所述的寡核苷酸; 利用 lis型内切酶对所述第一寡核苷酸和第二寡核苷酸进行切割, 以 便获得第一寡核苷酸切割产物和第二寡核苷酸切割产物, 其中, 所述第一寡核苷酸切割产 物在第一扩增序列和第二扩增序列形成有粘性末端, 以及所述第二寡核苷酸切割产物在第 一扩增序列和第二扩增序列形成有粘性末端, 并且所述第一寡核苷酸切割产物的第二扩增 序列所形成的粘性末端与所述第二寡核苷酸切割产物的第一扩增序列所形成的粘性末端匹 配; 以及将所述第一寡核苷酸切割产物与所述第二寡核苷酸切割产物进行连接, 以便获得 表达 TALE重复序列的寡核苷酸。利用该方法可以有效地通过将不同的寡核苷酸进行连接, 从而获得不同的 RVD组合。在实际操作过程中, 只需要对寡核苷酸分子两侧的第一扩增序 列和第二扩增序列进行选择即可以实现,通过一次连接获得大的 RVD组合。需要说明的是, 术语"第一"、 "第二 "仅用于描述目的, 而不能理解为指示或暗示相对重要性或者隐含指明 所指示的技术特征的数量。 由此, 限定有 "第一"、 "第二 "的特征可以明示或者隐含地包括 一个或者更多个该特征。 在本发明的描述中, "多个"的含义是两个或两个以上, 除非另有 明确具体的限定。 由此, 根据本发明的实施例, 可以一次性在一个酶切连接体系中, 通过 一次酶切连接反应, 能够完成任意数目的 RVD组合。 不需要进行中间载体的置换和扩增, 只需要进行一次连接后进行转化, 理论上讲只需要 6个小时就可以得到正确的质粒, 再通 过一次转化扩增鉴定后即可进行下游的实验。整个实验过程非常简单, 真正实验了流程化, 使实验具有可控性。 而根据目前常规的方法, 仅构建过程, 需要耗费 2~5天的时间。 另外, 在实验中不需要采用 PCR扩增以得到各个单双的单元模块, 而是直接将各单双的单元模块 构建到质粒上, 通过质粒的扩增来得到大量的片段库, 减少了由于 PCR引入的突变, 并且 使实验可控性更好。 直接通过质粒和质粒的酶切连接来得到最终的载体, 并且可以利用不 同的抗性来筛选得到正确的载体。 According to a third aspect of the invention, the invention proposes a method of constructing a nucleic acid expressing a TALE repeat. The method comprises: providing a first oligonucleotide and a second oligonucleotide, the first oligonucleotide and the second oligonucleotide being the oligonucleotides described above; using a lis type inscribed The first oligonucleotide and the second oligonucleotide are cleaved by an enzyme to obtain a first oligonucleotide cleavage product and a second oligonucleotide cleavage product, wherein the first oligonucleotide The cleavage product forms a cohesive end at the first amplified sequence and the second amplified sequence, and the second oligonucleotide cleavage product forms a sticky end at the first amplified sequence and the second amplified sequence, and The sticky end formed by the second amplified sequence of the first oligonucleotide cleavage product matches the sticky end formed by the first amplified sequence of the second oligonucleotide cleavage product; and the first oligo A nucleotide cleavage product is ligated to the second oligonucleotide cleavage product to obtain an oligonucleotide that expresses a TALE repeat. With this method, different RVD combinations can be obtained efficiently by connecting different oligonucleotides. In the actual operation, only the first amplification sequence and the second amplification sequence on both sides of the oligonucleotide molecule need to be selected, and a large RVD combination can be obtained by one connection. It should be noted, The terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" and "second" may include one or more of the features, either explicitly or implicitly. In the description of the present invention, the meaning of "plurality" is two or more, unless specifically defined otherwise. Thus, according to an embodiment of the present invention, any number of RVD combinations can be accomplished by one enzymatic ligation reaction in one enzymatic ligation system at a time. It is not necessary to carry out the replacement and amplification of the intermediate vector, and only needs to perform the transformation after one connection. In theory, it takes only 6 hours to obtain the correct plasmid, and then the downstream experiment can be carried out after one transformation and amplification identification. The whole experiment process is very simple, and the experiment is really experimental, which makes the experiment controllable. According to the current conventional method, only the construction process takes 2 to 5 days. In addition, PCR amplification is not required in the experiment to obtain individual single-double unit modules, but each single-double unit module is directly constructed on the plasmid, and a large number of fragment libraries are obtained by amplification of the plasmid, thereby reducing The mutation introduced by PCR, and the experimental controllability is better. The final vector is obtained directly by restriction enzyme ligation of the plasmid and the plasmid, and different vectors can be used to screen for the correct vector.
根据本发明的实施例, 上述方法可以进一步具有下列附加技术特征:  According to an embodiment of the invention, the above method may further have the following additional technical features:
在本发明的一个实施例中, 基于预定的靶核酸序列确定所述第一寡核苷酸和所述第二 寡核苷酸分子的 RVD序列。 优选, 基于下列关系, 确定所述第一寡核苷酸和所述第二寡核 苷酸分子的 RVD序列:  In one embodiment of the invention, the RVD sequence of the first oligonucleotide and the second oligonucleotide molecule is determined based on a predetermined target nucleic acid sequence. Preferably, the RVD sequence of the first oligonucleotide and the second oligonucleotide molecule is determined based on the following relationship:
碱基 A对应的 RVD为 I;  The RVD corresponding to base A is I;
碱基 C对应的 RVD为 HD;  The RVD corresponding to base C is HD;
碱基 T对应的 RVD为 NG; 以及  The RVD corresponding to base T is NG;
碱基 G对应的 RVD为 NN。 由此, 可以有效地提高所设计的 RVD组合识别预定靶核 苷酸序列的效率。  The RVD corresponding to base G is NN. Thereby, the efficiency of the designed RVD combination to recognize a predetermined target nucleotide sequence can be effectively improved.
另外, 根据本发明的实施例, 第一寡核苷酸和第二寡核苷酸的至少之一包含多种不同 的寡核苷酸, 其中, 每种寡核苷酸的一个粘性末端至多与其他寡核苷酸的一个粘性末端匹 配。 例如, 对于多个寡核苷酸, 每个寡核苷酸的正向引物粘性末端 Fn只是与另一个寡核苷 酸的反向引物粘性末端 Rn+1相连, 而不能与其他引物上的粘性末端相连, 由此, 可以一次 性地将多个寡核苷酸进行连接, 从而提高了构建效率。 Further, according to an embodiment of the present invention, at least one of the first oligonucleotide and the second oligonucleotide comprises a plurality of different oligonucleotides, wherein at least one sticky end of each oligonucleotide is One sticky end of the other oligonucleotide matches. For example, for multiple oligonucleotides, the forward primer cohesive terminus Fn of each oligonucleotide is only linked to the reverse primer sticky end Rn +1 of the other oligonucleotide, but not to other primers. The sticky ends are connected, whereby a plurality of oligonucleotides can be ligated at one time, thereby improving the construction efficiency.
在本发明的第四方面, 本发明提出了一种改变细胞基因组的方法, 其特征在于, 包括: 针对细胞基因组中预定的靶核酸序列, 确定 TALE的 RVD序列; 根据前面所述的方法, 构 建表达 TALE重复序列的寡核苷酸,所述 TALE重复序列能够特异性识别所述靶核酸序列; 以及将表达 TALE重复序列的寡核苷酸与编码 TALE DNA修饰酶的核酸引入所述细胞。 由 此, 根据本发明的实施例, 可以有效地针对预定的靶核酸序列, 构建特定的 RVD组合, 从 而, 能够有效地提高改变细胞基因组的效率。 根据本发明的实施例, 所述靶核酸序列长度 为 4~20nt, 优选 12~20nt。 根据本发明的实施例, TALE DNA修饰酶为 Fok I。 优选, 所述 表达 TALE重复序列的寡核苷酸与编码 TALE DNA修饰酶的核酸被构建在同一载体上。从 而, 能够有效地提高改变细胞基因组的效率。  In a fourth aspect of the invention, the invention provides a method for altering a cellular genome, comprising: determining an RVD sequence of a TALE for a predetermined target nucleic acid sequence in a genome of a cell; constructing according to the method described above An oligonucleotide expressing a TALE repeat sequence capable of specifically recognizing the target nucleic acid sequence; and introducing an oligonucleotide expressing a TALE repeat sequence and a nucleic acid encoding a TALE DNA modifying enzyme into the cell. Thus, according to an embodiment of the present invention, a specific RVD combination can be efficiently constructed for a predetermined target nucleic acid sequence, thereby effectively improving the efficiency of changing the genome of the cell. According to an embodiment of the invention, the target nucleic acid sequence has a length of 4 to 20 nt, preferably 12 to 20 nt. According to an embodiment of the invention, the TALE DNA modifying enzyme is Fok I. Preferably, the oligonucleotide expressing the TALE repeat sequence and the nucleic acid encoding the TALE DNA modifying enzyme are constructed on the same vector. Therefore, it is possible to effectively improve the efficiency of changing the genome of the cell.
在本发明的第五方面, 本发明提出了一种分离的多肽, 其特征在于, 所述多肽是由前 面所述的寡核苷酸编码的。 由此, 所得到的多肽能够特异性识别碱基序列。 In a fifth aspect of the invention, the invention provides an isolated polypeptide, characterized in that the polypeptide is The oligonucleotides described are encoded. Thus, the obtained polypeptide is capable of specifically recognizing the base sequence.
在本发明的第五方面, 本发明提出了一种载体。 根据本发明的实施例, 该载体包括编 码前面所述的寡核苷酸的核酸分子。 其中, 优选所述载体为质粒。 由此, 可以方便地保存 或者扩增相应的寡核苷酸, 从而便于后续构建 talen的效率。 根据本发明的实施例, 载体或 者质粒中可以包含筛选标记物, 例如药物抗性基因, 从而可以方便地进行扩增或者筛选载 体或者质粒。 另外, 根据本发明的实施例, 包含寡核苷酸的载体的筛选标记物, 例如药物 抗性基因与最终构建的表达载体的筛选标记物可以不同, 从而更方便地实现一次连接反应 就能够筛选出所需要的识别模块组合。  In a fifth aspect of the invention, the invention proposes a carrier. According to an embodiment of the invention, the vector comprises a nucleic acid molecule encoding the oligonucleotide described above. Among them, it is preferred that the vector is a plasmid. Thereby, the corresponding oligonucleotide can be conveniently stored or amplified, thereby facilitating the efficiency of subsequent construction of talen. According to an embodiment of the present invention, a vector or a plasmid may contain a selection marker, such as a drug resistance gene, so that amplification or screening of the vector or plasmid can be conveniently performed. In addition, according to an embodiment of the present invention, a screening marker of a vector containing an oligonucleotide, such as a drug resistance gene, may be different from a screening marker of a finally constructed expression vector, thereby facilitating screening by more convenient one-time ligation reaction. A combination of identification modules required.
在本发明的第六方面, 本发明提出了一种质粒文库。 根据本发明的实施例, 该质粒文 库包括: 多个质粒, 其中, 所述质粒为前面所述的载体。 优选的, 不同的质粒分别设置在 不同的容器中。  In a sixth aspect of the invention, the invention proposes a plasmid library. According to an embodiment of the present invention, the plasmid library comprises: a plurality of plasmids, wherein the plasmid is the vector described above. Preferably, the different plasmids are each placed in separate containers.
在本发明的第七方面, 本发明提出了一种细胞。 根据本发明的实施例, 所述细胞是通 过前面所述的改变基因组的方法进行改变基因组而获得的。 由此, 所得到的细胞与原始细 胞相比, 其基因组携带有突变, 例如基因表达的沉默。  In a seventh aspect of the invention, the invention proposes a cell. According to an embodiment of the present invention, the cells are obtained by altering the genome by the method of changing the genome as described above. Thus, the resulting cells carry mutations, such as silencing of gene expression, in the genome as compared to the original cells.
在本发明的第八方面, 本发明提出了一种对基因进行修饰的方法。 根据本发明的实施 例, 该方法包括: 基于所述基因的靶序列, 从前面所述的质粒文库中选择相应的至少两种 质粒;利用 lis型内切酶对所述至少两种质粒进行切割, 以便获得多种质粒切割产物,其中, 所述质粒切割产物的两端分别形成有粘性末端, 并且每种质粒的一个粘性末端至多与其他 质粒的一个粘性末端匹配; 将所述多个质粒切割产物进行连接, 以便获得表达 TALE重复 序列的寡核苷酸, 所述 TALE重复序列识别述基因的靶序列; 基于所述 TALE重复序列识 别述基因的靶序列, 借助 TALE DNA修饰酶, 对所述基因进行改变。 由此, 可以有效地对 预定的基因进行修饰, 例如基因沉默, 达到基因敲除的目的。 根据本发明的实施例, 多个 质粒中, 每个质粒的一个粘性末端至多与其他质粒的一个粘性末端匹配。 例如, 对于多个 质粒, 每个质粒的正向引物粘性末端 Fn只是与另一个质粒的反向引物粘性末端 Rn+1相连, 而不能与其他引物上的粘性末端相连, 由此, 可以一次性地将多个质粒进行连接, 从而提 高了构建效率。 In an eighth aspect of the invention, the invention proposes a method of modifying a gene. According to an embodiment of the present invention, the method comprises: selecting at least two corresponding plasmids from the plasmid library described above based on the target sequence of the gene; and cutting the at least two plasmids with a lis endonuclease , in order to obtain a plurality of plasmid cleavage products, wherein the plasmid cleavage products are respectively formed with cohesive ends, and one cohesive end of each plasmid matches at most one cohesive end of the other plasmid; the plurality of plasmids are cleaved The product is ligated to obtain an oligonucleotide expressing a TALE repeat, the TALE repeat identifying a target sequence of the gene; identifying a target sequence of the gene based on the TALE repeat, by means of a TALE DNA modifying enzyme, The gene changes. Thereby, it is possible to efficiently modify a predetermined gene, such as gene silencing, to achieve the purpose of gene knockout. According to an embodiment of the invention, of the plurality of plasmids, one cohesive end of each plasmid matches at most one sticky end of the other plasmid. For example, for multiple plasmids, the forward primer sticky end F n of each plasmid is only linked to the reverse primer sticky end R n+1 of the other plasmid, and cannot be linked to the sticky end of the other primers, thereby Multiple plasmids are ligated at one time, thereby increasing construction efficiency.
根据本发明的实施例, 该方法还可以具有下列附加技术特征:  According to an embodiment of the invention, the method may also have the following additional technical features:
在本发明的一个实施例中, 基于下列关系, 确定所述相应的质粒:  In one embodiment of the invention, the corresponding plasmid is determined based on the following relationship:
碱基 A对应的 RVD为 I;  The RVD corresponding to base A is I;
碱基 C对应的 RVD为 HD;  The RVD corresponding to base C is HD;
碱基 T对应的 RVD为 NG; 以及  The RVD corresponding to base T is NG;
碱基 G对应的 RVD为 N。  The RVD corresponding to base G is N.
在本发明的一个实施例中,所述基因为 ΡΡΑΛγ,所述靶序列 TGACACAGAGATGCCATT 和 GAATCAGCTCTGTGG的至少之一。 本发明的附加方面和优点将在下面的描述中部分给出, 部分将从下面的描述中变得明 显, 或通过本发明的实践了解到。 具体实施方式 In one embodiment of the invention, the gene is ΡΡΑΛγ, at least one of the target sequences TGACACAGAGATGCCATT and GAATCAGCTCTGTGG. The additional aspects and advantages of the invention will be set forth in part in the description which follows. detailed description
下面通过具体的实施例, 对本发明进行说明。 需要说明的是, 下面所描述的实施例仅 仅是示例性的, 并不对本发明的范围做出限制, 并且在下面的实施例中所采用的试剂均为 市售可得的, 另外, 如无特别说明, 在下面实施例中所未明确说明步骤的方法, 均为本领 域技术人员已知的常规方法进行。 实施例 1 构建寡核苷酸文库  The invention will now be described by way of specific examples. It should be noted that the embodiments described below are merely exemplary and do not limit the scope of the invention, and the reagents used in the following examples are commercially available, and, if not, In particular, the methods of the steps not explicitly illustrated in the following examples are all carried out by conventional methods known to those skilled in the art. Example 1 Construction of an oligonucleotide library
1、 确定识别模块 1, determine the identification module
发明人选择可以分别 AA、 AT、 AC、 AG、 TA、 TT、 TC、 TG、 CA、 CT、 CC、 The inventor can choose AA, AT, AC, AG, TA, TT, TC, TG, CA, CT, CC,
CG、 GA、 GT、 GC、 GG ( 16个双碱基) 的 16个识别模块 NI-NI、 NI-NG、 NI-HD、 NI-顺、 NG-NK NG-NG NG-HD、 NG-顺、 HD-NK HD-NG、 HD-HD、 HD-丽、顺 -NI、 NN-NG、 NN-HD、 NN-NN序列分别按顺序连接到载体 p-FUS-B2上, 鉴定正确后作为 扩增的模板。 16 recognition modules for CG, GA, GT, GC, GG (16 double bases) NI-NI, NI-NG, NI-HD, NI-Shun, NG-NK NG-NG NG-HD, NG-Shun HD-NK HD-NG, HD-HD, HD-Li, cis-NI, NN-NG, NN-HD, NN-NN sequences are sequentially connected to the carrier p-FUS-B2, respectively, and identified as correct Increased template.
单个模块的碱基序列如下:  The base sequence of a single module is as follows:
NI(A)  NI(A)
CTGACCCCGG ACCAAGTGGT GGCTATCGCC AGCAACATTG GCGGCAAGCA AGCGCTCGAA HD(C)  CTGACCCCGG ACCAAGTGGT GGCTATCGCC AGCAACATTG GCGGCAAGCA AGCGCTCGAA HD(C)
CTGACTCCGG ACCAAGTGGT GGCTATCGCC AGCCACGATG GCGGCAAGCA AGCGCTCGAA  CTGACTCCGG ACCAAGTGGT GGCTATCGCC AGCCACGATG GCGGCAAGCA AGCGCTCGAA
NG(T) NG(T)
CTGACCCCGG ACCAAGTGGT GGCTATCGCC AGCAACGGTG GCGGCAAGCA CTGACCCCGG ACCAAGTGGT GGCTATCGCC AGCAACGGTG GCGGCAAGCA
AGCGCTCGAA 丽 (G) AGCGCTCGAA Li (G)
CTGACCCCGG ACCAAGTGGT GGCTATCGCC AGCAACAATG GCGGCAAGCA AGCGCTCGAA 其中 p-FUS-B2来自购买自 addgene, 利用 p-FUS-B2作为一个载体, 将两个单模 块连接到一起, 构成发明人使用的基本扩增模板。 测序正确后待用。  CTGACCCCGG ACCAAGTGGT GGCTATCGCC AGCAACAATG GCGGCAAGCA AGCGCTCGAA where p-FUS-B2 was purchased from addgene, using p-FUS-B2 as a carrier, and two single modules were joined together to form the basic amplification template used by the inventors. Sequencing is correct and ready for use.
2、 扩增识别模块
Figure imgf000010_0001
ί¾¥ί篛「《½ψ^^ #ί¾ ' 6) 呦 IfeiiJf^ si「^土 ¾:piffi
2. Amplification recognition module
Figure imgf000010_0001
33⁄4¥ί篛""1⁄2ψ^^ #ί3⁄4 ' 6) 呦IfeiiJf^ si"^土3⁄4:piffi
6  6
£0tSL0/n0ZSLD/∑Jd 0Ϊ869Ϊ/ 0Ζ OAV
Figure imgf000011_0001
£0tSL0/n0ZSLD/∑Jd 0Ϊ869Ϊ/ 0Ζ OAV
Figure imgf000011_0001
5'-ctagacgtctcgatagcctcgagTGGTTggtctcTcAGTCCATGGTCCTGGCAC 在上面的 18条引物中, 都带有 Bsal酶切识别序列 GGTCTCN'NNNN。 这个 于 type lis enzymes, 同一个酶切识别序列可以产生多个粘性识别末端, 理论上可以产 生 44个粘性识别末端。 由于每个模块的结尾和开头的氨基酸均为 Gly和 Leu, 因而考 虑到简并密码子(Gly密码子 4种, Leu密码子 6种)的限制, 利用一个 type lis enzyme 可以产生 24种接头。我们选取其中的 18种设计了引物,除了 F 1与 R9夕卜, Fn可与 Rn+1 的粘性末端相连, 而不能与其他引物上的粘性末端相连。 需要说明的是, 在上面的扩 增引物中, 分别在扩增产物的两端分别引入了 Xhol和 Xbal I酶切位点: C'TCGAG和 T'CTAGA。 5'-ctagacgtctcgatagcctcgagTGGTTggtctcTcAGTCCATGGTCCTGGCAC In the above 18 primers, both have the Bsal restriction recognition sequence GGTCTCN'NNNN. In this type lis enzymes, with recognition sequence a cleavage recognition sticky end may generate a plurality, theoretically cause An adhesive identification terminal 44. Since the amino acids at the end and beginning of each module are Gly and Leu, 24 types of linkers can be generated using a type lis enzyme, taking into account the limitations of degenerate codons (4 Gly codons, 6 Leu codons). We selected 18 of them to design primers. Except for F 1 and R9, F n can be attached to the sticky end of R n+1 and not to the sticky end of other primers. It should be noted that in the above amplification primers, Xhol and Xbal I restriction sites were respectively introduced at both ends of the amplification product: C'TCGAG and T'CTAGA.
PCR扩增的程序如下:  The procedure for PCR amplification is as follows:
PCR反应体系包括:  The PCR reaction system includes:
单双模板质粒 (50ng/ul) lul  Single and double template plasmid (50ng/ul) lul
Phusion High-Fidelity PCR Master Mix 25ul  Phusion High-Fidelity PCR Master Mix 25ul
正反引物 (lOuM) 5ul  Positive and negative primers (lOuM) 5ul
双蒸水 19ul  Double steamed water 19ul
总体积 50ul  Total volume 50ul
反应条件:  Reaction conditions:
35个循环
Figure imgf000011_0002
35 cycles
Figure imgf000011_0002
PCR后对于单模块产物进行 PCR产物回收纯化; 双模块的 PCR产物由于存在单模块 的干扰, 需要进行割胶回收, 选择大小约为 280bp的产物进行回收纯化。  After PCR, the PCR product was recovered and purified for the single-module product; the PCR product of the two-module was subjected to tapping recovery due to the interference of single module, and the product with a size of about 280 bp was selected for recovery and purification.
3、 构建识别模块的质粒文库  3. Construct a plasmid library of the recognition module
由于 PNN4-质粒上存在 Xhol和 Xbal的酶切位点, 所以发明人选择了抗性为四环 素抗性的 PNN4质粒作为质粒库的背景质粒和单双模块连接, 由于 PNN4的抗性和最 终载体的卡那霉素抗性不同, 所有可以更高效的实现一步连接。 其中 PNN4 购自 Addgene。  Due to the presence of the Xhol and Xbal cleavage sites on the PNN4-plasmid, the inventors selected the PNN4 plasmid resistant to tetracycline resistance as the background plasmid for the plasmid library and the single- and double-modulus linkage due to the resistance of PNN4 and the final vector. Kanamycin resistance is different, all can achieve a one-step connection more efficiently. PNN4 was purchased from Addgene.
将基于 PNN-4 模块载体质粒以及前面所得到的识别模块扩增产物通过 Xhol 和 Xbal酶切后回收目的片段, 分别将酶切产物进行连接, 从而获得了这对各个识别模块 的质粒文库, 共计 180种质粒。 按照引物的序号分别命名为 aN (其中 a代表 1-9号引 物代码, N表示模块识别的碱基, 可为单碱基 A、 T、 C、 G四种, 或是 16种双碱基 AA、 AT、 AC、 AG、 TA、 TT、 TC、 TG、 CA、 CT、 CC、 CG、 GA、 GT、 GC、 GG) 其中酶切反应条件如下: The PNN-4 module vector plasmid and the previously obtained recognition module amplification product were digested with Xhol and Xbal, and the target fragment was recovered, and the digested products were ligated to obtain the plasmid library of each recognition module. 180 plasmids. According to the serial number of the primer, it is named aN (where a represents the primer code of 1-9, N represents the base recognized by the module, and can be single base A, T, C, G, or 16 double base AA , AT, AC, AG, TA, TT, TC, TG, CA, CT, CC, CG, GA, GT, GC, GG) The conditions for the digestion reaction are as follows:
酶切体系:  Enzyme digestion system:
Xbal, lO U /μΙ 0.75ul  Xbal, lO U /μΙ 0.75ul
Xhol lO U /μΙ 0.75ul  Xhol lO U /μΙ 0.75ul
NEB buffer 4 5ul  NEB buffer 4 5ul
PNN-4 或是单双模块 500ng  PNN-4 or single and dual modules 500ng
双蒸水 补足体积到 50ul  Double distilled water to make up the volume to 50ul
总体积 50ul  Total volume 50ul
反应条件: 37°C 4h  Reaction conditions: 37 ° C 4h
其中连接反应条件如下:  The reaction conditions are as follows:
连接体系:  Connection system:
T4 liagase 0.4ul  T4 liagase 0.4ul
T4 buffer lul  T4 buffer lul
酶切回收后得到的 PNN-4 20ng  PNN-4 20ng obtained after enzyme digestion
酶切后回收得到的单双模块 200ng  Single and double modules recovered after enzyme digestion 200ng
双蒸水 补足体积到 lOul  Double steamed water to make up the volume to lOul
总体积 lOul 反应条件: 16°C 2h。 将连接产物转化感受态细菌 TransTl后, 提取质粒, 质粒经过双 酶切鉴定(Xhol和 Xbal)得到对应的单双模块的大小的片段的质粒送 Sanger测序, 确认序 列。  Total volume lOul Reaction conditions: 16 ° C 2 h. After the ligation product was transformed into the competent bacteria TransTl, the plasmid was extracted, and the plasmid was subjected to restriction enzyme digestion (Xhol and Xbal) to obtain a plasmid of the size corresponding to the single- and double-module size, and Sanger was sequenced to confirm the sequence.
最终得到所有连接到 PNN-4上的单双模块的质粒待用。  Finally, all plasmids connected to the single and double modules on PNN-4 were used.
4、 构建表达半重复单元的质粒  4. Construction of a plasmid expressing a semi-repeat unit
在 Talen蛋白中, 最后半个重复单元模块实际上包含半个 RVD序列并且还有部分 的 talen蛋白的 C端,在本实施例中, 将这 2个部分合并成一个模块单元, 和其他模块 同时进行连接, 这样能够简化连接流程, 将最终载体简化为一种而不需每次根据不同 的末端识别序列 ATCG选择不同的对应识别 ATCG的最终载体。  In the Talen protein, the last half of the repeat unit module actually contains half of the RVD sequence and also the C-terminus of the partial talen protein. In this embodiment, the two parts are combined into one modular unit, simultaneously with other modules. The connection is made, which simplifies the connection process and simplifies the final vector into a final vector that does not need to select different corresponding identification ATCGs according to different end recognition sequences ATCG.
由于最后一个模块是半个单元, 并且包含了部分 N端的序列, 所以发明人采用合 成和 PCR的方法得到相对应的序列。本发明以合成的 C terminal seq作为模块, 利用正 向片段引物和 half-R进行 PCR扩增。  Since the last module is a half unit and contains a partial N-terminal sequence, the inventors used synthetic and PCR methods to obtain a corresponding sequence. The present invention uses a synthetic C terminal seq as a module, and uses a forward fragment primer and a half-R for PCR amplification.
该 PCR扩增方法如下:  The PCR amplification method is as follows:
PCR扩增体系:  PCR amplification system:
合成的 C terminal seq 10ng  Synthetic C terminal seq 10ng
Phusion High-Fidelity PCR Master Mix 25ul  Phusion High-Fidelity PCR Master Mix 25ul
half-R ( lOuM) 2ul  half-R ( lOuM) 2ul
双蒸水 补足到 50ul 总体积 50ul Double steamed water to make up to 50ul Total volume 50ul
反应条件如下:  The reaction conditions are as follows:
98 °C 30s  98 °C 30s
98 °C 10s  98 °C 10s
60 °C 20s J 20个循环  60 °C 20s J 20 cycles
72 °C 20s  72 °C 20s
72 °C 5min  72 °C 5min
第一步 PCR扩增后, 从 PCR体系中取出 10ul, 加入 lOul的正向片段引物, 混合后加 热到 95° 5min后置于室温自然冷却退火。退火后加入 lul的 1 Ox Klenow fragment Buffer和 lul的 klenow fragment, 37°C处理 30min。  After the first step of PCR amplification, 10 ul was taken from the PCR system, lOul of the forward fragment primer was added, mixed and heated to 95 ° 5 min, and then left to cool at room temperature. After annealing, lul 1 Ox Klenow fragment Buffer and lul klenow fragment were added and treated at 37 ° C for 30 min.
取出 3ul上述步骤的产物, 作为模块, 使用扩增引物 F/R进行 PCR反应, 条件如下: 3 ul of the product of the above procedure was taken out, and as a module, PCR reaction was carried out using the amplification primer F/R under the following conditions:
PCR扩增体系: PCR amplification system:
Klenow酶处理过的产物 3ul  Klenow enzyme treated product 3ul
Phusion High-Fidelity PCR Master Mix 25ul  Phusion High-Fidelity PCR Master Mix 25ul
扩增引物 F\R ( lOuM) 2ul  Amplification primer F\R ( lOuM) 2ul
双蒸水 补足到 50ul  Double steamed water to make up to 50ul
总体积 50ul  Total volume 50ul
反应条件如下:  The reaction conditions are as follows:
98 °C 30s  98 °C 30s
98 °C 10s  98 °C 10s
60 °C 20s 35个循环  60 °C 20s 35 cycles
72 °C 20s  72 °C 20s
72 °C 5min  72 °C 5min
PCR产物回收大小为 200bp左右长度的片段,割胶回收纯化后连接到 P-easy载体上去。 反应条件如下:  The PCR product recovers a fragment of about 200 bp in length, and the gel is recovered and purified, and then ligated to the P-easy vector. The reaction conditions are as follows:
P-easy blunt vector lul  P-easy blunt vector lul
PCR产物 05.ul-4ul (20ng)  PCR product 05.ul-4ul (20ng)
混合后室温连接 15min后直接转化感受态细胞后挑取单克隆用 M13F 进行 Sanger测 序。 保存正确的质粒待用。  After mixing for 15 min at room temperature, the competent cells were directly transformed, and then the monoclonal was picked for Manger using M13F for Sanger sequencing. Save the correct plasmid for use.
利用扩增引物 F/R进行 PCR扩增, 可得到相对于的识别序列最后一位的半个模块 和部分的 C端的模块单元, 将该模块单元同其他单双模块一样和 PNN-4进行连接, 方 法同其他单双模块一样, 得到测序正确的质粒。 因为半个模块在连接的第 10位上, 所 以按照其识别的碱基分别命名为 10A, 10T, 10C , 10G。  Using the amplification primer F/R for PCR amplification, a half module and a partial C-terminal module unit relative to the last bit of the recognition sequence can be obtained, and the module unit is connected to the PNN-4 like other single and dual modules. The method is the same as other single and double modules, and the correct sequencing plasmid is obtained. Since half of the modules are at the 10th position of the connection, the bases identified by them are named 10A, 10T, 10C, 10G, respectively.
其全长序列如下:  Its full length sequence is as follows:
HD:  HD:
gcgcTCTAGACCTTAAACCGGCCAACATACCggtctcCactgaccCCGGACCAAGTGG
Figure imgf000014_0001
gcgcTCTAGACCTTAAACCGGCCAACATACCggtctcCactgaccCCGGACCAAGTGG
Figure imgf000014_0001
αι Das ) IOODDIIIDDDDIIIDIOIDIOVDO VODIDODOD , s ^-K^^ Ιι Das ) IOODDIIIDDDDIIIDIOIDIOVDO VODIDODOD , s ^-K^^
:MM : MM
ON ON
( ΐ £ : ON ai 03S ) VVV03I3030VV30VV3003§§IIV3VV3§¾33§3^ §0I ( ΐ £ : ON ai 03S ) VVV03I3030VV30VV3003§§IIV3VV3§3⁄433§3^ §0I
:IM : IM
(οε :OMaiDas) VVVODIDODOVVDOVVDOO^SSIVODVD^^^^S^^ ^OI (οε : OMaiDas) VVVODIDODOVVDOVVDOO^SSIVODVD^^^^S^^ ^OI
OH OH
(6 : OM ai Das ) § ¾§¾O¾§¾¾¾§§§§¾¾¾§§3 o (6 : OM ai Das) § 3⁄4§3⁄4O3⁄4§3⁄43⁄43⁄4§§§§3⁄43⁄43⁄4§§3 o
(82 OMaiDas) DDIIIDDDDIIIDIOIDIOVDOVODIDODOD :¾-/#|& f Ji (82 OMaiDas) DDIIIDDDDIIIDIOIDIOVDOVODIDODOD : 3⁄4-/#|& f Ji
:OM(H0aS) 33VIV3VV330033VVVII33VOVI3I3§30 :J-/#|& f Ji : OM(H0aS) 33VIV3VV330033VVVII33VOVI3I3§30 : J-/#|& f Ji
:土^ sfei呦 Ife liJf ^ si :土^ sfei呦Ife liJf ^ si
ςζ :OM CH 03S) §0§0OVO3I3¾§¾0¾§¾¾¾§§§§¾¾¾§§00¾§¾§0§0¾¾00¾§W§°§00§ oi §11§ο§§οοΐ¾§ 裏οο§¾§ §¾οο裏1§11¾ο§¾¾¾§ο §ο§¾¾ο§¾¾ο§§ο§§ ) )3Υγο§¾οο§οΐ¾ § ) ςζ: OM CH 03S) §0§0OVO3I3¾ 0¾ § ¾0¾ § ¾¾¾ §§§§ ¾¾¾ §§ 00¾ § ¾ § 0 § 0¾¾00¾ §W§ ° § 00 § oi §11§ο§§οοΐ¾§ 裏οο§ 3⁄4§ §3⁄4οο裏1§113⁄4ο§3⁄43⁄43⁄4§§§§§3⁄43⁄4ο§3⁄43⁄4裏§§§))) 3Υγο§3⁄4οο§οΐ3⁄4 § )
ON ON
:OM n Das) §o§oovo3i3¾§¾0¾§¾¾¾§§§§¾¾¾§§00¾§¾§0§0¾¾00¾§w§°§00§ : OM n Das) §o§oovo3i3¾ 0¾ § ¾0¾ § ¾¾¾ §§§§ ¾¾¾ §§ 00¾ § ¾ § 0 § 0¾¾00¾ §w§ ° § 00 §
§11§ο§§οοΐ¾§ 裏οο§¾§ §¾οο裏1§11¾ο§¾¾¾§ο §ο§¾¾ο§¾¾ο§§ο§§ γ3Υγο§¾οο§οΐ¾ § ) ς §11§§§§οοΐ3⁄4§ §§§ο裏3⁄4§ §3⁄4οο裏§§§1⁄4ο§3⁄43⁄4⁄§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§
:IM : IM
:OM n Das) §o§oovo3i3¾§¾0¾§¾¾¾§§§§¾¾¾§§00¾§¾§0§0¾¾00¾§w§°§00§ : OM n Das) §o§oovo3i3¾ 0¾ § ¾0¾ § ¾¾¾ §§§§ ¾¾¾ §§ 00¾ § ¾ § 0 § 0¾¾00¾ §w§ ° § 00 §
§11§ο§§οοΐ¾§ 裏οο§¾§ §¾οο裏1§11¾ο§¾¾¾§ο §ο§¾¾ο§¾¾ο§§ο§§ γ )3Υ3θ§¾οο§οΐ¾ § ) £0tSL0/n0ZSLJ/∑Jd 0Ϊ869Ϊ/ 0Ζ OAV 实施例 2 针对靶序列, 构建打靶载体 talen靶位点的选择 §11§§§§οοΐ3⁄4§ §§§ο裏3⁄4§ §3⁄4οο裏§§§1⁄4ο§3⁄43⁄4裏3⁄43⁄4裏§§§γ⁄3⁄4裏§§§§3⁄4οο§οΐ3⁄4 § ) £0tSL0/n0ZSLJ /∑Jd 0Ϊ869Ϊ/ 0Ζ OAV Example 2 Construction of targeting target talen target site for target sequence
talen 靶位点的识别从 5'到 3'开始, 并且识别的前一位为 T。 可以利用在线工具 htt s: / bogl ab .pip.iastate. edii/node/add/taien¾fi talen位点的选择。 要构建 20bp 的目标 中 N = A, G, T or C), 中间的 18bp是我们根据需要选择要构建的 TALEN载体目标序 列。  The identification of the talen target site starts from 5' to 3' and the previous digit identified is T. You can use the online tool htt s: / bogl ab .pip.iastate. edii/node/add/taien3⁄4fi talen site selection. To construct a 20bp target, N = A, G, T or C), the middle 18bp is the TALEN carrier target sequence we want to construct as needed.
1、 选择靶序列  1, select the target sequence
在该实施例中,发明人选择了基因 PPARY2(NCBI索取号 ACCESSION: NG_011749 REGION: 5001..151507 ) , 该发明的靶序列的位置位于 ΡΡΑΙ γ2基因的 2号外显子上, 序列为 TGACACAGAGATGCCATTctggcccaccaacttcgGAATCAGCTCTGTGGA。 由此, 构建的 Talen的左臂识别 17个碱基序列 TGACACAGAGATGCCATT, 右臂识别 15个 碱基序列 GAATCAGCTCTGTGG, 中间的间隔为 17个碱基,本发明中建议所有的设计 中选择间隔中包含酶切位点的 talen, 这样有利于后续的打靶效率验证。 talen设计的依 据是打靶位点的基因序列的 5'端均以 T开始, 即序列以 T开始, 并且以 A结尾。 这个 T对于提高 talen的效率非常重要。 In this example, the inventors selected the gene PPAR Y 2 (NCBI request number ACCESSION: NG_011749 REGION: 5001..151507), and the target sequence of the present invention is located on exon 2 of the ΡΡΑΙγ2 gene, and the sequence is TGACACAGGAGTGCCATTctggcccaccaacttcgGAATCAGCTCTGTGGA . Thus, the left arm of the constructed Talen recognizes the 17 base sequence TGACACAGATGGCCCATT, and the right arm recognizes the 15 base sequence GAATCAGCTCTGTGG with an interval of 17 bases. In the present invention, it is suggested that all the selection intervals in the design include the enzyme digestion. The talen of the site is beneficial for subsequent target efficiency verification. The talen design is based on the fact that the 5' end of the gene sequence of the target site starts with T, ie the sequence starts with T and ends with A. This T is very important to improve the efficiency of the talen.
2、 确定 RVD模块  2. Determine the RVD module
基于前面所确定的靶序列, 确定 RVD模块的序列如下:  Based on the target sequence determined above, the sequence of the RVD module is determined as follows:
Talen左臂 RVD序列:  Talen left arm RVD sequence:
Talen右臂 RVD序列: Talen right arm RVD sequence:
HD HD NI HD NI顺 NI NH HD NG顺 NI NG NG HD  HD HD NI HD NI to NI NH HD NG to NI NG NG HD
确定 ΡΡΑΙ γ2 的打靶 Talen 识别的 RVD 后, 我们从质粒库中选择质粒: 1TG, 2AC,3AC,4AG,5AG,6AT,7GC,8CA,9T, 10T 一共 10个质粒和本发明的最终载体质粒进 行 酶 切 连 接 , 最 终 得 到 测 序 正 确 的 Talen 左 臂 。 选 择 质 粒 : 1CC,2AC,3AG,4AG,5CT,6G,7A,8T,9T, 10C—共 10个质粒和本发明的最终载体质粒进行 酶切连接, 最终得到测序正确的 Talen右臂。  After determining the RVD recognized by the target Talen of ΡΡΑΙ2, we selected the plasmid from the plasmid library: 1TG, 2AC, 3AC, 4AG, 5AG, 6AT, 7GC, 8CA, 9T, 10T for a total of 10 plasmids and the final vector plasmid of the present invention. Enzyme-cleavage ligation, and finally the right-handed Talen left arm was obtained. The plasmids were selected: 1CC, 2AC, 3AG, 4AG, 5CT, 6G, 7A, 8T, 9T, 10C - a total of 10 plasmids and the final vector plasmid of the present invention were ligated and ligated to obtain the right Talen right arm.
3、 模块连接  3, module connection
将各质粒库的质粒浓度均一致稀释到 100ng/ul, 由于载体上酶切位点为 Esp3I, 故 先将最终载体进行酶切后, 割胶回收片段大小为 4K的载体:  The plasmid concentration of each plasmid library was uniformly diluted to 100 ng/ul. Since the restriction site on the vector was Esp3I, the final vector was first digested, and the vector with a fragment size of 4K was recovered by tapping:
酶切成分体积如下:  The volume of the enzyme digestion component is as follows:
Esp3I (BsmBI), lO U /μΙ 0.75ul  Esp3I (BsmBI), lO U /μΙ 0.75ul
T4 buffer lul  T4 buffer lul
双蒸水 14.85ul  Double distilled water 14.85ul
最终载体 lmg 总体积 20ul Final carrier lmg Total volume 20ul
反应条件为 rrc 4h。  The reaction conditions were rrc 4h.
酶切连接得到最终 talen左右臂的反应体系如下:  The reaction system for the final talen left and right arms is obtained by enzyme digestion as follows:
成分体积  Component volume
酶切回收纯化后的载体 (100 ι^μΐ—1) lul Restriction recovery of purified vector (100 ι^μΐ- 1 ) lul
10个模块质粒 (100 ng μ厂1) 各 0.5ul 10 module plasmids (100 ng μ plant 1 ) 0.5ul each
Bsal 0.75ul Bsal 0.75ul
BSA(IOX) lul BSA(IOX) lul
ATP(lOmM) lul  ATP (lOmM) lul
T7 ligase 0.25ul  T7 ligase 0.25ul
双蒸水 补足体积到 20ul 总体积 20ul  Double distilled water to make up the volume to 20ul total volume 20ul
反应条件:  Reaction conditions:
37°C 5min  37 ° C 5 min
20 °C lOmin J 35 cycles  20 °C lOmin J 35 cycles
反应结束后将连接产物转化, 抗性为 Ka, 筛选出克隆进行 Sanger 正确后待用。  After the end of the reaction, the ligation product was transformed and the resistance was Ka. The clones were screened for Sanger and used immediately.
4、 验证  4, verification
经过测序, 上述 ΡΡΑΙγ2的 Talen左臂 RVD序列的组装和比对: Talen的左臂的 RVD序列为: 对应的 RVD的氨基酸序列为:  After sequencing, the assembly and alignment of the above-mentioned lenγ2 Talen left arm RVD sequence: The RVD sequence of Talen's left arm is: The corresponding RVD amino acid sequence is:
ETVQRLLPVLCQDHG (SEQIDNO: 37) ETVQRLLPVLCQDHG (SEQIDNO: 37)
TVQRLLPVLCQDHG (SEQIDNO: 38) TVQRLLPVLCQDHG (SEQIDNO: 38)
TVQRLLPVLCQDHG (SEQIDNO: 39) TVQRLLPVLCQDHG (SEQIDNO: 40) TVQRLLPVLCQDHG (SEQIDNO: 39) TVQRLLPVLCQDHG (SEQIDNO: 40)
TVQRLLPVLCQDHG (SEQIDNO: 41) TVQRLLPVLCQDHG (SEQIDNO: 41)
TVQRLLPVLCQDHG (SEQIDNO: 42) TVQRLLPVLCQDHG (SEQIDNO: 42)
ETVQRLLPVLCQDHG (SEQIDNO: 43) TVQRLLPVLCQDHG ( SEQ ID NO: 44) ETVQRLLPVLCQDHG (SEQIDNO: 43) TVQRLLPVLCQDHG ( SEQ ID NO: 44)
LTPDQVVAIAS  LTPDQVVAIAS
E ( SEQ ID NO: 45 ) E ( SEQ ID NO: 45 )
RVD一共由 598个氨基酸组成, 包括 34* 17+20=598个^  RVD consists of a total of 598 amino acids, including 34* 17+20=598 ^
Sanger测序的引物序列为:  The primer sequences for Sanger sequencing are:
TAL-F1 5'-ttggcgtcggcaaacagtgg ( SEQ ID NO: 35 )  TAL-F1 5'-ttggcgtcggcaaacagtgg ( SEQ ID NO: 35 )
TAL-R2 5'-ggcgacgaggtggtcgttgg ( SEQ ID NO: 36)  TAL-R2 5'-ggcgacgaggtggtcgttgg (SEQ ID NO: 36)
由于整个 Talen的 RVDs识别长度约为 2k左右, 为了保证每个 RVD的序列的正 确性, 我们将整个序列 RVDs分三部分进行 Sanger测序, 序列比对的原则是保证三段 序列可以重叠, 从第 1位氨基酸 LTPE开始, 到最后半个模块的 QAHG截止, 保证每 个氨基酸的完全比对。  Since the recognition length of the entire Talen RVDs is about 2k, in order to ensure the correctness of each RVD sequence, we sequence the entire sequence RVDs into three parts. The principle of sequence alignment is to ensure that the three sequences can overlap. Starting with the amino acid LTPE, the QAHG cutoff to the last half of the module ensures complete alignment of each amino acid.
经过检测序列匹配的载体即可用于下游的打靶效率的检测。  The vector matched by the detection sequence can be used for the detection of downstream target efficiency.
实施例 3 基因敲除 将前面实施例 2中所构建的打靶载体, 转染 293T细胞: Example 3 Gene knockdown The targeting vector constructed in the previous Example 2 was transfected into 293T cells:
细胞转染  Cell transfection
转染前调整细胞到较好状态 (细胞形态清晰, 饱满透亮) , 在正式转染之前的一 天, 将细胞传代于 6孔板中, 使细胞在转染时密度达到 80— 90%为宜。 除实验组外要 预留出一个孔为阴性对照、 一个孔为阳性对照。  Adjust the cells to a good state before transfection (cell morphology is clear, full and translucent). On the day before the official transfection, the cells are passaged into a 6-well plate, so that the density of the cells at the time of transfection is 80-90%. In addition to the experimental group, one well was reserved as a negative control and one well was a positive control.
本实验是利用 X-tremeGENE HP DNA Transfection Reagent进行细胞转染 (以 6孑 L 板为例) , 简言之, 其步骤如下:  In this experiment, X-tremeGENE HP DNA Transfection Reagent was used for cell transfection (taking 6孑L plate as an example). In short, the steps are as follows:
准备转染复合物: a) 200 μΐ 无血清培养基或 Opti-MEM I 培养基。 b) 加入 2 μ§ DNA (实验组: Talen质粒各 1μ§ ; 阳性对照组: EGFP-N1质粒 2 μ§ ; 阴性对照组: none) 。 c) 短暂轻柔漩涡 (不超过 10s) 。 d) 再加入 6μ1 X-tremeGENE HP DNA DNA Transfection Reagent。 e) 短暂轻柔漩涡。 f) 室温孵育 15-30 分钟 (15 至 25°C)。 Prepare the transfection complex: a) 200 μΐ serum-free medium or Opti-MEM I medium. b) Add 2 μ § DNA (experimental group: Talen plasmid 1 μ § ; positive control group: EGFP-N1 plasmid 2 μ § ; negative control group: none). c) A brief gentle vortex (no more than 10s). d) Add 6μ1 X-tremeGENE HP DNA DNA Transfection Reagent. e) A brief gentle whirlpool. f) Incubate for 15-30 minutes (15 to 25 ° C) at room temperature.
将转染复合物均匀逐滴加入到提前准备好的细胞中, 十字法轻轻摇动培养皿。 将 细胞置于 37°C, 5%C02培养箱中培养。 The transfection complex was evenly added dropwise to the cells prepared in advance, and the dish was gently shaken by the cross method. The cells were cultured at 37 ° C in a 5% CO 2 incubator.
转染后 24h, 通过观察阳性对照组绿色荧光比率, 来推测细胞转染效率 (CHO细 胞正常为 80%以上) 。 弃掉培养液, 更换新的完全培养基。  At 24 h after transfection, the cell transfection efficiency was estimated by observing the green fluorescence ratio of the positive control group (CHO cells were normally over 80%). Discard the medium and replace with new complete medium.
转染 24h后, 可将培养基更换为含有 puromycin的选择培养基, 培养约 7天后换成普 通完全培养基, 此时阴性对照组细胞已经死光, 而实验组仍有存活细胞 (抗生素浓度与药 筛时间要提前预实验测定)  After transfection for 24h, the medium can be changed to a selection medium containing puromycin. After about 7 days of culture, the medium is replaced with normal complete medium. At this time, the negative control cells are dead, and the experimental group still has viable cells (antibiotic concentration and The screening time should be pre-tested in advance)
在药筛期间, 会出现大量细胞死亡后悬浮在培养基中的现象, 可以每 1~2天换液, 并 始终保持 puromycin浓度不变。  During the sieving period, a large amount of cell death occurs after suspension in the culture medium, and the liquid can be changed every 1-2 days, and the concentration of puromycin is always maintained.
换普通完全培养基后, 待实验组细胞长至约 80%以上, 即可以提取基因组 DNA进行 后续鉴定弃掉培养基, 每个孔分别用 PBS清洗一遍, 加入 0.25%胰酶消化, 待细胞消化成 圆形后加入完全培养基终止消化, 收集细胞悬液, lOOOrpm离心 5min后弃上清。 After changing the normal complete medium, the cells in the experimental group grow to more than 80%, that is, the genomic DNA can be extracted. Subsequent identification discarded the medium, and each well was washed once with PBS, and then digested with 0.25% trypsin. After the cells were digested into a round shape, the whole medium was added to terminate the digestion, and the cell suspension was collected, centrifuged at 1000 rpm for 5 min, and the supernatant was discarded.
利用血液组织细胞基因组提取试剂盒 (TIANGEN) 提细胞基因组 DNA。  Cellular genomic DNA was extracted using the Blood Tissue Cell Genome Extraction Kit (TIANGEN).
以基因组 DNA为模板, 利用提前根据靶位点设计的引物进行 PCR, 琼脂糖电泳检 测目的片段大小是否正确, 并将 PCR产物送样测序。  Using genomic DNA as a template, PCR was performed using primers designed according to the target site in advance, and the size of the target fragment was determined by agarose gel electrophoresis, and the PCR product was sent for sequencing.
若测序结果中在靶位点出现套峰, 说明靶位点有敲出, Talen作用生效。 将测序正 确的 PCR产物进行大量酶切, 以去除未敲除基因, 酶切产物回收后进行 TA克隆, 菌 落 PCR后再酶切, 对验证正确的克隆进行送样测序, 测序结果验证实验的正确性。  If there is a set of peaks at the target site in the sequencing result, it indicates that the target site has a knockout, and the Talen effect is effective. The sequenced correct PCR product was subjected to a large number of enzyme digestion to remove the non-knockout gene, and the restriction enzyme product was recovered and subjected to TA cloning. After colony PCR, the enzyme was digested, and the correct clone was sampled and sequenced. The sequencing result verified the correct experiment. Sex.
单克隆细胞的制备  Preparation of monoclonal cells
提前在 10cm培养皿上加入 0.1%的明胶放入 37°C培养箱铺板, 将初步鉴定有敲除的细 胞利用胰酶消化成细胞悬液, 通过细胞计数板进行计数, 以每个皿 2000个细胞将细胞悬液 加入到培养皿中, 37°C, 5%C02培养箱培养一周, 形成岛状边缘清晰的细胞群落, 即为细 胞单克隆。  Put 0.1% gelatin on a 10cm culture dish in advance and place it in a 37°C incubator. The cells that were initially identified to be knocked out were trypsinized into cell suspensions and counted by cell counting plates to 2000 per dish. The cells were added to the culture dish, and cultured at 37 ° C in a 5% CO 2 incubator for one week to form a cell population with a clear island edge, which is a cell monoclonal.
单克隆细胞的检测  Detection of monoclonal cells
将极限稀释后长起的单细胞克隆, 利用移液枪在显微镜下吸出转移到 96孔板中培养, 待细胞密度达到 90%后传代至 48孔板中, 并分离出少量细胞利用 fl40 phire animal tissue direct per kit进行 PCR扩增, 经过琼脂糖凝胶电泳显示, 96个样品其中 93个均在在 530bp 处有条带。 再通过 Phol限制性内切酶对 PCR产物进行 75 °C酶切, 电泳显示出原 530bp条 带未被切开的为筛选出的阳性克隆, 将其 PCR结果送去测序。 随后对测序结果进行分析, Phol位点 (GGCC ) 有敲除的对应克隆细胞为 TALEN敲除后的单一细胞系。 初步测序结 果显示测序样品为 23个, 其中无敲除细胞为 6个, 敲除但有杂细胞污染 2个, 单敲除 13 个 (9个测序结果不明显) , 双敲除 2个 (27,64) 。  The single-cell clones that have been extended after the dilution are transferred to a 96-well plate by a pipette and transferred to a 96-well plate. After the cell density reaches 90%, they are passaged into a 48-well plate, and a small amount of cells are isolated. Using fl40 phire animal The tissue direct per kit was subjected to PCR amplification, and agarose gel electrophoresis showed that 93 of the 96 samples were banded at 530 bp. The PCR product was digested with a Phol restriction enzyme at 75 °C. The electrophoresis revealed that the original 530 bp band was not cut into positive clones, and the PCR results were sent for sequencing. Subsequently, the sequencing results were analyzed, and the corresponding clonal cells with knockout at the Phol site (GGCC) were single cell lines after TALEN knockout. The preliminary sequencing results showed that there were 23 sequencing samples, of which there were 6 non-knockout cells, 2 knockouts but 2 single cells, 9 knockouts (9 sequencing results are not obvious), and 2 knockouts (27). , 64).
进一步将 27与 64号 PCR产物与 PMD18-T连接, 进行 T克隆测序。 分析结果如下:  Further, PCR products of No.27 and No.64 were ligated to PMD18-T for T-cloning sequencing. The analysis results are as follows:
Figure imgf000018_0001
Figure imgf000018_0001
由此, 验证了通过以上实验, 成功得到多株纯和 ppair2敲除的细胞克隆, 并有 2株为纯 和双敲除。  Thus, it was verified that a plurality of pure and ppair2 knockout cell clones were successfully obtained by the above experiments, and two strains were pure and double knockout.
在本说明书的描述中, 参考术语"一个实施例"、 "一些实施例"、 "示例"、 "具体示例"、 或"一些示例"等的描述意指结合该实施例或示例描述的具体特征、 结构、 材料或者特点包 含于本发明的至少一个实施例或示例中。 在本说明书中, 对上述术语的示意性表述不一定 指的是相同的实施例或示例。 而且, 描述的具体特征、 结构、 材料或者特点可以在任何的 实施例或示例中以合适的方式结合。 尽管上面已经示出和描述了本发明的实施例, 可以理解的是, 上述实施例是示例性的, 不能理解为对本发明的限制, 本领域的普通技术人员在不脱离本发明的原理和宗旨的情况 下在本发明的范围内可以对上述实施例进行变化、 修改、 替换和变型。 In the description of the present specification, the description of the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" and the like means a specific feature described in connection with the embodiment or example. A structure, material or feature is included in at least one embodiment or example of the invention. In the present specification, the schematic representation of the above terms does not necessarily mean the same embodiment or example. Moreover, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any embodiment or example. Although the embodiments of the present invention have been shown and described, it is understood that the foregoing embodiments are illustrative and not restrictive Variations, modifications, alterations and variations of the above-described embodiments are possible within the scope of the invention.

Claims

权 利 要 求 书 Claim
1、 一种分离的寡核苷酸, 其特征在于, 包括: An isolated oligonucleotide, comprising:
第一核酸分子, 所述第一核酸分子编码双碱基识别模块, 所述双碱基识别模块由串联 的第一单碱基识别模块和第二单碱基识别模块构成, 所述第一单碱基识别模块和第二单碱 基识别模块均包含重复可变双氨基酸残基;  a first nucleic acid molecule, the first nucleic acid molecule encoding a double base recognition module, wherein the double base identification module is composed of a first single base identification module and a second single base identification module connected in series, the first single The base recognition module and the second single base recognition module each comprise a repeating variable diamino acid residue;
第一扩增序列, 所述第一扩增序列位于所述第一核酸分子的 5'侧, 并且所述第一扩增 序列包含 lis型内切酶识别序列; 以及  a first amplification sequence, the first amplification sequence is located on the 5' side of the first nucleic acid molecule, and the first amplification sequence comprises a lis endonuclease recognition sequence;
第二扩增序列, 所述第二扩增序列位于所述第一核酸分子的 3'侧, 并且所述第二核酸 序列包含 lis型内切酶识别序列。  A second amplification sequence, the second amplification sequence is located on the 3' side of the first nucleic acid molecule, and the second nucleic acid sequence comprises a lis type endonuclease recognition sequence.
2、 根据权利要求 1所述的寡核苷酸, 其特征在于, 所述第一单碱基识别模块和第二单 碱基识别模块均均具有氨基酸序列:  The oligonucleotide according to claim 1, wherein the first single base recognition module and the second single base recognition module each have an amino acid sequence:
LTPDQVVAIAS*RVD*GGKQALETVQRLLPVLCQDHG, LTPDQVVAIAS*RVD*GGKQALETVQRLLPVLCQDHG,
其中, *RVD*表示重复可变双氨基酸残基。  Wherein *RVD* indicates repeated variable diamino acid residues.
3、 根据权利要求 2所述的寡核苷酸, 其特征在于, 所述第一单碱基识别模块和第二单 碱基识别模块中的 RVD满足下列条件之一:  The oligonucleotide according to claim 2, wherein the RVD in the first single base identification module and the second single base identification module satisfy one of the following conditions:
所述第一单碱基识别模块的 RVD为 L 所述第二」单碱基识别模块的 RVD为 NI; 所述第- -单碱基识别模块的 RVD为 L 所述第二」单碱基识别模块的 RVD为 NG 所述第- -单碱基识别模块的 RVD为 L 所述第二」单碱基识别模块的 RVD为 HD 所述第- -单碱基识别模块的 RVD为 L 所述第二」单碱基识别模块的 RVD为 NN 所述第- -单碱基识别模块的 RVD为 NG, 所述第:二单碱基识别模块的 RVD为 NI 所述第- -单碱基识别模块的 RVD为 NG, 所述第二单碱基识别模块的 RVD为 NG 所述第- -单碱基识别模块的 RVD为 NG, 所述第二单碱基识别模块的 RVD为 HD 所述第- -单碱基识别模块的 RVD为 NG, 所述第二单碱基识别模块的 RVD为 NN 所述第- -单碱基识别模块的 RVD为 HD, 所述第二单碱基识别模块的 RVD为 NI; 所述第- -单碱基识别模块的 RVD为 HD, 所述第二单碱基识别模块的 RVD为 NG 所述第- -单碱基识别模块的 RVD为 HD, 所述第二单碱基识别模块的 RVD为 HD 所述第- -单碱基识别模块的 RVD为 HD, 所述第二单碱基识别模块的 RVD为 NN 所述第- -单碱基识别模块的 RVD为 N, 所述第二单碱基识别模块的 RVD为 NI; 所述第- -单碱基识别模块的 RVD为 N, 所述第二单碱基识别模块的 RVD为 NG; 所述第- -单碱基识别模块的 RVD为 N, 所述第二单碱基识别模块的 RVD为 HD; 或 所述第一单碱基识别模块的 RVD为 N, 所述第二单碱基识别模块的 RVD为 NN。 The RVD of the first single base identification module is L; the RVD of the second "single base identification module is NI; and the RVD of the first - single base identification module is L the second "single base" The RVD of the recognition module is NG, and the RVD of the first-single base identification module is L. The RVD of the second "single base identification module is HD. The RVD of the first-single base identification module is L. The RVD of the second "single base identification module" is NN, and the RVD of the first-single base identification module is NG, and the RVD of the second base identification module is the first-single base identification of the NI The RVD of the module is NG, the RVD of the second single base identification module is NG, the RVD of the first-base identification module is NG, and the RVD of the second single base identification module is HD. - the RVD of the single base identification module is NG, the RVD of the second single base identification module is NN, the RVD of the first-single base identification module is HD, and the second single base identification module The RVD is NI; the RVD of the first-single base recognition module is HD, and the RVD of the second single base identification module is NG, the first-single base identification The RVD of the block is HD, the RVD of the second single base identification module is HD, the RVD of the first-base identification module is HD, and the RVD of the second single base identification module is NN. - the RVD of the single base identification module is N, the RVD of the second single base identification module is NI; the RVD of the first - single base identification module is N, and the second single base identification module The RVD of the first-base identification module is N, the RVD of the second single-base identification module is HD, or the RVD of the first single-base identification module is N, The RVD of the second single base identification module is NN.
4、根据权利要求 1所述的寡核苷酸, 其特征在于,所述 lis型内切酶为选自 Bsal、 Bbsl 以及 BsmBI的至少一种。 The oligonucleotide according to claim 1, wherein the lis-type endonuclease is at least one selected from the group consisting of Bsal, Bbsl, and BsmBI.
5、 根据权利要求 1 所述的寡核苷酸, 其特征在于, 所述 lis 型内切酶识别序列为 GGTCTCNNNN, 其中, N为 A、 T、 G或者 C。 The oligonucleotide according to claim 1, wherein the lis endonuclease recognition sequence is GGTCTCNNNN, where N is A, T, G or C.
6、根据权利要求 1所述的寡核苷酸, 其特征在于, 所述第一扩增序列进一步包含 Xbal 酶切位点,第一扩增序列具有通式 CM;>1Q.2QTCTAGA(M;)2.8GGTCTC(;H; 8.25,其中 M为 A、 T、 C或 G; !!为 、 T、 C或 G, 并且与所述第一单碱基识别模块的编码序列一致。 The oligonucleotide according to claim 1, wherein the first amplification sequence further comprises an Xbal cleavage site, and the first amplification sequence has the general formula CM; > 1Q . 2Q TCTAGA (M ;.) 2 8 GGTCTC (; H;. 8 25, wherein M is a, T, C or G; !! is, T, C or G, and consistent with the first single-base recognition sequence coding module .
7、根据权利要求 1所述的寡核苷酸, 其特征在于,所述第二扩增序列进一步包含 Xhol 酶切位点, 第二扩增序列具有通式 (M>.2QCTCGAG(M';)2.8GGTCTC(;H'; 8.25, 其中 M'为 A、 T、 C或 G; 11为 、 T、 C或 G, 并且与所述第二单碱基识别模块的编码序列匹配。 The oligonucleotide according to claim 1, wherein the second amplified sequence further comprises a Xhol cleavage site, and the second amplified sequence has the general formula (M>. 2Q CTCGAG (M' ;) 2 8 GGTCTC (;. H ';. 8 25, wherein M' is a, T, C or G; 11 is, T, C or G, and the second single-base recognition sequence coding module match.
8、 根据权利要求 1所述的寡核苷酸, 其特征在于, 所述第一扩增序列和第二扩增序列 满足下列条件:  The oligonucleotide according to claim 1, wherein the first amplified sequence and the second amplified sequence satisfy the following conditions:
所述第一扩增序列具有如 SEQ ID NO : 1所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 10所示的核苷酸序列;  The first amplification sequence has a nucleotide sequence as shown in SEQ ID NO: 1, and the second amplification sequence has a nucleotide sequence as shown in SEQ ID NO: 10;
所述第一扩增序列具有如 SEQ ID NO: 2所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 11所示的核苷酸序列;  The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 2, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 11;
所述第一扩增序列具有如 SEQ ID NO: 3所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 12所示的核苷酸序列;  The first amplification sequence has a nucleotide sequence as shown in SEQ ID NO: 3, and the second amplification sequence has a nucleotide sequence as shown in SEQ ID NO: 12;
所述第一扩增序列具有如 SEQ ID NO: 4所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 13所示的核苷酸序列;  The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 4, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 13;
所述第一扩增序列具有如 SEQ ID NO: 5所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 14所示的核苷酸序列;  The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 5, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 14;
所述第一扩增序列具有如 SEQ ID NO: 6所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 15所示的核苷酸序列;  The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 6, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 15;
所述第一扩增序列具有如 SEQ ID NO: 7所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 16所示的核苷酸序列;  The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 7, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 16;
所述第一扩增序列具有如 SEQ ID NO: 8所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 17所示的核苷酸序列; 或者  The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 8, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 17;
所述第一扩增序列具有如 SEQ ID NO: 9所示的核苷酸序列, 所述第二扩增序列具有 如 SEQ ID NO: 18所示的核苷酸序列。  The first amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 9, and the second amplified sequence has a nucleotide sequence as shown in SEQ ID NO: 18.
9、 一种寡核苷酸文库, 其特征在于, 包括:  9. An oligonucleotide library, comprising:
多个分离的寡核苷酸, 其中, 所述分离的寡核苷酸为权利要求 1~8任一项所述的寡核 苷酸。  A plurality of isolated oligonucleotides, wherein the isolated oligonucleotide is the oligonucleotide according to any one of claims 1 to 8.
10、 根据权利要求 9所述的寡核苷酸文库, 其特征在于, 不同的寡核苷酸分别设置在 不同的容器中。  10. The oligonucleotide library according to claim 9, wherein the different oligonucleotides are respectively disposed in different containers.
11、 一种构建表达 TALE重复序列的核酸的方法, 其特征在于, 包括:  11. A method of constructing a nucleic acid expressing a TALE repeat, characterized in that it comprises:
提供第一寡核苷酸和第二寡核苷酸, 所述第一寡核苷酸和第二寡核苷酸均为权利要求 1~8任一项所述的寡核苷酸;  Providing a first oligonucleotide and a second oligonucleotide, the first oligonucleotide and the second oligonucleotide are the oligonucleotides according to any one of claims 1 to 8;
利用 lis型内切酶对所述第一寡核苷酸和第二寡核苷酸进行切割,以便获得第一寡核苷 酸切割产物和第二寡核苷酸切割产物, 其中, 所述第一寡核苷酸切割产物在第一扩增序列 和第二扩增序列形成有粘性末端, 以及所述第二寡核苷酸切割产物在第一扩增序列和第二 扩增序列形成有粘性末端, 并且所述第一寡核苷酸切割产物的第二扩增序列所形成的粘性 末端与所述第二寡核苷酸切割产物的第一扩增序列所形成的粘性末端匹配; 以及 The first oligonucleotide and the second oligonucleotide are cleaved using a lis-type endonuclease to obtain a first oligonucleoside An acid cleavage product and a second oligonucleotide cleavage product, wherein the first oligonucleotide cleavage product forms a cohesive end at the first amplification sequence and the second amplification sequence, and the second oligonucleoside The acid cleavage product forms a cohesive end at the first amplified sequence and the second amplified sequence, and the sticky end formed by the second amplified sequence of the first oligonucleotide cleavage product and the second oligonucleoside The sticky end of the first amplified sequence of the acid cleavage product is matched;
将所述第一寡核苷酸切割产物与所述第二寡核苷酸切割产物进行连接, 以便获得表达 TALE重复序列的寡核苷酸。  The first oligonucleotide cleavage product is ligated to the second oligonucleotide cleavage product to obtain an oligonucleotide that expresses a TALE repeat.
12、 根据权利要求 11所述的方法, 其特征在于, 基于下列关系, 确定所述第一寡核苷 酸和所述第二寡核苷酸分子的 RVD序列:  12. The method according to claim 11, wherein the RVD sequence of the first oligonucleotide and the second oligonucleotide molecule is determined based on the following relationship:
碱基 A对应的 RVD为 I;  The RVD corresponding to base A is I;
碱基 C对应的 RVD为 HD;  The RVD corresponding to base C is HD;
碱基 T对应的 RVD为 NG; 以及  The RVD corresponding to base T is NG;
碱基 G对应的 RVD为 N。  The RVD corresponding to base G is N.
13、 根据权利要求 11所述的方法, 其特征在于, 第一寡核苷酸和第二寡核苷酸的至少 之一包含多种不同的寡核苷酸, 其中, 每种寡核苷酸的一个粘性末端至多与其他寡核苷酸 的一个粘性末端匹配。  13. The method according to claim 11, wherein at least one of the first oligonucleotide and the second oligonucleotide comprises a plurality of different oligonucleotides, wherein each oligonucleotide One sticky end matches at most one sticky end of the other oligonucleotide.
14、 一种改变细胞基因组的方法, 其特征在于, 包括:  14. A method of altering a cellular genome, the method comprising:
针对细胞基因组中预定的靶核酸序列, 确定 TALE的 RVD序列;  Determining the RVD sequence of TALE for a predetermined target nucleic acid sequence in the genome of the cell;
根据权利要求 11~13任一项所述的方法, 构建表达 TALE重复序列的寡核苷酸, 所述 TALE重复序列能够特异性识别所述靶核酸序列; 以及  The method according to any one of claims 11 to 13, constructing an oligonucleotide expressing a TALE repeat, the TALE repeat capable of specifically recognizing the target nucleic acid sequence;
将表达 TALE重复序列的寡核苷酸与编码 TALE DNA修饰酶的核酸引入所述细胞。 An oligonucleotide expressing a TALE repeat and a nucleic acid encoding a TALE DNA modifying enzyme are introduced into the cell.
15、 根据权利要求 14所述的方法, 其特征在于, 所述靶核酸序列长度为 4~20nt, 优选 12-20nt。 The method according to claim 14, wherein the target nucleic acid sequence has a length of 4 to 20 nt, preferably 12 to 20 nt.
16、 根据权利要求 14所述的方法, 其特征在于, TALE DNA修饰酶为 Fok I。  16. The method according to claim 14, wherein the TALE DNA modifying enzyme is Fok I.
17、 根据权利要求 14所述的方法, 其特征在于, 所述表达 TALE重复序列的寡核苷酸 与编码 TALE DNA修饰酶的核酸被构建在同一载体上。  17. The method according to claim 14, wherein the oligonucleotide expressing the TALE repeat sequence and the nucleic acid encoding the TALE DNA modifying enzyme are constructed on the same vector.
18、 一种分离的多肽, 其特征在于, 所述多肽是由权利要求 1~8任一项所述的寡核苷 酸编码的。  18. An isolated polypeptide, characterized in that the polypeptide is encoded by the oligonucleotide according to any one of claims 1-8.
19、 一种载体, 其特征在于, 包括编码权利要求 1~8任一项所述的寡核苷酸的核酸分 子。  A vector comprising a nucleic acid molecule encoding the oligonucleotide of any one of claims 1 to 8.
20、 根据权利要求 19所述的载体, 其特征在于, 所述载体为质粒。  20. The vector according to claim 19, wherein the vector is a plasmid.
21、 一种质粒文库, 其特征在于, 包括:  21. A plasmid library, comprising:
多个质粒, 其中, 所述质粒为权利要求 19或 20所述的载体。  A plurality of plasmids, wherein the plasmid is the vector of claim 19 or 20.
22、 根据权利要求 21所述的质粒文库, 其特征在于, 不同的质粒分别设置在不同的容 器中。  22. The plasmid library according to claim 21, wherein the different plasmids are respectively disposed in different containers.
23、 一种细胞, 其特征在于, 所述细胞是通过权利要求 14 17任一项所述的方法进行 改变基因组而获得的。 A cell, which is obtained by modifying a genome by the method according to any one of claims 14 to 17.
24、 一种对基因进行修饰的方法, 其特征在于, 包括: 24. A method of modifying a gene, comprising:
基于所述基因的靶序列, 从权利要求 21或 22所述的质粒文库中选择相应的至少两种 质粒;  Selecting at least two corresponding plasmids from the plasmid library of claim 21 or 22 based on the target sequence of the gene;
利用 lis型内切酶对所述至少两种质粒进行切割, 以便获得多种质粒切割产物, 其中, 所述质粒切割产物的两端分别形成有粘性末端, 并且每种质粒的一个粘性末端至多与其他 质粒的一个粘性末端匹配;  The at least two plasmids are cleaved using a lis-type endonuclease to obtain a plurality of plasmid cleavage products, wherein the plasmid cleavage products are respectively formed with cohesive ends, and one viscous end of each plasmid is at most Matching one sticky end of the other plasmid;
将所述多个质粒切割产物进行连接, 以便获得表达 TALE重复序列的寡核苷酸, 所述 TALE重复序列识别述基因的靶序列;  Ligating the plurality of plasmid cleavage products to obtain an oligonucleotide expressing a TALE repeat sequence, the TALE repeat sequence identifying a target sequence of the gene;
基于所述 TALE重复序列识别述基因的靶序列, 借助 TALE DNA修饰酶, 对所述基因 进行改变。  The target sequence of the gene is identified based on the TALE repeat, and the gene is altered by means of TALE DNA modifying enzyme.
25、根据权利要求 24所述的方法,其特征在于,基于下列关系, 确定所述相应的质粒: 碱基 A对应的 RVD为 I;  The method according to claim 24, wherein the corresponding plasmid is determined based on the following relationship: the RVD corresponding to the base A is I;
碱基 C对应的 RVD为 HD;  The RVD corresponding to base C is HD;
碱基 T对应的 RVD为 NG; 以及  The RVD corresponding to base T is NG;
碱基 G对应的 RVD为 N。  The RVD corresponding to base G is N.
26、 根据权利要求 24 所述的方法, 其特征在于, 所述基因为 ΡΡΑΙ γ, 所述靶序列 TGACACAGAGATGCCATT禾口 GAATCAGCTCTGTGG的至少之一。  The method according to claim 24, wherein the gene is ΡΡΑΙγ, at least one of the target sequences TGACACAGAGATGCCATT and GAATCAGCTCTGTGG.
PCT/CN2014/075403 2013-04-16 2014-04-15 Isolated oligonucleotide and use thereof WO2014169810A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201480011125.2A CN105008536A (en) 2013-04-16 2014-04-15 Isolated oligonucleotide and use thereof
HK16102277.2A HK1214302A1 (en) 2013-04-16 2016-02-26 Isolated oligonucleotide and use thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310132864 2013-04-16
CN201310132864.4 2013-04-16

Publications (1)

Publication Number Publication Date
WO2014169810A1 true WO2014169810A1 (en) 2014-10-23

Family

ID=51730802

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/075403 WO2014169810A1 (en) 2013-04-16 2014-04-15 Isolated oligonucleotide and use thereof

Country Status (3)

Country Link
CN (1) CN105008536A (en)
HK (1) HK1214302A1 (en)
WO (1) WO2014169810A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011072246A2 (en) * 2009-12-10 2011-06-16 Regents Of The University Of Minnesota Tal effector-mediated dna modification
WO2011146121A1 (en) * 2010-05-17 2011-11-24 Sangamo Biosciences, Inc. Novel dna-binding proteins and uses thereof

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012106725A2 (en) * 2011-02-04 2012-08-09 Sangamo Biosciences, Inc. Methods and compositions for treating occular disorders
US9315788B2 (en) * 2011-04-05 2016-04-19 Cellectis, S.A. Method for the generation of compact TALE-nucleases and uses thereof
BR112013025567B1 (en) * 2011-04-27 2021-09-21 Amyris, Inc METHODS FOR GENOMIC MODIFICATION
WO2012152912A1 (en) * 2011-05-12 2012-11-15 Newvectys Genetically modified pig as a cancer prone model
EP2736538A4 (en) * 2011-07-27 2015-08-05 Broad Inst Inc Compositions and methods of treating head and neck cancer
CN102787125B (en) * 2011-08-05 2013-12-04 北京大学 Method for building TALE (transcription activator-like effector) repeated sequences
CN102864158B (en) * 2012-09-29 2014-04-23 北京大学 High-efficiency synthesis method of TALE (transcription activator like effectors) repeated segments for genetic fixed-point modification

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011072246A2 (en) * 2009-12-10 2011-06-16 Regents Of The University Of Minnesota Tal effector-mediated dna modification
WO2011146121A1 (en) * 2010-05-17 2011-11-24 Sangamo Biosciences, Inc. Novel dna-binding proteins and uses thereof

Also Published As

Publication number Publication date
CN105008536A (en) 2015-10-28
HK1214302A1 (en) 2016-07-22

Similar Documents

Publication Publication Date Title
CN105518138B (en) Method for specifically knocking out pig GFRA1 gene by CRISPR-Cas9 and sgRNA for specifically targeting GFRA1 gene
CN113699135B (en) Adenine base editor fusion protein without PAM limitation and application thereof
WO2016197355A1 (en) Crispr-cas9 method for specific knockout of swine sall1 gene and sgrna for use in targeting specifically sall1 gene
WO2016197361A1 (en) Method for specific knockout of swine ggta1 gene using crispr-cas9 specificity, and sgrna used for specifically targeting ggta1 gene
CN108441520A (en) The gene conditionity knockout technique built using CRISPR/Cas9 systems
CN106701808A (en) DNA polymerase I defective strain and construction method thereof
WO2019153902A1 (en) Plant genome site-directed substitution method
Carninci et al. Balanced-size and long-size cloning of full-length, cap-trapped cDNAs into vectors of the novel λ-FLC family allows enhanced gene discovery rate and functional analysis
WO2013143438A1 (en) Nucleic acid molecular cloning method based on homologous recombination, and related reagent kit
CN110551761B (en) CRISPR/Sa-SepCas9 gene editing system and application thereof
WO2021083183A9 (en) Hematopoietic stem cell hbb gene repair method and product
CN114829600A (en) Plant MAD7 nuclease and PAM recognition capacity of amplification thereof
WO2019206236A1 (en) Implementation of efficient and precise targeted integration by means of tild-crispr
WO2018113799A1 (en) Method and test kit for constructing simplified genomic library
CN110835635A (en) Plasmid construction method for promoting expression of multiple tandem sgRNAs by different promoters
JPS584799A (en) Dna arrangement
US10793867B2 (en) Methods for targeted transgene-integration using custom site-specific DNA recombinases
KR20220151175A (en) RNA-guided genomic recombination at the kilobase scale
EP4047087A1 (en) Construction of high-fidelity crispr/ascpf1 mutant and application thereof
WO2014169810A1 (en) Isolated oligonucleotide and use thereof
CN113549650B (en) CRISPR-SaCas9 gene editing system and application thereof
CN112979822B (en) Construction method of disease animal model and fusion protein
CN111235153B (en) sgRNA for targeted knockout of human MC1R gene and cell strain constructed by same
CN113249409A (en) BMI1 gene-deleted zebra fish
CN105695509B (en) Method for obtaining high-purity myocardial cells

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14784580

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07.03.2016)

122 Ep: pct application non-entry in european phase

Ref document number: 14784580

Country of ref document: EP

Kind code of ref document: A1