WO2022142963A1 - Vector system for screening regulatory sequences and application - Google Patents

Vector system for screening regulatory sequences and application Download PDF

Info

Publication number
WO2022142963A1
WO2022142963A1 PCT/CN2021/134329 CN2021134329W WO2022142963A1 WO 2022142963 A1 WO2022142963 A1 WO 2022142963A1 CN 2021134329 W CN2021134329 W CN 2021134329W WO 2022142963 A1 WO2022142963 A1 WO 2022142963A1
Authority
WO
WIPO (PCT)
Prior art keywords
library
random
sequence
fragment
tag
Prior art date
Application number
PCT/CN2021/134329
Other languages
French (fr)
Chinese (zh)
Inventor
施金秀
罗燕
肖晓丹
叶知晟
蓝田
Original Assignee
云舟生物科技(广州)股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 云舟生物科技(广州)股份有限公司 filed Critical 云舟生物科技(广州)股份有限公司
Publication of WO2022142963A1 publication Critical patent/WO2022142963A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/65Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the invention belongs to the field of bioengineering, and more particularly relates to a vector system and application for screening regulatory sequences.
  • the promoter or enhancer is a DNA sequence located in the upstream and downstream regions of the 5' of the structural gene, which can accurately interact with specific RNA polymerases and related transcription factors. Binding, thereby initiating the transcription initiation of downstream genes, is the most important cis-acting element for regulating gene expression.
  • Eukaryotic promoters contain three conserved sequences with important biological functions, namely the TATA box (TATA box) located in the -35--25 region, the CAAT box (CAAT box) located in the -80--70 region, and the - GC box in the 110--80 zone.
  • the TATA box is involved in regulating the precise transcription initiation of downstream genes, and the CAAT box and GC box are involved in regulating the frequency of transcription initiation.
  • the above three functional regions are important manifestations of the functional activity of promoters, not every promoter contains these three functional regions, and the change of any base or relative position of these three functional regions will often cause promoters Dramatic changes in activity and specificity.
  • the upstream promoter or enhancer activity and the regulatory sequences near the gene, such as 5'UTR and 3'UTR, are the key factors to determine whether the downstream gene can be expressed smoothly and whether the expression level is moderate. Therefore, in order to obtain better expression of the target gene. It is particularly important to transform and screen promoter regulatory sequences using molecular directed evolution technology in vitro.
  • Natural evolution is a long process of survival of the fittest and the accumulation of favorable mutations.
  • researchers simulate the natural evolutionary mechanisms of mutation, recombination and selection in vitro, so that evolution develops in the expected direction.
  • Early researchers mainly used physical methods, chemical methods, mutagenic strains or error-prone PCR to introduce random mutations into protein-coding genes, and then perform functional screening at the cellular or animal level to obtain new functions that can meet people's needs or Excellent performance protein. Although these methods can improve some properties of proteins to a certain extent, their diversity is far from meeting people's needs.
  • DNA shuffling involves dividing multiple related gene families from different sources into random fragments by DNaseI digestion or sonication, and then using the homology between the fragments as templates and primers for each other, these fragments are reassembled by primerless PCR (primerless PCR). To generate full-length genes, the process generates template switching or crossover events that increase the diversity of the mutant library.
  • the protein mutants were then amplified using specific 5' and 3' primers for different protein-coding frames, and cloned into relevant cloning vectors to form mutant libraries.
  • the library diversity ( ⁇ 106 or more) was verified by NGS sequencing. ), and finally perform functional screening at the cellular level or animal level to obtain a protein with improved properties.
  • This method is mainly aimed at the directed evolution of protein molecules. Different genes used as starting templates need to have a certain degree of homology, so as to generate in vitro homologous recombination between small fragments and introduce mutations to form a mutant library for screening.
  • promoter or enhancer DNA shuffling is to enhance the activity of the promoter or specifically change the expression characteristics of the gene.
  • promoter or enhancer shuffling are often extremely low in homology, so the above-mentioned DNA shuffling technology for protein molecules obviously cannot be used for the directed evolution of promoter regulatory sequences.
  • promoter shuffling generally adopts the following technical routes: (1) performing two rounds of error-prone PCR on a single promoter to recover PCR products (forming a large number of mutants with homologous sequences); (2) digesting with DNaseI or sonicating into Random fragments and recover; (3) use the recovered product as a template to carry out primer-free PCR; (4) add specific primers containing specific restriction sites to the primer-free PCR system to amplify the full-length promoter, and recover the specific size (5) The cloning vector and the full-length promoter mutant were digested and ligated with the corresponding restriction enzymes; (6) NGS sequencing was used to verify the diversity of the promoter library.
  • the technical problem to be solved by the present invention is to provide a vector system and application for screening regulatory sequences.
  • the purpose of the present invention is to overcome the limitations of the promoter shuffling method, solve the problem of insufficient library diversity due to the low homology of promoters from different sources and the inability to carry out efficient in vitro recombination, and provide a method for constructing a functional element library
  • a first aspect of the present invention provides a plasmid vector comprising: an index tag, a reporter gene and a barcode tag;
  • the barcode label is a random fragment with a length of 5-200 bp;
  • the number of the index tags is at least 1, and it is independently selected from random fragments with a length of 5-100 bp;
  • the expression product of the reporter gene is capable of self-emitting light or producing color change by catalyzing the substrate reaction, producing light or producing color change by catalyzing the substrate reaction, or producing emitted light or producing color change by irradiating excitation light, Or resistant to corresponding drug screening.
  • the barcode tag is a random fragment with a length of 40bp; the number of the index tags is 2, wherein index1 is a random fragment with a length of 30bp, index2 is a random fragment with a length of 30bp, and the reporter gene At least one selected from the group consisting of fluorescent protein, luciferase, LacZ gene or a resistance gene that can play a screening role, and the resistance gene includes a puromycin resistance gene.
  • Some provided herein include a vector backbone with a first terminator, a recombination site, a reporter gene, a multiple cloning site (MCS), a post-transcriptional regulatory sequence (WPRE), and a second terminator sequentially attached to the vector backbone.
  • MCS multiple cloning site
  • WPRE post-transcriptional regulatory sequence
  • the vector further comprises at least one enzyme cleavage site.
  • the expression product of the reporter gene is capable of self-emitting light or producing color change by catalyzing the reaction of the substrate, producing light or producing color change by catalyzing the reaction of the substrate, or by irradiating excitation light to produce emission light or producing Color changes, or resistance to corresponding drug screening.
  • the reporter gene is selected from at least one of fluorescent protein, luciferase, LacZ gene or a resistance gene that can play a screening role, such as a puromycin resistance gene.
  • TurboGFP is selected as the reporter gene.
  • the first terminator and the second terminator are elements capable of terminating transcription.
  • the terminator SV40 terminator hGH terminator, BGH terminator or rbGlob terminator.
  • both the first terminator and the second terminator are selected as BGH terminators, denoted as BGH-pA.
  • the plasmid vector sequentially comprises the following elements: pUC ori, 5' ITR, BGH pA, index1, index2, reporter gene, barcode tag, WPRE, BGH pA, 3' ITR, and a resistance selection marker.
  • an enzyme cleavage site and a random recombination regulatory sequence are also included between the index1 and index2.
  • the enzyme cleavage site is AsiSI; the number of the enzyme cleavage site is 2, located at both ends of the random recombination control sequence;
  • the random recombination regulatory sequence is a fragmented promoter fragment or a fragmented enhancer fragment.
  • the random recombination regulatory sequence is an enzymatically digested promoter fragment or an enzymatically digested enhancer fragment.
  • the fragmentation method in the step of preparing the random recombination regulatory sequence, includes enzymatic digestion, ultrasonication or artificial synthesis.
  • the enzyme digested by the enzyme is DnaseI.
  • the enzyme digested by the enzyme is DnaseI;
  • the promoter is selected from hRO, hRK, mCAR, ProA1, CMV, EF1A, EFS, CAG, CBh, SFFV, MSCV, SV40, mPGK, hPGK, UBC, Nanog, Nes, Tuba1a, Camk2a, SYN1, Hb9, Th, NSE, GFAP, Iba1, hRHO, hBEST1, Prnp, Cnp, K14, BK5, mTyr, cTnT, ⁇ MHC, Myog, ACTA1, MHCK7, SM22a, EnSM22a, Runx2, OC, Col1a1, Col2a1, aP2, Adipoq, Tie1, Cd144, CD68, CD11b, Afp, Alb, TBG, MMTV,
  • the plasmid vector includes the following elements in sequence: pUC ori, 5'ITR, BGH pA, index1, AsiSI restriction site, index2, Kozak, TurboGFP gene, barcode tag, WPRE, BGH pA, 3'ITR and Amp resistance screening marker; wherein, the barcode label is a random fragment with a length of 40bp, the index1 is a random fragment with a length of 30bp, and the index2 is a random fragment with a length of 30bp;
  • the plasmid vector sequentially includes the following elements: pUC ori, 5'ITR, BGH pA, index1, AsiSI restriction site, random recombination control sequence, AsiSI restriction site, index2, Kozak, TurboGFP gene , barcode tags, WPRE, BGH pA, 3'ITR and Amp resistance screening markers; wherein, the length of the random recombination control sequence is 50-2000bp, which is the promoter fragment after DnaseI digestion or the enhancer fragment after enzyme digestion , the barcode label is a random fragment with a length of 40bp, the index1 is a random fragment with a length of 30bp, and the index2 is a random fragment with a length of 30bp.
  • the second aspect of the present invention provides a method for constructing the plasmid vector, wherein barcode tags, index tags and random recombination control sequences are inserted into a backbone vector containing a reporter gene.
  • the present invention does not limit the insertion order of barcode tags, index tags or random recombination control sequences, and also does not limit the insertion order thereof.
  • Any method of connecting a vector and a nucleic acid fragment that can be used in the art can be used in the present invention.
  • the insert fragment and the vector are ligated after enzyme digestion, or the fragment and the vector are ligated by Gibson cloning reaction.
  • the insertion of the barcode tag is as follows: preparing a barcode tag carrying the homology arm of the backbone vector, making it react with the linearized backbone vector through Gibson cloning, and constructing a tag library.
  • the insertion of the index tag and the random recombination regulatory sequence includes:
  • the tag library is linearized; the fragments are then ligated to the linearized tag library to obtain a library of regulatory sequences.
  • the method for preparing the random recombination regulatory sequence comprises enzymatically digesting the promoter or enhancer.
  • the enzyme is the DnaseI enzyme.
  • the preparation of the homology arm 1-index tag 1-enzyme cleavage site 1-random recombination regulatory sequence-enzyme cleavage site 2-index tag 2-homology arm 2 fragment specifically includes:
  • the primer F and primer R are annealed to form a Y-shaped adapter; the structure of the primer F is homology arm 1-index tag 1-restriction site 1-protection sequence 1; the structure of the primer R is protection sequence 2- Restriction site 2-index tag 2-homology arm 2; the protection sequence 1 and protection sequence 2 are complementary;
  • a linear fragment is obtained by PCR on the random long fragment of the functional element containing the Y-shaped linker
  • the linear fragment is ligated with the linearized tag library to construct a library of regulatory sequences.
  • the method for constructing the plasmid vector of the present invention further includes removing the random recombination regulatory sequence from the regulatory sequence library by enzyme digestion to obtain an index tag library.
  • the third aspect of the present invention provides the application of the vector described in the first aspect of the present invention in library construction or functional element screening.
  • the fourth aspect of the present invention provides a method for library construction, using a Y-shaped linker to integrate randomly interrupted sequences into the vector.
  • the integration site is the recombination site of the vector.
  • the Y-shaped linker is structurally divided into a complementary region and a non-complementary region.
  • non-complementary sequences at the 5' end of the Y-shaped linker respectively comprise the first homology arm, the second homology arm, the first index sequence and the second index sequence from the front and rear ends of the backbone vector cloning site,
  • the complementary sequence at the 3' end contains the restriction enzyme cleavage site.
  • the structure of the Y-shaped linker is a first homology arm, a first index sequence, an enzyme cutting site, a random sequence embedding site, an enzyme cutting site, a second index sequence and a second homology arm.
  • the homologous sequence facilitates subsequent Gibson cloning reactions with the backbone vector.
  • the enzyme cleavage site is different from the enzyme cleavage site on the vector described in the first aspect of the present invention, and the enzyme cleavage site can be used for sequencing verification of functional elements after functional screening.
  • the enzyme cleavage site is selected as an AsiSI enzyme cleavage site.
  • the Y-shaped linker is prepared by annealing PCR primers.
  • downstream primer of the PCR also has an enzyme cleavage site.
  • the Y-shaped linker is prepared by mixing primer A: GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC (SEQ ID NO.3) and primer B Phos-GAATGAAGCGATCGCNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC (SEQ ID NO.4) after mixing.
  • the method according to the third aspect of the present invention comprises the following steps:
  • the label in step S01 is a random sequence with about 40 bases
  • the random tag sequence is located downstream of the fluorescent tracer screening gene TurboGFP,
  • the random tag sequence is located between the fluorescent tracer screening gene TurboGFP and polyA, and the Barcode sequence can be determined at the mRNA level, thereby indirectly determining its corresponding functional element sequence.
  • step S01 More specifically, the specific operations of step S01 are:
  • the vector is linearized by single enzyme cleavage in step a.
  • the upstream primer of the primer in step b contains the random tag sequence.
  • both the upstream and downstream primers of the primers in step b contain restriction enzyme cleavage sites.
  • the PCR fragment thus amplified can be ligated with the digested vector backbone after digestion.
  • the ligation product is transformed into E. coli for storage.
  • XbaI is used to cut the backbone vector
  • the primers used are: F-terminal primer: CACCAAGGAAGCCCTCGAGGACGCGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGATCCCGACCTACCGACCCAGCTTTC (SEQ ID NO.1) and R-terminal primer AGGCGAAGACGCGGAAGAGG (SEQ ID NO.2).
  • the specific technical route of step S01 is as follows: using a specific restriction enzyme to excise the backbone vector MCS and part of the element sequence, recovering the backbone large fragment, using the 5' end to carry the random tag Barcode sequence and The primers of the homology arm PCR amplify the backbone vector to obtain a PCR product with Barcode; the PCR product is subjected to Gibson cloning reaction with the backbone vector recovered by enzyme digestion to construct a tag library, or recorded as a Barcode library.
  • the diversity of the tagged Barcode library can also be verified by high-throughput NGS sequencing.
  • step S02 utilizes the Y-shaped linker to integrate the randomly interrupted sequence of the functional element into the vector.
  • step S02 is:
  • step d the functional element fragments are randomly broken into fragments smaller than 100 bp.
  • the functional element fragments are randomly broken into fragments of about 50 bp in step d.
  • the nucleic acid fragments are randomly interrupted and then blunted to form blunt-ended short fragments of different sizes.
  • the nucleic acid fragments of the functional elements in step d are nucleic acid fragments of multiple functional elements of a specific function.
  • the yield of random long fragments of functional elements containing Y-shaped linkers can be increased by PCR, and the random long fragments of functional elements containing Y-shaped linkers can be reconfigured into double-stranded DNA fragments.
  • the PCR product was purified and then ligated with the tagged Barcode library.
  • the primers used in this step are: F2:CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAGGCTACGG (SEQ ID NO.5);
  • R2 GTCTAGACCTCGAGGAGACGCCACGGCTGCCGTCAGCCTACGTCAGGG (SEQ ID NO. 6).
  • the Y-shaped adapter in step e is prepared by annealing PCR primers.
  • downstream primer of the PCR also has an enzyme cleavage site.
  • the Y-shaped linker is prepared by mixing primer A: GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC (SEQ ID NO.3) and primer BPhos-GAATGAAGCGATCGCNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC (SEQ ID NO.4) after mixing.
  • the specific technical route of step S02 is as follows: PCR amplification and recovery of several functional elements with the same tissue specificity or specific or unknown functions respectively; Random fragments of 100bp and blunt ends are recovered, and the target size band is recovered, such as a small band of about 50bp; the Y-type adaptor that has undergone annealing reaction is added to the random blunt-end short fragment for ligation and PCR reaction, and the obtained structure is the first identical.
  • the first index sequence is denoted as index1
  • the second index sequence is denoted as index2.
  • the Barcode library obtained in step S01 was specifically digested with XcmI, and the 4845bp fragment was recovered as the library backbone; the functional element random fragment and the library backbone were ligated and transformed into Escherichia coli DH10B, Obtain the promoter library.
  • step S03 the specific technical route of step S03 is as follows: the functional element library constructed in step S02 is digested with enzymes, the random sequence embedding site is removed, the vector backbone is recovered and self-ligated, and an index tag library is constructed.
  • the fragments obtained by cleaving the functional element library are added to the ligation reaction by means of a small amount of multiple additions, so that the ligation reaction that occurs in the ligation reaction is an intramolecular ligation reaction as much as possible, even if a single linearized fragment is self-circularized Ligation; and transforming the ligation product into E. coli DH10B to obtain an indexed tag library.
  • the functional element library was digested with enzymes, random fragments of functional elements were cut out, the backbone was recovered and self-ligated, so that index1 and index2 could be simultaneously analyzed in a high-throughput sequencing reaction.
  • Sequencing with Barcode the maximum sequencing read length of high-throughput sequencing NGS is 1 kb
  • an index tag library is constructed, and the inventor named the library as the Marriage library.
  • the index1, index2 and Barcode sequences in the Marriage library were amplified by PCR for high-throughput sequencing and NGS sequencing. The corresponding relationship between the three can be determined through data analysis.
  • the fifth aspect of the present invention provides the application of the method described in the third aspect of the present invention in screening functional elements.
  • the sixth aspect of the present invention provides a method for screening functional elements, comprising the step of building a library, and the method for constructing the library is the method described in the third aspect of the present invention.
  • the method according to the sixth aspect of the present invention comprises the following steps:
  • the promoter library is firstly transfected into specific cells or microinjected into experimental animals. If it is a viral vector, it needs to be packaged into virus particles to infect cells or live animals, and then the fluorescence expression of TurboGFP is observed.
  • index1 and index2 specific sequences can be obtained through the corresponding relationship between index1, index2 and Barcode in the Marriage library, and finally Using index1 and index2 of known sequences as primers, and using the functional element library as a template, PCR amplifies specific functional elements, and finally selects functional elements with excellent performance (such as small fragments, high specificity, and strong priming ability). sequence.
  • the existing specific functional element shuffling technology is based on in vitro homologous recombination in protein shuffling, and often only a single functional element can be used as an initial shuffling template, resulting in insufficient library diversity.
  • the invention provides a library construction method with high library diversity, abandons the in vitro homologous recombination technology used in protein shuffling, and successfully solves the problem of similar The defect that functional elements with specific or unknown functions cannot be effectively recombined.
  • This method mainly involves the construction of three libraries, namely, the construction of Barcode library, functional element library and marriage library, which can quickly and high-throughput realize promoter or
  • the construction method of highly diverse enhancers can also be applied to the construction and screening of other functional elements, which lays a solid foundation for the final screening of functional elements with excellent performance.
  • Figure 1 is a schematic diagram of the Y-type connector
  • Fig. 3 constructs the vector map of label Barcode library
  • Fig. 4 constructs the vector map of promoter library
  • Fig. 5 constructs the vector map of index tag Marriage library
  • Figure 6 The ratio of random segment fragments and adapters; wherein, the ratio of random short fragments added to lane 1 and Y-type adapter is 1:3; the ratio of lane 2 is 1:1;
  • Figure 7 Fluorescence image of retinal slices after in situ subretinal injection in mice, the brighter red fluorescence in panel A represents the distribution of cone cells, panel B is the distribution and expression of the entire library in the retina, panel C is panel A The co-staining map of Figure B, the yellow-orange fluorescently labeled cells in Figure C can be used for subsequent identification of promoter sequences;
  • AAV vector is selected as an example, and the plasmid map is shown in FIG. 2 .
  • the purpose of the present invention can be achieved by adopting the vectors used for conventional library construction.
  • Such as pUC18, pBR322 vectors, etc. different vectors can be selected according to the subsequent screening methods and application scenarios.
  • Genomic functional elements refer to the elements involved in the regulation of gene expression, mainly including cis-acting elements and trans-acting elements. Common ones include: promoters, enhancers Enhancers, silencers, regulatory regions and sequences, inducible elements, activators and repressors, etc.
  • Label Barcode a label for high-throughput sequencing process, to distinguish different samples.
  • the index is an index for further distinguishing different samples containing the same label Barcode in the high-throughput sequencing process.
  • a method for constructing a functional element library comprising the construction of three kinds of libraries, respectively constructing a Barcode library, a functional element library and a Marriage library, and specifically comprising the following steps:
  • the 40 N bases in the F-terminal primer represent random sequencing tag Barcode sequences.
  • the underline of the F-terminal primer is the restriction site of Mlu I; the restriction site of Tfi I is located on the transcriptional regulatory element WPRE of the vector backbone, and the amplification products of primers F and R contain the restriction site of Tfi I.
  • DNase I to digest the nucleic acid fragments of the functional elements (the conditions and times for the digestion of fragments of different lengths are different), so that the promoter fragments are randomly cut into short fragments of different sizes;
  • End Repair Module blunts the ends of the short fragments to form blunt-ended short fragments of different sizes, which are purified and recovered to obtain random short fragments of functional elements;
  • Primer F2 GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC (SEQ ID NO. 3) and primer R2Phos-GAATGAA GCGATCGC NNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC (SEQ ID NO. 4) were mixed and prepared by annealing. After mixing, a Y-type adapter (containing an AsiSI restriction site) was formed by annealing.
  • primer B is the restriction site of AsiSI.
  • the random short fragments of the functional element and the Y-type adapter are mixed in a certain proportion, and then the ligation reaction is performed to generate a long fragment of the functional element containing the Y-type adapter;
  • the long fragments of functional elements are screened by agarose gel electrophoresis, gel cutting and recovery, and the long fragments of functional elements within the expected range are recovered and purified;
  • the primer sequence of PCR is: F2: CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAGGCTACGG (SEQ ID NO.5);
  • R2 GTCTAGACCTCGAGGAGAGACGCCACGGCTGCCGTCAGC CTACGTCAGGG (SEQ ID NO. 6).
  • the index1, index2 and Barcode sequences in the index tag Marriage library were amplified by PCR for high-throughput sequencing, and the corresponding relationship between the three was determined by data analysis.
  • the inventors selected four photoreceptor cell-specific promoters hRO, hRK, mCAR and ProA1 as raw materials for DNA shuffling.
  • the difference in seed promoter strength was hRO ⁇ hRK>mCAR>ProA1.
  • the ProA1 promoter is a promoter that is specifically expressed only in cone cells, but its full length is about 2 kb, which is obviously not suitable for AAV vectors.
  • the full-length hRK promoter is only about 500 bp and can be expressed in both cone and rod cells, but its specificity does not meet the expected requirements.
  • the hRO and mCAR promoters were only expressed in rod cells, again not as expected. Therefore, the inventors used these four promoters to carry out random DNA recombination, selected random recombination fragments with a size of about 500 bp and cloned them into an AAV vector to form a promoter library, and then packaged the obtained highly diverse promoter library into type 8 At the same time, the control virus was used as the reference for the targeting of cone cells, and the subretinal in situ injection was performed at the animal level. By observing the fluorescence expression of TurboGFP (green reporter gene) and Tdtomato (red reporter gene), we screened out those with excellent characteristics. Random recombinant promoter, the specific experimental steps are as follows:
  • E6050S Use End Repair Module
  • primer F2 GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCNCCCCTGACGTAGGCTGACGGC (SEQ ID NO. 4) and primer R2: Phos-GAATGAA GCGATCGC NNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC (SEQ ID NO. 4) to form Y-type adaptor (containing AsiSI restriction site) by annealing;
  • the primer sequence is: F2: CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAG GCTACGG (SEQ ID NO.5);
  • R2 GTCTAGACCTCGAGGAGACGCCACGGCTGCCGTCAGCCTACGTCAGGG (SEQ ID NO. 6).
  • the random recombination promoter fragment and the library backbone are ligated, and transformed into Escherichia coli DH10B to obtain a promoter library;
  • PCR amplify the index1, index2 and Barcode sequences in the index tag of the Marriage library for NGS sequencing, and determine the corresponding relationship between the three through data analysis.
  • the diversity of the library was determined to be 8.5 ⁇ 10 6 by sequence analysis.
  • the present embodiment also provides a method for screening functional elements, comprising the following steps:
  • index1 and index2 sequences as primers and the promoter library as a template, PCR amplifies the corresponding promoter fragments;
  • Figure 6 shows the fluorescence images obtained by subretinal orthotopic injection in mice after premixing the library virus with the control ProA1-Tdtomato virus. Since the ProA1 promoter only specifically targets cone cells, the brighter red fluorescence in Figure A represents the distribution of cone cells, Figure B shows the distribution and expression of the entire library in the retina, and Figure C is Figure A The co-staining map of Figure B, the yellow-orange fluorescently labeled cells in Figure C can be used for subsequent identification of promoter sequences.
  • the library construction method provided by the present invention can achieve the effect of high diversity of the library, successfully solve the defect that functional elements with similar characteristics or specific or unknown functions cannot be effectively recombined, and can quickly and high-throughput.
  • a construction method that realizes high diversity of promoters, enhancers or other functional elements, and can screen out functional element sequences with excellent performance (such as smaller fragments, high specificity, and strong activation ability).
  • CMV_en, HBB_en and SV40_en enhancers as raw materials for DNA shuffling, these three enhancers can play a role in gene regulation, the size of the CMV_en enhancer is 300bp, the size of the HBB_ enhancer is 3kb, and the size of the SV40_en enhancer is 237bp.
  • HBB enhancer In order to obtain a new, shorter and positively regulated HBB enhancer, we used these three enhancers for random recombination, and selected a random recombination fragment with a size of about 800bp-1k and cloned it into a mammalian enhancer containing the SCP1_mini promoter
  • the enhancer library was formed on the sub-test expression vector ( Figure 8), and then the obtained enhancer library was transiently transfected into K562 cells, and cells with different fluorescence intensities were screened by flow sorting, and the random recombination enhancer that met the purpose was further screened. son.
  • the ligation product is subjected to agarose electrophoresis, and the recombination fragment of the enhancer with a size of about 800bp-1kb is cut into gel for recovery and purification to obtain the final random recombination enhancer fragment.
  • the specific sequence of the candidate enhancer can be obtained through the Sanger sequence.

Abstract

Provided is a library construction method for highly diverse libraries, which abandons in-vitro homologous recombination technology used in protein reorganization, and by introducing a Y-type adaptor to connect to a random fragment which is smaller than 100 bp after being digested by DNase I, successfully solves the defect in which functional elements having similar characteristics or specific or unknown functions cannot be effectively recombined. A label library, a functional element library and an index label library are constructed, and the diversity of functional elements may be quickly achieved and with high throughput, thus laying a foundation for final screening so as to obtain functional elements having excellent performance.

Description

一种筛选调控序列的载体体系和应用A vector system and application for screening regulatory sequences
本申请要求于2020年12月31日提交中国专利局、申请号为202011630533.X、发明名称为“一种功能元件的建库方法及其应用”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on December 31, 2020 with the application number 202011630533.X and the invention titled "A method for building a library of functional elements and its application", the entire contents of which are approved by Reference is incorporated in this application.
本申请要求于2021年10月26日提交中国专利局、申请号为202111247419.3、发明名称为“一种筛选调控序列的载体体系和应用”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202111247419.3 and the invention titled "A Vector System and Application for Screening Regulatory Sequences", which was filed with the China Patent Office on October 26, 2021, the entire contents of which are incorporated by reference in in this application.
技术领域technical field
本发明属于生物工程领域,更具体地,涉及一种筛选调控序列的载体体系和应用。The invention belongs to the field of bioengineering, and more particularly relates to a vector system and application for screening regulatory sequences.
背景技术Background technique
基因的表达离不开调控序列的作用,启动子或增强子作为调控序列之一,是一段位于结构基因5’上游上下游区域的DNA序列,能够准确地与特定的RNA聚合酶及相关转录因子结合,从而启动下游基因的转录起始,是调控基因表达最重要的顺式作用元件。真核生物启动子包含三种具有重要生物功能的保守序列,分别为位于-35~-25区的TATA盒(TATA box)、位于-80~-70区的CAAT盒(CAAT box)和位于-110~-80区的GC盒(GC box)。其中TATA盒参与调控下游基因的精确转录起始,CAAT盒和GC盒参与调控转录起始的频率。尽管上述三种功能区域是构成启动子功能活性的重要体现,但并不是每个启动子都包含这三种功能区域,这三个功能区域的任意碱基或相对位置的改变往往会造成启动子活性及特异性的剧烈变化。上游启动子或增强子活性以及基因附近的调控序列如5’UTR和3’UTR是决定下游基因能否顺利表达及表达水平是否适中的关键性因素,因此,为了使目的基因获得更好的表达,在体外利用分子定向进化技术对启动子调控序列进行改造筛选便显得尤为重要。Gene expression is inseparable from the role of regulatory sequences. As one of the regulatory sequences, the promoter or enhancer is a DNA sequence located in the upstream and downstream regions of the 5' of the structural gene, which can accurately interact with specific RNA polymerases and related transcription factors. Binding, thereby initiating the transcription initiation of downstream genes, is the most important cis-acting element for regulating gene expression. Eukaryotic promoters contain three conserved sequences with important biological functions, namely the TATA box (TATA box) located in the -35--25 region, the CAAT box (CAAT box) located in the -80--70 region, and the - GC box in the 110--80 zone. The TATA box is involved in regulating the precise transcription initiation of downstream genes, and the CAAT box and GC box are involved in regulating the frequency of transcription initiation. Although the above three functional regions are important manifestations of the functional activity of promoters, not every promoter contains these three functional regions, and the change of any base or relative position of these three functional regions will often cause promoters Dramatic changes in activity and specificity. The upstream promoter or enhancer activity and the regulatory sequences near the gene, such as 5'UTR and 3'UTR, are the key factors to determine whether the downstream gene can be expressed smoothly and whether the expression level is moderate. Therefore, in order to obtain better expression of the target gene. It is particularly important to transform and screen promoter regulatory sequences using molecular directed evolution technology in vitro.
自然进化是一个漫长的优胜劣汰、有利突变不断积累的过程,为了加快这一进程,研究人员在体外模拟突变、重组和选择的自然进化机制, 使进化朝着预期的方向发展。早期研究人员主要采用物理方法、化学方法、致突变菌株或易错PCR等方法将随机突变引入到蛋白质编码基因中,然后进行细胞或动物水平的功能筛选,从而得到能够满足人们需要的新功能或优良性能的蛋白质。这些方法虽然能够在一定程度上改善蛋白的某些特性,但是其所具备的多样性远远无法满足人们的需求。随着分子生物学的不断发展,人们建立了一种基于PCR技术的新的体外定向分子进化技术-DNA改组(DNA shuffling)技术,该技术由Stemmer于1994年首次提出,可用于核酸、蛋白的体外定向进化。DNA改组涉及将不同来源的多个相关基因家族通过DNaseI消化或超声破碎成随机片段,然后利用各片段间的同源性互为模板和引物,经过无引物PCR(primerless PCR)将这些片段重新组装成全长基因,该过程会所产生模板切换或交叉事件,从而增加了突变体文库的多样性。然后利用针对不同蛋白编码框的特异性5’端和3’端引物对蛋白突变体进行扩增,并克隆到相关的克隆载体上形成突变体文库,通过NGS测序验证文库多样性(~106以上),最后在细胞水平或动物水平进行功能筛选,得到一种具有改良特性的蛋白。该方法主要针对蛋白分子的定向进化,作为起始模板的不同基因之间需要具备一定的同源性,从而产生小片段间的体外同源重组引入突变形成可供筛选的突变体文库。Natural evolution is a long process of survival of the fittest and the accumulation of favorable mutations. In order to speed up this process, researchers simulate the natural evolutionary mechanisms of mutation, recombination and selection in vitro, so that evolution develops in the expected direction. Early researchers mainly used physical methods, chemical methods, mutagenic strains or error-prone PCR to introduce random mutations into protein-coding genes, and then perform functional screening at the cellular or animal level to obtain new functions that can meet people's needs or Excellent performance protein. Although these methods can improve some properties of proteins to a certain extent, their diversity is far from meeting people's needs. With the continuous development of molecular biology, people have established a new PCR-based in vitro directed molecular evolution technology-DNA shuffling technology, which was first proposed by Stemmer in 1994 and can be used for nucleic acid and protein In vitro directed evolution. DNA shuffling involves dividing multiple related gene families from different sources into random fragments by DNaseI digestion or sonication, and then using the homology between the fragments as templates and primers for each other, these fragments are reassembled by primerless PCR (primerless PCR). To generate full-length genes, the process generates template switching or crossover events that increase the diversity of the mutant library. The protein mutants were then amplified using specific 5' and 3' primers for different protein-coding frames, and cloned into relevant cloning vectors to form mutant libraries. The library diversity (~106 or more) was verified by NGS sequencing. ), and finally perform functional screening at the cellular level or animal level to obtain a protein with improved properties. This method is mainly aimed at the directed evolution of protein molecules. Different genes used as starting templates need to have a certain degree of homology, so as to generate in vitro homologous recombination between small fragments and introduce mutations to form a mutant library for screening.
启动子或增强子DNA改组(promoter or enhancer shuffling)的主要目的是增强启动子的活性或特异性改变基因的表达特性,具有相似特性的启动子调控序列(如特异性靶向相同的组织或器官)之间往往同源性极低,因此上述针对蛋白分子的DNA改组技术显然并不能够生搬硬套用于启动子调控序列的定向进化。目前,启动子改组一般采用以下技术路线:(1)对单个启动子进行两轮易错PCR,回收PCR产物(形成大量具有同源序列的突变体);(2)用DNaseI消化或超声破碎成随机片段并回收;(3)将回收产物作为模板,进行无引物PCR;(4)在无引物PCR体系中加入含有特定酶切位点的特异性引物扩增全长启动子,回收特定大小的PCR产物;(5)用对应的限制性内切酶对克隆载体及全长启动子突变体进行酶切连接;(6)NGS测序验证启动子文库的多样性。该技术路线高度重复 了蛋白定向进化的方法,且仅能对单个启动子来源进行改组,即便第一轮的易错PCR提高了模板的多样性,其本质上仍来源于同一个启动子,因此该启动子文库的多样性仍受到极大的限制,相同体系下一般只能到达10 4~10 5,筛选有特定功能的调控序列难度较大。 The main purpose of promoter or enhancer DNA shuffling (promoter or enhancer shuffling) is to enhance the activity of the promoter or specifically change the expression characteristics of the gene. ) are often extremely low in homology, so the above-mentioned DNA shuffling technology for protein molecules obviously cannot be used for the directed evolution of promoter regulatory sequences. At present, promoter shuffling generally adopts the following technical routes: (1) performing two rounds of error-prone PCR on a single promoter to recover PCR products (forming a large number of mutants with homologous sequences); (2) digesting with DNaseI or sonicating into Random fragments and recover; (3) use the recovered product as a template to carry out primer-free PCR; (4) add specific primers containing specific restriction sites to the primer-free PCR system to amplify the full-length promoter, and recover the specific size (5) The cloning vector and the full-length promoter mutant were digested and ligated with the corresponding restriction enzymes; (6) NGS sequencing was used to verify the diversity of the promoter library. This technical route highly repeats the method of protein directed evolution, and can only shuffle the source of a single promoter. Even if the error-prone PCR in the first round improves the diversity of templates, it is still derived from the same promoter in essence. The diversity of the promoter library is still greatly limited, generally only 10 4 to 10 5 under the same system, and it is difficult to screen regulatory sequences with specific functions.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本发明要解决的技术问题在于提供一种筛选调控序列的载体体系和应用。In view of this, the technical problem to be solved by the present invention is to provide a vector system and application for screening regulatory sequences.
本发明的目的在于克服启动子改组方法的局限性,解决不同来源启动子由于同源性低而无法进行高效的体外重组导致文库多样性不足的问题,提供一种功能元件文库的构建方法The purpose of the present invention is to overcome the limitations of the promoter shuffling method, solve the problem of insufficient library diversity due to the low homology of promoters from different sources and the inability to carry out efficient in vitro recombination, and provide a method for constructing a functional element library
本发明所采取的技术方案是:The technical scheme adopted by the present invention is:
本发明的第一个方面,提供一种质粒载体,其包括:索引标签、报告基因和条码标签;A first aspect of the present invention provides a plasmid vector comprising: an index tag, a reporter gene and a barcode tag;
所述条码标签为长度为5~200bp的随机片段;The barcode label is a random fragment with a length of 5-200 bp;
所述索引标签的个数至少为1,其独立的选自长度为5~100bp的随机片段;The number of the index tags is at least 1, and it is independently selected from random fragments with a length of 5-100 bp;
所述报告基因的表达产物为可通过催化底物反应自身发光或产生颜色变化、可通过催化底物反应使底物发光或产生颜色变化、或经过激发光照射而产生发射光或产生颜色变化、或可抵抗相应药物筛选。The expression product of the reporter gene is capable of self-emitting light or producing color change by catalyzing the substrate reaction, producing light or producing color change by catalyzing the substrate reaction, or producing emitted light or producing color change by irradiating excitation light, Or resistant to corresponding drug screening.
一些实施例中,所述条码标签为长度为40bp的随机片段;所述索引标签的数量为2,其中,index1为长度为30bp的随机片段,index2为长度为30bp的随机片段,所述报告基因选自荧光蛋白、荧光素酶、LacZ基因或能起到筛选作用的抗性基因中的至少一种,所述抗性基因包括嘌呤霉素抗性基因。In some embodiments, the barcode tag is a random fragment with a length of 40bp; the number of the index tags is 2, wherein index1 is a random fragment with a length of 30bp, index2 is a random fragment with a length of 30bp, and the reporter gene At least one selected from the group consisting of fluorescent protein, luciferase, LacZ gene or a resistance gene that can play a screening role, and the resistance gene includes a puromycin resistance gene.
一些本发明提供的包括载体骨架与依次连接于载体骨架上的第一终止子、重组位点、报告基因、多克隆位点(MCS)、转录后调控序列(WPRE)和第二终止子。Some provided herein include a vector backbone with a first terminator, a recombination site, a reporter gene, a multiple cloning site (MCS), a post-transcriptional regulatory sequence (WPRE), and a second terminator sequentially attached to the vector backbone.
优选地,根据本发明第一个方面所述的载体,所述载体上还包括至少 一个酶切位点。Preferably, according to the vector according to the first aspect of the present invention, the vector further comprises at least one enzyme cleavage site.
优选地,所述报告基因的表达产物为可通过催化底物反应自身发光或产生颜色变化、可通过催化底物反应使底物发光或产生颜色变化、或经过激发光照射而产生发射光或产生颜色变化、或可抵抗相应药物筛选。Preferably, the expression product of the reporter gene is capable of self-emitting light or producing color change by catalyzing the reaction of the substrate, producing light or producing color change by catalyzing the reaction of the substrate, or by irradiating excitation light to produce emission light or producing Color changes, or resistance to corresponding drug screening.
具体地,所述报告基因选自荧光蛋白、荧光素酶、LacZ基因或能起到筛选作用的抗性基因中的至少一种,所述抗性基因例如嘌呤霉素抗性基因。Specifically, the reporter gene is selected from at least one of fluorescent protein, luciferase, LacZ gene or a resistance gene that can play a screening role, such as a puromycin resistance gene.
在本发明的部分实施例中,所述报告基因选用TurboGFP。In some embodiments of the present invention, TurboGFP is selected as the reporter gene.
优选地,所述第一终止子与第二终止子为能起转录终止作用的元件。Preferably, the first terminator and the second terminator are elements capable of terminating transcription.
具体地,所述终止子SV40终止子、hGH终止子、BGH终止子或rbGlob终止子。Specifically, the terminator SV40 terminator, hGH terminator, BGH terminator or rbGlob terminator.
在本发明的部分实施例中,所述第一终止子与第二终止子都选用BGH终止子,记为BGH-pA。In some embodiments of the present invention, both the first terminator and the second terminator are selected as BGH terminators, denoted as BGH-pA.
一些实施例中,所述质粒载体依次包括如下元件:pUC ori、5’ITR、BGH pA、index1、index2、报告基因、条码标签、WPRE、BGH pA、3’ITR和抗性筛选标记。In some embodiments, the plasmid vector sequentially comprises the following elements: pUC ori, 5' ITR, BGH pA, index1, index2, reporter gene, barcode tag, WPRE, BGH pA, 3' ITR, and a resistance selection marker.
一些实施例中,所述index1和index2之间还包括酶切位点和随机重组调控序列。In some embodiments, an enzyme cleavage site and a random recombination regulatory sequence are also included between the index1 and index2.
一些具体实施例中,所述酶切位点为AsiSI;所述酶切位点的数量为2,位于所述随机重组调控序列的两端;In some specific embodiments, the enzyme cleavage site is AsiSI; the number of the enzyme cleavage site is 2, located at both ends of the random recombination control sequence;
一些实施例中,所述随机重组调控序列为碎片化的启动子片段或碎片化的增强子片段。In some embodiments, the random recombination regulatory sequence is a fragmented promoter fragment or a fragmented enhancer fragment.
一些具体实施例中,所述随机重组调控序列为经酶消化后的启动子片段或酶消化后的增强子片段。In some specific embodiments, the random recombination regulatory sequence is an enzymatically digested promoter fragment or an enzymatically digested enhancer fragment.
本发明实施例中,制备随机重组调控序列的步骤中,所述碎片化的方式包括酶消化、超声破碎或人工合成。一些实施例中,所述酶消化的酶为DnaseI。In the embodiment of the present invention, in the step of preparing the random recombination regulatory sequence, the fragmentation method includes enzymatic digestion, ultrasonication or artificial synthesis. In some embodiments, the enzyme digested by the enzyme is DnaseI.
本发明实施例中,制备随机重组调控序列的步骤中,所述酶消化的酶为DnaseI;所述启动子选自hRO、hRK、mCAR、ProA1、CMV,EF1A, EFS,CAG,CBh,SFFV,MSCV,SV40,mPGK,hPGK,UBC,Nanog,Nes,Tuba1a,Camk2a,SYN1,Hb9,Th,NSE,GFAP,Iba1,hRHO,hBEST1,Prnp,Cnp,K14,BK5,mTyr,cTnT,αMHC,Myog,ACTA1,MHCK7,SM22a,EnSM22a,Runx2,OC,Col1a1,Col2a1,aP2,Adipoq,Tie1,Cd144,CD68,CD11b,Afp,Alb,TBG,MMTV,Wap,HIP,Pdx1,Ins2,Hcn4,NPHS2,SPB,CD144,TERT,TRE,TRE3G,GAL1,MET17,CUP1,AOX1,sCMV,bactin2,Ubi,cmlc2,zK5,503unc,HSP70,5×UAS,CaMV35S,Nos,ZmUbi,TEF1,GPD,ADH1,GAP,actin5C,Polyubiquitin,α1-tubulin,Rh2,Mtn,U6,U3,H1,U6-26,TK,RSV,MC1,GAL1,PH,p5,p10,p40,p41,araBAD,cspA或Hsp68;所述增强子选自CMV_en、HBB_en或SV40_en。In the embodiment of the present invention, in the step of preparing the random recombination regulatory sequence, the enzyme digested by the enzyme is DnaseI; the promoter is selected from hRO, hRK, mCAR, ProA1, CMV, EF1A, EFS, CAG, CBh, SFFV, MSCV, SV40, mPGK, hPGK, UBC, Nanog, Nes, Tuba1a, Camk2a, SYN1, Hb9, Th, NSE, GFAP, Iba1, hRHO, hBEST1, Prnp, Cnp, K14, BK5, mTyr, cTnT, αMHC, Myog, ACTA1, MHCK7, SM22a, EnSM22a, Runx2, OC, Col1a1, Col2a1, aP2, Adipoq, Tie1, Cd144, CD68, CD11b, Afp, Alb, TBG, MMTV, Wap, HIP, Pdx1, Ins2, Hcn4, NPHS2, SPB, CD144, TERT, TRE, TRE3G, GAL1, MET17, CUP1, AOX1, sCMV, bactin2, Ubi, cmlc2, zK5, 503unc, HSP70, 5×UAS, CaMV35S, Nos, ZmUbi, TEF1, GPD, ADH1, GAP, actin5C, Polyubiquitin, α1-tubulin, Rh2, Mtn, U6, U3, H1, U6-26, TK, RSV, MC1, GAL1, PH, p5, p10, p40, p41, araBAD, cspA or Hsp68; the enhancer is selected from CMV_en, HBB_en or SV40_en.
一些实施例中,所述质粒载体依次包括如下元件:pUC ori、5’ITR、BGH pA、index1、AsiSI酶切位点、index2、Kozak、TurboGFP基因、条码标签、WPRE、BGH pA、3’ITR和Amp抗性筛选标记;其中,条码标签为长度为40bp的随机片段,所述index1为长度为30bp的随机片段,所述index2为长度为30bp的随机片段;In some embodiments, the plasmid vector includes the following elements in sequence: pUC ori, 5'ITR, BGH pA, index1, AsiSI restriction site, index2, Kozak, TurboGFP gene, barcode tag, WPRE, BGH pA, 3'ITR and Amp resistance screening marker; wherein, the barcode label is a random fragment with a length of 40bp, the index1 is a random fragment with a length of 30bp, and the index2 is a random fragment with a length of 30bp;
另一些实施例中,所述质粒载体依次包括如下元件:pUC ori、5’ITR、BGH pA、index1、AsiSI酶切位点、随机重组调控序列、AsiSI酶切位点、index2、Kozak、TurboGFP基因、条码标签、WPRE、BGH pA、3’ITR和Amp抗性筛选标记;其中,随机重组调控序列的长度为50~2000bp,为经DnaseI酶消化后的启动子片段或酶消化后的增强子片段,条码标签为长度为40bp的随机片段,所述index1为长度为30bp的随机片段,所述index2为长度为30bp的随机片段。In other embodiments, the plasmid vector sequentially includes the following elements: pUC ori, 5'ITR, BGH pA, index1, AsiSI restriction site, random recombination control sequence, AsiSI restriction site, index2, Kozak, TurboGFP gene , barcode tags, WPRE, BGH pA, 3'ITR and Amp resistance screening markers; wherein, the length of the random recombination control sequence is 50-2000bp, which is the promoter fragment after DnaseI digestion or the enhancer fragment after enzyme digestion , the barcode label is a random fragment with a length of 40bp, the index1 is a random fragment with a length of 30bp, and the index2 is a random fragment with a length of 30bp.
本发明第二方面,是提供所述质粒载体的构建方法,将条码标签、索引标签和随机重组调控序列插入含有报告基因的骨架载体。The second aspect of the present invention provides a method for constructing the plasmid vector, wherein barcode tags, index tags and random recombination control sequences are inserted into a backbone vector containing a reporter gene.
本发明对条码标签、索引标签或随机重组调控序列的插入顺序不做限定,且对其插入顺序也不做限定,凡本领域能够采用的载体与核酸片段连接的方式都可以用于本发明所述的质粒载体的构建,例如,或将插入片段和载体经酶切后进行连接,或通过Gibson克隆反应将片段与载体进行 连接。The present invention does not limit the insertion order of barcode tags, index tags or random recombination control sequences, and also does not limit the insertion order thereof. Any method of connecting a vector and a nucleic acid fragment that can be used in the art can be used in the present invention. For the construction of the plasmid vector described above, for example, the insert fragment and the vector are ligated after enzyme digestion, or the fragment and the vector are ligated by Gibson cloning reaction.
本发明中,所述条码标签的插入为:制备携带有骨架载体同源臂的条码标签,使其与线性化骨架载体经Gibson克隆反应,构建获得标签文库。本发明中,所述索引标签和随机重组调控序列的插入包括:In the present invention, the insertion of the barcode tag is as follows: preparing a barcode tag carrying the homology arm of the backbone vector, making it react with the linearized backbone vector through Gibson cloning, and constructing a tag library. In the present invention, the insertion of the index tag and the random recombination regulatory sequence includes:
制备随机重组调控序列,然后在其两端添加骨架载体的同源臂和索引标签,得到结构为同源臂1-索引标签1-酶切位点1-随机重组调控序列-酶切位点2-索引标签2-同源臂2的插入片段;Prepare a random recombination regulatory sequence, and then add the homology arms and index tags of the backbone vector at both ends to obtain a structure of homology arm 1-index tag 1-restriction site 1-random recombination regulatory sequence-restriction site 2 - index tag 2 - insert of homology arm 2;
将标签文库线性化;然后将所述片段与线性化的标签文库连接,获得调控序列文库。The tag library is linearized; the fragments are then ligated to the linearized tag library to obtain a library of regulatory sequences.
一些实施例中,所述随机重组调控序列的制备方法包括,以酶对启动子或增强子进行消化。一些具体实施例中,所述酶为DnaseI酶。In some embodiments, the method for preparing the random recombination regulatory sequence comprises enzymatically digesting the promoter or enhancer. In some specific embodiments, the enzyme is the DnaseI enzyme.
一些实施例中,所述同源臂1-索引标签1-酶切位点1-随机重组调控序列-酶切位点2-索引标签2-同源臂2片段的制备具体包括:In some embodiments, the preparation of the homology arm 1-index tag 1-enzyme cleavage site 1-random recombination regulatory sequence-enzyme cleavage site 2-index tag 2-homology arm 2 fragment specifically includes:
将引物F和引物R退火形成Y型的adaptor;所述引物F的结构为同源臂1-索引标签1-酶切位点1-保护序列1;所述引物R的结构为保护序列2-酶切位点2-索引标签2-同源臂2;所述保护序列1和保护序列2互补;The primer F and primer R are annealed to form a Y-shaped adapter; the structure of the primer F is homology arm 1-index tag 1-restriction site 1-protection sequence 1; the structure of the primer R is protection sequence 2- Restriction site 2-index tag 2-homology arm 2; the protection sequence 1 and protection sequence 2 are complementary;
将adaptor与平末端的随机重组调控序列连接,得含有Y型接头的功能元件随机长片段;Connect the adapter with the blunt-ended random recombination control sequence to obtain a random long fragment of the functional element containing the Y-shaped linker;
将对含有Y型接头的功能元件随机长片段经PCR获得线性片段,A linear fragment is obtained by PCR on the random long fragment of the functional element containing the Y-shaped linker,
使所述线性片段与线性化的标签文库进行连接,构建获得调控序列文库。The linear fragment is ligated with the linearized tag library to construct a library of regulatory sequences.
一些实施例中,本发明所述质粒载体的构建方法还包括将所述调控序列文库经酶切去除随机重组调控序列,获得索引标签文库。In some embodiments, the method for constructing the plasmid vector of the present invention further includes removing the random recombination regulatory sequence from the regulatory sequence library by enzyme digestion to obtain an index tag library.
本发明的第三个方面,提供本发明第一个方面所述的载体在文库构建或功能元件筛选方面的应用。The third aspect of the present invention provides the application of the vector described in the first aspect of the present invention in library construction or functional element screening.
本发明的第四个方面,提供一种文库构建的方法,利用Y型接头将随机打断的序列整合入所述载体中。The fourth aspect of the present invention provides a method for library construction, using a Y-shaped linker to integrate randomly interrupted sequences into the vector.
具体地,所述整合位点为所述载体的重组位点。Specifically, the integration site is the recombination site of the vector.
进一步地,所述Y型接头从结构上分为互补区和非互补区。Further, the Y-shaped linker is structurally divided into a complementary region and a non-complementary region.
更进一步地,所述Y型接头的5’端的非互补序列分别包含来自于骨架载体克隆位点前后两端的第一同源臂、第二同源臂以及第一index序列、第二index序列,3’端的互补序列包含酶切位点。Further, the non-complementary sequences at the 5' end of the Y-shaped linker respectively comprise the first homology arm, the second homology arm, the first index sequence and the second index sequence from the front and rear ends of the backbone vector cloning site, The complementary sequence at the 3' end contains the restriction enzyme cleavage site.
具体地,所述Y型接头的结构依次为第一同源臂、第一index序列、酶切位点、随机序列嵌入位点、酶切位点、第二index序列和第二同源臂。Specifically, the structure of the Y-shaped linker is a first homology arm, a first index sequence, an enzyme cutting site, a random sequence embedding site, an enzyme cutting site, a second index sequence and a second homology arm.
所述同源序列便于后续与骨架载体进行Gibson克隆反应。The homologous sequence facilitates subsequent Gibson cloning reactions with the backbone vector.
所述酶切位点不同于本发明第一个方面所述载体上的酶切位点,该酶切位点可用于功能筛选后的功能元件的测序验证。The enzyme cleavage site is different from the enzyme cleavage site on the vector described in the first aspect of the present invention, and the enzyme cleavage site can be used for sequencing verification of functional elements after functional screening.
在本发明的部分实施方式中,所述酶切位点选用AsiSI酶切位点。In some embodiments of the present invention, the enzyme cleavage site is selected as an AsiSI enzyme cleavage site.
在本发明的部分实施方式中,所述Y型接头通过PCR引物退火制备得到。In some embodiments of the present invention, the Y-shaped linker is prepared by annealing PCR primers.
进一步地,所述PCR的下游引物上还有酶切位点。Further, the downstream primer of the PCR also has an enzyme cleavage site.
更具体地,所述Y型接头由引物A:GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC(SEQ ID NO.3)和引物B Phos-GAATGAAGCGATCGCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC(SEQ ID NO.4)混匀后通过退火制备得到。More specifically, the Y-shaped linker is prepared by mixing primer A: GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC (SEQ ID NO.3) and primer B Phos-GAATGAAGCGATCGCNNNNNNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC (SEQ ID NO.4) after mixing.
根据本发明第三个方面的方法,包括以下步骤:The method according to the third aspect of the present invention comprises the following steps:
S01.构建标签文库;S01. Construct a tag library;
S02.构建功能元件文库;S02. Construct a functional element library;
S03.构建索引标签文库。S03. Build an index tag library.
进一步地,步骤S01中所述标签是一种具有约40个碱基的随机序列,Further, the label in step S01 is a random sequence with about 40 bases,
优选地,随机标签序列位于荧光示踪筛选基因TurboGFP下游,Preferably, the random tag sequence is located downstream of the fluorescent tracer screening gene TurboGFP,
更优选地,随机标签序列位于荧光示踪筛选基因TurboGFP与polyA之间,可在mRNA水平确定Barcode序列,从而间接确定其所对应的功能元件序列。More preferably, the random tag sequence is located between the fluorescent tracer screening gene TurboGFP and polyA, and the Barcode sequence can be determined at the mRNA level, thereby indirectly determining its corresponding functional element sequence.
更具体地,步骤S01的具体操作为:More specifically, the specific operations of step S01 are:
a.线性化所述载体,回收线性化载体骨架;a. Linearize the vector, and recover the linearized vector backbone;
b.用携带随机标签序列及同源臂引物扩增载体骨架,获得带有随机标签序列的的PCR产物;b. Amplify the vector backbone with primers carrying random tag sequences and homology arms to obtain PCR products with random tag sequences;
c.将PCR产物与回收的线性化载体骨架连接,构建得标签文库。c. Connect the PCR product to the recovered linearized vector backbone to construct a tag library.
优选地,步骤a中通过单酶切线性化所述载体。Preferably, the vector is linearized by single enzyme cleavage in step a.
优选地,步骤b中所述引物的上游引物中含有所述随机标签序列。Preferably, the upstream primer of the primer in step b contains the random tag sequence.
优选地,步骤b中所述引物的上下游引物均含有酶切位点。由此扩增出来的PCR片段酶切后可以与酶切后的载体骨架连接。Preferably, both the upstream and downstream primers of the primers in step b contain restriction enzyme cleavage sites. The PCR fragment thus amplified can be ligated with the digested vector backbone after digestion.
优选地,步骤c中将连接产物转化大肠杆菌,用于储存。Preferably, in step c, the ligation product is transformed into E. coli for storage.
在本发明的部分实施方式中,使用XbaI酶切骨架载体,所使用的引物为:F端引物:CACCAAGGAAGCCCTCGAGGACGCGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGATCCCGACCTACCGACCCAGCTTTC(SEQ ID NO.1)和R端引物AGGCGAAGACGCGGAAGAGG(SEQ ID NO.2)。In some embodiments of the present invention, XbaI is used to cut the backbone vector, and the primers used are: F-terminal primer: CACCAAGGAAGCCCTCGAGGACGCGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGATCCCGACCTACCGACCCAGCTTTC (SEQ ID NO.1) and R-terminal primer AGGCGAAGACGCGGAAGAGG (SEQ ID NO.2).
回收纯化PCR产物后使用Mlu I+Tfi I进行酶切纯化,得到嵌入标签Insert-Barcode片段;使用Mlu I+Tfi I酶切克隆骨架,回收4849bp片段作为载体骨架;随后将嵌入标签Insert-Barcode片段和文库骨架进行连接反应,并转化至大肠杆菌DH10B中,得到标签Barcode文库。After recycling and purifying the PCR product, use Mlu I+Tfi I to carry out digestion and purification to obtain the insert-label Insert-Barcode fragment; use Mlu I+Tfi I to cut the cloned backbone, and reclaim the 4849bp fragment as the vector backbone; then insert the label Insert-Barcode fragment A ligation reaction was performed with the library backbone and transformed into E. coli DH10B to obtain a tagged Barcode library.
在本发明的部分实施例中,步骤S01的具体技术路线如下:利用特定的限制性内切酶切除骨架载体MCS和部分元件序列,回收骨架大片段,利用5’端携带随机标签Barcode序列及同源臂的引物PCR扩增骨架载体,获得带有Barcode的PCR产物;将该PCR产物与经酶切回收骨架载体进行Gibson克隆反应,构建标签文库,或记为Barcode文库。In some embodiments of the present invention, the specific technical route of step S01 is as follows: using a specific restriction enzyme to excise the backbone vector MCS and part of the element sequence, recovering the backbone large fragment, using the 5' end to carry the random tag Barcode sequence and The primers of the homology arm PCR amplify the backbone vector to obtain a PCR product with Barcode; the PCR product is subjected to Gibson cloning reaction with the backbone vector recovered by enzyme digestion to construct a tag library, or recorded as a Barcode library.
此外,还可以通过高通量NGS测序验证标签Barcode文库的多样性。In addition, the diversity of the tagged Barcode library can also be verified by high-throughput NGS sequencing.
进一步地,步骤S02利用所述Y型接头将随机打断的所述功能元件的序列整合入所述载体中。Further, step S02 utilizes the Y-shaped linker to integrate the randomly interrupted sequence of the functional element into the vector.
更具体地,步骤S02的具体操作为:More specifically, the specific operation of step S02 is:
d.将功能元件的核酸片段随机打断,得功能元件随机短片段;d. Randomly interrupt the nucleic acid fragments of functional elements to obtain random short fragments of functional elements;
e.将功能元件随机短片段与Y型接头连接,得含有Y型接头的功能元件随机长片段;e. Connect the random short fragment of the functional element with the Y-shaped linker to obtain a random long fragment of the functional element containing the Y-shaped linker;
f.将含有Y型接头的功能元件随机长片段与步骤S01构建得到的标签文库进行连接,构建功能元件文库。f. Linking the random long fragments of functional elements containing Y-shaped linkers with the tag library constructed in step S01 to construct a functional element library.
优选地,步骤d中将功能元件片段随机打断成小于100bp的片段。Preferably, in step d, the functional element fragments are randomly broken into fragments smaller than 100 bp.
更优选地,步骤d中将功能元件片段随机打断成约50bp的片段。More preferably, the functional element fragments are randomly broken into fragments of about 50 bp in step d.
优选地,步骤d中将核酸片段随机打断后进行末端补平,形成不同大小的平末端短片段。Preferably, in step d, the nucleic acid fragments are randomly interrupted and then blunted to form blunt-ended short fragments of different sizes.
优选地,步骤d中所述功能元件的核酸片段为某一特定功能的多种功能元件的核酸片段。Preferably, the nucleic acid fragments of the functional elements in step d are nucleic acid fragments of multiple functional elements of a specific function.
优选地,步骤e中可以通过PCR增加含有Y型接头的功能元件随机长片段的产量,并且将含有Y型接头的功能元件随机长片段改构为双链DNA片段。纯化PCR产物后再与标签Barcode文库进行连接。Preferably, in step e, the yield of random long fragments of functional elements containing Y-shaped linkers can be increased by PCR, and the random long fragments of functional elements containing Y-shaped linkers can be reconfigured into double-stranded DNA fragments. The PCR product was purified and then ligated with the tagged Barcode library.
这一步骤中所使用的引物为:F2:CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAGGCTACGG(SEQ ID NO.5);The primers used in this step are: F2:CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAGGCTACGG (SEQ ID NO.5);
R2:GTCTAGACCTCGAGGAGAGACGCCACGGCTGCCGTCAGCCTACGTCAGGG(SEQ ID NO.6)。R2: GTCTAGACCTCGAGGAGAGACGCCACGGCTGCCGTCAGCCTACGTCAGGG (SEQ ID NO. 6).
进一步地,步骤e中所述Y型接头通过PCR引物退火制备得到。Further, the Y-shaped adapter in step e is prepared by annealing PCR primers.
进一步地,所述PCR的下游引物上还有酶切位点。Further, the downstream primer of the PCR also has an enzyme cleavage site.
在本发明的部分实施方式中,所述Y型接头由引物A:GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC(SEQ ID NO.3)和引物BPhos-GAATGAAGCGATCGCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC(SEQ ID NO.4)混匀后通过退火制备得到。In some embodiments of the present invention, the Y-shaped linker is prepared by mixing primer A: GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC (SEQ ID NO.3) and primer BPhos-GAATGAAGCGATCGCNNNNNNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC (SEQ ID NO.4) after mixing.
在本发明的部分实施方式中,步骤S02的具体技术路线如下:分别PCR扩增并回收数种具有相同组织特异性或具有特定或未知功能的功能元件;将数种功能元件用DNaseI消化成小于100bp的随机片段并进行末端补平,回收目的大小条带,如约50bp的小条带;加入经过退火反应的Y型adaptor与随机平末端短片段进行连接并进行PCR反应,得到结构为 第一同源臂-第一index序列-AsiSI酶切位点-功能元件片段-AsiSI酶切位点-第二index序列-第二同源臂的克隆片段;克隆片段再次与预先酶切的Barcode文库进行连接,得到功能元件文库。第一index序列记为index1,第二index序列记为index2。In some embodiments of the present invention, the specific technical route of step S02 is as follows: PCR amplification and recovery of several functional elements with the same tissue specificity or specific or unknown functions respectively; Random fragments of 100bp and blunt ends are recovered, and the target size band is recovered, such as a small band of about 50bp; the Y-type adaptor that has undergone annealing reaction is added to the random blunt-end short fragment for ligation and PCR reaction, and the obtained structure is the first identical. Source arm-first index sequence-AsiSI restriction site-functional element fragment-AsiSI restriction site-second index sequence-cloned fragment of the second homology arm; the cloned fragment is connected to the pre-digested Barcode library again , to obtain the functional element library. The first index sequence is denoted as index1, and the second index sequence is denoted as index2.
在本发明的部分实施方式中,具体使用Xcm I酶切步骤S01中得到的Barcode文库,回收4845bp片段作为文库骨架;将功能元件随机片段和文库骨架进行连接反应,并转化至大肠杆菌DH10B中,得到启动子文库。In some embodiments of the present invention, the Barcode library obtained in step S01 was specifically digested with XcmI, and the 4845bp fragment was recovered as the library backbone; the functional element random fragment and the library backbone were ligated and transformed into Escherichia coli DH10B, Obtain the promoter library.
进一步地,步骤S03的具体技术路线为:酶切步骤S02中构建的功能元件文库,去除随机序列嵌入位点,回收载体骨架并自连,构建得索引标签文库。Further, the specific technical route of step S03 is as follows: the functional element library constructed in step S02 is digested with enzymes, the random sequence embedding site is removed, the vector backbone is recovered and self-ligated, and an index tag library is constructed.
优选地,通过少量多次添加的方法,将酶切功能元件文库得到的片段加入到连接反应中,尽可能地使连接反应中发生的为分子内的连接反应,即使单个线性化片段自身环化连接;并将连接产物转化至大肠杆菌DH10B中,得到索引标签文库。Preferably, the fragments obtained by cleaving the functional element library are added to the ligation reaction by means of a small amount of multiple additions, so that the ligation reaction that occurs in the ligation reaction is an intramolecular ligation reaction as much as possible, even if a single linearized fragment is self-circularized Ligation; and transforming the ligation product into E. coli DH10B to obtain an indexed tag library.
为了确定index1、index2和Barcode之间的一一对应关系,利用酶切功能元件文库,切去功能元件随机片段,回收骨架并自连,从而实现在一个高通量测序反应中同时对index1、index2和Barcode进行测序(高通量测序NGS的测序读长最大为1kb),构建得索引标签文库,发明人将该文库命名为Marriage文库。In order to determine the one-to-one correspondence between index1, index2 and Barcode, the functional element library was digested with enzymes, random fragments of functional elements were cut out, the backbone was recovered and self-ligated, so that index1 and index2 could be simultaneously analyzed in a high-throughput sequencing reaction. Sequencing with Barcode (the maximum sequencing read length of high-throughput sequencing NGS is 1 kb), an index tag library is constructed, and the inventor named the library as the Marriage library.
以Marriage文库质粒为模板,PCR扩增Marriage文库中的index1、index2以及Barcode序列进行高通量测序NGS测序,通过数据分析可以确定三者之间的对应关系。Using the plasmid of the Marriage library as a template, the index1, index2 and Barcode sequences in the Marriage library were amplified by PCR for high-throughput sequencing and NGS sequencing. The corresponding relationship between the three can be determined through data analysis.
本发明的第五个方面,提供本发明第三个方面所述方法在筛选功能元件方面的应用。The fifth aspect of the present invention provides the application of the method described in the third aspect of the present invention in screening functional elements.
本发明的第六个方面,提供一种筛选功能元件的方法,包括文库构建的步骤,所述文库构建的方法本发明第三个方面所述的方法。The sixth aspect of the present invention provides a method for screening functional elements, comprising the step of building a library, and the method for constructing the library is the method described in the third aspect of the present invention.
根据本发明第六个方面所述的方法,包括以下步骤:The method according to the sixth aspect of the present invention comprises the following steps:
S11.将本发明所述的方法构建得到的功能元件文库转染细胞或注射 实验动物;S11. transfect cells or inject experimental animals with the functional element library constructed by the method of the present invention;
S12.通过报告基因表达情况选取细胞或组织提取mRNA,逆转录成cDNA;S12. Select cells or tissues to extract mRNA according to the expression of the reporter gene, and reverse-transcribe it into cDNA;
S13.对所述标签进行测序,通过所述标签序列、第一index序列和第二index序列的对应关系筛选得到功能元件。S13. Sequence the tag, and obtain functional elements by screening the corresponding relationship between the tag sequence, the first index sequence and the second index sequence.
在本发明的部分实施方式中,首先将启动子文库转染特定的细胞或显微注射实验动物,如果是病毒载体,则需要包装成病毒颗粒再感染细胞或活体动物,然后观察TurboGFP荧光表达情况,选择荧光表达强度合适的细胞或组织提取mRNA,逆转录成cDNA并对所述标签Barcode进行测序,通过Marriage文库中index1、index2和Barcode的对应关系可得到相应的index1和index2具体序列,最后再以已知序列的index1和index2为引物,以功能元件文库为模板,PCR扩增出特定的功能元件,最终筛选得到具有优良性能(如片段较小,特异性高,启动能力强)的功能元件序列。In some embodiments of the present invention, the promoter library is firstly transfected into specific cells or microinjected into experimental animals. If it is a viral vector, it needs to be packaged into virus particles to infect cells or live animals, and then the fluorescence expression of TurboGFP is observed. , select cells or tissues with suitable fluorescence expression intensity to extract mRNA, reverse transcribed into cDNA and sequence the label Barcode, the corresponding index1 and index2 specific sequences can be obtained through the corresponding relationship between index1, index2 and Barcode in the Marriage library, and finally Using index1 and index2 of known sequences as primers, and using the functional element library as a template, PCR amplifies specific functional elements, and finally selects functional elements with excellent performance (such as small fragments, high specificity, and strong priming ability). sequence.
本发明的有益效果是:The beneficial effects of the present invention are:
现有的特定功能元件改组技术是基于蛋白改组中的体外同源重组而来,往往只能利用单一的功能元件作为起始的改组模板,所得到的文库多样性不足。本发明提供了一种文库高度多样性的文库构建方法,摒弃了蛋白改组中所采用的体外同源重组技术,通过引入Y型adaptor与DNaseI消化后小于100bp的随机片段连接,成功解决了具有相似特性的或具有特定或未知功能的功能元件无法进行有效重组的缺陷,该方法主要涉及三种文库的构建,分别为构建Barcode文库、功能元件文库以及Marriage文库,能够快速高通量实现启动子或增强子高度多样性的构建方法,该方法也可应用于其他功能元件的构建和筛选,为最终筛选得到具有优良性能的功能元件奠定了坚实的基础。The existing specific functional element shuffling technology is based on in vitro homologous recombination in protein shuffling, and often only a single functional element can be used as an initial shuffling template, resulting in insufficient library diversity. The invention provides a library construction method with high library diversity, abandons the in vitro homologous recombination technology used in protein shuffling, and successfully solves the problem of similar The defect that functional elements with specific or unknown functions cannot be effectively recombined. This method mainly involves the construction of three libraries, namely, the construction of Barcode library, functional element library and marriage library, which can quickly and high-throughput realize promoter or The construction method of highly diverse enhancers can also be applied to the construction and screening of other functional elements, which lays a solid foundation for the final screening of functional elements with excellent performance.
附图说明Description of drawings
图1Y型接头示意图;Figure 1 is a schematic diagram of the Y-type connector;
图2原始载体图谱(示例);Figure 2 Original vector map (example);
图3构建标签Barcode文库的载体图谱;Fig. 3 constructs the vector map of label Barcode library;
图4构建启动子文库的载体图谱;Fig. 4 constructs the vector map of promoter library;
图5构建索引标签Marriage文库的载体图谱;Fig. 5 constructs the vector map of index tag Marriage library;
图6随机段片段和接头的比例用量;其中,泳道1加入的随机短片段和Y型adaptor比例为1:3;泳道2的比例为1:1;Figure 6 The ratio of random segment fragments and adapters; wherein, the ratio of random short fragments added to lane 1 and Y-type adapter is 1:3; the ratio of lane 2 is 1:1;
图7小鼠视网膜下原位注射后视网膜切片荧光图,A图中较亮的红色荧光即代表视锥细胞的分布,B图则是整个文库在视网膜中的分布表达情况,C图是A图和B图的共染图,图C中黄橙色荧光标记的细胞即为可用于后续的启动子序列鉴定;Figure 7 Fluorescence image of retinal slices after in situ subretinal injection in mice, the brighter red fluorescence in panel A represents the distribution of cone cells, panel B is the distribution and expression of the entire library in the retina, panel C is panel A The co-staining map of Figure B, the yellow-orange fluorescently labeled cells in Figure C can be used for subsequent identification of promoter sequences;
图8增强子测试表达载体;Figure 8 enhancer test expression vector;
图9增强子文库。Figure 9 Enhancer library.
具体实施方式Detailed ways
以下结合具体的实施例及附图对本发明的内容作进一步详细的说明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。The content of the present invention will be described in further detail below with reference to specific embodiments and accompanying drawings. It should be understood that these examples are only used to illustrate the present invention and not to limit the scope of the present invention.
下列实施例中未注明具体条件的实验方法,通常按照常规条件,例如Sambrook等人,分子克隆:实验室手册(New York:Cold Spring Harbor LaboratoryPress,1989)中所述的条件,或按照制造厂商所建议的条件。实施例中所用到的各种常用化学试剂,均为市售产品。The experimental method of unreceipted specific conditions in the following examples, usually according to conventional conditions, such as Sambrook et al., Molecular Cloning: Conditions described in laboratory manual (New York:Cold Spring Harbor Laboratory Press, 1989), or according to the manufacturer the proposed conditions. Various common chemical reagents used in the examples are all commercially available products.
实施例中选用AAV载体作为示例,质粒图谱如附图2所示。本领域一般技术人员应理解采用常规建库使用的载体均可实现本发明目的。如pUC18,pBR322载体等,可根据后续筛选的方法和应用场景选择不同的载体。In the embodiment, AAV vector is selected as an example, and the plasmid map is shown in FIG. 2 . Those of ordinary skill in the art should understand that the purpose of the present invention can be achieved by adopting the vectors used for conventional library construction. Such as pUC18, pBR322 vectors, etc., different vectors can be selected according to the subsequent screening methods and application scenarios.
基因组功能元件(functional elements)指参与基因表达调控的元件,主要包括顺式作用元件(cis-acting element)和反式作用因子(trans-acting element)常见的包括:启动子(promoter)、增强子(enhancer)、沉默子(silencer)、调控序列(regulatory regions and sequence)、可诱导元件(Inducible element)以及激活因子和阻遏因子(activator and repressor)等。Genomic functional elements (functional elements) refer to the elements involved in the regulation of gene expression, mainly including cis-acting elements and trans-acting elements. Common ones include: promoters, enhancers Enhancers, silencers, regulatory regions and sequences, inducible elements, activators and repressors, etc.
标签Barcode,为高通量测序过程的标签,区分不同的样本。Label Barcode, a label for high-throughput sequencing process, to distinguish different samples.
索引index,为为高通量测序过程中进一步区分含有相同标签Barcode的不同样本的索引。The index is an index for further distinguishing different samples containing the same label Barcode in the high-throughput sequencing process.
实施例1Example 1
一种功能元件文库的构建方法,包括三种文库的构建,分别为构建Barcode文库、功能元件文库以及Marriage文库,具体包括以下步骤:A method for constructing a functional element library, comprising the construction of three kinds of libraries, respectively constructing a Barcode library, a functional element library and a Marriage library, and specifically comprising the following steps:
S01.构建标签(Barcode)文库:S01. Build a label (Barcode) library:
(1)制备Insert-Barcode片段:(1) Prepare Insert-Barcode fragment:
使用XbaI酶切原始载体(如图2所示),回收4528bp片段作为PCR的模板扩增多克隆位点MCS+转录调控元件WPRE元件的部分序列;所使用的引物为:F1端引物:CACCAAGGAAGCCCTCGAGG ACGCGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGATCCCGACCTACCGACCCAGCTTTC(SEQ ID NO.1)和R1端引物AGGCGAAGACGCGGAAGAGG(SEQ ID NO.2)进行扩增; Use XbaI to digest the original vector (as shown in Figure 2), and recover the 4528bp fragment as a template for PCR to amplify the partial sequence of the multi-cloning site MCS+ transcriptional regulatory element WPRE element; the primers used are: F1 end primer: CACCAAGGAAGCCCTCGAGG ACGCGT NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGATCCCGACCTACCGACCCAGCTTTC (SEQ ID NO.1) and R1-terminal primer AGGCGAAGACGCGGAAGAGG (SEQ ID NO.2) for amplification;
F端引物中的40个N碱基代表随机的测序标签Barcode序列。The 40 N bases in the F-terminal primer represent random sequencing tag Barcode sequences.
回收纯化PCR产物后使用Mlu I+Tfi I进行酶切纯化,得到Insert-Barcode片段。After the recovery and purification of the PCR product, Mlu I+Tfi I was used for digestion and purification to obtain the Insert-Barcode fragment.
F端引物下划线处为Mlu I的酶切位点;Tfi I的酶切位点位于载体骨架的转录调控元件WPRE上,引物F和R的扩增产物上含有TfiI的酶切位点。The underline of the F-terminal primer is the restriction site of Mlu I; the restriction site of Tfi I is located on the transcriptional regulatory element WPRE of the vector backbone, and the amplification products of primers F and R contain the restriction site of Tfi I.
(2)制备线性化克隆骨架:使用Mlu I+Tfi I酶切克隆骨架,回收4849bp片段作为文库骨架。(2) Preparation of linearized clone backbone: The clone backbone was digested with Mlu I+Tfi I, and the 4849bp fragment was recovered as the library backbone.
(3)将Insert-Barcode片段和文库骨架进行连接反应(如图3所示),得到Barcode文库,转化至大肠杆菌DH10B中,用以保存。(3) Perform a ligation reaction between the Insert-Barcode fragment and the library backbone (as shown in Figure 3) to obtain a Barcode library, which is transformed into E. coli DH10B for preservation.
(4)扩增Barcode标签文库中的Barcode序列进行高通量NGS测序,通过数据分析确认文库的多样性高达1×10 8(4) Amplify the Barcode sequences in the Barcode tag library for high-throughput NGS sequencing, and confirm that the diversity of the library is as high as 1×10 8 through data analysis.
S02.构建调控序列文库:S02. Construct a regulatory sequence library:
(1)随机打断某一类型的数种功能元件的核酸片段,得到功能元件 随机片段;(1) randomly interrupt the nucleic acid fragments of several functional elements of a certain type to obtain random fragments of functional elements;
使用DNase I对功能元件的核酸片段进行消化(不同长度的片段消化的条件和时间不同),使启动子片段被随机剪切不同大小的短片段;Use DNase I to digest the nucleic acid fragments of the functional elements (the conditions and times for the digestion of fragments of different lengths are different), so that the promoter fragments are randomly cut into short fragments of different sizes;
将1U的DNase I稀释25倍,按照如下体系消化片段,使启动子片段被随机剪切为50-100bp的短片段,并进行回收;1U of DNase I was diluted 25 times, and the fragment was digested according to the following system, so that the promoter fragment was randomly sheared into a short fragment of 50-100 bp, and recovered;
Figure PCTCN2021134329-appb-000001
Figure PCTCN2021134329-appb-000001
使用
Figure PCTCN2021134329-appb-000002
End Repair Module(E6050S)对短片段进行末端补平,形成不同大小的平末端短片段,纯化回收后得到功能元件随机短片段;
use
Figure PCTCN2021134329-appb-000002
End Repair Module (E6050S) blunts the ends of the short fragments to form blunt-ended short fragments of different sizes, which are purified and recovered to obtain random short fragments of functional elements;
将引物F2:GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC(SEQ ID NO.3)和引物R2Phos-GAATGAA GCGATCGCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC(SEQ ID NO.4)混匀后通过退火制备得到。混匀后通过退火形成Y型adaptor(含有AsiSI酶切位点)。 Primer F2: GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC (SEQ ID NO. 3) and primer R2Phos-GAATGAA GCGATCGC NNNNNNNNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC (SEQ ID NO. 4) were mixed and prepared by annealing. After mixing, a Y-type adapter (containing an AsiSI restriction site) was formed by annealing.
引物B下划线处为AsiSI的酶切位点。The underline of primer B is the restriction site of AsiSI.
将功能元件随机短片段和Y型adaptor按一定的比例混匀后进行连接反应,生成含Y型adaptor的功能元件长片段;The random short fragments of the functional element and the Y-type adapter are mixed in a certain proportion, and then the ligation reaction is performed to generate a long fragment of the functional element containing the Y-type adapter;
通过琼脂糖凝胶电泳、切胶、回收的方法对功能元件长片段进行筛选,将预期范围内的功能元件长片段进行回收纯化;The long fragments of functional elements are screened by agarose gel electrophoresis, gel cutting and recovery, and the long fragments of functional elements within the expected range are recovered and purified;
以回收的功能元件长片段为模板,通过PCR扩增的方法增加预期范围内的功能元件长片段的产量,以及将预期含Y型adaptor的功能元件长片段改构为双链DNA片段;回收纯化PCR产物,得到最终的功能元件随机片段片段。Using the recovered functional element long fragment as a template, increase the yield of the functional element long fragment within the expected range by PCR amplification, and restructure the functional element long fragment expected to contain Y-type adaptor into double-stranded DNA fragments; recovery and purification PCR products to obtain the final functional element random fragment fragment.
PCR的引物序列为:F2:CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAGGCTACGG(SEQ ID NO.5);The primer sequence of PCR is: F2: CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAGGCTACGG (SEQ ID NO.5);
R2:GTCTAGACCTCGAGGAGAGACGCCACGGCTGCCGTCAGC CTACGTCAGGG(SEQ ID NO.6)。R2: GTCTAGACCTCGAGGAGAGACGCCACGGCTGCCGTCAGC CTACGTCAGGG (SEQ ID NO. 6).
(2)制备线性化载体骨架:使用Xcm I酶切S01中得到的Barcode文库,去除Stuffer序列,回收4845bp片段作为文库骨架;(2) prepare the linearized vector backbone: use XcmI to digest the Barcode library obtained in S01, remove the Stuffer sequence, and reclaim the 4845bp fragment as the library backbone;
(3)将s功能元件随机片段和载体骨架进行连接反应,得到启动子文库(如图4所示),转化至大肠杆菌DH10B中,用以保存。(3) A ligation reaction is performed between the random fragment of the s functional element and the vector backbone to obtain a promoter library (as shown in Figure 4), which is transformed into Escherichia coli DH10B for preservation.
S03.构建索引标签Marriage文库S03. Build an index tag Marriage library
(1)线性化:使用AsiS I酶切S02中得到的调控序列文库,回收4954bp含相同粘性末端的线性化片段;(1) Linearization: use AsiS I to digest the regulatory sequence library obtained in S02, and recover 4954bp of linearized fragments containing the same sticky ends;
(2)连接及转化:通过少量多次添加的方法将上一步中得到的片段加入到连接反应中,尽可能地使连接反应中发生的为分子内的连接反应,即使单个线性化片段自身环化连接;并将连接产物转化至大肠杆菌DH10B中,得到索引标签Marriage文库(如图5所示),转化至大肠杆菌DH10B中,用以保存。(2) Ligation and transformation: The fragment obtained in the previous step is added to the ligation reaction by adding a small amount of times, so that the ligation reaction that occurs in the ligation reaction is an intramolecular ligation reaction as much as possible, even if a single linearized fragment itself loops ligation; and the ligation product was transformed into E. coli DH10B to obtain an index tag Marriage library (as shown in Figure 5), which was transformed into E. coli DH10B for preservation.
以Marriage文库质粒为模板,PCR扩增索引标签Marriage文库中的index1、index2以及Barcode序列进行高通量测序,通过数据分析确定三者之间的对应关系。Using the plasmid of the Marriage library as a template, the index1, index2 and Barcode sequences in the index tag Marriage library were amplified by PCR for high-throughput sequencing, and the corresponding relationship between the three was determined by data analysis.
实施例2Example 2
为了得到一种高度特异性靶向视锥细胞且能高效表达目的基因的启动子,发明人选择了四种感光细胞特异性的启动子hRO、hRK、mCAR和ProA1为原材料进行DNA改组,这四种启动子强度差异是hRO≈hRK>mCAR>ProA1。其中ProA1启动子是一个仅在视锥细胞中特异表达的启动子,但其全长约为2kb,显然并不适用于AAV载体。hRK启动子全长仅有约500bp,可同时在视锥细胞和视杆细胞中表达,但其特异性未达到预期要求。hRO和mCAR启动子仅在视杆细胞中表达,同样不符合预期。因此,发明人利用这四种启动子进行DNA随机重组,选取大小约为500bp的随机重组片段克隆到AAV载体上形成启动子文库,然后将获得的具有高度多样性的启动子文库包装成8型AAV,同时,以对照病毒作为视锥细胞靶向性的参照,动物水平进行视网膜下原位注射,通过观察TurboGFP(绿色报告基因)和Tdtomato(红色报告基因)荧光表 达情况筛选出具有优良特性的随机重组启动子,具体实验步骤如下:In order to obtain a highly specific promoter that targets cone cells and can efficiently express the target gene, the inventors selected four photoreceptor cell-specific promoters hRO, hRK, mCAR and ProA1 as raw materials for DNA shuffling. The difference in seed promoter strength was hRO≈hRK>mCAR>ProA1. Among them, the ProA1 promoter is a promoter that is specifically expressed only in cone cells, but its full length is about 2 kb, which is obviously not suitable for AAV vectors. The full-length hRK promoter is only about 500 bp and can be expressed in both cone and rod cells, but its specificity does not meet the expected requirements. The hRO and mCAR promoters were only expressed in rod cells, again not as expected. Therefore, the inventors used these four promoters to carry out random DNA recombination, selected random recombination fragments with a size of about 500 bp and cloned them into an AAV vector to form a promoter library, and then packaged the obtained highly diverse promoter library into type 8 At the same time, the control virus was used as the reference for the targeting of cone cells, and the subretinal in situ injection was performed at the animal level. By observing the fluorescence expression of TurboGFP (green reporter gene) and Tdtomato (red reporter gene), we screened out those with excellent characteristics. Random recombinant promoter, the specific experimental steps are as follows:
S01.构建Barcode标签文库:S01. Build the Barcode tag library:
1.1制备Insert-barcode片段:1.1 Prepare Insert-barcode fragment:
1.1.1使用XbaI酶切克隆骨架,回收4528bp片段作为PCR的模板扩增MCS+WPRE元件的部分序列;使用F1端引物:CACCAAGGAAGCCCTCGAGG ACGCGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGATCCCGACCTACCGACCCAGCTTTC(SEQ ID NO.1)和R1端引物AGGCGAAGACGCGGAAGAGG(SEQ ID NO.2)进行扩增; 1.1.1 Use XbaI to cut the cloned backbone and recover the 4528bp fragment as a template for PCR to amplify the partial sequence of the MCS+WPRE element; use F1 end primers: CACCAAGGAAGCCCTCGAGG ACGCGT NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGGATCCCGACCTACCGACCCAGCTTTC (SEQ ID NO.1) and R1 end primers AGGCGAAGACGCGGAAGAGG (SEQ ID NO. NO.2) Amplify;
1.1.2回收纯化PCR产物后使用Mlu I+Tfi I进行酶切纯化,得到Insert-Barcode片段;1.1.2 use Mlu I+Tfi I to carry out digestion and purification after reclaiming and purifying PCR product, obtain Insert-Barcode fragment;
1.2制备线性化克隆骨架:使用Mlu I+Tfi I酶切克隆骨架,回收4849bp片段作为文库骨架;1.2 Prepare the linearized clone backbone: use Mlu I+Tfi I to cut the clone backbone, and recover the 4849bp fragment as the library backbone;
1.3将Insert-Barcode片段和文库骨架进行连接反应,并转化至大肠杆菌DH10B中,得到Barcode文库;1.3 Carry out a ligation reaction with the Insert-Barcode fragment and the library backbone, and transform it into Escherichia coli DH10B to obtain the Barcode library;
1.4扩增Barcode标签文库中的Barcode序列进行NGS测序,通过数据分析确认文库的多样性高达1×10 81.4 Amplify the Barcode sequences in the Barcode tag library for NGS sequencing, and confirm the diversity of the library as high as 1×10 8 through data analysis.
S02.构建启动子文库:S02. Build a promoter library:
2.1制备随机重组启动子片段:2.1 Preparation of random recombinant promoter fragments:
2.1.1分别通过PCR扩增hRO、hRK、mCAR和ProA1启动子片段;2.1.1 Amplify hRO, hRK, mCAR and ProA1 promoter fragments by PCR respectively;
2.1.2使用DNase I对启动子片段进行消化将1U的DNase I稀释25倍,按照如下体系消化片段,使启动子片段被随机剪切为50-100bp的短片段,并进行回收;;2.1.2 Use DNase I to digest the promoter fragment and dilute 1U of DNase I by 25 times, digest the fragment according to the following system, so that the promoter fragment is randomly sheared into a short fragment of 50-100bp, and is recovered;
Figure PCTCN2021134329-appb-000003
Figure PCTCN2021134329-appb-000003
2.1.3使用
Figure PCTCN2021134329-appb-000004
End Repair Module(E6050S)对上述片段进行 末端补平,形成50-100bp大小的平末端短片段,纯化回收后得到启动子随机短片段;
2.1.3 Use
Figure PCTCN2021134329-appb-000004
End Repair Module (E6050S) blunts the ends of the above-mentioned fragments to form a blunt-ended short fragment of 50-100 bp, which is purified and recovered to obtain a random short fragment of the promoter;
2.1.4将引物F2:GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC(SEQ ID NO.3)和引物R2:Phos-GAATGAA GCGATCGCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC(SEQ ID NO.4)混匀后通过退火形成Y型adaptor(含有AsiSI酶切位点); 2.1.4 Mix primer F2: GGGCTCACCTCAGGCTACGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCGATCGCTTCATTC (SEQ ID NO. 3) and primer R2: Phos-GAATGAA GCGATCGC NNNNNNNNNNNNNNNNNNNNNNNNNNNNCCCTGACGTAGGCTGACGGC (SEQ ID NO. 4) to form Y-type adaptor (containing AsiSI restriction site) by annealing;
2.1.5将启动子随机短片段和Y型adaptor分别按1:1和1:3的比例混匀后进行连接反应,生成含Y型adaptor的启动子长片段大小趋势明显;如图6所示随着接头加入量的增多,获得的片段长度越短;2.1.5 The random short fragment of the promoter and the Y-type adapter were mixed at a ratio of 1:1 and 1:3 respectively, and then the ligation reaction was carried out, and the size of the long fragment of the promoter containing the Y-type adapter was obvious; as shown in Figure 6 With the increase of the amount of adapter added, the length of the obtained fragment is shorter;
2.1.6将连接产物进行琼脂糖电泳,对500bp的启动子长片段进行切胶回收纯化;2.1.6 The ligation product was subjected to agarose electrophoresis, and the 500bp long promoter fragment was cut into gel for recovery and purification;
2.1.7以上一步产物为模板,通过PCR扩增的方法增加500bp启动子长片段的产量,以及将预期含Y型adaptor的启动子长片段改构为双链DNA片段;2.1.7 The product of the previous step was used as a template, and the yield of the 500bp promoter long fragment was increased by the method of PCR amplification, and the expected Y-type adaptor-containing promoter long fragment was restructured into a double-stranded DNA fragment;
引物序列为:F2:CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAG GCTACGG(SEQ ID NO.5);The primer sequence is: F2: CGGTGGGCTCTATGGTGAGACGCCAGCCGTGGGCTCACCTCAG GCTACGG (SEQ ID NO.5);
R2:GTCTAGACCTCGAGGAGAGACGCCACGGCTGCCGTCAGCCTACGTCAGGG(SEQ ID NO.6)。R2: GTCTAGACCTCGAGGAGAGACGCCACGGCTGCCGTCAGCCTACGTCAGGG (SEQ ID NO. 6).
2.1.8回收纯化上一步的PCR产物,得到最终的shuffling promoter片段。2.1.8 Recover and purify the PCR product of the previous step to obtain the final shuffling promoter fragment.
2.2制备线性化克隆骨架:使用Xcm I酶切第一步中得到的Barcode标签文库,回收4845bp片段作为文库骨架;2.2 Prepare the linearized clone backbone: use Xcm I to digest the Barcode tag library obtained in the first step, and recover the 4845bp fragment as the library backbone;
2.3将随机重组启动子片段和文库骨架进行连接反应,并转化至大肠杆菌DH10B中,得到启动子文库;2.3 The random recombination promoter fragment and the library backbone are ligated, and transformed into Escherichia coli DH10B to obtain a promoter library;
S03.构建索引标签Marriage文库:S03. Build the index tag Marriage library:
3.1线性化:使用AsiS I酶切第二步中得到的启动子文库,回收4954bp含相同粘性末端的线性化片段;3.1 Linearization: Use AsiS I to digest the promoter library obtained in the second step, and recover a 4954bp linearized fragment containing the same sticky end;
3.2连接及转化:通过少量多次添加的方法将上一步中得到的片段加 入到连接反应中,尽可能地使连接反应中发生的为分子内的连接反应,即使单个线性化片段自身环化连接;并将连接产物转化至大肠杆菌DH10B中,得到索引标签Marriage文库;3.2 Ligation and transformation: The fragments obtained in the previous step are added to the ligation reaction by adding a small amount of time, so that the ligation reaction that occurs in the ligation reaction is an intramolecular ligation reaction as much as possible, even if a single linearized fragment itself is circularly connected. ; And the ligation product is transformed into Escherichia coli DH10B to obtain the index tag Marriage library;
3.3以Marriage文库质粒为模板,PCR扩增索引标签Marriage文库中的index1、index2以及Barcode序列进行NGS测序,通过数据分析确定三者之间的对应关系。通过序列分析确定该文库的多样性达到8.5×10 63.3 Using the plasmid of the Marriage library as a template, PCR amplify the index1, index2 and Barcode sequences in the index tag of the Marriage library for NGS sequencing, and determine the corresponding relationship between the three through data analysis. The diversity of the library was determined to be 8.5×10 6 by sequence analysis.
本实施例还提供一种筛选功能元件的方法,包括以下步骤:The present embodiment also provides a method for screening functional elements, comprising the following steps:
S11.将步骤S02构建得到的功能元件文库转染细胞或注射实验动物;S11. Transfect cells or inject experimental animals with the functional element library constructed in step S02;
S12.通过报告基因表达情况选取细胞或组织提取mRNA,逆转录成cDNA;S12. Select cells or tissues to extract mRNA according to the expression of the reporter gene, and reverse-transcribe it into cDNA;
S13.对标签Barcode进行测序,通过标签Barcode序列、第一index序列和第二index序列的对应关系筛选得到功能元件。S13. Sequence the label Barcode, and obtain functional elements by screening the corresponding relationship between the label Barcode sequence, the first index sequence and the second index sequence.
具体来说,以在动物水平筛选具有优良特性的启动子为例:Specifically, take the selection of promoters with excellent properties at the animal level as an example:
4.1将上述得到的启动子文库以及对照ProA1-Tdtomato包装成8型AAV病毒;4.1 The above-obtained promoter library and the control ProA1-Tdtomato were packaged into type 8 AAV virus;
4.2将上述病毒混合后进行小鼠眼球的视网膜下原位注射;4.2 In situ injection under the retina of mouse eyeball after mixing the above viruses;
4.3两周后摘取眼球进行冷冻切片并拍照观察荧光表达情况;4.3 Two weeks later, the eyeballs were taken out for frozen section and photographed to observe the fluorescence expression;
4.4收集感光细胞并进行流式筛选,分选出具有较高荧光强度的细胞,结果见附图6;4.4 Collect photoreceptor cells and perform flow screening to sort out cells with higher fluorescence intensity. The results are shown in Figure 6;
4.5对分选出来的荧光表达较强的细胞提取RNA;4.5 Extract RNA from the sorted cells with strong fluorescence expression;
4.6以RNA为模板逆转录成cDNA,PCR扩增出包含Barcode的序列进行NGS测序,通过Marriage文库的数据分析结果,得到index1和index2的具体序列;4.6 Reverse transcription into cDNA with RNA as a template, PCR amplify the sequence containing Barcode for NGS sequencing, and obtain the specific sequences of index1 and index2 through the data analysis results of the Marriage library;
4.7以得到的index1和index2序列为引物,以启动子文库为模板,PCR扩增出相应的启动子片段;4.7 Using the obtained index1 and index2 sequences as primers and the promoter library as a template, PCR amplifies the corresponding promoter fragments;
4.8通过Sanger测序得到候选启动子的具体序列。4.8 Obtain the specific sequence of the candidate promoter by Sanger sequencing.
附图6中为将文库病毒与对照ProA1-Tdtomato病毒预混后进行小鼠 视网膜下原位注射所得到的荧光图片。由于ProA1启动子仅特异性靶向视锥细胞中,因此A图中较亮的红色荧光即代表视锥细胞的分布,B图则是整个文库在视网膜中的分布表达情况,C图是A图和B图的共染图,图C中黄橙色荧光标记的细胞即为可用于后续的启动子序列鉴定。Figure 6 shows the fluorescence images obtained by subretinal orthotopic injection in mice after premixing the library virus with the control ProA1-Tdtomato virus. Since the ProA1 promoter only specifically targets cone cells, the brighter red fluorescence in Figure A represents the distribution of cone cells, Figure B shows the distribution and expression of the entire library in the retina, and Figure C is Figure A The co-staining map of Figure B, the yellow-orange fluorescently labeled cells in Figure C can be used for subsequent identification of promoter sequences.
综上所述,说明本发明提供的文库构建方法可以达到文库高度多样性的效果,成功解决了具有相似特性的或具有特定或未知功能的功能元件无法进行有效重组的缺陷,能够快速高通量实现启动子、增强子或其他功能元件高度多样性的构建方法,并且可以由此筛选出具有优良性能(如片段较小,特异性高,启动能力强)的功能元件序列。To sum up, it shows that the library construction method provided by the present invention can achieve the effect of high diversity of the library, successfully solve the defect that functional elements with similar characteristics or specific or unknown functions cannot be effectively recombined, and can quickly and high-throughput. A construction method that realizes high diversity of promoters, enhancers or other functional elements, and can screen out functional element sequences with excellent performance (such as smaller fragments, high specificity, and strong activation ability).
实施例3Example 3
选取CMV_en、HBB_en和SV40_en增强子为原材料进行DNA改组,这3个增强子都可以对基因起到调控作用,其中CMV_en增强子的大小为300bp,HBB_增强子大小为3kb,SV40_en增强子大小为237bp。为了获得一个全新的更短的且有正调控作用的HBB增强子,我们利用这3种增强子进行随机重组,选取大小约为800bp-1k的随机重组片段克隆到含有SCP1_mini启动子的哺乳动物增强子测试表达载体(图8)上形成增强子文库,然后将获得的增强子文库瞬转K562细胞,通过流式分选的方式筛选出不同荧光强度的细胞,进一步筛选出符合目的的随机重组增强子。Select CMV_en, HBB_en and SV40_en enhancers as raw materials for DNA shuffling, these three enhancers can play a role in gene regulation, the size of the CMV_en enhancer is 300bp, the size of the HBB_ enhancer is 3kb, and the size of the SV40_en enhancer is 237bp. In order to obtain a new, shorter and positively regulated HBB enhancer, we used these three enhancers for random recombination, and selected a random recombination fragment with a size of about 800bp-1k and cloned it into a mammalian enhancer containing the SCP1_mini promoter The enhancer library was formed on the sub-test expression vector (Figure 8), and then the obtained enhancer library was transiently transfected into K562 cells, and cells with different fluorescence intensities were screened by flow sorting, and the random recombination enhancer that met the purpose was further screened. son.
1.构建Barcode标签文库:1. Build the Barcode label library:
1.1.1使用ScaI酶切克隆骨架,回收3974bp片段作为PCR的模板扩增MCS+SV40元件的部分序列;使用F3端引物(GAAGCCCTCGAGGACGCGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAAAGGTACCAAAGGATCCCGAC)和R3端引物TGGAGCGAACGACCTACACCGA进行扩增;1.1.1 Use ScaI to cut the cloned backbone, and recover the 3974bp fragment as a PCR template to amplify the partial sequence of the MCS+SV40 element; use the F3 end primer (GAAGCCCTCGAGGACGCGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAAAGGTACCAAAGGATCCCGAC) and the R3 end primer TGGAGCGAACGACCTACACCGA for amplification;
1.1.2回收纯化PCR产物后使用Mlu I+Drd I进行酶切纯化,得到Insert-barcode片段;1.1.2 use Mlu I+Drd I to carry out digestion and purification after reclaiming and purifying the PCR product to obtain the Insert-barcode fragment;
1.2制备线性化克隆骨架:使用Mlu I+Drd I酶切克隆骨架,回收3953bp片段作为文库骨架;1.2 Prepare the linearized clone backbone: Use Mlu I+Drd I to cut the clone backbone, and recover the 3953bp fragment as the library backbone;
1.3将Insert-barcode片段和文库骨架进行连接反应,并转化至大肠杆菌DH10B中,得到Barcode文库;1.3 Carry out a ligation reaction with the Insert-barcode fragment and the library backbone, and transform it into Escherichia coli DH10B to obtain the Barcode library;
1.4扩增Barcode标签文库中的Barcode序列进行NGS测序,通过数据分析确认文库的多样性。1.4 Amplify the Barcode sequence in the Barcode tag library for NGS sequencing, and confirm the diversity of the library through data analysis.
2.构建索引标签Marriage文库:2. Build the index tag Marriage library:
2.1制备线性化克隆骨架:使用BsmBI酶切第一步中得到的Barcode标签文库,回收3404bp片段作为文库骨架;2.1 Prepare the linearized clone backbone: Use BsmBI to digest the Barcode tag library obtained in the first step, and recover the 3404bp fragment as the library backbone;
2.2Index1+MCS+Index2片段的制备:将引物F4(TGGGGATGCGGTGGGCTCTATGGNNNNNNNNNNNNNNNNNNNNNNNNNCCCAGACCGACTCGGACCACCCAGCCGTGAACTGGAAAGCTTACCACAAGAGCCG)和引物R4(TTATATAAGTACCCTCGAGGNNNNNNNNNNNNNNNNNNNNNNNNNGGGACAGGCAGTGCCAGGAGCCACGGCTCTTGTGGTAAGCTTTCCAGTTCACGGC)退火延伸形成171bp的双链Index1+MCS+Index2片段;2.2Index1+MCS+Index2片段的制备:将引物F4(TGGGGATGCGGTGGGCTCTATGGNNNNNNNNNNNNNNNNNNNNNNNNNCCCAGACCGACTCGGACCACCCAGCCGTGAACTGGAAAGCTTACCACAAGAGCCG)和引物R4(TTATATAAGTACCCTCGAGGNNNNNNNNNNNNNNNNNNNNNNNNNGGGACAGGCAGTGCCAGGAGCCACGGCTCTTGTGGTAAGCTTTCCAGTTCACGGC)退火延伸形成171bp的双链Index1+MCS+Index2片段;
2.3使用Index1+MCS+Index2片段与文库骨架进行Gibon连接,并将连接产物转化至大肠杆菌DH10B中,得到Marriage索引标签文库;2.3 Use the Index1+MCS+Index2 fragment to perform Gibon ligation with the library backbone, and transform the ligation product into Escherichia coli DH10B to obtain the Marriage index tag library;
2.4以Marriage索引标签文库质粒为模板,PCR扩增索引标签Marriage文库中的Index1、Index2以及Barcode序列进行NGS测序,通过数据分析确定三者之间的对应关系。2.4 Using the plasmid of the Marriage index tag library as a template, PCR amplify the Index1, Index2 and Barcode sequences in the index tag Marriage library for NGS sequencing, and determine the corresponding relationship between the three through data analysis.
3.构建增强子文库:3. Build the enhancer library:
3.1制备随机重组增强子片段:3.1 Preparation of random recombinant enhancer fragments:
3.2分别通过PCR扩增CMV_en、HBB_en和SV40_en增强子片段;3.2 Amplify CMV_en, HBB_en and SV40_en enhancer fragments by PCR respectively;
3.3使用Covaris超声破碎根据操作说明书对启动子片段进行破碎,使增强子片段被随机剪切为150bp-550bp的短片段,并进行回收;3.3 Use Covaris ultrasonic fragmentation to fragment the promoter fragment according to the operating instructions, so that the enhancer fragment is randomly sheared into short fragments of 150bp-550bp, and recovered;
3.4使用
Figure PCTCN2021134329-appb-000005
End Repair Module(E6050S)对上述片段进行末端补平;
3.4 Use
Figure PCTCN2021134329-appb-000005
End Repair Module (E6050S) blunts the ends of the above-mentioned fragments;
3.5将引物F5(gactcggaccacccagccgtnnnnnnnnnnnnnnnnnnnnnnnnnnnnnngcgatcgcttcattc)和引物r5(phos-gaatgaagcgatcgcnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnacggctgggtggtccgagtc)混匀后通过退火获得 小片段左连接接头;将引物f6(phos-cctaggcgcaccaaggaagccnnnnnnnnnnnnnnnnnnnnnnnnnnnnnncctcgagggtacttatataa)和引物r6(ttatataagtaccctcgaggnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnggcttccttggtgcgcctagg)混匀后通过退火获得小片段右连接接头;3.5将引物F5(gactcggaccacccagccgtnnnnnnnnnnnnnnnnnnnnnnnnnnnnnngcgatcgcttcattc)和引物r5(phos-gaatgaagcgatcgcnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnacggctgggtggtccgagtc)混匀后通过退火获得小片段左连接接头;将引物f6(phos-cctaggcgcaccaaggaagccnnnnnnnnnnnnnnnnnnnnnnnnnnnnnncctcgagggtacttatataa)和引物r6(ttatataagtaccctcgaggnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnggcttccttggtgcgcctagg)混匀后通过退火获得小片段右connection connector;
3.6将增强子随机短片段和两种连接接头1:1混匀后进行连接反应,生成增强子长片段;3.6 Mix the random short fragment of the enhancer and the two ligation adapters at a ratio of 1:1 and carry out the ligation reaction to generate the long fragment of the enhancer;
3.7将连接产物进行琼脂糖电泳,对800bp-1kb左右大小的增强子重组片段进行切胶回收纯化,得到最终的随机重组增强子片段。3.7 The ligation product is subjected to agarose electrophoresis, and the recombination fragment of the enhancer with a size of about 800bp-1kb is cut into gel for recovery and purification to obtain the final random recombination enhancer fragment.
3.8制备线性化克隆骨架:使用XcmI酶切Marriage索引标签文库,回收3404bp片段作为文库骨架;3.8 Preparation of linearized clone backbone: Use XcmI enzyme to digest the library index tag library, and recover the 3404bp fragment as the library backbone;
3.9将随机重组增强子片段和文库骨架进行Gibson重组连接反应,并转化至大肠杆菌DH10B中,得到增强子文库,如图9所示;3.9 The random recombination enhancer fragment and the library backbone were subjected to Gibson recombination ligation reaction, and transformed into Escherichia coli DH10B to obtain an enhancer library, as shown in Figure 9;
3.10将获得的增强子文库瞬转K562细胞,通过流式分选的方式筛选出不同荧光强度的细胞,提取目的细胞RNA,并以RNA为模板逆转录成cDNA,PCR扩增出包含Barcode的序列进行NGS测序,获得具体的Barcode序列,通过Marriage索引标签文库的数据分析结果,得到index1和index2的具体序列;3.10 Transient the obtained enhancer library into K562 cells, screen out cells with different fluorescence intensities by flow sorting, extract the RNA of the target cells, and use the RNA as a template to reverse transcribe into cDNA, and PCR amplify the sequence containing Barcode Perform NGS sequencing to obtain specific Barcode sequences, and obtain the specific sequences of index1 and index2 through the data analysis results of the Marriage index tag library;
3.11以得到的index1和index2序列为引物,以增强子文库为模板,PCR扩增出相应的增强子片段;3.11 Using the obtained index1 and index2 sequences as primers and the enhancer library as a template, PCR amplifies the corresponding enhancer fragments;
3.12通过Sanger序列即可得到候选增强子的具体序列。3.12 The specific sequence of the candidate enhancer can be obtained through the Sanger sequence.
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。The above-mentioned embodiments only represent several embodiments of the present invention, and the descriptions thereof are specific and detailed, but should not be construed as limiting the scope of the patent of the present invention. It should be pointed out that for those skilled in the art, without departing from the concept of the present invention, several modifications and improvements can be made, which all belong to the protection scope of the present invention.

Claims (23)

  1. 一种载体,所述载体包括载体骨架与依次连接于载体骨架上的第一终止子、重组位点、报告基因、多克隆位点、转录后调控序列和第二终止子;所述报告基因的表达产物为可通过催化底物反应自身发光或产生颜色变化、可通过催化底物反应使底物发光或产生颜色变化、或经过激发光照射而产生发射光或产生颜色变化、或可抵抗相应药物筛选。A vector comprising a vector backbone and a first terminator, a recombination site, a reporter gene, a multiple cloning site, a post-transcriptional regulatory sequence and a second terminator sequentially connected to the vector backbone; The expression product can emit light or produce color change by catalyzing the reaction of the substrate, can cause the substrate to emit light or produce color change by catalyzing the reaction of the substrate, or emit light or produce color change by excitation light irradiation, or can resist the corresponding drug filter.
  2. 质粒载体,其包括:索引标签、报告基因和条码标签;A plasmid vector, which includes: an index tag, a reporter gene, and a barcode tag;
    所述条码标签为长度为5~200bp的随机片段;The barcode label is a random fragment with a length of 5-200 bp;
    所述索引标签的个数至少为1,其独立的选自长度为5~100bp的随机片段;The number of the index tags is at least 1, and it is independently selected from random fragments with a length of 5-100 bp;
    所述报告基因的表达产物为可通过催化底物反应自身发光或产生颜色变化、可通过催化底物反应使底物发光或产生颜色变化、或经过激发光照射而产生发射光或产生颜色变化、或可抵抗相应药物筛选。The expression product of the reporter gene is capable of catalyzing the substrate reaction to emit light or to produce color change, to catalyze the substrate reaction to cause the substrate to emit light or to produce color change, or to emit light or produce color change through excitation light irradiation, Or resistant to corresponding drug screening.
  3. 根据权利要求2所述的质粒载体,其特征在于,所述条码标签为长度为40bp的随机片段;所述索引标签的数量为2,其中index1为长度为30bp的随机片段,index2为长度为30bp的随机片段,所述报告基因选自荧光蛋白、荧光素酶、LacZ基因或能起到筛选作用的抗性基因中的至少一种,所述抗性基因包括嘌呤霉素抗性基因。The plasmid vector according to claim 2, wherein the barcode tag is a random fragment with a length of 40bp; the number of the index tags is 2, wherein index1 is a random fragment with a length of 30bp, and index2 is a length of 30bp. The random fragment of , the reporter gene is selected from at least one of fluorescent protein, luciferase, LacZ gene or a resistance gene that can play a screening role, and the resistance gene includes a puromycin resistance gene.
  4. 根据权利要求3所述的质粒载体,其特征在于,依次包括如下元件:pUC ori、5’ITR、BGHpA、index1、index2、报告基因、条码标签、WPRE、BGHpA、3’ITR和抗性筛选标记。The plasmid vector according to claim 3, wherein the following elements are included in sequence: pUC ori, 5'ITR, BGHpA, index1, index2, reporter gene, barcode label, WPRE, BGHpA, 3'ITR and resistance selection marker .
  5. 根据权利要求4所述的质粒载体,其特征在于,所述index1和index2之间还包括酶切位点和随机重组调控序列。The plasmid vector according to claim 4, characterized in that, an enzyme cleavage site and a random recombination control sequence are further included between the index1 and index2.
  6. 根据权利要求5所述的质粒载体,其特征在于,plasmid vector according to claim 5, is characterized in that,
    所述酶切位点为AsiSI;The enzyme cleavage site is AsiSI;
    所述酶切位点的数量为2,位于所述随机重组调控序列的两端;The number of the restriction enzyme cleavage sites is 2, which are located at both ends of the random recombination control sequence;
    所述随机重组调控序列为碎片化的启动子片段或碎片化的增强子片段。The random recombination control sequences are fragmented promoter fragments or fragmented enhancer fragments.
  7. 根据权利要求6所述的质粒载体,其特征在于,所述所述碎片化的方式包括酶消化、超声破碎或人工合成;所述启动子选自hRO、hRK、mCAR、ProA1、CMV,EF1A,EFS,CAG,CBh,SFFV,MSCV,SV40,mPGK,hPGK,UBC,Nanog,Nes,Tuba1a,Camk2a,SYN1,Hb9,Th,NSE,GFAP,Iba1,hRHO,hBEST1,Prnp,Cnp,K14,BK5,mTyr,cTnT,αMHC,Myog,ACTA1,MHCK7,SM22a,EnSM22a,Runx2,OC,Col1a1,Col2a1,aP2,Adipoq,Tie1,Cd144,CD68,CD11b,Afp,Alb,TBG,MMTV,Wap,HIP,Pdx1,Ins2,Hcn4,NPHS2,SPB,CD144,TERT,TRE,TRE3G,GAL1,MET17,CUP1,AOX1,sCMV,bactin2,Ubi,cmlc2,zK5,503unc,HSP70,5×UAS,CaMV35S,Nos,ZmUbi,TEF1,GPD,ADH1,GAP,actin5C,Polyubiquitin,α1-tubulin,Rh2,Mtn,U6,U3,H1,U6-26,TK,RSV,MC1,GAL1,PH,p5,p10,p40,p41,araBAD,cspA或Hsp68;所述增强子选自CMV_en、HBB_en或SV40_en。The plasmid vector according to claim 6, wherein the fragmentation mode comprises enzymatic digestion, ultrasonication or artificial synthesis; the promoter is selected from hRO, hRK, mCAR, ProA1, CMV, EF1A, EFS, CAG, CBh, SFFV, MSCV, SV40, mPGK, hPGK, UBC, Nanog, Nes, Tuba1a, Camk2a, SYN1, Hb9, Th, NSE, GFAP, Iba1, hRHO, hBEST1, Prnp, Cnp, K14, BK5, mTyr, cTnT, αMHC, Myog, ACTA1, MHCK7, SM22a, EnSM22a, Runx2, OC, Col1a1, Col2a1, aP2, Adipoq, Tie1, Cd144, CD68, CD11b, Afp, Alb, TBG, MMTV, Wap, HIP, Pdx1, Ins2, Hcn4, NPHS2, SPB, CD144, TERT, TRE, TRE3G, GAL1, MET17, CUP1, AOX1, sCMV, bactin2, Ubi, cmlc2, zK5, 503unc, HSP70, 5×UAS, CaMV35S, Nos, ZmUbi, TEF1, GPD, ADH1, GAP, actin5C, Polyubiquitin, α1-tubulin, Rh2, Mtn, U6, U3, H1, U6-26, TK, RSV, MC1, GAL1, PH, p5, p10, p40, p41, araBAD, cspA or Hsp68; the enhancer is selected from CMV_en, HBB_en or SV40_en.
  8. 根据权利要求1~7任一项所示的质粒载体,其特征在于,The plasmid vector according to any one of claims 1 to 7, characterized in that:
    依次包括如下元件:pUC ori、5’ITR、BGHpA、index1、AsiSI酶切位点、index2、Kozak、TurboGFP基因、条码标签、WPRE、BGHpA、3’ITR和Amp抗性筛选标记;其中,条码标签为长度为40bp的随机片段,所述index1为长度为30bp的随机片段,所述index2为长度为30bp的随机片段;The following elements are included in sequence: pUC ori, 5'ITR, BGHpA, index1, AsiSI restriction site, index2, Kozak, TurboGFP gene, barcode tag, WPRE, BGHpA, 3'ITR and Amp resistance selection marker; wherein, barcode tag is a random fragment with a length of 40bp, the index1 is a random fragment with a length of 30bp, and the index2 is a random fragment with a length of 30bp;
    或者依次包括如下元件:pUC ori、5’ITR、BGHpA、index1、AsiSI酶切位点、随机重组调控序列、AsiSI酶切位点、index2、Kozak、TurboGFP基因、条码标签、WPRE、BGH pA、3’ITR和Amp抗性筛选标记;其中,随机重组调控序列的长度为50~2000,为经DnaseI酶消化后的启动子片段或酶消化后的增强子片段,条码标签为长度为40bp的随机片段,所述index1为长度为30bp的随机片段,所述index2为长度为30bp的随机片段。Or include the following elements in sequence: pUC ori, 5'ITR, BGHpA, index1, AsiSI restriction site, random recombination regulatory sequence, AsiSI restriction site, index2, Kozak, TurboGFP gene, barcode tag, WPRE, BGH pA, 3 'ITR and Amp resistance screening markers; wherein, the length of the random recombination control sequence is 50-2000, which is the promoter fragment after digestion with DnaseI enzyme or the enhancer fragment after enzyme digestion, and the barcode tag is a random fragment with a length of 40bp , the index1 is a random fragment with a length of 30bp, and the index2 is a random fragment with a length of 30bp.
  9. 权利要求5~8任一项所示质粒载体的构建方法,其特征在于,将条码标签、索引标签和随机重组调控序列插入含有报告基因的骨架载体。The method for constructing a plasmid vector according to any one of claims 5 to 8, characterized in that a barcode tag, an index tag and a random recombination control sequence are inserted into a backbone vector containing a reporter gene.
  10. 根据权利要求9所述的构建方法,其特征在于,所述条码标签的插入为:construction method according to claim 9, is characterized in that, the insertion of described barcode label is:
    制备携带有骨架载体同源臂的条码标签,使其与线性化骨架载体经Gibson克隆反应,构建获得标签文库。The barcode tag carrying the homology arm of the backbone vector is prepared, and it reacts with the linearized backbone vector through Gibson cloning to construct a tag library.
  11. 根据权利要求9所述的构建方法,其特征在于,所述索引标签和随机重组调控序列的插入包括:construction method according to claim 9, is characterized in that, the insertion of described index label and random recombination control sequence comprises:
    制备随机重组调控序列,然后在其两端添加骨架载体的同源臂和索引标签,得到插入片段;Prepare a random recombination regulatory sequence, and then add the homology arms and index tags of the backbone vector at both ends to obtain an insert;
    将权利要求10所述构建方法制得的标签文库线性化;Linearizing the tag library prepared by the construction method of claim 10;
    将所述插入片段与线性化的标签文库连接,获得调控序列文库。The inserts are ligated with the linearized tag library to obtain a library of regulatory sequences.
  12. 根据权利要求11所述的构建方法,其特征在于,所述随机重组调控序列的制备方法包括,以酶对启动子或增强子进行消化。The construction method according to claim 11, wherein the preparation method of the random recombination regulatory sequence comprises: digesting a promoter or an enhancer with an enzyme.
  13. 根据权利要求11或12所述的构建方法,其特征在于,所述插入片段的制备具体包括:construction method according to claim 11 or 12, is characterized in that, the preparation of described insert fragment specifically comprises:
    将引物F和引物R退火形成Y型的adaptor;所述引物F的结构为同源臂1-索引标签1-酶切位点1-保护序列1;所述引物R的结构为保护序列2-酶切位点2-索引标签2-同源臂2;所述保护序列1和保护序列2互补;The primer F and primer R are annealed to form a Y-shaped adapter; the structure of the primer F is homology arm 1-index tag 1-restriction site 1-protection sequence 1; the structure of the primer R is protection sequence 2- Restriction site 2-index tag 2-homology arm 2; the protection sequence 1 and protection sequence 2 are complementary;
    将adaptor与平末端的随机重组调控序列连接,得含有Y型接头的功能元件随机长片段;Connect the adapter with the blunt-ended random recombination control sequence to obtain a random long fragment of the functional element containing the Y-shaped linker;
    将对含有Y型接头的功能元件随机长片段经PCR获得线性片段,A linear fragment was obtained by PCR on the random long fragment of the functional element containing the Y-shaped linker,
    使所述线性片段与线性化的标签文库进行连接,构建获得调控序列文库。The linear fragment is ligated with the linearized tag library to construct a library of regulatory sequences.
  14. 权利要求2~4任一项所述质粒载体的构建方法,其特征在于,将权利要求11~13所述构建方法构建获得的调控序列文库经酶切去除随机重组调控序列,获得索引标签文库。The method for constructing a plasmid vector according to any one of claims 2 to 4, characterized in that the random recombination regulatory sequence is removed by enzyme digestion from the regulatory sequence library constructed by the construction method described in claims 11 to 13 to obtain an index tag library.
  15. 权利要求1~8任一项所述载体在文库构建或功能元件筛选方面的应用。Application of the vector according to any one of claims 1 to 8 in library construction or functional element screening.
  16. 一种文库构建的方法,其特征在于,利用Y型接头将随机打断的序列整合入权利要求1所述载体中;序列整合位点优选为所述载体的重 组位点;A method for library construction, characterized in that, utilizing a Y-type linker to integrate the randomly interrupted sequence into the carrier of claim 1; the sequence integration site is preferably the recombination site of the carrier;
    进一步所述Y型接头的结构依次包括第一同源臂、第一index序列、酶切位点、随机序列嵌入位点、酶切位点、第二index序列和第二同源臂。Further, the structure of the Y-shaped linker sequentially includes a first homology arm, a first index sequence, an enzyme cleavage site, a random sequence insertion site, an enzyme cleavage site, a second index sequence and a second homology arm.
  17. 根据权利要求16所述的方法,其特征在于,所述方法包括以下步骤:The method of claim 16, wherein the method comprises the steps of:
    S01.构建标签文库;S01. Construct a tag library;
    S02.构建功能元件文库;S02. Construct a functional element library;
    S03.构建索引标签文库;S03. Build an index tag library;
    其中,步骤S02优选通过所述Y型接头将随机打断的功能元件的序列整合入权利要求1所述载体中。Wherein, in step S02, the sequence of the randomly interrupted functional element is preferably integrated into the vector of claim 1 through the Y-shaped linker.
  18. 根据权利要求17所述的方法,其特征在于,步骤S01的具体操作为:The method according to claim 17, wherein the specific operation of step S01 is:
    线性化权利要求1所述载体,回收线性化载体骨架;Linearizing the vector of claim 1, recovering the linearized vector backbone;
    用携带随机标签序列及同源臂的引物扩增载体骨架,获得带有随机标签序列的PCR产物;Amplify the vector backbone with primers carrying random tag sequences and homology arms to obtain PCR products with random tag sequences;
    将PCR产物与回收的线性化载体骨架连接,构建得标签文库。The PCR product was ligated with the recovered linearized vector backbone to construct a tag library.
  19. 根据权利要求17所述的方法;其特征在于,步骤S02的具体操作为:The method according to claim 17; it is characterized in that, the specific operation of step S02 is:
    将功能元件的核酸片段随机打断,得功能元件随机短片段;Randomly interrupt the nucleic acid fragments of functional elements to obtain random short fragments of functional elements;
    将功能元件随机短片段与Y型接头连接,得含有Y型接头的功能元件随机长片段;Connecting the random short fragments of the functional element with the Y-shaped linker to obtain a random long fragment of the functional element containing the Y-shaped linker;
    将含有Y型接头的功能元件随机长片段与步骤S01构建得到的标签文库进行连接,构建功能元件文库。A functional element library is constructed by ligating the functional element random long fragment containing the Y-shaped linker with the tag library constructed in step S01.
  20. 根据权利要求17所述的方法,其特征在于,步骤S03的具体操作为:酶切步骤S02中构建的功能元件文库,去除随机序列嵌入位点,回收载体骨架并自连,构建得索引标签文库。The method according to claim 17, wherein the specific operation of step S03 is: enzyme digestion of the functional element library constructed in step S02, removal of random sequence embedding sites, recovery of the vector backbone and self-ligation to construct an index tag library .
  21. 权利要求8~19任一项所述的方法在筛选功能元件方面的应用。Application of the method of any one of claims 8 to 19 in screening functional elements.
  22. 一种筛选功能元件的方法,其特征在于,包括文库构建的步骤,所述文库构建的方法如权利要求9~14任一项所述质粒载体的构建方法, 或权利要求15~20任一项所述的方法。A method for screening functional elements, comprising the step of building a library, wherein the method for building a library is the method for constructing a plasmid vector according to any one of claims 9 to 14, or any one of claims 15 to 20. the method described.
  23. 根据权利要求22所述的方法,其特征在于,所述方法包括以下步骤:The method of claim 22, wherein the method comprises the steps of:
    S11.将权利要求9~20任一项所述方法构建得到的功能元件文库转染细胞或注射入实验动物体内;S11. Transfect cells or inject the functional element library constructed by the method according to any one of claims 9 to 20 into an experimental animal;
    S12.通过报告基因表达情况选取细胞或组织提取mRNA,逆转录成cDNA;S12. Select cells or tissues to extract mRNA according to the expression of the reporter gene, and reverse-transcribe it into cDNA;
    S13.对所述标签进行测序,通过所述标签序列、第一index序列和第二index序列的对应关系筛选得到功能元件。S13. Sequence the tag, and obtain functional elements by screening the corresponding relationship between the tag sequence, the first index sequence and the second index sequence.
PCT/CN2021/134329 2020-12-31 2021-11-30 Vector system for screening regulatory sequences and application WO2022142963A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202111247419.3 2020-12-31
CN202111247419.3A CN113957089B (en) 2020-12-31 2020-12-31 Vector system for screening regulatory sequences and application thereof
CN202011630533.X 2020-12-31
CN202011630533.XA CN112725329B (en) 2020-12-31 2020-12-31 Library building method for functional element and application thereof

Publications (1)

Publication Number Publication Date
WO2022142963A1 true WO2022142963A1 (en) 2022-07-07

Family

ID=75608329

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/134329 WO2022142963A1 (en) 2020-12-31 2021-11-30 Vector system for screening regulatory sequences and application

Country Status (2)

Country Link
CN (2) CN113957089B (en)
WO (1) WO2022142963A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116286991A (en) * 2023-02-10 2023-06-23 中国农业科学院农业基因组研究所 Whole genome enhancer screening system, screening method and application

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113957089B (en) * 2020-12-31 2024-02-27 云舟生物科技(广州)股份有限公司 Vector system for screening regulatory sequences and application thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030175724A1 (en) * 2001-04-27 2003-09-18 Wei Zhang Promoter libraries and their use in identifying promoters, transcription initiation sites and transcription factors
US20070161031A1 (en) * 2005-12-16 2007-07-12 The Board Of Trustees Of The Leland Stanford Junior University Functional arrays for high throughput characterization of gene expression regulatory elements
US20130324440A1 (en) * 2011-01-25 2013-12-05 Synpromics Ltd. Method for the construction of specific promoters
CN105603537A (en) * 2016-03-11 2016-05-25 南京工业大学 Construction method and application of promoter library
CN106192021A (en) * 2016-08-02 2016-12-07 中国海洋大学 A kind of construction method in RAD label sequencing library of connecting
CN112725329A (en) * 2020-12-31 2021-04-30 云舟生物科技(广州)有限公司 Library building method for functional element and application thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030175724A1 (en) * 2001-04-27 2003-09-18 Wei Zhang Promoter libraries and their use in identifying promoters, transcription initiation sites and transcription factors
US20070161031A1 (en) * 2005-12-16 2007-07-12 The Board Of Trustees Of The Leland Stanford Junior University Functional arrays for high throughput characterization of gene expression regulatory elements
US20130324440A1 (en) * 2011-01-25 2013-12-05 Synpromics Ltd. Method for the construction of specific promoters
CN105603537A (en) * 2016-03-11 2016-05-25 南京工业大学 Construction method and application of promoter library
CN106192021A (en) * 2016-08-02 2016-12-07 中国海洋大学 A kind of construction method in RAD label sequencing library of connecting
CN112725329A (en) * 2020-12-31 2021-04-30 云舟生物科技(广州)有限公司 Library building method for functional element and application thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RYO KOMURA, WATARU AOKI, KEISUKE MOTONE, ATSUSHI SATOMURA, MITSUYOSHI UEDA, MARK ISALAN: "High-throughput evaluation of T7 promoter variants using biased randomization and DNA barcoding", PLOS ONE, PUBLIC LIBRARY OF SCIENCE, vol. 13, no. 5, 7 May 2018 (2018-05-07), pages e0196905, XP055671168, DOI: 10.1371/journal.pone.0196905 *
ZHOU XUEYING, ET AL.: "Construction of Lentivirus Based Promoter Reporter Random Library for Screening Context Dependent Cis-Elements", MEDICAL JOURNAL OF WEST CHINA, vol. 29, no. 4, 30 April 2017 (2017-04-30), pages 455 - 461, XP055948871, ISSN: 1672-3511, DOI: 10.3969/j.issn.1672-3511.2017.04.003 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116286991A (en) * 2023-02-10 2023-06-23 中国农业科学院农业基因组研究所 Whole genome enhancer screening system, screening method and application
CN116286991B (en) * 2023-02-10 2023-10-13 中国农业科学院农业基因组研究所 Whole genome enhancer screening system, screening method and application

Also Published As

Publication number Publication date
CN113957089B (en) 2024-02-27
CN112725329A (en) 2021-04-30
CN112725329B (en) 2021-11-23
CN113957089A (en) 2022-01-21

Similar Documents

Publication Publication Date Title
WO2022142963A1 (en) Vector system for screening regulatory sequences and application
CN113286880A (en) Methods and compositions for regulating a genome
US11898270B2 (en) Pig genome-wide specific sgRNA library, preparation method therefor and application thereof
US10167485B2 (en) Production of viral vectors
CN107109434A (en) Novel CHO integration sites and its purposes
Carninci et al. Balanced-size and long-size cloning of full-length, cap-trapped cDNAs into vectors of the novel λ-FLC family allows enhanced gene discovery rate and functional analysis
AU4659097A (en) Viral vectors and their uses
US20030017552A1 (en) Modular vector systems
KR20110089420A (en) Baculoviral vectors
WO2015144045A1 (en) Plasmid library comprising two random markers and use thereof in high throughput sequencing
JP2009523428A (en) Linear vectors, host cells and cloning methods
CN111733184B (en) Adenovirus packaging method
CN112592923A (en) IRES sequence, use of IRES sequence and polycistronic expression vector
CN111961686A (en) System for realizing biallelic precise genome editing by using CRISPR/Cas9 and PiggyBac
Hou et al. Retrotransposon vectors for gene delivery in plants
US20220380750A1 (en) Method for the production of raav and method for the in vitro generation of genetically engineered, linear, single-stranded nucleic acid fragments containing itr sequences flanking a gene of interest
CN107287226B (en) Cpf 1-based DNA construct and DNA in-vitro splicing method
EP4081636A2 (en) Method for identifying regulatory elements
CN111171121B (en) Transcription factor and application thereof in activating expression of banana MaSBE2.3
AU2021252114A1 (en) Methods for targeted integration
CN102533741B (en) Swine pseudo attp site and use of swine pseudo attp site
CN115948514B (en) In vitro amplification method of linear double-stranded DNA
US20230037026A1 (en) Method for identifying regulatory elements conformationally
CN117051046A (en) Lentiviral vector and application thereof
CN115927470A (en) Adenovirus packaging system, application and adenovirus packaging method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913704

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21913704

Country of ref document: EP

Kind code of ref document: A1