WO2021233330A1 - 一种蛋白质异质索烃的生物合成方法 - Google Patents

一种蛋白质异质索烃的生物合成方法 Download PDF

Info

Publication number
WO2021233330A1
WO2021233330A1 PCT/CN2021/094589 CN2021094589W WO2021233330A1 WO 2021233330 A1 WO2021233330 A1 WO 2021233330A1 CN 2021094589 W CN2021094589 W CN 2021094589W WO 2021233330 A1 WO2021233330 A1 WO 2021233330A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
p53dim
intein
domain
sequence
Prior art date
Application number
PCT/CN2021/094589
Other languages
English (en)
French (fr)
Inventor
张文彬
刘雅杰
Original Assignee
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学 filed Critical 北京大学
Priority to US17/999,377 priority Critical patent/US20230348546A1/en
Publication of WO2021233330A1 publication Critical patent/WO2021233330A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4746Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used p53
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/02Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • C07K2319/21Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a His-tag
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/50Fusion polypeptide containing protease site
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/90Fusion polypeptide containing a motif for post-translational modification
    • C07K2319/92Fusion polypeptide containing a motif for post-translational modification containing an intein ("protein splicing")domain

Definitions

  • the present invention relates to a method for the biosynthesis of protein heterogenes, in particular to a biosynthesis system based on polypeptide-protein reaction pairs and/or fragmentation of intein, and the construction of multiple structures based on the system through two orthogonal coupling cyclization methods The method of domain protein heterogeneous solubilization.
  • topological proteins In nature, many natural biological macromolecules have specific topological structures and are closely related to their corresponding biological functions.
  • the natural topological proteins discovered so far include cyclic, kink, lasso proteins and protein cord hydrocarbons. Since the construction of cyclic proteins only needs to be coupled to the polypeptide chain, it is currently the focus of research on synthetic topological proteins, and it usually shows a significant improvement in thermal stability. Due to the complexity of the protein folding mechanism, it is relatively difficult to control the topological structure of the protein by controlling the tangled relationship between the polypeptide chains.
  • the simplest [2] cable hydrocarbon is composed of two mechanically interlocked cyclic motifs.
  • the corresponding protein heterogeneous cable hydrocarbon structure can not only combine the advantages of the cyclic protein, but also regulate the two loops.
  • the relative position of the shape element achieves a synergistic effect, and this structure has not yet been discovered in nature. Therefore, it is a very attractive research direction to develop the preparation method of protein heterogeneous hydrocarbon.
  • the first type is to use the tetramerization domain p53tet of the tumor suppressor protein p53 or its mutant dimerization domain p53dim to guide the mutual entanglement between molecular chains, and then through efficient and specific natural chemical connection or spy tag-spy trapper (SpyTag). -SpyCatcher) reaction pair is closed loop, so as to realize the synthesis of protein homologous hydrocarbon.
  • SpyTag spy tag-spy trapper
  • the second type is based on the topological structure conversion of lasso peptides, which are gradually transformed into higher order hydrocarbons through restriction enzyme digestion and assembly.
  • the third type is to split the spy catcher into BDTag and SpyStapler, and rationally reorganize the three motifs based on the folding structure of the spy tag-spy catcher reaction pair, and combine the fragmentation of the internal peptide mediator.
  • the guided cyclization reaction and its self-catalyzed formation of isopeptide bonds enabled the synthesis of protein heterogenes for the first time, but the reaction could not be complete, and the entire purification process was more cumbersome.
  • the assembly-reaction coordinated strategy the further development of the biosynthesis method of protein heterogenes will help to further study the influence of topological structure on protein properties and functions, and will lay a foundation for its application in the field of biomedicine. Base.
  • the purpose of the present invention is to provide a biosynthesis strategy of protein heterogenes, which can realize the efficient construction of multi-domain protein heterogenes without additional extracellular reaction process.
  • the present invention develops a synthesis system based on two orthogonal coupling modes by simulating the multi-step post-translational modification process in the synthesis of natural topological proteins, based on reasonable gene sequence design, and combining in-situ assembly, strand break and site-directed cyclization, It can realize the modular synthesis of protein heterogenes and has the structural characteristics of branching or complete main chain cyclization.
  • the basic structure of the protein precursor sequence designed in the present invention for the preparation of protein heterogenes includes: L 1-1 -XL 1-2- (in situ restriction site)-L 2-1 -XL 2-2 ,in:
  • X represents the entanglement motif that can form dimers, and is one of the key elements for the formation of heterogeneous cord hydrocarbons.
  • Two Xs can be the same, for example, the p53dim domain derived from tumor suppressor, HP0242 protein of Helicobacter pylori, etc. to form a homodimer entanglement motif; two Xs can also be different, for example, based on the above-mentioned dimer motif
  • the above is a heterodimer motif derived from the substitution of amino acid residues, or a natural heterodimer motif that exists in nature.
  • L 1-1 /L 1-2 and L 2-1 /L 2-2 represent two pairs of cyclization motifs that can undergo orthogonal coupling reactions in the cell, and are another form of heterogeneous cable hydrocarbons.
  • the cyclization motif can be selected from polypeptide-protein reaction pairs, cleavage of intein, etc.
  • the two cyclization methods should have a certain degree of orthogonality. Under certain circumstances, it is necessary to insert an in situ restriction site between L 1-2 and L 2-1 , and in situ digestion of the site by co-expression of protease in the cell can achieve the synthesis of heterogeneous cord hydrocarbons. For example, insert the recognition site of the TVMV enzyme.
  • the choice of two pairs of cyclization primitives mainly includes the following three methods:
  • 1Two orthogonal peptide-protein reaction pairs such as spy tag-spy catcher reaction pair and probe tag-spy catcher reaction pair.
  • an in situ restriction site must be inserted between the two reaction pairs, and one polypeptide chain can be cut into two polypeptide chains by co-expression protease.
  • the polypeptide-protein reaction pair is at the back, that is, L 1-1 /L 1-2 is the fragmentation intein, and L 2-1 /L 2-2 is the polypeptide-protein reaction pair At this time, based on the fact that the cyclization mediated by the fragmented intein is coupled to the main chain, and the cyclization of the fragmented intein will be released from the precursor protein through self-splicing, in-situ digestion may not be necessary.
  • 3Two orthogonal cleavage inteins such as IntC1/IntN1, IntC2/IntN2 formed by two different cleavage methods of NpuDnaE cleavage intein, and other cleavage inteins, such as gp41-1, gp41-8, NrdJ -1 and IMPDH-1, etc.
  • the two fragmented inteins have a certain degree of orthogonality.
  • the advantage of fragmented intein-mediated cyclization is that the main chain cyclization is formed, and the self-splicing is removed, and the remaining redundant amino acids are few. It is not necessary to insert the in situ restriction site when using two orthogonal cleavage intein.
  • Inserting one or more identical or different target proteins into the basic structure of the aforementioned protein precursor sequence can construct a heterogeneous protein containing the target protein.
  • the insertion site of the target protein can be within the loop, that is, before the X domain and/or after the X domain. Since the cyclization mediated by the polypeptide-protein reaction pair is a side chain coupling, the N-terminus and C-terminus are still retained after cyclization. Therefore, the insertion site of the target protein can also be outside the ring, that is, the N-terminus of the polypeptide-protein reaction pair. End and/or C-terminus, thereby constructing branched heterogeneous cord hydrocarbons.
  • L 1-1 -X-POI1-L 1-2 -(TVMV)-L 2-1 -X-POI2-L 2-2 L 1- 1 /L 1-2 and L 2-1 /L 2-2 represent the cyclization motifs of two orthogonal cyclization modes, X represents the tangle motif, POI1 and POI2 represent the target protein 1 and target protein 2;
  • TVMV The site represents the recognition site of the TVMV enzyme, which can be recognized by the co-expressed TVMV enzyme and digested in situ; a purification tag (such as a histidine tag sequence) is introduced before the second tangle motif X to facilitate synthesis
  • the heterogeneous hydrocarbons are purified.
  • the following examples illustrate the fusion position of the target protein:
  • L 1-1 /L 1-2 and L 2-1 /L 2-2 are orthogonal polypeptide-protein reaction pairs, both side chains are coupled and cyclized, which requires in situ digestion, resulting in The complexes L 1 and L 2 will exist in the final catenone structure. Therefore, in addition to inserting the target proteins POI1 and POI2 into the two loops, respectively, the heterogeneous catenone cat-L 1 (X-POI1)-L 2 is formed.
  • the target protein POI3, POI4, POI5, POI6 can be further fused to the N-terminus and C-terminus of the polypeptide-protein reaction pair to construct a branched heterogeneous cord hydrocarbon.
  • the insertion position of the target protein is as follows: POI3 -L 1-1 -X-POI1-L 1-2 -POI4-(TVMV)-POI5-L 2-1 -X-POI2-L 2-2 -POI6.
  • L 1-1 /L 1-2 and L 2-1 /L 2-2 are peptide-protein reaction pairs combined with the fragmented intein, the complex formed by the fragmented intein will be spliced away, If L 1-1 /L 1-2 is a polypeptide-protein reaction pair, and L 2-1 /L 2-2 is a fragmented intein, insert the target protein POI1 and POI2 into the two loops to form a heterogeneous cord
  • the hydrocarbon is cat-L 1 (X-POI1)-(X-POI2), and the target protein (POI3, POI4) can be fused to the N-terminus and C-terminus of the L 1-1 /L 1-2 polypeptide-protein reaction pair.
  • the position of the target protein is inserted as follows: POI3-L 1-1 -X-POI1-L 1-2 -POI4-(TVMV)-L 2-1 -X-POI2-L 2-2 ; Conversely, if L 1-1 /L 1-2 is a fragmented intein and L 2-1 /L 2-2 is a polypeptide-protein reaction pair, a heterogeneous cat-(X-POI1)-L will be formed 2 (X-POI2), further fusion of the target protein (POI3, POI4) at the N-terminus and C-terminus of the L 2-1 /L 2-2 polypeptide-protein reaction pair can construct a branched heterogeneous cord hydrocarbon, and the target protein is inserted
  • the position is as follows: L 1-1 -X-POI1-L 1-2 -(TVMV)-POI3-L 2-1 -X-POI2-L 2-2 -POI4.
  • the strategy of the present invention for the biosynthesis of protein heterogenes focuses on the following aspects: (1) The use of entanglement motifs (X) such as the p53dim domain to achieve mechanical interlocking, by converting intermolecular dimerization to intramolecular dimerization Improve the yield of heterogeneous cord hydrocarbons; (2) Choose the cyclization method that can occur intracellularly.
  • entanglement motifs such as the p53dim domain
  • Fragmentation Intein usually includes a larger N-terminal part (IntN) and a relatively small C-terminal part (IntC).
  • IntN N-terminal part
  • IntC C-terminal part
  • the fragmented intein involved in the present invention is preferably the NpuDnaE fragmented intein, and its natural fragmentation mode is divided into IntC1 containing 36 amino acids and IntN1 containing 102 amino acids.
  • the IntC2 containing 15 amino acids and the corresponding IntN2 containing 123 amino acids obtained by systematically truncating the IntC part also have good trans-splicing efficiency.
  • IntC1 and IntN2 have a certain degree of reactivity, IntC2 cannot react with IntN1, showing a certain degree of orthogonality.
  • the biosynthesis systems of protein heterogenes in the present invention all use the intramolecular dimerization of entanglement motifs such as p53dim domains to guide the entanglement of polypeptide chains, but the ways to achieve orthogonal coupling are different.
  • the intracellular cyclization reaction based on the peptide-protein reaction pair is a side chain coupling reaction with a complete N-/C-terminus. At the same time, the complex formed will exist in the final structure, so it can be further fused with other targets. Protein preparation of branched protein heterogeneous cord hydrocarbons.
  • the two ends of the peptide chain can be connected by natural peptide bonds to realize the main chain cyclization.
  • the broken intein will be spliced from the precursor protein by self-splicing. Released in.
  • the method for biosynthesis of protein heterogenes mainly includes:
  • L 1-1 -XL 1-2 -(in situ restriction site) -L 2-1 -XL from N-terminus to C-terminus 2-2 where X represents the entanglement motif forming a dimer;
  • L 1-1 /L 1-2 and L 2-1 /L 2-2 represent two pairs of orthogonal coupling reactions in the cell Cyclization motifs, the two pairs of cyclization motifs can be two orthogonal polypeptide-protein reaction pairs, or a combination of polypeptide-protein reaction pairs and fragmented inteins, or two orthogonal fragmented inteins; when When L 1-1 /L 1-2 is a polypeptide-protein reaction pair, the in situ restriction site inserted between L 1-2 and L 2-1 is an essential element, and the protease is co-expressed in the cell.
  • the site is subjected to in situ digestion, otherwise the in situ digestion site is an unnecessary element; insert the target protein sequence in the above basic structure, and the insertion site is selected from: before the X domain and/or after the X domain, The N-terminus and/or C-terminus of the polypeptide-protein reaction pair;
  • step 2) Construct the coding gene sequence corresponding to the protein precursor sequence in step 1) and introduce it into the expression vector;
  • step 2) Transfer the expression vector constructed in step 2) into the cell for expression, and co-express the protease that cuts the in situ restriction site in the cell if necessary;
  • step 4) Purify the fusion protein obtained in step 3) to obtain the corresponding protein heterogene.
  • the polypeptide-protein reaction pair is preferably a spy tag-spycatcher reaction pair, a spy tag-spycatcher reaction pair, and the amino acid sequences of the typical spy tag (SpyTag) and spycatcher (SpyCatcher) are as follows, respectively.
  • reactive SpyTag/SpyCatcher mutants can also be used.
  • the mutant refers to a peptide chain derived from the above-mentioned amino acid sequence of SpyTag/SpyCatcher by substitution, deletion or addition of amino acid residues, and the substitution, deletion or addition of amino acid residues will not generate isopeptides to it
  • the coupling reaction of the bond has an effect.
  • the entanglement motif X is preferably the p53dim domain derived from tumor suppressor.
  • the amino acid sequence of the typical p53dim domain is shown in SEQ ID NO: 3 in the sequence table. It can also be used to form a similar dimerization
  • the mutant refers to a peptide chain derived from the above-mentioned amino acid sequence of p53dim by substitution, deletion or addition of amino acid residues, and the substitution, deletion or addition of amino acid residues will not entangle the dimer Have an impact.
  • the N-terminal part (IntN) and the C-terminal part (IntC) of the N-terminal part (IntN) and the C-terminal part (IntC) of the cleavage intein preferably NpuDnaE form the cyclization motif.
  • the amino acid sequences of IntC2 and IntN2 are shown in SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6 and SEQ ID NO: 7 in the sequence table, respectively.
  • other fragmented inteins that meet the conditions can also be applied to the present invention to realize the biosynthesis of protein heterogenes.
  • the in situ restriction site is preferably the recognition sequence ETVRFQG of the tobacco vein mottle virus (TVMV) protease.
  • the identification sequence ENLYFQG of the tobacco etch virus (TEV) protease can be introduced before the first tangle motif X This sequence can also be used as the in situ restriction site. Further, in order to facilitate purification, a histidine tag sequence is introduced before the second tangled motif X, and the protein is purified by nickel column affinity chromatography in step 4).
  • step 3 for the case where L 1-1 /L 1-2 is a polypeptide-protein reaction pair, it must be co-expressed with the protease at the in situ enzyme cleavage site to achieve the biosynthesis of protein heterogenes; and For the case where L 1-1 /L 1-2 is a fragmented intein, there is no need to co-express the protease.
  • the expressed protein is purified by nickel column affinity chromatography, which can be combined with gradient elution or size exclusion chromatography to further improve protein heterogeneity The purity of the hydrocarbon.
  • IntC1-p53dim(X)-POI1-IntN1-IntC2-p53dim(X)-POI2-IntN2 abbreviated as IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2.
  • the above coding gene was introduced into the expression vector pMCSG19, and then the expression vector was transferred into BL21(DE3) competent cells for expression.
  • BL21(DE3) competent cells also contain the pRK1037 plasmid that can encode TVMV protease; and IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2
  • the biosynthesis of protein heterogenes can be achieved, and there is no obvious difference between the two, so the expression vector can be transferred to conventional BL21 (DE3) competent cells for expression.
  • the obtained fusion protein is purified, and the corresponding protein heterogene can be obtained.
  • the protein precursor expressed by the BXA-IntC1-X-IntN1 or IntC1-XPOI1-IntN1-IntC2-XPOI2-IntN2 recombinant plasmid firstly forms an intramolecular tangled structure through the dimerization of the p53dim domain, and then through two positive The cross-coupling method realizes fixed-point cyclization.
  • BXA-IntC1-X-IntN1 On the basis of BXA-IntC1-X-IntN1, by introducing other folded proteins at the N-terminus of SpyCatcher and the C-terminus of SpyTag, such as the affinity body AffiHER2 with high affinity for HER2, branching can be achieved based on the same co-expression method. Biosynthesis of protein heterogenes. In the IntC1-XPOI1-IntN1-IntC2-XPOI2-IntN2 system, the small ubiquitin-modified protein SUMO and the superfolded protein GFP were selected as model proteins to realize the protein heterogeneous cat-XSUMO-X and cat-XSUMO- XGFP biosynthesis.
  • the term “comprising” or “including” means including the stated elements, integers or steps, but does not exclude any other elements, integers or steps.
  • the term “comprises” or “includes” is used, unless otherwise specified, it also encompasses the situation consisting of the stated elements, integers or steps.
  • the gene sequence corresponding to BXA-IntC1-X-IntN1 is shown in SEQ ID No: 8 in the list.
  • the amino acid residues 8-122 are SpyCatcher, and the amino acid residues 132-138 are the TEV protease recognition sequence.
  • Amino acid residues 186-198 are SpyTag, amino acid residues 143-180 and 274-311 are p53dim domains, amino acid residues 205-211 are the recognition sequence of TVMV protease, and amino acids 221-255
  • the residue is IntC1, the amino acid residues at positions 261-266 are 6 ⁇ His tags, and the amino acid residues at positions 319-420 are IntN1.
  • the histidine tag sequence is included.
  • the gene sequence corresponding to AffiHER2-BXA-AffiHER2-IntC1-X-IntN1 is shown in SEQ ID No: 9 in the list, wherein the amino acid residues at positions 6-75 and 279-348 are AffiHER2, and amino acids at positions 82-196
  • the residue is SpyCatcher
  • the amino acid residues 206-212 are the recognition sequence of TEV protease
  • the amino acid residues 260-272 are SpyTag
  • the amino acid residues 217-254 and 424-461 are the p53dim domain
  • Amino acid residues 355-361 are the recognition sequence of TVMV protease
  • amino acid residues 371-405 are IntC1
  • amino acid residues 411-416 are 6 ⁇ His tags
  • amino acid residues 469-570 are IntN1 .
  • IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-IntN2 (IntC1-X-SUMO-IntN1-IntC2-X-IntN2): from the N-terminal to the C-terminal are the C-terminal parts of the broken intein IntC1, The knot motif p53dim domain, the target protein SUMO, the fragmented intein N-terminal part IntN1, the fragmented intein C-terminal part IntC2, the tangle motif p53dim domain and the fragmented intein N-terminal part IntN2, of which IntC1 and the first part A TEV protease recognition sequence is inserted between one p53dim domain, a TVMV protease recognition sequence is inserted between IntN1 and IntC2, and a histidine tag sequence is introduced before the second p53dim domain.
  • amino acid residues 62-99 and 358-395 are p53dim domains
  • amino acid residues 100-195 are target protein SUMO
  • amino acid residues 203-304 are IntN1, 311-317
  • the amino acid residue at position 345-350 is the recognition sequence of TVMV protease
  • the amino acid residue at position 345-350 is 6 ⁇ His
  • the amino acid residue at position 326-339 is IntC2
  • the amino acid residue at position 403-504 is IntN2.
  • IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-GFP-IntN2 (IntC1-X-SUMO-IntN1-IntC2-X-GFP-IntN2): from the N-terminus to the C-terminus is the C-terminus of the broken intein Part of IntC1, tangled motif p53dim domain, target protein SUMO, fragmented intein N-terminal part IntN1, fragmented intein C-terminal part IntC2, tangled motif p53dim domain, target protein GFP and fragmented intein N IntN2, where the TEV protease recognition sequence is inserted between IntC1 and the first p53dim domain, the TVMV protease recognition sequence is inserted between IntN1 and IntC2, and the histidine tag sequence is introduced before the second p53dim domain .
  • amino acid residues 62-99 and 358-395 are p53dim domains
  • amino acid residues 100-195 are target protein SUMO
  • amino acid residues 203-304 are IntN1, 311-317
  • the amino acid residue at position is the recognition sequence of TVMV protease
  • the amino acid residue at position 345-350 is 6 ⁇ His
  • the amino acid residue at position 326-339 is IntC2
  • the amino acid residue at position 403-640 is the target protein GFP
  • amino acid residue at position 403-640 is the target protein GFP.
  • the amino acid residue at position -765 is IntN2.
  • the present invention utilizes conventional characterization means, such as sodium dodecyl sulfonate-polyacrylamide gel electrophoresis (SDS-PAGE), ultra-high performance liquid chromatography-mass spectrometry (LC-MS) and TEV enzymatic hydrolysis reaction pair prepared Basic characterization and topological structure verification of protein heterogeneous cord hydrocarbons.
  • conventional characterization means such as sodium dodecyl sulfonate-polyacrylamide gel electrophoresis (SDS-PAGE), ultra-high performance liquid chromatography-mass spectrometry (LC-MS) and TEV enzymatic hydrolysis reaction pair prepared Basic characterization and topological structure verification of protein heterogeneous cord hydrocarbons.
  • the present invention is based on reasonable gene sequence design, combined with in-situ assembly, restriction enzyme digestion and site-directed cyclization, and develops a biosynthesis system based on orthogonal coupling, which is suitable for intracellular synthesis of heterogeneous cord hydrocarbons of multiple functional proteins.
  • Figure 1 shows the structural schematic diagram of some protein heterogenes synthesized by different orthogonal coupling reactions according to the present invention, where L 1-1 /L 1-2 and L 2-1 /L 2-2 represent two positive Cyclization motif in cross-cyclization mode; when the cyclization motif is a polypeptide-protein reaction pair, side chain coupling occurs, and the resulting complexes are L 1 and L 2 respectively , which will exist in the synthetic heterogeneous cord In hydrocarbons; when the cyclization motif is a fragmented intein, the main chain coupling occurs, and after cyclization, it splices away and does not exist in the synthetic heterogeneous cord hydrocarbons.
  • Figure 2 shows two representative schematic diagrams of the present invention using orthogonal coupling reactions to achieve the synthesis of protein heterogenes, in which: (a) in situ digestion, SpyTag-SpyCatcher reaction pair and fragmentation of the intein IntC1 /IntN1 mediates the biosynthesis of protein heterogenes; (b) Two orthogonal fragmentation inteins IntC1/IntN1 and IntC2/IntN2 mediate protein heterogenes biosynthesis.
  • Figure 3 shows the size exclusion chromatography (a) of the protein heterogene cat-BXA-X synthesized in the example, the SDS-PAGE characterization results before and after TEV digestion (b) and the mass spectrum of cat-BXA-X (c ).
  • Figure 4 shows the size exclusion chromatography (a) of the protein heterogene cat-(AffiHER2-BXA-AffiHER2)-X synthesized in the example, the SDS-PAGE characterization results before and after TEV digestion (b) and cat-( Mass spectrum (c) of AffiHER2-BXA-AffiHER2)-X.
  • Figure 5 shows the size exclusion chromatography (a) of the protein heterogene cat-XSUMO-X synthesized in the example, the SDS-PAGE characterization results before and after TEV digestion (b), and the mass spectrum of cat-XSUMO-X (c ).
  • Figure 6 shows the size exclusion chromatography of the protein heterogene cat-XSUMO-XGFP synthesized in the Example (a), the SDS-PAGE characterization results before and after TEV digestion (b), and the mass spectrum of cat-XSUMO-XGFP (c ).
  • Figure 7 shows the mass spectra of the TEV digestion products l-BXA(a) and cX(b) of the protein heterogeneous cat-BXA-X synthesized in the embodiment, and the TEV of cat-(AffiHER2-BXA-AffiHER2)-X
  • the folded protein AffiHER2 was further introduced into the N-terminal of SpyCatcher and the C-terminal of SpyTag to construct AffiHER2-SpyCatcher(B)-p53dim(X)-SpyTag(A)-AffiHER2-IntC1-p53dim(X )-I ntN1 (AffiHER2-BXA-AffiHER2-IntC1-X-IntN1) gene sequence.
  • recombinant genetic engineering technology is used to construct a system containing 6 ⁇ His tag (for protein purification), p53dim domain, and fragmented intein IntC1/IntN1. Break the gene sequence of the intein IntC2/IntN2 and the target protein SUMO/GFP, namely
  • IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-GFP-IntN2 (IntC1-X-SUMO-IntN1-I ntC2-X-GFP-IntN2). These two gene sequences were inserted into the expression vector pMSCG19, and transferred into BL21(DE3) competent cells for expression. In the expression process, the biosynthesis of protein heterogeneous cat-XSUMO-X and cat-XSUMO-XGFP was achieved through in-situ assembly and orthogonal fragmentation of the intein-mediated cyclization reaction.
  • Example 1 Using the co-expression system of pMCSG19/pRK1037 to realize the biosynthesis of protein heterogenes cat-BXA-X and cat-(AffiHER2-BXA-AffiHER2)-X
  • the seed bacteria solution was inoculated into 250 mL of 2 ⁇ YT medium containing the same resistance at a ratio of 1:100, cultured with shaking at 37°C until the OD 600 was between 0.5-0.7, and isopropyl- ⁇ -D-sulfur was added. IPTG to a final concentration of 0.5 mM, then transferred to 16°C to express for 20 hours.
  • the cells (5500g ⁇ 15min) are collected by centrifugation in a high-speed refrigerated centrifuge, and the supernatant is discarded.
  • the cells were resuspended in lysis buffer A (50mM sodium dihydrogen phosphate, 300mM sodium chloride, 10mM imidazole, pH 8.0).
  • the resuspension was sonicated with an ultrasonic cell disruptor under ice-water bath conditions (working for 5 seconds, interval of 5 seconds, intensity 30%), and then centrifuged to collect the supernatant (12000g ⁇ 30min). Take the supernatant and mix with Ni-NTA resin uniformly and incubate at 4°C for 1h.
  • wash buffer B 50mM sodium dihydrogen phosphate, 300mM sodium chloride, 20mM imidazole, 5-10 times the volume of the resin. pH 8.0
  • Rinse the resin to reduce non-specific adsorption For protein heterogenes cat-BXA-X, cat-(AffiHER2-BXA-AffiHER2)-X and cat-XSUMO-X, you can directly use elution buffer C (50mM sodium dihydrogen phosphate, 300mM sodium chloride, 250mM imidazole, pH 8.0) for elution.
  • elution buffer D 50mM sodium dihydrogen phosphate, 300mM sodium chloride, 50mM imidazole, pH 8.0. Elution ⁇ 10 resin volumes, the collected protein eluate is basically heterogeneous cordane, and elution buffer C is used to elute the cyclic or cordane by-products of GFP.
  • the protein eluate uses a fast purification liquid chromatography system ( Pure, GE Healthcare) and size exclusion chromatography column (Superdex 200increase 10/300GL, GE Healthcare) for further purification. min, the elution peak of protein was monitored by UV absorption at 280 nm, and samples were collected for characterization.
  • Example 3 For the protein heterogenes purified in Example 3, 5 ⁇ SDS loading buffer was first added, and heated at 98° C. for 10 min, and then SDS-PAGE characterization was performed. After replacing the SEC-purified protein sample into ddH 2 O with an ultrafiltration tube, its molecular weight was characterized by LC-MS. The protein concentration was measured by an ultra-micro spectrophotometer (NanoPhotometer P330, Implen, Inc.). In order to prove the topological structure of heterogeneous cord hydrocarbons, the protein solution (10 ⁇ M) and 10 ⁇ M TEV protease were mixed at a molar ratio of 20:1, and digestion was carried out at 37°C (1, 3, 6 hours, 3 hours is basically sufficient Enzyme digestion is complete).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Peptides Or Proteins (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

提供了一种蛋白质异质索烃的生物合成方法,其中蛋白质异质索烃的蛋白质前体序列的基本结构由N端到C端包括:L 1-1-X-L 1-2-(原位酶切位点)-L 2-1-X-L 2-2,其中,X代表形成二聚体的缠结基元,两个X可以相同也可以不同;L 1-1/L 1-2、L 2-1/L 2-2代表在胞内发生正交偶联反应的两对环化基元,这两对环化基元可以为两种正交的多肽-蛋白质反应对,或者多肽-蛋白质反应对和断裂内含肽组合,或者两种正交的断裂内含肽。当多肽-蛋白质反应对与断裂内含肽联用时,可以实现支化蛋白质异质索烃的生物合成;当两种正交断裂内含肽联用时,可以获得完全主链环化的蛋白质异质索烃。

Description

一种蛋白质异质索烃的生物合成方法
优先权和相关申请
本申请要求2020年5月21日提交的名称为“一种蛋白质异质索烃的生物合成方法”的中国专利申请202010436910.X的优先权,该申请包括附录在内的全部内容作为参考并入本申请。
技术领域
本发明涉及蛋白质异质索烃的生物合成方法,特别涉及基于多肽-蛋白质反应对和/或断裂内含肽的生物合成体系,以及基于该体系通过两种正交偶联环化方式构建多结构域蛋白质异质索烃的方法。
背景技术
在自然界中,许多天然生物大分子都存在特定的拓扑结构,并与其相应的生物学功能密切相关。目前发现的天然拓扑蛋白质包括环状、扭结、套索蛋白质和蛋白质索烃等。由于环状蛋白质的构建只需对多肽链实现偶联,是目前人工合成拓扑蛋白质的研究重点,通常表现出明显热稳定性的提高。由于蛋白质折叠机制的复杂性,通过控制多肽链之间的缠结关系对蛋白质的拓扑结构进行调控是相对困难的。索烃中最为简单的[2]索烃是由两个机械互锁的环状基元组成,因而相应的蛋白质异质索烃结构既可以结合环状蛋白质的优势,又可以通过调控两个环状基元的相对位置实现协同作用,而该结构尚未在自然界中发现。故而发展蛋白质异质索烃的制备方法是一个非常有吸引力的研究方向。
目前关于人工合成蛋白质索烃的报道相对较少,其合成策略可以大致分为三类,但实现机械互锁结构的本质均是基于蛋白质的折叠结构。第一类是利用肿瘤抑制蛋白质p53的四聚结构域p53tet或其突变型二聚结构域p53dim来指导分子链间的相互缠结,再通过高效特异的天然化学连接或者谍标签-谍捕手(SpyTag-SpyCatcher)反应对进行闭环,从而实现蛋白质同质索烃的合成。第二类则是基于套索肽的拓扑结构转换,通过酶切和组装逐步转变为高 阶索烃。第三类则是通过将谍捕手拆分为BDTag和谍订书机酶(SpyStapler),并基于谍标签-谍捕手反应对的折叠结构对三个基元进行合理重组,结合断裂内含肽介导的环化反应和其自催化生成异肽键的特点,首次实现了蛋白质异质索烃的合成,但其反应无法达到完全,整个纯化过程也较为繁琐。基于组装-反应的协同策略,进一步发展蛋白质异质索烃的生物合成方法,将有助于更深入地研究拓扑结构对蛋白质性质和功能的影响,也将为其在生物医学领域中的应用奠定基础。
发明内容
本发明的目的在于提供一种蛋白质异质索烃的生物合成策略,不需要额外的胞外反应过程,即可实现多结构域蛋白质异质索烃的高效构建。
本发明通过模拟天然拓扑蛋白质合成中的多步翻译后修饰过程,基于合理的基因序列设计,结合原位组装、链断裂和定点环化,发展了基于两种正交偶联方式的合成体系,可以实现蛋白质异质索烃的模块化合成,具有支化或完全主链环化的结构特征。
本发明设计的用于制备蛋白质异质索烃的蛋白质前体序列的基本结构包括:L 1-1-X-L 1-2-(原位酶切位点)-L 2-1-X-L 2-2,其中:
(1)X代表可以形成二聚体的缠结基元,是形成异质索烃的关键元素之一。两个X可以相同,例如肿瘤抑制因子衍生的p53dim结构域、幽门螺旋杆菌HP0242蛋白等形成同质二聚体的缠结基元;两个X也可以不同,例如在上述二聚体基元基础上经过氨基酸残基的取代等衍生的异质二聚体基元,或者是自然界中存在的天然异质缠结二聚体基元。
(2)L 1-1/L 1-2、L 2-1/L 2-2代表可以在胞内发生正交偶联反应的两对环化基元,是形成异质索烃的另一关键元素。所述环化基元可以选自多肽-蛋白质反应对、断裂内含肽等,为了避免过多的副反应,两种环化方式之间应当具有一定的正交性。在特定情况下需要在L 1-2与L 2-1之间插入原位酶切位点,通过在胞内共表达蛋白酶对该位点进行原位酶切才能实现异质索烃的合成,例如插入TVMV酶的识别位点。
两对环化基元的选择主要包括下述三种方式:
①两种正交的多肽-蛋白质反应对,例如谍标签-谍捕手反应对和探标签- 探捕手反应对。在这种情况下必须在两个反应对之间插入原位酶切位点,将一条多肽链经共表达蛋白酶酶切变为两条多肽链。
②多肽-蛋白质反应对与断裂内含肽组合,例如谍标签-谍捕手反应对与NpuDnaE断裂内含肽(包括C端部分和N端部分)。当多肽-蛋白质反应对在前面,断裂内含肽在后面,即L 1-1/L 1-2为多肽-蛋白质反应对,L 2-1/L 2-2为断裂内含肽时,由于多肽-蛋白质反应对的胞内环化反应是侧链偶联反应,同时所形成的复合物会存在于最终结构中,因此需要通过原位酶切引发L 2-1/L 2-2的环化反应;当断裂内含肽在前面,多肽-蛋白质反应对在后面,即L 1-1/L 1-2为断裂内含肽,L 2-1/L 2-2为多肽-蛋白质反应对时,基于断裂内含肽介导的环化为主链偶联,断裂内含肽环化后会通过自剪接的方式从前体蛋白质中释放出来的特性,原位酶切可以是非必须的。
③两种正交的断裂内含肽,例如NpuDnaE断裂内含肽的两种不同断裂方式形成的IntC1/IntN1、IntC2/IntN2,以及其他断裂内含肽,例如gp41-1,gp41-8,NrdJ-1和IMPDH-1等,两种断裂内含肽存在一定正交性即可。断裂内含肽介导环化的优势是形成主链环化,并且自身剪接离去,剩余的冗余氨基酸少。利用两种正交的断裂内含肽时可以不插入原位酶切位点。
在上述蛋白质前体序列的基本结构中插入一个或多个相同或不同的目标蛋白,即可构建包含目标蛋白的蛋白质异质索烃。目标蛋白的插入位点可以在环内,即X结构域前和/或X结构域后。由于多肽-蛋白质反应对介导的环化为侧链偶联,其环化后仍然保留N端和C端,因而目标蛋白的插入位点也可以在环外,即多肽-蛋白质反应对的N端和/或C端,从而构建支化异质索烃。
目标蛋白的基因构建参见图1,在蛋白质前体序列L 1-1-X-POI1-L 1-2-(TVMV)-L 2-1-X-POI2-L 2-2中,L 1-1/L 1-2与L 2-1/L 2-2代表两种正交环化方式的环化基元,X代表缠结基元,POI1和POI2代表目标蛋白1和目标蛋白2;TVMV位点代表TVMV酶的识别位点,可以被共表达的TVMV酶识别并进行原位酶切;在第二个缠结基元X前引入纯化标签(如组氨酸标签序列),方便对合成的异质索烃进行纯化。下面举例说明目标蛋白的可融合位置:
①当L 1-1/L 1-2与L 2-1/L 2-2为正交的多肽-蛋白质反应对时,均发生侧链偶 联环化,需要原位酶切,所形成的复合物L 1和L 2会存在于最终的索烃结构中,因此,除了将目标蛋白POI1和POI2分别插入两个环内,形成异质索烃cat-L 1(X-POI1)-L 2(X-POI2)外,进一步在多肽-蛋白质反应对的N端和C端融合上目标蛋白(POI3、POI4、POI5、POI6)可以构建支化异质索烃,目标蛋白插入的位置如下:POI3-L 1-1-X-POI1-L 1-2-POI4-(TVMV)-POI5-L 2-1-X-POI2-L 2-2-POI6。
②当L 1-1/L 1-2与L 2-1/L 2-2为多肽-蛋白质反应对与断裂内含肽组合时,由于断裂内含肽形成的复合物会自剪接离去,若L 1-1/L 1-2为多肽-蛋白质反应对,L 2-1/L 2-2为断裂内含肽,将目标蛋白POI1和POI2分别插入两个环内,形成的异质索烃为cat-L 1(X-POI1)-(X-POI2),进一步在L 1-1/L 1-2多肽-蛋白质反应对的N端和C端融合上目标蛋白(POI3、POI4)可以构建支化异质索烃,目标蛋白插入的位置如下:POI3-L 1-1-X-POI1-L 1-2-POI4-(TVMV)-L 2-1-X-POI2-L 2-2;反之,若L 1-1/L 1-2为断裂内含肽,L 2-1/L 2-2为多肽-蛋白质反应对,将形成异质索烃cat-(X-POI1)-L 2(X-POI2),进一步在L 2-1/L 2-2多肽-蛋白质反应对的N端和C端融合上目标蛋白(POI3、POI4)可以构建支化异质索烃,目标蛋白插入的位置如下:L 1-1-X-POI1-L 1-2-(TVMV)-POI3-L 2-1-X-POI2-L 2-2-POI4。
③当L 1-1/L 1-2与L 2-1/L 2-2为正交断裂内含肽时,由于断裂内含肽形成的复合物会自剪接离去并介导主链环化,将目标蛋白POI1和POI2分别插入两个环内,形成的异质索烃为cat-(X-POI1)-(X-POI2),其所含有的两个环状蛋白质基元均实现了主链环化,且不含有缠结基元和目标蛋白以外的其他冗余组分。
本发明进行蛋白质异质索烃生物合成的策略重点考虑了以下方面:(1)利用p53dim结构域等缠结基元(X)实现机械互锁,通过将分子间二聚转为分子内二聚提高异质索烃的产率;(2)选择可以在胞内发生的环化方式,目前应用较多的是蛋白质-多肽反应对和断裂内含肽;(3)两种环化方式之间具有一定的正交性,避免过多的副反应,例如将谍标签-谍捕手反应对和断裂内含肽结合使用,或者选择两种具有一定正交性的断裂内含肽;(4)断裂内含肽通常包括尺寸较大的N端部分(IntN)和尺寸相对较小的C端部分(IntC),当IntC位于链中导致反应受阻时,可以通过共表达蛋白酶对新生多肽链进行原位酶 切,从而引发该断裂内含肽介导的反式剪接反应。
本发明涉及的断裂内含肽优选为NpuDnaE断裂内含肽,其天然断裂方式是分成含有36个氨基酸的IntC1和含有102个氨基酸的IntN1。通过系统性地截短IntC部分得到的含有15个氨基酸的IntC2和相应的含有123个氨基酸的IntN2,同样具有很好的反式剪接效率。尽管IntC1与IntN2存在一定的反应性,但是IntC2无法和IntN1发生反应,体现出一定的正交性。
本发明中蛋白质异质索烃的生物合成体系均是利用p53dim结构域等缠结基元的分子内二聚导向多肽链的缠结,但是实现正交偶联的方式有所不同。基于多肽-蛋白质反应对的胞内环化反应是一个具有完整N-/C-末端的侧链偶联反应,同时其所形成的复合物会存在于最终结构中,因而可以通过进一步融合其他目标蛋白制备支化蛋白质异质索烃。与之不同的是,基于断裂内含肽的胞内环化反应,可以将肽链的两端以天然肽键相连实现主链环化,同时断裂内含肽会通过自剪接的方式从前体蛋白质中释放出来。
本发明提供的蛋白质异质索烃的生物合成方法主要包括:
1)设计蛋白质异质索烃的蛋白质前体序列,其基本结构由N端到C端包括:L 1-1-X-L 1-2-(原位酶切位点)-L 2-1-X-L 2-2,其中,X代表形成二聚体的缠结基元;L 1-1/L 1-2、L 2-1/L 2-2代表在胞内发生正交偶联反应的两对环化基元,这两对环化基元可以为两种正交的多肽-蛋白质反应对,或者多肽-蛋白质反应对和断裂内含肽组合,或者两种正交的断裂内含肽;当L 1-1/L 1-2为多肽-蛋白质反应对时,在L 1-2与L 2-1之间插入的原位酶切位点为必要元件,通过在胞内共表达蛋白酶对该位点进行原位酶切,否则所述原位酶切位点为非必要元件;在上述基本结构中插入目标蛋白序列,插入位点选自:X结构域前和/或X结构域后、多肽-蛋白质反应对的N端和/或C端;
2)构建步骤1)所述蛋白质前体序列对应的编码基因序列,并引入表达载体中;
3)将步骤2)构建的表达载体转入细胞中进行表达,必要时在细胞内共表达切割所述原位酶切位点的蛋白酶;
4)对步骤3)获得的融合蛋白进行纯化,得到相应的蛋白质异质索烃。
上述步骤1)中,所述多肽-蛋白质反应对优选为谍标签-谍捕手反应对、探标签-探捕手反应对,典型的谍标签(SpyTag)和谍捕手(SpyCatcher)的氨基 酸序列分别如序列表中SEQ ID NO:1和SEQ ID NO:2所示,也可以应用具有反应性的SpyTag/SpyCatcher突变体。所述突变体是指在SpyTag/SpyCatcher的上述氨基酸序列基础上经过氨基酸残基的取代、缺失或添加而衍生的肽链,所述取代、缺失或添加的氨基酸残基不会对其生成异肽键的偶联反应产生影响。
上述步骤1)中,缠结基元X优选为肿瘤抑制因子衍生的p53dim结构域,典型的p53dim结构域的氨基酸序列如序列表中SEQ ID NO:3所示,也可以应用能够形成类似二聚结构的p53dim突变体。所述突变体是指在p53dim的上述氨基酸序列基础上经过氨基酸残基的取代、缺失或添加而衍生的肽链,所述取代、缺失或添加的氨基酸残基不会对其缠结二聚体的生成产生影响。
上述步骤1)中,所述断裂内含肽优选NpuDnaE断裂内含肽的N端部分(IntN)和C端部分(IntC)构成环化基元,其两种断裂方式所产生的IntC1、IntN1、IntC2和IntN2的氨基酸序列分别如序列表中SEQ ID NO:4、SEQ ID NO:5、SEQ ID NO:6和SEQ ID NO:7所示。除此之外,其他满足条件的断裂内含肽也可以应用于本发明以实现蛋白质异质索烃的生物合成。
上述步骤1)中,所述原位酶切位点优选为烟草脉斑驳病毒(Tobacco vein mottling virus,TVMV)蛋白酶的识别序列ETVRFQG。
上述步骤1)中,为了对所合成的蛋白质异质索烃进行拓扑结构的证明,可以在第一个缠结基元X之前引入烟草蚀斑病毒(Tobacco etch virus,TEV)蛋白酶的识别序列ENLYFQG,该序列也可以用作所述的原位酶切位点。进一步的,为了便于纯化,在第二个缠结基元X之前引入了组氨酸标签序列,在步骤4)通过镍柱亲和层析进行蛋白纯化。
上述步骤3)中,对于L 1-1/L 1-2为多肽-蛋白质反应对的情况,必须与原位酶切位点的蛋白酶进行共表达才能实现蛋白质异质索烃的生物合成;而对于L 1-1/L 1-2为断裂内含肽的情况,不需要共表达所述蛋白酶。
上述步骤4)中,对于引入组氨酸标签序列的蛋白质异质索烃,通过镍柱亲和层析对所表达的蛋白质进行纯化,可以结合梯度洗脱或尺寸排阻色谱进一步提高蛋白质异质索烃的纯度。
在本发明的实施例中,如图2所示,设计了下述蛋白质前体序列:
SpyCatcher(B)-p53dim(X)-SpyTag(A)-IntC1-p53dim(X)-IntN1,简记为 BXA-IntC1-X-IntN1;
IntC1-p53dim(X)-POI1-IntN1-IntC2-p53dim(X)-POI2-IntN2,简记为IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2。
将上述编码基因引入表达载体pMCSG19中,再将表达载体转入BL21(DE3)感受态细胞进行表达。对于需要共表达蛋白酶的体系BXA-IntC1-X-IntN1,BL21(DE3)感受态细胞中则还含有可以编码TVMV蛋白酶的pRK1037质粒;而IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2在单独表达或与TVMV酶共表达的情况下,均可以实现蛋白质异质索烃的生物合成,二者没有明显差异,因此将其表达载体转入常规BL21(DE3)感受态细胞进行表达即可。最后对获得的融合蛋白质进行纯化,即可得到相应的蛋白质异质索烃。
由BXA-IntC1-X-IntN1或IntC1-XPOI1-IntN1-IntC2-XPOI2-IntN2重组质粒表达得到的蛋白质前体,首先通过p53dim结构域的二聚形成分子内缠结的结构,再通过两种正交的偶联方式实现定点环化。在BXA-IntC1-X-IntN1的体系中,必须要通过与TVMV酶进行共表达来实现原位酶切,从而引发IntC1/IntN1介导的反式剪接反应,再结合谍标签-谍捕手反应对介导的侧链环化反应,最终实现蛋白质异质索烃cat-BXA-X的制备。在IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2的体系中,两对断裂内含肽可以顺序发生反式剪接反应,依次介导两种目的蛋白的环化,最终实现蛋白质异质索烃cat-XPOI1-XPOI2的制备。
在BXA-IntC1-X-IntN1的基础上,通过在SpyCatcher的N端和SpyTag的C端引入其他折叠蛋白质,例如对HER2具有高亲和力的亲和体AffiHER2,可以基于同样的共表达方式实现支化蛋白质异质索烃的生物合成。在IntC1-XPOI1-IntN1-IntC2-XPOI2-IntN2的体系中,选择小泛素修饰蛋白SUMO和超折叠蛋白质GFP作为模型蛋白质,分别实现了蛋白质异质索烃cat-XSUMO-X和cat-XSUMO-XGFP的生物合成。
发明详述
在详细描述本发明之前,应了解,本发明不受限于本说明书中的特定方法及实验条件,因为所述方法以及条件是可以改变的。另外,本文所用术语仅是供说明特定实施方案之用,而不意欲为限制性的。
除非另有定义,否则本文中使用的所有技术和科学术语均具有与本领域一般技术人员通常所理解的含义相同的含义。为了本发明的目的,下文定义了以下术语。
术语“和/或”当用于连接两个或多个可选项时,应理解为意指可选项中的任一项或可选项中的任意两项或多项。
如本文中所用,术语“包含”或“包括”意指包括所述的要素、整数或步骤,但是不排除任意其他要素、整数或步骤。在本文中,当使用术语“包含”或“包括”时,除非另有指明,否则也涵盖由所述及的要素、整数或步骤组成的情形。
以下将详细说明本发明的各种示例性实施例、特征和方面。在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。
另外,为了更好地说明本发明,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本发明同样可以实施。在另外一些实例中,对于本领域技术人员熟知的方法、手段、器材和步骤未作详细描述,以便于凸显本发明的主旨。
如无特殊声明,本说明书中所使用的单位均为国际标准单位,并且本发明中出现的数值、数值范围,均应当理解为包含了工业生产中所不可避免的系统性误差。
下面通过一些具体的例子说明蛋白质异质索烃生物合成过程中蛋白质前体的序列:
(a)SpyCatcher(B)-p53dim(X)-SpyTag(A)-IntC1-p53dim(X)-IntN1(BXA-Int C1-X-IntN1):从N端到C端分别为反应基元SpyCatcher、缠结基元p53dim结构域、反应基元SpyTag、断裂内含肽C端部分IntC1、缠结基元p53dim结构域和断裂内含肽N端部分IntN1,其中SpyCatcher和第一个p53dim结构域之间插入了TEV蛋白酶的识别序列,SpyTag和IntC1之间插入了TVMV蛋白酶的识别序列,并在第二个p53dim结构域之前引入了组氨酸标签序列。BXA-IntC1-X-IntN1对应的基因序列如列表中SEQ ID No:8所示,其中第8-122位氨基酸残基为SpyCatcher,第132-138位氨基酸残基为TEV蛋白酶的识别序列,第186-198位氨基酸残基为SpyTag,第143-180位和第274-311 位氨基酸残基为p53dim结构域,第205-211位氨基酸残基为TVMV蛋白酶的识别序列,第221-255位氨基酸残基为IntC1,第261-266位氨基酸残基为6×His标签,第319-420位氨基酸残基为IntN1。
(b)AffiHER2-SpyCatcher(B)-p53dim(X)-SpyTag(A)-AffiHER2-IntC1-p53d im(X)-IntN1(AffiHER2-BXA-AffiHER2-IntC1-X-IntN1):从N端到C端分别为目的蛋白AffiHER2、反应基元SpyCatcher、缠结基元p53dim结构域、反应基元SpyTag、目的蛋白AffiHER2、断裂内含肽C端部分IntC1、缠结基元p53dim结构域和断裂内含肽N端部分IntN1,其中SpyCatcher和第一个p53dim结构域之间插入了TEV蛋白酶的识别序列,第二个AffiHER2和IntC1之间插入了TVMV蛋白酶的识别序列,并在第二个p53dim结构域之前引入了组氨酸标签序列。AffiHER2-BXA-AffiHER2-IntC1-X-IntN1对应的基因序列如列表中SEQ ID No:9所示,其中第6-75位和第279-348位氨基酸残基为AffiHER2,第82-196位氨基酸残基为SpyCatcher,第206-212位氨基酸残基为TEV蛋白酶的识别序列,第260-272位氨基酸残基为SpyTag,第217-254位和第424-461位氨基酸残基为p53dim结构域,第355-361位氨基酸残基为TVMV蛋白酶的识别序列,第371-405位氨基酸残基为IntC1,第411-416位氨基酸残基为6×His标签,第469-570位氨基酸残基为IntN1。
(c)IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-IntN2(IntC1-X-SUMO-IntN1-IntC2-X-IntN2):从N端到C端分别为断裂内含肽C端部分IntC1、缠结基元p53dim结构域、目的蛋白SUMO、断裂内含肽N端部分IntN1、断裂内含肽C端部分IntC2、缠结基元p53dim结构域和断裂内含肽N端部分IntN2,其中IntC1和第一个p53dim结构域之间插入了TEV蛋白酶的识别序列,IntN1和IntC2之间插入了TVMV蛋白酶的识别序列,并在第二个p53dim结构域之前引入了组氨酸标签序列。IntC1-XSUMO-IntN1-IntC2-X-IntN2对应的基因序列如列表中SEQ ID No:10所示,其中第8-42位氨基酸残基为IntC1,第48-54位氨基酸残基为TEV蛋白酶的识别序列,第62-99位和第358-395位氨基酸残基为p53dim结构域,第100-195位氨基酸残基为目标蛋白质SUMO,第203-304位氨基酸残基为IntN1,第311-317位氨基酸残基为TVMV蛋白酶的识别序列,第345-350位氨基酸残基为6×His,第326-339位氨基酸残基为IntC2,第403-504位氨基酸残基为IntN2。
(d)IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-GFP-IntN2(IntC1-X-SUMO-IntN1-IntC2-X-GFP-IntN2):从N端到C端分别为断裂内含肽C端部分IntC1、缠结基元p53dim结构域、目的蛋白质SUMO、断裂内含肽N端部分IntN1、断裂内含肽C端部分IntC2、缠结基元p53dim结构域、目的蛋白质GFP和断裂内含肽N端部分IntN2,其中IntC1和第一个p53dim结构域之间插入了TEV蛋白酶的识别序列,IntN1和IntC2之间插入了TVMV蛋白酶的识别序列,第二个p53dim结构域之前引入了组氨酸标签序列。IntC1-XSUMO-IntN1-IntC2-XGFP-IntN2对应的基因序列如列表中SEQ ID No:11所示,其中第8-42位氨基酸残基为IntC1,第48-54位氨基酸残基为TEV蛋白酶的识别序列,第62-99位和第358-395位氨基酸残基为p53dim结构域,第100-195位氨基酸残基为目标蛋白质SUMO,第203-304位氨基酸残基为IntN1,第311-317位氨基酸残基为TVMV蛋白酶的识别序列,第345-350位氨基酸残基为6×His,第326-339位氨基酸残基为IntC2,第403-640位氨基酸残基为目标蛋白质GFP,第643-765位氨基酸残基为IntN2。
本发明利用常规表征手段,如十二烷基磺酸钠-聚丙烯酰氨凝胶电泳(SDS-PAGE)、超高效液相色谱-质谱(LC-MS)以及TEV酶解反应对所制备的蛋白质异质索烃进行基本表征以及拓扑结构证明。
本发明基于合理的基因序列设计,结合原位组装、酶切和定点环化,发展了基于正交偶联的生物合成体系,适用于胞内合成多种功能蛋白质的异质索烃,其主要优势在于:1)通过基因编码的方式可以实现异质索烃的模块化合成,利用p53dim结构域等二聚缠结基元的分子内二聚提高蛋白质异质索烃的产量,相应的缠结基元和偶联手段有多种选择;2)模拟天然拓扑蛋白质合成中的多步翻译后修饰过程,直接在胞内完成多肽链缠结和两种正交的共价环化反应,不需要额外的胞外反应,表达纯化后即可得到相应的蛋白质异质索烃。3)在含多肽-蛋白质反应对的构建中,如BXA-IntC1-X-IntN1,通过在SpyCatcher的N端和SpyTag的C端引入其他折叠蛋白质,可以实现支化蛋白质异质索烃的生物合成;而在含两种正交断裂内含肽的构建中,如IntC1-X-POI1-IntN1-IntC2-X-POI2-IntN2,则实现了完全主链环化的蛋白质异质索烃的生物合成。两种体系均可以实现对现有蛋白质异质索烃结构的拓展。
Figure PCTCN2021094589-appb-000001
Figure PCTCN2021094589-appb-000002
Figure PCTCN2021094589-appb-000003
附图说明
图1显示了本发明利用不同的正交偶联反应合成的部分蛋白质异质索烃的结构示意图,其中L 1-1/L 1-2与L 2-1/L 2-2代表两种正交环化方式的环化基元;当环化基元为多肽-蛋白质反应对时,发生侧链偶联,所形成的复合物分别为L 1和L 2,会存在于合成的异质索烃中;当环化基元为断裂内含肽时,发生主链偶联,环化后自剪接离去,不存在于合成的异质索烃中。
图2显示了本发明中两种代表性的利用正交偶联反应实现蛋白质异质索烃合成的示意图,其中:(a)由原位酶切、SpyTag-SpyCatcher反应对和断裂内含肽IntC1/IntN1介导蛋白质异质索烃的生物合成;(b)由两种正交断裂内含肽IntC1/IntN1和IntC2/IntN2介导蛋白质异质索烃的生物合成。
图3显示了实施例合成的蛋白质异质索烃cat-BXA-X的尺寸排阻色谱(a),TEV酶切前后的SDS-PAGE表征结果(b)以及cat-BXA-X的质谱(c)。
图4显示了实施例合成的蛋白质异质索烃cat-(AffiHER2-BXA-AffiHER2)-X的尺寸排阻色谱(a),TEV酶切前后的 SDS-PAGE表征结果(b)以及cat-(AffiHER2-BXA-AffiHER2)-X的质谱(c)。
图5显示了实施例合成的蛋白质异质索烃cat-XSUMO-X的尺寸排阻色谱(a),TEV酶切前后的SDS-PAGE表征结果(b)以及cat-XSUMO-X的质谱(c)。
图6显示了实施例合成的蛋白质异质索烃cat-XSUMO-XGFP的尺寸排阻色谱(a),TEV酶切前后的SDS-PAGE表征结果(b),cat-XSUMO-XGFP的质谱(c)。
图7显示了实施例合成的蛋白质异质索烃cat-BXA-X的TEV酶切产物l-BXA(a)和c-X(b)的质谱,cat-(AffiHER2-BXA-AffiHER2)-X的TEV酶切产物l-AffiHER2-BXA-AffiHER2(c)和c-X(d)的质谱以及cat-XSUMO-X的TEV酶切产物l-XSUMO(e)和c-X(f)的质谱。
具体实施方式
下面通过实施例进一步对本发明进行详细说明,但不以任何方式限制本发明的范围。
构建蛋白质异质索烃生物合成过程中蛋白质前体及其相应表达体系的具体步骤:
1)对于SpyTag-SpyCatcher反应对和断裂内含肽IntC1/IntN1共同介导蛋白质异质索烃合成的体系,利用重组基因工程技术构建含有6×His标签(用于蛋白质纯化)、SpyTag和SpyCatcher反应对、p53dim结构域、断裂内含肽IntC1/IntN1的基因序列,即SpyCatcher(B)-p53dim(X)-SpyTag(A)-IntC1-p53dim(X)-IntN1(BXA-IntC1-X-I ntN1)。在该基因序列的基础上,进一步在SpyCatcher的N端和SpyTag的C端分别引入折叠蛋白质AffiHER2,构建AffiHER2-SpyCatcher(B)-p53dim(X)-SpyTag(A)-AffiHER2-IntC1-p53dim(X)-I ntN1(AffiHER2-BXA-AffiHER2-IntC1-X-IntN1)的基因序列。将这两种基因序列分别插入表达载体pMSCG19中,转入含有pRK1037质粒的BL21(DE3)感受态细胞进行表达,其中pRK1037质粒可以编码TVMV蛋白酶。在表达过程中,通过原位组装、酶切和定点环化,实现蛋白质异质索烃cat-BXA-X和cat-(AffiHER2-BXA-AffiHER2)-X的生物合成。
2)对于正交断裂内含肽介导蛋白质异质索烃合成的体系,利用重组基因 工程技术构建含有6×His标签(用于蛋白质纯化)、p53dim结构域、断裂内含肽IntC1/IntN1、断裂内含肽IntC2/IntN2和目的蛋白SUMO/GFP的基因序列,即
IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-IntN2(IntC1-X-SUMO-IntN1-IntC2-X-IntN2),或者
IntC1-p53dim-SUMO-IntN1-IntC2-p53dim-GFP-IntN2(IntC1-X-SUMO-IntN1-I ntC2-X-GFP-IntN2)。将这两种基因序列分别插入表达载体pMSCG19中,转入BL21(DE3)感受态细胞进行表达。在表达过程中,通过原位组装和正交断裂内含肽介导的环化反应,实现蛋白质异质索烃cat-XSUMO-X和cat-XSUMO-XGFP的生物合成。
利用十二烷基磺酸钠-聚丙烯酰氨凝胶电泳(SDS-PAGE)、超高效液相色谱-质谱(LC-MS)以及TEV酶解反应对所制备的蛋白质异质索烃进行基本表征以及拓扑结构证明。
实施例1:利用pMCSG19/pRK1037的共表达体系实现蛋白质异质索烃cat-BXA-X和cat-(AffiHER2-BXA-AffiHER2)-X的生物合成
将BXA-IntC1-X-IntN1和AffiHER2-BXA-AffiHER2-IntC1-X-IntN1的基因片段分别插入表达载体pMCSG19中,其序列分别如列表中SEQ ID No:8和SEQ ID No:9所示。所得构建经测序确认后,转入含有pRK1037质粒的BL21(DE3)感受态细胞中,利用含有100μg/mL氨苄青霉素钠和50μg/mL卡那霉素的双抗性平板在37℃进行过夜培养。随后挑出单克隆菌落,接种至5mL含有相同抗性的2×YT培养基中,在37℃震荡培养10-12小时制备种子菌液。将该种子菌液按1∶100比例接种到250mL含有相同抗性的2×YT培养基中,在37℃震荡培养至OD 600在0.5-0.7之间,加入异丙基-β-D-硫代吡喃半乳糖苷(IPTG)至终浓度为0.25mM,转至16℃表达20小时。
实施例2:蛋白质异质索烃cat-XSUMO-X和cat-XSUMO-XGFP的生物合成
将IntC1-X-SUMO-IntN1-IntC2-X-IntN2和IntC1-X-SUMO-IntN1-IntC2-X-GFP-IntN2的基因片段分别插入表达载体pMCSG19中,其序列分别如列表中SEQ ID No:10和SEQ ID No:11所示。所得构建经测序确认后,转入BL21(DE3)感受态细胞中,利用含有100μg/mL 氨苄青霉素钠的平板在37℃进行过夜培养。随后挑出单克隆菌落,接种至5mL含有相同抗性的2×YT培养基中,在37℃震荡培养10-12小时制备种子菌液。将该种子菌液按1∶100比例接种到250mL含有相同抗性的2×YT培养基中,在37℃震荡培养至OD 600在0.5-0.7之间,加入异丙基-β-D-硫代吡喃半乳糖苷(IPTG)至终浓度为0.5mM,转至16℃表达20小时。
实施例3:蛋白质异质索烃的纯化
蛋白质表达结束后,用高速冷冻离心机离心收集菌体(5500g×15min),弃去上清液。将菌体用裂解缓冲液A(50mM磷酸二氢钠,300mM氯化钠,10mM咪唑,pH 8.0)重悬。重悬液在冰水浴条件下用超声波细胞破碎仪超声破碎(工作5秒,间隔5秒,强度30%),随后离心收集上清液(12000g×30min)。取上清液与Ni-NTA树脂混合均匀并在4℃孵育1h。将该混合液倒入纯化用PD-10重力空柱中,待裂解液流尽后,用5-10倍树脂体积的洗涤缓冲液B(50mM磷酸二氢钠,300mM氯化钠,20mM咪唑,pH 8.0)冲洗树脂以减少非特异性吸附。对于蛋白质异质索烃cat-BXA-X、cat-(AffiHER2-BXA-AffiHER2)-X和cat-XSUMO-X,可以直接用洗脱缓冲液C(50mM磷酸二氢钠,300mM氯化钠,250mM咪唑,pH 8.0)进行洗脱。对于蛋白质异质索烃cat-XSUMO-XGFP,为了提高其纯度,采取梯度洗脱的方式,首先用洗脱缓冲液D(50mM磷酸二氢钠,300mM氯化钠,50mM咪唑,pH 8.0)进行洗脱~10个树脂体积,收集的蛋白质洗脱液基本为异质索烃,再用洗脱缓冲液C洗脱GFP的环状或索烃副产物。
蛋白质洗脱液利用快速纯化液相色谱系统(
Figure PCTCN2021094589-appb-000004
pure,GE Healthcare)和尺寸排阻色谱柱(Superdex 200increase 10/300GL,GE Healthcare)进行进一步纯化,流动相为经过0.22μm滤膜过滤的磷酸盐缓冲液PBS(pH 7.4),流速为0.5mL/min,通过280nm的紫外吸收监测蛋白质的流出峰,收集样品进行表征。
实施例4:蛋白质异质索烃的表征
对于实施例3中纯化得到的蛋白质异质索烃,首先加入5×SDS上样缓冲液,并于98℃加热10min,然后进行SDS-PAGE表征。将SEC纯化后的蛋白质样品用超滤管置换至ddH 2O中后,利用LC-MS对其分子量进行表征。通过超微量分光光度计(NanoPhotometer P330,Implen,Inc.)测定蛋白质的 浓度。为了进行异质索烃的拓扑结构证明,将蛋白质溶液(10μM)与10μM TEV蛋白酶以20∶1的摩尔比混合,在37℃条件下进行酶切(1、3、6小时,3小时基本可以酶切完全)。酶切结束后,取10μL加入5×SDS上样缓冲液,并于98℃条件下加热10min终止反应,利用SDS-PAGE表征酶切之后的产物组成。利用超滤管将剩余的酶切体系置换至ddH 2O中后,利用LC-MS进行分子量确认。cat-BXA-X、cat-(AffiHER2-BXA-AffiHER2)-X、cat-XSUMO-X和cat-XSUMO-XGFP经镍柱亲和纯化后的SEC表征,酶切前后的SDS-PAGE表征以及LC-MS表征结果分别如图3、4、5和6所示。cat-BXA-X、cat-(AffiHER2-BXA-AffiHER2)-X和cat-XSUMO-X经TEV酶切后酶切产物的LC-MS表征如图7所示。

Claims (10)

  1. 一种蛋白质异质索烃的生物合成方法,包括以下步骤:
    1)设计蛋白质异质索烃的蛋白质前体序列,其基本结构由N端到C端包括:L 1-1-X-L 1-2-(原位酶切位点)-L 2-1-X-L 2-2,其中,X代表形成二聚体的缠结基元,可以为同质也可以为异质,即两个X可以相同也可以不同;L 1-1/L 1-2、L 2-1/L 2-2代表在胞内发生正交偶联反应的两对环化基元,这两对环化基元可以为两种正交的多肽-蛋白质反应对,或者多肽-蛋白质反应对和断裂内含肽组合,或者两种正交的断裂内含肽;当L 1-1/L 1-2为多肽-蛋白质反应对时,在L 1-2与L 2-1之间插入的原位酶切位点为必要元件,通过在胞内共表达蛋白酶对该位点进行原位酶切,否则所述原位酶切位点为非必要元件;在上述基本结构中插入目标蛋白序列,插入位点选自:X结构域前和/或X结构域后、多肽-蛋白质反应对的N端和/或C端;
    2)构建步骤1)所述蛋白质前体序列对应的编码基因序列,并引入表达载体中;
    3)将步骤2)构建的表达载体转入细胞中进行表达,必要时在细胞内共表达切割所述原位酶切位点的蛋白酶;
    4)对步骤3)获得的融合蛋白进行纯化,得到相应的蛋白质异质索烃。
  2. 如权利要求1所述的方法,其特征在于,步骤1)中所述缠结基元为p53dim结构域或能够形成二聚结构的p53dim突变体,其中p53dim结构域的氨基酸序列如序列表中SEQ ID NO:3所示。
  3. 如权利要求1所述的方法,其特征在于,步骤1)中所述多肽-蛋白质反应对选自谍标签-谍捕手反应对、探标签-探捕手反应对。
  4. 如权利要求3所述的方法,其特征在于,所述谍标签-谍捕手反应对中谍标签和谍捕手的氨基酸序列分别如序列表中SEQ ID NO:1和SEQ ID NO:2所示。
  5. 如权利要求1所述的方法,其特征在于,步骤1)中所述断裂内含肽为NpuDnaE断裂内含肽,由IntC1和IntN1组成环化基元,或者由IntC2和IntN2组成环化基元,IntC1、IntN1、IntC2和IntN2的氨基酸序列分别如序列表中SEQ ID NO:4、SEQ ID NO:5、SEQ ID NO:6和SEQ ID NO:7所示。
  6. 如权利要求1所述的方法,其特征在于,步骤1)中设计的原位酶切位点是TVMV蛋白酶的识别序列ETVRFQG,或者是TEV蛋白酶的识别序 列ENLYFQG;相应的在步骤3)中共表达TVMV蛋白酶或TEV蛋白酶。
  7. 如权利要求1所述的方法,其特征在于,步骤1)在第二个缠结基元X之前引入了组氨酸标签序列,在步骤4)通过镍柱亲和层析进行蛋白纯化。
  8. 如权利要求1所述的方法,其特征在于,步骤1)设计的蛋白质前体序列基本结构为SpyCatcher-p53dim-SpyTag-IntC1-p53dim-IntN1,从N端到C端依次是环化反应基元谍捕手SpyCatcher、缠结基元p53dim结构域、环化反应基元谍标签SpyTag、断裂内含肽C端部分IntC1、缠结基元p53dim结构域和断裂内含肽N端部分IntN1;在SpyTag和IntC1之间插入TVMV蛋白酶的识别序列,并在第二个p53dim结构域之前引入了组氨酸标签序列;一个或多个相同或不同的目标蛋白的融合位点选自:p53dim结构域前和/或p53dim结构域后、SpyCatcher的N端、SpyTag的C端。
  9. 如权利要求1所述的方法,其特征在于,步骤1)设计的蛋白质前体序列基本结构为IntC1-p53dim-IntN1-IntC2-p53dim-IntN2,从N端到C端依次是断裂内含肽C端部分IntC1、缠结基元p53dim结构域、断裂内含肽N端部分IntN1、断裂内含肽C端部分IntC2、缠结基元p53dim结构域和断裂内含肽N端部分IntN2;在第二个p53dim结构域前引入了组氨酸标签序列;一个或多个相同或不同的目标蛋白插在两个p53dim结构域的前面和/或后面。
  10. 如权利要求1所述的方法,其特征在于,在步骤4)对于引入组氨酸标签序列的蛋白质异质索烃,通过镍柱亲和层析对所表达的蛋白质进行纯化,并结合梯度洗脱或尺寸排阻色谱进一步提高蛋白质异质索烃的纯度。
PCT/CN2021/094589 2020-05-21 2021-05-19 一种蛋白质异质索烃的生物合成方法 WO2021233330A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/999,377 US20230348546A1 (en) 2020-05-21 2021-05-19 Method for biosynthesis of protein heterocatenane

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010436910.XA CN111560391B (zh) 2020-05-21 2020-05-21 一种蛋白质异质索烃的生物合成方法
CN202010436910.X 2020-05-21

Publications (1)

Publication Number Publication Date
WO2021233330A1 true WO2021233330A1 (zh) 2021-11-25

Family

ID=72072251

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/094589 WO2021233330A1 (zh) 2020-05-21 2021-05-19 一种蛋白质异质索烃的生物合成方法

Country Status (3)

Country Link
US (1) US20230348546A1 (zh)
CN (1) CN111560391B (zh)
WO (1) WO2021233330A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114075298A (zh) * 2022-01-07 2022-02-22 广州中科蓝华生物科技有限公司 一种索烃化的var2csa重组蛋白及其制备方法和应用
CN116621947A (zh) * 2023-07-18 2023-08-22 北京智源人工智能研究院 一种基于索烃骨架的拓扑蛋白质、制备方法及应用
WO2024106955A1 (ko) * 2022-11-15 2024-05-23 한양대학교 산학협력단 고리형 단백질의 제조방법 및 이의 응용

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111560391B (zh) * 2020-05-21 2022-02-11 北京大学 一种蛋白质异质索烃的生物合成方法
CN113403291A (zh) * 2021-06-22 2021-09-17 华侨大学 一种醛醇氧化酶二聚体及其制备方法
WO2023229029A1 (ja) * 2022-05-26 2023-11-30 国立大学法人山形大学 ヘテロダイマータンパク質の製造方法、ダイマータンパク質、モノマータンパク質、および標的反応性のヘテロダイマータンパク質のスクリーニング方法
CN117003852A (zh) * 2022-07-07 2023-11-07 北京大学 白细胞介素-2的拓扑改造及其作为自身免疫病药物的应用
CN118155706A (zh) * 2022-12-07 2024-06-07 北京大学 拓扑蛋白质的程序化设计方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105061581A (zh) * 2015-09-17 2015-11-18 北京大学 可基因编码的全蛋白质索烃的制备方法
CN110272913A (zh) * 2019-06-12 2019-09-24 北京大学 一种基于索烃化的蛋白质偶联方法
CN111073925A (zh) * 2018-10-19 2020-04-28 北京大学 一种基于无序蛋白偶联酶的高效多肽-多肽偶联系统和方法
CN111560391A (zh) * 2020-05-21 2020-08-21 北京大学 一种蛋白质异质索烃的生物合成方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107708720A (zh) * 2015-04-06 2018-02-16 苏伯多曼有限责任公司 含有从头结合结构域的多肽及其用途
US20200048716A1 (en) * 2017-11-03 2020-02-13 Twister Biotech, Inc Using minivectors to treat ovarian cancer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105061581A (zh) * 2015-09-17 2015-11-18 北京大学 可基因编码的全蛋白质索烃的制备方法
CN111073925A (zh) * 2018-10-19 2020-04-28 北京大学 一种基于无序蛋白偶联酶的高效多肽-多肽偶联系统和方法
CN110272913A (zh) * 2019-06-12 2019-09-24 北京大学 一种基于索烃化的蛋白质偶联方法
CN111560391A (zh) * 2020-05-21 2020-08-21 北京大学 一种蛋白质异质索烃的生物合成方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DA XIAO‐DI, ZHANG WEN‐BIN: "Active Template Synthesis of Protein Heterocatenanes", ANGEWANDTE CHEMIE INTERNATIONAL EDITION, vol. 58, no. 32, 5 August 2019 (2019-08-05), pages 11097 - 11104, XP055869957, ISSN: 1521-3773, DOI: 10.1002/anie.201904943 *
LIU YAJIE, DUAN ZELIN, FANG JING, ZHANG FAN, XIAO JUNYU, ZHANG WEN‐BIN: "Cellular Synthesis and X-ray Crystal Structure of a Designed Protein Heterocatenane", ANGEWANDTE CHEMIE INTERNATIONAL EDITION, vol. 59, no. 37, 7 September 2020 (2020-09-07), pages 16122 - 16127, XP055869950, ISSN: 1433-7851, DOI: 10.1002/anie.202005490 *
WANG XIAO-WEI, ZHANG WEN-BIN: "Cellular Synthesis of Protein Catenanes", ANGEWANDTE CHEMIE INTERNATIONAL EDITION, vol. 55, no. 10, 1 March 2016 (2016-03-01), pages 3442 - 3446, XP055869963, ISSN: 1521-3773, DOI: 10.1002/anie.201511640 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114075298A (zh) * 2022-01-07 2022-02-22 广州中科蓝华生物科技有限公司 一种索烃化的var2csa重组蛋白及其制备方法和应用
WO2024106955A1 (ko) * 2022-11-15 2024-05-23 한양대학교 산학협력단 고리형 단백질의 제조방법 및 이의 응용
CN116621947A (zh) * 2023-07-18 2023-08-22 北京智源人工智能研究院 一种基于索烃骨架的拓扑蛋白质、制备方法及应用
CN116621947B (zh) * 2023-07-18 2023-11-07 北京智源人工智能研究院 一种基于索烃骨架的拓扑蛋白质、制备方法及应用

Also Published As

Publication number Publication date
CN111560391B (zh) 2022-02-11
CN111560391A (zh) 2020-08-21
US20230348546A1 (en) 2023-11-02

Similar Documents

Publication Publication Date Title
WO2021233330A1 (zh) 一种蛋白质异质索烃的生物合成方法
US5837821A (en) Antibody construct
EP0654085B1 (en) Monomeric and dimeric antibody-fragment fusion proteins
Kipriyanov et al. High level production of soluble single chain antibodies in small-scale Escherichia coli cultures
EP0938571B1 (en) Method for the oligomerisation of peptides
US8518403B2 (en) Expression-enhanced polypeptides
US7235641B2 (en) Bispecific antibodies
Wels et al. Construction, bacterial expression and characterization of a bifunctional single–chain antibody–phosphatase fusion protein targeted to the human ERBB–2 receptor
CN107849147B (zh) 基于二泛素突变蛋白的Her2结合蛋白
CN105061581B (zh) 可基因编码的全蛋白质索烃的制备方法
CN113164621B (zh) 蛋白-药物偶联物和定点偶联方法
JP2021121642A (ja) 二特異性抗体基幹
CN106397599B (zh) 二价双特异性抗体杂交蛋白的表达和制备方法
CN110272913B (zh) 一种基于索烃化的蛋白质偶联方法
EP2161278B1 (en) Single-chain coiled coil scaffold
Tian et al. Development and characterization of a camelid single domain antibody–urease conjugate that targets vascular endothelial growth factor receptor 2
CN113045633B (zh) 蛋白质异质缠结基元的设计与复杂索烃结构的制备方法
KR20190075071A (ko) 프래그먼트 항체 및 당해 프래그먼트 항체를 이용하는 단백질의 결정화 방법
CN118146334A (zh) 单结构域荧光蛋白索烃及其构建和在制备融合蛋白索烃中的应用
EP3325514B1 (en) Her2 binding proteins based on di-ubiquitin muteins
CN104844711B (zh) 一种多聚体荧光蛋白复合物及其应用
JP5686360B2 (ja) タンパク質集積体及びその利用
AU631200B2 (en) Production of modified pe40
CN118271390A (zh) 一种谍捕手突变体的制备方法及应用
CN114685679A (zh) 谍捕手突变体及其制备方法与其在荧光蛋白质体系中的应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21809360

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21809360

Country of ref document: EP

Kind code of ref document: A1