WO2020259403A1 - 重组串联融合蛋白制备目标多肽的方法 - Google Patents

重组串联融合蛋白制备目标多肽的方法 Download PDF

Info

Publication number
WO2020259403A1
WO2020259403A1 PCT/CN2020/097058 CN2020097058W WO2020259403A1 WO 2020259403 A1 WO2020259403 A1 WO 2020259403A1 CN 2020097058 W CN2020097058 W CN 2020097058W WO 2020259403 A1 WO2020259403 A1 WO 2020259403A1
Authority
WO
WIPO (PCT)
Prior art keywords
protease
sequence
recognition site
fusion protein
target protein
Prior art date
Application number
PCT/CN2020/097058
Other languages
English (en)
French (fr)
Inventor
陈清
曾鑫
彭永亮
覃晓兰
杨辉
范开
Original Assignee
重庆派金生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 重庆派金生物科技有限公司 filed Critical 重庆派金生物科技有限公司
Priority to EP20831844.4A priority Critical patent/EP3992212A4/en
Publication of WO2020259403A1 publication Critical patent/WO2020259403A1/zh
Priority to US17/558,767 priority patent/US20220195004A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/575Hormones
    • C07K14/605Glucagons
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/06Preparation of peptides or proteins produced by the hydrolysis of a peptide bond, e.g. hydrolysate products
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/21Serine endopeptidases (3.4.21)
    • C12Y304/21061Kexin (3.4.21.61), i.e. proprotein convertase subtilisin/kexin type 9
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/22Cysteine endopeptidases (3.4.22)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/20Fusion polypeptide containing a tag with affinity for a non-protein ligand
    • C07K2319/21Fusion polypeptide containing a tag with affinity for a non-protein ligand containing a His-tag
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/50Fusion polypeptide containing protease site
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12RINDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
    • C12R2001/00Microorganisms ; Processes using microorganisms
    • C12R2001/01Bacteria or Actinomycetales ; using bacteria or Actinomycetales
    • C12R2001/185Escherichia
    • C12R2001/19Escherichia coli

Definitions

  • the present invention relates to the field of biomedicine. Specifically, the present invention relates to fusion proteins, methods and systems for preparing fusion proteins, and more specifically, the present invention relates to fusion proteins, methods and systems for preparing fusion proteins, nucleic acids, constructs, and recombinant cells.
  • Polypeptides often refer to active compounds composed of less than 100 amino acids.
  • Polypeptide drugs refer to polypeptides or their modifications used for disease prevention, diagnosis and treatment. They have been widely used in many disease fields. The FDA has approved about 70 A peptide drug. Polypeptide drugs have significant effects on diabetes, osteoporosis, intestinal diseases, thrombocytopenia, tumors, cardiovascular diseases, antiviral, immune diseases, etc.
  • Preproglucagon is a 158 amino acid precursor polypeptide, which is differentially processed in tissues to form a variety of structurally related glucagon-like peptides, including glucagon Glucagon, glucagon-like peptide-1 (GLP-1), glucagon-like peptide-2 (GLP-2), etc. These molecules are involved in a variety of physiological functions, including glucose homeostasis, insulin secretion, gastric emptying and intestinal growth, and regulation of food intake.
  • Glucagon is mainly used to treat severe hypoglycemia in diabetic patients receiving insulin treatment.
  • the marketed drug is GlucaGen.
  • Glucagon-like peptide-1 (GLP-1) is mainly used for type 2 diabetes.
  • GLP-1 receptor agonist drugs include Exenatide, Exenatide QW, Liraglutide, Albiglutide, Dulaglutide, Lixisenatide and Semaglutide.
  • Glucagon-like peptide-2 (GLP-2) is mainly used for short bowel syndrome, and Teduglutide is a marketed drug.
  • GLP-1 Human Glucagon-like peptide-1
  • GLP-1 is a peptide hormone secreted by the intestinal mucosa that promotes insulin secretion. It regulates blood glucose metabolism by increasing insulin secretion and inhibiting glucagon release It can also reduce intestinal peristalsis, cause satiety and suppress appetite; GLP-1 can promote the proliferation of pancreatic ⁇ cells and inhibit the apoptosis of pancreatic ⁇ cells to increase the number and function of pancreatic ⁇ cells. Most importantly, its hypoglycemic effect only occurs at higher blood glucose concentrations, thus avoiding hypoglycemia caused by excessive insulin secretion.
  • GLP-1 can also improve the sensitivity of receptor cells to insulin, which is helpful for the treatment of insulin resistance; long-term treatment can significantly improve the patient's medium and long-term indicators such as glycosylated hemoglobin; for type II diabetes caused by obesity, it can inhibit gastric emptying It helps patients control their diet and achieve weight loss.
  • GLP-1 drugs such as liraglutide and semaglutide have cardiovascular benefits. Insulin therapy usually has the disadvantages of increasing weight and risk of hypoglycemia. GLP-1 receptor agonist drugs just meet these clinical needs.
  • the mechanism of GLP-1 drugs represented by liraglutide in the treatment of diabetes includes: stimulating insulin secretion in a physiological and glucose-dependent manner; reducing glucagon secretion; inhibiting gastric emptying; reducing appetite; promoting pancreatic ⁇ -cells Growth and recovery.
  • GLP-1 When the blood glucose concentration exceeds the normal level, GLP-1 can stimulate the secretion of insulin through the above mechanism to exert its blood sugar lowering effect. Therefore, GLP-1 is a highly effective hypoglycemic drug (glucose-dependent). In view of the above characteristics and the analysis of the clinical treatment effect of GLP-1 drugs for many years, GLP-1 is a suitable candidate for the treatment of type 2 diabetes, and when GLP-1 combined with insulin is used to treat type 1 diabetes, the subject Will get better curative effect. GLP-1 has a potential hypoglycemic effect, even those patients who have failed sulfonylureas therapy can still exert the effect, and it will not cause severe hypoglycemia.
  • GLP-1 also has the ability to increase the rate of insulin biosynthesis and restore the rapid response of rat pancreatic ⁇ -cells to increased blood sugar (ie, prime insulin release). It has been reported in the literature that GLP-1 can stimulate the growth and proliferation of pancreatic ⁇ -cells and promote ductal cells to become new pancreatic ⁇ -cells. A number of human trials have shown that GLP-1 is also involved in the preservation and repair of pancreatic ⁇ -cell populations.
  • the main competition points of GLP-1 drugs include frequency of administration, hypoglycemic effect, weight reduction effect and immunogenicity.
  • the main disadvantages of Exenatide are the short dosing cycle and strong immunogenicity.
  • Albiglutide is obviously weak in reducing blood sugar and weight reduction.
  • Albiglutide is the first long-acting GLP-1 once a week, but its performance is far inferior to the subsequent market Dulaglutide.
  • the cardiovascular risk of GLP-1 drugs has also attracted much attention.
  • Insulin degludec which has been marketed in Japan, the European Union and the United States, has been delayed by the US FDA for its cardiovascular risk concerns. Liraglutide and Semaglutide have been confirmed to have cardiovascular benefits in the past two years, greatly improving the overall market competitiveness of GLP-1 receptor agonist drugs.
  • Glucagon-like peptides and analogs are mostly prepared by natural extraction, artificial chemical synthesis, and genetic engineering.
  • peptide drugs are mainly synthesized by artificial chemical synthesis, but the cost of solid phase synthesis is relatively high.
  • a large number of organic solvents may have an impact on the activity of peptides, and the analysis of related substances of peptides synthesized by solid phase is difficult.
  • Related substances such as epimers and chiral isomers are strictly controlled.
  • the published patent CN201210369966 uses a fully artificial chemical synthesis method to prepare liraglutide.
  • Liraglutide and Semaglutide are recombinantly expressed in yeast system.
  • yeast system When the yeast system is used to recombinantly express foreign proteins, the multiple protease families contained in yeast may degrade foreign proteins, especially some small peptides with simple structures are more easily degraded, and the degradation products increase with the extension of fermentation time, and it is difficult It is effectively separated by purification means.
  • the degradation of the fermentation process is caused by the digestion of the polypeptide by the protease contained in the yeast. By replacing the expression host bacteria and changing fermentation conditions, the degradation degree can be partially improved, but it cannot meet the requirements of industrialization.
  • the E. coli expression system is also a commonly used means of expression of recombinant foreign proteins.
  • Polypeptide drugs have a simple structure, no complicated high-level structures, and no glycosylation sites.
  • Escherichia coli contains less protease, and an active complete polypeptide can be obtained by recombinant expression in Escherichia coli system.
  • the conventional E. coli fusion expression system can obtain the target polypeptide by restriction enzyme digestion, but the yield and recovery rate of the polypeptide after restriction enzyme digestion are significantly reduced, which severely restricts the industrialization of peptide drugs.
  • CN201610753093.4 uses chaperone protein-entokinase fusion to express Arg 34 -GLP-1 (7-37). Although the expression level of the fusion protein is relatively high, the restriction enzyme digestion The obtained Arg 34 -GLP-1 (7-37) accounts for only one tenth of the total fusion protein, and the yield of the target protein is low.
  • chaperone proteins (TrxA, DsbA) are suitable for the fusion expression of macromolecular proteins that require renaturation.
  • Arg 34 -GLP-1 (7-37) has a simple spatial structure and does not require spatial conformation renaturation.
  • CN201610857663.4 adopts SUMO-GLP-1 (7-37) fusion protein to recombinantly express GLP-1 (7-37).
  • the peptide may be destroyed to form a fragmented peptide.
  • long-term acid lysis solution may introduce deamidation-related substances of the peptide, which is serious Affect product quality and restrict subsequent purification.
  • the translation of all proteins starts from the N-terminal methionine, so the first position of the expression product is the non-target amino acid methionine, only when the target protein is the first
  • the radius of rotation of an amino acid is 1.22 angstroms or less (such as Gly and Ala)
  • the N-terminal methionine can be effectively cleaved off by methioninase.
  • the methionine-cutting enzyme is saturated and lacks cofactors, so methionine is usually not cut off. Therefore, the N-terminal may be uneven (with or without Met), and the amino acid sequence of the expressed protein is inconsistent with the target protein (the first position contains Met), which may cause immunotoxicity.
  • the present invention aims to solve one of the technical problems in the related art at least to a certain extent.
  • the present invention proposes a fusion protein.
  • the fusion protein includes multiple target protein sequences connected in series, and two adjacent target protein sequences are connected by a linking sequence, wherein the linking sequence is suitable for forming multiple free targets by protease cleavage Proteins, the multiple target protein sequences are not cleaved by the protease, and neither the C-terminus nor the N-terminus of the free target protein contains additional residues.
  • the "said multiple target protein sequences are not cleaved by the protease” in this application means that the target protein sequence cannot be cleaved internally by the protease, that is, the protease cannot cleave the internal peptide bond of the target protein sequence.
  • the "extra residues” mentioned in this application refer to amino acid residues other than the target protein sequence.
  • the fusion protein according to the embodiment of the present invention can form multiple free target proteins under the action of proteases. Neither the C-terminus nor the N-terminus of the target protein contains additional residues. The quality of the target protein is significantly improved and greatly improved. It facilitates the purification of subsequent products, the safety of the target protein as a pharmaceutical polypeptide is significantly improved, and the immunotoxicity is significantly reduced.
  • the aforementioned fusion protein may further include at least one of the following additional technical features:
  • At least a part of the linking sequence constitutes a part of the C-terminus of the target protein sequence.
  • the connecting sequence is composed of at least one protease recognition site.
  • the linking sequence constitutes the C-terminus of the target protein sequence.
  • the C-terminus of the target protein sequence is continuous KR, and the protease is Kex2.
  • the continuous KR at the C-terminus of the fusion protein is recognized by the protease Kex2, and the peptide bond of Kex2 is cleaved after the R to form multiple free target proteins.
  • the connecting sequence contains a first protease recognition site and a second protease recognition site, and the plurality of target protein sequences does not contain the second protease recognition site, and the first A protease recognition site is suitable for being recognized and cleaved by the first protease to form a first protease cleavage product, and the second protease recognition site is suitable for being recognized and cleaved by a second protease.
  • the N- The end does not carry residues of the linking sequence, and the second protease is suitable for cleaving the C-terminus of the first protease cleavage product to form a plurality of free target proteins, and the C-terminus of the free target protein Neither N-terminal nor N-terminal contains residues of the connecting sequence.
  • the efficiency of the first protease in recognizing the internal first protease recognition site is lower than that of the first protease recognition site.
  • the efficiency of a protease to recognize the first protease recognition site in the linking sequence is used to achieve cleavage at the first protease recognition site of the linking sequence, and the internal peptide bond of the target protein sequence is not cleaved by the first protease.
  • the first protease is Kex2
  • the internal first protease recognition site is at least one of KK and RK
  • the first protease recognition site in the connecting sequence is KR or RR Or RKR.
  • the inventor found that the protease Kex2 can recognize KR or RR or KK and RK, but the cleavage ability of KR or RR is significantly greater than that of KK and RK. Therefore, the inventor can adjust Kex2 to realize the cleavage of KR in the connecting sequence. Or RR or RKR recognition and cleavage, but cannot cleave the peptide bond after K in KK or RK in the connecting sequence. For example, when the mass ratio of the fusion protein to Kex2 is 2000:1, the above-mentioned cleavage method can be realized.
  • the inventors discovered that adjacent consecutive acidic amino acid sequences can hide the first protease recognition site in the target protein sequence, so that the first protease cannot recognize and cleave the first protease recognition site in the target protein sequence.
  • the length of the continuous acidic amino acid sequence is 1 to 2 amino acids.
  • the acidic amino acid is aspartic acid or glutamic acid, and preferably, the acidic amino acid is aspartic acid.
  • the recognition sites of the first protease and the second protease are the same or different.
  • the first protease recognition site and the second protease recognition site meet the following conditions: the amino acid sequence of the target protein sequence has no consecutive KR and RR and has or does not have consecutive KK or RK,
  • the first protease recognition site is KR or RR or RKR, the first protease is Kex2, the second protease recognition site is the carboxy terminal R or K, and the second protease is CPB; or the target protein sequence
  • the amino acid sequence of has no K and R, the first protease recognition site is K, the first protease is Lys-C, the second protease recognition site is the carboxy terminal K, and the second protease is CPB;
  • the amino acid sequence of the target protein sequence has no K or R, the first protease recognition site is K or R, the first protease is Lys-C or Trp, and the second protease recognition site is the carboxyl end K or R, the second protease is CPB
  • the specific cleavage of the fusion protein at the first protease recognition site is realized, and the obtained first protease cleavage product is The N-terminus does not contain the residues of the linking sequence.
  • the linking sequence residues at the C-terminus of the first protease cleavage product are sequentially cleaved.
  • the fusion protein includes a plurality of linking sequences, and the plurality of linking sequences are the same or different.
  • the length of the connecting sequence is 1-10 amino acids.
  • the continuous sequence may include 1 to 5 of the first protease recognition site and the second protease recognition site. This in turn ensures the effective progress of protease cleavage.
  • auxiliary peptide segment the carboxyl end of the auxiliary peptide segment is connected to the N-terminus of the plurality of target protein sequences in series through the connection sequence.
  • the auxiliary peptide segment can be cleaved from the fusion protein under the action of the protease, and the N-terminus of the target protein sequence after cleaved does not contain residues of the connecting sequence.
  • the auxiliary peptide segment includes a tag sequence and an optional expression promoting sequence.
  • the connecting sequence facilitates subsequent identification or purification of the fusion protein, and the expression promoting sequence greatly improves the expression efficiency of the fusion protein.
  • the amino acid sequence of the tag sequence is a repeated His sequence.
  • the amino acid sequence of the expression promoting sequence is EEAEAEA, EEAEAEAGG or EEAEAEARG.
  • the first amino acid of the auxiliary peptide segment is methionine.
  • the methionine can be excised with the excision of the auxiliary peptide in the subsequent enzymatic digestion process, which avoids that the methionine in the polypeptide in the prior art is not easily excised and the N-terminal is not uniform. It has the problem of immune toxicity.
  • the length of the target protein sequence is 10-100 amino acids.
  • the fusion protein includes 4 to 16 target protein sequences in series.
  • the inventor found that the fusion protein includes 4-16 target protein sequences in series, which can ensure that the plasmid loss rate within 80 times is not higher than 10%, and the target protein expression level is not affected, realizing industrial scale fermentation, high protein density and high expression .
  • amino acid sequence of the target protein sequence is shown in SEQ ID NO: 1 to 6.
  • the present invention proposes a method for obtaining free target protein.
  • the method includes: providing the aforementioned fusion protein; contacting the fusion protein with a protease, the protease is determined based on the linking sequence, and the multiple target protein sequences are not
  • the protease cleaves to obtain a plurality of free target proteins, and the C-terminal and N-terminal ends of the free target protein do not contain additional residues.
  • the free target protein obtained according to the method of the embodiment of the present invention does not contain additional residues at the C-terminus and N-terminus, the quality of the target protein is significantly improved, and the purification of subsequent products is greatly facilitated.
  • the target protein is used as a pharmaceutical polypeptide The safety is significantly improved, and the immune toxicity is significantly reduced.
  • the above method may further include at least one of the following additional technical features:
  • the connecting sequence constitutes the C-terminus of the target protein sequence
  • the C-terminus of the target protein sequence is a continuous KR
  • the protease is Kex2.
  • the continuous KR at the C-terminus of the fusion protein is recognized by the protease Kex2, and the peptide bond of Kex2 is cleaved after the R to form multiple free target proteins.
  • the connecting sequence contains a first protease recognition site and a second protease recognition site, and the plurality of target protein sequences do not contain the second protease recognition site, so that the The contacting of the fusion protein with the protease further includes: contacting the fusion protein with the first protease to obtain a first protease cleavage product, and the N-terminus of the first protease cleavage product does not carry residues of the linking sequence.
  • Base contacting the first protease cleavage product with the second protease, the second protease being suitable for cleaving the C-terminus of the first protease cleavage product, so as to obtain a plurality of free target proteins.
  • the efficiency of the first protease in recognizing the internal first protease recognition site is lower than that of the first protease recognition site.
  • the efficiency of a protease to recognize the first protease recognition site in the linking sequence is used to achieve cleavage at the first protease recognition site of the linking sequence, and the internal peptide bond of the target protein sequence is not cleaved by the first protease.
  • the internal first protease recognition site is at least one of KK and RK
  • the first protease recognition site in the connecting sequence is KR or RR or RKR
  • the second protease The recognition site is the carboxyl terminal R or K
  • the first protease is Kex2
  • the second protease is CPB
  • the mass ratio of the fusion protein to the first protease is 2000:1.
  • the inventor found that the protease Kex2 can recognize KR or RR or KK and RK, but the cleavage ability of KR or RR is significantly greater than that of KK and RK.
  • the inventor can adjust the amount of Kex2.
  • the inventor can realize the recognition and cleavage of KR or RR or RKR in the linking sequence, but cannot detect the K in KK or RK in the linking sequence. The peptide bond is then cleaved.
  • the length of the continuous acidic amino acid sequence is 1 to 2 amino acids.
  • the acidic amino acid is aspartic acid or glutamic acid, and preferably, the acidic amino acid is aspartic acid.
  • the first protease recognition site is KR or RR or RKR
  • the second protease recognition site is The carboxyl terminal is R or K
  • the first protease is Kex2
  • the second protease is CPB.
  • the first protease Kex2 can only recognize and cleave the first protease recognition site in the linking sequence, but cannot recognize and cleave the continuous DKR or DRR or DKK or DRK in the target protein sequence.
  • the first protease cleavage product is in the second protease. Under the action, the C-terminal linking sequence residues are sequentially removed.
  • the plurality of target protein sequences do not contain the first protease recognition site and the second protease recognition site.
  • the first protease and the second protease are respectively: the amino acid sequence of the target protein sequence has no consecutive KR, RR, KK and RK, and the first protease recognition site is KR Or RR or RKR, the first protease is Kex2, the second protease recognition site is the carboxy-terminal R or K, and the second protease is CPB; or the amino acid sequence of the target protein sequence has no K but R, the The first protease recognition site is K, the first protease is Lys-C, the second protease recognition site is carboxyl-terminal K, the second protease is CPB, or the target protein sequence has no amino acid sequence K has no R, the first protease recognition site is K or R, the first protease is Lys-C or Trp, the second protease recognition site is the carboxy terminal K or R, and the second protease is CPB.
  • the specific cleavage of the fusion protein at the first protease recognition site of the connecting sequence is achieved, and the obtained first protease
  • the N-terminus of the cleavage product does not contain residues of the connecting sequence.
  • the first protease cleavage product C-terminal connecting sequence residues are sequentially cleaved.
  • the mass ratio of the fusion protein to the first protease is 250:1 to 2000:1.
  • the fusion protein is obtained by fermenting a microorganism that carries a nucleic acid encoding the fusion protein. It overcomes the defects of high synthesis cost and organic solvents affecting peptide activity caused by artificially synthesized peptides.
  • the microorganism is Escherichia coli.
  • the inventor found that if the yeast system is used to recombinantly express foreign proteins, multiple protease families contained in yeast may degrade foreign proteins, especially small peptides with simple structures are more likely to be degraded.
  • the polypeptide to be obtained in this application has a simple structure, no complicated high-level structure, and does not contain glycosylation sites, so it is more suitable for E. coli.
  • Escherichia coli contains less protease, and an active complete intermediate product can be obtained by recombinant expression preparation of Escherichia coli system, and the fermentation cycle of Escherichia coli is short, and the production cost is greatly reduced.
  • the present invention further includes crushing and dissolving the microbial fermentation treatment product, and the dissolving treatment is performed in the presence of a detergent to obtain the fusion protein.
  • the detergent is a surfactant, which can increase the solubility of the fusion protein and improve the efficiency of protease digestion.
  • the detergent surfactant includes (1) nonionic surfactants, such as PEG2000, Tween, sorbitol, urea, TritonX-100, guanidine hydrochloride; (2) anionic surface activity Agents, such as sodium lauryl sulfate, sodium lauryl sulfonate, stearic acid; (3) amphoteric surfactants, such as tri-sulfopropyl tetradecyl dimethyl betaine, dodecyl two Methyl betaine, lecithin; (4) Cationic surfactants: quaternary ammonium compounds, etc.
  • Detergents can achieve high-efficiency dissolution of the fusion protein without destroying the activity of the target protein, and will not affect the subsequent activities of the first protease and the second protease.
  • the present invention provides a nucleic acid.
  • the nucleic acid encodes the aforementioned fusion protein.
  • the aforementioned nucleic acid may further include at least one of the following additional technical features:
  • the nucleic acid has a nucleotide sequence shown in any one of SEQ ID NO: 7-12.
  • the present invention proposes a construct.
  • the construct carries the aforementioned nucleic acid.
  • the expression of the aforementioned fusion protein can be realized under conditions suitable for protein expression.
  • the present invention proposes a recombinant cell.
  • the recombinant cell comprises the aforementioned nucleic acid or the aforementioned construct or expresses the aforementioned fusion protein.
  • the aforementioned recombinant cell may further include at least one of the following additional technical features:
  • the recombinant cell is an E. coli cell.
  • the present invention proposes a system for obtaining free target protein.
  • the system includes: a fusion protein preparation device, the fusion protein preparation device is used to provide the aforementioned fusion protein; a digestion device, the digestion device is connected to the fusion protein preparation device, so that The fusion protein is in contact with a protease, and the protease is determined based on the linking sequence, and the multiple target protein sequences are not cleaved by the protease, so as to obtain multiple free target proteins. Neither the terminal nor the N-terminal contains additional residues.
  • the system according to the embodiment of the present invention is suitable for performing the aforementioned method for obtaining free target protein, and the obtained free target protein does not contain additional residues at the C-terminus and N-terminus, and the quality of the target protein is significantly improved. And it greatly facilitates the purification of subsequent products, the safety of the target protein as a pharmaceutical polypeptide is significantly improved, and the immunotoxicity is significantly reduced.
  • the above system may further include at least one of the following additional technical features:
  • the digestion device is provided with a first protease digestion unit and a second protease digestion unit, and the first protease digestion unit is connected to the second protease digestion unit.
  • the fusion protein can be digested in the first protease digestion unit, and the first protease digestion product enters the second protease digestion unit for further digestion.
  • the protease can be artificially added to the first protease digestion unit and the second protease digestion unit, or the first protease Immobilize with the second protease to realize the industrialization and automatic digestion of the fusion protein.
  • the connecting sequence constitutes the C-terminus of the target protein sequence
  • the C-terminus of the target protein sequence is a continuous KR
  • the first protease digestion unit and the second protease digestion unit are fixed with Protease Kex2.
  • the free target protein can be obtained, and the first protease digestion product can enter the second protease digestion processing unit again to realize that the first protease digestion product is not digested or undigested. Further digestion of the completely digested fusion protein, or direct production of the first protease digestion treatment product, to obtain free target protein.
  • the connecting sequence contains a first protease recognition site and a second protease recognition site, and the plurality of target protein sequences does not contain the second protease recognition site, and the first A protease digestion unit is fixed with a first protease, and the second protease digestion unit is fixed with a second protease: the fusion protein contacts the first protease in the first protease digestion unit to obtain the first protease cleavage product
  • the N-terminus of the first protease cleavage product does not carry residues of the linking sequence; the first protease cleavage product contacts the second protease in the second protease digestion unit, and the second protease It is suitable for cleaving the C-terminus of the first protease cleavage product to obtain a plurality of the target proteins.
  • the amino acid sequence of the target protein sequence has no consecutive KR and RR and has or does not have consecutive KK or RK, the first protease recognition site is KR or RR or RKR, and the first protease is Kex2,
  • the second protease recognition site is the carboxy terminal R or K, the second protease is CPB; or the amino acid sequence of the target protein sequence has no K but R, the first protease recognition site is K, and the The first protease is Lys-C, the second protease recognition site is the carboxy terminal K, and the second protease is CPB; or the amino acid sequence of the target protein sequence has no K or R, and the first protease recognition site
  • the point is K or R, the first protease is Lys-C or Trp, the second protease recognition site is the carboxy terminal K or R, and the second protease is CPB; or the amino acid sequence of the target protein sequence There are consecutive KR or R,
  • the device for preparing a fusion protein includes a fermentation unit adapted to ferment a microorganism, the microorganism carries a nucleic acid encoding the fusion protein, and preferably, the microorganism is a large intestine Bacillus.
  • the device for preparing a fusion protein further includes a dissolving unit connected to the fermentation unit for crushing and dissolving the microbial fermentation treatment product, and the dissolving treatment is in descaling In the presence of an agent to obtain the fusion protein.
  • the digestion device further includes an adjustment unit for adjusting the dosage of the protease so that the mass ratio of the fusion protein to the protease is 250:1 to 2000:1.
  • the regulating unit is used to regulate the dosage of the protease to realize the specific cleavage of the fusion protein at the restriction site of the linking sequence.
  • Figure 1 is a schematic structural diagram of a system for obtaining free target protein according to an embodiment of the present invention
  • Figure 2 is a schematic structural diagram of a digestion device according to an embodiment of the present invention.
  • Figure 3 is a schematic structural diagram of a device for preparing a fusion protein according to an embodiment of the present invention
  • Fig. 4 is another schematic diagram of the structure of a device for preparing a fusion protein according to an embodiment of the present invention.
  • Fig. 5 is another structural schematic diagram of a digestion device according to an embodiment of the present invention.
  • Figure 6 is a schematic diagram of pET-30a-Arg 34 -GLP-1(7-37) recombinant plasmid construction according to an embodiment of the present invention
  • Fig. 7 is a pET-30a-Arg 34 -GLP-1(7-37) restriction identification diagram according to an embodiment of the present invention.
  • Figure 8 is a schematic diagram of pET-30a-Arg 34 -GLP-1(9-37) recombinant plasmid construction according to an embodiment of the present invention
  • Figure 9 is an identification diagram of pET-30a-Arg 34 -GLP-1(9-37) digestion according to an embodiment of the present invention.
  • Figure 10 is a schematic diagram of pET-30a-Arg 34 -GLP-1(11-37) recombinant plasmid construction according to an embodiment of the present invention
  • Figure 11 is a pET-30a-Arg 34 -GLP-1 (11-37) restriction identification diagram according to an embodiment of the present invention.
  • Figure 12 is a schematic diagram of pET-30a-GLP-2 recombinant plasmid construction according to an embodiment of the present invention.
  • Figure 13 is a pET-30a-GLP-2 restriction identification diagram according to an embodiment of the present invention.
  • Figure 14 is a schematic diagram of the construction of pET-30a-Glucagon recombinant plasmid according to an embodiment of the present invention.
  • Figure 15 is a pET-30a-Glucagon restriction identification diagram according to an embodiment of the present invention.
  • Figure 16 is a schematic diagram of pET-30a-T4B recombinant plasmid construction according to an embodiment of the present invention.
  • Figure 17 is a pET-30a-T4B restriction identification diagram according to an embodiment of the present invention.
  • Figure 18 is an SDS-PAGE diagram of pET-30a-Arg 34 -GLP-1(9-37)/BL21(DE3) recombinant engineering bacteria induced expression according to an embodiment of the present invention
  • Figure 19 is a mass spectrum molecular weight map of Arg 34 -GLP-1(9-37) after digestion according to an embodiment of the present invention.
  • Figure 20 shows the in vitro cellular biological activity of Arg 34 -GLP-1(9-37) according to an embodiment of the present invention
  • Figure 21 shows the in vitro cell biological activity of GLP-2 according to an embodiment of the present invention.
  • Fig. 22 is a comparison diagram of induced expression levels of fusion proteins containing EEAEAEARG and those without EEAEAEARG promoting the expression of short peptides according to an embodiment of the present invention
  • Figure 23 is a comparison diagram of the fusion protein content in the supernatant containing EEAEAEARG and without EEAEAEARG promoting the expression of short peptides according to an embodiment of the present invention
  • Fig. 24 is a comparison diagram of restriction enzyme cleavage efficiency of fusion proteins containing EEAEAEARG and those without EEAEAEARG promoting the expression of short peptides according to an embodiment of the present invention.
  • one aspect of the present invention provides a fusion protein and a novel method for end-to-end tandem recombinant expression of a polypeptide.
  • the process of the novel head-to-tail tandem recombinant polypeptide expression method specifically includes the following steps:
  • auxiliary peptide segment-(restriction site-polypeptide-restriction site-polypeptide) n where n is 2-8.
  • the recombinant expression vector is transferred into the host cell to obtain the recombinant genetically engineered bacteria expressing the polypeptide;
  • the expression vector in step 2) refers to an E. coli expression vector containing expression promoters including T7, Tac, Trp, and lac, or a yeast expression vector containing an alpha secretion factor and an AOX or GAP expression promoter.
  • the host cell of step 3) may be Pichia pastoris or E. coli, preferably E. coli, more specifically BL21, BL21(DE3), BL21(DE3)plysS, preferably BL21(DE3).
  • Recombinant double basic amino acid endopeptidase (Recombinant Kex2 Protease, Kex2 for short) is a Kex2-like proteolytic enzyme on the yeast cell membrane that specifically hydrolyzes the carboxyl terminal peptide bond in the alpha factor precursor, that is, two consecutive A basic amino acid (such as Lys Arg, LysLys, ArgArg, etc., among which Lys Arg has the highest digestion efficiency), the optimal active pH of the carboxy terminal peptide bond is 9.0-9.5.
  • the enzyme digestion buffer can be Tris-HCl buffer, phosphate buffer, or borate buffer, preferably Tris-HCl buffer salt.
  • Recombinant Carboxypeptidase B can selectively hydrolyze arginine (Arg, R) and lysine (Lys, K) at the carboxyl end of a protein or polypeptide, preferentially cutting basic amino acids.
  • the optimal active pH is 8.5 ⁇ 9.5.
  • the enzyme digestion buffer can be Tris-HCl buffer, phosphate buffer, or borate buffer, preferably Tris-HCl buffer salt.
  • the present invention has the following advantages:
  • the newly designed head-to-tail tandem polypeptide, its recombinant engineering bacteria can ensure that the plasmid loss rate within 80 generations is not higher than 10%, and the target protein expression is basically not affected, which can achieve high density and high expression in industrial scale fermentation expression;
  • the conventional method of fusion expression polypeptide is adopted. Although the expression level of the fusion protein is relatively high, the irrelevant protein part needs to be removed after restriction enzyme digestion, and only part of the target polypeptide corresponding to the molar concentration can be obtained. Glucagon-like peptides and analogs can be digested with all intact target peptides.
  • the design of the present invention can completely overcome the non-uniformity defect caused by the N-terminal Met.
  • the N-terminal Met can be completely digested to obtain a completely uniform N-terminal target protein Peptides.
  • Kex2 enzyme and recombinant carboxypeptidase B have high specificity for digestion, and no non-specific digestion related substances are produced. All target peptides with correct structure can be obtained by digestion, greatly reducing the difficulty of subsequent purification and separation, and obtaining extremely pure Target polypeptide, improve the recovery rate of recombinant polypeptide, and reduce the cost of recombinant expression polypeptide by genetic engineering;
  • Reversed-phase purification has high separation effect and high recovery rate.
  • the present invention proposes a system for obtaining free target protein.
  • the system includes: a fusion protein preparation device 100, which is used to provide the aforementioned fusion protein; a digestion device 200, and the digestion device 200 is
  • the fusion protein preparation device 100 is connected to contact the fusion protein with a protease, and the protease is determined based on the connection sequence, and the multiple target protein sequences are not cleaved by the protease, so as to obtain multiple free target proteins Neither the C-terminus nor the N-terminus of the free target protein contains additional residues.
  • the system according to the embodiment of the present invention is suitable for performing the aforementioned method for obtaining free target protein, and the obtained free target protein does not contain additional residues at the C-terminus and N-terminus, and the quality of the target protein is significantly improved. And it greatly facilitates the purification of subsequent products, the safety of the target protein as a pharmaceutical polypeptide is significantly improved, and the immunotoxicity is significantly reduced.
  • the digestion device is provided with a first protease digestion unit 201 and a second protease digestion unit 202, and the first protease digestion unit 201 is connected to the second protease digestion unit 202.
  • the fusion protein can be digested in the first protease digestion unit, and the first protease digestion product enters the second protease digestion unit for further digestion.
  • the protease can be artificially added to the first protease digestion unit and the second protease digestion unit, or the first protease Immobilize with the second protease to realize the industrialization and automatic digestion of the fusion protein.
  • the connecting sequence constitutes the C-terminus of the target protein sequence
  • the C-terminus of the target protein sequence is a continuous KR
  • the first protease digestion unit and the second protease digestion unit are immobilized with protease Kex2.
  • the free target protein can be obtained after the fusion protein is digested by the first protease digestion unit, and the first protease digestion product can enter the second protease digestion processing unit again to realize that the first protease digestion product is not digested or undigested.
  • the completely digested fusion protein is further digested, or the product of the first protease digestion treatment is directly produced to obtain free target protein.
  • the connecting sequence contains a first protease recognition site and a second protease recognition site, and the plurality of target protein sequences do not contain the second protease recognition site
  • the first protease digests The unit 201 is immobilized with a first protease
  • the second protease digestion unit 202 is immobilized with a second protease: the fusion protein contacts the first protease in the first protease digestion unit to obtain the first protease cleavage product,
  • the N-terminus of the first protease cleavage product does not carry residues of the linking sequence; the first protease cleavage product contacts the second protease in the second protease digestion unit, and the second protease is suitable
  • the C-terminus of the first protease cleavage product is cleaved to obtain a plurality of the target proteins.
  • the first protease recognition site is KR or RR or RKR
  • the first protease is Kex2
  • the second protease recognizes The site is the carboxy terminal R or K
  • the second protease is CPB
  • the amino acid sequence of the target protein sequence has no K but R
  • the first protease recognition site is K
  • the first protease is Lys- C
  • the second protease recognition site is the carboxyl terminal K
  • the second protease is CPB
  • the amino acid sequence of the target protein sequence has no K and no R
  • the first protease recognition site is K or R
  • the first protease is Lys-C or Trp
  • the second protease recognition site is the carboxy terminal K or R
  • the second protease is CPB
  • the amino acid sequence of the target protein sequence has continuous KR or RR or KK or RK
  • the fusion protein can be digested by the first protease digestion processing unit to realize the cleavage of the carboxy-terminal peptide bond at the first protease digestion site of the linking sequence to obtain the first protease digestion product without linking sequence residues at the N-terminus.
  • the carboxy-terminal linking sequence residues of the first protease digestion product are sequentially cut to obtain a free target protein without linking sequence residues at the C-terminus.
  • the first protease and the second protease can also be added in a system to digest the fusion protein simultaneously, and the activities of the first protease and the second protease selected according to the embodiments of the present application do not interfere with each other .
  • the fusion protein preparation device includes a fermentation unit 101, and the fermentation unit 101 is adapted to ferment a microorganism, and the microorganism carries a nucleic acid encoding the fusion protein, preferably ,
  • the microorganism is Escherichia coli.
  • the device for preparing a fusion protein further includes a dissolving unit 102, which is connected to the fermentation unit and used for crushing and dissolving the microbial fermentation treatment product.
  • the treatment is carried out in the presence of a detergent in order to obtain the fusion protein.
  • the digestion device further includes an adjustment unit 203 for adjusting the amount of protease, so that the mass ratio of the fusion protein to the protease is 250:1 to 2000: 1.
  • the regulating unit is used to regulate the amount of protease to realize the specific cleavage of the fusion protein at the restriction site of the linking sequence.
  • Arg 34 -GLP-1(7-37) (SEQ ID NO: 1), according to the auxiliary peptide-(repetition site-polypeptide-restriction site-polypeptide) 4 repeats in series into a sequence (SEQ ID NO: 13) Adopt E.
  • the PUC-57-Arg 34 -GLP-1(7-37) plasmid was double digested with Nde I endonuclease and BamH I endonuclease, the target fragment was recovered, and then T4 DNA ligase and plasmid pET-30a( (Purchased from Novagen) and ligated the fragments recovered by Nde I and BamH I digestion, transformed into Escherichia coli cloning host strain Top10, and screened the recombinant plasmid pET-30a-Arg 34 -GLP-1(7- 37).
  • the PUC-57-Arg 34 -GLP-1(9-37) plasmid was double digested with Nde I endonuclease and BamH I endonuclease, the target fragment was recovered, and then T4 DNA ligase and plasmid pET-30a( (Purchased from Novagen) and ligated the fragments recovered by Nde I and BamH I digestion, transformed into E. coli cloning host strain Top10, and screened the recombinant plasmid pET-30a-Arg 34 -GLP-1(9- 37).
  • Arg 34 -GLP-1(11-37) (SEQ ID NO: 3), according to the auxiliary peptide-(restriction site-polypeptide-restriction site-polypeptide) 4 repeats in series into a sequence (SEQ ID NO: 15) Adopt E.
  • the PUC-57-Arg 34 -GLP-1(11-37) plasmid was double digested with Nde I endonuclease and BamH I endonuclease, the target fragment was recovered, and then T4 DNA ligase and plasmid pET-30a( (Purchased from Novagen) and ligated the fragments recovered by Nde I and BamH I digestion, transformed into E. coli cloning host strain Top10, and screened the recombinant plasmid pET-30a-Arg 34 -GLP-1(11- 37).
  • GLP-2 (SEQ ID NO: 4), according to the auxiliary peptide segment-(enzyme cleavage site-polypeptide-enzyme cleavage site-polypeptide) 4 repeated tandem into a sequence (SEQ ID NO: 16), using E.
  • coli codons Preference, and add Nde I nuclease digestion site CAT ATG at the 5'end of the gene, add double stop codon TAA TGA, and BamH I nuclease digestion site GGA TCC at the 3'end, design its cDNA sequence (SEQ ID NO: 10), commissioned to artificially synthesize the nucleotide sequence of the entire gene and construct it on the PUC-57 vector to obtain the recombinant plasmid PUC-57-GLP-2, which is stored in the glycerol bacteria Top10.
  • the PUC-57-GLP-2 plasmid was double digested with Nde I endonuclease and BamH I endonuclease, the target fragment was recovered, and then T4 DNA ligase and plasmid pET-30a (purchased from Novagen) were similarly passed through Nde I The fragments were ligated with the fragments recovered by digestion with BamH I, transformed into E. coli cloning host strain Top10, and the recombinant plasmid pET-30a-GLP-2 was screened by the methods of digestion and PCR verification.
  • Glucagon (SEQ ID NO: 5), according to the auxiliary peptide segment-(restriction site-polypeptide-restriction site-polypeptide) 8 repeated tandem into a sequence (SEQ ID NO: 17), using E. coli codon preference , And add Nde I nuclease digestion site CAT ATG at the 5'end of the gene, add double stop codon TAA TGA and BamH I nuclease digestion site GGA TCC at the 3'end, design its cDNA sequence (SEQ ID NO :11), commissioned artificial full gene synthesis of the nucleotide sequence, and constructed it on the PUC-57 vector to obtain the recombinant plasmid PUC-57-Glucagon, which was stored in the glycerol bacteria Top10.
  • the PUC-57-Glucagon plasmid was double digested with Nde I endonuclease and BamH I endonuclease, the target fragment was recovered, and then T4 DNA ligase and plasmid pET-30a (purchased from Novagen) were similarly treated with Nde I and BamH I The fragments recovered by restriction digestion were ligated and transformed into Escherichia coli cloning host strain Top10. The recombinant plasmid pET-30a-Glucagon was screened by restriction digestion and PCR verification.
  • TB4 (SEQ ID NO: 6), according to the auxiliary peptide segment-(enzyme cleavage site-polypeptide-enzyme cleavage site-polypeptide) 4 repeated tandem into a sequence (SEQ ID NO: 18), using E.
  • the PUC-57-TB4 plasmid was double digested with Nde I endonuclease and BamH I endonuclease, the target fragment was recovered, and then T4 DNA ligase and plasmid pET-30a (purchased from Novagen) were similarly treated with Nde I and BamH I
  • the fragments recovered by restriction digestion were ligated and transformed into Escherichia coli cloning host strain Top10, and the recombinant plasmid pET-30a-TB4 was screened by the methods of restriction digestion and PCR verification. After DNA sequencing proved that the cDNA sequence of TB4 in the recombinant plasmid was correct, the E.
  • FIG. 16 The plasmid digestion identification map is shown in Figure 17, in which, after digestion, the recombinant plasmid has about 5000bp and 600bp bands, corresponding to pET-30a and TB4 respectively and the theoretical values are consistent, indicating that TB4 is correctly connected to the vector pET-30a.
  • Example 7 pET-30a-Arg 34 -GLP-1(7-37)/BL 21 (DE3), pET-30a-Arg 34 -GLP-1(9-37)/BL 21 (DE3), pET- 30a-Arg 34 -GLP-1(11-37)/BL 21 (DE3), pET-30a-GLP-2/BL 21 (DE3), pET-30a-Glucagon/BL 21 (DE3), pET-30a- Fermentation culture of TB4/BL 21 (DE3) recombinant engineering strain
  • pET-30a-Arg 34 -GLP-1(9-37)/BL 21 (DE3) recombinant engineering bacteria fermentation cells are resuspended in a crushing buffer, homogenized at high pressure three times (pressure 600 ⁇ 700Bar), stirred and centrifuged at room temperature, and the precipitate is collected ; The washing solution for precipitation is resuspended according to the mass-volume ratio, and the homogenizer is homogenized to no visible particles; stirred at room temperature for 30 minutes, and centrifuged to collect the precipitate.
  • Use surfactant-containing enzyme digestion buffer for precipitation dissolve at a ratio of 3 to 5% (mass volume ratio g/mL), adjust pH to 10.5, stir for 30 minutes at 28°C to 32°C, collect the supernatant by centrifugation, and sample The content was determined by OD 280 ultraviolet; the pH of the dissolved sample was adjusted to 8.0-9.0, the recombinant protease Kex2 and recombinant protease CPB were added according to the mass ratio (1:1000), the digestion was stirred overnight at 25°C ⁇ 35°C, and the samples were taken for RP-HPLC detection.
  • the Q anion chromatography column is routinely cleaned and regenerated, the balance solution is balanced to 2CV, the digested sample is adjusted to pH 9.5 to 9.8, filtered and mounted on the column (conductivity lower than 5ms/cm), rebalanced to 1CV, and eluent 1 is eluted to The UV absorbance value is reset to zero, the balance solution is rebalanced to 2CV, the eluent 2 is eluted in one step, and the target peak is collected.
  • CHO-K1–CRE–GLP1R cells transfected with GLP-1R receptor of our company were used to determine the in vitro biological activity.
  • CHO-K1–CRE–GLP1R cells were plated overnight, and recombinant CHO–K1–CRE–GLP1R cells were stimulated with Arg 34 -GLP-1(9-37) polypeptide, and reacted with 5% CO2 at 37°C for 4 hours ⁇ 15 minutes.
  • Add Promega kit chemiluminescent substrate Cat.
  • Recombinant CHO-K1–CRE–GLP2R cells transfected with GLP-2R receptor of our company were used to determine in vitro biological activity.
  • the specific method is: CHO-K1–CRE–GLP2R cells are plated overnight, and GLP-2 protein is used to stimulate recombinant CHO–K1–CRE–GLP2R cells, and the reaction is performed at 37°C with 5% CO2 for 4 hours ⁇ 15 minutes.
  • Add Promega kit chemiluminescent substrate Cat. No.: E2510
  • 100ul/well gently shake on a shaker for 40 minutes ⁇ 10 minutes at room temperature, and measure the time for each well (1 second/well) in the fluorescence Read the plate on the microplate reader to determine the RLU.
  • Use "Sigmaplot" software to do four-parameter regression curve fitting to calculate the half-effect dose (EC 50 ) of GLP-2. The result is shown in Figure 21.
  • fusion proteins containing EEAEAEARG and those without EEAEAEARG to promote the expression of short peptides were designed, and the comparison experiments of fermentation-induced expression, fusion protein solubilization, and fusion protease cleavage were performed. The results are as follows.
  • the content of the target protein in the supernatant containing EEAEAEARG promoting peptide is significantly higher than that in the supernatant without EEAEAEARG structure.
  • the digestion efficiency with EEAEAEARG is 96.6%, and the digestion efficiency without EEAEAEARG is 62.3%.
  • the digestion efficiency with EEAEAEARG is higher than that without EEAEAEARG.

Abstract

本发明提供了一种融合蛋白,包括串联的多个目标蛋白序列,相邻两个所述目标蛋白序列通过连接序列相连,其中,连接序列适于通过蛋白酶切割形成多个游离目标蛋白,多个目标蛋白序列不被蛋白酶切割,游离目标蛋白的C-末端和N-末端均不含有额外残基。

Description

重组串联融合蛋白制备目标多肽的方法 技术领域
本发明涉及生物医药领域,具体地,本发明涉及融合蛋白、制备融合蛋白的方法和系统,更具体地,本发明涉及融合蛋白、制备融合蛋白的方法和系统、核酸、构建体以及重组细胞。
背景技术
多肽常指由100个以下氨基酸构成的活性化合物,多肽类药物是指用作疾病预防、诊断和治疗的多肽或其修饰物,目前已广泛应用于多个疾病领域,FDA至今已批准了约70个多肽类药物。多肽类药物在糖尿病、骨质疏松、肠道疾病、血小板减少症、肿瘤、心血管疾病、抗病毒、免疫疾病等方面有显著的疗效。
前胰高血糖素原(preproglucagon)是158个氨基酸的前体多肽,其在组织中被差异性加工而形成多种结构上相关的胰高血糖素类似物(Glucagon-like peptide),包括胰高血糖素(Glucagon)、胰高血糖素样肽-1(GLP-1)、胰高血糖素样肽-2(GLP-2)等。这些分子参与多种生理功能,包括葡萄糖体内平衡、胰岛素分泌、胃排空和肠生长以及食物摄取调节。
胰高血糖素(Glucagon)主要用于治疗接受胰岛素治疗的糖尿病患者发生的严重低血糖反应,上市药物有GlucaGen。胰高血糖素样肽-1(GLP-1)主要用于II型糖尿病,目前GLP-1受体激动剂类药物包括已上市的Exenatide、Exenatide QW、Liraglutide、Albiglutide、Dulaglutide、Lixisenatide和Semaglutide。胰高血糖素样肽-2(GLP-2)主要用于短肠综合症,上市药物有Teduglutide。
人胰高血糖素类似肽1(Glucagon-like peptide-1,GLP-1)是由肠黏膜分泌的促胰岛素分泌的多肽类激素,它通过增加胰岛素分泌和抑制胰高血糖素释放,调节血糖代谢,还可减少肠蠕动,引起饱感和抑制食欲;GLP-1可促进胰岛β细胞增殖和抑制胰岛β细胞凋亡以提升胰β细胞的数量和功能。最重要的是,其降糖作用只发生在较高血糖浓度下,从而避免了胰岛素过度分泌导致的低血糖发生。GLP-1还能改善受体细胞对胰岛素的敏感性,有助于治疗胰岛素抗性;长期治疗可显著改善病人糖化血红蛋白等中长期指标;对于肥胖引起的II型糖尿病,能通过抑制胃排空作用,帮助病人控制饮食,实现体重减轻。近两年,相继证实利拉鲁肽、索马鲁肽等GLP-1类药物有心血管获益的作用。胰岛素治疗通常有增加体重、低血糖风险的劣势,GLP-1受体激动剂类药物刚好满足了这些临床需求。
以利拉鲁肽为代表的GLP-1类药物治疗糖尿病的机理包括:以生理和葡萄糖依赖的方式刺激胰岛素分泌;降低胰高血糖素分泌;抑制胃排空;降低食欲;促进胰腺β细胞的生长和复苏。
当血糖浓度超过正常水平时,GLP-1可以通过以上机理刺激胰岛素的分泌从而发挥其降低血糖的作用,因此GLP-1为一种高效降糖药物(具有葡萄糖依赖性)。鉴于以上特性,以及GLP-1类药物多年的临床治疗效果分析可知,GLP-1是治疗II型糖尿病的合适侯选药物,而且当GLP-1联合胰岛素用于治疗Ⅰ型糖尿病时,受试者将会获得更好的疗效。GLP-1具有潜在的降糖作用,即使那些经磺脲类药物治疗失败的患者也能发挥疗效,而且不会引起严重低血糖危险。此外,GLP-1还具有增加胰岛素生物合成率和恢复大鼠胰腺β-细胞对血糖升高的快速反应(即首相胰岛素释放)能力。已有文献报告,GLP-1能够刺激胰腺β-细胞的生长与增生并能促进导管细胞成为新的胰腺β-细胞。多项人体试验表明,GLP-1同样参与胰腺β-细胞群的保留和修复过程。
GLP-1药物的主要竞争点包括给药频率、降糖效果、减轻体重作用以及免疫原性等。Exenatide的主要劣势是给药周期短和免疫原性强,Albiglutide则在降糖效果和减轻体重作用方面都明显弱势,Albiglutide作为第一个一周一次的长效GLP-1,但表现远不如后来上市的Dulaglutide。除此之外,GLP-1类药物对心血管的风险也备受关注,已在日本、欧盟和美国上市的德谷胰岛素就因为心血管风险的顾虑被美国FDA延迟批准。Liraglutide、Semaglutide近两年相继被证实存在心血管获益作用,大大提高了GLP-1受体激动剂类药物整体市场的竞争力。
胰高血糖素样肽及类似物多采用天然提取、人工化学合成、基因工程三种方式制备。目前多肽类药物以人工化学合成为主,但是固相合成的成本较高,大量使用有机溶剂可能对多肽的活性有影响,且固相合成多肽的有关物质分析难度较高,需要对断裂肽、差象异构体、手性异构体等有关物质进行严格控制。已公开的专利CN201210369966采用全人工化学合成的方法制备利拉鲁肽。
随着分子生物学技术的发展,越来越多的上市多肽类药物采用基因工程的方法制备,如Liraglutide和Semaglutide采用酵母系统重组表达。利用酵母系统重组表达外源蛋白时,酵母内含有的多个蛋白酶家族可能会降解外源蛋白,特别是一些结构简单的小肽更容易被降解,降解产物随发酵时间的延长而增加,且难以通过纯化手段将其有效分离。经研究发现,发酵过程的降解是由酵母内含有的蛋白酶对多肽的酶切所致。通过更换表达宿主菌、改变发酵条件等方法,可部分改善降解程度,但不能满足产业化的要求。通过分子生物学手段敲除或失活酵母宿主菌中的特定蛋白酶基因,可部分实现防止多肽的降解,但技术难度较大,且无法完全克服多肽被降解的缺点。Novo Nordisk公司采用YES2085株Saccharomyces cerevisiae(Knock out YPS1 and PEP4,防止降解)可以高效表达Arg 34-GLP-1(7-37)(US20100317057)。
大肠杆菌表达体系也是常用的重组外源蛋白的表达手段。多肽类药物结构简单,无复杂 的高级结构,且不含糖基化位点。而大肠杆菌中含有较少的蛋白酶,利用大肠杆菌系统重组表达制备可获得有活性的完整多肽。常规的大肠杆菌融合表达体系,可通过酶切获得目标多肽,但是酶切后多肽的得率和回收率显著减少,严重制约多肽类药物的产业化。
在已公开的GLP-1类多肽的制备发明专利中,CN201610753093.4采用伴侣蛋白-肠激酶融合表达Arg 34-GLP-1(7-37),虽然融合蛋白的表达量较高,但是酶切得到的Arg 34-GLP-1(7-37)仅占融合蛋白总量的十分之一,目的蛋白的收率较低。此外,伴侣蛋白(TrxA、DsbA)适用于需要复性的大分子蛋白的融合表达,Arg 34-GLP-1(7-37)空间结构简单,无需空间构象的复性。在纯化工艺需严格控制酶切引入的伴侣蛋白的残留量,防止其带来的安全性风险。CN201610857663.4采用SUMO-GLP-1(7-37)融合蛋白的方式重组表达GLP-1(7-37)。
在已公开的GLP-2类多肽的制备发明专利中,CN104072604B、CN101171262、CN102659938A均是通过固相或液相合成手段制备GLP-2类似物多肽;CN103159848A专利通过串联表达的方式制备GLP-2的二串体多肽;CN103945861A制备重组多肽与GLP-2链接在一起的融合多肽;上海医药工业研究院采用肠激酶和酸裂解的方法制备GLP-2(CN201610537328.6),为了得到完整的GLP-2,需采用强酸进行酸切位点天冬氨酸-脯氨酸连接D-P键,酸裂解过程中可能会破坏多肽形成断裂肽,此外长时间的酸裂解液可能引入多肽的脱酰胺相关物质,严重影响产品的质量及制约后续的纯化。
此外,采用传统的原核和真核细胞重组表达,所有蛋白质的翻译都起始于N端的甲硫氨酸,因此表达产物的第一位为非目标氨基酸甲硫氨酸,只有当目的蛋白的第一位氨基酸的旋转半径在1.22埃或更小时(如Gly和Ala),N端甲硫氨酸才可能被甲硫氨酸酶有效的切除。但是当目的蛋白表达量较高时,由于相应切除甲硫氨酸的酶被饱和并缺少辅助因子,甲硫氨酸通常不被切除掉。因此可能造成N端不均一(含或不含Met),且表达得到的蛋白与目标蛋白的氨基酸序列不一致(第一位含Met),可能引发免疫毒性。
目前,使用基因工程的方法制备多肽,高效获得符合药用标准的多肽类药物,尽可能减少多肽药物引起的毒副作用,是生物医药科研工作者不断解决的关键问题。
发明内容
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。
为此,在本发明的第一方面,本发明提出了一种融合蛋白。根据本发明的实施例,所述融合蛋白包括串联的多个目标蛋白序列,相邻两个所述目标蛋白序列通过连接序列相连,其中,所述连接序列适于通过蛋白酶切割形成多个游离目标蛋白,所述多个目标蛋白序列不被所述蛋白酶切割,所述游离目标蛋白的C-末端和N-末端均不含有额外残基。其中,需 要说明的是,本申请所述的“所述多个目标蛋白序列不被所述蛋白酶切割”是指目标蛋白序列不能被蛋白酶从内部切割,即蛋白酶不能切割目标蛋白序列的内部肽键;本申请所述的“额外残基”是指除目标蛋白序列以外的其它氨基酸残基。根据本发明实施例的融合蛋白在蛋白酶的作用下可形成多个游离目标蛋白,该目标蛋白的C-末端和N-末端均不含有额外残基,所述目标蛋白的质量显著提高,且大大方便了后续产品的纯化,目标蛋白作为药用多肽的安全性显著提高,免疫毒性显著下降。
根据本发明的实施例,上述融合蛋白还可以进一步包括如下附加技术特征至少之一:
根据本发明的实施例,所述连接序列的至少一部分构成所述目标蛋白序列的C-末端的一部分。
根据本发明的实施例,所述连接序列由至少一个蛋白酶识别位点构成。
根据本发明的实施例,所述连接序列构成所述目标蛋白序列的C-末端。具体地,所述目标蛋白序列的C-末端为连续KR,所述蛋白酶为Kex2。进而融合蛋白C-末端的连续KR被蛋白酶Kex2识别,Kex2在R后肽键进行切割,形成多个游离目标蛋白。
根据本发明的实施例,所述连接序列中含有第一蛋白酶识别位点和第二蛋白酶识别位点,并且所述多个目标蛋白序列中不含有所述第二蛋白酶识别位点,所述第一蛋白酶识别位点适于被第一蛋白酶识别并切割以便形成第一蛋白酶切割产物,所述第二蛋白酶识别位点适于被第二蛋白酶识别并切割,所述第一蛋白酶切割产物的N-末端不携带所述连接序列的残基,所述第二蛋白酶适于对所述第一蛋白酶切割产物的C-末端进行切割,以便形成多个游离目标蛋白,所述游离目标蛋白的C-末端和N-末端均不含有所述连接序列的残基。
根据本发明的实施例,所述多个目标蛋白序列中存在至少一个内部第一蛋白酶识别位点,其中,所述第一蛋白酶识别所述内部第一蛋白酶识别位点的效率低于所述第一蛋白酶识别所述连接序列中所述第一蛋白酶识别位点的效率。进而在一定条件下,利用第一蛋白酶实现在连接序列第一蛋白酶识别位点处的切割,而目标蛋白序列内部肽键不被第一蛋白酶切割。
根据本发明的实施例,所述第一蛋白酶是Kex2,所述内部第一蛋白酶识别位点为KK和RK的至少之一,所述连接序列中所述第一蛋白酶识别位点是KR或RR或RKR。发明人发现,蛋白酶Kex2可以识别KR或RR或KK和RK,但对KR或RR的切割能力却显著大于对KK和RK的切割能力,因此,发明人可以通过调整Kex2,实现对连接序列中KR或RR或RKR的识别和切割,但无法对连接序列中KK或RK中的K后肽键进行切割,例如,当融合蛋白与Kex2的质量比为2000:1时,可实现上述切割方式。
根据本发明的实施例,所述内部第一蛋白酶识别位点的上游或下游存在与所述内部第一蛋白酶识别位点毗邻的连续酸性氨基酸序列。发明人发现,毗邻的连续酸性氨基酸序列 可将目标蛋白序列中的第一蛋白酶识别位点进行掩藏,使得第一蛋白酶无法识别和切割目标蛋白序列中的第一蛋白酶识别位点。
根据本发明的实施例,所述连续酸性氨基酸序列的长度为1~2个氨基酸。发明人发现,连续酸性氨基酸序列的长度为1~2个氨基酸对目标蛋白序列中的第一蛋白酶识别位点的掩藏效果更佳。
根据本发明的实施例,所述酸性氨基酸为天冬氨酸或谷氨酸,优选地,所述酸性氨基酸为天冬氨酸。发明人发现,当毗邻的连续酸性氨基酸序列为天冬氨酸时,对目标蛋白序列中的第一蛋白酶识别位点的掩藏效果更加显著。
根据本发明的实施例,所述第一蛋白酶识别位点和所述第二蛋白酶识别位点存在重叠区域。
根据本发明的实施例,所述第一蛋白酶和所述第二蛋白酶的识别位点相同或者不同。
根据本发明的实施例,所述第一蛋白酶识别位点和所述第二蛋白酶识别位点满足下列条件:所述目标蛋白序列的氨基酸序列无连续KR和RR以及有或没有连续KK或者RK,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB;或所述目标蛋白序列的氨基酸序列无K有R,所述第一蛋白酶识别位点为K,所述第一蛋白酶为Lys-C,所述第二蛋白酶识别位点为羧基端K,所述第二蛋白酶为CPB;或所述目标蛋白序列的氨基酸序列无K无R,所述第一蛋白酶识别位点为K或者R,所述第一蛋白酶为Lys-C或者Trp,所述第二蛋白酶识别位点为羧基端K或R,所述第二蛋白酶为CPB;或所述目标蛋白序列的氨基酸序列有连续KR或者RR或者KK或者RK,所述连续KR或者RR或者KK或者RK的毗邻1或2个连续酸性氨基酸,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB。根据本发明实施例的第一蛋白识别位点和第二蛋白识别位点在上述条件下,实现了融合蛋白在第一蛋白酶识别位点处的特异性切割,所获的第一蛋白酶切割产物的N末端不含有连接序列的残基,第一蛋白酶切割产物在第二蛋白酶的作用下,第一蛋白酶切割产物C末端的连接序列残基依次被切除。
根据本发明的实施例,所述融合蛋白包含多个连接序列,所述多个连接序列相同或者不同。
根据本发明的实施例,所述连接序列的长度为1~10个氨基酸。根据本发明的具体实施例,所述连续序列可以包含1~5个所述第一蛋白酶识别位点和第二蛋白酶识别位点。进而保证蛋白酶切割的有效进行。
根据本发明的实施例,进一步包括辅助肽段,所述辅助肽段的羧基端通过所述连接序 列与所述串联的多个目标蛋白序列的N-末端相连。所述辅助肽段在上述蛋白酶的作用下,可以被从融合蛋白中切除,切割后的目标蛋白序列的N末端不含有连接序列的残基。
根据本发明的实施例,所述辅助肽段包括标签序列以及任选的促表达序列。所述连接序列方便后续对融合蛋白的识别或纯化,所述促表达序列大大提高了融合蛋白的表达效率。
根据本发明的实施例,所述标签序列的氨基酸序列为重复His序列。
根据本发明的实施例,所述促表达序列的氨基酸序列为EEAEAEA、EEAEAEAGG或EEAEAEARG。发明人发现,所述促表达序列的氨基酸序列为上述的氨基酸序列时,融合蛋白的表达量和表达效率进一步提高。
根据本发明的实施例,所述辅助肽段的首位氨基酸为甲硫氨酸。根据本发明的实施例,甲硫氨酸在后续酶切过程中,可伴随辅助肽段的切除而切除,避免了现有技术中多肽中甲硫氨酸不易被切除掉、N端不均一、具有免疫毒性的问题。
根据本发明的实施例,所述目标蛋白序列的长度为10~100个氨基酸。
根据本发明的实施例,所述融合蛋白包括串联的4~16目标蛋白序列。发明人发现,融合蛋白包括串联的4~16目标蛋白序列,可保证倍增80倍以内质粒丢失率不高于10%,目的蛋白表达量无影响,实现产业的规模发酵,蛋白高密度和高表达。
根据本发明的实施例,所述目标蛋白序列的氨基酸序列如SEQ ID NO:1~6所示。
His-Ala-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly(SEQ ID NO:1)。
Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly(SEQ ID NO:2)。
Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly(SEQ ID NO:3)。
His-Gly-Asp-Gly-Ser-Phe-Ser-Asp-Glu-Met-Asn-Thr-Ile-Leu-Asp-Asn-Leu-Ala-Ala-Arg-Asp-Phe-Ile-Asn-Trp-Leu-Ile-Gln-Thr-Lys-Ile-Thr-Asp(SEQ ID NO:4)。
His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr(SEQ ID NO:5)。
Ser-Asp-Lys-Pro-Asp-Met-Ala-Glu-Ile-Glu-Lys-Phe-Asp-Lys-Ser-Lys-Leu-Lys-Lys-Thr-Glu-Thr-Gln-Glu-Lys-Asn-Pro-Leu-Pro-Ser-Lys-Glu-Thr-Ile-Glu-Gln-Glu-Lys-Gln-Ala-Gly-Glu-Ser(SEQ ID NO:6)。
在本发明的第二方面,本发明提出了一种获得游离目标蛋白的方法。根据本发明的实施例,所述方法包括:提供前面所述的融合蛋白;使所述融合蛋白与蛋白酶接触,所述蛋白酶是基于所述连接序列确定的,所述多个目标蛋白序列不被所述蛋白酶切割,以便获得 多个游离目标蛋白,所述游离目标蛋白的C-末端和N-末端均不含有额外残基。根据本发明实施例的方法所获得的游离目标蛋白C-末端和N-末端均不含有额外残基,目标蛋白的质量显著提高,且大大方便了后续产品的纯化,目标蛋白作为药用多肽的安全性显著提高,免疫毒性显著下降。
根据本发明的实施例,上述方法还可以进一步包括如下附加技术特征至少之一:
根据本发明的实施例,所述连接序列构成所述目标蛋白序列的C-末端,所述目标蛋白序列的C-末端为连续KR,所述蛋白酶为Kex2。进而融合蛋白C-末端的连续KR被蛋白酶Kex2识别,Kex2在R后肽键进行切割,形成多个游离目标蛋白。
根据本发明的实施例,所述连接序列中含有第一蛋白酶识别位点和第二蛋白酶识别位点,并且所述多个目标蛋白序列中不含有所述第二蛋白酶识别位点,使所述融合蛋白与所述蛋白酶接触进一步包括:使所述融合蛋白与所述第一蛋白酶接触,以便获得第一蛋白酶切割产物,所述第一蛋白酶切割产物的N-末端不携带所述连接序列的残基;使所述第一蛋白酶切割产物与所述第二蛋白酶接触,所述第二蛋白酶适于对所述第一蛋白酶切割产物的C-末端进行切割,以便获得多个所述游离目标蛋白。
根据本发明的实施例,所述多个目标蛋白序列中存在至少一个内部第一蛋白酶识别位点,其中,所述第一蛋白酶识别所述内部第一蛋白酶识别位点的效率低于所述第一蛋白酶识别所述连接序列中所述第一蛋白酶识别位点的效率。进而在一定条件下,利用第一蛋白酶实现在连接序列第一蛋白酶识别位点处的切割,而目标蛋白序列内部肽键不被第一蛋白酶切割。
根据本发明的实施例,所述内部第一蛋白酶识别位点为KK和RK的至少之一,所述连接序列中所述第一蛋白酶识别位点是KR或RR或RKR,所述第二蛋白酶识别位点为羧基端R或K,所述第一蛋白酶为Kex2,所述第二蛋白酶为CPB,所述融合蛋白与所述第一蛋白酶的质量比为2000:1。发明人发现,蛋白酶Kex2可以识别KR或RR或KK和RK,但对KR或RR的切割能力却显著大于对KK和RK的切割能力。发明人可以通过调整Kex2的用量,当融合蛋白与Kex2的质量比为2000:1时,实现对连接序列中KR或RR或RKR的识别和切割,但无法对连接序列中KK或RK中的K后肽键进行切割。
根据本发明的实施例,所述内部第一蛋白酶识别位点的上游或下游存在与所述内部第一蛋白酶识别位点毗邻的连续酸性氨基酸序列。发明人发现,毗邻的连续酸性氨基酸序列可将目标蛋白序列中的第一蛋白酶识别位点进行掩藏,使得第一蛋白酶无法识别和切割目标蛋白序列中的第一蛋白酶识别位点。
根据本发明的实施例,所述连续酸性氨基酸序列的长度为1~2个氨基酸。发明人发现,连续酸性氨基酸序列的长度为1~2个氨基酸对目标蛋白序列中的第一蛋白酶识别位点的掩 藏效果更佳。
根据本发明的实施例,所述酸性氨基酸为天冬氨酸或谷氨酸,优选地,所述酸性氨基酸为天冬氨酸。发明人发现,当毗邻的连续酸性氨基酸序列为天冬氨酸时,对目标蛋白序列中的第一蛋白酶识别位点的掩藏效果更加显著。
根据本发明的具体实施例,所述多个目标蛋白序列中存在连续DKR或DRR或DKK或DRK,所述第一蛋白酶识别位点为KR或RR或RKR,所述第二蛋白酶识别位点为羧基端R或K,所述第一蛋白酶为Kex2,所述第二蛋白酶为CPB。进而第一蛋白酶Kex2仅可识别和切割连接序列中的第一蛋白酶识别位点,而不能识别和切割目标蛋白序列中的连续DKR或DRR或DKK或DRK,第一蛋白酶切割产物在第二蛋白酶的作用下,C末端的连接序列残基依次被切除。
根据本发明的实施例,所述多个目标蛋白序列中不含有所述第一蛋白酶识别位点和所述第二蛋白酶识别位点。
根据本发明的具体实施例,所述第一蛋白酶与所述第二蛋白酶分别为:所述目标蛋白序列的氨基酸序列无连续KR、RR、KK和RK,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB;或所述目标蛋白序列的氨基酸序列无K有R,所述第一蛋白酶识别位点为K,所述第一蛋白酶为Lys-C,所述第二蛋白酶识别位点为羧基端K,所述第二蛋白酶为CPB,或所述目标蛋白序列的氨基酸序列无K无R,所述第一蛋白酶识别位点为K或者R,所述第一蛋白酶为Lys-C或者Trp,所述第二蛋白酶识别位点为羧基端K或R,所述第二蛋白酶为CPB。根据本发明实施例的第一蛋白识别位点和第二蛋白识别位点在上述条件下,实现了融合蛋白在连接序列的第一蛋白酶识别位点处的特异性切割,所获的第一蛋白酶切割产物的N末端不含有连接序列的残基,第一蛋白酶切割产物在第二蛋白酶的作用下,第一蛋白酶切割产物C末端的连接序列残基依次被切除。
根据本发明的实施例,所述融合蛋白与所述第一蛋白酶的质量比为250:1~2000:1。发明人发现,融合蛋白与所述第一蛋白酶的质量比在上述范围内,均可实现融合蛋白的有效切割,切割特异性高、切割彻底、且非特异性切割产物少。
根据本发明的实施例,所述融合蛋白是通过对微生物进行发酵处理后获得的,所述微生物携带有编码所述融合蛋白的核酸。克服了采用人工合成多肽所带来的合成成本高、有机溶剂影响肽活性等的缺陷。
根据本发明的实施例,所述微生物为大肠杆菌。发明人发现,如果利用酵母系统重组表达外源蛋白时,酵母内含有的多个蛋白酶家族可能会降解外源蛋白,特别是一些结构简单的小肽更容易被降解。而本申请所要获得的多肽的结构简单,无复杂的高级结构,且不 含糖基化位点,因而更加适用大肠杆菌。大肠杆菌中含有较少的蛋白酶,利用大肠杆菌系统重组表达制备可获得有活性的完整中间体产物,并且大肠杆菌发酵周期短,生产成本大大降低。
根据本发明的实施例,进一步包括对微生物发酵处理产物进行破碎和溶解处理,所述溶解处理是在去垢剂存在的条件下进行的,以便获得所述融合蛋白。所述去垢剂即为表面活性剂,可增加融合蛋白的溶解性,提高蛋白酶酶切效率。
本申请所述的去垢剂的选择不受特别限制,可根据所使用的蛋白酶的性质,选择不同的去垢剂种类或不同去垢剂的组合。根据本发明的具体实施例,所述去垢剂表面活性剂包括(1)非离子表面活性剂,如PEG2000,吐温,山梨醇、尿素、TritonX-100、盐酸胍;(2)阴离子表面活性剂,如十二烷基硫酸钠、十二烷基璜酸钠、硬酯酸;(3)两性表面活性剂,如三-磺丙基十四烷基二甲甜菜碱、十二烷基二甲基甜菜碱、卵磷脂;(4)阳离子表面活性剂:季铵化合物等。去垢剂可实现对融合蛋白的高效率溶解,且不会破坏目标蛋白的活性,不会对后续第一蛋白酶和第二蛋白酶的活性造成影响。
在本发明的第三方面,本发明提出了一种核酸。根据本发明的实施例,所述核酸编码前面所述的融合蛋白。
根据本发明的实施例,上述核酸还可以进一步包括如下附加技术特征至少之一:
根据本发明的实施例,所述核酸具有SEQ ID NO:7~12任一项所示的核苷酸序列。
Figure PCTCN2020097058-appb-000001
Figure PCTCN2020097058-appb-000002
Figure PCTCN2020097058-appb-000003
在本发明的第四方面,本发明提出了一种构建体。根据本发明的实施例,所述构建体携带前面所述的核酸。进而将根据本发明实施例的构建体导入受体细胞后,在适合蛋白表达的条件下,实现前面所述融合蛋白的表达。
在本发明的第五方面,本发明提出了一种重组细胞。根据本发明的实施例,所述重组细胞包含前面所述的核酸或前面所述的构建体或表达前面所述的融合蛋白。
根据本发明的实施例,上述重组细胞还可以进一步包括如下附加技术特征至少之一:
根据本发明的实施例,所述重组细胞为大肠杆菌细胞。
在本发明的第六方面,本发明提出了一种获得游离目标蛋白的系统。根据本发明的实施例,所述系统包括:制备融合蛋白装置,所述制备融合蛋白装置用于提供前面所述的融合蛋白;消化装置,所述消化装置与所述制备融合蛋白装置相连,使所述融合蛋白与蛋白酶接触,所述蛋白酶是基于所述连接序列确定的,所述多个目标蛋白序列不被所述蛋白酶 切割,以便获得多个游离目标蛋白,所述游离目标蛋白的C-末端和N-末端均不含有额外残基。根据本发明实施例的系统适于执行前面所述的获得游离目标蛋白的方法,所获得的游离目标蛋白C-末端和N-末端均不含有额外残基,所述目标蛋白的质量显著提高,且大大方便了后续产品的纯化,目标蛋白作为药用多肽的安全性显著提高,免疫毒性显著下降。
根据本发明的实施例,上述系统还可以进一步包括如下附加技术特征至少之一:
根据本发明的实施例,所述消化装置设置有第一蛋白酶消化单元和第二蛋白酶消化单元,所述第一蛋白酶消化单元与所述第二蛋白酶消化单元相连。进而融合蛋白可在第一蛋白酶消化单元进行消化,第一蛋白酶消化产物进入第二蛋白酶消化单元进一步消化,可以人为在第一蛋白酶消化单元和第二蛋白酶消化单元投放蛋白酶,也可将第一蛋白酶和第二蛋白酶进行固定化处理,实现对融合蛋白的产业化和自动化消化处理。
根据本发明的实施例,所述连接序列构成所述目标蛋白序列的C-末端,所述目标蛋白序列的C-末端为连续KR,所述第一蛋白酶消化单元和第二蛋白酶消化单元固定有蛋白酶Kex2。进而融合蛋白在第一蛋白酶消化单元消化处理后,即可获得游离目标蛋白,第一蛋白酶消化处理产物可再次进入第二蛋白酶消化处理单元,实现第一蛋白酶消化处理产物中未被消化或未被消化完全的融合蛋白的进一步消化,或第一蛋白酶消化处理产物直接产出,获得游离目标蛋白。
根据本发明的实施例,所述连接序列中含有第一蛋白酶识别位点和第二蛋白酶识别位点,并且所述多个目标蛋白序列中不含有所述第二蛋白酶识别位点,所述第一蛋白酶消化单元固定有第一蛋白酶,所述第二蛋白酶消化单元固定有第二蛋白酶:所述融合蛋白与所述第一蛋白酶在所述第一蛋白酶消化单元接触,以便获得第一蛋白酶切割产物,所述第一蛋白酶切割产物的N-末端不携带所述连接序列的残基;所述第一蛋白酶切割产物与所述第二蛋白酶在所述第二蛋白酶消化单元接触,所述第二蛋白酶适于对所述第一蛋白酶切割产物的C-末端进行切割,以便获得多个所述目标蛋白。
根据本发明的实施例,所述目标蛋白序列的氨基酸序列无连续KR和RR以及有或没有连续KK或者RK,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB;或所述目标蛋白序列的氨基酸序列无K有R,所述第一蛋白酶识别位点为K,所述第一蛋白酶为Lys-C,所述第二蛋白酶识别位点为羧基端K,所述第二蛋白酶为CPB;或所述目标蛋白序列的氨基酸序列无K无R,所述第一蛋白酶识别位点为K或者R,所述第一蛋白酶为Lys-C或者Trp,所述第二蛋白酶识别位点为羧基端K或R,所述第二蛋白酶为CPB;或所述目标蛋白序列的氨基酸序列有连续KR或者RR或者KK或者RK,所述连续KR或者RR或者KK或者RK的毗邻1或2个连续酸性氨基酸,所述第一蛋白酶识别位点为KR或者RR或者RKR,第 一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB。
根据本发明的实施例,所述制备融合蛋白装置包括发酵单元,所述发酵单元适于将微生物进行发酵处理,所述微生物携带有编码所述融合蛋白的核酸,优选地,所述微生物为大肠杆菌。
根据本发明的实施例,所述制备融合蛋白装置进一步包括溶解单元,所述溶解单元与所述发酵单元相连,用于对微生物发酵处理产物进行破碎和溶解处理,所述溶解处理是在去垢剂存在的条件下进行的,以便获得所述融合蛋白。
根据本发明的实施例,所述消化装置进一步包括调节单元,所述调节单元用于调节蛋白酶的用量,使所述融合蛋白与蛋白酶的质量比为250:1~2000:1。所述调节单元用于调节蛋白酶的用量,实现对融合蛋白在连接序列酶切位点的特异性切割。
根据本发明实施例的上述获得游离目标蛋白的系统的附加技术特征的优势或效果与上述获得游离目标蛋白的方法类似,在此不再赘述。
附图说明
图1是根据本发明实施例的获得游离目标蛋白的系统的结构示意图;
图2是根据本发明实施例的消化装置的结构示意图;
图3是根据本发明实施例的制备融合蛋白装置的结构示意图;
图4是根据本发明实施例的制备融合蛋白装置的另一结构示意图;
图5是根据本发明实施例的消化装置的另一结构示意图;
图6是根据本发明实施例的pET-30a-Arg 34-GLP-1(7-37)重组质粒构建示意图;
图7是根据本发明实施例的pET-30a-Arg 34-GLP-1(7-37)酶切鉴定图;
图8是根据本发明实施例的pET-30a-Arg 34-GLP-1(9-37)重组质粒构建示意图;
图9是根据本发明实施例的pET-30a-Arg 34-GLP-1(9-37)酶切鉴定图;
图10是根据本发明实施例的pET-30a-Arg 34-GLP-1(11-37)重组质粒构建示意图;
图11是根据本发明实施例的pET-30a-Arg 34-GLP-1(11-37)酶切鉴定图;
图12是根据本发明实施例的pET-30a-GLP-2重组质粒构建示意图;
图13是根据本发明实施例的pET-30a-GLP-2酶切鉴定图;
图14是根据本发明实施例的pET-30a-Glucagon重组质粒构建示意图;
图15是根据本发明实施例的pET-30a-Glucagon酶切鉴定图;
图16是根据本发明实施例的pET-30a-T4B重组质粒构建示意图;
图17是根据本发明实施例的pET-30a-T4B酶切鉴定图;
图18是根据本发明实施例的pET-30a-Arg 34-GLP-1(9-37)/BL21(DE3)重组工程菌诱 导表达SDS-PAGE图;
图19是根据本发明实施例的Arg 34-GLP-1(9-37)酶切后质谱分子量图谱;
图20是根据本发明实施例的Arg 34-GLP-1(9-37)体外细胞生物活性;
图21是根据本发明实施例的GLP-2体外细胞生物活性;
图22是根据本发明实施例的含EEAEAEARG和不含EEAEAEARG促进表达短肽的融合蛋白诱导表达量对比图;
图23是根据本发明实施例的含EEAEAEARG和不含EEAEAEARG促进表达短肽的破菌上清中融合蛋白含量的对比图;
图24是根据本发明实施例的含EEAEAEARG和不含EEAEAEARG促进表达短肽的融合蛋白的酶切效率对比图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
针对目前基因工程重组表达多肽技术的不足,本发明一方面提供了一种融合蛋白和一种新型首尾串联重组表达多肽的方法。
本发明中,新型首尾串联重组表达多肽方法的工艺具体包括以下步骤:
1)新型的首尾串联多肽的结构特点为:辅助肽段-(酶切位点-多肽-酶切位点-多肽)n,其中n为2-8。设计并全基因合成编码前述多肽的氨基酸序列的DNA序列;
2)构建含有编码前述多肽的氨基酸序列的DNA序列的重组质粒的表达载体;
3)重组表达载体转入宿主细胞,获得表达多肽的重组基因工程菌;
4)重组基因工程菌高密度发酵培养;
5)采用重组碱性蛋白酶双酶切类似物多肽前体,获得全部的目标多肽;
6)采用反相层析纯化得到高纯度的多肽。
本发明中,
步骤2)的表达载体是指含有表达启动子包括T7、Tac、Trp、lac的大肠杆菌表达载体,或含有α分泌因子以及AOX或GAP表达启动子的酵母表达载体。
步骤3)的宿主细胞是可以为毕赤酵母或大肠杆菌,优选大肠杆菌,更具体为BL21、BL21(DE3)、BL21(DE3)plysS,优选为BL21(DE3)。
步骤5)中重组双碱性氨基酸内肽酶(Recombinant Kex2 Protease,简称Kex2),为酵母细胞膜上有一种Kex2样蛋白水解酶,专一性水解α因子前体中羧基端肽键,即连续两个碱性氨基酸(如Lys Arg,LysLys,ArgArg等,其中Lys Arg酶切效率最高)后面的羧基端 肽键,其最适活性pH为9.0~9.5。酶切缓冲可以为Tris-HCl缓冲液、磷酸盐缓冲、硼酸盐缓冲,优选Tris-HCl缓冲盐。重组羧肽酶B(Carboxypeptidase B,简称CPB),可选择性水解蛋白或多肽羧基端的精氨酸(Arg,R)和赖氨酸(Lys,K),优先切割碱性氨基酸。其最适活性pH为8.5~9.5。酶切缓冲可以为Tris-HCl缓冲液、磷酸盐缓冲、硼酸盐缓冲,优选Tris-HCl缓冲盐。
本发明相对于现有技术具有如下的优点:
(1)新型设计的首尾串联多肽,其重组工程菌可保证倍增80代以内质粒丢失率不高于10%,目的蛋白表达量基本无影响,可实现产业化规模发酵的高密度和高表达量表达;
(2)采用常规的融合表达多肽的方法,虽然融合蛋白的表达量较高,但是酶切后需除去无关蛋白部分,只能得到对应摩尔浓度的部分目标多肽,而采用本发明设计的首尾串联胰高血糖素样肽及类似物,酶切后可以得到所有完整的目标多肽。
(3)采用本发明设计可完全克服由于N端Met造成的不均一的缺陷,采用本发明中独特的前导肽和酶切方式,可完全酶切N端Met,得到N端完全均一的目的蛋白多肽。
(4)Kex2酶和重组羧肽酶B酶切特异性高,无非特异性酶切有关物质产生,可酶切得到所有正确结构的目的多肽,大大减少后续纯化分离难度,可获得纯度极高的目的多肽,提高重组多肽的回收率,降低基因工程重组表达多肽的成本;
(5)反相精纯化的分离效果高且具有较高回收率。
另一方面,本发明提出了一种获得游离目标蛋白的系统。根据本发明的实施例,参考图1,所述系统包括:制备融合蛋白装置100,所述制备融合蛋白装置100用于提供前面所述的融合蛋白;消化装置200,所述消化装置200与所述制备融合蛋白装置100相连,使所述融合蛋白与蛋白酶接触,所述蛋白酶是基于所述连接序列确定的,所述多个目标蛋白序列不被所述蛋白酶切割,以便获得多个游离目标蛋白,所述游离目标蛋白的C-末端和N-末端均不含有额外残基。根据本发明实施例的系统适于执行前面所述的获得游离目标蛋白的方法,所获得的游离目标蛋白C-末端和N-末端均不含有额外残基,所述目标蛋白的质量显著提高,且大大方便了后续产品的纯化,目标蛋白作为药用多肽的安全性显著提高,免疫毒性显著下降。
根据本发明的具体实施例,参考图2,所述消化装置设置有第一蛋白酶消化单201和第二蛋白酶消化单元202,所述第一蛋白酶消化单元201与所述第二蛋白酶消化单元202相连。进而融合蛋白可在第一蛋白酶消化单元进行消化,第一蛋白酶消化产物进入第二蛋白酶消化单元进一步消化,可以人为在第一蛋白酶消化单元和第二蛋白酶消化单元投放蛋白酶,也可将第一蛋白酶和第二蛋白酶进行固定化处理,实现对融合蛋白的产业化和自动化消化处理。
具体地,当所述连接序列构成所述目标蛋白序列的C-末端,所述目标蛋白序列的C-末端为连续KR,所述第一蛋白酶消化单元和第二蛋白酶消化单元固定有蛋白酶Kex2。此时融合蛋白在第一蛋白酶消化单元消化处理后,即可获得游离目标蛋白,第一蛋白酶消化处理产物可再次进入第二蛋白酶消化处理单元,实现第一蛋白酶消化处理产物中未被消化或未被消化完全的融合蛋白的进一步消化,或第一蛋白酶消化处理产物直接产出,获得游离目标蛋白。
具体地,当所述连接序列中含有第一蛋白酶识别位点和第二蛋白酶识别位点,并且所述多个目标蛋白序列中不含有所述第二蛋白酶识别位点,所述第一蛋白酶消化单元201固定有第一蛋白酶,所述第二蛋白酶消化单元202固定有第二蛋白酶:所述融合蛋白与所述第一蛋白酶在所述第一蛋白酶消化单元接触,以便获得第一蛋白酶切割产物,所述第一蛋白酶切割产物的N-末端不携带所述连接序列的残基;所述第一蛋白酶切割产物与所述第二蛋白酶在所述第二蛋白酶消化单元接触,所述第二蛋白酶适于对所述第一蛋白酶切割产物的C-末端进行切割,以便获得多个所述目标蛋白。
当所述目标蛋白序列的氨基酸序列无连续KR和RR以及有或没有连续KK或者RK,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB;当所述目标蛋白序列的氨基酸序列无K有R,所述第一蛋白酶识别位点为K,所述第一蛋白酶为Lys-C,所述第二蛋白酶识别位点为羧基端K,所述第二蛋白酶为CPB;当所述目标蛋白序列的氨基酸序列无K无R,所述第一蛋白酶识别位点为K或者R,所述第一蛋白酶为Lys-C或者Trp,所述第二蛋白酶识别位点为羧基端K或R,所述第二蛋白酶为CPB;当所述目标蛋白序列的氨基酸序列有连续KR或者RR或者KK或者RK,所述连续KR或者RR或者KK或者RK的毗邻1或2个连续酸性氨基酸,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB。进而融合蛋白可在第一蛋白酶消化处理单元消化后,实现在连接序列的第一蛋白酶消化位点处的羧基端肽键的切割,获得N末端没有连接序列残基的第一蛋白酶消化产物,第一蛋白酶消化产物进入第二蛋白酶消化单元后,实现第一蛋白酶消化产物羧基端连接序列残基的依次切割,获得C末端没有连接序列残基的游离目标蛋白。
根据本发明的具体实施例,所述第一蛋白酶和第二蛋白酶也可在一个体系同步加入对融合蛋白进行消化处理,根据本申请实施例选用的第一蛋白酶和第二蛋白酶的活性互不干扰。
根据本发明的实施例,参考图3,所述制备融合蛋白装置包括发酵单元101,所述发酵单元101适于将微生物进行发酵处理,所述微生物携带有编码所述融合蛋白的核酸,优选 地,所述微生物为大肠杆菌。
根据本发明的实施例,参考图4,所述制备融合蛋白装置进一步包括溶解单元102,所述溶解单元与所述发酵单元相连,用于对微生物发酵处理产物进行破碎和溶解处理,所述溶解处理是在去垢剂存在的条件下进行的,以便获得所述融合蛋白。
根据本发明的实施例,参考图5,所述消化装置进一步包括调节单元203,所述调节单元203用于调节蛋白酶的用量,使所述融合蛋白与蛋白酶的质量比为250:1~2000:1。所述调节单元用于调节蛋白酶的用量,实现对融合蛋白在连接序列酶切位点的特异性切割。
以下结合具体实施例来进一步描述本发明。本发明的优点和特点将随着描述而更为清楚。但这些实例仅是范例,不对发明的范围构成任何限制。本领域技术人员应该理解的是,在不偏离本发明的精神和范围下可以对本发明技术方案的细节和形式进行修改或替换,但这些修改或替换均落入本发明的保护范围内。
实施例1 pET-30a-Arg 34-GLP-1(7-37)重组质粒和工程菌株的构建
Arg 34-GLP-1(7-37)(SEQ ID NO:1),根据辅助肽段-(酶切位点-多肽-酶切位点-多肽)4的重复串联成序列(SEQ ID NO:13),采用大肠杆菌密码子偏爱性,并在基因5’端添加Nde I核酸酶酶切位点CAT ATG,3’端添加双终止密码子TAA TGA,以及BamH I核酸酶酶切位点GGA TCC,设计其cDNA序列(SEQ ID NO:7),委托人工全基因合成核苷酸序列,并构建于PUC-57载体上,得到重组质粒PUC-57-Arg 34-GLP-1(7-37),保存于甘油菌Top10中。
NH 2-Met-His-His-His-His-Glu-Glu-Ala-Glu-Ala-Glu-Ala-Arg-Gly-Lys-Arg-His-Ala-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-His-Ala-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-His-Ala-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-His-Ala-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-COOH(SEQ ID NO:13)。
用Nde Ⅰ核酸内切酶和BamH Ⅰ核酸内切酶双酶切PUC-57-Arg 34-GLP-1(7-37)质粒,回收目的片段,再用T4 DNA连接酶与质粒pET-30a(购于Novagen)同样经Nde Ⅰ和BamH Ⅰ酶切回收的片段连接,转化大肠杆菌克隆宿主菌Top10,用酶切和PCR验证的方法筛选重组质粒pET-30a-Arg 34-GLP-1(7-37)。经DNA测序证明重组质粒中Arg 34-GLP-1(7-37)的cDNA序列正确后,转化大肠杆菌表达宿主菌BL 21(DE3)。经表达筛选获得重组表达菌株。 构建示意图见附图6。质粒酶切鉴定图如图7所示,其中,酶切后1-3号质粒都出现了约5000bp、450bp条带,分别对应pET-30a、Arg 34-GLP-1(7-37)与理论值相符,说明Arg 34-GLP-1(7-37)正确连接到载体pET-30a中。
实施例2 pET-30a-Arg 34-GLP-1(9-37)重组质粒和工程菌株的构建
Arg 34-GLP-1(9-37)(SEQ ID NO:2),根据辅助肽段-(酶切位点-多肽-酶切位点-多肽)4的重复串联成序列(SEQ ID NO:14),采用大肠杆菌密码子偏爱性,并在基因5’端添加Nde I核酸酶酶切位点CAT ATG,3’端添加双终止密码子TAA TGA,以及BamH I核酸酶酶切位点GGA TCC,设计其cDNA序列(SEQ ID NO:8),委托人工全基因合成核苷酸序列,并构建于PUC-57载体上,得到重组质粒PUC-57-Arg 34-GLP-1(9-37),保存于甘油菌Top10中。
NH 2-Met-His-His-His-His-Glu-Glu-Ala-Glu-Ala-Glu-Ala-Arg-Gly-Lys-Arg-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-COOH(SEQ ID NO:14)。
用Nde Ⅰ核酸内切酶和BamH Ⅰ核酸内切酶双酶切PUC-57-Arg 34-GLP-1(9-37)质粒,回收目的片段,再用T4 DNA连接酶与质粒pET-30a(购于Novagen)同样经Nde Ⅰ和BamH Ⅰ酶切回收的片段连接,转化大肠杆菌克隆宿主菌Top10,用酶切和PCR验证的方法筛选重组质粒pET-30a-Arg 34-GLP-1(9-37)。经DNA测序证明重组质粒中Arg 34-GLP-1(9-37)的cDNA序列正确后,转化大肠杆菌表达宿主菌BL 21(DE3)。经表达筛选获得重组表达菌株。构建示意图见附图8。质粒酶切鉴定图如图9所示,其中,酶切后1-3号质粒都出现了约5000bp、400bp条带,分别对应pET-30a、Arg 34-GLP-1(9-37)与理论值相符,说明Arg 34-GLP-1(9-37)正确连接到载体pET-30a中。
实施例3.pET-30a-Arg 34-GLP-1(11-37)重组质粒和工程菌株的构建
Arg 34-GLP-1(11-37)(SEQ ID NO:3),根据辅助肽段-(酶切位点-多肽-酶切位点-多肽)4的重复串联成序列(SEQ ID NO:15),采用大肠杆菌密码子偏爱性,并在基因5’端添加Nde I核酸酶酶切位点CAT ATG,3’端添加双终止密码子TAA TGA,以及BamH I核 酸酶酶切位点GGA TCC,设计其cDNA序列(SEQ ID NO:9),委托人工全基因合成核苷酸序列,并构建于PUC-57载体上,得到重组质粒PUC-57-Arg 34-GLP-1(11-37),保存于甘油菌Top10中。
NH 2-Met-His-His-His-His-Glu-Glu-Ala-Glu-Ala-Glu-Ala-Arg-Gly-Lys-Arg-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-Lys-Arg-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Gly-Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Arg-Gly-Arg-Gly-COOH(SEQ ID NO:15)。
用Nde Ⅰ核酸内切酶和BamH Ⅰ核酸内切酶双酶切PUC-57-Arg 34-GLP-1(11-37)质粒,回收目的片段,再用T4 DNA连接酶与质粒pET-30a(购于Novagen)同样经Nde Ⅰ和BamH Ⅰ酶切回收的片段连接,转化大肠杆菌克隆宿主菌Top10,用酶切和PCR验证的方法筛选重组质粒pET-30a-Arg 34-GLP-1(11-37)。经DNA测序证明重组质粒中Arg 34-GLP-1(11-37)的cDNA序列正确后,转化大肠杆菌表达宿主菌BL 21(DE3)。经表达筛选获得重组表达菌株。构建示意图见附图10。质粒酶切鉴定图如图11所示,其中,酶切后1-3号质粒都出现了约5000bp、400bp条带,分别对应pET-30a、Arg 34-GLP-1(11-37)与理论值相符,说明Arg 34-GLP-1(11-37)正确连接到载体pET-30a中。
实施例4.pET-30a-GLP-2重组质粒和工程菌株的构建
GLP-2(SEQ ID NO:4),根据辅助肽段-(酶切位点-多肽-酶切位点-多肽)4的重复串联成序列(SEQ ID NO:16),采用大肠杆菌密码子偏爱性,并在基因5’端添加Nde I核酸酶酶切位点CAT ATG,3’端添加双终止密码子TAA TGA,以及BamH I核酸酶酶切位点GGA TCC,设计其cDNA序列(SEQ ID NO:10),委托人工全基因合成核苷酸序列,并构建于PUC-57载体上,得到重组质粒PUC-57-GLP-2,保存于甘油菌Top10中。
NH 2-Met-His-His-His-His-Glu-Glu-Ala-Glu-Ala-Glu-Ala-Arg-Gly-Lys-Arg-His-Gly-Asp-Gly-Ser-Phe-Ser-Asp-Glu-Met-Asn-Thr-Ile-Leu-Asp-Asn-Leu-Ala-Ala-Arg-Asp-Phe-Ile-Asn-Trp-Leu-Ile-Gln-Thr-Lys-Ile-Thr-Asp-Arg-Lys-Arg-His-Gly-Asp-Gly-Ser-Phe-Ser-Asp-Glu-Met-Asn-Thr-Ile-Leu-Asp-Asn-Leu-Ala-Ala-Arg-Asp-Phe-Ile-Asn-Trp-Leu-Ile-Gln-Thr-Lys-Ile-Thr-Asp-Arg-Lys-Arg-His-Gly-Asp-Gly-Ser-Phe-Ser-Asp-Glu-Met-Asn-Thr-Ile-Leu-Asp-Asn-Leu-Ala-Ala-Arg-Asp-Phe-Ile-Asn-Trp-Leu-Ile-Gln-Thr-Lys-Ile-Thr-Asp-Arg-Lys-Arg-His-Gly-Asp-Gly-Ser-Phe-Ser-Asp-Glu-Met-Asn-Thr-Ile-Leu-Asp-Asn-Leu-Ala-Ala-Arg-Asp-Phe-Ile-Asn-Trp-Leu-Il e-Gln-Thr-Lys-Ile-Thr-Asp-COOH(SEQ ID NO:16)
用Nde Ⅰ核酸内切酶和BamH Ⅰ核酸内切酶双酶切PUC-57-GLP-2质粒,回收目的片段,再用T4 DNA连接酶与质粒pET-30a(购于Novagen)同样经Nde Ⅰ和BamH Ⅰ酶切回收的片段连接,转化大肠杆菌克隆宿主菌Top10,用酶切和PCR验证的方法筛选重组质粒pET-30a-GLP-2。经DNA测序证明重组质粒中GLP-2的cDNA序列正确后,转化大肠杆菌表达宿主菌BL 21(DE3)。经表达筛选获得重组表达菌株。构建示意图见附图12。质粒酶切鉴定图如图13所示,其中,酶切后1-3号质粒都出现了约5000bp、480bp条带,分别对应pET-30a、GLP-2与理论值相符,说明GLP-2正确连接到载体pET-30a中。
实施例5.pET-30a-Glucagon重组质粒和工程菌株的构建
Glucagon(SEQ ID NO:5),根据辅助肽段-(酶切位点-多肽-酶切位点-多肽) 8的重复串联成序列(SEQ ID NO:17),采用大肠杆菌密码子偏爱性,并在基因5’端添加Nde I核酸酶酶切位点CAT ATG,3’端添加双终止密码子TAA TGA,以及BamH I核酸酶酶切位点GGA TCC,设计其cDNA序列(SEQ ID NO:11),委托人工全基因合成核苷酸序列,并构建于PUC-57载体上,得到重组质粒PUC-57-Glucagon,保存于甘油菌Top10中。
NH 2-Met-His-His-His-His-Glu-Glu-Ala-Glu-Ala-Glu-Ala-Arg-Gly-Lys-Arg-His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr-Arg-Lys-Arg-His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr-Arg-Lys-Arg-His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr-Arg-Lys-Arg-His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr-Arg-Lys-Arg-His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr-Arg-Lys-Arg-His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr-Arg-Lys-Arg-His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr-Arg-Lys-Arg-His-Ser-Gln-Gly-Thr-Phe-Thr-Ser-Asp-Tyr-Ser-Lys-Tyr-Leu-Asp-Ser-Arg-Arg-Ala-Gln-Asp-Phe-Val-Gln-Trp-Leu-Met-Asn-Thr-COOH(SEQ ID NO:17)。
用Nde Ⅰ核酸内切酶和BamH Ⅰ核酸内切酶双酶切PUC-57-Glucagon质粒,回收目的片段,再用T4 DNA连接酶与质粒pET-30a(购于Novagen)同样经Nde Ⅰ和BamH Ⅰ酶切回收的片段连接,转化大肠杆菌克隆宿主菌Top10,用酶切和PCR验证的方法筛选重组质粒pET-30a-Glucagon。经DNA测序证明重组质粒中Glucagon的cDNA序列正确后,转化 大肠杆菌表达宿主菌BL 21(DE3)。经表达筛选获得重组表达菌株。构建示意图见附图14。质粒酶切鉴定图如图15所示,其中,酶切后1-3号质粒都出现了约5000bp、800bp条带,分别对应pET-30a、Glucagon与理论值相符,说明Glucagon正确连接到载体pET-30a中。
实施例6.pET-30a-TB4重组质粒和工程菌株的构建
TB4(SEQ ID NO:6),根据辅助肽段-(酶切位点-多肽-酶切位点-多肽)4的重复串联成序列(SEQ ID NO:18),采用大肠杆菌密码子偏爱性,并在基因5’端添加Nde I核酸酶酶切位点CAT ATG,3’端添加双终止密码子TAA TGA,以及BamH I核酸酶酶切位点GGA TCC,设计其cDNA序列(SEQ ID NO:12),委托人工全基因合成核苷酸序列,并构建于PUC-57载体上,得到重组质粒PUC-57-TB4,保存于甘油菌Top10中。
NH 2-Met-His-His-His-His-Glu-Glu-Ala-Glu-Ala-Glu-Ala-Arg-Gly-Lys-Arg-Ser-Asp-Lys-Pro-Asp-Met-Ala-Glu-Ile-Glu-Lys-Phe-Asp-Lys-Ser-Lys-Leu-Lys-Lys-Thr-Glu-Thr-Gln-Glu-Lys-Asn-Pro-Leu-Pro-Ser-Lys-Glu-Thr-Ile-Glu-Gln-Glu-Lys-Gln-Ala-Gly-Glu-Ser-Arg-Lys-Arg-Ser-Asp-Lys-Pro-Asp-Met-Ala-Glu-Ile-Glu-Lys-Phe-Asp-Lys-Ser-Lys-Leu-Lys-Lys-Thr-Glu-Thr-Gln-Glu-Lys-Asn-Pro-Leu-Pro-Ser-Lys-Glu-Thr-Ile-Glu-Gln-Glu-Lys-Gln-Ala-Gly-Glu-Ser-Arg-Lys-Arg-Ser-Asp-Lys-Pro-Asp-Met-Ala-Glu-Ile-Glu-Lys-Phe-Asp-Lys-Ser-Lys-Leu-Lys-Lys-Thr-Glu-Thr-Gln-Glu-Lys-Asn-Pro-Leu-Pro-Ser-Lys-Glu-Thr-Ile-Glu-Gln-Glu-Lys-Gln-Ala-Gly-Glu-Ser-Arg-Lys-Arg-Ser-Asp-Lys-Pro-Asp-Met-Ala-Glu-Ile-Glu-Lys-Phe-Asp-Lys-Ser-Lys-Leu-Lys-Lys-Thr-Glu-Thr-Gln-Glu-Lys-Asn-Pro-Leu-Pro-Ser-Lys-Glu-Thr-Ile-Glu-Gln-Glu-Lys-Gln-Ala-Gly-Glu-Ser-COOH(SEQ ID NO:18)。
用Nde Ⅰ核酸内切酶和BamH Ⅰ核酸内切酶双酶切PUC-57-TB4质粒,回收目的片段,再用T4 DNA连接酶与质粒pET-30a(购于Novagen)同样经Nde Ⅰ和BamH Ⅰ酶切回收的片段连接,转化大肠杆菌克隆宿主菌Top10,用酶切和PCR验证的方法筛选重组质粒pET-30a-TB4。经DNA测序证明重组质粒中TB4的cDNA序列正确后,转化大肠杆菌表达宿主菌BL 21(DE3)。经表达筛选获得重组表达菌株。构建示意图见附图16。质粒酶切鉴定图如图17所示,其中,酶切后重组质粒出现了约5000bp、600bp条带,分别对应pET-30a、TB4与理论值相符,说明TB4正确连接到载体pET-30a中。
实施例7.pET-30a-Arg 34-GLP-1(7-37)/BL 21(DE3)、pET-30a-Arg 34-GLP-1(9-37)/BL 21(DE3)、pET-30a-Arg 34-GLP-1(11-37)/BL 21(DE3)、pET-30a-GLP-2/BL 21(DE3)、pET-30a-Glucagon/BL 21(DE3)、pET-30a-TB4/BL 21(DE3)重组工程菌株的发酵培养
将pET-30a-Arg 34-GLP-1(7-37)/BL 21(DE3)、pET-30a-Arg 34-GLP-1(9-37)/BL 21(DE3)、pET-30a-Arg 34-GLP-1(11-37)/BL 21(DE3)、pET-30a-GLP-2/BL 21(DE3)、pET-30a-Glucagon/BL 21(DE3)、pET-30a-TB4/BL 21(DE3)重组工程菌分别划线接种于LA琼脂平板,37℃培养过夜。从过夜培养的LA平板上挑取菌苔接种于含LB液体培养基中,37℃培养12小时,然后按1%的比例转接到含200ml LB培养液的1000ml三角瓶中,37℃培养过夜即成为上罐种子液。将上罐种子液按5%的比例接种于含YT培养液的30L发酵罐中,37℃培养,通过调节转速、通空气量、通纯氧量来保持溶氧在25%以上,用氨水调节pH并保持在6.5。至菌液OD 600达到50~80时,加入终浓度为0.2mM的异丙基-β-D-硫代半乳糖苷,继续培养3小时后停止发酵,收集菌液,8000rpm离心10分钟,弃上清,收集菌体放入-20℃冰箱保存备用。
其中,pET-30a-Arg 34-GLP-1(9-37)/BL21(DE3)重组工程菌诱导表达SDS-PAGE图如图18所示。
实施例8.Arg 34-GLP-1(9-37)的前处理、酶切及纯化
pET-30a-Arg 34-GLP-1(9-37)/BL 21(DE3)重组工程菌发酵菌体采用破碎缓冲重悬,高压匀浆三次(压力600~700Bar),室温搅拌离心、收集沉淀;沉淀用洗涤液按质量体积比重悬,均质机均质至无可见颗粒状物;室温搅拌30分钟,离心收集沉淀。沉淀用含表面活性剂的酶切缓冲液,按3~5%比例(质量体积比g/mL)进行溶解,调pH10.5,28℃~32℃搅拌反应30分钟,离心收集上清,取样OD 280紫外测定含量;溶解样品调pH至8.0~9.0,按质量比(1:1000)加入重组蛋白酶Kex2和重组蛋白酶CPB,25℃~35℃酶切搅拌反应过夜,取样进行RP-HPLC检测。Q阴离子层析柱常规清洗,再生,平衡液平衡2CV,酶切样品调pH9.5~9.8过滤挂载层析柱(电导低于5ms/cm),复平衡1CV,洗脱液1洗脱至紫外吸收值归零,平衡液复平衡2CV,洗脱液2一步洗脱,收集目的峰。C4反相柱上样、平衡、梯度洗脱收集目的蛋白,其纯度不低于99%。
其中,Arg 34-GLP-1(9-37)酶切后质谱分子量图谱如图19所示。
实施例9.Arg 34-GLP-1(9-37)体外活性测定
采用本公司转染GLP-1R受体的重组CHO-K1–CRE–GLP1R细胞测定体外生物活性。CHO-K1–CRE–GLP1R细胞铺板过夜,用Arg 34-GLP-1(9-37)多肽刺激重组CHO–K1–CRE–GLP1R细胞,37℃5%CO2反应作用4小时±15分钟。加入Promega试剂盒化学发光底物(货号:E2510),100ul/孔,室温条件下,在振荡器上轻轻振荡40分钟±10分钟,以每孔适当测定时间(1秒钟/孔)在荧光酶标仪上读板测定RLU。用“Sigmaplot”软 件做四参数回归曲线拟合,计算Arg 34-GLP-1(9-37)的半效剂量(EC 50)。结果如图20所示。
实施例10.GLP-2体外活性测定
采用本公司转染GLP-2R受体的重组CHO-K1–CRE–GLP2R细胞测定体外生物活性。具体方法为:CHO-K1–CRE–GLP2R细胞铺板过夜,用GLP–2蛋白刺激重组CHO–K1–CRE–GLP2R细胞,37℃5%CO2反应作用4小时±15分钟。加入Promega剂盒化学发光底物(货号:E2510),100ul/孔,室温条件下,在振荡器上轻轻振荡40分钟±10分钟,以每孔适当测定时间(1秒钟/孔)在荧光酶标仪上读板测定RLU。用“Sigmaplot”软件做四参数回归曲线拟合,计算GLP–2的半效剂量(EC 50)。结果如图21所示。
为了体现本申请方法的创造性,发明人将本申请方法前期开发过程中遇到的一些优化实验方案的实验过程和结论列于以下实施例中,通过以下实验可以看出,本申请所要求保护的融合蛋白合获得融合蛋白的方法需要发明人付出创造性的实验劳动,在某些优化的实验条件下,本申请所要求保护的方法获得的技术效果更优。
方法开发实施例1
发明人在前期方法的开发过程中,尝试在辅助肽段合成不同的促表达序列,以期进一步有效提高融合蛋白(多肽前体)的表达量,下面将详细介绍该筛选实验过程:
设计含EEAEAEARG和不含EEAEAEARG促进表达短肽的融合蛋白,进行发酵诱导表达量、融合蛋白溶解、融合蛋白酶切比对实验,结果如下。
(1)表达量如图22所示。
结论:含EEAEAEARG的结构诱导4h表达量高于无EEAEAEARG结构的表达量。
(2)溶解性如图23所示
结论:含EEAEAEARG的促进肽的破菌上清中目的蛋白含量明显高于无EEAEAEARG结构中上清目的蛋白含量。
(3)酶切比对如图24所示
结论:含EEAEAEARG酶切效率为96.6%,不含EEAEAEARG酶切效率62.3%,含EEAEAEARG酶切效率高于不含EEAEAEARG酶切效率。
综上,促进表达多肽EEAEAEARG中酸性氨基酸(E)平衡融合蛋白的等电点(引入的KR酶切位点,KR均为碱性氨基酸,极大的提高了融合蛋白的等电点,过高的等电点不利于融合蛋白的表达,也不利于融合蛋白在后续纯化中的溶解性),可以提高融合蛋白的表达量,同时可以提高融合蛋白的酶切效率,提高目标多肽的产率。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具 体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (50)

  1. 一种融合蛋白,其特征在于,包括:串联的多个目标蛋白序列,相邻两个所述目标蛋白序列通过连接序列相连,其中,所述连接序列适于通过蛋白酶切割形成多个游离目标蛋白,所述多个目标蛋白序列不被所述蛋白酶切割,所述游离目标蛋白的C-末端和N-末端均不含有额外残基。
  2. 根据权利要求1所述的融合蛋白,其特征在于,所述连接序列的至少一部分构成所述目标蛋白序列的C-末端的一部分。
  3. 根据权利要求2所述的融合蛋白,其特征在于,所述连接序列由至少一个蛋白酶识别位点构成。
  4. 根据权利要求2所述的融合蛋白,其特征在于,所述连接序列构成所述目标蛋白序列的C-末端。
  5. 根据权利要求4所述的融合蛋白,其特征在于,所述目标蛋白序列的C-末端为连续KR。
  6. 根据权利要求5所述的融合蛋白,其特征在于,所述蛋白酶为Kex2。
  7. 根据权利要求1所述的融合蛋白,其特征在于,所述连接序列中含有第一蛋白酶识别位点和第二蛋白酶识别位点,并且所述多个目标蛋白序列中不含有所述第二蛋白酶识别位点,
    所述第一蛋白酶识别位点适于被第一蛋白酶识别并切割以便形成第一蛋白酶切割产物,
    所述第二蛋白酶识别位点适于被第二蛋白酶识别并切割,
    所述第一蛋白酶切割产物的N-末端不携带所述连接序列的残基,
    所述第二蛋白酶适于对所述第一蛋白酶切割产物的C-末端进行切割,以便形成多个游离目标蛋白,所述游离目标蛋白的C-末端和N-末端均不含有所述连接序列的残基。
  8. 根据权利要求7所述的融合蛋白,其特征在于,所述多个目标蛋白序列中存在至少一个内部第一蛋白酶识别位点,其中,所述第一蛋白酶识别所述内部第一蛋白酶识别位点的效率低于所述第一蛋白酶识别所述连接序列中所述第一蛋白酶识别位点的效率。
  9. 根据权利要求8所述的融合蛋白,其特征在于,所述第一蛋白酶是Kex2,所述内部第一蛋白酶识别位点为KK和RK的至少之一,所述连接序列中所述第一蛋白酶识别位点是KR或RR或RKR。
  10. 根据权利要求8所述的融合蛋白,其特征在于,所述内部第一蛋白酶识别位点的上游或下游存在与所述内部第一蛋白酶识别位点毗邻的连续酸性氨基酸序列。
  11. 根据权利要求10所述的融合蛋白,其特征在于,所述连续酸性氨基酸序列的长度 为1~2个氨基酸。
  12. 根据权利要求10所述的融合蛋白,其特征在于,所述酸性氨基酸为天冬氨酸或谷氨酸,优选地,所述酸性氨基酸为天冬氨酸。
  13. 根据权利要求7所述的融合蛋白,其特征在于,所述第一蛋白酶识别位点和所述第二蛋白酶识别位点存在重叠区域。
  14. 根据权利要求7所述的融合蛋白,其特征在于,所述第一蛋白酶和所述第二蛋白酶的识别位点相同或者不同。
  15. 根据权利要求7所述的融合蛋白,其特征在于,所述第一蛋白酶识别位点和所述第二蛋白酶识别位点满足下列条件:
    所述目标蛋白序列的氨基酸序列无连续KR和RR以及有或没有连续KK或者RK,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB;或
    所述目标蛋白序列的氨基酸序列无K有R,所述第一蛋白酶识别位点为K,所述第一蛋白酶为Lys-C,所述第二蛋白酶识别位点为羧基端K,所述第二蛋白酶为CPB;或
    所述目标蛋白序列的氨基酸序列无K无R,所述第一蛋白酶识别位点为K或者R,所述第一蛋白酶为Lys-C或者Trp,所述第二蛋白酶识别位点为羧基端K或R,所述第二蛋白酶为CPB;或
    所述目标蛋白序列的氨基酸序列有连续KR或者RR或者KK或者RK,所述连续KR或者RR或者KK或者RK的毗邻1或2个连续酸性氨基酸,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB。
  16. 根据权利要求1所述的融合蛋白,其特征在于,所述融合蛋白包含多个连接序列,所述多个连接序列相同或者不同。
  17. 根据权利要求1所述的融合蛋白,其特征在于,所述连接序列的长度为1~10个氨基酸。
  18. 根据权利要求1所述的融合蛋白,其特征在于,进一步包括辅助肽段,所述辅助肽段的羧基端通过所述连接序列与所述串联的多个目标蛋白序列的N-末端相连。
  19. 根据权利要求18所述的融合蛋白,其特征在于,所述辅助肽段包括标签序列以及任选的促表达序列。
  20. 根据权利要求19所述的融合蛋白,其特征在于,所述标签序列的氨基酸序列为重复His序列;
    任选地,所述促表达序列的氨基酸序列为EEAEAEA、EEAEAEAGG或EEAEAEARG;
    任选地,所述辅助肽段的首位氨基酸为甲硫氨酸。
  21. 根据权利要求1所述融合蛋白,其特征在于,所述目标蛋白序列的长度为10~100个氨基酸;
    优选地,所述融合蛋白包括串联的4~16目标蛋白序列。
  22. 根据权利要求21所述的融合蛋白,其特征在于,所述目标蛋白序列的氨基酸序列如SEQ ID NO:1~6所示。
  23. 一种获得游离目标蛋白的方法,其特征在于,包括:
    提供权利要求1~22任一项所述的融合蛋白;
    使所述融合蛋白与蛋白酶接触,所述蛋白酶是基于所述连接序列确定的,所述多个目标蛋白序列不被所述蛋白酶切割,以便获得多个游离目标蛋白,所述游离目标蛋白的C-末端和N-末端均不含有额外残基。
  24. 根据权利要求23所述的方法,其特征在于,所述连接序列构成所述目标蛋白序列的C-末端,所述目标蛋白序列的C-末端为连续KR,所述蛋白酶为Kex2。
  25. 根据权利要求23所述的方法,其特征在于,所述连接序列中含有第一蛋白酶识别位点和第二蛋白酶识别位点,并且所述多个目标蛋白序列中不含有所述第二蛋白酶识别位点,使所述融合蛋白与所述蛋白酶接触进一步包括:
    使所述融合蛋白与所述第一蛋白酶接触,以便获得第一蛋白酶切割产物,所述第一蛋白酶切割产物的N-末端不携带所述连接序列的残基;
    使所述第一蛋白酶切割产物与所述第二蛋白酶接触,所述第二蛋白酶适于对所述第一蛋白酶切割产物的C-末端进行切割,以便获得多个所述游离目标蛋白。
  26. 根据权利要求25所述的方法,其特征在于,所述多个目标蛋白序列中存在至少一个内部第一蛋白酶识别位点,其中,所述第一蛋白酶识别所述内部第一蛋白酶识别位点的效率低于所述第一蛋白酶识别所述连接序列中所述第一蛋白酶识别位点的效率。
  27. 根据权利要求26所述的方法,其特征在于,所述内部第一蛋白酶识别位点为KK和RK的至少之一,所述连接序列中所述第一蛋白酶识别位点是KR或RR或RKR,所述第二蛋白酶识别位点为羧基端R或K,所述第一蛋白酶为Kex2,所述第二蛋白酶为CPB,所述融合蛋白与所述第一蛋白酶的质量比为2000:1。
  28. 根据权利要求26所述的方法,其特征在于,所述内部第一蛋白酶识别位点的上游或下游存在与所述内部第一蛋白酶识别位点毗邻的连续酸性氨基酸序列。
  29. 根据权利要求28所述的方法,其特征在于,所述连续酸性氨基酸序列的长度为1~2个氨基酸。
  30. 根据权利要求29所述的方法,其特征在于,所述酸性氨基酸为天冬氨酸或谷氨酸, 优选地,所述酸性氨基酸为天冬氨酸。
  31. 根据权利要求30所述的方法,其特征在于,所述多个目标蛋白序列中存在连续DKR或DRR或DKK或DRK,所述第一蛋白酶识别位点为KR或RR或RKR,所述第二蛋白酶识别位点为羧基端R或K,所述第一蛋白酶为Kex2,所述第二蛋白酶为CPB。
  32. 根据权利要求25所述的方法,其特征在于,所述多个目标蛋白序列中不含有所述第一蛋白酶识别位点和所述第二蛋白酶识别位点。
  33. 根据权利要求32所述的方法,其特征在于,所述第一蛋白酶与所述第二蛋白酶分别为:
    所述目标蛋白序列的氨基酸序列无连续KR、RR、KK和RK,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB;或
    所述目标蛋白序列的氨基酸序列无K有R,所述第一蛋白酶识别位点为K,所述第一蛋白酶为Lys-C,所述第二蛋白酶识别位点为羧基端K,所述第二蛋白酶为CPB,或
    所述目标蛋白序列的氨基酸序列无K无R,所述第一蛋白酶识别位点为K或者R,所述第一蛋白酶为Lys-C或者Trp,所述第二蛋白酶识别位点为羧基端K或R,所述第二蛋白酶为CPB。
  34. 根据权利要求28~33任一项所述的方法,其特征在于,所述融合蛋白与所述第一蛋白酶的质量比为250:1~2000:1。
  35. 根据权利要求23所述的方法,其特征在于,所述融合蛋白是通过对微生物进行发酵处理后获得的,所述微生物携带有编码所述融合蛋白的核酸。
  36. 根据权利要求35所述的方法,其特征在于,所述微生物为大肠杆菌。
  37. 根据权利要求24所述的方法,其特征在于,进一步包括对微生物发酵处理产物进行破碎和溶解处理,所述溶解处理是在去垢剂存在的条件下进行的,以便获得所述融合蛋白。
  38. 一种核酸,其特征在于,编码权利要求1~22任一项所述的融合蛋白。
  39. 根据权利要求38所述的核酸,其特征在于,所述核酸具有SEQ ID NO:7~12任一项所示的核苷酸序列。
  40. 一种构建体,其特征在于,携带权利要求38~39任一项所述的核酸。
  41. 一种重组细胞,其特征在于,包含权利要求38~39任一项所述的核酸或权利要求40所述的构建体或表达权利要求1~22任一项所述的融合蛋白。
  42. 根据权利要求41所述的重组细胞,其特征在于,所述重组细胞为大肠杆菌细胞。
  43. 一种获得游离目标蛋白的系统,其特征在于,包括:
    制备融合蛋白装置,所述制备融合蛋白装置用于提供权利要求1~22任一项所述的融合蛋白;
    消化装置,所述消化装置与所述制备融合蛋白装置相连,使所述融合蛋白与蛋白酶接触,所述蛋白酶是基于所述连接序列确定的,所述多个目标蛋白序列不被所述蛋白酶切割,以便获得多个游离目标蛋白,所述游离目标蛋白的C-末端和N-末端均不含有额外残基。
  44. 根据权利要求43所述的系统,其特征在于,所述消化装置设置有第一蛋白酶消化单元和第二蛋白酶消化单元,所述第一蛋白酶消化单元与所述第二蛋白酶消化单元相连。
  45. 根据权利要求44所述的系统,其特征在于,所述连接序列构成所述目标蛋白序列的C-末端,所述目标蛋白序列的C-末端为连续KR,所述第一蛋白酶消化单元和第二蛋白酶消化单元固定有蛋白酶Kex2。
  46. 根据权利要求44所述的系统,其特征在于,所述连接序列中含有第一蛋白酶识别位点和第二蛋白酶识别位点,并且所述多个目标蛋白序列中不含有所述第二蛋白酶识别位点,所述第一蛋白酶消化单元固定有第一蛋白酶,所述第二蛋白酶消化单元固定有第二蛋白酶:
    所述融合蛋白与所述第一蛋白酶在所述第一蛋白酶消化单元接触,以便获得第一蛋白酶切割产物,所述第一蛋白酶切割产物的N-末端不携带所述连接序列的残基;
    所述第一蛋白酶切割产物与所述第二蛋白酶在所述第二蛋白酶消化单元接触,所述第二蛋白酶适于对所述第一蛋白酶切割产物的C-末端进行切割,以便获得多个所述目标蛋白。
  47. 根据权利要求46所述的系统,其特征在于,所述目标蛋白序列的氨基酸序列无连续KR和RR以及有或没有连续KK或者RK,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB;或
    所述目标蛋白序列的氨基酸序列无K有R,所述第一蛋白酶识别位点为K,所述第一蛋白酶为Lys-C,所述第二蛋白酶识别位点为羧基端K,所述第二蛋白酶为CPB;或
    所述目标蛋白序列的氨基酸序列无K无R,所述第一蛋白酶识别位点为K或者R,所述第一蛋白酶为Lys-C或者Trp,所述第二蛋白酶识别位点为羧基端K或R,所述第二蛋白酶为CPB;或
    所述目标蛋白序列的氨基酸序列有连续KR或者RR或者KK或者RK,所述连续KR或者RR或者KK或者RK的毗邻1或2个连续酸性氨基酸,所述第一蛋白酶识别位点为KR或者RR或者RKR,第一蛋白酶为Kex2,所述第二蛋白酶识别位点为羧基端R或K,所述第二蛋白酶为CPB。
  48. 根据权利要求43所述的系统,其特征在于,所述制备融合蛋白装置包括发酵单元, 所述发酵单元适于将微生物进行发酵处理,所述微生物携带有编码所述融合蛋白的核酸,
    优选地,所述微生物为大肠杆菌。
  49. 根据权利要求48所述的系统,其特征在于,所述制备融合蛋白装置进一步包括溶解单元,所述溶解单元与所述发酵单元相连,用于对微生物发酵处理产物进行破碎和溶解处理,所述溶解处理是在去垢剂存在的条件下进行的,以便获得所述融合蛋白。
  50. 根据权利要求43所述的系统,其特征在于,所述消化装置进一步包括调节单元,所述调节单元用于调节蛋白酶的用量,使所述融合蛋白与蛋白酶的质量比为250:1~2000:1。
PCT/CN2020/097058 2019-06-26 2020-06-19 重组串联融合蛋白制备目标多肽的方法 WO2020259403A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20831844.4A EP3992212A4 (en) 2019-06-26 2020-06-19 METHOD OF PRODUCTION OF A TARGET POLYPEPTIDE BY RECOMBINATION AND SERIES CONFIGURATION OF FUSIONED PROTEINS
US17/558,767 US20220195004A1 (en) 2019-06-26 2021-12-22 Method for preparing target polypeptide by means of recombination and series connection of fused proteins

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910563692.3A CN110305223B (zh) 2019-06-26 2019-06-26 重组串联融合蛋白制备目标多肽的方法
CN201910563692.3 2019-06-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/558,767 Continuation US20220195004A1 (en) 2019-06-26 2021-12-22 Method for preparing target polypeptide by means of recombination and series connection of fused proteins

Publications (1)

Publication Number Publication Date
WO2020259403A1 true WO2020259403A1 (zh) 2020-12-30

Family

ID=68077497

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/097058 WO2020259403A1 (zh) 2019-06-26 2020-06-19 重组串联融合蛋白制备目标多肽的方法

Country Status (4)

Country Link
US (1) US20220195004A1 (zh)
EP (1) EP3992212A4 (zh)
CN (1) CN110305223B (zh)
WO (1) WO2020259403A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022064517A1 (en) * 2020-09-23 2022-03-31 Dr. Reddy's Laboratories Limited A process for the preparation of semaglutide and semapeptide

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110305223B (zh) * 2019-06-26 2022-05-13 重庆派金生物科技有限公司 重组串联融合蛋白制备目标多肽的方法
CN111072783B (zh) * 2019-12-27 2021-09-28 万新医药科技(苏州)有限公司 一种采用大肠杆菌表达串联序列制备glp-1或其类似物多肽的方法
CN114057886B (zh) * 2020-07-24 2024-03-01 宁波鲲鹏生物科技有限公司 一种索玛鲁肽衍生物及其制备方法
CN113502296B (zh) * 2021-09-10 2021-11-30 北京惠之衡生物科技有限公司 一种表达司美鲁肽前体的重组工程菌及其构建方法
CN113861266A (zh) * 2021-09-30 2021-12-31 天津科技大学 一种芽孢杆菌碱性蛋白酶抑制肽的生物合成方法
CN115028740A (zh) * 2022-06-16 2022-09-09 重庆派金生物科技有限公司 一种融合表达制备人甲状旁腺激素1-34的方法
CN115975047A (zh) * 2022-10-24 2023-04-18 扬州奥锐特药业有限公司 一种重组融合蛋白生产多肽的方法及其应用
CN116948013B (zh) * 2023-04-28 2024-04-09 江苏创健医疗科技股份有限公司 重组小分子胶原蛋白及其表达系统与制备方法
CN117801124A (zh) * 2024-02-29 2024-04-02 天津凯莱英生物科技有限公司 利西那肽前体的融合蛋白及其应用

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1167155A (zh) * 1996-03-04 1997-12-10 三得利公司 使用加工酶的嵌合蛋白质的切断方法
CN101171262A (zh) 2005-05-04 2008-04-30 西兰制药公司 胰高血糖素样肽-2(glp-2)类似物
CN101172996A (zh) * 2006-09-29 2008-05-07 上海新生源医药研究有限公司 用于多肽融合表达的连接肽及多肽融合表达方法
US20100317057A1 (en) 2007-12-28 2010-12-16 Novo Nordisk A/S Semi-recombinant preparation of glp-1 analogues
CN103159848A (zh) 2013-01-06 2013-06-19 中国人民解放军第四军医大学 人胰高血糖素样肽-2二串体蛋白及其制备方法
CN103945861A (zh) 2011-09-12 2014-07-23 阿穆尼克斯运营公司 胰高血糖素样肽-2组合物及其制备和使用方法
CN104072604A (zh) 2013-03-27 2014-10-01 深圳翰宇药业股份有限公司 一种替度鲁肽的聚乙二醇偶合物及其固相制备方法
CN110305223A (zh) * 2019-06-26 2019-10-08 重庆派金生物科技有限公司 重组串联融合蛋白制备目标多肽的方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016089782A1 (en) * 2014-12-01 2016-06-09 Pfenex Inc. Fusion partners for peptide production

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1167155A (zh) * 1996-03-04 1997-12-10 三得利公司 使用加工酶的嵌合蛋白质的切断方法
CN101171262A (zh) 2005-05-04 2008-04-30 西兰制药公司 胰高血糖素样肽-2(glp-2)类似物
CN102659938A (zh) 2005-05-04 2012-09-12 西兰制药公司 胰高血糖素样肽-2(glp-2)类似物
CN101172996A (zh) * 2006-09-29 2008-05-07 上海新生源医药研究有限公司 用于多肽融合表达的连接肽及多肽融合表达方法
US20100317057A1 (en) 2007-12-28 2010-12-16 Novo Nordisk A/S Semi-recombinant preparation of glp-1 analogues
CN103945861A (zh) 2011-09-12 2014-07-23 阿穆尼克斯运营公司 胰高血糖素样肽-2组合物及其制备和使用方法
CN103159848A (zh) 2013-01-06 2013-06-19 中国人民解放军第四军医大学 人胰高血糖素样肽-2二串体蛋白及其制备方法
CN104072604A (zh) 2013-03-27 2014-10-01 深圳翰宇药业股份有限公司 一种替度鲁肽的聚乙二醇偶合物及其固相制备方法
CN110305223A (zh) * 2019-06-26 2019-10-08 重庆派金生物科技有限公司 重组串联融合蛋白制备目标多肽的方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3992212A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022064517A1 (en) * 2020-09-23 2022-03-31 Dr. Reddy's Laboratories Limited A process for the preparation of semaglutide and semapeptide

Also Published As

Publication number Publication date
EP3992212A4 (en) 2023-07-12
CN110305223A (zh) 2019-10-08
EP3992212A1 (en) 2022-05-04
CN110305223B (zh) 2022-05-13
US20220195004A1 (en) 2022-06-23

Similar Documents

Publication Publication Date Title
WO2020259403A1 (zh) 重组串联融合蛋白制备目标多肽的方法
CA2246733C (en) Use of a pharmaceutical composition comprising an appetite-suppressing peptide
WO2020182229A1 (zh) 一种融合蛋白及其制备利拉鲁肽中间体多肽的方法
JP4504014B2 (ja) インスリン分泌性glp−1(7−36)ポリペプチドおよび/またはglp−1類似体を生成する方法
US7893017B2 (en) Protracted GLP-1 compounds
US5912229A (en) Use of a pharmaceutical composition comprising an appetite-suppressing peptide
US20120289453A1 (en) Novel glp-1 compounds
WO2011020319A1 (zh) 调节血糖血脂的融合蛋白及其制备方法和应用
US11612640B2 (en) Acylated GLP-1 derivative
WO2005035761A1 (en) Splice variants of preproglucagon, glucagon-like peptide-1 and oxyntomodulin
CN110724187B (zh) 一种高效表达利拉鲁肽前体的重组工程菌及其应用
CN112584853B (zh) 一种新型门冬胰岛素原的结构和制备门冬胰岛素的方法
CZ290079B6 (cs) Způsob produkce inzulinu a meziprodukt pro tento způsob
KR100997835B1 (ko) 엑센딘 4 폴리펩타이드 단편
CN113105536A (zh) 一种新甘精胰岛素原及其制备甘精胰岛素的方法
WO2022227707A1 (zh) 一种双靶点融合蛋白的制备方法和应用
EP3257523A1 (en) Use of polypeptide complex as polypeptide or protein drug carrier, method, and fusion protein complex thereof
Zhang et al. Expression, purification, and C-terminal amidation of recombinant human glucagon-like peptide-1
WO2021143810A1 (zh) 多肽化合物及其应用
RU2773242C2 (ru) Ацилированное производное GLP-1
US20200024321A1 (en) Expression and large-scale production of peptides
CN117820494A (zh) 一种胰淀素及其类似物重组制备方法及其应用
JP2001342198A (ja) 組み換え型タンパク質の製造方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20831844

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020831844

Country of ref document: EP

Effective date: 20220126