CN113773392B

CN113773392B - Preparation method of insulin glargine

Info

Publication number: CN113773392B
Application number: CN202010520197.7A
Authority: CN
Inventors: 于歌; 唐亚连; 陈卫
Original assignee: Ningbo Kunpeng Biotech Co Ltd
Current assignee: Ningbo Kunpeng Biotech Co Ltd
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2023-04-07
Anticipated expiration: 2040-06-09
Also published as: CN113773392A

Abstract

The invention provides a glargine insulin derivative and a preparation method thereof. Specifically, the method expresses the insulin glargine fusion protein containing a green fluorescent protein folding unit in escherichia coli at high density, and performs enzyme digestion and purification on the fusion protein to prepare the insulin glargine. The method of the invention does not need to be carried out in an organic system, reduces the process steps, has little pollution to the environment and lower cost, and is suitable for popularization.

Description

Preparation method of insulin glargine

Technical Field

The invention relates to the technical field of biology, and particularly relates to a preparation method of insulin glargine.

Background

Diabetes is a major disease threatening human health worldwide. In China, with the change of life styles and the accelerated aging process of people, the prevalence rate of diabetes mellitus is on a rapid rising trend. Acute and chronic complications of diabetes, especially chronic complications, accumulate a plurality of organs, are disabled, have high fatality rate, seriously affect physical and psychological health of patients and bring heavy burden to individuals, families and society.

The insulin glargine achieves the purpose of long action maintenance time by changing the amino acid of the recombinant human insulin glargine and slightly adjusting the formula. The insulin glargine substitutes glycine with neutral charge for asparagine at the 21 position of the human insulin A chain, so that the hexamer is more stable. 2 arginines are added at the C end of the B chain, so that the isoelectric point is improved from 5.4 to 6.7, the insulin glargine is a transparent solution in a weak acid environment, and the solubility is greatly reduced in a physiological environment to cause precipitation. A small amount of zinc is added into the formula, so that the zinc can form crystals under the skin when the zinc is injected under the skin, thereby delaying the absorption time and further playing the role of reducing the blood sugar for a long time.

The preparation of insulin glargine generally adopts genetic engineering technology to prepare a precursor, and the original research company, sunofil (US 5663291), adopts escherichia coli to carry out recombinant fermentation preparation, and secondly adopts pichia pastoris to carry out fermentation expression (Biocon Baiokang, IN2008CHE 000310). The classical preparation process of converting glargine precursor into glargine all adopts a transpeptidation process, for example, cenofil (US 2009/0192073 A1) firstly prepares a glargine precursor B-Arg (B31) Arg (B32) -a-Gly (a 21), obtains the glargine and a glargine analogue B-Arg (B31) -a-Gly (a 21) by trypsin digestion, and the glargine analogue is converted into the glargine after transpeptidation and deprotection. The process for preparing the insulin glargine by the method is complex, the yield is low, and the production cost is extremely high.

Therefore, the development of a method for preparing insulin glargine with simple process, environmental friendliness and high yield is urgently needed in the field.

Disclosure of Invention

The invention aims to provide a preparation method of insulin glargine.

In a first aspect of the invention, there is provided a recombinant insulin glargine fusion protein having the structure shown in formula I:

A-FP-TEV-R-G (I)

in the formula (I), the compound is shown in the specification,

"-" represents a peptide bond;

a is a null or leader peptide,

FP is a folding unit of green fluorescent protein,

TEV is a restriction site, preferably a TEV enzyme restriction site;

r is arginine or lysine for enzyme digestion;

g is insulin glargine or an active fragment thereof;

wherein said green fluorescent protein fold units comprise 2-6, preferably 2-3 β -sheet units selected from the group consisting of:

beta-sheet unit	Amino acid sequence
		u1	VPILVELDGDVNG(SEQ ID NO:11)
u2	HKFSVRGEGEGDAT(SEQ ID NO:12)
		u3	KLTLKFICTT(SEQ ID NO:13)
u4	YVQERTISFKD(SEQ ID NO:14)
		u5	TYKTRAEVKFEGD(SEQ ID NO:15)
u6	TLVNRIELKGIDF(SEQ ID NO:16)
		u7	HNVYITADKQ(SEQ ID NO:17)
u8	GIKANFKIRHNVED(SEQ ID NO:18)
		u9	VQLADHYQQNTPIG(SEQ ID NO:19)
u10	HYLSTQSVLSKD(SEQ ID NO:20)
		u11	HMVLLEFVTAAGI(SEQ ID NO:21)。

In another preferred embodiment, the green fluorescent protein folding unit is u2-u3, u4-u5, u1-u2-u3, u3-u4-u5 or u4-u5-u6.

In another preferred embodiment, G is Boc-modified insulin glargine precursor having the structure of formula II:

GB-X-GA (II)

in the formula (I), the compound is shown in the specification,

GB is a B chain of the insulin glargine modified by Boc, the amino acid sequence is shown as the 1 st to 32 th sites of SEQ ID NO. 5,

x is nothing or a linker peptide, preferably the amino acid sequence of the linker peptide is R, or as shown in SEQ ID NO:6-9 (GSKR, AAKR, YPGDVKR or EAEDLQVGQVELGGGPGAGSLQPLALE GSLQKR);

GA is insulin glargine A chain, and the amino acid sequence is shown in 33-53 bits of SEQ ID NO. 5.

In another preferred embodiment, the R is used for trypsin digestion.

In another preferred embodiment, G is Boc modified insulin glargine with the sequence shown in SEQ ID NO. 5.

In another preferred embodiment, intrachain disulfide bonds are present between GB-X-GA.

In another preferred embodiment, X is absent.

In another preferred embodiment, the sequence of the recombinant insulin glargine fusion protein is shown in SEQ ID NO. 1.

In another preferred embodiment, the position 7 of the B chain of insulin glargine forms an interchain disulfide bond with the position 7 of the A chain (A7-B7), and the position 19 of the B chain forms an interchain disulfide bond with the position 20 of the A chain (A20-B19).

In another preferred embodiment, an intrachain disulfide bond is formed between the 6 th position of the A chain and the 11 th position of the A chain (A6-A11) of the insulin glargine.

In a second aspect of the invention, there is provided a double-chain insulin glargine fusion protein having the structure shown in formula III:

A-FP-TEV-R-D (III)

in the formula (I), the compound is shown in the specification,

"-" represents a peptide bond;

a is a null or leader peptide, preferably a leader peptide having the sequence shown in SEQ ID NO 2,

FP is a green fluorescent protein folding unit,

TEV is a restriction site, preferably a TEV enzyme restriction site (with the sequence of ENLYFQG, SEQ ID NO: 4);

r is arginine or lysine for enzyme digestion;

d is Boc modified double-chain insulin glargine, and the main chain has a structure shown in the following formula IV;

in the formula (I), the compound is shown in the specification,

"║" represents a disulfide bond;

GA is insulin glargine A chain, the amino acid sequence is shown as 33-53 bit of SEQ ID NO:5,

x is nothing or a connecting peptide;

GB is a insulin glargine B chain with the 29 th position modified by Boc, and the amino acid sequence is shown in the 1 st to 32 th positions of SEQ ID NO. 5;

In another preferred embodiment, the C-terminus of the glargine insulin B chain is linked to the N-terminus of the glargine insulin a chain via a linker peptide.

In another preferred embodiment, X is nothing or a linker peptide, preferably the amino acid sequence of the linker peptide is R, or as shown in SEQ ID NO 6-9 (GSKR, AAKR, YPGDVKR or EAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR);

in a third aspect of the invention, there is provided a Boc-modified insulin glargine precursor having a structure according to formula II:

GB-X-GA (II)

in the formula (I), the compound is shown in the specification,

GB is a glycine-arginine insulin B chain with 29 th position being Boc modified, the amino acid sequence is shown as 1 st to 32 th positions of SEQ ID NO. 5,

x is nothing or a linker peptide, preferably the amino acid sequence of the linker peptide is R, or as shown in SEQ ID NO:6-9 (GSKR, AAKR, YPGDVKR or EAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR);

In another preferred embodiment, the protected lysine is N epsilon- (t-butyloxycarbonyl) -lysine.

In a fourth aspect of the present invention, there is provided a Boc-modified glargine insulin having a structure represented by formula IV:

in the formula (I), the compound is shown in the specification,

"║" represents a disulfide bond;

GB is insulin glargine B chain, the amino acid sequence is shown in 1 st to 32 th positions of SEQ ID NO. 5, and the 29 th lysine of the B chain is N epsilon- (tert-butyloxycarbonyl) -lysine.

In a fifth aspect of the present invention, there is provided a method for preparing insulin glargine, the method comprising the steps of:

(i) Fermenting by using recombinant bacteria to prepare recombinant insulin glargine fusion protein (first protein) fermentation liquor;

(ii) Carrying out enzyme digestion on the recombinant insulin glargine fusion protein (first protein) so as to obtain a mixed solution I containing Boc modified insulin glargine (second protein);

(iii) Carrying out deprotection treatment on the Boc modified insulin glargine (second protein) so as to obtain a mixed solution II containing the deprotected insulin glargine (third protein);

(iv) And purifying the mixed solution II to obtain the insulin glargine.

In another preferred embodiment, the purification treatment comprises the steps of:

(I) Carrying out first cation chromatography on the mixed solution II to obtain an eluent I containing recombinant insulin glargine (third protein);

(II) subjecting said eluate I to a second cationic chromatography, thereby obtaining an eluate II containing recombinant insulin glargine (third protein);

(III) carrying out reverse phase chromatography on the eluent II to obtain the insulin glargine.

In another preferred embodiment, in step (ii), the temperature of the enzyme cleavage is 15-25 ℃, preferably 18-22 ℃.

In another preferred example, succinic acid and L-lysine are added as enzyme digestion aids during said enzyme digestion in step (ii).

In another preferred embodiment, in step (ii), the enzyme cutting system comprises succinic acid, and the concentration of the succinic acid is 10 to 50mmol/L, preferably 15 to 30mmol/L, and more preferably 30mmol/L.

In another preferred embodiment, in step (ii), the digestion system contains L-lysine, and the concentration of the L-lysine is 10-50mmol/L, preferably 15-30mmol/L, and more preferably 30mmol/L.

In another preferred embodiment, the purity of the prepared insulin glargine is higher than 99%.

In another preferred embodiment, the prepared insulin glargine has insulin activity.

In another preferred embodiment, the recombinant Boc-insulin glargine is insulin glargine with protected lysine at B29 (position 29 of insulin B chain).

In another preferred embodiment, the protected lysine is lysine with a protecting group.

In another preferred embodiment, the protected lysine is N epsilon- (tert-butyloxycarbonyl) -lysine.

In another preferred embodiment, in step (i), the recombinant bacterium is used for the fermentative production of recombinant insulin glargine fusion protein.

In another preferred embodiment, the recombinant bacterium comprises or integrates an expression cassette for expressing the recombinant insulin glargine fusion protein.

In another preferred example, in step (i), the recombinant insulin glargine fusion protein inclusion bodies are obtained by isolation from the fermentation broth of the recombinant bacteria.

In another preferred embodiment, in step (i), the inclusion body is denatured and renatured to obtain the recombinant insulin glargine fusion protein (first protein) with correct protein folding.

In another preferred embodiment, the recombinant insulin glargine fusion protein with correctly folded protein comprises an intrachain disulfide bond between the A chain and the B chain of insulin glargine.

In another preferred embodiment, the recombinant insulin glargine fusion protein is according to the first aspect of the present invention.

In another preferred embodiment, in step (ii), the trypsin is recombinant porcine trypsin.

In another preferred embodiment, in step (ii), the mass ratio of trypsin to recombinant insulin glargine precursor is 1.

In another preferred example, in step (ii), the enzyme digestion assisting agent contained in the enzyme digestion system can effectively improve the enzyme digestion yield.

In another preferred embodiment, in step (ii), the enzyme is cleaved for 10-25h, preferably 14-20h.

In another preferred embodiment, in step (ii), the pH of the recombinant insulin glargine precursor solution is 7.5-9.0.

In another preferred example, in step (iii), deprotection treatment is performed using hydrochloric acid.

In another preferred embodiment, in step (iii), the temperature of the deprotection reaction is 25 to 40 ℃, preferably 36 to 38 ℃.

In another preferred embodiment, in step (iii), the deprotection reaction time is 2-6h, preferably 4-5h.

In another preferred embodiment, said Boc-glargine is N epsilon- (t-butyloxycarbonyl) -lysine glargine.

In another preferred embodiment, in step (I), cation chromatography is performed using a weak cation exchange packing.

In another preferred embodiment, in step (I), an acetic acid 40-60mmol/L counter ion column is used.

In another preferred embodiment, in step (I), the loading amount of recombinant insulin glargine (third protein) is

≤12mg/ml。

In another preferred embodiment, in step (I), a linear gradient elution is performed using isopropanol-containing ammonium acetate.

In another preferred example, the step (I) further comprises a desalting treatment step.

In another preferred example, in the desalting treatment, the target protein is precipitated by isoelectric focusing.

In another preferable example, in the desalting treatment, 1 to 3mmol/L zinc acetate solution is added under stirring, then sodium hydroxide is added dropwise to adjust the pH value to 6.0 to 7.0, and the mixture is stirred and then kept stand.

In another preferred embodiment, the temperature of the standing is 2-8 ℃.

In another preferred embodiment, the standing time is 1-5 h.

In another preferred example, after the standing, a microfiltration membrane with the pore diameter of 0.1-0.4 μm is selected for filtration.

In another preferred embodiment, after the microfiltration membrane filtration, the substitution is performed with an ammonium acetate solution.

In another preferred embodiment, in step (II), a sodium chloride solution containing isopropanol is used as the mobile phase.

In another preferred embodiment, in step (II), the concentration of the sodium chloride solution is 0.1-0.5mol/L.

In another preferred example, in step (II), linear elution is performed using a mobile phase.

In another preferred embodiment, in step (II), the loading amount of insulin glargine in the eluent I is

Less than or equal to 5mg/ml, preferably, the loading amount is less than or equal to 4mg/ml.

In another preferred embodiment, in step (III), an acetonitrile solution containing sodium citrate is used as the mobile phase.

In another preferred embodiment, in step (III), the concentration of sodium citrate in the mobile phase is 80-120mmol/L, preferably 90-110mmol/L. In another preferred embodiment, in step (III), the pH of the mobile phase is 4.0 to 4.5, preferably 4.1 to 4.2.

In another preferred embodiment, in step (III), a gradient elution is performed using a mobile phase.

In another preferred embodiment, in step (III), the loading amount of insulin glargine in the eluent II is less than or equal to 6mg/ml, and preferably, the loading amount is less than or equal to 5mg/ml.

In another preferred embodiment, after the step (III), the step of precipitating and lyophilizing the prepared glargine is further included, so as to obtain a lyophilized product.

In a sixth aspect of the invention there is provided an insulin glargine formulation prepared using the method of the fifth aspect of the invention.

In another preferred embodiment, the purity of insulin glargine contained in the insulin glargine preparation is higher than 99%.

In another preferred embodiment, the insulin glargine contained in the insulin glargine preparation has biological activity.

In a seventh aspect of the invention, there is provided an isolated polynucleotide encoding the recombinant insulin glargine fusion protein of the first aspect of the invention, the insulin glargine backbone fusion protein of the second aspect of the invention, the Boc-modified insulin glargine precursor of the third aspect of the invention, or the Boc-modified insulin glargine backbone of the fourth aspect of the invention.

In an eighth aspect of the invention, there is provided a vector comprising a polynucleotide according to the seventh aspect of the invention.

In another preferred embodiment, the carrier is selected from the group consisting of: DNA, RNA, plasmids, lentiviral vectors, adenoviral vectors, retroviral vectors, transposons, or combinations thereof.

In a ninth aspect of the present invention, there is provided a host cell comprising the vector of the eighth aspect of the present invention, or a polynucleotide of the seventh aspect of the present invention integrated into a chromosome, or expressing the recombinant insulin glargine fusion protein of the first aspect of the present invention, the insulin glargine backbone fusion protein of the second aspect of the present invention, the Boc-modified insulin glargine precursor of the third aspect of the present invention, or the Boc-modified insulin glargine backbone of the fourth aspect of the present invention.

In another preferred embodiment, the host cell is Escherichia coli, bacillus subtilis, a yeast cell, an insect cell, a mammalian cell, or a combination thereof.

In a tenth aspect of the invention, there is provided a formulation or pharmaceutical composition comprising the recombinant insulin glargine fusion protein according to the first aspect of the invention, the insulin glargine backbone fusion protein according to the second aspect of the invention, the Boc-modified insulin glargine precursor according to the third aspect of the invention, or the Boc-modified insulin glargine backbone according to the fourth aspect of the invention, and a pharmaceutically acceptable carrier.

It is to be understood that within the scope of the present invention, the above-described features of the present invention and those specifically described below (e.g., in the examples) may be combined with each other to form new or preferred embodiments. Not to be reiterated herein, but to the extent of space.

Drawings

FIG. 1 shows a map of plasmid pBAD-FP-TEV-R-G.

FIG. 2 shows a map of the plasmid pEvol-pylRs-pylT.

FIG. 3 shows an SDS-PAGE of insulin glargine after the first chromatography.

FIG. 4 shows the HPLC profile of insulin glargine after the second chromatography.

FIG. 5 shows the HPLC profile of insulin glargine after the third chromatography.

Detailed Description

The present inventors have extensively and intensively studied and found a glargine derivative and a method for preparing the same. Specifically, the method expresses the insulin glargine fusion protein containing a green fluorescent protein folding unit in escherichia coli at high density, and performs enzyme digestion and purification on the fusion protein to prepare the insulin glargine. The method of the invention does not need to be carried out in an organic system, reduces the process steps, has little pollution to the environment and lower cost, and is suitable for popularization.

Term(s) for

In order that the disclosure may be more readily understood, certain terms are first defined. As used in this application, each of the following terms shall have the meaning given below, unless explicitly specified otherwise herein. Other definitions are set forth throughout the application.

The term "about" can refer to a value or composition that is within an acceptable error range for the particular value or composition as determined by one of ordinary skill in the art, which will depend in part on how the value or composition is measured or determined.

Insulin glargine

The insulin products are the first major drug variety in the diabetes market, occupy about 53% of the market share, and mainly comprise third-generation recombinant insulin. Insulin glargine belongs to third generation recombinant insulin, is long-acting insulin, has no obvious peak value and risk of hypoglycemia, sudden death and the like caused by the peak value, and has the characteristics of safety and long-acting performance, and insulin glargine Lantus continuously becomes a product with the largest market share in the insulin market for years, and accounts for more than 30 percent of the whole insulin market.

Insulin glargine is prepared by changing amino acids of recombinant human insulin the formula is slightly adjusted to achieve the purpose of long action and maintenance time. The insulin glargine substitutes glycine with neutral charge for asparagine at the 21 position of the A chain of the human insulin glargine, so that a hexamer is more stable. 2 arginine is added at the C end of the B chain, so that the isoelectric point is improved to 6.7 from 5.4, and the insulin glargine is a transparent solution in a weak acid environment, and the solubility of the insulin glargine is greatly reduced in a physiological environment to generate precipitates. A small amount of zinc is added into the formula, so that the zinc can form crystals under the skin when the zinc is injected under the skin, thereby delaying the absorption time and further playing the role of reducing the blood sugar for a long time.

Fusion proteins

By using the green fluorescent protein folding unit, two fusion proteins are constructed, namely the recombinant insulin glargine fusion protein comprising the single-chain insulin glargine precursor according to the first aspect of the invention and the double-chain insulin glargine fusion protein comprising the double-chain insulin glargine according to the third aspect of the invention. In fact, the scope of protection of two fusion proteins of the invention may partially overlap, for example, in the case of insulin glargine in a double-stranded form contained in the fusion protein, the C-terminus of the B-chain may also be linked to the N-terminus of the A-chain by a linker peptide, or may be considered as a single-chain containing intrachain disulfide bonds.

The green fluorescent protein fold unit FP comprised in the fusion protein of the invention comprises 2 to 6, preferably 2 to 3 β -sheet units selected from the group consisting of:

In another preferred embodiment, the green fluorescent protein folding unit FP can be selected from: u8, u9, u2-u3, u4-u5, u8-u9, u1-u2-u3, u2-u3-u4, u3-u4-u5, u5-u6-u7, u8-u9-u10, u9-u10-u11, u3-u5-u7, u3-u4-u6, u4-u7-u10, u6-u8-u10, u1-u2-u3-u4, u2-u3-u4-u5 u3-u4-u3-u4, u3-u5-u7-u9, u5-u6-u7-u8, u1-u3-u7-u9, u2-u2-u7-u8, u7-u2-u5-u11, u3-u4-u7-u10, u1-I-u2, u1-I-u5, u2-I-u4, u3-I-u8, u5-I-u6, or u10-I-u11.

In another preferred embodiment, the green fluorescent protein folding unit is u3-u4-u5.

The term "fusion protein" as used herein also includes variants having the above-described activities. These variants include (but are not limited to): deletion, insertion and/or substitution of 1 to 3 (usually 1 to 2, more preferably 1) amino acids, and addition or deletion of one or several (usually up to 3, preferably up to 2, more preferably up to 1) amino acids at the C-terminal and/or N-terminal. For example, in the art, substitutions with amino acids of similar or similar properties will not generally alter the function of the protein. Also, for example, the addition or deletion of one or several amino acids at the C-terminus and/or N-terminus does not generally alter the structure and function of the protein. In addition, the term also includes monomeric and multimeric forms of the polypeptides of the invention. The term also includes linear as well as non-linear polypeptides (e.g., cyclic peptides).

The invention also includes active fragments, derivatives and analogs of the above fusion proteins. As used herein, the terms "fragment," "derivative," and "analog" refer to a polypeptide that substantially retains the function or activity of a fusion protein of the invention. The polypeptide fragment, derivative or analogue of the present invention may be (i) a polypeptide in which one or more conserved or non-conserved amino acid residues (preferably conserved amino acid residues) are substituted, or (ii) a polypeptide having a substituent group in one or more amino acid residues, or (iii) a polypeptide in which a polypeptide is fused with another compound (such as a compound for increasing the half-life of the polypeptide, e.g., polyethylene glycol), or (iv) a polypeptide in which an additional amino acid sequence is fused with the polypeptide sequence (a fusion protein in which a tag sequence such as a leader sequence, a secretory sequence or 6His is fused). Such fragments, derivatives and analogs are within the purview of those skilled in the art in view of the teachings herein.

A preferred class of reactive derivatives refers to polypeptides formed by the replacement of up to 3, preferably up to 2, more preferably up to 1 amino acid with an amino acid of similar or analogous nature compared to the amino acid sequence of the present invention. These conservative variants are preferably produced by amino acid substitutions according to Table A.

TABLE A

The invention also provides analogs of the fusion proteins of the invention. These analogs may differ from the polypeptides of the invention by amino acid sequence differences, by modifications that do not affect the sequence, or by both. Analogs also include analogs having residues other than the natural L-amino acids (e.g., D-amino acids), as well as analogs having non-naturally occurring or synthetic amino acids (e.g., beta, gamma-amino acids). It is to be understood that the polypeptides of the present invention are not limited to the representative polypeptides exemplified above.

In addition, modifications may be made to the fusion proteins of the invention. Modified (generally without altering primary structure) forms include: chemically derivatized forms of the polypeptide, such as acetylation or carboxylation, in vivo or in vitro. Modifications also include glycosylation, such as those resulting from glycosylation modifications in the synthesis and processing of the polypeptide or in further processing steps. Such modification may be accomplished by exposing the polypeptide to an enzyme that performs glycosylation, such as a mammalian glycosylase or deglycosylase. Modified forms also include sequences having phosphorylated amino acid residues (e.g., phosphotyrosine, phosphoserine, phosphothreonine). Also included are polypeptides modified to increase their resistance to proteolysis or to optimize solubility.

The term "polynucleotide encoding a fusion protein of the present invention" may include a polynucleotide encoding a fusion protein of the present invention, and may also include polynucleotides that additionally include coding and/or non-coding sequences.

The invention also relates to variants of the above polynucleotides which encode fragments, analogs and derivatives of the polypeptides or fusion proteins having the same amino acid sequence as the present invention. These nucleotide variants include substitution variants, deletion variants and insertion variants. As is known in the art, an allelic variant is a substitution of a polynucleotide, which may be a substitution, deletion, or insertion of one or more nucleotides, without substantially altering the function of the fusion protein encoded thereby.

The present invention also relates to polynucleotides which hybridize to the sequences described above and which have at least 50%, preferably at least 70%, and more preferably at least 80% identity between the two sequences. The present invention particularly relates to polynucleotides hybridizable under stringent conditions (or stringent conditions) with the polynucleotides of the present invention. In the present invention, "stringent conditions" mean: (1) Hybridization and elution at lower ionic strength and higher temperature, e.g., 0.2 XSSC, 0.1% SDS,60 ℃; or (2) adding a denaturing agent such as 50% (v/v) formamide, 0.1% calf serum/0.1% Ficoll,42 ℃ etc. at the time of hybridization; or (3) hybridization occurs only when the identity between two sequences is at least 90% or more, preferably 95% or more.

The fusion proteins and polynucleotides of the invention are preferably provided in isolated form, and more preferably, purified to homogeneity.

The full-length sequence of the polynucleotide of the present invention can be obtained by PCR amplification, recombination, or artificial synthesis. For PCR amplification, primers can be designed based on the nucleotide sequences disclosed herein, particularly open reading frame sequences, and the sequences can be amplified using commercially available cDNA libraries or cDNA libraries prepared by conventional methods known to those skilled in the art as templates. When the sequence is long, two or more PCR amplifications are often required, and then the amplified fragments are spliced together in the correct order.

Once the sequence of interest has been obtained, it can be obtained in large quantities by recombinant methods. This is usually done by cloning it into a vector, transferring it into a cell, and isolating the relevant sequence from the propagated host cell by conventional methods.

In addition, the sequence can be synthesized by artificial synthesis, especially when the fragment length is short. Generally, fragments with long sequences are obtained by first synthesizing a plurality of small fragments and then ligating them.

At present, DNA sequences encoding the proteins of the present invention (or fragments or derivatives thereof) have been obtained completely by chemical synthesis. The DNA sequence can then be introduced into various existing DNA molecules (or e.g., vectors) and cells known in the art.

Methods for amplifying DNA/RNA using PCR techniques are preferably used to obtain the polynucleotides of the invention. Particularly, when it is difficult to obtain a full-length cDNA from a library, it is preferable to use the RACE method (RACE-cDNA terminal rapid amplification method), and primers used for PCR can be appropriately selected based on the sequence information of the present invention disclosed herein and synthesized by a conventional method. The amplified DNA/RNA fragments can be isolated and purified by conventional methods, such as by gel electrophoresis.

Expression vector

The invention also relates to vectors comprising the polynucleotides of the invention, as well as genetically engineered host cells transformed with the vectors of the invention or the coding sequences of the fusion proteins of the invention, and methods for producing the polypeptides of the invention by recombinant techniques.

The polynucleotide sequences of the present invention may be used to express or produce recombinant fusion proteins by conventional recombinant DNA techniques. Generally, the following steps are provided:

(1) Transforming or transducing a suitable host cell with a polynucleotide (or variant) of the invention encoding a fusion protein of the invention, or with a recombinant expression vector comprising the polynucleotide;

(2) A host cell cultured in a suitable medium;

(3) Isolating and purifying the protein from the culture medium or the cells.

In the present invention, the polynucleotide sequence encoding the fusion protein may be inserted into a recombinant expression vector. The term "recombinant expression vector" refers to a bacterial plasmid, bacteriophage, yeast plasmid, plant cell virus, mammalian cell virus such as adenovirus, retrovirus, or other vectors well known in the art. Any plasmid or vector may be used as long as it can replicate and is stable in the host. An important feature of expression vectors is that they generally contain an origin of replication, a promoter, a marker gene and translation control elements.

Methods well known to those skilled in the art can be used to construct expression vectors containing the DNA sequences encoding the fusion proteins of the present invention and appropriate transcription/translation control signals. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like. The DNA sequence may be operably linked to a suitable promoter in an expression vector to direct mRNA synthesis. Representative examples of such promoters are: lac or trp promoter of E.coli; a lambda phage PL promoter; eukaryotic promoters include CMV immediate early promoter, HSV thymidine kinase promoter, early and late SV40 promoter, LTRs of retrovirus, and other known promoters which can control the expression of genes in prokaryotic or eukaryotic cells or viruses. The expression vector also includes a ribosome binding site for translation initiation and a transcription terminator.

Furthermore, the expression vector preferably comprises one or more selectable marker genes to provide phenotypic traits for selection of transformed host cells, such as dihydrofolate reductase, neomycin resistance and Green Fluorescent Protein (GFP) for eukaryotic cell culture, or tetracycline or ampicillin resistance for E.coli.

Vectors comprising the appropriate DNA sequences described above, together with appropriate promoter or control sequences, may be used to transform appropriate host cells to enable expression of the protein.

The host cell may be a prokaryotic cell, such as a bacterial cell; or lower eukaryotic cells, such as yeast cells; or higher eukaryotic cells, such as mammalian cells. Representative examples are: escherichia coli, streptomyces; bacterial cells of salmonella typhimurium; fungal cells such as yeast, plant cells (e.g., ginseng cells).

When the polynucleotide of the present invention is expressed in higher eukaryotic cells, transcription will be enhanced if an enhancer sequence is inserted into the vector. Enhancers are cis-acting elements of DNA, usually about 10 to 300 base pairs, that act on a promoter to increase transcription of a gene. Examples include the SV40 enhancer, which is 100 to 270 bp on the late side of the replication origin, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers, among others.

It will be clear to one of ordinary skill in the art how to select appropriate vectors, promoters, enhancers and host cells.

Transformation of a host cell with recombinant DNA can be carried out using conventional techniques well known to those skilled in the art. When the host is prokaryotic, e.g., E.coli, competent cells capable of DNA uptake can be harvested after exponential growth phase using CaCl ₂ Methods, the steps used are well known in the art. Another method is to use MgCl ₂ . If desired, transformation can also be carried out by electroporation. When the host is a eukaryote, the following DNA transfection methods may be used: calcium phosphate coprecipitation, conventional mechanical methods such as microinjection, electroporation, liposome encapsulation, etc.

The obtained transformant can be cultured by a conventional method to express the polypeptide encoded by the gene of the present invention. The medium used in the culture may be selected from various conventional media depending on the host cell used. The culturing is performed under conditions suitable for growth of the host cell. After the host cells have been grown to an appropriate cell density, the selected promoter is induced by suitable means (e.g., temperature shift or chemical induction) and the cells are cultured for an additional period of time.

The recombinant polypeptide in the above method may be expressed intracellularly or on the cell membrane, or secreted extracellularly. If necessary, the recombinant protein can be isolated and purified by various separation methods using its physical, chemical and other properties. These methods are well known to those skilled in the art. Examples of such methods include, but are not limited to: conventional renaturation treatment, treatment with a protein precipitant (such as salt precipitation), centrifugation, cell lysis by osmosis, sonication, ultracentrifugation, molecular sieve chromatography (gel filtration), adsorption chromatography, ion exchange chromatography, high Performance Liquid Chromatography (HPLC), and other various liquid chromatography techniques, and combinations thereof.

The main advantages of the invention include:

(1) The method does not need to remove excessive inorganic salt in the supernatant of the fermentation liquor by adopting methods such as dilution, ultrafiltration and liquid exchange, and the like, so that the obtained inclusion body has higher purity and less pigment, and separated substances and purification cost are reduced for subsequent purification. And the cation chromatography in the method is used for separating the insulin glargine, and the one-step yield is over 80 percent.

(2) Due to B ₂₉ Protection of Boc lysine, no recognition of B during trypsin digestion ₂₉ Lysine at position, does not produce des (B) ₃₀ ) The side product of (2) can improve the enzyme digestion yield, reduce the impurities of the insulin analog and provide convenience for subsequent purification and separation.

(3) In the enzyme digestion process, the enzyme digestion yield is improved by optimizing the proportion of trypsin, controlling the enzyme digestion temperature and adding an enzyme digestion auxiliary agent.

(4) In the deprotection step, the Boc-insulin glargine is converted into insulin glargine without being carried out in an organic system, so that the process steps are reduced, the environmental pollution is low, and the cost is lower.

(5) In the desalting step, the target protein is precipitated by an isoelectric point method, and the purification effect is realized to a certain extent, so that part of the impure protein is removed. In addition, the filtering column with the membrane aperture of 0.1-0.2 μm is used for replacing an ultrafiltration membrane with a small aperture, so that the filtering time is greatly shortened.

(6) The invention adopts two-step ion exchange chromatography and one-step reverse phase chromatography for separation and purification, replaces the conventional four-step chromatography, reduces the production period, reduces the use of organic solvent and saves the cost.

(7) The fusion protein of the invention contains insulin glargine with high specific gravity (increased fusion ratio), the green fluorescent protein in the fusion protein contains arginine and lysine, can be digested into small fragments by protease, has large molecular weight difference compared with the target protein, and is easy to separate.

The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Experimental procedures without specific conditions noted in the following examples, generally followed by conventional conditions, such as Sambrook et al, molecular cloning: the conditions described in the Laboratory Manual (New York: cold Spring Harbor Laboratory Press, 1989), or according to the manufacturer's recommendations. Unless otherwise indicated, percentages and parts are percentages and parts by weight.

Example 1 construction and expression of insulin glargine-expressing Strain

The expression plasmid of insulin glargine is constructed by the method of the prior art in the field, specifically, the description of the examples in patent application No. 201910210102.9. The DNA fragment of the fusion protein FP-TEV-R-G was cloned into the NcoI-XhoI site downstream of the araBAD promoter of the expression vector plasmid pBAD/His A (purchased from NTCC, kanamycin resistance) to obtain plasmid pBAD-FP-TEV-R-G. The plasmid map is shown in FIG. 1.

The DNA sequence of pylRs was then cloned into the SpeI-SalI site downstream of the araBAD promoter of the expression vector plasmid pEvol-pBpF (available from NTCC for chloramphenicol resistance), while the DNA sequence of the tRNA of lysyl-tRNA synthetase (pylTcua) was PCR-inserted downstream of the proK promoter. This plasmid was designated pEvol-pylRs-pylT. The plasmid map is shown in FIG. 2.

And (3) transforming the constructed insulin glargine expression vector into an escherichia coli strain, and screening to obtain a recombinant strain for expressing the recombinant insulin glargine precursor. Wherein, the amino acid sequence of the recombinant insulin glargine precursor is shown as SEQ ID NO:1, and the 73 th site (29 th site of insulin glargine) of the precursor sequence is lysine protected by Boc.

MVSKGEELFTGVKLTLKFICTTYVQERTISFKDTYKTRAEVKFEGDENLYFQGRFVNQHLCGSHLVEALYLVCGERGFFYTPK(Boc)TRRGIVEQCCTSICSLYQLENYCG(SEQ ID NO:1)

The structure of the recombinant insulin glargine precursor is as follows:

A-FP-TEV-R-G

in the formula (I), the compound is shown in the specification,

"-" represents a peptide bond;

a is leader peptide sequence of MVSKGEELFTGV (SEQ ID NO: 2)

FP is a green fluorescent protein folding unit with the sequence as

KLTLKFICTTYVQERTISFKDTYKTRAEVKFEGD(SEQ ID NO:3)；

TEV is TEV enzyme restriction site, and the sequence is ENLYFQG (SEQ ID NO: 4);

r is arginine or lysine for trypsin enzyme digestion;

g is the insulin glargine with the 29 th position modified by Boc and has the sequence

FVNQHLCGSHLVEALYLVCGERGFFYTPK(Boc)TRRGIVEQCCTSICSLYQLENYCG(SEQ ID NO:5)。

Preparing a seed liquid culture medium, inoculating, preparing a secondary seed liquid through two-stage culture, culturing for 20h, enabling OD600 to reach about 180, obtaining about 3L of fermentation liquid after fermentation is finished, and centrifuging to obtain about 150g/L of wet thalli. And (3) after the fermentation liquor is centrifuged, adding a crushing buffer solution, performing bacteria crushing twice by using a high-pressure homogenizer, adding Tween 80 and EDTA-2Na with a certain concentration after centrifugation, washing once, centrifuging, and collecting precipitates to obtain the inclusion body. Approximately 43g of wet inclusion bodies per liter of fermentation broth are finally obtained.

Example 2 solubilization and renaturation of Inclusion bodies

Adding 8mol/L urea solution into the obtained inclusion body, adjusting the pH value to 8.0-9.0 by sodium hydroxide, stirring for 1-3h at room temperature, controlling the protein concentration to 10-20mg/mL, supplementing beta-mercaptoethanol until the final concentration is 10-20mmol/L, and continuing stirring for 0.5-1.0 h.

Dripping the inclusion body dissolving solution into renaturation buffer solution, diluting renaturation by 5-10 times, maintaining the pH of the renaturation solution to be 9.0-10.5, the temperature to be 2-8 ℃, and stirring for renaturation time to be 10-20 h.

Example 3 cleavage of fusion protein

8-10 times of the renaturation solution is concentrated by using a 10KD ultrafiltration membrane. And adding dilute hydrochloric acid into the renaturation solution to adjust the pH value to 7.5-9.0. The total protein amount was calculated by measuring the protein concentration of the renaturation concentrate by the Bradford method. Adding recombinant trypsin with stirring, wherein the mass ratio of the recombinant trypsin to the total protein of the renaturation solution is 1.

And after enzyme digestion is carried out for 10 hours, detecting the Boc-insulin glargine content in the enzyme digestion solution by HPLC, and finishing enzyme digestion when the concentration difference of the Boc-insulin glargine detected for two continuous hours is less than 3%. Finally, the concentration of Boc-insulin glargine in the enzyme digestion solution is 0.9-1.3 mg/mL, and the enzyme digestion rate is 30-40%.

Example 4 deprotection

Adding hydrochloric acid into the enzyme digestion solution, reacting at 25-40 ℃ for 4-5h to remove the Boc protecting group, and adding sodium hydroxide to adjust the pH value to 3.0-3.5 to terminate the deprotection reaction. The deprotection yield is about 75-80%, and the purity of insulin glargine is about 20%. (deprotection yield = (deprotected solution glargine concentration after reaction termination × deprotected solution final volume)/(Boc-glargine concentration before deprotection × volume before deprotection) × 100%)

Example 5 first chromatography

The initial protein mixed liquor contains a large amount of mycoprotein residues, and then contains enzyme digestion byproducts generated in the enzyme digestion process and hydrolyzed protein generated by deprotection. According to the difference of isoelectric points of proteins, cation exchange packing is selected to carry out crude extraction on insulin glargine. Balancing 3-5 column volumes of an ion column by using 50mmol/L acetic acid and a buffer solution with pH of 3.0-3.5, combining insulin glargine with a cationic filler, controlling the loading capacity of the insulin glargine to be lower than 12mg/ml, after loading is finished, eluting the chromatographic column by using a linear gradient of 1mol/L ammonium acetate containing isopropanol for 20 column volumes, collecting the eluted target protein peak, collecting the eluted peak, and detecting the result of SDS protein gel electrophoresis as shown in figure 3. The yield of the insulin glargine obtained by the chromatography I is 80-85%, the purity is about 30%, and most of mycoprotein and partial enzyme digestion byproducts can be removed by the step.

Followed by desalting treatment.

Adding 2mmol/L zinc acetate solution into the collection liquid eluted by the chromatography I, and stirring for 2-5 min. Adding sodium hydroxide dropwise under the stirring condition to adjust the pH value to 6.0-7.0, continuously stirring for 2-10 min, and standing at the temperature of 2-8 ℃ for 1-5 h. The micro-filtration membrane with the aperture of 0.1-0.4 mu m is selected to concentrate the sample to more than 10 times, and then ammonium acetate solution with the volume of 6 times and the pH value of 6.0-7.0 is used for replacement.

The result shows that the yield of the insulin glargine is higher than 95 percent, and the purity is improved by about 40 percent.

EXAMPLE 6 second chromatography

The purity of insulin glargine in the initial mixed solution of the step is about 40 percent, wherein B is absent ₃₂ The arginine insulin glargine analogue has a structure very similar to that of insulin glargine and is difficult to remove, and based on the difference in the charge of the substance, the insulin glargine is purified by high-resolution cation chromatography to remove part of impurities.

Clarifying the insulin glargine solution, and adding acetic acid to adjust pH to 2.5-4.5. Balancing 3-5 column volumes of an ion column by using 75mmol/L glycine, 30% isopropanol and pH3.4 buffer solution, combining insulin glargine protein solution with a cationic filler, controlling the loading capacity of insulin glargine not to exceed 4mg/mL, linearly eluting by using 0.3mol/L sodium chloride containing isopropanol, and collecting an insulin glargine sample. Finally, the insulin glargine with the purity higher than 97 percent is obtained, the yield reaches 75.6 percent, and B is absent ₃₂ The content of insulin glargine in arginine is controlled within 0.5%, and its HPLC detection result is shown in FIG. 4

EXAMPLE 7 third chromatography

According to the difference of the hydrophobicity of the substances, the reverse phase chromatographic column technology is adopted to finely purify the insulin glargine, and mainly hydrolysis products of the insulin glargine are removed. Diluting the insulin glargine solution obtained by the secondary chromatography with pure water by more than 4 times, and combining with C8 reversed phase filler. Controlling the loading capacity of the insulin glargine to be not higher than 5mg/mL, carrying out gradient elution by using an acetonitrile solution containing 100mmol/L sodium citrate and pH4.2, collecting the elution peak of the insulin glargine, and finally obtaining the insulin glargine with the yield of 77.3 percent and the purity of 99.18 percent, wherein the HPLC detection result is shown in figure 5.

EXAMPLE 8 precipitation and lyophilization

Adding water for injection into the eluate obtained by the third chromatography until acetonitrile content is not more than 15% (v/v), adding zinc acetate to a concentration of 2mmol/L, adjusting pH to 6.8-7.1 with sodium hydroxide, and standing at 4-8 deg.C for precipitation. Collecting the precipitate, washing the precipitate with more than 100 times of water for injection, collecting the washed precipitate sample, and drying to obtain insulin glargine.

Comparative example

Construction and expression of the fusion protein expression strain were carried out in a similar manner to example 1 except that the amino acid sequence of the fusion protein used for expression was shown in SEQ ID NO. 10.

MKKLLFAIPLVVPFYSHSTMELEICSWYHMGIRSFLEQKLISEEDLNSAVDRFVNQHLCGSHLVEALYLVCGERGFFYTPK(Boc)TRRGIVEQCCTSICSLYQLENYCG(SEQ ID NO:10)

The fusion protein comprises a B chain and an A chain of insulin glargine and also comprises a gIII signal peptide.

The results showed that the cells were cultured for 20h ₆₀₀ About 140 g/L, and obtaining about 3L fermentation liquor after fermentation is finished, and obtaining wet thalli of about 105g/L by centrifugation. And after the fermentation liquor is centrifuged, adding a crushing buffer solution, crushing the bacteria twice by using a high-pressure homogenizer, and centrifuging and collecting the precipitate to obtain the inclusion body. About 30g of wet inclusion bodies per liter of fermentation broth can be finally obtained.

The results show that compared with the expression of the fusion protein with the conventional structure, the expression quantity of the fusion protein is obviously improved, and the insulin glargine protein in the fusion protein is correctly folded and has biological activity.

All documents referred to herein are incorporated by reference into this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes and modifications of the present invention can be made by those skilled in the art after reading the above teachings of the present invention, and these equivalents also fall within the scope of the present invention as defined by the appended claims.

Sequence listing

<110> Ningbo spread Biotechnology Ltd

<120> preparation method of insulin glargine

<130> P2020-0721

<160> 21

<170> PatentIn version 3.5

<210> 1

<211> 107

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 1

Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Lys Leu Thr Leu

1 5 10 15

Lys Phe Ile Cys Thr Thr Tyr Val Gln Glu Arg Thr Ile Ser Phe Lys

20 25 30

Asp Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp Glu Asn

35 40 45

Leu Tyr Phe Gln Gly Arg Phe Val Asn Gln His Leu Cys Gly Ser His

50 55 60

Leu Val Glu Ala Leu Tyr Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr

65 70 75 80

Thr Pro Lys Thr Arg Arg Gly Ile Val Glu Gln Cys Cys Thr Ser Ile

85 90 95

Cys Ser Leu Tyr Gln Leu Glu Asn Tyr Cys Gly

100 105

<210> 2

<211> 12

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 2

Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val

1 5 10

<210> 3

<211> 34

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 3

Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Tyr Val Gln Glu Arg Thr

1 5 10 15

Ile Ser Phe Lys Asp Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu

20 25 30

Gly Asp

<210> 4

<211> 7

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 4

Glu Asn Leu Tyr Phe Gln Gly

1 5

<210> 5

<211> 53

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 5

Phe Val Asn Gln His Leu Cys Gly Ser His Leu Val Glu Ala Leu Tyr

1 5 10 15

Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys Thr Arg Arg

20 25 30

Gly Ile Val Glu Gln Cys Cys Thr Ser Ile Cys Ser Leu Tyr Gln Leu

35 40 45

Glu Asn Tyr Cys Gly

50

<210> 6

<211> 4

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 6

Gly Ser Lys Arg

1

<210> 7

<211> 4

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 7

Ala Ala Lys Arg

1

<210> 8

<211> 7

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 8

Tyr Pro Gly Asp Val Lys Arg

1 5

<210> 9

<211> 33

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 9

Glu Ala Glu Asp Leu Gln Val Gly Gln Val Glu Leu Gly Gly Gly Pro

1 5 10 15

Gly Ala Gly Ser Leu Gln Pro Leu Ala Leu Glu Gly Ser Leu Gln Lys

20 25 30

Arg

<210> 10

<211> 105

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 10

Met Lys Lys Leu Leu Phe Ala Ile Pro Leu Val Val Pro Phe Tyr Ser

1 5 10 15

His Ser Thr Met Glu Leu Glu Ile Cys Ser Trp Tyr His Met Gly Ile

20 25 30

Arg Ser Phe Leu Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu Asn Ser

35 40 45

Ala Val Asp Arg Phe Val Asn Gln His Leu Cys Gly Ser His Leu Val

50 55 60

Glu Ala Leu Tyr Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro

65 70 75 80

Lys Thr Arg Arg Gly Ile Val Glu Gln Cys Cys Thr Ser Ile Cys Ser

85 90 95

Leu Tyr Gln Leu Glu Asn Tyr Cys Gly

100 105

<210> 11

<211> 13

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 11

Val Pro Ile Leu Val Glu Leu Asp Gly Asp Val Asn Gly

1 5 10

<210> 12

<211> 14

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 12

His Lys Phe Ser Val Arg Gly Glu Gly Glu Gly Asp Ala Thr

1 5 10

<210> 13

<211> 10

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 13

Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr

1 5 10

<210> 14

<211> 11

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 14

Tyr Val Gln Glu Arg Thr Ile Ser Phe Lys Asp

1 5 10

<210> 15

<211> 13

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 15

Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp

1 5 10

<210> 16

<211> 13

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 16

Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe

1 5 10

<210> 17

<211> 10

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 17

His Asn Val Tyr Ile Thr Ala Asp Lys Gln

1 5 10

<210> 18

<211> 14

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 18

Gly Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp

1 5 10

<210> 19

<211> 14

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 19

Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly

1 5 10

<210> 20

<211> 12

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 20

His Tyr Leu Ser Thr Gln Ser Val Leu Ser Lys Asp

1 5 10

<210> 21

<211> 13

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 21

His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile

1 5 10

Claims

1. A method of preparing insulin glargine, comprising the steps of:

(i) Fermenting by using recombinant bacteria to prepare recombinant insulin glargine fusion protein fermentation liquor;

(ii) Carrying out enzyme digestion on the recombinant insulin glargine fusion protein to obtain mixed liquor I containing Boc modified insulin glargine;

(iii) Carrying out deprotection treatment on the Boc modified insulin glargine to obtain a mixed solution II containing the deprotected insulin glargine;

(iv) Purifying the mixed solution II to obtain insulin glargine;

wherein, the recombinant insulin glargine fusion protein has a structure shown in a formula I:

A-FP-TEV-R-G (I)

in the formula (I), the compound is shown in the specification,

"-" represents a peptide bond;

a is a null or leader peptide,

FP is a folding unit of green fluorescent protein,

TEV is an enzyme cutting site;

r is arginine or lysine for enzyme digestion;

g is insulin glargine or an active fragment thereof;

wherein, the green fluorescent protein folding unit is 2-3 beta-folding units selected from the following group:

beta-sheet unit Amino acid sequence u1 VPILVELDGDVNG (SEQ ID NO: 11) u2 HKFSVRGEGEGDAT (SEQ ID NO: 12) u3 KLTLKFICTT (SEQ ID NO: 13) u4 YVQERTISFKD (SEQ ID NO: 14) u5 TYKTRAEVKFEGD (SEQ ID NO: 15) u6 TLVNRIELKGIDF (SEQ ID NO: 16) u7 HNVYITADKQ (SEQ ID NO: 17) u8 GIKANFKIRHNVED (SEQ ID NO: 18) u9 VQLADHYQQNTPIG (SEQ ID NO: 19) u10 HYLSTQSVLSKD (SEQ ID NO: 20) u11 HMVLLEFVTAAGI (SEQ ID NO: 21)；

The folding unit of the green fluorescent protein is u2-u3, u4-u5, u1-u2-u3, u3-u4-u5 or u4-u5-u6.

2. The method of claim 1, wherein said purification process comprises the steps of:

(I) Carrying out first cation chromatography on the mixed solution II to obtain an eluent I containing the recombinant insulin glargine;

(II) subjecting said eluate I to a second cation chromatography to obtain an eluate II containing recombinant glargine;

3. The method of claim 2, wherein after step (III), further comprising the steps of precipitating and lyophilizing the prepared insulin glargine to produce a lyophilized product.

4. The method of claim 2, wherein step (I) further comprises a desalting step.

5. The method of claim 1, wherein in step (i), recombinant insulin glargine fusion protein inclusion bodies are isolated from the fermentation broth of the recombinant bacteria.

6. The method of claim 5, wherein in step (i), further comprising the step of denaturing and renaturing the inclusion bodies to obtain a recombinant insulin glargine fusion protein with correct protein folding.

7. The method of claim 1, wherein in step (ii), succinic acid and L-lysine are added as digestion aids during said digestion.

8. The method of claim 1, wherein in step (iii), the deprotection treatment is performed using hydrochloric acid.

9. The method of claim 1, wherein said TEV is a TEV enzymatic cleavage site.

10. The method of claim 1, wherein G is Boc-modified insulin glargine precursor having the structure of formula II:

GB- X-GA (II)

in the formula (I), the compound is shown in the specification,

x is nothing or a connecting peptide;

11. The method of claim 1, wherein the recombinant insulin glargine fusion protein has the sequence as set forth in SEQ ID NO 1.

12. A method of preparing an insulin glargine formulation comprising the steps of:

(iv) Purifying the mixed solution II to obtain insulin glargine, and performing precipitation and freeze-drying on the prepared insulin glargine to obtain a freeze-dried product, namely an insulin glargine preparation;

A-FP-TEV-R-G (I)

in the formula (I), the compound is shown in the specification,

"-" represents a peptide bond;

a is a null or leader peptide,

FP is a folding unit of green fluorescent protein,

TEV is an enzyme cutting site;

r is arginine or lysine for enzyme digestion;

g is insulin glargine or an active fragment thereof;

beta-sheet unit Amino acid sequence u1 VPILVELDGDVNG (SEQ ID NO: 11) u2 HKFSVRGEGEGDAT (SEQ ID NO: 12) u3 KLTLKFICTT (SEQ ID NO: 13) u4 YVQERTISFKD (SEQ ID NO: 14) u5 TYKTRAEVKFEGD (SEQ ID NO: 15) u6 TLVNRIELKGIDF (SEQ ID NO: 16) u7 HNVYITADKQ (SEQ ID NO: 17) u8 GIKANFKIRHNVED (SEQ ID NO: 18) u9 VQLADHYQQNTPIG (SEQ ID NO: 19) u10 HYLSTQSVLSKD (SEQ ID NO: 20) u11 HMVLLEFVTAAGI (SEQ ID NO: 21) ；

The green fluorescent protein folding unit is u2-u3, u4-u5, u1-u2-u3, u3-u4-u5 or u4-u5-u6.