CN110092835A

CN110092835A - A kind of GLP-1 analog-COL3A1 fusion protein

Info

Publication number: CN110092835A
Application number: CN201810089639.XA
Authority: CN
Inventors: 陈国友
Original assignee: SHANGHAI HUIDUN BIOTECHNOLOGY Co Ltd
Current assignee: Shanghai Huidun Yintai Biotechnology Co ltd
Priority date: 2018-01-30
Filing date: 2018-01-30
Publication date: 2019-08-06

Abstract

The present invention relates to a kind of GLP-1 analog-COL3A1 fusion proteins.Specifically, the fusion protein includes the fusion protein of 1 chain of human III type collagen α of glucagon-like peptide 1 analog sum, and the fusion protein has the reduction blood glucose effect of glucagon-like peptide 1, and has extended Half-life in vivo.

Description

A kind of GLP-1 analog-COL3A1 fusion protein

Technical field

The present invention relates to field of medicaments, more particularly to a kind of GLP-1 analog-COL3A1 fusion protein.

Background technique

Diabetes are a kind of serious chronic diseases, mainly by the hyposecretion and function of hyperglycemia and endogenous insulin Can lose causes.Diabetes can cause multiple complications, for example, vascular system, kidney, retina, crystalline lens, peripheral nerve and Skin etc., and then influence service life and quality of life.With the development of countries in the world social economy and the raising of Living consumption, The disease incidence and illness rate of diabetes rise year by year, and it is tight that diabetes have become the third position after tumour, cardiovascular and cerebrovascular disease The chronic disease of human health is endangered again.Diabetes are divided into four classes substantially, comprising: I type (insulin-dependent), II type (non-pancreas Island element dependent form), its alloytype and gestational diabetes mellitus.Wherein, type II diabetes is the current most common diabetes, accounts for about diabetes The 90% of patient.Type II diabetes is that a kind of cause of disease is complicated, the metabolic disease characterized by blood glucose rise, Chinese II type glycosuria Patient is in past more than 20 Nian Zhongcheng explosive growths.

The treatment of type II diabetes at present mainly based on oral hypoglycemic agents, have sulfonylureas object such as Ge Lieben, gliclazide, Melbine and insulin etc..But these drug therapies, which are used for a long time, can generate tolerance, can not control blood glucose and cell for a long time The disorder of function.Therefore, it researches and develops a kind of more safely, effectively particularly important for the newtype drug of pathogenesis of diabetes mellitus.

Glucagon-like peptide 1 (GLP-1) is a kind of secretin by being located at the secretion of gastrointestinal tract mucosa L cell, GLP-1 Secretion be blood glucose dependence, i.e., when blood sugar concentration is higher than it is normal when, promoting insulin secretion is presented in GLP-1, and works as blood When sugared concentration is normal, the promoting insulin secretion of GLP-1 weakens, therefore exogenous GLP-1 treatment not will increase hypoglycemia wind Danger.In addition, GLP-1 is by increasing insulin secretion and biosynthesis, effectively reducing blood in conjunction with the receptor of alpha Cell of islet Sugar；Promote beta Cell of islet proliferation, it is inhibited to adjust apoptosis, increases the insulin secretion of glucose dependency；Gastrointestinal tract can be weakened It wriggles, delays gastric emptying, reduce food intake；Hypothalamus is acted on, appetite is reduced, to lose weight.Based on the above feature, GLP-1 has become the exploitation hot spot of novel type II diabetes therapeutic agent.

Natural GLP-1 is extremely unstable in vivo, and dipeptidyl peptidase 4 (DPP-4) fast degradation is easy to after release, Intracorporal half-life period is only 1~2min, does not have druggability.First GLP-1 receptor stimulating agent drug Chinese mugwort in the world in 2005 The listing of that peptide is filled in, it is a kind of GLP-1 analog from lizard saliva, has 50% homology with people GLP-1.Later The Liraglutide of listing is artificial synthesized GLP-1 analog, has 97% homology, therefore its validity with people GLP-1 Its side effect is substantially reduced compared to Exenatide while promotion.But Liraglutide requires daily skin as Exenatide Lower drug administration by injection, there are still certain disadvantages in terms of the ease for use of drug, therefore carry out structural modification to GLP-1, are retaining it Extend the R&D direction that its Half-life in vivo has become GLP-1 drug while biological effect.And and human immunoglobulin(HIg) The characteristics of source of people GLP-1 modifier degree that the part IgG Fc combines draws glycopeptide, IgG long circulating half-life period is utilized, effect not While being inferior to Liraglutide, it can achieve and be weekly administered, be optimal product in current similar product.But degree is drawn The Fc section of IgG in glycopeptide may have ADCC, and (Antibody-Dependent Cell Cytotoxicity, antibody-dependant are thin The cytotoxic effect that born of the same parents mediate) effect, there is potential immunogenicity and side reaction.

Therefore, this field needs to develop the drug of the permanently effective treatment diabetes of a kind of novel, safe and energy.

Summary of the invention

The object of the present invention is to provide the drugs of the permanently effective treatment diabetes of a kind of novel, safe and energy.

The first aspect of the present invention provides a kind of fusion protein, and the structure of the fusion protein is as shown in following formula I:

A-L-B (I)

In formula, A is GLP-1 analog, and B is III Collagen Type VI α of people, 1 chain, and L is that nothing or link peptide, each "-" independently are connection Peptide or peptide bond；And

The GLP-1 analog has the polypeptide of amino acid sequence shown in SEQ ID NO.:1,

His-Xaa₈-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Xaa₂₂- Gln-Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Lys-Gly-Arg-Xaa₃₇

Wherein Xaa₈It is Gly or Ala, Xaa₂₂It is Glu or Gly, Xaa₃₇It is Gly or nothing.

In another preferred example, there is in corresponding to sequence shown in SEQ ID NO.:6 choosing in the GLP-1 analog From the amino acid mutation of the following group: the 2nd glycine (Ala), the 16th glycine (Gly), the 31st glycine (Gly) or its Combination.

In another preferred example, there is in corresponding to sequence shown in SEQ ID NO.:6 choosing in the GLP-1 analog From mutation: A2G, G16E, missing G31, or combinations thereof.

In another preferred example, the GLP-1 analog is in addition to the mutation (the such as the 2nd, 16,31 amino acids), remaining Amino acid and sequence shown in SEQ ID NO.:6 are identical or essentially identical.

In another preferred example, the GLP-1 analog has the amino acid sequence as shown in SEQ ID NO.:6 or 12.

In another preferred example, 1 chain (i.e. COL3A1) of human III type collagen α be COL3A1 full length protein or segment, Described in segment be selected from the group: COL3A1_598-896Segment, COL3A1_733-896Segment, or combinations thereof.

In another preferred example, the COL3A1 full length protein has amino acid sequence shown in SEQ ID NO.:2.

In another preferred example, the COL3A1_598-896Segment is 598-896 amino acids sequence in COL3A1 albumen Column.

In another preferred example, the COL3A1_598-896Segment has amino acid sequence shown in SEQ ID NO.:3.

In another preferred example, the COL3A1_733-896Segment is 733-896 amino acids sequence in COL3A1 albumen Column.

In another preferred example, the COL3A1_733-896Segment has amino acid sequence shown in SEQ ID NO.:4.

In another preferred example, 1 chain of human III type collagen α has amino acid sequence shown in SEQ ID NO.:2,3 or 4 Column.

In another preferred example, the link peptide is duplicate such as SEQ ID NO.:5 (Gly-Gly-Gly- with n Gly-Ser sequence shown in), and its C-terminal is also connected with the polypeptide of an alanine (Ala), and wherein n is 2-6, preferably n It is 2.

In another preferred example, the link peptide has amino acid sequence shown in SEQ ID NO.:7.

In another preferred example, the fusion protein has amino acid sequence shown in SEQ ID NO.:8,9 or 13.

The second aspect of the present invention, provides a kind of oligomer, and the oligomer includes described in first aspect present invention Fusion protein.

In another preferred example, the oligomer is dimer, tripolymer, the tetramer or pentamer.

In another preferred example, the oligomer is the dimer of fusion protein described in first aspect present invention.

The third aspect of the present invention provides a kind of isolated polynucleotides, the polynucleotide encoding present invention first Fusion protein described in aspect.

In another preferred example, the coded sequence of the GLP-1 analog such as 1-90 institutes in SEQ ID NO.:10 Show.

In another preferred example, the COL3A1_598-896139- in the coded sequence such as SEQ ID NO.:10 of segment Shown in 1035.

In another preferred example, the COL3A1_733-896139- in the coded sequence such as SEQ ID NO.:11 of segment Shown in 630.

In another preferred example, the polynucleotides have the sequence as shown in SEQ ID NO.:10 or 11.

The fourth aspect of the present invention, provides a kind of carrier, and the carrier includes multicore described in third aspect present invention Thuja acid.

In another preferred example, the carrier is selected from the group: DNA, RNA, plasmid, slow virus carrier, adenovirus vector, Retroviral vector, transposons, or combinations thereof.

In another preferred example, the carrier is plasmid, preferably pUC57 plasmid.

The fifth aspect of the present invention, provides a kind of host cell, and the host cell contains fourth aspect present invention Polynucleotides described in the third aspect present invention of external source or the expression present invention the are integrated in the carrier or chromosome Oligomer described in fusion protein described in one side or expression second aspect of the present invention.

In another preferred example, the host cell is yeast, preferably Pichia pastoris, more preferably Pichia pastoris Cell GS115.

The sixth aspect of the present invention, provides a kind of pharmaceutical composition, and described pharmaceutical composition includes first party of the present invention Oligomer described in fusion protein described in face or second aspect of the present invention and pharmaceutically acceptable carrier or excipient.

In another preferred example, described pharmaceutical composition is for treating Non-Insulin Dependent Diabetes Mellitus or its related disease Disease.

The seventh aspect of the present invention provides fusion protein as described in the first aspect of the invention, second aspect of the present invention Polynucleotides described in the oligomer, third aspect present invention, carrier described in fourth aspect present invention, the present invention the 5th Host cell described in aspect is used to prepare prevention and/or treats the drug or preparation of diabetes.

In another preferred example, the diabetes are Non-Insulin Dependent Diabetes Mellitus or its related disease.

In the eighth aspect of the present invention, fusion protein, the present invention second described in a kind of first aspect present invention are provided The purposes of pharmaceutical composition described in oligomer described in aspect or sixth aspect present invention, for preventing and/or treating glycosuria Disease, preferably Non-Insulin Dependent Diabetes Mellitus or its related disease.

The ninth aspect of the present invention provides a kind of method for treating disease, including suitable to object in need for the treatment of application Fusion protein described in the first aspect present invention of amount, oligomer described in second aspect of the present invention or sixth aspect present invention institute The pharmaceutical composition stated.

In another preferred example, the disease is Non-Insulin Dependent Diabetes Mellitus or its related disease.

It should be understood that above-mentioned each technical characteristic of the invention and having in below (eg embodiment) within the scope of the present invention It can be combined with each other between each technical characteristic of body description, to form a new or preferred technical solution.As space is limited, exist This no longer tires out one by one states.

Detailed description of the invention

Fig. 1 shows the structure figures of pPic9m-GLP-COL-1 expression plasmid.

Fig. 2 shows pPic9m-GLP-COL-2 plasmid construct figure.

Fig. 3 shows GLP-1-2L-COL after purification_598-896And GLP-1-2L-COL_733-896The electrophoresis purity of fusion protein Analyze result.

Fig. 4 shows GLP-1-2L-COL_598-896And GLP-1-2L-COL_733-896The GLP-1R receptor activation of fusion protein Activity analysis.

Fig. 5 shows GLP-1-2L-COL_598-896And GLP-1-2L-COL_733-896The Half-life in vivo of fusion protein is analyzed As a result.

Specific embodiment

The present inventor after extensive and in-depth study, unexpectedly obtains a kind of safe, permanently effective non-pancreas islet for the treatment of The drug of plain dependent diabetes.The drug is a kind of GLP-1 analog-COL3A1 fusion protein, includes glucagon The fusion protein of 1 chain of human III type collagen α of 1 analog sum of sample peptide, the fusion protein have the reduction of glucagon-like peptide 1 Blood glucose effect, and there is extended Half-life in vivo.The present invention is also mutated glucagon-like peptide 1 analog, drop The sensibility that low GLP-1 analog hydrolyzes DPP-4, activity improve, and immunogenicity reduces.The present invention also filters out two kinds of people COL3A1 segment (shown in SEQ ID NO.:3 and SEQ ID NO.:4) remains the ability that people COL3A1 forms homotrimer The recombinant expression for being conducive to heterologous fusion proteins in the present invention simultaneously avoids too long amino acid sequence from leading to the tired of recombinant expression It is difficult.Fused protein of the present invention can be used for treating type II diabetes and various conditions associated.On this basis, inventor completes The present invention.

Fusion protein

As used herein, " fusion protein of the present invention ", " recombination fusion protein " or " polypeptide " refer both to first aspect present invention The fusion protein.The structure of fusion protein of the present invention is as shown in following formula I:

A-L-B (I)

As used herein, term " fusion protein " further includes having the sequence of above-mentioned active, SEQ ID NO.:8,9 or 13 Variant form.These variant forms include (but being not limited to): 1-3 (usually 1-2 is a, more preferably 1) amino acid Missing, insertion and/or replace, and C-terminal and/or N-terminal addition or lack it is one or several (usually within 3, compared with Being goodly is more preferably within 1 within 2) amino acid.For example, in the art, with amino acid similar in performance When being replaced, the function of protein is not usually changed.For another example, C-terminal and/or N-terminal addition or missing one or Several amino acid will not generally also change the structure and function of protein.In addition, the term further includes monomer and the poly bodily form The polypeptide of the present invention of formula.The term further includes linear and nonlinear polypeptide (such as cyclic peptide).

The invention also includes the active fragments of above-mentioned fusion protein, derivative and analogue.As used herein, term " piece Section ", " derivative " and " analog " refer to the function of being kept substantially fusion protein of the present invention or active polypeptide.Of the invention Polypeptide fragment, derivative or the like, which can be (i), has one or several conservative or non-conservative amino acid residues (preferably conservative Acidic amino acid residue) substituted polypeptide, or (ii) in one or more amino acid residues with the polypeptide of substituent group, or (iii) Antigenic Peptide and another compound (for example extending the compound of polypeptide half-life period, such as polyethylene glycol) fusion are formed Polypeptide, or (iv) additional amino acid sequence is blended in this polypeptide sequence and the polypeptide that is formed is (with leader sequence, secretion sequence Or the fusion protein of the fusion of the sequence labels such as 6His and formation).According to the teaching of this article, these segments, derivative and analogue It belongs to scope known to those skilled in the art.

A kind of preferred reactive derivative refers to compared with the amino acid sequence of formulas I there is at most 3, preferably at most 2, More preferably at most 1 amino acid is replaced by amino acid with similar or analogous properties and forms polypeptide.These conservative variations are more Peptide carries out amino acid substitution preferably based on Table A and generates.

Table A

Initial residue	Representative substitution	It is preferred to replace
			Ala(A)	Val；Leu；Ile	Val
Arg(R)	Lys；Gln；Asn	Lys
			Asn(N)	Gln；His；Lys；Arg	Gln
Asp(D)	Glu	Glu
			Cys(C)	Ser	Ser
Gln(Q)	Asn	Asn
			Glu(E)	Asp	Asp
Gly(G)	Pro；Ala	Ala
			His(H)	Asn；Gln；Lys；Arg	Arg
Ile(I)	Leu；Val；Met；Ala；Phe	Leu
			Leu(L)	Ile；Val；Met；Ala；Phe	Ile
Lys(K)	Arg；Gln；Asn	Arg
			Met(M)	Leu；Phe；Ile	Leu
Phe(F)	Leu；Val；Ile；Ala；Tyr	Leu
			Pro(P)	Ala	Ala
Ser(S)	Thr	Thr
			Thr(T)	Ser	Ser
Trp(W)	Tyr；Phe	Tyr
			Tyr(Y)	Trp；Phe；Thr；Ser	Phe
Val(V)	Ile；Leu；Met；Phe；Ala	Leu

The present invention also provides the analogs of fusion protein of the present invention.Shown in these analogs and SEQ ID NO.:8,9 or 13 The difference of polypeptide can be the difference on amino acid sequence, be also possible to not influence the difference on the modified forms of sequence, or Person haves both at the same time.Analog further includes the analog with the residue (such as D- amino acid) different from natural L-amino acids, and Analog with non-naturally occurring or synthesis amino acid (such as β, gamma-amino acid).It should be understood that polypeptide of the invention is not It is limited to enumerated representative polypeptide.

Modification (not changing primary structure usually) form includes: the chemical derivative form such as acetyl of internal or external polypeptide Change or carboxylated.Modification further includes glycosylation, is carried out in the synthesis and processing of polypeptide or in further processing step such as those Glycosylation modified and generation polypeptide.This modification can carry out glycosylated enzyme (such as mammal by the way that polypeptide to be exposed to Glycosylase or deglycosylation enzyme) and complete.Modified forms further include with phosphorylated amino acid residue (such as phosphoric acid junket ammonia Acid, phosphoserine, phosphothreonine) sequence.It further include being modified to improve its anti-proteolytic properties or optimization The polypeptide of solubility property.

The compound of the present invention includes a kind of heterologous fusion proteins matter, wherein first polypeptide is GLP-1 analog, sequence Column selection from SEQIDNO.:1,

Wherein Xaa₈It is Gly or Ala；

Wherein Xaa₂₂It is Glu or Gly；

Wherein Xaa₃₇It is Gly or is removed.

Second polypeptide is that 1 chain (i.e. COL3A1) overall length of human III type collagen α or segment, sequence are selected from

(a) overall length COL3A1 (SEQ ID NO.:2)

Met-Met-Ser-Phe-Val-Gln-Lys-Gly-Ser--Trp-Leu-Leu-Leu-Ala-Leu-Leu-His- Pro--Thr-Ile-Ile-Leu-Ala-Gln-Gln-Glu-Ala-Val-Glu-Gly-Gly-Cys-Ser-His-Leu-Gly- Gln-Ser--Tyr-Ala-Asp-Arg-Asp-Val--Trp-Lys-Pro-Glu-Pro-Cys-Gln-Ile-Cys-Val- Cys-Asp-Ser-Gly-Ser-Val-Leu-Cys-Asp-Asp-Ile-Ile-Cys-Asp-Asp-Gln-Glu-Leu-Asp- Cys-Pro-Asn-Pro-Glu-Ile-Pro-Phe-Gly-Glu-Cys-Cys-Ala-Val-Cys-Pro-Gln-Pro-Pro-- Thr-Ala-Pro--Thr-Arg-Pro-Pro-Asn-Gly-Gln-Gly-Pro-Gln-Gly-Pro-Lys-Gly-Asp-Pro- Gly-Pro-Pro-Gly-Ile-Pro-Gly-Arg-Asn-Gly-Asp-Pro-Gly-Ile-Pro-Gly-Gln-Pro-Gly- Ser-Pro-Gly-Ser-Pro-Gly-Pro-Pro-Gly-Ile-Cys-Glu-Ser-Cys-Pro--Thr-Gly-Pro-Gln- Asn--Tyr-Ser-Pro-Gln--Tyr-Asp-Ser--Tyr-Asp-Val-Lys-Ser-Gly-Val-Ala-Val-Gly- Gly-Leu-Ala-Gly--Tyr-Pro-Gly-Pro-Ala-Gly-Pro-Pro-Gly-Pro-Pro-Gly-Pro-Pro- Gly--Thr-Ser-Gly-His-Pro-Gly-Ser-Pro-Gly-Ser-Pro-Gly--Tyr-Gln-Gly-Pro-Pro- Gly-Glu-Pro-Gly-Gln-Ala-Gly-Pro-Ser-Gly-Pro-Pro-Gly-Pro-Pro-Gly-Ala-Ile-Gly- Pro-Ser-Gly-Pro-Ala-Gly-Lys-Asp-Gly-Glu-Ser-Gly-Arg-Pro-Gly-Arg-Pro-Gly-Glu- Arg-Gly-Leu-Pro-Gly-Pro-Pro-Gly-Ile-Lys-Gly-Pro-Ala-Gly-Ile-Pro-Gly-Phe-Pro- Gly-Met-Lys-Gly-His-Arg-Gly-Phe-Asp-Gly-Arg-Asn-Gly-Glu-Lys-Gly-Glu--Thr-Gly- Ala-Pro-Gly-Leu-Lys-Gly-Glu-Asn-Gly-Leu-Pro-Gly-Glu-Asn-Gly-Ala-Pro-Gly-Pro- Met-Gly-Pro-Arg-Gly-Ala-Pro-Gly-Glu-Arg-Gly-Arg-Pro-Gly-Leu-Pro-Gly-Ala-Ala- Gly-Ala-Arg-Gly-Asn-Asp-Gly-Ala-Arg-Gly-Ser-Asp-Gly-Gln-Pro-Gly-Pro-Pro-Gly- Pro-Pro-Gly--Thr-Ala-Gly-Phe-Pro-Gly-Ser-Pro-Gly-Ala-Lys-Gly-Glu-Val-Gly-Pro- Ala-Gly-Ser-Pro-Gly-Ser-Asn-Gly-Ala-Pro-Gly-Gln-Arg-Gly-Glu-Pro-Gly-Pro-Gln- Gly-His-Ala-Gly-Ala-Gln-Gly-Pro-Pro-Gly-Pro-Pro-Gly-Ile-Asn-Gly-Ser-Pro-Gly- Gly-Lys-Gly-Glu-Met-Gly-Pro-Ala-Gly-Ile-Pro-Gly-Ala-Pro-Gly-Leu-Met-Gly-Ala- Arg-Gly-Pro-Pro-Gly-Pro-Ala-Gly-Ala-Asn-Gly-Ala-Pro-Gly-Leu-Arg-Gly-Gly-Ala- Gly-Glu-Pro-Gly-Lys-Asn-Gly-Ala-Lys-Gly-Glu-Pro-Gly-Pro-Arg-Gly-Glu-Arg-Gly- Glu-Ala-Gly-Ile-Pro-Gly-Val-Pro-Gly-Ala-Lys-Gly-Glu-Asp-Gly-Lys-Asp-Gly-Ser- Pro-Gly-Glu-Pro-Gly-Ala-Asn-Gly-Leu-Pro-Gly-Ala-Ala-Gly-Glu-Arg-Gly-Ala-Pro- Gly-Phe-Arg-Gly-Pro-Ala-Gly-Pro-Asn-Gly-Ile-Pro-Gly-Glu-Lys-Gly-Pro-Ala-Gly- Glu-Arg-Gly-Ala-Pro-Gly-Pro-Ala-Gly-Pro-Arg-Gly-Ala-Ala-Gly-Glu-Pro-Gly-Arg- Asp-Gly-Val-Pro-Gly-Gly-Pro-Gly-Met-Arg-Gly-Met-Pro-Gly-Ser-Pro-Gly-Gly-Pro- Gly-Ser-Asp-Gly-Lys-Pro-Gly-Pro-Pro-Gly-Ser-Gln-Gly-Glu-Ser-Gly-Arg-Pro-Gly- Pro-Pro-Gly-Pro-Ser-Gly-Pro-Arg-Gly-Gln-Pro-Gly-Val-Met-Gly-Phe-Pro-Gly-Pro- Lys-Gly-Asn-Asp-Gly-Ala-Pro-Gly-Lys-Asn-Gly-Glu-Arg-Gly-Gly-Pro-Gly-Gly-Pro- Gly-Pro-Gln-Gly-Pro-Pro-Gly-Lys-Asn-Gly-Glu--Thr-Gly-Pro-Gln-Gly-Pro-Pro-Gly- Pro--Thr-Gly-Pro-Gly-Gly-Asp-Lys-Gly-Asp--Thr-Gly-Pro-Pro-Gly-Pro-Gln-Gly- Leu-Gln-Gly-Leu-Pro-Gly--Thr-Gly-Gly-Pro-Pro-Gly-Glu-Asn-Gly-Lys-Pro-Gly-Glu- Pro-Gly-Pro-Lys-Gly-Asp-Ala-Gly-Ala-Pro-Gly-Ala-Pro-Gly-Gly-Lys-Gly-Asp-Ala- Gly-Ala-Pro-Gly-Glu-Arg-Gly-Pro-Pro-Gly-Leu-Ala-Gly-Ala-Pro-Gly-Leu-Arg-Gly- Gly-Ala-Gly-Pro-Pro-Gly-Pro-Glu-Gly-Gly-Lys-Gly-Ala-Ala-Gly-Pro-Pro-Gly-Pro- Pro-Gly-Ala-Ala-Gly--Thr-Pro-Gly-Leu-Gln-Gly-Met-Pro-Gly-Glu-Arg-Gly-Gly-Leu- Gly-Ser-Pro-Gly-Pro-Lys-Gly-Asp-Lys-Gly-Glu-Pro-Gly-Gly-Pro-Gly-Ala-Asp-Gly- Val-Pro-Gly-Lys-Asp-Gly-Pro-Arg-Gly-Pro--Thr-Gly-Pro-Ile-Gly-Pro-Pro-Gly-Pro- Ala-Gly-Gln-Pro-Gly-Asp-Lys-Gly-Glu-Gly-Gly-Ala-Pro-Gly-Leu-Pro-Gly-Ile-Ala- Gly-Pro-Arg-Gly-Ser-Pro-Gly-Glu-Arg-Gly-Glu--Thr-Gly-Pro-Pro-Gly-Pro-Ala-Gly- Phe-Pro-Gly-Ala-Pro-Gly-Gln-Asn-Gly-Glu-Pro-Gly-Gly-Lys-Gly-Glu-Arg-Gly-Ala- Pro-Gly-Glu-Lys-Gly-Glu-Gly-Gly-Pro-Pro-Gly-Val-Ala-Gly-Pro-Pro-Gly-Lys-Asp- Gly--Thr-Ser-Gly-His-Pro-Gly-Pro-Ile-Gly-Pro-Pro-Gly-Pro-Arg-Gly-Asn-Arg-Gly- Glu-Arg-Gly-Ser-Glu-Gly-Ser-Pro-Gly-His-Pro-Gly-Gln-Pro-Gly-Pro-Pro-Gly-Pro- Pro-Gly-Ala-Pro-Gly-Pro-Cys-Cys-Gly-Gly-Val-Gly-Ala-Ala-Ala-Ile-Ala-Gly-Ile- Gly-Gly-Glu-Lys-Ala-Gly-Gly-Phe-Ala-Pro--Tyr--Tyr-Gly-Asp-Glu-Pro-Met-Asp- Phe-Lys-Ile-Asn--Thr-Asp-Glu-Ile-Met--Thr-Ser-Leu-Lys-Ser-Val-Asn-Gly-Gln- Ile-Glu-Ser-Leu-Ile-Ser-Pro-Asp-Gly-Ser-Arg-Lys-Asn-Pro-Ala-Arg-Asn-Cys-Arg- Asp-Leu-Lys-Phe-Cys-His-Pro-Glu-Leu-Lys-Ser-Gly-Glu--Tyr--Trp-Val-Asp-Pro- Asn-Gln-Gly-Cys-Lys-Leu-Asp-Ala-Ile-Lys-Val-Phe-Cys-Asn-Met-Glu--Thr-Gly- Glu--Thr-Cys-Ile-Ser-Ala-Asn-Pro-Leu-Asn-Val-Pro-Arg-Lys-His--Trp--Trp--Thr- Asp-Ser-Ser-Ala-Glu-Lys-Lys-His-Val--Trp-Phe-Gly-Glu-Ser-Met-Asp-Gly-Gly-Phe- Gln-Phe-Ser--Tyr-Gly-Asn-Pro-Glu-Leu-Pro-Glu-Asp-Val-Leu-Asp-Val-Gln-Leu-Ala- Phe-Leu-Arg-Leu-Leu-Ser-Ser-Arg-Ala-Ser-Gln-Asn-Ile--Thr--Tyr-His-Cys-Lys- Asn-Ser-Ile-Ala--Tyr-Met-Asp-Gln-Ala-Ser-Gly-Asn-Val-Lys-Lys-Ala-Leu-Lys-Leu- Met-Gly-Ser-Asn-Glu-Gly-Glu-Phe-Lys-Ala-Glu-Gly-Asn-Ser-Lys-Phe--Thr--Tyr-- Thr-Val-Leu-Glu-Asp-Gly-Cys--Thr-Lys-His--Thr-Gly-Glu--Trp-Ser-Lys--Thr-Val- Phe-Glu--Tyr-Arg--Thr-Arg-Lys-Ala-Val-Arg-Leu-Pro-Ile-Val-Asp-Ile-Ala-Pro-- Tyr-Asp-Ile-Gly-Gly-Pro-Asp-Gln-Glu-Phe-Gly-Val-Asp-Val-Gly-Pro-Val-Cys-Phe- Leu

(b)COL3A1_598-896(SEQ ID NO.:3)

Gly-Pro-Gly-Gly-Pro-Gly-Pro-Gln-Gly-Pro-Pro-Gly-Lys-Asn-Gly-Glu-Thr- Gly-Pro-Gln-Gly-Pro-Pro-Gly-Pro-Thr-Gly-Pro-Gly-Gly-Asp-Lys-Gly-Asp-Thr-Gly- Pro-Pro-Gly-Pro-Gln-Gly-Leu-Gln-Gly-Leu-Pro-Gly-Thr-Gly-Gly-Pro-Pro-Gly-Glu- Asn-Gly-Lys-Pro-Gly-Glu-Pro-Gly-Pro-Lys-Gly-Asp-Ala-Gly-Ala-Pro-Gly-Ala-Pro- Gly-Gly-Lys-Gly-Asp-Ala-Gly-Ala-Pro-Gly-Glu-Arg-Gly-Pro-Pro-Gly-Leu-Ala-Gly- Ala-Pro-Gly-Leu-Arg-Gly-Gly-Ala-Gly-Pro-Pro-Gly-Pro-Glu-Gly-Gly-Lys-Gly-Ala- Ala-Gly-Pro-Pro-Gly-Pro-Pro-Gly-Ala-Ala-Gly-Thr-Pro-Gly-Leu-Gln-Gly-Met-Pro- Gly-Glu-Arg-Gly-Gly-Leu-Gly-Ser-Pro-Lys-Gly-Asp-Lys-Gly-Glu-Pro-Gly-Gly-Pro- Gly-Ala-Asp-Gly-Val-Pro-Gly-Lys-Asp-Gly-Pro-Arg-Gly-Pro-Thr-Gly-Pro-Ile-Gly- Pro-Pro-Gly-Pro-Ala-GLy-Gln-Pro-Gly-Asp-Lys-Gly-Glu-Gly-Gly-Ala-Pro-Gly-Leu- Pro-Gly-Ile-Ala-Gly-Pro-Arg-Gly-Ser-Pro-Gly-Glu-Arg-Gly-Glu-Thr-Gly-Pro-Pro- Gly-Pro-Ala-Gly-Phe-Pro-Gly-Ala-Pro-Gly-Gln-Asn-Gly-Glu-Pro-Gly-Gly-Lys-Gly- Glu-Arg-Gly-Ala-Pro-Gly-Glu-Lys-Gly-Glu-Gly-Gly-Pro-Pro-Gly-Val-Ala-Gly-Pro- Pro-Gly-Lys-Asp-Gly-Thr-Ser-Gly-His-Pro-Gly-Pro-I le-Gly-Pro-Pro-Gly-Pro-Arg- Gly-Asn-Arg-Gly-Glu-Arg-Gly-Ser-Glu-Gly-Ser-Pro-Gly-His-Pro-Gly-Gln-Pro-Gly- Pro-Pro-Gly-Pro-Pro-Gly-Ala-Pro-Gly-Pro-Cys-Cys-Gly-Gly

(c)COL3A1_733-896(SEQ ID NO.:4)

Gly-Leu-Gly-Ser-Pro-Lys-Gly-Asp-Lys-Gly-Glu-Pro-Gly-Gly-Pro-Gly-Ala- Asp-Gly-Val-Pro-Gly-Lys-Asp-Gly-Pro-Arg-Gly-Pro-Thr-Gly-Pro-Ile-Gly-Pro-Pro- Gly-Pro-Ala-GLy-Gln-Pro-Gly-Asp-Lys-Gly-Glu-Gly-Gly-Ala-Pro-Gly-Leu-Pro-Gly- Ile-Ala-Gly-Pro-Arg-Gly-Ser-Pro-Gly-Glu-Arg-Gly-Glu-Thr-Gly-Pro-Pro-Gly-Pro- Ala-Gly-Phe-Pro-Gly-Ala-Pro-Gly-Gln-Asn-Gly-Glu-Pro-Gly-Gly-Lys-Gly-Glu-Arg- Gly-Ala-Pro-Gly-Glu-Lys-Gly-Glu-Gly-Gly-Pro-Pro-Gly-Val-Ala-Gly-Pro-Pro-Gly- Lys-Asp-Gly-Thr-Ser-Gly-His-Pro-Gly-Pro-Ile-Gly-Pro-Pro-Gly-Pro-Arg-Gly-Asn- Arg-Gly-Glu-Arg-Gly-Ser-Glu-Gly-Ser-Pro-Gly-His-Pro-Gly-Gln-Pro-Gly-Pro-Pro- Gly-Pro-Pro-Gly-Ala-Pro-Gly-Pro-Cys-Cys-Gly-Gly

The C-terminal of heterologous fusion proteins matter GLP-1 analogue polypeptide of the invention and the N-terminal of people's COL3A1 segment are preferred It is fused together by the peptide linker (i.e. link peptide) rich in G, wherein peptide linker has sequence [Gly-Gly-Gly-Gly-Ser (SEQ ID NO.:5)]_nThe sequence of-Ala, wherein n is 2-6, it is therefore preferable to 2.

Heterologous fusion proteins matter of the invention includes GLP-1 analog part and people's COL3A1 segment portion.By to day The partial replacement of right GLP-1 sequence with merge people's COL3A1 segment, while retaining natural GLP-1 activity, fusion protein leads to The remarkable region COL3A1 forms stable oligomer, increases the internal stability of fusion protein.

Natural GLP-1 is cut into the active segment of AA7-AA37, therefore, according to this field in vivo by processing The aminoterminal of GLP-1 is appointed as No. 7 by habit, and c-terminus is No. 37.To its in the polypeptide as shown in SEQ ID NO.:6 His amino acid serial number.

⁷His-Ala-Glu-¹⁰Gly-Thr-Phe-Thr-Ser-¹⁵Asp-Val-Ser-Ser-Tyr-²⁰Leu-Glu-Gly- Gln-Ala-²⁵Ala-Lys-Glu-Phe-Ile-³⁰Ala-Trp-Leu-Val-Lys-³⁵Gly-Arg-³⁷Gly

(SEQ ID NO:6)

Relative to natural GLP-1 (7-37), the GLP-1 analog part of heterologous fusion proteins matter includes the 8th, 22 and 36 Three preliminary replacements.Endogenous dipeptidyl peptidase 4 (DPP-4) is cut naturally between Ala and the 9th of the 8th Glu GLP-1, inactive GLP-1 (9-37) segment of generation, the 8th replace with Gly after can reduce GLP-1 analog to DPP- The sensibility of 4 hydrolysis.The activity of GLP-1 analog can be improved in 22nd replacement.37th removal can reduce fusion egg White matter obtains immunogenicity.Sequence after mutation is as shown in SEQ ID NO.:12.

⁷His-Gly-Glu-¹⁰Gly-Thr-Phe-Thr-Ser-¹⁵Asp-Val-Ser-Ser-Tyr-²⁰Leu-Glu-Glu- Gln-Ala-²⁵Ala-Lys-Glu-Phe-Ile-³⁰Ala-Trp-Leu-Val-Lys-³⁵Gly-Arg(SEQ ID NO.:12)

Heterologous fusion proteins COL3A1 containing someone of the invention and its segment.On molecular structure, III collagen type is It is made of parallel line type chain, each linear chain is combined closely by interchain interaction by three left-handed 1 chains of α of distortion and formed An extremely strong dextrorotation triple helices structure.Every III Collagen Type VI α, 1 chain repeats structure by up to 300 or more Gly-X-Y triplet At the triplet configuration is the key that III Collagen Type VI α, 1 chain forms homotrimer.Therefore, the present invention is using overall length people On the basis of COL3A1 (SEQ ID NO.:2), in order to avoid too long amino acid sequence leads to the difficulty of recombinant expression, preferably Two kinds of people's COL3A1 segments, sequence is respectively as shown in SEQ ID NO.:3 and SEQ ID NO.:4.Two kinds of segments contain difference The Gly-X-Y triplet configuration domain of length remains people COL3A1 and forms the ability of homotrimer while being conducive to the present invention The recombinant expression of middle heterologous fusion proteins.

Joint peptide [Gly- of the C-terminal amino acid of GLP-1 analog part in the present invention preferably by being rich in glycine Gly-Gly-Gly-Ser(SEQ ID NO.:5)]_n- Ala is merged with the N-terminal of people's COL3A1 segment.Increasing peptide linker can be to prevent Only interfering with each other between potential two structural domains, improves the stability of heterologous fusion proteins.In addition, connecing rich in glycine Head provides certain structural flexibility, allow GLP-1 analog part and the GLP-1 on target cell such as pancreatic beta cell by Body molecule effectively interacts, and is conducive to play its bioactivity.Center tap peptide [Gly-Gly-Gly-Gly-Ser of the present invention (SEQ ID NO.:5)]_nRepeat number n >=2 of-Ala, but too long joint peptide is unfavorable for the stability of fusion protein and may increase Add potential immunogenicity.It is therefore preferable that joint peptide includes sequence:

Gly-Gly-Gly-Gly-Ser-Gly-Gly-Gly-Gly-Ser-Ala(SEQ ID NO.:7)

Therefore currently preferred GLP-1-COL3A1 heterologous fusion proteins matter includes following protein:

(a)GLP-1-2L-COL_598-896(SEQ ID NO.:8)

HGEGTFTSDVSSYLEEQAAKEFIAWLVKGRGGGGSGGGGSGGGGSAGPGGPGPQGPPGKNGETGPQGPP GPTGPGGDKGDTGPPGPQGLQGLPGTGGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAPGERGPPGLAGAPGLRGG AGPPGPEGGKGAAGPPGPPGAAGTPGLQGMPGERGGLGSPGPKGDKGEPGGPGADGVPGKDGPRGPTGPIGPPGPAG QPGDKGEGGAPGLPGIAGPRGSPGERGETGPPGPAGFPGAPGQNGEPGGKGERGAPGEKGEGGPPGVAGPPGKDGTS GHPGPIGPPGPRGNRGERGSEGSPGHPGQPGPPGPPGAPGPCCGG

(b)GLP-1-2L-COL_733-896(SEQ ID NO.:9)

HGEGTFTSDVSSYLEEQAAKEFIAWLVKGRGGGGSGGGGSGGGGSAGLGSPGPKGDKGEPGGPGADGVP GKDGPRGPTGPIGPPGPAGQPGDKGEGGAPGLPGIAGPRGSPGERGETGPPGPAGFPGAPGQNGEPGGKGERGAPGE KGEGGPPGVAGPPGKDGTSGHPGPIGPPGPRGNRGERGSEGSPGHPGQPGPPGPPGAPGPCCGG

(c)GLP-1-2L-COL(SEQ ID NO.:13)

HGEGTFTSDVSSYLEEQAAKEFIAWLVKGRGGGGSGGGGSGGGGSAAMMSFVQKGSWLLLALLHPTI ILAQQEAVEGGCSHLGQSYADRDVWKPEPCQICVCDSGSVLCDDIICDDQELDCPNPEIPFGECCAVCPQPPTAPTR PPNGQGPQGPKGDPGPPGIPGRNGDPGIPGQPGSPGSPGPPGICESCPTGPQNYSPQYDSYDVKSGVAVGGLAGYPG PAGPPGPPGPPGTSGHPGSPGSPGYQGPPGEPGQAGPSGPPGPPGAIGPSGPAGKDGESGRPGRPGERGLPGPPGIK GPAGIPGFPGMKGHRGFDGRNGEKGETGAPGLKGENGLPGENGAPGPMGPRGAPGERGRPGLPGAAGARGNDGARGS DGQPGPPGPPGTAGFPGSPGAKGEVGPAGSPGSNGAPGQRGEPGPQGHAGAQGPPGPPGINGSPGGKGEMGPAGIPG APGLMGARGPPGPAGANGAPGLRGGAGEPGKNGAKGEPGPRGERGEAGIPGVPGAKGEDGKDGSPGEPGANGLPGAA GERGAPGFRGPAGPNGIPGEKGPAGERGAPGPAGPRGAAGEPGRDGVPGGPGMRGMPGSPGGPGSDGKPGPPGSQGE SGRPGPPGPSGPRGQPGVMGFPGPKGNDGAPGKNGERGGPGGPGPQGPPGKNGETGPQGPPGPTGPGGDKGDTGPPG PQGLQGLPGTGGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAPGERGPPGLAGAPGLRGGAGPPGPEGGKGAAGPP GPPGAAGTPGLQGMPGERGGLGSPGPKGDKGEPGGPGADGVPGKDGPRGPTGPIGPPGPAGQPGDKGEGGAPGLPGI AGPRGSPGERGETGPPGPAGFPGAPGQNGEPGGKGERGAPGEKGEGGPPGVAGPPGKDGTSGHPGPIGPPGPRGNRG ERGSEGSPGHPGQPGPPGPPGAPGPCCGGVGAAAIAGIGGEKAGGFAPYYGDEPMDFKINTDEIMTSLKSVNGQIES LISPDGSRKNPARNCRDLKFCHPELKSGEYWVDPNQGCKLDAIKVFCNMETGETCISANPLNVPRKHWWTDSSAEKK HVWFGESMDGGFQFSYGNPELPEDVLDVQLAFLRLLSSRASQNITYHCKNSIAYMDQASGNVKKALKLMGSNEGEFK AEGNSKFTYTVLEDGCTKHTGEWSKTVFEYRTRKAVRLPIVDIAPYDIGGPDQEFGVDVGPVCFL

The nomenclature used herein for referring to specific heterologous fusion proteins matter is defined as follows: the part fused protein GLP-1 The analog of mature GLP-1 (7-37) is refered in particular to, wherein the 8th Ala sports Gly, the 22nd Gly sports Glu, the 37th The Gly removal of position.L refers to sequence [Gly-Gly-Gly-Gly-Ser (SEQ ID NO.:5)]_nThe connector of-Ala.Directly in L The number of front refers to the repetition number of n in joint peptide.The joint peptide for being appointed as 2L refers to sequence Gly-Gly-Gly-Gly-Ser- Gly-Gly-Gly-Gly-Ser-Ala(SEQ ID NO.:7).Fused protein COL3A1 segment is abbreviated as COL, amino Acid sequence initial position is indicated with residue numbering.COL_598-896Indicate the part COL3A1 of mature fused protein with the 598th The Gly of position starts, and is terminated with the 896th Gly；COL_733-896Indicate the part COL3A1 of mature fused protein with the 733rd Gly start, terminated with the 896th Gly.

The present invention relates to the fusion proteins that source of people GLP-1 analog is combined with 1 chain segment of human III type collagen α.The fusion Albumen and degree draw glycopeptide the difference is that, the Fc section of IgG may have ADCC (Antibody-Dependent Cell Cytotoxicity, Antibody -dependent cell cytotoxicity effect) effect, there is potential immunogenicity and side reaction. Opposite, collagen is a kind of most protein of in-vivo content, accounts for about the 25%-33% of total protein, widely deposits Be human body bone, tendon, cartilage and skin and other connective tissue kinds, be the main component of extracellular matrix (ECM), have it is good Bio-compatibility well, bioresorbable.Wherein, III Collagen Type VI only accounts for the 10% of collagen total amount, is primarily present in blood vessel In.On molecular structure, III collagen type is made of parallel line type chain, each linear chain α 1 chain left-handed by three distortions The extremely strong dextrorotation triple helices structure of to be formed one that combined closely by interchain interaction.Every III Collagen Type VI α, 1 chain is by more Gly-X-Y triplet up to 300 or more repeats to constitute, which is the pass that III Collagen Type VI α, 1 chain forms homotrimer Key.

Nucleic acid coding sequence

The invention also includes the polynucleotides for encoding heterologous fusion proteins matter of the present invention and include these polynucleotides Carrier and host cell.The present invention also includes treatment with non-insulin-depending type, obesity and various other diseases and illness The method of patient comprising apply heterologous fusion proteins matter discussed herein.

The invention further relates to the polynucleotides for encoding fusion protein according to the present invention.

In a preference of the invention, the nucleotide sequence is as shown in SEQ ID NO.:10 or 11.

Polynucleotides of the invention can be DNA form or rna form.DNA form includes cDNA, genomic DNA or people The DNA of work synthesis.DNA can be single-stranded or double-strand.DNA can be coding strand or noncoding strand.Encoding mature polypeptide Coding region sequence can the sequence of polypeptide be identical or the variation of degeneracy with shown in coding SEQ ID NO.:8,9 or 13 Body.As used herein, " variant of degeneracy " it is more shown in SEQ ID NO.:8,9 or 13 to refer to that coding has in the present invention Peptide, but the differentiated nucleic acid sequence of corresponding encoded region sequence.

The nucleotide full length sequence of polypeptide of the invention or its segment can usually use PCR amplification method, recombination method or artificial Synthetic method obtains.It, can be according to published related nucleotide sequence, especially open reading frame sequence for PCR amplification method Column carry out design primer, and make with the commercially available library cDNA or by the library cDNA prepared by conventional method well known by persons skilled in the art For template, expands and obtain related sequence.When sequence is longer, it is often necessary to twice or repeatedly PCR amplification is carried out, it then again will be each The secondary segment amplified is stitched together by proper order.At present, it is already possible to code book be obtained by chemical synthesis completely The DNA sequence dna of invention polypeptide (or its segment, or derivatives thereof).Then the DNA sequence dna can be introduced as known in the art each In the existing DNA molecular of kind (or such as carrier) and cell.

The present invention also relates to the carriers comprising polynucleotides of the invention, and with carrier of the invention or peptide coding sequence Arrange genetically engineered host cell.Above-mentioned polynucleotides, carrier or host cell can be separation.

As used herein, " separation " it is (former if it is crude to refer to that substance is separated from its primal environment Beginning environment is natural surroundings).If the polynucleotides and polypeptides under the native state in active somatic cell do not isolate and purify, But same polynucleotides or polypeptide such as from separating in other substances with existing in native state, then isolate and purify.

Once obtaining related sequence, so that it may obtain related sequence in large quantity with recombination method.This is usually will It is cloned into carrier, then is transferred to cell, then the isolated related sequence from the host cell after proliferation by conventional method.

In addition, related sequence can be also synthesized with artificial synthesized method, when especially fragment length is shorter.In general, logical After first synthesizing multiple small fragments, it is then attached the very long segment of available sequence again.

It is optimized for obtaining gene of the invention using round pcr DNA amplification/RNA method.Primer for PCR It can be properly selected according to the sequence information of invention disclosed herein, and available conventional method synthesis.Conventional method can be used The DNA/RNA segment of amplification is such as separated and purified by gel electrophoresis.

The present invention also relates to the carriers comprising polynucleotides of the invention, and with carrier of the invention or encoding histone sequence Genetically engineered host cell is arranged, and utilizes host cell expression fusion protein of the present invention through recombinant technique Method.

By the recombinant dna technology of routine, expression present invention fusion egg is obtained using polynucleotide sequence of the invention White host cell.In general comprising steps of by polynucleotides or fourth aspect present invention described in third aspect present invention The carrier transduction enters in host cell.

Method well-known to those having ordinary skill in the art can be used to construct DNA sequences encoding and suitable turn containing enzyme of the present invention Record/translation control signal expression vector.These methods include recombinant DNA technology in vi, DNA synthetic technology, In vivo recombination skill Art etc..The DNA sequence dna can be effectively connected in the appropriate promoter in expression vector, to instruct mRNA to synthesize.Expression carries Body further includes the ribosome bind site and transcription terminator of translation initiation.

In addition, expression vector preferably includes one or more selected markers, to provide for selecting conversion The phenotypic character of host cell, such as the dihyrofolate reductase of eukaryotic culture, neomycin resistance and green fluorescence egg White (GFP), or tetracycline or amicillin resistance for Escherichia coli.

Carrier comprising above-mentioned appropriate DNA sequence dna and appropriate promoter or control sequence, can be used for converting suitable When host cell, allow it to expression protein.

Host cell can be prokaryotic cell, such as bacterial cell；Or low eukaryocyte, such as yeast cells；Or it is high Equal eukaryocytes, such as mammalian cell.Representative example has: Escherichia coli, Bacillus subtillis, and the bacterium of streptomyces is thin Born of the same parents；Fungal cell such as Pichia pastoris, brewing yeast cell；Plant cell；The insect cell of drosophila S2 or Sf9；CHO,NS0, COS7 or the zooblast of 293 cells etc..In another preferred example, the host cell is Pichia pastoris.

It can be carried out with routine techniques well known to those skilled in the art with recombinant DNA conversion host cell.When host is original When core biology such as Escherichia coli, the competent cell that can absorb DNA can harvest after exponential phase of growth, use CaCl₂Method processing, institute With the step of it is generally well-known in the art.Another method is using MgCl₂.If desired, conversion can also use the side of electroporation Method carries out.When host is eucaryote, following DNA transfection method can be selected: calcium phosphate precipitation, conventional mechanical methods are such as Microinjection, electroporation, liposome packaging etc..

The transformant of acquisition can use conventional method culture, express the protein of coded by said gene of the invention.According to institute Host cell, culture medium used in culture can be selected from various conventional mediums.In the condition for being suitable for host cell growth Under cultivated.After host cell growth is to cell density appropriate, with suitable method, (such as temperature transition or chemistry are lured Lead) promoter that induces selection, cell is further cultured for a period of time.

Protein in the above methods can be expressed in cells, or on the cell membrane, or secreted outside the cell.If It needs, can be separated by various separation methods and purifying protein using its physics, chemical and other characteristics.These methods are It is well-known to those skilled in the art.The example of these methods includes but is not limited to: conventional renaturation process uses albumen precipitation Agent handle (salting-out method), centrifugation, permeate broken bacterium, super processing, ultracentrifugation, sieve chromatography (gel filtration), adsorption chromatography, from The combination of sub- displacement chromatography, high performance liquid chroma- tography (HPLC) and various other liquid chromatography technologies and these methods.

The DNA for encoding GLP-1 analog of the invention can be generated by a variety of different methods, can be based on natural sequence Column design primer, can be in the pre-connection or in the entire fusion egg of coding to generate the DNA for encoding GLP-1 analog described herein Encoding wild type GLP-1DNA is mutated in the cDNA of white matter.It usually can be used as from the overall length wild-type sequence of specific library clone The template for generating COL3A1 segment of the present invention can generate coding COL3A1 segment described herein by design primer DNA.By round pcr and design of primers, the gene and coding COL3A1 analog protein of GLP-1 analog can will be encoded Gene connected in frame also by coding rich in the DNA of joint peptide of G.The chemical synthesis for carrying out complete sequence is also feasible skill Art.Round pcr can be used, generate the piece with the primer hybridized with the sequence corresponding to end needed for COL3A1 segment is designed Section.PCR primer can also be designed and generate restriction enzyme site in order to be cloned into expression vector.

It is provided in SEQ ID NO.:10 and encodes one of currently preferred heterologous fusion proteins matter GLP-1-2L- COL_598-896Preferred DNA sequence dna:

CACGGTGAGGGTACTTTTACCTCTGATGTTTCCTCATACTTGGAAGAACAAGCTGCTAAGGAATTCATTGCCTGGCT GGTCAAAGGCAGAGGAGGTGGCGGATCCGGTGGCGGTGGGTCCGGAGGAGGTGGTTCAGCTGGTCCAGGTGGTCCAG GTCCTCAAGGTCCTCCAGGTAAGAATGGTGAAACTGGTCCTCAGGGACCTCCAGGCCCAACCGGTCCTGGAGGTGAT AAGGGTGATACCGGACCACCTGGCCCACAAGGCTTGCAGGGTCTGCCAGGTACAGGGGGTCCACCCGGTGAAAACGG CAAGCCTGGTGAACCAGGCCCAAAAGGTGACGCTGGAGCTCCAGGAGCCCCAGGAGGTAAGGGTGATGCTGGTGCCC CCGGTGAGAGAGGCCCACCAGGTTTGGCCGGTGCTCCCGGTCTGAGAGGGGGAGCTGGTCCACCAGGACCTGAAGGC GGAAAAGGTGCTGCTGGTCCACCTGGACCACCTGGTGCTGCCGGAACTCCAGGACTGCAGGGAATGCCTGGTGAAAG AGGCGGATTGGGATCTCCTGGCCCAAAAGGAGACAAGGGAGAGCCTGGTGGACCAGGGGCAGATGGAGTTCCTGGAA AAGATGGTCCTCGTGGTCCAACAGGACCTATCGGTCCCCCAGGACCTGCTGGTCAACCTGGAGATAAAGGTGAAGGC GGGGCTCCAGGATTGCCTGGTATTGCCGGCCCTAGAGGTTCTCCCGGTGAAAGAGGTGAGACCGGCCCACCTGGTCC AGCTGGCTTCCCTGGAGCACCAGGTCAGAATGGTGAGCCAGGTGGTAAGGGTGAGAGAGGAGCTCCAGGTGAGAAGG GGGAAGGTGGTCCACCTGGTGTTGCTGGTCCACCAGGTAAGGATGGTACATCCGGTCATCCTGGACCAATTGGACCT CCAGGGCCTAGAGGTAACAGGGGTGAAAGGGGATCTGAAGGATCTCCTGGACATCCAGGTCAGCCCGGTCCTCCTGG TCCACCCGGAGCTCCTGGGCCATGCTGTGGTGGC(SEQ ID NO.:10)

It is provided in SEQ ID NO.:11 and encodes one of currently preferred heterologous fusion proteins matter GLP-1-2L- COL_733-896Preferred DNA sequence dna:

CACGGTGAGGGTACTTTTACCTCTGATGTTTCCTCATACTTGGAAGAACAAGCTGCTAAGGAATTCATTGCCTGGCT GGTCAAAGGCAGAGGAGGTGGCGGATCCGGTGGCGGTGGGTCCGGAGGAGGTGGTTCAGCTGGTTTGGGATCTCCTG GCCCAAAAGGAGACAAGGGAGAGCCTGGTGGACCAGGGGCAGATGGAGTTCCTGGAAAAGATGGTCCTCGTGGTCCA ACAGGACCTATCGGTCCCCCAGGACCTGCTGGTCAACCTGGAGATAAAGGTGAAGGCGGGGCTCCAGGATTGCCTGG TATTGCCGGCCCTAGAGGTTCTCCCGGTGAAAGAGGTGAGACCGGCCCACCTGGTCCAGCTGGCTTCCCTGGAGCAC CAGGTCAGAATGGTGAGCCAGGTGGTAAGGGTGAGAGAGGAGCTCCAGGTGAGAAGGGGGAAGGTGGTCCACCTGGT GTTGCTGGTCCACCAGGTAAGGATGGTACATCCGGTCATCCTGGACCAATTGGACCTCCAGGGCCTAGAGGTAACAG GGGTGAAAGGGGATCTGAAGGATCTCCTGGACATCCAGGTCAGCCCGGTCCTCCTGGTCCACCCGGAGCTCCTGGGC CATGCTGTGGTGGC (SEQ ID NO.:11)

Expression vector and host cell

The present invention also provides a kind of expression vectors for fusion protein of the present invention.

The host cell of clone or expression nucleic acid of the present invention can be prokaryotic cell, more preferably host cell include yeast or Higher eucaryotic cells.Antigen-4 fusion protein gene is isolated and purified from expression product after host cell expression, can be used for preparing The therapeutic agent of diabetes and related disease.The related disease includes: type II diabetes, Type I diabetes, obesity, II type sugar Urinate patient's major cardiovascular events and other severe complications etc..

Compared with prior art, the present invention mainly has the advantage that

GLP-1-COL3A1 fusion protein involved in the present invention grasps reservation GLP-1 hypoglycemic activity and significantly extends it The target of Half-life in vivo devises a kind of novel molecular different from other GLP-1 analog drugs, swashs with GLP-1R Polymer and extended Half-life in vivo are stablized in activity, formation.

(a) relative to natural GLP-1 (7-37), the GLP-1 analog part of fused protein of the present invention includes the 8th, 22 With 36 three replacements.Endogenous dipeptidyl peptidase 4 (DPP-4) is cut naturally between Ala and the 9th of the 8th Glu GLP-1, inactive GLP-1 (9-37) segment of generation, the 8th replace with Gly after reduce GLP-1 analog to DPP-4 The sensibility of hydrolysis.22nd replacement improves the activity of GLP-1 analog.37th removal reduces fused protein Obtain immunogenicity.The sensibility that fusion protein of the present invention hydrolyzes DPP-4 is low, and activity is very high, and immunogenicity is low.

(b) fusion protein of the present invention contains 1 chain of human III type collagen α, for the first time using COL3A1 segment and GLP-1 analog into The building of row long-acting GLP-1 analog, good biocompatibility can form oligomer and be easy to express.Preferred COL3A1 is advantageous In the expression the advantages of, the fusion protein of building is suitable for pichia yeast expression system, and production cost is lower than other GLP-1 class drugs The higher eukaryotic cell lines of use are united, advanced optimizing and amplify by preparation process, are expected to obtain a kind of price less expensive Long-acting diabetes B therapeutic agent.Also, while retaining GLP-1 biological activity, pass through 1 chain of α using fusion protein The characteristic of tripolymer is formed, significant extended Half-life in vivo is obtained.

(c) in order to avoid the difficulty that too long amino acid sequence leads to recombinant expression, the preferred two kinds of people of the present invention COL3A1 segment (COL3A1_598-896And COL3A1_733-8962A segment), sequence is respectively such as SEQ ID NO.:3 and SEQ ID Shown in NO.:4.Two kinds of segments contain the Gly-X-Y triplet configuration domain of different length, remain people COL3A1 and form homologous three The ability of aggressiveness is conducive to the recombinant expression of heterologous fusion proteins in the present invention simultaneously, and fusion protein stability of the present invention is high.

Below with reference to specific implementation, the present invention is further explained.It should be understood that these embodiments be merely to illustrate the present invention and It is not used in and limits the scope of the invention.In the following examples, the experimental methods for specific conditions are not specified, usually according to normal condition, Such as Sambrook et al., molecular cloning: laboratory manual (New York:Cold SpringHarbor Laboratory Press, 1989) condition described in, or according to the normal condition proposed by manufacturer.Unless otherwise stated, otherwise percentage and Number is calculated by weight.

Embodiment 1: the DNA of building coding GLP-1-COL3A1 fusion protein

Encode fusion protein GLP-1-2L-COL of the present invention_598-896Gene (SEQ ID NO:8) by Nanjing gold this Auspicious Biotechnology Co., Ltd synthesizes and is cloned into pUC57 plasmid, and XhoI restriction enzyme site is contained at 5 ' ends of fusion, and 3 ' ends contain There are TAA terminator codon and NotI restriction enzyme site, the pUC57 plasmid is named as pUC57-GLP-COL-1.

With limitation nucleic acid restriction endonuclease XhoI and NotI (being purchased from Fermentas) to specifications to pUC57-GLP- COLA-1 progress is double digested, to the coding GLP-1-2L-COL that generation length is 1050bp or so after digestion_598-896Fusion The genetic fragment of albumen carries out glue recycling (plastic recovery kit is purchased from Axygen).PPic9m is carried out with XhoI and NotI simultaneously It is double digested, and glue recycling is carried out to the plasmid band that length after digestion is 9000bp.

The fusion protein gene fraction and pPic9m plasmid fragments T4DNA ligase that above-mentioned digestion obtains (are purchased from Fermentas it) is attached, connection product heat shock converts competent escherichia coli cell DH5 α, converted product coating On the LB solid medium with kanamycins chloramphenicol resistance, picking monoclonal carries out gene sequencing, determines insertion gene Sequence is correct, and the plasmid of acquisition is named as pPic9m-GLP-COL-1 (as shown in Figure 1).

It is similar, it constructs containing coding GLP-1-2L-COL_733-896The expressing fusion protein of fusion protein encoding gene Carrier is named as pPic9m-GLP-COL-2 (Fig. 2).

Embodiment 2: the expression of heterologous fusion proteins

The carrier inserted with antigen-4 fusion protein gene in embodiment 1 is stripped, is transferred to using the method for electrotransformation complete Red yeast GS115 competent cell.After nutrient limitation Screening of Media recon, high copy is carried out using G418 resistance The screening of recon.It is thin finally to obtain the recombinant yeast pichia pastoris containing heterologous fusion proteins gene for being suitable for be recombinantly expressed Born of the same parents.

Recombinant yeast pichia pastoris cell after seed expansion, is being inoculated with into being prepared on a small scale in 5L fermentor, is fermented Continue 5 days, wherein carrying out the inducing expression of heterologous fusion proteins using methanol, inducing sustained 36h ferments after fermentation The collection of liquid is used for the purifying of albumen.

Table 1: Pichia pastoris GS115 seed expansion culture medium composition

Formula	Content
		Yeast Extract	10.0g/L
Peptone	20.0g/L
		KH₂PO₄	11.8g/L
K₂HPO₄	3.0g/L
		Glycerol	10.0ml/L

Table 2: Pichia pastoris GS115 fermentation medium composition

Formula	Content
		YNB	0.67g/L
CaCl₂	0.4g/L
		K₂SO₄	10.0g/L
MgSO4.7H₂O	8.0g/L
		(NH₄)₂SO₄	8.0g/L
Citric acid	5.0g/L
		K₂HPO₄.3H₂O	18.0g/L
Glycerol	40.0ml/L

Embodiment 3: the purifying of heterologous fusion proteins

Two kinds of preferred heterologous fusion proteins GLP-1-2L-COL_598-896、GLP-1-2L-COL_733-896Using similar pure Change step.

4L fermentation medium carries out the collection of supernatant using 0.2 μm of PALL of hollow fibre filtering system, obtains about 4L Supernatant.Supernatant uses the ultrafiltration system of PALL 50kDa filter membrane to carry out sample ultrafiltration again, removes part foreign protein.Ultrafiltrate Body is finally purified with Source30Q anion-exchange chromatography, carries out washing for sample using 0-500mM NaCl linear gradient It is de-, finally obtain the heterologous fusion proteins of purifying.The purity and molecular weight of heterologous fusion proteins has been determined using SDS-PAGE, it is pure Degree > 90%, molecular weight are consistent (Fig. 3) with expection.

Embodiment 4: the bioactivity and pharmacokinetic of heterologous fusion proteins

The bioactivity research of embodiment 4a, fusion protein

There is the human osteosarcoma cell U of GLP-1R using stable transfection₂OS carries out the determination of activity of heterologous fusion proteins, heterologous The GLP-1 analog segment and U of fusion protein₂GLP-1R receptor combination on OS cell can stimulate cell to secrete cAMP, pass through The enzyme-linked measuring method of cAMP is active come the GLP-1 for detecting the activity of cAMP to characterize fusion protein.U₂OS cell is with 1.2 × 10⁵/ hole It is inoculated in 96 orifice plates, with the DMEM culture medium containing 10%FBS, in 37 DEG C, 5%CO₂Middle culture is for 24 hours.Culture medium is removed, then Base culture base is added to stay overnight.Basal medium is removed, the sample to be tested of various concentration, including 2 kinds of fusion proteins are added GLP-1-2L-COL_598-896、GLP-1-2L-COL_733-896And artificial synthesized GLP-1 (7-37) reference substance, 37 DEG C, 5%CO₂In Cultivate 0.5h.The measurement of cell cAMP content is carried out using cAMP kit (being purchased from R&D), as a result as shown in Figure 4.

Fig. 4's the results show that through GLP-1-2L-COL_598-896、GLP-1-2L-COL_733-896Stimulate U₂What OS cell generated CAMP content is and suitable with GLP-1 there are apparent dosage effect, determines GLP-1-2L-COL_598-896、GLP-1-2L- COL_733-896Fusion protein has the similar GLP-1R Activation Activity of GLP-1.

The pharmacokinetic of embodiment 4b, fusion protein

SD male rat is used in pharmacokinetic trial, and GLP-1-2L-COL is set_598-896、GLP-1-2L- COL_733-896And chemical synthesis GLP-1 (7-37) control group, every group 8.It is injected intravenously according to 1mg/kg dosage, respectively acquisition note The blood sample of different time points before penetrating and after injection: 0h, 0.5h, 1h, 2h, 4h, 6h, 10h, for 24 hours, 2d, 4d, 6d, 8d, 10d, 14d, 21d.The serum of acquisition is placed in -80 DEG C of preservations.The amount of fusion protein is detected (Fig. 5) using GLP-1 kit in serum. By Fig. 5 result it is found that fusion protein of the invention can significantly extend the circulating half-life in vivo of GLP-1 analog.

The above is only preferred embodiment of the invention, not the limitation to the present invention in any form.It should be noted that It is that, for improvement and supplement that one of ordinary skill in the art makes the present invention, also should be regarded as guarantor of the invention Protect range.

All references mentioned in the present invention is incorporated herein by reference, independent just as each document It is incorporated as with reference to such.In addition, it should also be understood that, after reading the above teachings of the present invention, those skilled in the art can To make various changes or modifications to the present invention, such equivalent forms equally fall within model defined by the application the appended claims It encloses.

Sequence table

<110>Shanghai Hui Dun Bioisystech Co., Ltd

<120>a kind of GLP-1 analog-COL3A1 fusion protein

<130> P2018-0197

<160> 13

<170> PatentIn version 3.5

<210> 1

<211> 31

<212> PRT

<213>artificial sequence (artificial sequence)

<400> 1

His Xaa Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Xaa

1 5 10 15

Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Arg Xaa

20 25 30

<210> 2

<211> 1163

<212> PRT

<213>homo sapiens (Homo sapiens)

<400> 2

Met Met Ser Phe Val Gln Lys Gly Ser Trp Leu Leu Leu Ala Leu Leu

1 5 10 15

His Pro Thr Ile Ile Leu Ala Gln Gln Glu Ala Val Glu Gly Gly Cys

20 25 30

Ser His Leu Gly Gln Ser Tyr Ala Asp Arg Asp Val Trp Lys Pro Glu

35 40 45

Pro Cys Gln Ile Cys Val Cys Asp Ser Gly Ser Val Leu Cys Asp Asp

50 55 60

Ile Ile Cys Asp Asp Gln Glu Leu Asp Cys Pro Asn Pro Glu Ile Pro

65 70 75 80

Phe Gly Glu Cys Cys Ala Val Cys Pro Gln Pro Pro Thr Ala Pro Thr

85 90 95

Arg Pro Pro Asn Gly Gln Gly Pro Gln Gly Pro Lys Gly Asp Pro Gly

100 105 110

Pro Pro Gly Ile Pro Gly Arg Asn Gly Asp Pro Gly Ile Pro Gly Gln

115 120 125

Pro Gly Ser Pro Gly Ser Pro Gly Pro Pro Gly Ile Cys Glu Ser Cys

130 135 140

Pro Thr Gly Pro Gln Asn Tyr Ser Pro Gln Tyr Asp Ser Tyr Asp Val

145 150 155 160

Lys Ser Gly Val Ala Val Gly Gly Leu Ala Gly Tyr Pro Gly Pro Ala

165 170 175

Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Thr Ser Gly His Pro Gly

180 185 190

Ser Pro Gly Ser Pro Gly Tyr Gln Gly Pro Pro Gly Glu Pro Gly Gln

195 200 205

Ala Gly Pro Ser Gly Pro Pro Gly Pro Pro Gly Ala Ile Gly Pro Ser

210 215 220

Gly Pro Ala Gly Lys Asp Gly Glu Ser Gly Arg Pro Gly Arg Pro Gly

225 230 235 240

Glu Arg Gly Leu Pro Gly Pro Pro Gly Ile Lys Gly Pro Ala Gly Ile

245 250 255

Pro Gly Phe Pro Gly Met Lys Gly His Arg Gly Phe Asp Gly Arg Asn

260 265 270

Gly Glu Lys Gly Glu Thr Gly Ala Pro Gly Leu Lys Gly Glu Asn Gly

275 280 285

Leu Pro Gly Glu Asn Gly Ala Pro Gly Pro Met Gly Pro Arg Gly Ala

290 295 300

Pro Gly Glu Arg Gly Arg Pro Gly Leu Pro Gly Ala Ala Gly Ala Arg

305 310 315 320

Gly Asn Asp Gly Ala Arg Gly Ser Asp Gly Gln Pro Gly Pro Pro Gly

325 330 335

Pro Pro Gly Thr Ala Gly Phe Pro Gly Ser Pro Gly Ala Lys Gly Glu

340 345 350

Val Gly Pro Ala Gly Ser Pro Gly Ser Asn Gly Ala Pro Gly Gln Arg

355 360 365

Gly Glu Pro Gly Pro Gln Gly His Ala Gly Ala Gln Gly Pro Pro Gly

370 375 380

Pro Pro Gly Ile Asn Gly Ser Pro Gly Gly Lys Gly Glu Met Gly Pro

385 390 395 400

Ala Gly Ile Pro Gly Ala Pro Gly Leu Met Gly Ala Arg Gly Pro Pro

405 410 415

Gly Pro Ala Gly Ala Asn Gly Ala Pro Gly Leu Arg Gly Gly Ala Gly

420 425 430

Glu Pro Gly Lys Asn Gly Ala Lys Gly Glu Pro Gly Pro Arg Gly Glu

435 440 445

Arg Gly Glu Ala Gly Ile Pro Gly Val Pro Gly Ala Lys Gly Glu Asp

450 455 460

Gly Lys Asp Gly Ser Pro Gly Glu Pro Gly Ala Asn Gly Leu Pro Gly

465 470 475 480

Ala Ala Gly Glu Arg Gly Ala Pro Gly Phe Arg Gly Pro Ala Gly Pro

485 490 495

Asn Gly Ile Pro Gly Glu Lys Gly Pro Ala Gly Glu Arg Gly Ala Pro

500 505 510

Gly Pro Ala Gly Pro Arg Gly Ala Ala Gly Glu Pro Gly Arg Asp Gly

515 520 525

Val Pro Gly Gly Pro Gly Met Arg Gly Met Pro Gly Ser Pro Gly Gly

530 535 540

Pro Gly Ser Asp Gly Lys Pro Gly Pro Pro Gly Ser Gln Gly Glu Ser

545 550 555 560

Gly Arg Pro Gly Pro Pro Gly Pro Ser Gly Pro Arg Gly Gln Pro Gly

565 570 575

Val Met Gly Phe Pro Gly Pro Lys Gly Asn Asp Gly Ala Pro Gly Lys

580 585 590

Asn Gly Glu Arg Gly Gly Pro Gly Gly Pro Gly Pro Gln Gly Pro Pro

595 600 605

Gly Lys Asn Gly Glu Thr Gly Pro Gln Gly Pro Pro Gly Pro Thr Gly

610 615 620

Pro Gly Gly Asp Lys Gly Asp Thr Gly Pro Pro Gly Pro Gln Gly Leu

625 630 635 640

Gln Gly Leu Pro Gly Thr Gly Gly Pro Pro Gly Glu Asn Gly Lys Pro

645 650 655

Gly Glu Pro Gly Pro Lys Gly Asp Ala Gly Ala Pro Gly Ala Pro Gly

660 665 670

Gly Lys Gly Asp Ala Gly Ala Pro Gly Glu Arg Gly Pro Pro Gly Leu

675 680 685

Ala Gly Ala Pro Gly Leu Arg Gly Gly Ala Gly Pro Pro Gly Pro Glu

690 695 700

Gly Gly Lys Gly Ala Ala Gly Pro Pro Gly Pro Pro Gly Ala Ala Gly

705 710 715 720

Thr Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Gly Leu Gly Ser

725 730 735

Pro Gly Pro Lys Gly Asp Lys Gly Glu Pro Gly Gly Pro Gly Ala Asp

740 745 750

Gly Val Pro Gly Lys Asp Gly Pro Arg Gly Pro Thr Gly Pro Ile Gly

755 760 765

Pro Pro Gly Pro Ala Gly Gln Pro Gly Asp Lys Gly Glu Gly Gly Ala

770 775 780

Pro Gly Leu Pro Gly Ile Ala Gly Pro Arg Gly Ser Pro Gly Glu Arg

785 790 795 800

Gly Glu Thr Gly Pro Pro Gly Pro Ala Gly Phe Pro Gly Ala Pro Gly

805 810 815

Gln Asn Gly Glu Pro Gly Gly Lys Gly Glu Arg Gly Ala Pro Gly Glu

820 825 830

Lys Gly Glu Gly Gly Pro Pro Gly Val Ala Gly Pro Pro Gly Lys Asp

835 840 845

Gly Thr Ser Gly His Pro Gly Pro Ile Gly Pro Pro Gly Pro Arg Gly

850 855 860

Asn Arg Gly Glu Arg Gly Ser Glu Gly Ser Pro Gly His Pro Gly Gln

865 870 875 880

Pro Gly Pro Pro Gly Pro Pro Gly Ala Pro Gly Pro Cys Cys Gly Gly

885 890 895

Val Gly Ala Ala Ala Ile Ala Gly Ile Gly Gly Glu Lys Ala Gly Gly

900 905 910

Phe Ala Pro Tyr Tyr Gly Asp Glu Pro Met Asp Phe Lys Ile Asn Thr

915 920 925

Asp Glu Ile Met Thr Ser Leu Lys Ser Val Asn Gly Gln Ile Glu Ser

930 935 940

Leu Ile Ser Pro Asp Gly Ser Arg Lys Asn Pro Ala Arg Asn Cys Arg

945 950 955 960

Asp Leu Lys Phe Cys His Pro Glu Leu Lys Ser Gly Glu Tyr Trp Val

965 970 975

Asp Pro Asn Gln Gly Cys Lys Leu Asp Ala Ile Lys Val Phe Cys Asn

980 985 990

Met Glu Thr Gly Glu Thr Cys Ile Ser Ala Asn Pro Leu Asn Val Pro

995 1000 1005

Arg Lys His Trp Trp Thr Asp Ser Ser Ala Glu Lys Lys His Val

1010 1015 1020

Trp Phe Gly Glu Ser Met Asp Gly Gly Phe Gln Phe Ser Tyr Gly

1025 1030 1035

Asn Pro Glu Leu Pro Glu Asp Val Leu Asp Val Gln Leu Ala Phe

1040 1045 1050

Leu Arg Leu Leu Ser Ser Arg Ala Ser Gln Asn Ile Thr Tyr His

1055 1060 1065

Cys Lys Asn Ser Ile Ala Tyr Met Asp Gln Ala Ser Gly Asn Val

1070 1075 1080

Lys Lys Ala Leu Lys Leu Met Gly Ser Asn Glu Gly Glu Phe Lys

1085 1090 1095

Ala Glu Gly Asn Ser Lys Phe Thr Tyr Thr Val Leu Glu Asp Gly

1100 1105 1110

Cys Thr Lys His Thr Gly Glu Trp Ser Lys Thr Val Phe Glu Tyr

1115 1120 1125

Arg Thr Arg Lys Ala Val Arg Leu Pro Ile Val Asp Ile Ala Pro

1130 1135 1140

Tyr Asp Ile Gly Gly Pro Asp Gln Glu Phe Gly Val Asp Val Gly

1145 1150 1155

Pro Val Cys Phe Leu

1160

<210> 3

<211> 297

<212> PRT

<213>homo sapiens (Homo sapiens)

<400> 3

Gly Pro Gly Gly Pro Gly Pro Gln Gly Pro Pro Gly Lys Asn Gly Glu

1 5 10 15

Thr Gly Pro Gln Gly Pro Pro Gly Pro Thr Gly Pro Gly Gly Asp Lys

20 25 30

Gly Asp Thr Gly Pro Pro Gly Pro Gln Gly Leu Gln Gly Leu Pro Gly

35 40 45

Thr Gly Gly Pro Pro Gly Glu Asn Gly Lys Pro Gly Glu Pro Gly Pro

50 55 60

Lys Gly Asp Ala Gly Ala Pro Gly Ala Pro Gly Gly Lys Gly Asp Ala

65 70 75 80

Gly Ala Pro Gly Glu Arg Gly Pro Pro Gly Leu Ala Gly Ala Pro Gly

85 90 95

Leu Arg Gly Gly Ala Gly Pro Pro Gly Pro Glu Gly Gly Lys Gly Ala

100 105 110

Ala Gly Pro Pro Gly Pro Pro Gly Ala Ala Gly Thr Pro Gly Leu Gln

115 120 125

Gly Met Pro Gly Glu Arg Gly Gly Leu Gly Ser Pro Lys Gly Asp Lys

130 135 140

Gly Glu Pro Gly Gly Pro Gly Ala Asp Gly Val Pro Gly Lys Asp Gly

145 150 155 160

Pro Arg Gly Pro Thr Gly Pro Ile Gly Pro Pro Gly Pro Ala Gly Gln

165 170 175

Pro Gly Asp Lys Gly Glu Gly Gly Ala Pro Gly Leu Pro Gly Ile Ala

180 185 190

Gly Pro Arg Gly Ser Pro Gly Glu Arg Gly Glu Thr Gly Pro Pro Gly

195 200 205

Pro Ala Gly Phe Pro Gly Ala Pro Gly Gln Asn Gly Glu Pro Gly Gly

210 215 220

Lys Gly Glu Arg Gly Ala Pro Gly Glu Lys Gly Glu Gly Gly Pro Pro

225 230 235 240

Gly Val Ala Gly Pro Pro Gly Lys Asp Gly Thr Ser Gly His Pro Gly

245 250 255

Pro Ile Gly Pro Pro Gly Pro Arg Gly Asn Arg Gly Glu Arg Gly Ser

260 265 270

Glu Gly Ser Pro Gly His Pro Gly Gln Pro Gly Pro Pro Gly Pro Pro

275 280 285

Gly Ala Pro Gly Pro Cys Cys Gly Gly

290 295

<210> 4

<211> 162

<212> PRT

<213>homo sapiens (Homo sapiens)

<400> 4

Gly Leu Gly Ser Pro Lys Gly Asp Lys Gly Glu Pro Gly Gly Pro Gly

1 5 10 15

Ala Asp Gly Val Pro Gly Lys Asp Gly Pro Arg Gly Pro Thr Gly Pro

20 25 30

Ile Gly Pro Pro Gly Pro Ala Gly Gln Pro Gly Asp Lys Gly Glu Gly

35 40 45

Gly Ala Pro Gly Leu Pro Gly Ile Ala Gly Pro Arg Gly Ser Pro Gly

50 55 60

Glu Arg Gly Glu Thr Gly Pro Pro Gly Pro Ala Gly Phe Pro Gly Ala

65 70 75 80

Pro Gly Gln Asn Gly Glu Pro Gly Gly Lys Gly Glu Arg Gly Ala Pro

85 90 95

Gly Glu Lys Gly Glu Gly Gly Pro Pro Gly Val Ala Gly Pro Pro Gly

100 105 110

Lys Asp Gly Thr Ser Gly His Pro Gly Pro Ile Gly Pro Pro Gly Pro

115 120 125

Arg Gly Asn Arg Gly Glu Arg Gly Ser Glu Gly Ser Pro Gly His Pro

130 135 140

Gly Gln Pro Gly Pro Pro Gly Pro Pro Gly Ala Pro Gly Pro Cys Cys

145 150 155 160

Gly Gly

<210> 5

<211> 5

<212> PRT

<213>artificial sequence (artificial sequence)

<400> 5

Gly Gly Gly Gly Ser

1 5

<210> 6

<211> 31

<212> PRT

<213>artificial sequence (artificial sequence)

<400> 6

His Ala Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Gly

1 5 10 15

Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Arg Gly

20 25 30

<210> 7

<211> 11

<212> PRT

<213>artificial sequence (artificial sequence)

<400> 7

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala

1 5 10

<210> 8

<211> 345

<212> PRT

<213>artificial sequence (artificial sequence)

<400> 8

His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu

1 5 10 15

Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Arg Gly Gly

20 25 30

Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Gly Pro

35 40 45

Gly Gly Pro Gly Pro Gln Gly Pro Pro Gly Lys Asn Gly Glu Thr Gly

50 55 60

Pro Gln Gly Pro Pro Gly Pro Thr Gly Pro Gly Gly Asp Lys Gly Asp

65 70 75 80

Thr Gly Pro Pro Gly Pro Gln Gly Leu Gln Gly Leu Pro Gly Thr Gly

85 90 95

Gly Pro Pro Gly Glu Asn Gly Lys Pro Gly Glu Pro Gly Pro Lys Gly

100 105 110

Asp Ala Gly Ala Pro Gly Ala Pro Gly Gly Lys Gly Asp Ala Gly Ala

115 120 125

Pro Gly Glu Arg Gly Pro Pro Gly Leu Ala Gly Ala Pro Gly Leu Arg

130 135 140

Gly Gly Ala Gly Pro Pro Gly Pro Glu Gly Gly Lys Gly Ala Ala Gly

145 150 155 160

Pro Pro Gly Pro Pro Gly Ala Ala Gly Thr Pro Gly Leu Gln Gly Met

165 170 175

Pro Gly Glu Arg Gly Gly Leu Gly Ser Pro Gly Pro Lys Gly Asp Lys

180 185 190

Gly Glu Pro Gly Gly Pro Gly Ala Asp Gly Val Pro Gly Lys Asp Gly

195 200 205

Pro Arg Gly Pro Thr Gly Pro Ile Gly Pro Pro Gly Pro Ala Gly Gln

210 215 220

Pro Gly Asp Lys Gly Glu Gly Gly Ala Pro Gly Leu Pro Gly Ile Ala

225 230 235 240

Gly Pro Arg Gly Ser Pro Gly Glu Arg Gly Glu Thr Gly Pro Pro Gly

245 250 255

Pro Ala Gly Phe Pro Gly Ala Pro Gly Gln Asn Gly Glu Pro Gly Gly

260 265 270

Lys Gly Glu Arg Gly Ala Pro Gly Glu Lys Gly Glu Gly Gly Pro Pro

275 280 285

Gly Val Ala Gly Pro Pro Gly Lys Asp Gly Thr Ser Gly His Pro Gly

290 295 300

Pro Ile Gly Pro Pro Gly Pro Arg Gly Asn Arg Gly Glu Arg Gly Ser

305 310 315 320

Glu Gly Ser Pro Gly His Pro Gly Gln Pro Gly Pro Pro Gly Pro Pro

325 330 335

Gly Ala Pro Gly Pro Cys Cys Gly Gly

340 345

<210> 9

<211> 210

<212> PRT

<213>artificial sequence (artificial sequence)

<400> 9

His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu

1 5 10 15

Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Arg Gly Gly

20 25 30

Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Gly Leu

35 40 45

Gly Ser Pro Gly Pro Lys Gly Asp Lys Gly Glu Pro Gly Gly Pro Gly

50 55 60

Ala Asp Gly Val Pro Gly Lys Asp Gly Pro Arg Gly Pro Thr Gly Pro

65 70 75 80

Ile Gly Pro Pro Gly Pro Ala Gly Gln Pro Gly Asp Lys Gly Glu Gly

85 90 95

Gly Ala Pro Gly Leu Pro Gly Ile Ala Gly Pro Arg Gly Ser Pro Gly

100 105 110

Glu Arg Gly Glu Thr Gly Pro Pro Gly Pro Ala Gly Phe Pro Gly Ala

115 120 125

Pro Gly Gln Asn Gly Glu Pro Gly Gly Lys Gly Glu Arg Gly Ala Pro

130 135 140

Gly Glu Lys Gly Glu Gly Gly Pro Pro Gly Val Ala Gly Pro Pro Gly

145 150 155 160

Lys Asp Gly Thr Ser Gly His Pro Gly Pro Ile Gly Pro Pro Gly Pro

165 170 175

Arg Gly Asn Arg Gly Glu Arg Gly Ser Glu Gly Ser Pro Gly His Pro

180 185 190

Gly Gln Pro Gly Pro Pro Gly Pro Pro Gly Ala Pro Gly Pro Cys Cys

195 200 205

Gly Gly

210

<210> 10

<211> 1035

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 10

cacggtgagg gtacttttac ctctgatgtt tcctcatact tggaagaaca agctgctaag 60

gaattcattg cctggctggt caaaggcaga ggaggtggcg gatccggtgg cggtgggtcc 120

ggaggaggtg gttcagctgg tccaggtggt ccaggtcctc aaggtcctcc aggtaagaat 180

ggtgaaactg gtcctcaggg acctccaggc ccaaccggtc ctggaggtga taagggtgat 240

accggaccac ctggcccaca aggcttgcag ggtctgccag gtacaggggg tccacccggt 300

gaaaacggca agcctggtga accaggccca aaaggtgacg ctggagctcc aggagcccca 360

ggaggtaagg gtgatgctgg tgcccccggt gagagaggcc caccaggttt ggccggtgct 420

cccggtctga gagggggagc tggtccacca ggacctgaag gcggaaaagg tgctgctggt 480

ccacctggac cacctggtgc tgccggaact ccaggactgc agggaatgcc tggtgaaaga 540

ggcggattgg gatctcctgg cccaaaagga gacaagggag agcctggtgg accaggggca 600

gatggagttc ctggaaaaga tggtcctcgt ggtccaacag gacctatcgg tcccccagga 660

cctgctggtc aacctggaga taaaggtgaa ggcggggctc caggattgcc tggtattgcc 720

ggccctagag gttctcccgg tgaaagaggt gagaccggcc cacctggtcc agctggcttc 780

cctggagcac caggtcagaa tggtgagcca ggtggtaagg gtgagagagg agctccaggt 840

gagaaggggg aaggtggtcc acctggtgtt gctggtccac caggtaagga tggtacatcc 900

ggtcatcctg gaccaattgg acctccaggg cctagaggta acaggggtga aaggggatct 960

gaaggatctc ctggacatcc aggtcagccc ggtcctcctg gtccacccgg agctcctggg 1020

ccatgctgtg gtggc 1035

<210> 11

<211> 630

<212> DNA

<213>artificial sequence (artificial sequence)

<400> 11

cacggtgagg gtacttttac ctctgatgtt tcctcatact tggaagaaca agctgctaag 60

gaattcattg cctggctggt caaaggcaga ggaggtggcg gatccggtgg cggtgggtcc 120

ggaggaggtg gttcagctgg tttgggatct cctggcccaa aaggagacaa gggagagcct 180

ggtggaccag gggcagatgg agttcctgga aaagatggtc ctcgtggtcc aacaggacct 240

atcggtcccc caggacctgc tggtcaacct ggagataaag gtgaaggcgg ggctccagga 300

ttgcctggta ttgccggccc tagaggttct cccggtgaaa gaggtgagac cggcccacct 360

ggtccagctg gcttccctgg agcaccaggt cagaatggtg agccaggtgg taagggtgag 420

agaggagctc caggtgagaa gggggaaggt ggtccacctg gtgttgctgg tccaccaggt 480

aaggatggta catccggtca tcctggacca attggacctc cagggcctag aggtaacagg 540

ggtgaaaggg gatctgaagg atctcctgga catccaggtc agcccggtcc tcctggtcca 600

cccggagctc ctgggccatg ctgtggtggc 630

<210> 12

<211> 30

<212> PRT

<213>artificial sequence (artificial sequence)

<400> 12

His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu

1 5 10 15

Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Arg

20 25 30

<210> 13

<211> 1210

<212> PRT

<213>artificial sequence (artificial sequence)

<400> 13

His Gly Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Glu

1 5 10 15

Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Lys Gly Arg Gly Gly

20 25 30

Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Ala Met

35 40 45

Met Ser Phe Val Gln Lys Gly Ser Trp Leu Leu Leu Ala Leu Leu His

50 55 60

Pro Thr Ile Ile Leu Ala Gln Gln Glu Ala Val Glu Gly Gly Cys Ser

65 70 75 80

His Leu Gly Gln Ser Tyr Ala Asp Arg Asp Val Trp Lys Pro Glu Pro

85 90 95

Cys Gln Ile Cys Val Cys Asp Ser Gly Ser Val Leu Cys Asp Asp Ile

100 105 110

Ile Cys Asp Asp Gln Glu Leu Asp Cys Pro Asn Pro Glu Ile Pro Phe

115 120 125

Gly Glu Cys Cys Ala Val Cys Pro Gln Pro Pro Thr Ala Pro Thr Arg

130 135 140

Pro Pro Asn Gly Gln Gly Pro Gln Gly Pro Lys Gly Asp Pro Gly Pro

145 150 155 160

Pro Gly Ile Pro Gly Arg Asn Gly Asp Pro Gly Ile Pro Gly Gln Pro

165 170 175

Gly Ser Pro Gly Ser Pro Gly Pro Pro Gly Ile Cys Glu Ser Cys Pro

180 185 190

Thr Gly Pro Gln Asn Tyr Ser Pro Gln Tyr Asp Ser Tyr Asp Val Lys

195 200 205

Ser Gly Val Ala Val Gly Gly Leu Ala Gly Tyr Pro Gly Pro Ala Gly

210 215 220

Pro Pro Gly Pro Pro Gly Pro Pro Gly Thr Ser Gly His Pro Gly Ser

225 230 235 240

Pro Gly Ser Pro Gly Tyr Gln Gly Pro Pro Gly Glu Pro Gly Gln Ala

245 250 255

Gly Pro Ser Gly Pro Pro Gly Pro Pro Gly Ala Ile Gly Pro Ser Gly

260 265 270

Pro Ala Gly Lys Asp Gly Glu Ser Gly Arg Pro Gly Arg Pro Gly Glu

275 280 285

Arg Gly Leu Pro Gly Pro Pro Gly Ile Lys Gly Pro Ala Gly Ile Pro

290 295 300

Gly Phe Pro Gly Met Lys Gly His Arg Gly Phe Asp Gly Arg Asn Gly

305 310 315 320

Glu Lys Gly Glu Thr Gly Ala Pro Gly Leu Lys Gly Glu Asn Gly Leu

325 330 335

Pro Gly Glu Asn Gly Ala Pro Gly Pro Met Gly Pro Arg Gly Ala Pro

340 345 350

Gly Glu Arg Gly Arg Pro Gly Leu Pro Gly Ala Ala Gly Ala Arg Gly

355 360 365

Asn Asp Gly Ala Arg Gly Ser Asp Gly Gln Pro Gly Pro Pro Gly Pro

370 375 380

Pro Gly Thr Ala Gly Phe Pro Gly Ser Pro Gly Ala Lys Gly Glu Val

385 390 395 400

Gly Pro Ala Gly Ser Pro Gly Ser Asn Gly Ala Pro Gly Gln Arg Gly

405 410 415

Glu Pro Gly Pro Gln Gly His Ala Gly Ala Gln Gly Pro Pro Gly Pro

420 425 430

Pro Gly Ile Asn Gly Ser Pro Gly Gly Lys Gly Glu Met Gly Pro Ala

435 440 445

Gly Ile Pro Gly Ala Pro Gly Leu Met Gly Ala Arg Gly Pro Pro Gly

450 455 460

Pro Ala Gly Ala Asn Gly Ala Pro Gly Leu Arg Gly Gly Ala Gly Glu

465 470 475 480

Pro Gly Lys Asn Gly Ala Lys Gly Glu Pro Gly Pro Arg Gly Glu Arg

485 490 495

Gly Glu Ala Gly Ile Pro Gly Val Pro Gly Ala Lys Gly Glu Asp Gly

500 505 510

Lys Asp Gly Ser Pro Gly Glu Pro Gly Ala Asn Gly Leu Pro Gly Ala

515 520 525

Ala Gly Glu Arg Gly Ala Pro Gly Phe Arg Gly Pro Ala Gly Pro Asn

530 535 540

Gly Ile Pro Gly Glu Lys Gly Pro Ala Gly Glu Arg Gly Ala Pro Gly

545 550 555 560

Pro Ala Gly Pro Arg Gly Ala Ala Gly Glu Pro Gly Arg Asp Gly Val

565 570 575

Pro Gly Gly Pro Gly Met Arg Gly Met Pro Gly Ser Pro Gly Gly Pro

580 585 590

Gly Ser Asp Gly Lys Pro Gly Pro Pro Gly Ser Gln Gly Glu Ser Gly

595 600 605

Arg Pro Gly Pro Pro Gly Pro Ser Gly Pro Arg Gly Gln Pro Gly Val

610 615 620

Met Gly Phe Pro Gly Pro Lys Gly Asn Asp Gly Ala Pro Gly Lys Asn

625 630 635 640

Gly Glu Arg Gly Gly Pro Gly Gly Pro Gly Pro Gln Gly Pro Pro Gly

645 650 655

Lys Asn Gly Glu Thr Gly Pro Gln Gly Pro Pro Gly Pro Thr Gly Pro

660 665 670

Gly Gly Asp Lys Gly Asp Thr Gly Pro Pro Gly Pro Gln Gly Leu Gln

675 680 685

Gly Leu Pro Gly Thr Gly Gly Pro Pro Gly Glu Asn Gly Lys Pro Gly

690 695 700

Glu Pro Gly Pro Lys Gly Asp Ala Gly Ala Pro Gly Ala Pro Gly Gly

705 710 715 720

Lys Gly Asp Ala Gly Ala Pro Gly Glu Arg Gly Pro Pro Gly Leu Ala

725 730 735

Gly Ala Pro Gly Leu Arg Gly Gly Ala Gly Pro Pro Gly Pro Glu Gly

740 745 750

Gly Lys Gly Ala Ala Gly Pro Pro Gly Pro Pro Gly Ala Ala Gly Thr

755 760 765

Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Gly Leu Gly Ser Pro

770 775 780

Gly Pro Lys Gly Asp Lys Gly Glu Pro Gly Gly Pro Gly Ala Asp Gly

785 790 795 800

Val Pro Gly Lys Asp Gly Pro Arg Gly Pro Thr Gly Pro Ile Gly Pro

805 810 815

Pro Gly Pro Ala Gly Gln Pro Gly Asp Lys Gly Glu Gly Gly Ala Pro

820 825 830

Gly Leu Pro Gly Ile Ala Gly Pro Arg Gly Ser Pro Gly Glu Arg Gly

835 840 845

Glu Thr Gly Pro Pro Gly Pro Ala Gly Phe Pro Gly Ala Pro Gly Gln

850 855 860

Asn Gly Glu Pro Gly Gly Lys Gly Glu Arg Gly Ala Pro Gly Glu Lys

865 870 875 880

Gly Glu Gly Gly Pro Pro Gly Val Ala Gly Pro Pro Gly Lys Asp Gly

885 890 895

Thr Ser Gly His Pro Gly Pro Ile Gly Pro Pro Gly Pro Arg Gly Asn

900 905 910

Arg Gly Glu Arg Gly Ser Glu Gly Ser Pro Gly His Pro Gly Gln Pro

915 920 925

Gly Pro Pro Gly Pro Pro Gly Ala Pro Gly Pro Cys Cys Gly Gly Val

930 935 940

Gly Ala Ala Ala Ile Ala Gly Ile Gly Gly Glu Lys Ala Gly Gly Phe

945 950 955 960

Ala Pro Tyr Tyr Gly Asp Glu Pro Met Asp Phe Lys Ile Asn Thr Asp

965 970 975

Glu Ile Met Thr Ser Leu Lys Ser Val Asn Gly Gln Ile Glu Ser Leu

980 985 990

Ile Ser Pro Asp Gly Ser Arg Lys Asn Pro Ala Arg Asn Cys Arg Asp

995 1000 1005

Leu Lys Phe Cys His Pro Glu Leu Lys Ser Gly Glu Tyr Trp Val

1010 1015 1020

Asp Pro Asn Gln Gly Cys Lys Leu Asp Ala Ile Lys Val Phe Cys

1025 1030 1035

Asn Met Glu Thr Gly Glu Thr Cys Ile Ser Ala Asn Pro Leu Asn

1040 1045 1050

Val Pro Arg Lys His Trp Trp Thr Asp Ser Ser Ala Glu Lys Lys

1055 1060 1065

His Val Trp Phe Gly Glu Ser Met Asp Gly Gly Phe Gln Phe Ser

1070 1075 1080

Tyr Gly Asn Pro Glu Leu Pro Glu Asp Val Leu Asp Val Gln Leu

1085 1090 1095

Ala Phe Leu Arg Leu Leu Ser Ser Arg Ala Ser Gln Asn Ile Thr

1100 1105 1110

Tyr His Cys Lys Asn Ser Ile Ala Tyr Met Asp Gln Ala Ser Gly

1115 1120 1125

Asn Val Lys Lys Ala Leu Lys Leu Met Gly Ser Asn Glu Gly Glu

1130 1135 1140

Phe Lys Ala Glu Gly Asn Ser Lys Phe Thr Tyr Thr Val Leu Glu

1145 1150 1155

Asp Gly Cys Thr Lys His Thr Gly Glu Trp Ser Lys Thr Val Phe

1160 1165 1170

Glu Tyr Arg Thr Arg Lys Ala Val Arg Leu Pro Ile Val Asp Ile

1175 1180 1185

Ala Pro Tyr Asp Ile Gly Gly Pro Asp Gln Glu Phe Gly Val Asp

1190 1195 1200

Val Gly Pro Val Cys Phe Leu

1205 1210

Claims

1. a kind of fusion protein, which is characterized in that the structure of the fusion protein is as shown in following formula I:

A-L-B (I)

In formula, A be GLP-1 analog, B be III Collagen Type VI α of people, 1 chain, L be without or link peptide, each "-" independently be link peptide or Peptide bond；And

His-Xaa₈-Glu-Gly-Thr-Phe-Thr-Ser-Asp-Val-Ser-Ser-Tyr-Leu-Glu-Xaa₂₂-Gln- Ala-Ala-Lys-Glu-Phe-Ile-Ala-Trp-Leu-Val-Lys-Gly-Arg-Xaa₃₇

2. fusion protein as described in claim 1, which is characterized in that the GLP-1 analog has such as SEQ ID NO.:6 Or amino acid sequence shown in 12.

3. fusion protein as described in claim 1, which is characterized in that 1 chain of human III type collagen α has SEQ ID NO.: 2, amino acid sequence shown in 3 or 4.

4. fusion protein as described in claim 1, which is characterized in that the fusion protein has SEQ ID NO.:8,9 or 13 Shown in amino acid sequence.

5. a kind of oligomer, which is characterized in that the oligomer includes fusion protein described in claim 1.

6. a kind of isolated polynucleotides, which is characterized in that fusion protein described in the polynucleotide encoding claim 1.

7. a kind of carrier, which is characterized in that the carrier includes polynucleotides as claimed in claim 6.

8. a kind of host cell, which is characterized in that the host cell contains carrier or chromosome as claimed in claim 7 In be integrated with the polynucleotides as claimed in claim 6 or expression fusion protein described in claim 1 or expression power of external source Benefit require 5 described in oligomer.

9. a kind of pharmaceutical composition, which is characterized in that described pharmaceutical composition includes fusion protein described in claim 1 or power Benefit require 5 described in oligomer and pharmaceutically acceptable carrier or excipient.

10. oligomer described in fusion protein as described in claim 1, claim 5, multicore glycosides as claimed in claim 6 Carrier sour, as claimed in claim 7, host cell according to any one of claims 8, which is characterized in that be used to prepare prevention and/or control Treat the drug or preparation of diabetes.