WO2016161983A1 - 一种融合载体蛋白及其在促进目的蛋白或多肽表达中的应用 - Google Patents
一种融合载体蛋白及其在促进目的蛋白或多肽表达中的应用 Download PDFInfo
- Publication number
- WO2016161983A1 WO2016161983A1 PCT/CN2016/078938 CN2016078938W WO2016161983A1 WO 2016161983 A1 WO2016161983 A1 WO 2016161983A1 CN 2016078938 W CN2016078938 W CN 2016078938W WO 2016161983 A1 WO2016161983 A1 WO 2016161983A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- protein
- fusion
- seq
- expression
- amino acid
- Prior art date
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 213
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 158
- 230000014509 gene expression Effects 0.000 title claims abstract description 141
- 230000004927 fusion Effects 0.000 title claims abstract description 138
- 108090000765 processed proteins & peptides Proteins 0.000 title claims abstract description 89
- 102000004196 processed proteins & peptides Human genes 0.000 title claims abstract description 82
- 229920001184 polypeptide Polymers 0.000 title claims abstract description 78
- 108010078791 Carrier Proteins Proteins 0.000 title claims abstract description 62
- 102000014914 Carrier Proteins Human genes 0.000 title claims abstract description 61
- 230000001737 promoting effect Effects 0.000 title claims abstract description 6
- 210000003000 inclusion body Anatomy 0.000 claims abstract description 91
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 claims abstract description 86
- 239000013604 expression vector Substances 0.000 claims abstract description 74
- 210000004027 cell Anatomy 0.000 claims abstract description 65
- 102000004877 Insulin Human genes 0.000 claims abstract description 48
- 108090001061 Insulin Proteins 0.000 claims abstract description 48
- 229940125396 insulin Drugs 0.000 claims abstract description 42
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 35
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 33
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 33
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract 19
- 108020001507 fusion proteins Proteins 0.000 claims description 124
- 102000037865 fusion proteins Human genes 0.000 claims description 123
- 239000013598 vector Substances 0.000 claims description 67
- 150000001413 amino acids Chemical class 0.000 claims description 32
- 108010024409 linaclotide Proteins 0.000 claims description 29
- KXGCNMMJRFDFNR-WDRJZQOASA-N linaclotide Chemical compound C([C@H](NC(=O)[C@@H]1CSSC[C@H]2C(=O)N[C@H]3CSSC[C@H](N)C(=O)N[C@H](C(N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=4C=CC(O)=CC=4)C(=O)N2)=O)CSSC[C@H](NC(=O)[C@H](C)NC(=O)[C@@H]2CCCN2C(=O)[C@H](CC(N)=O)NC3=O)C(=O)N[C@H](C(NCC(=O)N1)=O)[C@H](O)C)C(O)=O)C1=CC=C(O)C=C1 KXGCNMMJRFDFNR-WDRJZQOASA-N 0.000 claims description 26
- 229960000812 linaclotide Drugs 0.000 claims description 25
- 230000004048 modification Effects 0.000 claims description 21
- 238000012986 modification Methods 0.000 claims description 21
- DTHNMHAUYICORS-KTKZVXAJSA-N Glucagon-like peptide 1 Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(N)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC=1N=CNC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=CC=C1 DTHNMHAUYICORS-KTKZVXAJSA-N 0.000 claims description 20
- 102400000322 Glucagon-like peptide 1 Human genes 0.000 claims description 20
- 125000000539 amino acid group Chemical group 0.000 claims description 19
- PBGKTOXHQIOBKM-FHFVDXKLSA-N insulin (human) Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@H]1CSSC[C@H]2C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N[C@H](C(=O)N[C@H](C(N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=3C=CC(O)=CC=3)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=3C=CC(O)=CC=3)C(=O)N[C@@H](CSSC[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=3C=CC(O)=CC=3)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=3NC=NC=3)NC(=O)[C@H](CO)NC(=O)CNC1=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O)=O)CSSC[C@@H](C(N2)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](NC(=O)CN)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC=1C=CC=CC=1)C(C)C)C1=CN=CN1 PBGKTOXHQIOBKM-FHFVDXKLSA-N 0.000 claims description 18
- 238000003776 cleavage reaction Methods 0.000 claims description 17
- 230000007017 scission Effects 0.000 claims description 17
- 101000976075 Homo sapiens Insulin Proteins 0.000 claims description 16
- 238000000746 purification Methods 0.000 claims description 11
- PEASPLKKXBYDKL-FXEVSJAOSA-N enfuvirtide Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(N)=O)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(C)=O)[C@@H](C)O)[C@@H](C)CC)C1=CN=CN1 PEASPLKKXBYDKL-FXEVSJAOSA-N 0.000 claims description 10
- 102400000319 Oxyntomodulin Human genes 0.000 claims description 8
- 101800001388 Oxyntomodulin Proteins 0.000 claims description 8
- 230000001965 increasing effect Effects 0.000 claims description 8
- 101000772194 Homo sapiens Transthyretin Proteins 0.000 claims description 7
- 102000056556 human TTR Human genes 0.000 claims description 7
- PXZWGQLGAKCNKD-DPNMSELWSA-N molport-023-276-326 Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O)[C@@H](C)O)C(C)C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)O)C1=CC=CC=C1 PXZWGQLGAKCNKD-DPNMSELWSA-N 0.000 claims description 7
- 108010032976 Enfuvirtide Proteins 0.000 claims description 6
- 230000009435 amidation Effects 0.000 claims description 5
- 238000007112 amidation reaction Methods 0.000 claims description 5
- 229960002062 enfuvirtide Drugs 0.000 claims description 5
- 101710135898 Myc proto-oncogene protein Proteins 0.000 claims description 4
- 102100038895 Myc proto-oncogene protein Human genes 0.000 claims description 4
- 101710150448 Transcriptional regulator Myc Proteins 0.000 claims description 4
- 230000021736 acetylation Effects 0.000 claims description 4
- 238000006640 acetylation reaction Methods 0.000 claims description 4
- 230000029936 alkylation Effects 0.000 claims description 4
- 238000005804 alkylation reaction Methods 0.000 claims description 4
- 230000006287 biotinylation Effects 0.000 claims description 4
- 238000007413 biotinylation Methods 0.000 claims description 4
- 108010048367 enhanced green fluorescent protein Proteins 0.000 claims description 4
- 230000013595 glycosylation Effects 0.000 claims description 4
- 238000006206 glycosylation reaction Methods 0.000 claims description 4
- 230000026731 phosphorylation Effects 0.000 claims description 4
- 238000006366 phosphorylation reaction Methods 0.000 claims description 4
- 238000007363 ring formation reaction Methods 0.000 claims description 4
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000006467 substitution reaction Methods 0.000 claims description 3
- -1 polyethylene Polymers 0.000 claims description 2
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 claims 2
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 claims 1
- 101710198884 GATA-type zinc finger protein 1 Proteins 0.000 claims 1
- 239000004698 Polyethylene Substances 0.000 claims 1
- WGCNASOHLSPBMP-UHFFFAOYSA-N hydroxyacetaldehyde Natural products OCC=O WGCNASOHLSPBMP-UHFFFAOYSA-N 0.000 claims 1
- 229920000573 polyethylene Polymers 0.000 claims 1
- 239000000523 sample Substances 0.000 description 63
- 241000588724 Escherichia coli Species 0.000 description 46
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 37
- 230000001580 bacterial effect Effects 0.000 description 32
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 28
- 238000001262 western blot Methods 0.000 description 27
- 238000001962 electrophoresis Methods 0.000 description 26
- 239000000047 product Substances 0.000 description 25
- 101800000224 Glucagon-like peptide 1 Proteins 0.000 description 19
- 102000035195 Peptidases Human genes 0.000 description 17
- 108091005804 Peptidases Proteins 0.000 description 17
- 239000004365 Protease Substances 0.000 description 17
- 238000000034 method Methods 0.000 description 17
- 239000013595 supernatant sample Substances 0.000 description 16
- 230000003698 anagen phase Effects 0.000 description 14
- 239000012634 fragment Substances 0.000 description 14
- 235000019419 proteases Nutrition 0.000 description 14
- 108010076181 Proinsulin Proteins 0.000 description 13
- 239000003550 marker Substances 0.000 description 13
- 108090000631 Trypsin Proteins 0.000 description 12
- 102000004142 Trypsin Human genes 0.000 description 12
- 230000015556 catabolic process Effects 0.000 description 12
- 238000006731 degradation reaction Methods 0.000 description 12
- 238000004519 manufacturing process Methods 0.000 description 12
- 239000012588 trypsin Substances 0.000 description 12
- 239000007857 degradation product Substances 0.000 description 11
- 150000003384 small molecules Chemical class 0.000 description 10
- 238000001514 detection method Methods 0.000 description 9
- 125000006850 spacer group Chemical group 0.000 description 9
- 239000006228 supernatant Substances 0.000 description 9
- 241000894006 Bacteria Species 0.000 description 8
- 241000282414 Homo sapiens Species 0.000 description 8
- 238000003259 recombinant expression Methods 0.000 description 8
- OMLWNBVRVJYMBQ-YUMQZZPRSA-N Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OMLWNBVRVJYMBQ-YUMQZZPRSA-N 0.000 description 7
- 108010042653 IgA receptor Proteins 0.000 description 7
- 241001465754 Metazoa Species 0.000 description 7
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 7
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 7
- 108010068380 arginylarginine Proteins 0.000 description 7
- 230000008901 benefit Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 108010075254 C-Peptide Proteins 0.000 description 6
- 108020004705 Codon Proteins 0.000 description 6
- 102100034014 Prolyl 3-hydroxylase 3 Human genes 0.000 description 6
- 238000004949 mass spectrometry Methods 0.000 description 6
- 102000003670 Carboxypeptidase B Human genes 0.000 description 5
- 108090000087 Carboxypeptidase B Proteins 0.000 description 5
- 241001198387 Escherichia coli BL21(DE3) Species 0.000 description 5
- 108091028043 Nucleic acid sequence Proteins 0.000 description 5
- 108010076504 Protein Sorting Signals Proteins 0.000 description 5
- 238000007792 addition Methods 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 5
- UQLDLKMNUJERMK-UHFFFAOYSA-L di(octadecanoyloxy)lead Chemical compound [Pb+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O UQLDLKMNUJERMK-UHFFFAOYSA-L 0.000 description 5
- 230000029087 digestion Effects 0.000 description 5
- 239000000499 gel Substances 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 230000009465 prokaryotic expression Effects 0.000 description 5
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 4
- 102000002933 Thioredoxin Human genes 0.000 description 4
- 229940098773 bovine serum albumin Drugs 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 229920001223 polyethylene glycol Polymers 0.000 description 4
- 230000010473 stable expression Effects 0.000 description 4
- 108060008226 thioredoxin Proteins 0.000 description 4
- 229940094937 thioredoxin Drugs 0.000 description 4
- 231100000331 toxic Toxicity 0.000 description 4
- 230000002588 toxic effect Effects 0.000 description 4
- 238000011282 treatment Methods 0.000 description 4
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 3
- 101150076489 B gene Proteins 0.000 description 3
- 108020004414 DNA Proteins 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 3
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 3
- 238000009835 boiling Methods 0.000 description 3
- 210000004899 c-terminal region Anatomy 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 238000001976 enzyme digestion Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000010353 genetic engineering Methods 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 230000006698 induction Effects 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 150000002500 ions Chemical class 0.000 description 3
- 239000012160 loading buffer Substances 0.000 description 3
- 238000001819 mass spectrum Methods 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 239000000843 powder Substances 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 238000004153 renaturation Methods 0.000 description 3
- 108091008146 restriction endonucleases Proteins 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 235000020183 skimmed milk Nutrition 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-N Ammonia Chemical compound N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- 206010010774 Constipation Diseases 0.000 description 2
- 108010011459 Exenatide Proteins 0.000 description 2
- 102000005720 Glutathione transferase Human genes 0.000 description 2
- 108010070675 Glutathione transferase Proteins 0.000 description 2
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 2
- SEQKRHFRPICQDD-UHFFFAOYSA-N N-tris(hydroxymethyl)methylglycine Chemical compound OCC(CO)(CO)[NH2+]CC([O-])=O SEQKRHFRPICQDD-UHFFFAOYSA-N 0.000 description 2
- 239000002033 PVDF binder Substances 0.000 description 2
- 239000002202 Polyethylene glycol Substances 0.000 description 2
- 108010071690 Prealbumin Proteins 0.000 description 2
- 108010076818 TEV protease Proteins 0.000 description 2
- 101150091380 TTR gene Proteins 0.000 description 2
- 102000009190 Transthyretin Human genes 0.000 description 2
- DTQVDTLACAAQTR-UHFFFAOYSA-N Trifluoroacetic acid Chemical compound OC(=O)C(F)(F)F DTQVDTLACAAQTR-UHFFFAOYSA-N 0.000 description 2
- 239000013504 Triton X-100 Substances 0.000 description 2
- 229920004890 Triton X-100 Polymers 0.000 description 2
- 108090000848 Ubiquitin Proteins 0.000 description 2
- 102000044159 Ubiquitin Human genes 0.000 description 2
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- 239000000427 antigen Substances 0.000 description 2
- 108091007433 antigens Proteins 0.000 description 2
- 102000036639 antigens Human genes 0.000 description 2
- 210000000227 basophil cell of anterior lobe of hypophysis Anatomy 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- NKLPQNGYXWVELD-UHFFFAOYSA-M coomassie brilliant blue Chemical compound [Na+].C1=CC(OCC)=CC=C1NC1=CC=C(C(=C2C=CC(C=C2)=[N+](CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=2C=CC(=CC=2)N(CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=C1 NKLPQNGYXWVELD-UHFFFAOYSA-M 0.000 description 2
- 238000005520 cutting process Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 206010012601 diabetes mellitus Diseases 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 229960001519 exenatide Drugs 0.000 description 2
- 238000012215 gene cloning Methods 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 230000017730 intein-mediated protein splicing Effects 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 2
- 108010013359 miniproinsulin Proteins 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000002018 overexpression Effects 0.000 description 2
- 230000001590 oxidative effect Effects 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 239000013612 plasmid Substances 0.000 description 2
- 229920002981 polyvinylidene fluoride Polymers 0.000 description 2
- GCYXWQUSHADNBF-AAEALURTSA-N preproglucagon 78-108 Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC=1N=CNC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=CC=C1 GCYXWQUSHADNBF-AAEALURTSA-N 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 235000019624 protein content Nutrition 0.000 description 2
- 230000012743 protein tagging Effects 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 230000003248 secreting effect Effects 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- DEQANNDTNATYII-OULOTJBUSA-N (4r,7s,10s,13r,16s,19r)-10-(4-aminobutyl)-19-[[(2r)-2-amino-3-phenylpropanoyl]amino]-16-benzyl-n-[(2r,3r)-1,3-dihydroxybutan-2-yl]-7-[(1r)-1-hydroxyethyl]-13-(1h-indol-3-ylmethyl)-6,9,12,15,18-pentaoxo-1,2-dithia-5,8,11,14,17-pentazacycloicosane-4-carboxa Chemical compound C([C@@H](N)C(=O)N[C@H]1CSSC[C@H](NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](CC=2C3=CC=CC=C3NC=2)NC(=O)[C@H](CC=2C=CC=CC=2)NC1=O)C(=O)N[C@H](CO)[C@H](O)C)C1=CC=CC=C1 DEQANNDTNATYII-OULOTJBUSA-N 0.000 description 1
- YQAXFVHNHSPUPO-RNJOBUHISA-N 2-[[(2s)-2-[[2-[[(2s,4r)-1-[(2s)-1-(2-aminoacetyl)pyrrolidine-2-carbonyl]-4-hydroxypyrrolidine-2-carbonyl]amino]acetyl]amino]propanoyl]amino]acetic acid Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1C[C@@H](O)CN1C(=O)[C@H]1N(C(=O)CN)CCC1 YQAXFVHNHSPUPO-RNJOBUHISA-N 0.000 description 1
- VOUAQYXWVJDEQY-QENPJCQMSA-N 33017-11-7 Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)NCC(=O)NCC(=O)N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C)C(=O)NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N1[C@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)CCC1 VOUAQYXWVJDEQY-QENPJCQMSA-N 0.000 description 1
- 208000030507 AIDS Diseases 0.000 description 1
- 102000044503 Antimicrobial Peptides Human genes 0.000 description 1
- 108700042778 Antimicrobial Peptides Proteins 0.000 description 1
- 240000006439 Aspergillus oryzae Species 0.000 description 1
- 235000002247 Aspergillus oryzae Nutrition 0.000 description 1
- 230000006974 Aβ toxicity Effects 0.000 description 1
- 101100489084 Bacillus subtilis (strain 168) yrbF gene Proteins 0.000 description 1
- 102100026189 Beta-galactosidase Human genes 0.000 description 1
- 239000004135 Bone phosphate Substances 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 101100337060 Caenorhabditis elegans glp-1 gene Proteins 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 102000008186 Collagen Human genes 0.000 description 1
- 108010035532 Collagen Proteins 0.000 description 1
- 101710200374 Crotamine Proteins 0.000 description 1
- NZNMSOFKMUBTKW-UHFFFAOYSA-N Cyclohexanecarboxylic acid Natural products OC(=O)C1CCCCC1 NZNMSOFKMUBTKW-UHFFFAOYSA-N 0.000 description 1
- 108090000695 Cytokines Proteins 0.000 description 1
- 102000004127 Cytokines Human genes 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 108010082990 Dermatophagoides farinae antigen f 7 Proteins 0.000 description 1
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 1
- 102100025012 Dipeptidyl peptidase 4 Human genes 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 101710194146 Ecotin Proteins 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 101100457353 Escherichia coli (strain K12) mlaF gene Proteins 0.000 description 1
- HTQBXNHDCUEHJF-XWLPCZSASA-N Exenatide Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(=O)NCC(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CO)C(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)CNC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=CC=C1 HTQBXNHDCUEHJF-XWLPCZSASA-N 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 108010074860 Factor Xa Proteins 0.000 description 1
- 241000710198 Foot-and-mouth disease virus Species 0.000 description 1
- 108090000698 Formate Dehydrogenases Proteins 0.000 description 1
- 241000027294 Fusi Species 0.000 description 1
- 108010010803 Gelatin Proteins 0.000 description 1
- 102000051325 Glucagon Human genes 0.000 description 1
- 108060003199 Glucagon Proteins 0.000 description 1
- 101800004266 Glucagon-like peptide 1(7-37) Proteins 0.000 description 1
- SXRSQZLOMIGNAQ-UHFFFAOYSA-N Glutaraldehyde Chemical compound O=CCCCC=O SXRSQZLOMIGNAQ-UHFFFAOYSA-N 0.000 description 1
- 229920002527 Glycogen Polymers 0.000 description 1
- 108010078321 Guanylate Cyclase Proteins 0.000 description 1
- 102000014469 Guanylate cyclase Human genes 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 101001135770 Homo sapiens Parathyroid hormone Proteins 0.000 description 1
- 101001135995 Homo sapiens Probable peptidyl-tRNA hydrolase Proteins 0.000 description 1
- 101000684208 Homo sapiens Prolyl endopeptidase FAP Proteins 0.000 description 1
- 102000002265 Human Growth Hormone Human genes 0.000 description 1
- 108010000521 Human Growth Hormone Proteins 0.000 description 1
- 239000000854 Human Growth Hormone Substances 0.000 description 1
- 241000235058 Komagataella pastoris Species 0.000 description 1
- 108010092217 Long-Acting Insulin Proteins 0.000 description 1
- 102000016261 Long-Acting Insulin Human genes 0.000 description 1
- 229940100066 Long-acting insulin Drugs 0.000 description 1
- NPBGTPKLVJEOBE-IUCAKERBSA-N Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCNC(N)=N NPBGTPKLVJEOBE-IUCAKERBSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 101710157759 Metchnikowin Proteins 0.000 description 1
- 108010085220 Multiprotein Complexes Proteins 0.000 description 1
- 102000007474 Multiprotein Complexes Human genes 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 241000186359 Mycobacterium Species 0.000 description 1
- 208000008589 Obesity Diseases 0.000 description 1
- 108010016076 Octreotide Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 238000012220 PCR site-directed mutagenesis Methods 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 108010033276 Peptide Fragments Proteins 0.000 description 1
- 102000007079 Peptide Fragments Human genes 0.000 description 1
- 229920001213 Polysorbate 20 Polymers 0.000 description 1
- 108010005991 Pork Regular Insulin Proteins 0.000 description 1
- 101001045444 Proteus vulgaris Endoribonuclease HigB Proteins 0.000 description 1
- 101001100822 Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1) Pyocin-S2 Proteins 0.000 description 1
- 101001100831 Pseudomonas aeruginosa Pyocin-S1 Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 241000235343 Saccharomycetales Species 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- 102000005806 Serine Peptidase Inhibitor Kazal-Type 5 Human genes 0.000 description 1
- 108010005020 Serine Peptidase Inhibitor Kazal-Type 5 Proteins 0.000 description 1
- 108010088160 Staphylococcal Protein A Proteins 0.000 description 1
- 101710172711 Structural protein Proteins 0.000 description 1
- 108010056079 Subtilisins Proteins 0.000 description 1
- 102000005158 Subtilisins Human genes 0.000 description 1
- 241000282898 Sus scrofa Species 0.000 description 1
- UZMAPBJVXOGOFT-UHFFFAOYSA-N Syringetin Natural products COC1=C(O)C(OC)=CC(C2=C(C(=O)C3=C(O)C=C(O)C=C3O2)O)=C1 UZMAPBJVXOGOFT-UHFFFAOYSA-N 0.000 description 1
- 239000007997 Tricine buffer Substances 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000000556 agonist Substances 0.000 description 1
- AFVLVVWMAFSXCK-VMPITWQZSA-N alpha-cyano-4-hydroxycinnamic acid Chemical compound OC(=O)C(\C#N)=C\C1=CC=C(O)C=C1 AFVLVVWMAFSXCK-VMPITWQZSA-N 0.000 description 1
- 229910021529 ammonia Inorganic materials 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000006933 amyloid-beta aggregation Effects 0.000 description 1
- 230000000844 anti-bacterial effect Effects 0.000 description 1
- 230000000840 anti-viral effect Effects 0.000 description 1
- 108010082685 antiarrhythmic peptide Proteins 0.000 description 1
- 125000000637 arginyl group Chemical group N[C@@H](CCCNC(N)=N)C(=O)* 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 108010005774 beta-Galactosidase Proteins 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- UDSAIICHUKSCKT-UHFFFAOYSA-N bromophenol blue Chemical compound C1=C(Br)C(O)=C(Br)C=C1C1(C=2C=C(Br)C(O)=C(Br)C=2)C2=CC=CC=C2S(=O)(=O)O1 UDSAIICHUKSCKT-UHFFFAOYSA-N 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 239000004202 carbamide Substances 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- JUFFVKRROAPVBI-PVOYSMBESA-N chembl1210015 Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N[C@H]1[C@@H]([C@@H](O)[C@H](O[C@H]2[C@@H]([C@@H](O)[C@@H](O)[C@@H](CO[C@]3(O[C@@H](C[C@H](O)[C@H](O)CO)[C@H](NC(C)=O)[C@@H](O)C3)C(O)=O)O2)O)[C@@H](CO)O1)NC(C)=O)C(=O)NCC(=O)NCC(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CO)C(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)CNC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=CC=C1 JUFFVKRROAPVBI-PVOYSMBESA-N 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- PEFQQQGFYPMQLH-WFQFKEFWSA-N crotamin Chemical compound C([C@H]1C(=O)N2CCC[C@H]2C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(N[C@@H]2C(=O)N[C@@H](CC(C)C)C(=O)N3CCC[C@H]3C(=O)N3CCC[C@H]3C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=3C=CC=CC=3)C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H]3CSSC[C@H](NC(=O)[C@H](CC=4NC=NC=4)NC(=O)CNC(=O)CNC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC=4NC=NC=4)NC(=O)[C@@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC=4C=CC(O)=CC=4)CSSC[C@@H](C(=O)N[C@@H](CSSC2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC=2C4=CC=CC=C4NC=2)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CC=2C4=CC=CC=C4NC=2)NC(=O)[C@H](CCCNC(N)=N)NC3=O)C(=O)N1)=O)[C@@H](C)CC)C1=CC=CC=C1 PEFQQQGFYPMQLH-WFQFKEFWSA-N 0.000 description 1
- ATDGTVJJHBUTRL-UHFFFAOYSA-N cyanogen bromide Chemical compound BrC#N ATDGTVJJHBUTRL-UHFFFAOYSA-N 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 230000003013 cytotoxicity Effects 0.000 description 1
- 231100000135 cytotoxicity Toxicity 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 238000004042 decolorization Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- KCFYHBSOLOXZIF-UHFFFAOYSA-N dihydrochrysin Natural products COC1=C(O)C(OC)=CC(C2OC3=CC(O)=CC(O)=C3C(=O)C2)=C1 KCFYHBSOLOXZIF-UHFFFAOYSA-N 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 210000004211 gastric acid Anatomy 0.000 description 1
- 102000054767 gene variant Human genes 0.000 description 1
- 101150102822 glp-1 gene Proteins 0.000 description 1
- MASNOZXLGMXCHN-ZLPAWPGGSA-N glucagon Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)C(C)C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)O)C1=CC=CC=C1 MASNOZXLGMXCHN-ZLPAWPGGSA-N 0.000 description 1
- 229960004666 glucagon Drugs 0.000 description 1
- 229940096919 glycogen Drugs 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 102000058004 human PTH Human genes 0.000 description 1
- 239000005457 ice water Substances 0.000 description 1
- 238000003119 immunoblot Methods 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 210000004153 islets of langerhan Anatomy 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 239000006210 lotion Substances 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000000074 matrix-assisted laser desorption--ionisation tandem time-of-flight detection Methods 0.000 description 1
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 235000020824 obesity Nutrition 0.000 description 1
- 229960002700 octreotide Drugs 0.000 description 1
- 238000012261 overproduction Methods 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 230000020477 pH reduction Effects 0.000 description 1
- 210000004923 pancreatic tissue Anatomy 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 210000001322 periplasm Anatomy 0.000 description 1
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 1
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 1
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 1
- 239000003910 polypeptide antibiotic agent Substances 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 238000011321 prophylaxis Methods 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 230000003014 reinforcing effect Effects 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- TXBNDGDMWKVRQW-UHFFFAOYSA-M sodium;2-[[1,3-dihydroxy-2-(hydroxymethyl)propan-2-yl]azaniumyl]acetate;dodecyl sulfate Chemical compound [Na+].OCC(CO)(CO)NCC(O)=O.CCCCCCCCCCCCOS([O-])(=O)=O TXBNDGDMWKVRQW-UHFFFAOYSA-M 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 229940126585 therapeutic drug Drugs 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000001269 time-of-flight mass spectrometry Methods 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 201000008827 tuberculosis Diseases 0.000 description 1
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/575—Hormones
- C07K14/62—Insulins
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K19/00—Hybrid peptides, i.e. peptides covalently bound to nucleic acids, or non-covalently bound protein-protein complexes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/02—Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
Definitions
- the invention belongs to the field of bioengineering, and particularly relates to a novel fusion carrier protein, a nucleic acid molecule encoding the fusion carrier protein, an expression vector and a host cell containing the nucleic acid molecule, and the same in promoting the expression of a heterologous protein or polypeptide.
- the present invention also relates to a fusion protein comprising the fusion carrier protein, and a nucleic acid molecule encoding the fusion protein, an expression vector containing the nucleic acid molecule, and a host cell.
- the gene recombination technique clones the gene of interest into an expression vector and expresses the protein or polypeptide of interest in the host cell. This is currently the most common method for producing heterologous proteins or peptides.
- a variety of recombinant protein expression systems have been successfully applied to different hosts such as prokaryotic cells, yeast cells, plant cells, insect cells, mammalian cells, etc. Each expression system has its respective advantages and limitations. Due to its clear genetic background, mature technology, high production efficiency and simple operation, E. coli prokaryotic expression system has become the earliest and most widely used classical expression system in protein recombination technology.
- heterologous protein or polypeptide The recombinant expression efficiency of a heterologous protein or polypeptide is affected by many factors. Common factors are: the composition of expression elements when constructing an expression vector (Kim KJ, Kim HE, Lee KH, et al. Two-promoter vector is highly efficient for overproduction of protein complexes. Protein Sci. 2004; 13: 1698-703), Stability of the target gene mRNA (Tanaka M, Tokuoka M, Shintani T, et al. Transcripts of a heterologous gene encoding mite allergen Der f 7 are stabilized by codon optimization in Aspergillus oryzae. Appl Microbiol Biotechnol.
- some expression strategies can be adopted, such as codon optimization by replacing rare codons or modifying host cells (Lakey DL, Voladri RK, Edwards KM, et al. Enhanced production of recombinant Mycobacterium). Tuberculosis antigens in Escherichia coli by replacement of low-usage codons. Infect Immun. 2000; 68: 233-8; Brinkmann U, Mattes RE, Buckel P. High-level expression of recombinant genes in Escherichia coli is dependent on the availability of the dnaY gene product.
- Protein fusion technology is to fuse a fusion carrier protein or polypeptide tag with a protein or polypeptide of interest to express the target protein as a fusion protein in the host cell; fusion expression can increase the expression level of the target protein or polypeptide, and improve the purpose.
- the physicochemical characteristics of the protein, or the addition of special markers to the protein or polypeptide of interest to facilitate subsequent purification or detection, is widely used.
- the method can be specifically cleavage by restriction enzyme digestion or chemical cleavage to remove the fusion carrier protein or the fusion tag, and further isolated and purified to obtain the protein or polypeptide of interest.
- fusion-enhancing carrier proteins are mainly classified into four categories: (1) fusion carrier proteins that promote soluble expression, such as thioredoxin (Levarski Z, A, Krahulec J, et al. High-level expression and purification of recombinant human growth hormone produced in soluble form in Escherichia coli. Protein Expr Purif. 2014; 100: 40-7), SUMO (Malakhov MP, Mattern MR, Malakhova OA , et al.SUMO fusions and SUMO-specific protease for efficient expression and purification of proteins. J Struct Funct Genomics.
- Such fusion carrier proteins are very water-soluble and can effectively prevent the aggregation of the target protein or polypeptide folding intermediate, thereby facilitating the formation of a correct spatial structure.
- a fusion carrier protein may also be Will play a very good role, such as thioredoxin (Lauber T, Marx UC, Schulz A, et al. Accurate disulfide formation in Escherichia coli: overexpression and characterization of the first domain (HF6478) of the multiple Kazal-type inhibitor LEKTI.Protein Expr Purif. 2001; 22: 108-12).
- Fusion carrier proteins that promote expression of inclusion bodies such as ThiS (Yuan S, Xu J, Ge Y, Yan Z, et al. Prokaryotic ubiquitin-like ThiS fusion enhances the heterologous protein overexpression and aggregation in Escherichia coli.
- ThiS Yuan S, Xu J, Ge Y, Yan Z, et al.
- Prokaryotic ubiquitin-like ThiS fusion enhances the heterologous protein overexpression and aggregation in Escherichia coli.
- MoaD MoaD(Yuan S,Wang X,Xu J,et al.Ubiquitin-like prokaryotic MoaD as a fusion tag for expression of heterologous proteins in Escherichia coli.BMC Biotechnol.2014;14:5)
- PurF Fragments Lee JH, Kim JH, Hwang SW, et al.
- inclusion body expression has the advantages of high yield, strong stability, easy isolation and purification, and alleviates the toxic effect of the target protein or polypeptide on host cells, but the inclusion body protein needs to be obtained through a complex renaturation process. Correctly folded functional proteins, which to some extent offset the high expression of inclusion bodies. (3) self-shearing fusion expressions such as intein (Xie YG, Han FF, Luan C, et al.
- a fusion carrier protein that directs secretion expression such as PelB (Wu D, Lu Y, Huang H, et al. High-level secretory expression of metchnikowin in Escherichia coli. Protein Expr Purif. 2013; 91: 49-53), Protein A signal peptide A, Blingsmo OR, Saether O, et al. Expression and characterization of a recombinant human parathyroid hormone secreted by Escherichia coli employing the staphylococcal protein A promoter and signal sequence. J Biol Chem. 1990; 265: 7338-44) and the like.
- PelB Hu D, Lu Y, Huang H, et al. High-level secretory expression of metchnikowin in Escherichia coli. Protein Expr Purif. 2013; 91: 49-53
- Protein A signal peptide A Blingsmo OR
- Saether O et al. Expression and characterization of a recombinant human
- the fusion protein can be secreted by the signal sequence to the periplasm of the cell, preventing the degradation of the intracellular protease; the periplasmic oxidative activity is stronger than the cytoplasm, which is beneficial to the protein or polypeptide containing the disulfide bond. Fold correctly. However, periplasmic expression usually results in less expression.
- the target protein Since the existing fusion carrier proteins such as glutathione transferase (27 kD) and maltose binding protein (50 kD) have large molecular weights, the target protein, especially the low molecular weight polypeptide, has a low proportion in the fusion protein; The expression level of the fusion protein is improved, but the expression efficiency of the protein or polypeptide of interest in the host cell is still relatively low.
- the development of a novel small molecular weight fusion carrier protein can effectively increase the expression level of the fusion protein and increase the expression efficiency of the target protein or polypeptide, and has a cost advantage in industrial production.
- Insulin is a hormone that promotes the synthesis of glycogen, fat and protein in animals and lowers blood sugar. It has long been used mainly for the treatment of diabetes. Insulin is synthesized and secreted in animal islet ⁇ cells, consisting of two subunits, B and A.
- the B subunit has 30 amino acid residues
- the A subunit contains 21 amino acid residues
- the two subunits are composed of two pairs.
- the disulfide bond is connected, and the A subunit further contains a pair of disulfide bonds.
- islet ⁇ cells In vivo, islet ⁇ cells first synthesize a single-chain pro-proinsulin containing a leader peptide; the pro-insulin produced by processing is a single-chain protein molecule formed by linking the B subunit, the A subunit and the C peptide of the spacer. It is necessary to cut off the C peptide by protease action to form a mature molecule of a double-chain insulin composed of a B subunit and an A subunit.
- the amino acid sequences and structures of insulin in different animal species (human, bovine, sheep, pig, etc.) are slightly different.
- the method for producing recombinant insulin can be mainly divided into two types according to the different expression patterns of insulin B and A subunits.
- (1) Single subunit synthesis method The insulin B and A subunits are separately expressed by genetic engineering, and the two subunits are mixed to make the B and A subunits oxidatively renature under certain conditions to form a correct disulfide bond to obtain mature insulin.
- the insulin B and A subunits have small molecular weight and cannot be directly recombined.
- Expression which needs to be expressed with a fusion carrier protein, such as ⁇ -galactosidase fusion protein (Goeddel DV, Kleid DG, Bolivar F, et al.
- Proinsulin method firstly express the proinsulin single-chain molecule consisting of B subunit, C peptide and A subunit by genetic engineering, and then oxidatively refolding and then cleaving the C peptide by trypsin and carboxypeptidase B. Mature insulin.
- proinsulin and small proinsulin can directly express prokaryotic expression in the form of inclusion bodies by directly merging the short leader peptide sequence containing the His tag purification tag at the N-terminus, thereby effectively avoiding proteolysis. And toxic damage to host cells (Tikhonov RV, Pechenov SE, Belacheu IA, et al.Recombinant human insulin IX.
- insulin-related sequence may be a potential small molecular weight fusion carrier protein that can effectively promote the recombinant expression of heterologous proteins or polypeptides.
- the technical problem to be solved by the present invention is to provide a novel fusion carrier protein, a gene sequence encoding the fusion carrier protein, an expression vector and/or a host cell containing the gene sequence, and a fusion carrier protein, a gene sequence thereof, and an expression vector.
- a host cell for promoting expression of a protein or polypeptide of interest and, in another aspect, a fusion protein comprising the fusion vector protein, a gene sequence encoding the fusion protein, an expression vector containing the gene sequence, and/or a host cell, and A method of expressing a fusion protein is provided.
- a first aspect of the present invention provides a fusion carrier protein for expressing a protein or polypeptide of interest, wherein the amino acid sequence of the fusion carrier protein is derived from an amino acid sequence of insulin or a substituted thereof.
- Amino acid sequence that is deleted and/or increased by one or several amino acids, or ammonia formed by conventional modification of the above amino acid sequence The amino acid sequence formed by the addition of a nucleic acid sequence or the above amino acid sequence; wherein the conventional modification includes acetylation, amidation, cyclization, glycosylation, phosphorylation, alkylation, biotinylation, fluorophore Group modification, polyethylene glycol PEG modification, immobilization modification; the label includes 6 ⁇ His, GST, EGFP, MBP, Nus, HA, IgG, FLAG, c-Myc, Profinity eXact. .
- the fusion vector protein characterized in that the fusion carrier protein comprises: (1) a human insulin A subunit, that is, an amino acid sequence represented by SEQ ID No: 2; or (2) an amino acid sequence in (1) Substituting, deleting and/or adding one or several amino acids and encoding an amino acid sequence which is fused for expression.
- the fusion vector protein characterized in that the fusion carrier protein comprises: (1) a human insulin B subunit, that is, an amino acid sequence represented by SEQ ID No: 4; or (2) an amino acid sequence in (1)
- the amino acid sequence represented by SEQ ID No: 6 is substituted, deleted and/or increased by one or several amino acids, and encodes an amino acid sequence which can be expressed by fusion.
- the fusion carrier protein characterized in that the fusion carrier protein comprises: (1) a single-chain protein molecule containing both a human insulin A subunit and a B subunit, preferably SEQ ID No: 8, SEQ ID No: 9.
- the amino acid sequence shown; or (2) is a single-chain molecular variant in (1) comprising a human insulin A subunit and/or a B subunit, substituted, deleted and/or increased by one or more amino acids And encoding an amino acid sequence which is fused to express, preferably the amino acid sequence shown by SEQ ID No: 25.
- nucleic acid molecule encoding the fusion carrier protein of the first aspect, an expression vector or host cell containing the nucleic acid molecule, is provided.
- the nucleic acid molecule characterized in that the nucleic acid molecule comprises a gene sequence encoding the fusion vector protein of the first aspect of the invention, preferably SEQ ID No: 1, SEQ ID No: 3, SEQ ID No: 5, SEQ ID No :7.
- the above expression vector characterized in that the expression vector comprises the nucleic acid molecule of the present aspect, and the promoter linked to the vector is used for expression of a protein encoded by the nucleic acid molecule.
- the host cell described above characterized in that the host cell comprises the nucleic acid molecule or expression vector of the present invention.
- a third aspect of the present invention provides a fusion carrier protein according to the first aspect of the present invention, and the nucleic acid molecule, expression vector or host cell of the second aspect, for promoting expression of a protein or polypeptide of interest.
- a fusion protein comprising the fusion carrier protein of the first aspect of the present invention and at least one protein or polypeptide of interest, and the protein of interest Or the polypeptide is not insulin.
- the above fusion protein may be subjected to conventional modification or addition to an expression purification tag;
- the conventional modifications include acetylation, amidation, cyclization, glycosylation, phosphorylation, alkylation, biotinylation, fluorophore modification, Polyethylene glycol PEG modification, immobilization modification;
- the label includes 6 ⁇ His, GST, EGFP, MBP, Nus, HA, IgG, FLAG, c-Myc, Profinity eXact.
- the above fusion protein may contain 1, 2, 3 or 4 proteins or polypeptides of interest.
- the fusion protein described above characterized in that the fusion protein contains a specific polypeptide cleavage site or sequence between the fusion vector protein and the protein or polypeptide of interest.
- the fusion protein characterized in that the protein or polypeptide of interest in the fusion protein is selected from the group consisting of GLP-1, oxyntomodulin, enfuvirtide, linaclotide, human transthyretin, and variants thereof. body.
- the preferred amino acid sequence of the above fusion protein is selected from the group consisting of SEQ ID No: 12, SEQ ID No: 13, SEQ ID No: 14, SEQ ID No: 17, SEQ ID No: 18, SEQ ID No: 21, SEQ ID No: 27.
- nucleic acid molecule encoding the fusion protein of the fourth aspect, an expression vector or host cell containing the nucleic acid molecule, is provided.
- nucleic acid molecule described above characterized in that the nucleic acid molecule comprises a gene sequence encoding the fusion protein of the fourth aspect of the invention.
- the above expression vector characterized in that the expression vector comprises the nucleic acid molecule of the present aspect, and the promoter linked to the vector is used for expression of a protein encoded by the nucleic acid molecule.
- the host cell described above characterized in that the host cell comprises the nucleic acid molecule or expression vector of the present invention.
- a sixth aspect of the present invention provides a method for producing a protein or polypeptide of interest by the following steps: (1) amplifying a suitable host cell; (2) inducing expression of a fusion protein in a host cell; (3) separating and preparing a fusion protein; and/or (4) chemically cleaving or bio-enzymatically cleavage of a specific polypeptide cleavage site or sequence, and isolating the fusion carrier protein and the protein or polypeptide of interest to obtain a protein or polypeptide of interest;
- the host cell in (1) is the host cell of the second aspect or the fifth aspect of the invention.
- heterologous polypeptides For many small molecular weight heterologous polypeptides, they are easily degraded by proteases when expressed in host cells; some expressed protein molecules have certain toxic effects on host cells, leading to host cell growth arrest or death; therefore, direct expression of the protein of interest or Polypeptides often encounter difficulties.
- the present invention finds that the insulin B, A subunit single strand, and the protein fragment fused by the B, A subunit single strand, and their various variants, as a novel fusion carrier protein, can effectively improve the expression of the target protein or polypeptide. the amount. Particularly in bacterial hosts, such fusion carrier proteins can facilitate the expression of a plurality of proteins of interest in the form of inclusion bodies. Increasing the stability of the protein of interest may be an important factor in effectively increasing the expression level of the protein or polypeptide of interest.
- RNA clones can be directly obtained by reverse transcription PCR using RNA extracted from pancreatic tissue or islet cells of human or various animals; more simply, synthetic and natural The same gene sequence encoding the DNA sequence; it is also possible to synthesize different codon-preferred gene sequences according to the amino acid sequence of the host according to the host's genetic preference according to the amino acid sequence; and it is also easy to obtain insulin B and A subunits from other commercial sources. Gene cloning.
- a fusion protein expression vector encoding the fusion vector protein of the present invention and a predetermined protein of interest can be constructed in a known manner. Using conventional molecular biology methods, the gene encoding the intended protein of interest, the gene encoding the insulin B subunit, or the gene encoding the A subunit, or the gene encoding the protein fragment fused by the B and A subunits, is firstly used. , direct co-code connection.
- the order of their linkage can be adjusted as needed to obtain various forms of fusion protein products such as BT, TB, AT, TA, ABT, BAT, TAB, TBA, etc.;
- B A contained in the fusion protein can be human Insulin B subunit single-stranded sequence (SEQ ID No: 4), human insulin A subunit single-stranded sequence (SEQ ID No: 2), or other animal species
- the genus insulin B chain, A chain sequence may also be a human insulin amino acid residue substitution, deletion or addition variant, but still maintain at least 70% homology with the human insulin sequence, or at least 75%, at least 80 %, at least 85%, at least 90%, at least 95% homologous, and capable of efficiently fused to the expressed sequence; wherein the protein of interest T may have one or more identical or different molecules.
- the above gene fragment may also be ligated by a gene fragment containing a coding spacer sequence, and the spacer sequence of the fusion carrier protein and the target protein may contain a specific sequence which can be specifically chemically cleaved or bio-enzymatically decomposed, such as a Met (methionine) residue which can be CNBr.
- Lys (lysine) or Arg (arginine) residues can be cleaved by trypsin, LysArg or ArgArg can be cleaved by double-base proteases such as Kex2, and GluAsnLeuTyrPheGln can be etched by tobacco etch virus protease (TEV protease) Identification and cleavage, IeuGluGlyArg can be recognized and cleaved by Factor Xa (Xa protease), as well as other suitable protease cleavage sites.
- Xa protease Factor Xa
- the fusion protein-encoding gene obtained above is inserted into a suitable expression vector by a known method to obtain a recombinant vector expressing the above fusion protein.
- the desired gene fragments can also be ligated to the expression vector in the appropriate order.
- Such expression vectors include, but are not limited to, prokaryotic expression vectors and various eukaryotic expression vectors.
- a variety of commercial expression vectors are available, such as the expression vector pQE series, pET series and the like.
- the recombinant vector is transformed into a host cell by well-known methods including, but not limited to, E. coli, other prokaryotic cells, and various eukaryotic cells.
- Induced expression is carried out under suitable conditions, and the expressed host cells are disrupted and cleaved to obtain a fusion protein product.
- the expression product can be purified by a known method; in the fusion protein, the insulin B, A subunit sequence as a fusion carrier protein can be removed by suitable chemical cleavage or protease cleavage.
- the invention can be used to fuse expression of a variety of proteins or polypeptides of interest in the diagnosis, treatment, prophylaxis or other fields of use of the disease.
- Protein molecules containing 50 or more amino acid residues are generally referred to as proteins, and protein molecules of 50 or less amino acid residues are often referred to as polypeptides; however, the names of the two are often mixed.
- Target proteins can include growth factors, cytokines, ligands, receptors, transporters, antigens, antibodies, and fragments thereof.
- the present invention has more advantages for the target polypeptide; common target polypeptides are glucagon-like peptide 1 (GLP-1), exenatide (Exendin-4), and gastric acid.
- Oxyntomodulin various antibacterial peptides, antiviral peptides such as enfuvirtide (T-20), atrial peptide, octreotide, linaclotide and the like.
- An advantage of the present invention is that the fusion vector protein of the present invention and the recombinant expression system constructed thereof have a good pro-expression effect on a plurality of different proteins or polypeptides. Due to its small molecular weight, such as insulin B, A subunit single chain, only composed of 30 or 20 amino acid residues; protein fragments fused by B, A subunit single chain can also contain only about 50 Amino acid residues; compared with known fusion carrier proteins, they serve as fusion carrier proteins, and the proportion of the fusion protein to the target protein or polypeptide is significantly decreased; thereby, the target protein, particularly the small molecular weight polypeptide, can be obtained efficiently. Recombinant expression, effectively reducing its production costs.
- insulin as the first bioengineered product in human history, and the numerous structural variants it derives, are well known for their structure, sequence variation, and recombinant expression.
- a fusion carrier protein it does not need to have the biological activity of insulin, which is also a variant fusion carrier protein for the substitution, deletion or addition of various amino acid residues, so that additional physical and chemical characteristics, such as ion binding characteristics, are provided.
- additional physical and chemical characteristics such as ion binding characteristics
- insulin has excellent in vivo compliance in clinical practice and does not cause significant toxicity to the body; this allows the fusion protein derived from insulin to form a fusion protein activity with the protein or polypeptide of interest. In the product, it may be applied directly to the body without cutting and removing.
- Figure 1 Prokaryotic recombinant expression of the fusion vector protein BA.
- the fusion protein of the carrier protein BA and the expression vector pQE80L was fused and expressed by Tricine-SDS-PAGE and Western blot.
- the whole bacterial sample (-), IPTG-induced whole bacterial sample (+) was detected by 16.5% Tricine-SDS-PAGE electrophoresis, and the polypeptide below about 10kD in the uninduced host bacteria rarely existed;
- the whole bacterial sample (-), the IPTG-induced whole bacterial sample (+), the induced bacterial soluble supernatant sample (S), and the insoluble inclusion body sample (I) were subjected to 13% Tricine-SDS-PAGE electrophoresis (left).
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the expression product BA band.
- the fusion vector protein BA was fused to the N-terminus of GLP-E, the expression vector BA-GLP-E/pQE80L, and the sample was expressed in E. coli TG1: the whole bacterial sample (-) was not induced, and the whole bacterial sample was induced by IPTG (+) The supernatant sample (S) was induced, and the inclusion body sample (I) was induced, and 13% Tricine-SDS-PAGE (left) and Western blot (right) were detected.
- the fusion protein has no obvious degradation and is highly expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the fusion protein BA-GLP-E band.
- the fusion vector protein B was fused to the N-terminus of GLP-E, the expression vector B-GLP-E/pQE80L, and the sample was expressed in E. coli TG1: the whole bacterial sample (-) was not induced, and the whole bacterial sample was induced by IPTG (+) The supernatant sample (S) was induced, and the inclusion body sample (I) was induced, and 13% Tricine-SDS-PAGE (left) and Western blot (right) were detected.
- the fusion protein has no obvious degradation and is effectively expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the fusion protein B-GLP-E band position.
- the fusion vector protein A was fused to the N-terminus of GLP-E, the expression vector A-GLP-E/pQE80L, and the sample was expressed in E. coli TG1: the whole bacterial sample (-) was not induced, and the whole bacterial sample was induced by IPTG (+) The supernatant sample (S) was induced, and the inclusion body sample (I) was induced, and 13% Tricine-SDS-PAGE (left) and Western blot (right) were detected.
- the fusion protein has no obvious degradation and is effectively expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the fusion protein A-GLP-E band position.
- the fusion vector protein BA was fused to the N-terminus of GLP-1, and its expression vector BA-GLP/pQE80L was expressed in E. coli TG1.
- the whole bacterial sample (-) was not induced, and the whole bacterial sample was induced by IPTG (+).
- the supernatant sample (S) was induced, and the inclusion body sample (I) was induced, and subjected to 13% Tricine-SDS-PAGE electrophoresis (left) and Western blot (right).
- the fusion protein did not drop significantly The solution is expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the fusion protein BA-GLP band.
- the fusion vector protein B was fused to the N-terminus of GLP-1, and the expression vector B-GLP/pQE80L was expressed in E. coli TG1: the whole bacterial sample (-) was not induced, and the whole bacterial sample (+) was induced by IPTG.
- the supernatant sample (S), the induced inclusion body sample (I), were subjected to 13% Tricine-SDS-PAGE (left) and Western blot (right).
- the fusion protein has no obvious degradation and is effectively expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the fusion protein B-GLP band.
- FIG. 1 Fusion expression of fusion vector protein BA with enfuvirtide T-20.
- the fusion vector protein BA was fused to the N-terminus of enfuvirtide T-20, the expression vector BA-T/pQE80L, and the sample was expressed in E. coli TG1: the whole bacterial sample (-) was not induced, and the whole bacterial sample was induced by IPTG ( +) Induction of supernatant sample (S), induction of inclusion body sample (I), 13% Tricine-SDS-PAGE (left) and Western blot (right).
- the fusion protein has no obvious degradation and is effectively expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the fusion protein BA-T band.
- Figure 8 Fusion expression of inclusion body tag BA variants with oxyntomodulin.
- the inclusion body tag BA variant B'A' was fused to the N-terminus of oxyntomodulin OXN, the expression vector B'A'-OXN/pQE80L, and the sample was expressed in E. coli TG1: no whole bacterial sample was induced (-) The whole bacterial sample (+), the induced supernatant sample (S), and the induced inclusion body sample (I) were induced by IPTG, and 16.5% Tricine-SDS-PAGE (left) and Western blot (right) were detected.
- the fusion protein has no obvious degradation and is highly expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the fusion protein B'A'-OXN band.
- Figure 9 Fusion expression of fusion vector protein BA with single copy linaclotide.
- the fusion vector protein BA was fused to the N-terminus of the single-copy linaclotide, the expression vector BA-LN1/pQE80L, and the sample was expressed in E. coli BL21(DE3)pLysS: the whole bacterial sample (-) was not induced, and the whole bacterium was induced by IPTG. Sample (+), induced supernatant sample (S), induced inclusion body sample (I), 13% Tricine-SDS-PAGE (left) and Western blot (right) were detected. The fusion protein has no obvious degradation and is highly expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the fusion protein BA-LN1 band.
- Figure 10 Fusion expression of fusion vector protein BA with three copies of linaclotide.
- the fusion vector protein BA was fused to the N-terminus of the three-copy linaclotide, the expression vector BA-LN3/pQE80L, and the sample was expressed in E. coli BL21(DE3)pLysS: the whole bacterial sample (-) was not induced, and the whole was induced by IPTG.
- the bacterial sample (+), the induced supernatant sample (S), and the induced inclusion body sample (I) were subjected to 13% Tricine-SDS-PAGE (left) and Western blot (right).
- the fusion protein has no obvious degradation and is effectively expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the fusion protein BA-LN3 band.
- FIG. 1 Fusion expression of fusion vector protein B variants with double copy linaclotide.
- the fusion vector protein B variant was fused to the N-terminus of the double-copy linaclotide, the expression vector B'-LN2/pQE80L, and the sample was expressed in E. coli BL21(DE3)pLysS: the whole bacterial sample (-), IPTG induces whole-sample (+), induced supernatant samples (S), induced inclusion body samples (I), 13% Tricine-SDS-PAGE (left) and Western blot (right) were performed.
- the fusion protein has no obvious degradation and is highly expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the fusion protein B'-LN2 band.
- Figure A shows the results of mass spectrometry detection of B'-LN2 inclusion bodies
- Figure B shows the results of trypsin digestion of B'-LN2 inclusion bodies
- Figure C shows the detection of trypsin-reduced samples of B'-LN2 inclusion bodies.
- Results Figure D shows the results of trypsin-carboxypeptidase B double digestion of B'-LN2 inclusion bodies
- Figure E shows the digestion of linaclotide in the trypsin-reduced samples of B'-LN2 inclusion bodies.
- the results of secondary mass spectrometry of the peptide (amino acid sequence CysCysGluTyrCysCysAsnProAlaCysThrGlyCysTyrArg).
- the peak position attribution results are in line with theoretical expectations.
- M+H represents excimer ions
- b and y represent b-series and y-series characteristic fragment ions generated in the second-order mass spectrum, respectively.
- Figure 13 Fusion expression of fusion vector protein BA and human transthyretin.
- the fusion vector protein BA was fused to the N-terminus of human transthyretin, the expression vector BA-TT/pQE80L, and the non-fusion expression vector TT/pQE80L were expressed in E.coli TG1, respectively: the whole bacterial sample (-) was not induced.
- the whole bacterial sample (+), the induced supernatant sample (S), and the induced inclusion body sample (I) were induced by IPTG, and subjected to 15% SDS-PAGE.
- the expression level of the fusion protein BA-TT was significantly higher than that of TT.
- the non-fusion protein TT was mainly soluble, but when the supernatant soluble protein was extracted, the collection rate was significantly lower than the expected yield.
- the fusion protein BA-TT mainly included Body expression, no obvious loss.
- TT and BA-TT are the expression products of the vectors TT/pQE80L and BA-TT/pQE80L, respectively;
- M is the protein electrophoresis marker and its molecular weight (kD); the arrow indicates the position of the corresponding expression product.
- FIG. 14 Fusion expression of fusion vector protein BA with random polypeptide X.
- the fusion vector protein BA was fused to the N-terminus of the random polypeptide X, the expression vector BA-X/pQE80L, and the sample was expressed in E. coli TG1: the whole bacterial sample (-) was not induced, and the whole bacterial sample (+) was induced by IPTG.
- the supernatant sample (S), the induced inclusion body sample (I), were subjected to 13% Tricine-SDS-PAGE (left) and Western blot (right).
- the fusion protein has no obvious degradation and is highly expressed in the form of inclusion bodies.
- M is the protein electrophoresis marker and its molecular weight (kD): the arrow indicates the position of the fusion protein BA-X band.
- the gene or polypeptide encoding gene is obtained by artificial synthesis and PCR amplification, and inserted into the expression vector pQE80L after digestion. Corresponding restriction sites were constructed to construct a related recombinant expression vector, and the N-terminus of the expressed product was fused with His-6 sequence. The gene sequences of interest of all vectors were verified by nucleic acid sequence determination.
- the relevant recombinant plasmid was transformed into E. coli TG1 or BL21 (DE3) pLysS competent state to obtain the objective engineering strain. Specific examples were expressed in laboratory small scale shake flasks. The mixture was shaken and activated overnight in an ampicillin-resistant LB liquid medium at 37 ° C, after which the overnight culture was transferred to a new ampicillin-resistant LB medium (20-30 mL) at a ratio of 1:100. Incubate at 37 ° C to the appropriate logarithmic growth phase and induce expression for a certain period of time with 0.1 mM or 1 mM IPTG.
- a volume of 200 ⁇ L of the supernatant was taken, and a 50 ⁇ L volume of 5 ⁇ SDS loading buffer was added, incubated in a boiling water bath for 20 min, and stored at -20° C. as a supernatant sample.
- the pellet was resuspended in 1 mL of inclusion body lotion (PBS containing 1% Triton X-100 and 5 mM EDTA), centrifuged at 12000 rpm for 15 min at 4 ° C, the supernatant was discarded, washed three times, and resuspended in 1 mL volume of PBS at 12000 rpm.
- inclusion body lotion PBS containing 1% Triton X-100 and 5 mM EDTA
- the inclusion body pellet was resuspended in 960 ⁇ L volume of PBS, 80 ⁇ L of which was added to 120 ⁇ L volume of PBS and 50 ⁇ L volume of 5 ⁇ SDS loading buffer, incubated in a boiling water bath for 20 min, and stored at -20 ° C as an induced inclusion body sample.
- the above various samples have the same dilution in electrophoresis detection, and their protein contents are directly comparable.
- SDS-PAGE gel was prepared according to the conventional method, and Tricine-SDS-PAGE gel was used according to the literature ( H. Tricine-SDS-PAGE. Nat Protoc. 2006; 1:16-22) Method formulation. 15% SDS-PAGE or 13% (or specific concentration) Tricine-SDS-PAGE detection of volumetric samples of prepared non-inducible whole bacteria samples, induced whole bacteria samples, induced supernatant samples, induced inclusion body samples, etc. . First constant pressure 50V electrophoresis for 50min, then constant pressure 150V, full ice water bath electrophoresis.
- Coomassie Brilliant Blue R-250 staining was performed (after the end of Tricine-SDS-PAGE electrophoresis, the electrophoresis gel was fixed with 5% glutaraldehyde for 30 min and then stained with Coomassie Brilliant Blue R-250), and the electrophoresis results were obtained after decolorization.
- Bovine serum albumin (BSA) was used as a standard. 0.5, 1, 2, 4 ⁇ g of BSA and appropriate amount of induced inclusion body samples were sampled and electrophoresis was performed simultaneously to decolorize the gelatin protein bands.
- the QuantiScan software performs grayscale scanning and calculates the fusion protein content based on the standard curve of BSA quantification.
- the separated gel fraction was taken for Western blot.
- the separation gel was wet transferred onto the PVDF membrane, and then the membrane was placed in PBST (0.1% Tween-20 in PBS) containing 5% skim milk powder, and blocked by shaking at room temperature for 2 h.
- the anti-His-tagged murine monoclonal antibody was diluted 1:2000 with PBST containing 2.5% skim milk powder and incubated overnight at 4 °C. After rinsing 4 times (15 min/time) with PBST, HRP-labeled goat anti-mouse IgG was diluted 1:2000 with PBST containing 2.5% skim milk powder at room temperature. Incubate for 1.5 h with shaking and rinse 4 times with PBST (15 min/time). 1 mL of ECL developer was evenly added to the taken PVDF film and exposed.
- inclusion body sample was dissolved in inclusion body solution (20 mM Tris-HCl containing 8 M urea, pH 8.0), added to a final concentration of 5 mM DTT, dissolved by shaking, left at room temperature for 20 min, centrifuged at 12000 rpm for 10 min at 4 ° C, and the supernatant was taken with 20 mM Tris.
- - HCl (pH 8.0) buffer diluted 20-fold, incubated with trypsin (final concentration 4 mg / L) at 37 ° C overnight; or carboxypeptidase B (final concentration 0.5 mg / L) was incubated at 37 ° C for 30 min; acidification was terminated.
- the acidified digested sample was mixed with CHCA matrix (30 g/L; 70% acetonitrile/30% methanol/0.1% trifluoroacetic acid) in equal volume, and 1 ⁇ L of the mixture was spotted on a mass spectrometer plate, dried naturally, and assisted by the matrix.
- Laser analytic ionization time-of-flight mass spectrometry MALDI-TOF-MS
- the fusion vector protein B gene (SEQ ID No: 3) was amplified with the forward primer atagatctatgtttgtgaaccagcatctgtg and the reverse primer atactcgagttaggttttcggggtataaaaaag, and the fusion vector protein A gene (SEQ ID No: 1) was amplified with the forward primer ataggacatcatgggcattgtggaagaggtgc and the reverse primer atagagtctgttgcaatagttttccagctg;
- the inclusion product tag sequence BA gene (SEQ ID No: 7) was obtained by the PCR product of the B gene and the A gene mixed with the primer cttttttttataccccgaaacccgccgcggcattgtggaacagtgc, using the overlap primer atatagtctatgtttgtgaaccagcatctgtg and the reverse primer atactcga
- This example demonstrates that the insulin BA single chain is stably expressed in the form of inclusion bodies in prokaryotic host cells.
- Glucagon-like peptide-1 is a polypeptide drug effective for the treatment of type 2 diabetes.
- GLP-1 (7-37) is one of the active forms of GLP-1 in vivo, containing 31 amino acid residues; its second Ala mutation is a GLP-1 (A2G) variant of Gly (SEQ ID No: 16), It can tolerate the degradation of GLP-1 by DPPIV.
- a gene sequence (SEQ ID No: 10) encoding the GLP-1 (A2G) gene encoding the polypeptide GLP-E (SEQ ID No: 11) with a C-terminal 10 amino acid residue extended flexible sequence was designed. , At the very end is a Cys residue that can be used to link to other molecules. The gene was synthesized by overlap PCR using the following primers.
- the inclusion body tag BA gene was amplified with primers ataagatctatgtttgtgaaccagcatctgtg and ataggagtccttgcaatagttttccagctg, and digested with BglII and BamHI, inserted into the BamHI restriction site of the expression vector pQE80L to construct a BA-/pQE80L vector;
- the GLP-E gene was amplified with the primers atagattctcgccgccacggtgaaggtac and ataaagcttagcaagaaccaccaccaccaccagaac
- the BamHI and HindIII restriction sites of the expression vector BA-/pQE80L were inserted into BglII and HindIII to construct a BA-GLP-E/pQE80L vector encoding a fusion protein sequence of SEQ ID No: 12.
- the spacer sequence between BA and GLP-E contains ArgArg, which is recognized and cleaved by a bibase protease, releasing intact GLP-E molecules.
- Tricine-SDS-PAGE (Fig. 2 left) showed that the apparent molecular weight of the fusion protein BA-GLP-E was consistent with the theoretical molecular weight (12094.5Da), and there was no obvious degradation product.
- BA-GLP-E was highly expressed in the form of inclusion bodies.
- Western blot (Fig. 2 right) showed that the expression band was the target protein containing His Tag.
- This example demonstrates that the insulin BA single chain is fused to the small molecule polypeptide GLP-E and is capable of efficient and stable expression in the form of inclusion bodies in prokaryotic host cells.
- the GLP-E gene was amplified with the primers accccgaaaacccgccgccacggtgaaggtacctctc and ataaagcttagcaagaaccaccaccaccaccagaac, and the product was mixed with the inclusion body tag B gene amplification product, and then amplified by primers, amplified by the primers atagagtctatgtttgtgaaccagcatctgtg and ataaagcttagcaagaaccaccaccaccaccagaac, and digested with BglII and HindIII, and inserted into the expression vector pQE80L.
- the BamHI and HindIII restriction sites were constructed into a B-GLP-E/pQE80L vector encoding a fusion protein sequence of SEQ ID No: 13.
- the fusion carrier protein B is fused to the N-terminus of GLP-E, and its interval sequence is ArgArg, which can be recognized and cleaved by the bibase protease to release the intact GLP-E molecule.
- This example demonstrates that the insulin B chain is fused to the small molecule polypeptide GLP-E and is stably expressed in the form of inclusion bodies in prokaryotic host cells.
- the fusion protein A-GLP-E gene was amplified from the vector BA-GLP-E/pQE80L with the primers ataagatctatgggcattgtggaacagtgctgcac and ataaagcttagcaagaaccaccaccaccagaac, and then digested with BglII and HindIII, and inserted into the BamHI and HindIII restriction sites of the expression vector pQE80L to construct A.
- the -GLP-E/pQE80L vector encodes a fusion protein sequence of SEQ ID No: 14.
- the fusion carrier protein A is fused to the N-terminus of GLP-E, and its spacer sequence contains ArgArg, which can be recognized and cleaved by the bibase protease to release the intact GLP-E molecule.
- A-GLP-E was effectively expressed in the form of inclusion bodies.
- Western blot (Fig. 4 right) showed that the expression band was the target protein containing His Tag.
- This example demonstrates that the insulin A chain is fused to the small molecule polypeptide GLP-E and is stably expressed in the form of inclusion bodies in prokaryotic host cells.
- This example directly expresses the fusion protein BA-GLP of GLP-1.
- the GLP-1 gene sequence (SEQ ID No: 15) was amplified from the vector BA-GLP-E/pQE80L with the primers ataagatctcgccgccacggtgaaggtac and ataaagcttaaccacgacctttaaccagc, digested with BglII and HindIII, and inserted into the expression vector BA-/pQE80L of BamHI and HindIII.
- the restriction enzyme site was constructed into a BA-GLP/pQE80L vector encoding a fusion protein sequence of SEQ ID No: 17.
- the spacer sequence between BA and GLP-1 contains ArgArg, which is recognized and cleaved by a bibase protease, releasing the intact GLP-1 molecule.
- BA-GLP was highly expressed in the form of inclusion bodies; Western blot (Fig. 5 Right) The results showed that the expression band was the target protein containing His Tag.
- This example demonstrates that the insulin BA single chain is fused to the small molecule polypeptide GLP and is capable of efficient and stable expression in the form of inclusion bodies in prokaryotic host cells.
- the primers ataagatctatgtttgtgaaccagcatctgtg and ataaagcttaaccacgacctttaaccagc were amplified, and digested with BglII and HindIII, inserted into the BamHI and HindIII cleavage sites of the expression vector pQE80L, and the inclusion body tag B and GLP-1 N were isolated.
- the terminal fusion was constructed into a B-GLP/pQE80L vector encoding a fusion protein sequence of SEQ ID No: 18, and the interval between B and GLP-1 in the expression product was ArgArg, which was recognized and cleaved by the double base protease, and released. Complete GLP-1 molecule.
- the results of Tricine-SDS-PAGE (Fig.
- This example demonstrates that the insulin B chain is fused to the small molecule polypeptide GLP-E and is stably expressed in the form of inclusion bodies in prokaryotic host cells.
- Example 7 Fusion expression of inclusion body tag BA and enfuvirtide
- the HIV fusion inhibitory polypeptide enfuviral peptide T-20 (SEQ ID No: 20), is an artificial polypeptide consisting of 36 amino acid residues and is an effective anti-AIDS therapeutic.
- a gene sequence encoding the enfuvirtide (SEQ ID No: 19) was synthesized by the following primers.
- the PCR product contains specific cleavage sites BglII and HindIII, which were digested with BglII and HindIII, inserted into the BamHI and HindIII restriction sites of the expression vector BA-/pQE80L to construct a BA-T/pQE80L vector, encoding a fusion.
- the protein sequence SEQ ID No: 21, whose BA and T-20 spacer sequences of the expression product contain ArgArg, can be recognized and cleaved by the bibase protease to release the intact T-20 molecule.
- Tricine -SDS-PAGE (Fig. 7 left) showed that the apparent molecular weight of the fusion protein BA-T was consistent with the theoretical molecular weight (12626.2 Da), and there was no obvious degradation product.
- BA-T was efficiently expressed in the form of inclusion bodies; Western blot (Fig. 7 Right) The results showed that the expression band was the target protein containing His Tag.
- insulin BA single chain is fused to the small molecule polypeptide enfuvirtide and can be stably expressed in the form of inclusion bodies in prokaryotic host cells.
- Example 8 Fusion expression of fusion vector protein BA variant with oxyntomodulin
- Oxyntomodulin (OXN, SEQ ID No: 22) is a polypeptide containing 37 amino acid residues; it has both GLP-1 and glucagon effects and may be a therapeutic drug for diabetes and obesity.
- a gene sequence encoding OXN (SEQ ID No: 23) was designed.
- a DNA fragment containing the gene was synthesized by overlay PCR using the following primers:
- Its 5' end contains an extended sequence encoding a linker sequence between the partial insulin A chain and the polypeptide recognized by the protease.
- the fusion vector protein BA gene vector BA-/pQE80L was subjected to PCR site-directed mutagenesis using the primers cgaacgcggcttttgttataccccgaaaacac and ggttttcggggtataacaaaagccgcgttcg to cause F25C mutation in the B chain; this was used as a template, and then amplified with the primers atagattctatgtttgtgaaccagcatctgtg and gttgcaatagttttcccccgggtaggggctgtgaatgctggtgcagtgctgtcca to obtain the carrier protein BA variant gene B.
- 'A' SEQ ID No: 24
- the B chain of the variant B'A' contains the F25C variant
- the A chain contains the C6H and C11H variants.
- the two amplified PCR products were mixed and amplified with primers ataggadccattttgtgaaccagcatctgtg and ataaagctttaagcgatgttgttacggttac.
- the PCR fragment was digested with BamHI and HindIII and inserted into the BamHI and HindIII restriction sites of the expression vector pQE80L to construct B'A'-OXN.
- the /pQE80L vector whose ORF sequence (SEQ ID No: 26) encodes a fusion protein sequence of SEQ ID No: 27.
- the spacer sequence of BA and OXN contains LysThrLysArg, which is recognized by the three-base protease Furilisin (Ballinger MD, Tom J, Wells JA (1996) Furilisin: a variant of subtilisin BPN'engineered for cleaving tribasic substrates. Biochemistry 35: 13579-13585) And cutting to release intact OXN molecules.
- This example demonstrates that the insulin BA single chain, when substituted for its amino acid sequence, is fused to the small molecule polypeptide oxyntomodulin, and is still efficiently and stably expressed in the form of inclusion bodies in prokaryotic host cells.
- Example 9 Fusion expression of inclusion body tag BA and single copy linaclotide
- Linaclotide (SEQ ID No: 28) consists of 14 amino acid residues and is rich in Cys. It is the first guanylate cyclase agonist for the treatment of chronic idiopathic constipation and constipation in adults. Irritable syndrome.
- the gene sequence encoding linaclotide (SEQ ID No: 29) was synthesized by the following primers.
- the PCR product contains specific cleavage sites BamHI, PstI and BglII, which were digested with PstI and BamHI, inserted into the BamHI and PstI restriction sites of the expression vector BA-/pQE80L, and constructed into BA-LN1/pQE80L vector.
- a fusion protein sequence of SEQ ID No: 30, the expression product of the BA and the single-copy linaclotide spacer sequence contains Arg, which can be recognized and cleaved by trypsin, releasing the extra-Arg residue at the C-terminus of linaclotide. Molecule.
- the constructed expression vector BA-LN1/pQE80L was transformed into E.
- This example demonstrates that the insulin BA single chain is fused to the small molecule peptide linaclotide and can be included in the prokaryotic host cell. Efficient and stable expression.
- Example 10 Fusion expression of inclusion body tag BA and three copies of linaclotide
- the PCR product of the linaclotide gene sequence contains a specific restriction site, which is digested with BamHI and BglII, ligated by itself, and amplified with the primers ataggadcccccggggggaatactgctgcaacccggcttgc and atactgcagatctgtagcaaccggtgcaagccgggttgcag to obtain a multicopy gene, which is digested according to the method of Example 9 and inserted into BA- /pQE80L vector, the expression of three copies of linaclotide vector BA-LN3/pQE80L, encoding a fusion protein sequence of SEQ ID No: 31, the expression product of BA and three copies of linaclotide and the spacer sequence between each copy contains Arg, which is recognized and cleaved by trypsin, releases a molecule containing additional Arg residues at the C-terminus of linaclotide.
- the results of Tricine-SDS-PAGE (Fig. 10 left) showed that the apparent molecular weight of the fusion protein BA-LN3 was consistent with the theoretical molecular weight (14121.1Da), and there was no obvious degradation product.
- BA-LN3 was efficiently expressed in the form of inclusion bodies; Western blot ( Figure 10, right) The results show that the expression band is the target protein containing His Tag.
- This example illustrates the fusion of insulin BA single-stranded and multi-copy small-molecule polypeptide linaclotide, as well as stable expression in the form of inclusion bodies in prokaryotic host cells.
- Example 11 Fusion expression of inclusion body tag B' and double copy linaclotide
- the inclusion body tag B gene variant (SEQ ID No: 5) amplified with the primers ataagatctatgtttgtgaaccagcatctgtg and ataagatctcggggtataaaaaaagccgcgttc encodes a protein sequence B' (SEQ ID No: 6), which lacks KT at the C-terminus.
- the results of Tricine-SDS-PAGE (Fig. 11 left) showed that the apparent molecular weight of the fusion protein B'-LN2 was consistent with the theoretical molecular weight (9299.6Da), and there was no obvious degradation product.
- B'-LN2 was highly expressed in the form of inclusion bodies; Western The result of the blot (Fig. 11 right) showed that the expression band was the target protein containing His Tag.
- This example demonstrates that when the amino acid residue of the insulin B chain is deleted, it is fused to the multicopy small polypeptide polypeptide linaclotide, and can also efficiently and stably express in the form of inclusion bodies in the prokaryotic host cell.
- the fusion protein B'-LN2 was detected by mass spectrometry, and its 9299.0840 peak was consistent with the predicted theoretical molecular weight of B'-LN2 (9299.6). 4647.9067 is likely to be the double-charged peak of B'-LN2, as shown in Fig. 12A.
- the B'-LN2 inclusion body sample was trypsinized to remove the carrier protein sequence, and the double-copy linaclotide sequence was segmented into a single-copy sequence, but the C-terminus contained an additional Arg residue; the mass spectrum of the digested sample 12B showed that the peak position 3728.7019 corresponds to the cleavage peptide GlySerHisHisHisHisHisGlySerMetPheValAsnGlnHisLeuCysGlySerHisLeuValGluAlaLeuTyrLeuValCysGlyGluArg (theoretical molecular weight 3730.1, containing 2 Cys), and the peak position 1682.4679 corresponds to the cleavage peptide segment CysCysGluTyrCysCysAsnProAlaCysThrGlyCysTyrArg (theoretical molecular weight 1688.9, containing 6 Cys).
- a secondary mass spectrometric detection was performed on the peak position 1688.5183 containing the polypeptide of interest, and the results are shown in Fig. 12E, and the amino acid sequence assignment result was as expected. Further cleavage by carboxypeptidase B can remove the extra Arg residue contained at the C-terminus of the above linaclotide fragment; as shown in the mass spectrum Figure 12C, the C-terminal Arg residue of the carboxypeptidase B excision peptide fragment (CysCysGluTyrCysCysAsnProAlaCysThrGlyCysTyrArg) The target polypeptide linaclotide (theoretical molecular weight 1532.7) can be obtained after the base, and the corresponding peak position is 1526.3960.
- Example 13 Fusion expression of inclusion body tag BA and human transthyretin
- TTR Human transthyretin
- SEQ ID No: 34 Human transthyretin (TTR) monomer (SEQ ID No: 34), containing 127 amino acid residues, which may be toxic to host cells when expressed by bacteria, resulting in low expression levels (Murrell) JR, Schoner RG, Liepnieks JJ, et al. Production and functional analysis of normal and variant recombinant human transthyretin proteins. J Biol Chem. 1992; 267: 16595-600), other strategies are needed to improve expression (Liu L, Hou J, Du) J, et al. Differential modification of Cys10 alters transthyretin's effect on beta-amyloid aggregation and toxicity. Protein Eng Des Sel. 2009; 22: 479-88), we attempted to express it by fusion using a novel inclusion body tag.
- TTR gene (SEQ ID No: 33) The three exons, human HeLa cell genomic DNA as templates, primer ggcaccggtgaatccaag with ctccagactcactggttttcccagaggcaaatggctcc, ggagccatttgcctctgggaaaccagtgagtctggag and cgttggctgtgaataccacctctgcatgctcatggaatg, cattccatgagcatgcagaggtggtattcacagccaacg and ataaagcttaagatctttccttgggattggtgacg amplified, and mixed with primers ataggatccggccctacgggcaccggtgaatccaag and ataaagcttaagatctttccttgggattggtgacg amplified, and mixed with primers
- the results in Figure 12 show that the apparent molecular weight of the non-fusion protein TT is consistent with the theoretical molecular weight (15403.1 Da), and the apparent molecular weight of the fusion protein BA-TT is consistent with the theoretical molecular weight (28145.5 Da).
- TT and BA-TT the expression of fusion protein BA-TT was significantly higher than that of TT.
- the non-fusion protein TT was mainly soluble expression, which was consistent with the literature.
- Example 14 Fusion expression of inclusion body tag BA and random polypeptide X
- the results of Tricine-SDS-PAGE (Fig. 14 left) showed that the apparent molecular weight of the fusion protein BA-X was consistent with the theoretical molecular weight (12069.7Da), and there was no obvious degradation product.
- BA-X was highly expressed in the form of inclusion bodies; Western blot ( Figure 14 right) The results show that the expression band is the target protein containing His Tag.
- novel fusion carrier protein also has a good pro-expression effect on a wide variety of polypeptides including non-natural peptides.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Toxicology (AREA)
- Diabetes (AREA)
- General Chemical & Material Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Endocrinology (AREA)
- Gastroenterology & Hepatology (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Peptides Or Proteins (AREA)
Abstract
公开了一种作为包涵体标签的融合载体蛋白,所述的融合载体蛋白的氨基酸序列来源于胰岛素。还公开了编码该融合载体蛋白的核酸分子、含有该核酸分子的表达载体和宿主细胞、以及其在促进异源性蛋白或多肽表达中的应用。
Description
本发明属于生物工程领域,具体涉及一种新型的融合载体蛋白,及编码该融合载体蛋白的核酸分子、含有该核酸分子的表达载体和宿主细胞,以及它们在促进异源性蛋白或多肽表达中的应用;本发明还涉及含有该融合载体蛋白的融合蛋白,及编码该融合蛋白的核酸分子、含有该核酸分子的表达载体和宿主细胞。
基因重组技术将目的基因克隆至表达载体,并在宿主细胞中表达目的蛋白或多肽。这是目前生产异源性蛋白或多肽最为常用的方法。已有多种重组蛋白表达系统,成功应用于原核细胞、酵母细胞、植物细胞、昆虫细胞、哺乳动物细胞等不同宿主;每种表达系统各有其相应的优势和局限性。大肠杆菌原核表达系统因其遗传背景清楚、技术成熟、生产高效、操作简单等优势,成为蛋白重组技术中发展最早及目前应用最为广泛的经典表达系统。
异源性蛋白或多肽的重组表达效率受到很多因素的影响。常见因素有:构建表达载体时表达元件的组成(Kim KJ,Kim HE,Lee KH,et al.Two-promoter vector is highly efficient for overproduction of protein complexes.Protein Sci.2004;13:1698-703)、目的基因mRNA的稳定性(Tanaka M,Tokuoka M,Shintani T,et al.Transcripts of a heterologous gene encoding mite allergen Der f 7 are stabilized by codon optimization in Aspergillus oryzae.Appl Microbiol Biotechnol.2012;96:1275-82)、宿主细胞稀有密码子的偏爱性(Kane JF.Effects of rare codon clusters on high-level expression of heterologous proteins in Escherichia coli.Curr Opin Biotechnol.1995;6:494-500)、重组蛋白或多肽稳定性(Gottesman S,Zipser D.Deg phenotype of Escherichia coli lon mutants.J Bacteriol.1978;133:844-51)、重组蛋白细胞毒性( M, J, G,et al.High-yield expression in Escherichia coli,purification and application of budding yeast K2 killer protein.Mol Biotechnol.2014;56:644-52)、蛋白表达定位(Choi JH,Lee SY.Secretory and extracellular production of recombinant proteins using Escherichia coli.Appl Microbiol Biotechnol.2004;64:625-35)、宿主细胞培养条件(País-Chanfrau JM,García Y,Licor L,et al.Improving the expression of mini-proinsulin in Pichia pastoris.Biotechnol Lett.2004;26:1269-72)等。为了提高重组蛋白或多肽的表达水平,可以采取一些表达策略,如通过替换稀有密码子或改造宿主细胞等方式进行密码子优化(Lakey DL,Voladri RK,Edwards KM,et al.Enhanced production of recombinant Mycobacterium tuberculosis antigens in Escherichia coli by replacement of low-usage codons.Infect Immun.2000;68:233-8;Brinkmann U,Mattes RE,Buckel P.High-level expression of recombinant genes in
Escherichia coli is dependent on the availability of the dnaY gene product.Gene,1989;85:109-14)、优化mRNA二级结构(Punginelli C,Ize B,Stanley NR,et al.mRNA secondary structure modulates translation of Tat-dependent formate dehydrogenase N.J Bacteriol,2004;186:6311-5)、选用蛋白酶缺失的宿主细胞以增强重组蛋白稳定性(Gottesman S,Zipser D.Deg phenotype of Escherichia coli lon mutants.J Bacteriol.1978;133:844-51)、蛋白融合技术(LaVallie ER,Lu Z,Diblasio-Smith EA,et al.Thioredoxin as a fusion partner for production of soluble recombinant proteins in Escherichia coli.Methods Enzymol.2000;326:322-40)等。
蛋白融合技术,是将一种融合载体蛋白或多肽标签与目的蛋白或多肽通过基因融合,使目的蛋白在宿主细胞中以融合蛋白形式表达;融合表达可以提高目的蛋白或多肽的表达水平,改善目的蛋白的理化特征,或为目的蛋白或多肽附加特殊标记以利于后续的纯化或检测,其应用非常广泛。通过酶切或化学裂解等方法,可以特异性切割以移除融合载体蛋白或融合标签,经进一步分离纯化从而得到目的蛋白或多肽。
常用的促进表达的融合载体蛋白主要分为四类:(1)促可溶性表达的融合载体蛋白,如硫氧还蛋白(Levarski Z, A,Krahulec J,et al.High-level expression and purification of recombinant human growth hormone produced in soluble form in Escherichia coli.Protein Expr Purif.2014;100:40-7)、SUMO(Malakhov MP,Mattern MR,Malakhova OA,et al.SUMO fusions and SUMO-specific protease for efficient expression and purification of proteins.J Struct Funct Genomics.2004;5:75-86)、谷胱甘肽转移酶(Jung JG,Lee YJ,Velmurugan N,et al.High-yield production of the VP1 structural protein epitope from serotype O foot-and-mouth disease virus in Escherichia coli.J Ind Microbiol Biotechnol.2013;40:705-13)、麦芽糖结合蛋白(Vu TT,Jeong B,Yu J,et al.Soluble prokaryotic expression and purification of crotamine using an N-terminal maltose-binding protein tag.Toxicon.2014;92:157-65)等。此类融合载体蛋白水溶性很好,可以有效阻止目的蛋白或多肽折叠中间体发生聚集,从而有利于其形成正确的空间结构,对于含有二硫键的蛋白或多肽,此类融合载体蛋白也可能会发挥很好的作用,如硫氧还原蛋白等(Lauber T,Marx UC,Schulz A,et al.Accurate disulfide formation in Escherichia coli:overexpression and characterization of the first domain(HF6478)of the multiple Kazal-type inhibitor LEKTI.Protein Expr Purif.2001;22:108-12)。(2)促包涵体表达的融合载体蛋白,如ThiS(Yuan S,Xu J,Ge Y,Yan Z,et al.Prokaryotic ubiquitin-like ThiS fusion enhances the heterologous protein overexpression and aggregation in Escherichia coli.PLoS One.2013;8:e62529)、MoaD(Yuan S,Wang X,Xu J,et al.Ubiquitin-like prokaryotic MoaD as a fusion tag for expression of heterologous proteins in Escherichia coli.BMC Biotechnol.2014;14:5)、PurF片段(Lee JH,Kim JH,Hwang SW,et al.High-level expression of antimicrobial peptide mediated by a fusion partner reinforcing formation of inclusion bodies.Biochem Biophys Res Commun.2000;277:575-80)等。与可溶性表达相比,包涵体表达具有产量高、稳定性强、易于分离纯化等优势,而且减轻了目的蛋白或多肽对宿主细胞的毒性作用,但是包涵体蛋白需要通过复杂的复性过程才能得到正确折叠的功能蛋白,这在一定程度上会抵消包涵体表达量高的优势。(3)可自我
剪切的融合表达,如内含肽(Xie YG,Han FF,Luan C,et al.High-yield soluble expression and simple purification of the antimicrobial peptide OG2 using the intein system in Escherichia coli.Biomed Res Int.2013;2013:754319)等。融合蛋白表达以后通过改变温度、pH等条件诱导内含肽自我剪切,从而避免了外源蛋白酶及化学试剂的使用,简化了目的蛋白或多肽的纯化过程(Mee C,Banki MR,Wood DW.Towards the elimination of chromatography in protein purification:expressing proteins engineered to purify themselves.Chem Eng J.2008;135:56-62)。(4)引导分泌表达的融合载体蛋白,如PelB(Wu D,Lu Y,Huang H,et al.High-level secretory expression of metchnikowin in Escherichia coli.Protein Expr Purif.2013;91:49-53)、蛋白A信号肽( A,Blingsmo OR,Saether O,et al.Expression and characterization of a recombinant human parathyroid hormone secreted by Escherichia coli employing the staphylococcal protein A promoter and signal sequence.J Biol Chem.1990;265:7338-44)等。对大肠杆菌表达系统而言,融合蛋白表达以后可以被信号序列引导分泌至细胞周质,防止细胞内蛋白酶的降解;细胞周质氧化性强于细胞质,有利于含二硫键目的蛋白或多肽的正确折叠。但细胞周质表达通常获得的表达量较少。
由于现有的融合载体蛋白如谷胱甘肽转移酶(27kD)、麦芽糖结合蛋白(50kD)等大多分子量较大,目的蛋白,特别是低分子量多肽,在融合蛋白中所占比例偏低;尽管融合蛋白的表达量得到了提高,但是目的蛋白或多肽在宿主细胞中的表达效率仍然相对较低。开发新型小分子量融合载体蛋白,可以有效提高融合蛋白的表达量,并提高目的蛋白或多肽的表达效率,在工业化生产中更具有成本优势。
胰岛素(insulin)是动物体内促进糖原、脂肪、蛋白质的合成,降低血糖的激素,长期以来主要用于治疗糖尿病。胰岛素在动物胰岛β细胞合成和分泌,由B、A两个亚基组成,其中B亚基有30个氨基酸残基,A亚基含有21个氨基酸残基;两个亚基之间由两对二硫键连接,A亚基内另含一对二硫键。在体内,胰岛β细胞首先合成出含有前导肽的单链前胰岛素原;经过加工生成的胰岛素原,是由B亚基、A亚基与间隔二者的C肽连接,形成的单链蛋白分子,需要通过蛋白酶作用切去C肽,形成B亚基、A亚基组成的双链胰岛素的成熟分子。不同动物种属(人、牛、羊、猪等)胰岛素的氨基酸序列和结构略有差异。
早期的药用胰岛素主要是动物胰岛素,如猪胰岛素。随着基因工程的发展,动物胰岛素逐渐被重组人胰岛素取代。重组技术除生产标准的人胰岛素外,还改变了人胰岛素序列上的一个或几个氨基酸残基,开发出了适合多种临床需要的人胰岛素类似物,如将B9的Ser变为Asp、B27的Thr变为Glu后制成速效胰岛素,将B27的Thr变为Arg、B亚基C端酰胺化、A21的Asn变为Gly后制成长效胰岛素等(Bristow AF.Recombinant-DNA-derived insulin analogues as potentially useful therapeutic agents.Trends Biotechnol.1993;11:301-5)。
生产重组胰岛素的方法,按胰岛素B、A亚基表达方式的不同,主要可以分为两种。(1)单亚基合成法:通过基因工程分别表达胰岛素B、A亚基,二者混合使B、A亚基在一定条件下氧化复性形成正确二硫键,制得成熟胰岛素。但胰岛素B、A亚基分子量小,无法直接重组
表达,需要与融合载体蛋白,如β-半乳糖苷酶组成融合蛋白进行表达(Goeddel DV,Kleid DG,Bolivar F,et al.Expression in Escherichia coli of chemically synthesized genes for human insulin.Proc Natl Acad Sci U S A.1979;76:106-10),分子量很大的融合载体蛋白,使得胰岛素B、A亚基的直接产量极低;加之后续的复性率低,使得该方法胰岛素产率太低,成本很高,现已被淘汰。(2)胰岛素原法:通过基因工程首先表达B亚基、C肽和A亚基组成的胰岛素原单链分子,氧化复性后再经胰蛋白酶、羧肽酶B酶解切除C肽,制得成熟胰岛素。类似的小胰岛素原法,则缩短或去掉C肽,直接由几个氨基酸残基连接胰岛素B、A亚基组成小胰岛素原,进行重组表达后,经正确氧化复性,再经胰蛋白酶、羧肽酶B酶切处理,制得胰岛素。但通常认为胰岛素原和小胰岛素原的分子稳定性差,容易被宿主降解而致表达产量降低,需采用与各种融合载体蛋白进行融合表达的方法(Castellanos-Serra LR,Hardy E,Ubieta R,Vispo NS,et al.Expression and folding of an interleukin-2-proinsulin fusion protein and its conversion into insulin by a single step enzymatic removal of the C-peptide and the N-terminal fused sequence.FEBS Lett.1996;378:171-6;Wetzel R,Kleid DG,Crea R,et al.Expression in Escherichia coli of a chemically synthesized gene for a″mini-C″analog of human proinsulin.Gene.1981;16:63-71;Trabucchi A1,Guerra LL,Faccinetti NI,et al.Expression and characterization of human proinsulin fused to thioredoxin in Escherichia coli.Appl Microbiol Biotechnol.2012;94:1565-76;Malik A1,Jenzsch M,Lübbert A,et al.Periplasmic production of native human proinsulin as a fusion to E.coli ecotin.Protein Expr Purif.2007;55:100-11)。需要注意的是,作为小分子多肽,胰岛素原和小胰岛素原在N端只融合含His Tag纯化标签的短前导肽序列情况下也可以直接以包涵体形式实现原核表达,从而有效避免了蛋白酶解及对宿主细胞的毒性伤害作用(Tikhonov RV,Pechenov SE,Belacheu IA,et al.Recombinant human insulin IX.Investigation of factors,influencing the folding of fusion protein-S-sulfonates,biotechnological precursors of human insulin.Protein Expr Purif.2002;26:187-93;Shin CS1,Hong MS,Bae CS,et al.Enhanced production of human mini-proinsulin in fed-batch cultures at high cell density of Escherichia coli BL21(DE3)[pET-3aT2M2].Biotechnol Prog.1997;13:249-57)。这提示我们,胰岛素相关序列可能是一种潜在的小分子量融合载体蛋白,可以有效促进异源性蛋白或多肽的重组表达。
发明内容
本发明解决的技术问题:一方面提供一种新型融合载体蛋白、编码该融合载体蛋白的基因序列、含有该基因序列的表达载体和/或宿主细胞、以及融合载体蛋白、其基因序列、表达载体或宿主细胞在促进目的蛋白或多肽表达中的应用;另一方面提供一种包含上述融合载体蛋白的融合蛋白、编码融合蛋白的基因序列、含有该基因序列的表达载体和/或宿主细胞,并提供一种表达融合蛋白的方法。
本发明的技术方案,第一方面是提供了一种用于表达目的蛋白或多肽的融合载体蛋白,其特征在于,所述的融合载体蛋白的氨基酸序列来源于胰岛素的氨基酸序列或其经过取代、缺失和/或增加一个或几个氨基酸的氨基酸序列,或上述氨基酸序列经过常规修饰后形成的氨
基酸序列、或上述氨基酸序列加入标签后形成的氨基酸序列;其中,所述的常规修饰包括乙酰化、酰胺化、环化、糖基化、磷酸化、烷基化、生物素化、荧光基团修饰、聚乙二醇PEG修饰、固定化修饰;所述的标签包括6×His、GST、EGFP、MBP、Nus、HA、IgG、FLAG、c-Myc、Profinity eXact。。
上述融合载体蛋白,其特征在于,所述的融合载体蛋白包括:(1)人胰岛素A亚基,即SEQ ID No:2所示的氨基酸序列;或(2)在(1)中的氨基酸序列经过取代、缺失和/或增加一个或几个氨基酸,且编码可融合表达的氨基酸序列。
上述融合载体蛋白,其特征在于,所述的融合载体蛋白包括:(1)人胰岛素B亚基,即SEQ ID No:4所示的氨基酸序列;或(2)在(1)中的氨基酸序列经取代、缺失和/或增加一个或几个氨基酸,且编码可融合表达的氨基酸序列,优选SEQ ID No:6所示的氨基酸序列。
上述融合载体蛋白,其特征在于,所述的融合载体蛋白包括:(1)是同时含有人胰岛素A亚基与B亚基的单链蛋白分子,优选SEQ ID No:8、SEQ ID No:9所示的氨基酸序列;或(2)是在(1)中之单链分子变体,其含有人胰岛素A亚基和/或B亚基,经取代、缺失和/或增加一个或几个氨基酸,且编码可融合表达的氨基酸序列,优选SEQ ID No:25所示的氨基酸序列。
本发明的技术方案,第二方面是提供了一种编码第一方面所述融合载体蛋白的核酸分子、含有该核酸分子的表达载体或宿主细胞。
上述核酸分子,其特征在于,所述核酸分子含有编码本发明第一方面所述融合载体蛋白的基因序列,优选SEQ ID No:1、SEQ ID No:3、SEQ ID No:5、SEQ ID No:7、SEQ ID No:24所示的基因序列。
上述表达载体,其特征在于,所述表达载体包含本方面所述的核酸分子,连接于载体的启动子用于核酸分子编码蛋白的表达。
上述宿主细胞,其特征在于,所述宿主细胞含有本方面所述的核酸分子或表达载体。
本发明的技术方案,第三方面是提供了一种本发明第一方面所述融合载体蛋白以及第二方面所述核酸分子、表达载体或宿主细胞在促进目的蛋白或多肽表达中的应用。
本发明的技术方案,第四方面是提供了一种融合蛋白,其特征在于,所述的融合蛋白,含有本发明第一方面所述融合载体蛋白以及至少一个目的蛋白或多肽,且该目的蛋白或多肽不是胰岛素。
上述的融合蛋白,可进行常规修饰或加入表达纯化标签;所述的常规修饰包括乙酰化、酰胺化、环化、糖基化、磷酸化、烷基化、生物素化、荧光基团修饰、聚乙二醇PEG修饰、固定化修饰;所述的标签包括6×His、GST、EGFP、MBP、Nus、HA、IgG、FLAG、c-Myc、Profinity eXact。
上述的融合蛋白,可以含有1、2、3、4个目的蛋白或多肽。
上述融合蛋白,其特征在于,所述的融合蛋白中融合载体蛋白与目的蛋白或多肽之间含有特异性多肽切割位点或序列。
上述融合蛋白,其特征在于,所述的融合蛋白中目的蛋白或多肽含有5-1000个氨基酸残
基。
上述融合蛋白,其特征在于,所述的融合蛋白中目的蛋白或多肽选自GLP-1、胃泌酸调节素、恩夫韦肽、利那洛肽、人转甲状腺素蛋白,及它们的变体。
上述融合蛋白,优选的氨基酸序列选自SEQ ID No:12、SEQ ID No:13、SEQ ID No:14、SEQ ID No:17、SEQ ID No:18、SEQ ID No:21、SEQ ID No:27、SEQ ID No:30、SEQ ID No:31、SEQ ID No:32、SEQ ID No:36、SEQ ID No:39所示的氨基酸序列。
本发明的技术方案,第五方面是提供了一种编码第四方面所述融合蛋白的核酸分子、含有该核酸分子的表达载体或宿主细胞。
上述核酸分子,其特征在于,所述核酸分子含有编码本发明第四方面所述融合蛋白的基因序列。
上述表达载体,其特征在于,所述表达载体包含本方面所述的核酸分子,连接于载体的启动子用于核酸分子编码蛋白的表达。
上述宿主细胞,其特征在于,所述宿主细胞含有本方面所述的核酸分子或表达载体。
本发明的技术方案,第六方面是提供了一种采用如下步骤生产目的蛋白或多肽的方法:(1)扩增适宜的宿主细胞;(2)诱导宿主细胞表达融合蛋白;(3)分离制备融合蛋白;和/或(4)对特异性的多肽切割位点或序列进行化学切割或生物酶解,分离融合载体蛋白与目的蛋白或多肽,以获取目的蛋白或多肽;其特征在于,该步骤(1)中的宿主细胞是本发明第二方面或第五方面所述的宿主细胞。
本发明中序列表中各序列对应的蛋白名称如下表所示:
发明详述:
对于许多小分子量的异源性多肽,在宿主细胞内表达时极易被蛋白酶降解;一些表达的蛋白分子对宿主细胞具有一定毒性作用,而导致宿主细胞生长停滞或死亡;所以直接表达目的蛋白或多肽时常遇到困难。本发明发现,胰岛素B、A亚基单链,和由B、A亚基单链融合的蛋白片段,以及他们的各种变异体,作为新型融合载体蛋白,能有效提高目的蛋白或多肽的表达量。特别在细菌宿主中,这类融合载体蛋白可以促进多种目的蛋白以包涵体形式表达。提高目的蛋白的稳定性,可能是其有效提高目的蛋白或多肽表达量的重要因素。
实施本发明,需要首先获得编码胰岛素B或/和A亚基的基因克隆。通常,根据已知的基因序列设计PCR引物,用从人或多种动物胰腺组织中或胰岛细胞中提取的RNA,进行反转录PCR可直接获得cDNA克隆;更简单的,可以人工合成与天然编码DNA序列相同基因序列;也可以按照其氨基酸序列,用公知的设计方法,依宿主的基因偏好,合成不同密码子偏好的基因序列;也容易从其它商业途径获得含有胰岛素B、A亚基的基因克隆。
可以按公知的方法,构建编码本发明所述的融合载体蛋白与既定目的蛋白形成的融合蛋白表达载体。用常规的分子生物学手段,首先将编码既定目的蛋白的基因,与编码胰岛素B亚基的基因、或编码A亚基的基因,或编码由B、A亚基单链融合的蛋白片段的基因,直接共码连接。它们的连接顺序可以根据需要进行调整,以获得诸如B-T、T-B、A-T、T-A、A-B-T、B-A-T、T-A-B、T-B-A等等多种形式的融合蛋白产物;其融合蛋白中包含的B、A可以是人胰岛素B亚基单链序列(SEQ ID No:4)、人胰岛素A亚基单链序列(SEQ ID No:2),或其它动物种
属的胰岛素B链、A链序列,也可以是人胰岛素氨基酸残基取代、缺失或添加变异体,但仍保持与人胰岛素序列至少具有70%同源性,或至少具有75%、至少具有80%、至少具有85%、至少具有90%、至少具有95%同源性,并能有效融合表达的序列;其中的目的蛋白T可以有一个或多个相同或不同的分子。上述基因片段也可以通过含有编码间隔序列的基因片段连接,融合载体蛋白与目的蛋白的间隔序列可以含有能被特异性化学切割或生物酶解的特定序列,如Met(蛋氨酸)残基可被CNBr化学切割,Lys(赖氨酸)或Arg(精氨酸)残基可被胰蛋白酶切割,LysArg或ArgArg可被双碱基蛋白酶类如Kex2切割,GluAsnLeuTyrPheGln可被烟草蚀纹病毒蛋白酶(TEV蛋白酶)识别和切割,IeuGluGlyArg可被Xa因子(Xa蛋白酶)识别和切割,以及其它合适的蛋白酶切位点。上述获得的融合蛋白编码基因,按公知的方法插入合适的表达载体,得到表达上述融合蛋白的重组载体。也可以将需要的基因片段按合适的顺序依次连接到表达载体上。上述表达载体包括但不限于原核表达载体和各种真核表达载体。各种商品化的表达载体,可提供选择,如表达载体pQE系列,pET系列等。按公知的方法将重组载体转入宿主细胞,所述宿主细胞包括但不限于大肠杆菌、其它原核细胞、各种真核生物细胞。在合适的条件下进行诱导表达,表达的宿主细胞经破碎裂解,收取融合蛋白产物。表达产物可以通过公知的方法纯化;融合蛋白中,作为融合载体蛋白的胰岛素B、A亚基序列,可以用合适的化学切割或蛋白酶切割去除。
本发明可用于融合表达多种在疾病诊断、治疗、预防或其它领域用途的目标蛋白或多肽。含50及以上氨基酸残基数的蛋白分子一般称为蛋白,50以下氨基酸残基数的蛋白分子常称为多肽;但二者名称经常混用。目标蛋白可包括生长因子、细胞因子、配体、受体、转运体、抗原、抗体及其片段。特别对目标多肽,本发明具有更多的优势;常见的目标多肽有胰高血糖素样肽1(glucagon-like peptide 1,GLP-1)、艾塞那肽(Exendin-4)、胃泌酸调节素(Oxyntomodulin)、多种抗菌肽、抗病毒肽如恩夫韦肽(enfuvirtide,T-20)、心房肽、奥曲肽(octreotide)、利那洛肽(linaclotide)等。
有益技术效果:
本发明的优点在于,本发明的融合载体蛋白及其构建的重组表达系统,对多种不同蛋白或多肽均具有良好的促表达作用。由于其分子量较小,如胰岛素B、A亚基单链,仅由30或20个左右的氨基酸残基组成;由B、A亚基单链融合的蛋白片段,也可仅含有50个左右的氨基酸残基;与已知的融合载体蛋白相比,他们作为融合载体蛋白,在融合蛋白中与目的蛋白或多肽的占比明显下降;由此,可以获得目的蛋白,特别是小分子量多肽的高效重组表达,有效降低其生产成本。
本发明的另一个优点是,胰岛素作为人类历史上第一个生物工程产品,以及它所衍生出的众多的结构变异产品,其结构、序列变异和重组表达等特征已为人们熟知。作为融合载体蛋白,其不需要具有胰岛素的生物活性,这也为构建多种氨基酸残基取代、缺失或添加的变异型融合载体蛋白,使其获得额外的理化特征,如离子结合特征,提供了多种可能。
本发明的另一个优点是,胰岛素在临床实践中具有极好的体内顺应性,不会引起机体的明显毒性;这使得来源于胰岛素的融合载体蛋白,在与目的蛋白或多肽形成的融合蛋白活性产物中,有可能不需切割除去,而直接应用于体内。
图1.融合载体蛋白BA的原核重组表达。
融合载体蛋白BA与表达载体pQE80L的His Tag融合表达产物,进行Tricine-SDS-PAGE及Western blot检测。左图A中,未诱导全菌样品(-)、IPTG诱导全菌样品(+)在16.5%Tricine-SDS-PAGE电泳检测,在未诱导宿主菌内约10kD以下的多肽很少存在;右图B中,未诱导全菌样品(-)、IPTG诱导全菌样品(+)、诱导菌可溶性上清样品(S)、和不溶性包涵体样品(I)在13%Tricine-SDS-PAGE电泳(左)及其Western blot(右)检测,诱导后融合蛋白以包涵体形式在宿主菌内有效蓄积。M为蛋白电泳marker及其分子量(kD);箭头所指为表达产物BA条带位置。
图2.融合载体蛋白BA与GLP-E的融合表达。
融合载体蛋白BA融合于GLP-E的N端,表达载体BA-GLP-E/pQE80L,在E.coli TG1中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式高效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白BA-GLP-E条带位置。
图3.融合载体蛋白B与GLP-E的融合表达。
融合载体蛋白B融合于GLP-E的N端,表达载体B-GLP-E/pQE80L,在E.coli TG1中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式有效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白B-GLP-E条带位置。
图4.融合载体蛋白A与GLP-E的融合表达。
融合载体蛋白A融合于GLP-E的N端,表达载体A-GLP-E/pQE80L,在E.coli TG1中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式有效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白A-GLP-E条带位置。
图5.融合载体蛋白BA与GLP-1的融合表达。
融合载体蛋白BA与GLP-1的N端融合,其表达载体BA-GLP/pQE80L,在E.coli TG1中的表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE电泳(左)及Western blot(右)检测。融合蛋白无明显降
解,以包涵体形式高效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白BA-GLP条带位置。
图6.融合载体蛋白B与GLP-1的融合表达。
融合载体蛋白B融合于GLP-1的N端,表达载体B-GLP/pQE80L,在E.coli TG1中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式有效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白B-GLP条带位置。
图7.融合载体蛋白BA与恩夫韦肽T-20的融合表达。
融合载体蛋白BA融合于恩夫韦肽T-20的N端,表达载体BA-T/pQE80L,在E.coli TG1中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式有效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白BA-T条带位置。
图8.包涵体标签BA变体与胃泌酸调节素的融合表达。
包涵体标签BA变体B′A′融合于胃泌酸调节素OXN的N端,表达载体B′A′-OXN/pQE80L,在E.coli TG1中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行16.5%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式高效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白B′A′-OXN条带位置。
图9.融合载体蛋白BA与单拷贝利那洛肽的融合表达。
融合载体蛋白BA融合于单拷贝利那洛肽的N端,表达载体BA-LN1/pQE80L,在E.coliBL21(DE3)pLysS中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式高效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白BA-LN1条带位置。
图10.融合载体蛋白BA与三拷贝利那洛肽的融合表达。
融合载体蛋白BA融合于三拷贝利那洛肽的N端,表达载体BA-LN3/pQE80L,在E.coli BL21(DE3)pLysS中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式有效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白BA-LN3条带位置。
图11.融合载体蛋白B变体与双拷贝利那洛肽的融合表达。
融合载体蛋白B变体融合于双拷贝利那洛肽的N端,表达载体B′-LN2/pQE80L,在E.coli BL21(DE3)pLysS中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品
(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式高效表达。M为蛋白电泳marker及其分子量(kD);箭头所指为融合蛋白B′-LN2条带位置。
图12.B′-LN2包涵体质谱分析。
图A为B′-LN2包涵体一级质谱检测结果,图B为B′-LN2包涵体的胰酶酶切样品检测结果,图C为B′-LN2包涵体的胰酶酶切还原样品检测结果,图D为B′-LN2包涵体的胰酶-羧肽酶B双酶切样品检测结果,图E为B′-LN2包涵体的胰酶酶切还原样品中利那洛肽所在酶切肽段(氨基酸序列为CysCysGluTyrCysCysAsnProAlaCysThrGlyCysTyrArg)的二级质谱检测结果。峰位归属结果均符合理论预期。图E中M+H表示准分子离子,b、y分别表示二级质谱中产生的b系列、y系列特征碎片离子。
图13.融合载体蛋白BA与人转甲状腺素蛋白的融合表达。
融合载体蛋白BA融合于人转甲状腺素蛋白的N端,表达载体BA-TT/pQE80L,与非融合表达载体TT/pQE80L分别在E.coli TG1中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行15%SDS-PAGE检测。融合蛋白BA-TT表达量明显高于TT;非融合蛋白TT主要为可溶性表达,但提取上清可溶蛋白时,收取率明显低于预期应收取的产量;而融合蛋白BA-TT主要以包涵体形式表达,无明显损失。TT、BA-TT分别为载体TT/pQE80L和BA-TT/pQE80L的表达产物;M为蛋白电泳marker及其分子量(kD);箭头所指为相应表达产物条带位置。
图14.融合载体蛋白BA与随机多肽X的融合表达。
融合载体蛋白BA融合于随机多肽X的N端,表达载体BA-X/pQE80L,在E.coli TG1中表达样品:未诱导全菌样品(-)、经IPTG诱导全菌样品(+)、诱导上清样品(S)、诱导包涵体样品(I),进行13%Tricine-SDS-PAGE(左)及Western blot(右)检测。融合蛋白无明显降解,以包涵体形式高效表达。M为蛋白电泳marker及其分子量(kD):箭头所指为融合蛋白BA-X条带位置。
下面结合附图,通过实施例对本发明进行具体描述或进一步说明,其目的在于更好地理解本发明的技术内涵,具体说明本发明中,新型融合载体蛋白的使用方法及相应融合表达系统的效果,但是本实施例并不限定本发明的保护范围。
实验方法:
常规质粒构建所涉及的PCR、酶切、连接等实验,以及蛋白质表达所涉及的转化、细菌培养等实验为本领域研究人员所熟悉,所以具体相关实验细节没有详细注明,具体可参照《分子克隆实验指南》(J.萨姆布鲁克,E.F.弗里奇,T.曼尼阿蒂斯著)所述常规实验条件。
1、基因克隆与表达载体构建:
目的蛋白或多肽编码基因通过人工合成及PCR扩增获得,经酶切后插入表达载体pQE80L
相应酶切位点,构建相关重组表达载体,其表达产物的N端融合有His-6序列。所有载体的目的基因序列均经核酸序列测定验证。
2、融合蛋白的诱导表达:
将相关重组质粒转化到E.coli TG1或BL21(DE3)pLysS感受态,得到目的工程菌株。具体实施例均采用实验室小规模摇瓶表达。在37℃条件下含氨苄青霉素抗性的LB液体培养基中振荡活化过夜,之后将过夜培养物按1∶100比例转到新的含有氨苄青霉素抗性的LB培养基(20-30mL)中,37℃振荡培养至合适对数生长期,经0.1mM或1mM IPTG诱导表达一定时间。
取未诱导、诱导菌液各1mL,7000rpm于4℃离心5min后弃上清,收集菌体,用200μL体积的PBS重悬后直接加入50μL体积的5×SDS上样缓冲液(300mM Tris-HCl(pH 6.8)、20%β-巯基乙醇、20%SDS、25%甘油、0.05%溴酚蓝),沸水浴孵育20min,-20℃保存,作为未诱导、诱导全菌样品。取诱导样品14mL,7000rpm于4℃离心5min后弃上清,收集菌体,用2.66mL体积的PBS重悬,加入140μL 20%Triton X-100后混匀,反复冻融三次,超声破菌10min。取2.4mL全菌裂解样品12000rpm于4℃离心15min,上清、沉淀分离。取200μL体积的上清,加入50μL体积的5×SDS上样缓冲液,沸水浴孵育20min,-20℃保存,作为诱导上清样品。将沉淀用1mL包涵体洗液(含1%Triton X-100和5mM EDTA的PBS)重悬后12000rpm于4℃离心15min,弃上清,重复洗涤三次,再用1mL体积PBS重悬后12000rpm于4℃离心15min,弃上清,即得较纯包涵体。包涵体沉淀用960μL体积的PBS重悬后取其中80μL加入120μL体积PBS和50μL体积5×SDS上样缓冲液,沸水浴孵育20min,-20℃保存,作为诱导包涵体样品。上述各种样品在电泳检测中的稀释度相同,其蛋白含量具有直接可比性。
3、蛋白电泳(SDS-PAGE/Tricine-SDS-PAGE)与定量:
SDS-PAGE凝胶按常规方法配制,Tricine-SDS-PAGE凝胶按文献( H.Tricine-SDS-PAGE.Nat Protoc.2006;1:16-22)方法配制。对已制备好的未诱导全菌样品、诱导全菌样品、诱导上清样品、诱导包涵体样品等体积上样,进行15%SDS-PAGE或13%(或特定浓度)Tricine-SDS-PAGE检测。先恒压50V电泳50min,再恒压150V,全程冰水浴电泳。电泳结束后进行考马斯亮蓝R-250染色(Tricine-SDS-PAGE电泳结束后,电泳胶用5%戊二醛固定30min后再进行考马斯亮蓝R-250染色),经脱色后得电泳结果。
以牛血清白蛋白(bovine serum albumin,BSA)作为标准品,上样0.5、1、2、4μg BSA及合适量的诱导包涵体样品,同时进行电泳检测,对脱色后的凝胶蛋白条带用QuantiScan软件进行灰度扫描,依据BSA定量的标准曲线计算融合蛋白含量。
4、免疫印迹检测:
蛋白电泳结束后,取分离胶部分用于Western blot。分离胶经湿法转印至PVDF膜上,之后将膜放入含5%脱脂奶粉的PBST(含0.1%Tween-20的PBS)中,室温振摇封闭2h。用含2.5%脱脂奶粉的PBST按1∶2000稀释抗His标签鼠单克隆抗体,4℃孵育过夜。PBST漂洗4次(15min/次)后,用含2.5%脱脂奶粉的PBST按1∶2000稀释HRP标记山羊抗小鼠IgG,室温
振摇孵育1.5h,PBST漂洗4次(15min/次)。配1mL ECL显影液均匀滴加于取出的PVDF膜上,曝光。
5、蛋白样品的酶切及质谱分析:
用包涵体溶解液(含8M脲的20mM Tris-HCl,pH 8.0)将包涵体样品溶解,加入终浓度5mM DTT,振荡溶解,室温放置20min,12000rpm于4℃离心10min,取上清用20mM Tris-HCl(pH 8.0)缓冲液稀释20倍,用胰酶(终浓度4mg/L)37℃孵育过夜;或羧肽酶B(终浓度0.5mg/L)37℃孵育30min;醋酸酸化终止。上述酸化后的酶切样品与CHCA基质(30g/L;70%乙腈/30%甲醇/0.1%三氟乙酸)等体积混合,取混合液1μL点样于质谱板,自然晾干,经基质辅助激光解析电离飞行时间质谱(MALDI-TOF-MS)(MALDI-TOF/TOF Analyzer 4800plus,Applied Biosystem),在激光强度4800W/cm2条件下,对样品进行线性模式或反射模式测定以定性分析,观察酶切片段的分子量。
实施例1:包涵体标签BA的原核重组表达
用正向引物ataagatctatgtttgtgaaccagcatctgtg和反向引物atactcgagttaggttttcggggtataaaaaaag扩增融合载体蛋白B基因(SEQ ID No:3),用正向引物ataggatccatgggcattgtggaacagtgc和反向引物ataagatctgttgcaatagttttccagctg扩增融合载体蛋白A基因(SEQ ID No:1);通过B基因、A基因的PCR产物与引物ctttttttataccccgaaaacccgccgcggcattgtggaacagtgc混合,采用overlap PCR,用正向引物ataagatctatgtttgtgaaccagcatctgtg及反向引物atactcgagttagttgcaatagttttccagctg获得包涵体标签BA基因(SEQ ID No:7),编码蛋白序列为SEQ ID No:8。经BglII和XhoI双酶切,插入表达载体pQE80L的BamHI和SalI酶切位点,构建成BA/pQE80L载体,编码一个蛋白序列SEQ ID No:9。
将构建的表达载体BA/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.4),经0.1mM IPTG诱导表达20h,收取样品进行Tricine-SDS-PAGE及Western blot检测。16.5%Tricine-SDS-PAGE检测蛋白分子量范围为3-70kD,如图1A的检测中,在未诱导宿主菌内约10kD以下的多肽基本不存在,而表达产物BA的表观分子量明显低于10kD;13%Tricine-SDS-PAGE(图1B左)结果显示,BA的表观分子量与理论分子量(7637.7Da)符合,且无明显降解产物,BA以包涵体形式有效表达;Western blot(图1B右)结果显示,表达条带为含有His Tag的目的蛋白。BA表达量为16.3±1.3mg/L(n=3)。
本例说明胰岛素BA单链能够在原核宿主细胞中以包涵体形式稳定表达。
实施例2:包涵体标签BA与GLP-E的融合表达
胰高血糖素样肽-1,是有效治疗2型糖尿病的多肽药物。GLP-1(7-37)是GLP-1体内活性形式之一,含有31个氨基酸残基;其第二位Ala突变为Gly的GLP-1(A2G)变体(SEQ ID No:16),可耐受DPPIV对GLP-1的降解作用。设计了含有编码GLP-1(A2G)的基因序列(SEQ ID No:10),该基因编码的多肽GLP-E(SEQ ID No:11),其C端为10个氨基酸残基延长的柔性序列,
其最末端为Cys残基,可用于与其它分子的连接。通过下列引物,采用overlap PCR,合成了该基因。
包涵体标签BA基因用引物ataagatctatgtttgtgaaccagcatctgtg和ataggatccgttgcaatagttttccagctg扩增,经BglII和BamHI双酶切,插入表达载体pQE80L的BamHI酶切位点,构建成BA-/pQE80L载体;GLP-E基因用引物ataagatctcgccgccacggtgaaggtac和ataaagcttagcaagaaccaccaccaccagaac扩增后,经BglII和HindIII双酶切,插入表达载体BA-/pQE80L的BamHI和HindIII酶切位点,构建成BA-GLP-E/pQE80L载体,编码一个融合蛋白序列SEQ ID No:12。BA与GLP-E的间隔序列含有ArgArg,可被双碱基蛋白酶识别和切割,释放出完整的GLP-E分子。将构建的表达载体BA-GLP-E/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.8),经0.1mM IPTG诱导表达8h。Tricine-SDS-PAGE(图2左)结果显示,融合蛋白BA-GLP-E的表观分子量与理论分子量(12094.5Da)符合,且无明显降解产物,BA-GLP-E以包涵体形式高效表达;Western blot(图2右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白BA-GLP-E表达量为131.8±2.2mg/L(n=3)。
本例说明胰岛素BA单链与小分子多肽GLP-E融合,能够在原核宿主细胞中以包涵体形式高效稳定表达。
实施例3:包涵体标签B与GLP-E的融合表达
GLP-E基因用引物accccgaaaacccgccgccacggtgaaggtaccttc和ataaagcttagcaagaaccaccaccaccagaac扩增,产物与包涵体标签B基因扩增产物混合后,采用overlap PCR,由引物ataagatctatgtttgtgaaccagcatctgtg和ataaagcttagcaagaaccaccaccaccagaac扩增,经BglII和HindIII双酶切,插入表达载体pQE80L的BamHI和HindIII酶切位点,构建成B-GLP-E/pQE80L载体,编码一个融合蛋白序列SEQ ID No:13。融合载体蛋白B与GLP-E的N端融合,其间隔序列为ArgArg,可被双碱基蛋白酶识别和切割,释放出完整的GLP-E分子。将构建的表达载体B-GLP-E/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.4),经0.1mM IPTG诱导表达8h。Tricine-SDS-PAGE(图3左)结果显示,融合蛋白B-GLP-E的表观分子量与理论分子量(9272.3Da)符合,且无明显降解产物,B-GLP-E以包涵体形式有效表达;Western blot(图3右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白B-GLP-E表达量为22.8±0.3mg/L(n=3)。
本例说明胰岛素B链与小分子多肽GLP-E融合,能够在原核宿主细胞中以包涵体形式稳定表达。
实施例4:包涵体标签A与GLP-E的融合表达
融合蛋白A-GLP-E基因用引物ataagatctatgggcattgtggaacagtgctgcac和ataaagcttagcaagaaccaccaccaccagaac从载体BA-GLP-E/pQE80L扩增后,经BglII和HindIII双酶切,插入表达载体pQE80L的BamHI和HindIII酶切位点,构建成A-GLP-E/pQE80L载体,编码一个融合蛋白序列SEQ ID No:14。融合载体蛋白A与GLP-E的N端融合,其间隔序列含有ArgArg,可被双碱基蛋白酶识别和切割,释放出完整的GLP-E分子。将构建的表达载体A-GLP-E/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.4),经0.1mM IPTG诱导表达8h,收取样品。Tricine-SDS-PAGE(图4左)结果显示,融合蛋白A-GLP-E的表观分子量与理论分子量(8370.2Da)符合,且无明显降解产物,A-GLP-E以包涵体形式有效表达;Western blot(图4右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白A-GLP-E表达量为22.6±0.1mg/L(n=3)。
本例说明胰岛素A链与小分子多肽GLP-E融合,能够在原核宿主细胞中以包涵体形式稳定表达。
实施例5:包涵体标签BA与GLP-1的融合表达
本实施例直接表达GLP-1的融合蛋白BA-GLP。GLP-1基因序列(SEQ ID No:15)从载体BA-GLP-E/pQE80L中用引物ataagatctcgccgccacggtgaaggtac和ataaagcttaaccacgacctttaaccagc扩增后,经BglII和HindIII双酶切,插入表达载体BA-/pQE80L的BamHI和HindIII酶切位点,构建成BA-GLP/pQE80L载体,编码一个融合蛋白序列SEQ ID No:17。BA与GLP-1的间隔序列含有ArgArg,可被双碱基蛋白酶识别和切割,释放出完整的GLP-1分子。将构建的表达载体BA-GLP/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.4),经0.1mM IPTG诱导表达20h,收取样品。Tricine-SDS-PAGE(图5左)结果显示,融合蛋白BA-GLP表观分子量与理论分子量(11417.8Da)符合,且无明显降解产物,BA-GLP以包涵体形式高效表达;Western blot(图5右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白BA-GLP表达量为162.9±3.5mg/L(n=3)。
本例说明胰岛素BA单链与小分子多肽GLP融合,能够在原核宿主细胞中以包涵体形式高效稳定表达。
实施例6:包涵体标签B与GLP-1的融合表达
从载体B-GLP-E/pQE80L中由引物ataagatctatgtttgtgaaccagcatctgtg和ataaagcttaaccacgacctttaaccagc扩增,经BglII和HindIII双酶切,插入表达载体pQE80L的BamHI和HindIII酶切位点,将包涵体标签B与GLP-1的N端融合,构建成B-GLP/pQE80L载体,编码一个融合蛋白序列SEQ ID No:18,其表达产物中B与GLP-1的间隔序列为ArgArg,可被双碱基蛋白酶识别和切割,释放出完整的GLP-1分子。将构建的表达载体B-GLP/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.4),经0.1mM IPTG
诱导表达20h。Tricine-SDS-PAGE(图6左)结果显示,融合蛋白B-GLP的表观分子量与理论分子量(8595.6Da)符合,且无明显降解产物,B-GLP以包涵体形式有效表达;Western blot(图6右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白B-GLP表达量为20.1±2.0mg/L(n=3)。
本例说明胰岛素B链与小分子多肽GLP-E融合,能够在原核宿主细胞中以包涵体形式稳定表达。
实施例7:包涵体标签BA与恩夫韦肽的融合表达
HIV融合抑制性多肽恩夫韦肽T-20(SEQ ID No:20),是由36个氨基酸残基组成的人工多肽,是有效的抗艾滋病治疗药物。
通过下列引物,合成含有编码恩夫韦肽的基因序列((SEQ ID No:19),
其PCR产物含有特定的酶切位点BglII和HindIII,经BglII和HindIII双酶切,插入表达载体BA-/pQE80L的BamHI和HindIII酶切位点,构建成BA-T/pQE80L载体,编码一个融合蛋白序列SEQ ID No:21,其表达产物的BA与T-20的间隔序列含有ArgArg,可被双碱基蛋白酶识别和切割,释放出完整的T-20分子。将构建的表达载体BA-T/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.4),经0.1mM IPTG诱导表达20h,收取样品.Tricine-SDS-PAGE(图7左)结果显示,融合蛋白BA-T的表观分子量与理论分子量(12626.2Da)符合,且无明显降解产物,BA-T以包涵体形式有效表达;Western blot(图7右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白BA-T表达量为37.6±0.9mg/L(n=3)。
本例说明胰岛素BA单链与小分子多肽恩夫韦肽融合,能够在原核宿主细胞中以包涵体形式稳定表达。
实施例8:融合载体蛋白BA变体与胃泌酸调节素的融合表达
胃泌酸调节素Oxyntomodulin(OXN,SEQ ID No:22)是含有37个氨基酸残基的多肽;它同时具有GLP-1和胰高血糖素的作用,可能作为糖尿病和肥胖的治疗药物。设计了编码OXN的基因序列(SEQ ID No:23)。通过下列引物,采用overlap PCR,合成了含有该基因的DNA片段:
其5′端含有延长序列,编码部分胰岛素A链和蛋白酶识别的多肽间连接序列。
融合载体蛋白BA基因载体BA-/pQE80L用引物cgaacgcggcttttgttataccccgaaaacc和ggttttcggggtataacaaaagccgcgttcg进行PCR定点突变,使其B链发生F25C变异;以此为模板,再用引物ataagatctatgtttgtgaaccagcatctgtg和gttgcaatagttttccagctgatacaggctgtgaatgctggtgcagtgctgttcca扩增,获得载体蛋白BA变体基因B′A′(SEQ ID No:24),该变体B′A′(SEQ ID No:25)的B链含有F25C变异,A链含有C6H和C11H变异。
将上述扩增的两种PCR产物混合,用引物ataggatccatgtttgtgaaccagcatctgtg和ataaagctttaagcgatgttgttacggttac扩增,PCR片段经BamHI和HindIII双酶切,插入表达载体pQE80L的BamHI和HindIII酶切位点,构建成B′A′-OXN/pQE80L载体,其ORF序列(SEQ ID No:26)编码一个融合蛋白序列SEQ ID No:27。BA与OXN的间隔序列含有LysThrLysArg,可被三碱基蛋白酶Furilisin(Ballinger MD,Tom J,Wells JA(1996)Furilisin:a variant of subtilisin BPN′engineered for cleaving tribasic substrates.Biochemistry 35:13579-13585)识别和切割,释放出完整的OXN分子。将构建的表达载体B′A′-OXN/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.6),经1mM IPTG诱导表达8h。Tricine-SDS-PAGE(图8左)结果显示,融合蛋白B′A′-OXN的表观分子量与理论分子量(12577.1Da)符合,以包涵体形式高效表达;Western blot(图8右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白B′A′-OXN表达量为157.7±2.9mg/L(n=3)。
本例说明胰岛素BA单链,在其氨基酸序列发生取代突变时,与小分子多肽胃泌酸调节素融合,仍能够在原核宿主细胞中以包涵体形式高效稳定表达。
实施例9:包涵体标签BA与单拷贝利那洛肽的融合表达
利那洛肽(SEQ ID No:28)由14个氨基酸残基组成,富含Cys,是首个鸟苷酸环化酶激动剂类药物,用于治疗成人慢性特发性便秘和便秘型肠易激综合症。
通过下列引物,合成含有编码利那洛肽的基因序列(SEQ ID No:29),
其PCR产物含有特定的酶切位点BamHI、PstI和BglII,经PstI和BamHI双酶切,插入表达载体BA-/pQE80L的BamHI和PstI酶切位点,构建成BA-LN1/pQE80L载体,编码一个融合蛋白序列SEQ ID No:30,其表达产物的BA与单拷贝利那洛肽的间隔序列含有Arg,可被胰蛋白酶识别和切割,释放出利那洛肽C端含有额外的Arg残基的分子。将构建的表达载体BA-LN1/pQE80L,转入E.coli BL21(DE3)pLysS感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.8),经0.1mM IPTG诱导表达20h。Tricine-SDS-PAGE(图9左)结果显示,融合蛋白BA-LN1的表观分子量与理论分子量(10292.7Da)符合,且无明显降解产物,BA-LN1以包涵体形式高效表达;Western blot(图9右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白BA-LN1表达量为85.0±4.8mg/L(n=3)。
本例说明胰岛素BA单链与小分子多肽利那洛肽融合,能够在原核宿主细胞中以包涵体形
式高效稳定表达。
实施例10:包涵体标签BA与三拷贝利那洛肽的融合表达
利那洛肽基因序列的PCR产物含有特定的酶切位点,经BamHI和BglII酶切,自身连接,用引物ataggatcccgctgctgcgaatactgctgcaacccggcttgc与atactgcagatctgtagcaaccggtgcaagccgggttgcagcag扩增获得多拷贝基因,按实施例9方法酶切并插入BA-/pQE80L载体,获得表达三拷贝利那洛肽载体BA-LN3/pQE80L,编码一个融合蛋白序列SEQ ID No:31,其表达产物的BA与三拷贝利那洛肽及各拷贝间的间隔序列含有Arg,可被胰蛋白酶识别和切割,释放出利那洛肽C端含有额外的Arg残基的分子。将获得的表达载体BA-LN3/pQE80L,转入E.coli BL21(DE3)pLysS感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.8),经0.1mM IPTG诱导表达8h,收取样品进行13%Tricine-SDS-PAGE及Western blot检测。Tricine-SDS-PAGE(图10左)结果显示,融合蛋白BA-LN3的表观分子量与理论分子量(14121.1Da)符合,且无明显降解产物,BA-LN3以包涵体形式有效表达;Western blot(图10右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白BA-LN3表达量为36.1±0.6mg/L(n=3)。
本例说明胰岛素BA单链与多拷贝小分子多肽利那洛肽融合,也能够在原核宿主细胞中以包涵体形式稳定表达。
实施例11:包涵体标签B′与双拷贝利那洛肽的融合表达
用引物ataagatctatgtttgtgaaccagcatctgtg和ataagatctcggggtataaaaaaagccgcgttc扩增的包涵体标签B基因变体(SEQ ID No:5),编码一个蛋白序列B′(SEQ ID No:6),其C末端缺失KT。用BglII酶切;与实施例8获得的多拷贝基因的BamHI酶切片段连接,再用引物ataagatctatgtttgtgaaccagcatctgtg和atactgcagatctgtagcaaccggtgcaagccgggttgcagcag扩增,经BglII和PstI双酶切并插入pQE80L载体BamHI和PstI位点,获得表达双拷贝利那洛肽载体B′-LN2/pQE80L,编码一个融合蛋白序列SEQ ID No:32,其表达产物B′与双拷贝利那洛肽及各拷贝间的间隔序列含有Arg残基,可被胰蛋白酶识别和切割,释放出利那洛肽C端含有额外的Arg残基的分子。将获得的表达载体B′-LN2/pQE80L,转入E.coli BL21(DE3)pLysS感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.4),经0.1mM IPTG诱导表达20h,收取样品。Tricine-SDS-PAGE(图11左)结果显示,融合蛋白B′-LN2的表观分子量与理论分子量(9299.6Da)符合,且无明显降解产物,B′-LN2以包涵体形式高效表达;Western blot(图11右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白B′-LN2表达量为70.9±0.3mg/L(n=3)。
本例说明胰岛素B链发生氨基酸残基缺失时,与多拷贝小分子多肽利那洛肽融合,也能够在原核宿主细胞中以包涵体形式高效稳定表达。
实施例12.利那洛肽融合蛋白的加工
融合蛋白B′-LN2经质谱检测,其9299.0840峰与B′-LN2预测理论分子量(9299.6)相符,4647.9067很可能为B′-LN2的双电荷峰,如图12A所示。B′-LN2包涵体样品经胰酶消化,可去除载体蛋白序列,将其双拷贝利那洛肽序列分割成为单拷贝序列,但其C端含有额外的Arg残基;酶切样品的质谱图12B显示,峰位3728.7019对应酶切肽段GlySerHisHisHisHisHisHisGlySerMetPheValAsnGlnHisLeuCysGlySerHisLeuValGluAlaLeuTyrLeuValCysGlyGluArg(理论分子量3730.1,含2个Cys),峰位1682.4679对应酶切肽段CysCysGluTyrCysCysAsnProAlaCysThrGlyCysTyrArg(理论分子量1688.9,含6个Cys)。对包含目的多肽的峰位1688.5183进行二级质谱检测,结果如图12E所示,其氨基酸序列归属结果符合预期。经羧肽酶B进一步酶切,可去除上述利那洛肽片段C端含有的额外Arg残基;如质谱图12C结果所示,羧肽酶B切除酶切肽段(CysCysGluTyrCysCysAsnProAlaCysThrGlyCysTyrArg)C端Arg残基后可以得到目的多肽利那洛肽(理论分子量1532.7),对应峰位为1526.3960。推测胰酶酶切过程中存在空气氧化,导致上述质谱结果与预测理论值存在较大差值。如图12D所示,将胰酶酶切样品经DTT进一步还原,出现预测理论峰位1688.5183,证实了上述推测。
实施例13:包涵体标签BA与人转甲状腺素蛋白的融合表达
人转甲状腺素蛋白(transthyretin,TTR)单体(SEQ ID No:34),含有127个氨基酸残基,该蛋白由细菌表达时可能存在对宿主细胞的毒性作用,导致其表达水平很低(Murrell JR,Schoner RG,Liepnieks JJ,et al.Production and functional analysis of normal and variant recombinant human transthyretin proteins.J Biol Chem.1992;267:16595-600),需要其它策略提高表达(Liu L,Hou J,Du J,et al.Differential modification of Cys10 alters transthyretin’s effect on beta-amyloid aggregation and toxicity.Protein Eng Des Sel.2009;22:479-88),我们尝试使用新型包涵体标签对其进行融合表达。
TTR基因(SEQ ID No:33)的三个外显子,以人HeLa细胞基因组DNA为模板,分别用引物ggcaccggtgaatccaag与ctccagactcactggttttcccagaggcaaatggctcc、ggagccatttgcctctgggaaaaccagtgagtctggag与cgttggctgtgaataccacctctgcatgctcatggaatg、cattccatgagcatgcagaggtggtattcacagccaacg与ataaagcttaagatctttccttgggattggtgacg扩增,并混合后用引物ataggatccggccctacgggcaccggtgaatccaag和ataaagcttaagatctttccttgggattggtgacg进行overlap PCR,获得的TTR基因的PCR产物经BamHI和HindIII双酶切,插入表达载体pQE80L或BA-/pQE80L的BamHI和HindIII酶切位点,获得TT/pQE80L和BA-TT/pQE80L表达载体,分别编码蛋白序列SEQ ID No:35和SEQ ID No:36。
将获得的表达载体TT/pQE80L和BA-TT/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.8),经1mM IPTG诱导表达21h,收取样品。图12结果显示,非融合蛋白TT的表观分子量与理论分子量(15403.1Da)符合,融合蛋白BA-TT的表观分子量与理论分子量(28145.5Da)符合。诱导全菌样品中TT和BA-TT相比,融合蛋白BA-TT表达量明显高于TT;非融合蛋白TT主要为可溶性表达,符合文献报道;但提取上清可溶蛋
白时,收取率明显低于预期应收取的产量,如图13中TT诱导上清样品(S)所示。而融合蛋白BA-TT主要以包涵体形式表达,无明显损失。
实施例14:包涵体标签BA与随机多肽X的融合表达
我们在克隆融合蛋白的过程中,获得一个BA与一未知DNA序列融合的表达载体BA-X/pQE80L,编码一个融合蛋白序列SEQ ID No:39。序列测定显示,该未知DNA为大肠杆菌yrbF编码序列BamHI和HindIII酶切片段,反向插入上述载体,与BA共码融合表达,其DNA序列(SEQ ID No:37)编码随机多肽X(SEQ ID No:38),它由39个氨基酸残基组成,其中多个Pro残基间隔存在。经计算机二级结构预测,该随机多肽是一种非天然、无结构的多肽。
将获得的表达载体BA-X/pQE80L,转入E.coli TG1感受态细胞,得到目的工程菌株;在对数生长期(OD600=0.4),经0.1mM IPTG诱导表达20h。Tricine-SDS-PAGE(图14左)结果显示,融合蛋白BA-X的表观分子量与理论分子量(12069.7Da)符合,且无明显降解产物,BA-X以包涵体形式高效表达;Western blot(图14右)结果显示,表达条带为含有His Tag的目的蛋白。融合蛋白BA-X表达量为88.0±3.2mg/L(n=3)。
本例表明,新型融合载体蛋白对多种多样的多肽包括非天然肽,同样具有很好的促表达作用。
Claims (23)
- 一种用于表达目的蛋白或多肽的可作为包涵体标签的融合载体蛋白,其特征在于,所述的融合载体蛋白的氨基酸序列来源于胰岛素的氨基酸序列或其经过取代、缺失和/或增加一个或几个氨基酸的氨基酸序列、或上述氨基酸序列经过常规修饰后形成的氨基酸序列、或上述氨基酸序列加入标签后形成的氨基酸序列。
- 根据权利要求1的融合载体蛋白,其特征在于,所述的常规修饰包括乙酰化、酰胺化、环化、糖基化、磷酸化、烷基化、生物素化、荧光基团修饰、聚乙二醇PEG修饰、固定化修饰;所述的标签包括6×His、GST、EGFP、MBP、Nus、HA、IgG、FLAG、c-Myc、Profinity eXact。
- 根据权利要求1的融合载体蛋白,其特征在于,所述的融合载体蛋白包括:(1)人胰岛素A亚基,即SEQ ID No:2所示的氨基酸序列;或(2)在(1)中的氨基酸序列经过取代、缺失和/或增加一个或几个氨基酸,且编码可融合表达的氨基酸序列。
- 根据权利要求1的融合载体蛋白,其特征在于,所述的融合载体蛋白包括:(1)人胰岛素B亚基,即SEQ ID No:4所示的氨基酸序列;或(2)在(1)中的氨基酸序列经取代、缺失和/或增加一个或几个氨基酸,且编码可融合表达的氨基酸序列。
- 根据权利要求1的融合载体蛋白,其特征在于,所述的融合载体蛋白包括:(1)是同时含有人胰岛素A亚基与B亚基的单链蛋白分子;或(2)是在(1)中之单链分子变体,其含有人胰岛素A亚基和/或B亚基,经取代、缺失和/或增加一个或几个氨基酸,且编码可融合表达的氨基酸序列。
- 根据权利要求4的融合载体蛋白,其特征在于,所述的融合载体蛋白包括SEQ ID No:6所示的氨基酸序列。
- 根据权利要求5的融合载体蛋白,其特征在于,所述的融合载体蛋白包括SEQ ID No:8、SEQ ID No:9或SEQ ID No:25所示的氨基酸序列。
- 一种融合蛋白,其特征在于,所述的融合蛋白含有权利要求1所述融合载体蛋白以及至少一个目的蛋白或多肽,且该目的蛋白或多肽不是胰岛素。
- 根据权利要求8的融合蛋白,其特征在于,所述的融合蛋白中融合载体蛋白与目的蛋白或多肽之间可以含有特异性多肽切割位点或序列。
- 根据权利要求8的融合蛋白,其特征在于,所述的融合蛋白可进行常规修饰或加入表达纯化标签。
- 根据权利要求10的融合蛋白,其特征在于,所述的常规修饰包括乙酰化、酰胺化、环化、糖基化、磷酸化、烷基化、生物素化、荧光基团修饰、聚乙二醇PEG修饰、固定化修饰;所述的标签包括6×His、GST、EGFP、MBP、Nus、HA、IgG、FLAG、c-Myc、Profinity eXact。
- 根据权利要求8的融合蛋白,其特征在于,所述的融合蛋白中目的蛋白或多肽含有5-1000个氨基酸残基。
- 根据权利要求8的融合蛋白,其特征在于,所述的融合蛋白中,可以含有1、2、3、4个目的蛋白或多肽。
- 根据权利要求8的融合蛋白,其特征在于,所述的融合蛋白中目的蛋白或多肽选自GLP-1、胃泌酸调节素、恩夫韦肽、利那洛肽、人转甲状腺素蛋白,及它们的变体。
- 根据权利要求14的融合蛋白,其特征在于,所述的融合蛋白选自SEQ ID No:12、SEQ ID No:13、SEQ ID No:14、SEQ ID No:17、SEQ ID No:18、SEQ ID No:21、SEQ ID No:27、SEQ ID No:30、SEQ ID No:31、SEQ ID No:32、SEQ ID No:36、SEQ ID No:39所示的氨基酸序列。
- 一种核酸分子,其特征在于,所述核酸分子含有编码权利要求1-7任一项所述融合载体蛋白的基因序列。
- 根据权利要求16的核酸分子,其特征在于,所述核酸分子包括SEQ ID No:1、SEQ ID No:3、SEQ ID No:5、SEQ ID No:7、SEQ ID No:24所示的基因序列。
- 一种核酸分子,其特征在于,所述核酸分子含有编码权利要求8-15任一项所述融合蛋白的基因序列。
- 一种表达载体,其特征在于,所述表达载体包含权利要求16-17任一项所述核酸分子,连接于载体的启动子用于核酸分子编码蛋白的表达。
- 一种表达载体,其特征在于,所述表达载体包含权利要求18所述核酸分子,连接于载体的启动子用于核酸分子编码蛋白的表达。
- 一种宿主细胞,其特征在于,含有权利要求16-17任一项所述的核酸分子或权利要求19所述的表达载体。
- 一种宿主细胞,其特征在于,含有权利要求18所述的核酸分子或权利要求20所述的表达载体。
- 权利要求1-7所述的融合载体蛋白或权利要求16-17任一项所述的核酸分子或权利要求19所述的表达载体或权利要求21所述的宿主细胞在促进目的蛋白或多肽表达中的应用。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510176724 | 2015-04-10 | ||
CN201510176724.6 | 2015-04-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016161983A1 true WO2016161983A1 (zh) | 2016-10-13 |
Family
ID=57071715
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/078938 WO2016161983A1 (zh) | 2015-04-10 | 2016-04-11 | 一种融合载体蛋白及其在促进目的蛋白或多肽表达中的应用 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2016161983A1 (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107929718A (zh) * | 2017-10-19 | 2018-04-20 | 南京星银药业集团有限公司 | Glp‑1类似物及gc‑c受体激动剂组合物缓释制剂及其制备方法 |
CN114350587A (zh) * | 2022-01-24 | 2022-04-15 | 修实生物医药(南通)有限公司 | 一种基因重组串联表达利那洛肽的工程菌 |
CN114507293A (zh) * | 2022-01-24 | 2022-05-17 | 修实生物医药(南通)有限公司 | 一种基因重组串联表达利那洛肽的融合蛋白及表达利那洛肽的方法 |
CN115850385B (zh) * | 2022-07-04 | 2023-08-11 | 北京惠之衡生物科技有限公司 | 一种促表达肽及其应用 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999032136A1 (en) * | 1997-12-23 | 1999-07-01 | Alexion Pharmaceuticals, Inc. | Chimeric proteins for the treatment of diabetes |
CN1740325A (zh) * | 2004-08-25 | 2006-03-01 | 浙江大学 | 一种融合基因、其表达的蛋白及其制备方法 |
CN101280020A (zh) * | 2008-05-23 | 2008-10-08 | 江南大学 | 人血清白蛋白与人胰岛素c肽的融合蛋白及其制备 |
CN103509118A (zh) * | 2012-06-15 | 2014-01-15 | 郭怀祖 | 胰岛素-Fc融合蛋白 |
-
2016
- 2016-04-11 WO PCT/CN2016/078938 patent/WO2016161983A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999032136A1 (en) * | 1997-12-23 | 1999-07-01 | Alexion Pharmaceuticals, Inc. | Chimeric proteins for the treatment of diabetes |
CN1740325A (zh) * | 2004-08-25 | 2006-03-01 | 浙江大学 | 一种融合基因、其表达的蛋白及其制备方法 |
CN101280020A (zh) * | 2008-05-23 | 2008-10-08 | 江南大学 | 人血清白蛋白与人胰岛素c肽的融合蛋白及其制备 |
CN103509118A (zh) * | 2012-06-15 | 2014-01-15 | 郭怀祖 | 胰岛素-Fc融合蛋白 |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107929718A (zh) * | 2017-10-19 | 2018-04-20 | 南京星银药业集团有限公司 | Glp‑1类似物及gc‑c受体激动剂组合物缓释制剂及其制备方法 |
CN114350587A (zh) * | 2022-01-24 | 2022-04-15 | 修实生物医药(南通)有限公司 | 一种基因重组串联表达利那洛肽的工程菌 |
CN114507293A (zh) * | 2022-01-24 | 2022-05-17 | 修实生物医药(南通)有限公司 | 一种基因重组串联表达利那洛肽的融合蛋白及表达利那洛肽的方法 |
CN114350587B (zh) * | 2022-01-24 | 2023-10-31 | 修实生物医药(南通)有限公司 | 一种基因重组串联表达利那洛肽的工程菌 |
CN114507293B (zh) * | 2022-01-24 | 2023-11-07 | 修实生物医药(南通)有限公司 | 一种基因重组串联表达利那洛肽的融合蛋白及表达利那洛肽的方法 |
CN115850385B (zh) * | 2022-07-04 | 2023-08-11 | 北京惠之衡生物科技有限公司 | 一种促表达肽及其应用 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2021003117A (ja) | ペプチド産生のための融合パートナー | |
Wegmuller et al. | Recombinant peptide production in microbial cells | |
WO2016161983A1 (zh) | 一种融合载体蛋白及其在促进目的蛋白或多肽表达中的应用 | |
US8298789B2 (en) | Orthogonal process for purification of recombinant human parathyroid hormone (rhPTH) (1-34) | |
WO2010082804A2 (en) | Method for producing physiologically active protein or peptide using immunoglobulin fragment | |
AU2016382134B2 (en) | Peptide tag and tagged protein including same | |
US20210269811A1 (en) | Means and methods for increased protein expression by use of transcription factors | |
US10000544B2 (en) | Process for production of insulin and insulin analogues | |
Filipe et al. | Design of bacterial vector systems for the production of recombinant proteins in Escherichia coli | |
US7892787B2 (en) | Method for production of recombinant growth hormone in form of hybrid protein | |
AU2023204533A1 (en) | Linear polyfunctional multimer biomolecule coupled to polyubiquitin linker and use thereof | |
SG175716A1 (en) | Method of controlling o-linked glycosylation of antibodies | |
JP3957630B2 (ja) | 組換えヒト副甲状腺ホルモンを生産する形質転換酵母及び該ホルモンの生産方法 | |
TW201829774A (zh) | 用以製備目標蛋白的表現構建體與方法 | |
US20110092424A1 (en) | Production of glucagon like peptide 2 and analogs | |
AU2016100212A4 (en) | Method of producing a recombinant peptide | |
JPWO2020045530A1 (ja) | ペプチドタグを利用したタンパク質の可溶性発現 | |
WO2019143193A9 (ko) | 재조합 폴리펩타이드 생산용 n-말단 융합 파트너 및 이를 이용하여 재조합 폴리 펩타이드를 생산하는방법 | |
US10465220B2 (en) | Expression process | |
Deng et al. | Insulin chains as efficient fusion tags for prokaryotic expression of short peptides | |
JP2022548598A (ja) | 組換え治療用ペプチドの発現のためのn末端伸長配列 | |
Wong et al. | Escherichia coli: A versatile platform for recombinant protein expression | |
JP2020513834A (ja) | ペプチドの発現および大規模生産 | |
EP3409685B1 (en) | Insoluble fusion protein comprising antimicrobial peptide and method for producing antimicrobial peptide using same | |
RU2728611C1 (ru) | РЕКОМБИНАНТНАЯ ПЛАЗМИДНАЯ ДНК pF265, КОДИРУЮЩАЯ ГИБРИДНЫЙ ПОЛИПЕПТИД, СОДЕРЖАЩИЙ ПРОИНСУЛИН ЧЕЛОВЕКА, И ШТАММ БАКТЕРИЙ Escherichia coli - ПРОДУЦЕНТ ГИБРИДНОГО ПОЛИПЕПТИДА, СОДЕРЖАЩЕГО ПРОИНСУЛИН ЧЕЛОВЕКА |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16776158 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16776158 Country of ref document: EP Kind code of ref document: A1 |