NZ622174B2 - Glucagon-like peptide-2 compositions and methods of making and using same - Google Patents
Glucagon-like peptide-2 compositions and methods of making and using same Download PDFInfo
- Publication number
- NZ622174B2 NZ622174B2 NZ622174A NZ62217412A NZ622174B2 NZ 622174 B2 NZ622174 B2 NZ 622174B2 NZ 622174 A NZ622174 A NZ 622174A NZ 62217412 A NZ62217412 A NZ 62217412A NZ 622174 B2 NZ622174 B2 NZ 622174B2
- Authority
- NZ
- New Zealand
- Prior art keywords
- xten
- glp
- sequence
- fusion protein
- amino acid
- Prior art date
Links
- 239000000203 mixture Substances 0.000 title claims abstract description 180
- TWSALRJGPBVBQU-PKQQPRCHSA-N Glucagon-like peptide 2 Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O)[C@@H](C)CC)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@H](CO)NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)CC)C1=CC=CC=C1 TWSALRJGPBVBQU-PKQQPRCHSA-N 0.000 title description 460
- 102100003818 GCG Human genes 0.000 title description 456
- 101710042131 GCG Proteins 0.000 title description 455
- 102000037240 fusion proteins Human genes 0.000 claims abstract description 352
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 352
- 235000001014 amino acid Nutrition 0.000 claims abstract description 174
- 150000001413 amino acids Chemical class 0.000 claims abstract description 165
- 229920001184 polypeptide Polymers 0.000 claims abstract description 113
- DHMQDGOQFOQNFH-UHFFFAOYSA-N glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 claims abstract description 97
- 125000000539 amino acid group Chemical group 0.000 claims abstract description 95
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 claims abstract description 55
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 claims abstract description 55
- 230000001134 intestinotrophic Effects 0.000 claims abstract description 55
- 239000004471 Glycine Substances 0.000 claims abstract description 50
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 claims abstract description 42
- 230000003252 repetitive Effects 0.000 claims abstract description 42
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 40
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 claims abstract description 35
- 239000004473 Threonine Substances 0.000 claims abstract description 34
- 235000004279 alanine Nutrition 0.000 claims abstract description 34
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 claims abstract description 32
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 claims abstract description 30
- 210000001744 T-Lymphocytes Anatomy 0.000 claims abstract description 15
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 14
- 230000001747 exhibiting Effects 0.000 claims abstract description 13
- 238000005755 formation reaction Methods 0.000 claims abstract description 12
- MTCFGRXMJLQNBG-UWTATZPHSA-N D-serine Chemical compound OC[C@@H](N)C(O)=O MTCFGRXMJLQNBG-UWTATZPHSA-N 0.000 claims abstract 12
- 230000000875 corresponding Effects 0.000 claims description 122
- 230000001965 increased Effects 0.000 claims description 100
- 230000000968 intestinal Effects 0.000 claims description 88
- 230000000694 effects Effects 0.000 claims description 87
- 238000003776 cleavage reaction Methods 0.000 claims description 86
- 210000004027 cells Anatomy 0.000 claims description 80
- 230000003247 decreasing Effects 0.000 claims description 70
- 201000010099 disease Diseases 0.000 claims description 66
- 210000000813 small intestine Anatomy 0.000 claims description 53
- 230000004927 fusion Effects 0.000 claims description 51
- 206010017943 Gastrointestinal conditions Diseases 0.000 claims description 46
- 208000004232 Enteritis Diseases 0.000 claims description 45
- 230000002829 reduced Effects 0.000 claims description 45
- 238000006467 substitution reaction Methods 0.000 claims description 44
- 241000282414 Homo sapiens Species 0.000 claims description 43
- 102000015626 Glucagon-Like Peptide-2 Receptor Human genes 0.000 claims description 39
- 108010024044 Glucagon-Like Peptide-2 Receptor Proteins 0.000 claims description 39
- 206010049416 Short-bowel syndrome Diseases 0.000 claims description 38
- 241000700159 Rattus Species 0.000 claims description 35
- 206010011401 Crohn's disease Diseases 0.000 claims description 34
- 210000000936 Intestines Anatomy 0.000 claims description 33
- 238000004166 bioassay Methods 0.000 claims description 33
- 230000002496 gastric Effects 0.000 claims description 33
- 230000037034 TERMINAL HALF LIFE Effects 0.000 claims description 31
- 238000002512 chemotherapy Methods 0.000 claims description 29
- 230000014509 gene expression Effects 0.000 claims description 28
- 150000007523 nucleic acids Chemical class 0.000 claims description 28
- 102000005962 receptors Human genes 0.000 claims description 28
- 108020003175 receptors Proteins 0.000 claims description 28
- 150000002500 ions Chemical class 0.000 claims description 27
- 239000008194 pharmaceutical composition Substances 0.000 claims description 27
- 206010009839 Coeliac disease Diseases 0.000 claims description 26
- 206010022114 Injury Diseases 0.000 claims description 26
- 108020004707 nucleic acids Proteins 0.000 claims description 26
- 206010025476 Malabsorption Diseases 0.000 claims description 25
- 208000008589 Obesity Diseases 0.000 claims description 24
- 235000020824 obesity Nutrition 0.000 claims description 24
- 206010021972 Inflammatory bowel disease Diseases 0.000 claims description 23
- 206010022680 Intestinal ischaemia Diseases 0.000 claims description 23
- 239000002253 acid Substances 0.000 claims description 23
- 235000016709 nutrition Nutrition 0.000 claims description 23
- 241000699666 Mus <mouse, genus> Species 0.000 claims description 21
- 239000003814 drug Substances 0.000 claims description 19
- 229940035295 Ting Drugs 0.000 claims description 18
- 230000001603 reducing Effects 0.000 claims description 18
- 241000282693 Cercopithecidae Species 0.000 claims description 17
- 210000000981 Epithelium Anatomy 0.000 claims description 17
- 210000001035 Gastrointestinal Tract Anatomy 0.000 claims description 16
- 229920001850 Nucleic acid sequence Polymers 0.000 claims description 15
- 238000006722 reduction reaction Methods 0.000 claims description 15
- 238000007920 subcutaneous administration Methods 0.000 claims description 15
- 201000010874 syndrome Diseases 0.000 claims description 15
- 208000002551 Irritable Bowel Syndrome Diseases 0.000 claims description 14
- 230000000295 complement Effects 0.000 claims description 14
- 230000029087 digestion Effects 0.000 claims description 14
- 235000016236 parenteral nutrition Nutrition 0.000 claims description 14
- 210000001519 tissues Anatomy 0.000 claims description 14
- 230000037396 body weight Effects 0.000 claims description 13
- 238000000338 in vitro Methods 0.000 claims description 13
- 238000001990 intravenous administration Methods 0.000 claims description 13
- 235000019786 weight gain Nutrition 0.000 claims description 13
- 206010009900 Colitis ulcerative Diseases 0.000 claims description 12
- 206010012601 Diabetes mellitus Diseases 0.000 claims description 12
- 208000002720 Malnutrition Diseases 0.000 claims description 12
- 208000004535 Mesenteric Ischemia Diseases 0.000 claims description 12
- 206010028116 Mucosal inflammation Diseases 0.000 claims description 12
- 208000008425 Protein Deficiency Diseases 0.000 claims description 12
- 208000001162 Steatorrhea Diseases 0.000 claims description 12
- 206010041969 Steatorrhoea Diseases 0.000 claims description 12
- 241000282898 Sus scrofa Species 0.000 claims description 12
- 206010044697 Tropical sprue Diseases 0.000 claims description 12
- 125000003630 glycyl group Chemical group [H]N([H])C([H])([H])C(*)=O 0.000 claims description 12
- 230000001071 malnutrition Effects 0.000 claims description 12
- 235000000824 malnutrition Nutrition 0.000 claims description 12
- 201000010927 mucositis Diseases 0.000 claims description 12
- 239000000041 non-steroidal anti-inflammatory agent Substances 0.000 claims description 12
- 229940021182 non-steroidal anti-inflammatory drugs Drugs 0.000 claims description 12
- 201000006704 ulcerative colitis Diseases 0.000 claims description 12
- 206010003816 Autoimmune disease Diseases 0.000 claims description 11
- 208000004262 Food Hypersensitivity Diseases 0.000 claims description 11
- 208000007882 Gastritis Diseases 0.000 claims description 11
- 206010064919 Hypospermia Diseases 0.000 claims description 11
- 208000002389 Pouchitis Diseases 0.000 claims description 11
- 230000001580 bacterial Effects 0.000 claims description 11
- 230000001925 catabolic Effects 0.000 claims description 11
- 235000020932 food allergy Nutrition 0.000 claims description 11
- 230000012010 growth Effects 0.000 claims description 11
- 238000007918 intramuscular administration Methods 0.000 claims description 11
- 206010003997 Bacteraemia Diseases 0.000 claims description 10
- 241000282465 Canis Species 0.000 claims description 10
- 210000003717 Douglas' Pouch Anatomy 0.000 claims description 10
- 206010051606 Necrotising colitis Diseases 0.000 claims description 10
- 230000000741 diarrhetic Effects 0.000 claims description 10
- 238000004519 manufacturing process Methods 0.000 claims description 10
- 206010061172 Gastrointestinal injury Diseases 0.000 claims description 9
- 206010020993 Hypoglycaemia Diseases 0.000 claims description 9
- 208000004155 Malabsorption Syndromes Diseases 0.000 claims description 9
- 206010033654 Pancreatitis necrotising Diseases 0.000 claims description 9
- 206010034674 Peritonitis Diseases 0.000 claims description 9
- 230000035876 healing Effects 0.000 claims description 9
- 230000002218 hypoglycaemic Effects 0.000 claims description 9
- 206010059512 Apoptosis Diseases 0.000 claims description 8
- 208000002633 Febrile Neutropenia Diseases 0.000 claims description 8
- 108060003199 Glucagon Proteins 0.000 claims description 8
- 229960004666 Glucagon Drugs 0.000 claims description 8
- MASNOZXLGMXCHN-ZLPAWPGGSA-N Glucagonum Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)C(C)C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)O)C1=CC=CC=C1 MASNOZXLGMXCHN-ZLPAWPGGSA-N 0.000 claims description 8
- 208000004995 Necrotizing Enterocolitis Diseases 0.000 claims description 8
- 230000006907 apoptotic process Effects 0.000 claims description 8
- 230000004663 cell proliferation Effects 0.000 claims description 8
- 210000001100 crypt cell Anatomy 0.000 claims description 8
- 230000005176 gastrointestinal motility Effects 0.000 claims description 8
- 200000000021 intestinal injury Diseases 0.000 claims description 8
- 201000006195 perinatal necrotizing enterocolitis Diseases 0.000 claims description 8
- 206010020718 Hyperplasia Diseases 0.000 claims description 7
- 101500016455 bovine Glucagon-like peptide 2 Proteins 0.000 claims description 7
- 101500016480 chicken Glucagon-like peptide 2 Proteins 0.000 claims description 7
- 239000003937 drug carrier Substances 0.000 claims description 7
- 230000000297 inotrophic Effects 0.000 claims description 7
- 230000003886 intestinal anastomosis Effects 0.000 claims description 7
- 101500014034 sheep Glucagon-like peptide 2 Proteins 0.000 claims description 7
- 206010059028 Gastrointestinal ischaemia Diseases 0.000 claims description 6
- 206010067994 Mucosal atrophy Diseases 0.000 claims description 6
- 241000283690 Bos taurus Species 0.000 claims description 5
- 241000283898 Ovis Species 0.000 claims description 5
- 206010062065 Perforated ulcer Diseases 0.000 claims description 5
- 239000000463 material Substances 0.000 claims description 4
- 210000000805 Cytoplasm Anatomy 0.000 claims description 3
- 208000004235 Neutropenia Diseases 0.000 claims description 2
- 239000005022 packaging material Substances 0.000 claims description 2
- 102000000852 Tumor Necrosis Factor-alpha Human genes 0.000 claims 4
- 108010001801 Tumor Necrosis Factor-alpha Proteins 0.000 claims 4
- 206010003694 Atrophy Diseases 0.000 claims 1
- 241001631434 Ermia Species 0.000 claims 1
- 235000018102 proteins Nutrition 0.000 description 126
- 102000004169 proteins and genes Human genes 0.000 description 126
- 108090000623 proteins and genes Proteins 0.000 description 126
- 230000035492 administration Effects 0.000 description 83
- 230000001225 therapeutic Effects 0.000 description 72
- 125000003275 alpha amino acid group Chemical group 0.000 description 50
- 229960002449 Glycine Drugs 0.000 description 39
- 229920000023 polynucleotide Polymers 0.000 description 39
- 239000002157 polynucleotide Substances 0.000 description 39
- 102000035443 Peptidases Human genes 0.000 description 38
- 108091005771 Peptidases Proteins 0.000 description 38
- 239000004365 Protease Substances 0.000 description 38
- MTCFGRXMJLQNBG-REOHCLBHSA-N L-serine Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 35
- 229960001153 serine Drugs 0.000 description 34
- 229940049906 Glutamate Drugs 0.000 description 27
- 230000027455 binding Effects 0.000 description 27
- 235000019833 protease Nutrition 0.000 description 26
- 235000008521 threonine Nutrition 0.000 description 26
- 230000036868 Blood Concentration Effects 0.000 description 25
- 102000004965 antibodies Human genes 0.000 description 24
- 108090001123 antibodies Proteins 0.000 description 24
- 230000036499 Half live Effects 0.000 description 23
- 230000002708 enhancing Effects 0.000 description 21
- 230000000275 pharmacokinetic Effects 0.000 description 21
- CGIGDMFJXJATDK-UHFFFAOYSA-N Indometacin Chemical compound CC1=C(CC(O)=O)C2=CC(OC)=CC=C2N1C(=O)C1=CC=C(Cl)C=C1 CGIGDMFJXJATDK-UHFFFAOYSA-N 0.000 description 18
- 230000004054 inflammatory process Effects 0.000 description 18
- 206010061218 Inflammation Diseases 0.000 description 17
- 230000017531 blood circulation Effects 0.000 description 16
- 125000003729 nucleotide group Chemical group 0.000 description 16
- -1 se-2 Proteins 0.000 description 16
- 230000035639 Blood Levels Effects 0.000 description 14
- 230000037250 Clearance Effects 0.000 description 13
- 230000035512 clearance Effects 0.000 description 13
- 238000000034 method Methods 0.000 description 13
- 239000002773 nucleotide Substances 0.000 description 13
- 230000003405 preventing Effects 0.000 description 13
- 239000000047 product Substances 0.000 description 13
- 210000002381 Plasma Anatomy 0.000 description 12
- 210000002966 Serum Anatomy 0.000 description 12
- 239000003795 chemical substances by application Substances 0.000 description 12
- 238000001542 size-exclusion chromatography Methods 0.000 description 12
- 239000000126 substance Substances 0.000 description 12
- 239000004475 Arginine Substances 0.000 description 11
- 210000004369 Blood Anatomy 0.000 description 11
- 241000282412 Homo Species 0.000 description 11
- 235000009697 arginine Nutrition 0.000 description 11
- 239000008280 blood Substances 0.000 description 11
- 230000002550 fecal Effects 0.000 description 11
- KEAYESYHFKHZAL-UHFFFAOYSA-N sodium Chemical compound [Na] KEAYESYHFKHZAL-UHFFFAOYSA-N 0.000 description 11
- 239000011734 sodium Substances 0.000 description 11
- 229910052708 sodium Inorganic materials 0.000 description 11
- 206010012735 Diarrhoea Diseases 0.000 description 10
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 10
- CILIXQOJUNDIDU-ASQIGDHWSA-N Teduglutide Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O)[C@@H](C)CC)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@H](CO)NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)CNC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)CC)C1=CC=CC=C1 CILIXQOJUNDIDU-ASQIGDHWSA-N 0.000 description 10
- 229920003013 deoxyribonucleic acid Polymers 0.000 description 10
- 201000008286 diarrhea Diseases 0.000 description 10
- 229940079593 drugs Drugs 0.000 description 10
- 238000003780 insertion Methods 0.000 description 10
- 230000000144 pharmacologic effect Effects 0.000 description 10
- 229960002444 teduglutide Drugs 0.000 description 10
- 230000029663 wound healing Effects 0.000 description 10
- 229960000905 Indomethacin Drugs 0.000 description 9
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 9
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 9
- 241000124008 Mammalia Species 0.000 description 9
- 230000001186 cumulative Effects 0.000 description 9
- 239000005090 green fluorescent protein Substances 0.000 description 9
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 9
- 102000027675 major histocompatibility complex family Human genes 0.000 description 9
- 238000011084 recovery Methods 0.000 description 9
- 230000037242 Cmax Effects 0.000 description 8
- 102100012353 DPP4 Human genes 0.000 description 8
- 238000002965 ELISA Methods 0.000 description 8
- RHGKLRLOHDJJDR-BYPYZUCNSA-N L-citrulline zwitterion Chemical compound NC(=O)NCCC[C@H]([NH3+])C([O-])=O RHGKLRLOHDJJDR-BYPYZUCNSA-N 0.000 description 8
- 230000015556 catabolic process Effects 0.000 description 8
- 229960002173 citrulline Drugs 0.000 description 8
- 235000013477 citrulline Nutrition 0.000 description 8
- 101500013972 human Glucagon-like peptide 2 Proteins 0.000 description 8
- 230000002209 hydrophobic Effects 0.000 description 8
- 230000001976 improved Effects 0.000 description 8
- 230000003871 intestinal function Effects 0.000 description 8
- 230000000670 limiting Effects 0.000 description 8
- 230000004044 response Effects 0.000 description 8
- 230000004936 stimulating Effects 0.000 description 8
- 238000001356 surgical procedure Methods 0.000 description 8
- 108010073046 teduglutide Proteins 0.000 description 8
- WYWHKKSPHMUBEB-UHFFFAOYSA-N tioguanine Chemical compound N1C(N)=NC(=S)C2=C1N=CN2 WYWHKKSPHMUBEB-UHFFFAOYSA-N 0.000 description 8
- 230000004584 weight gain Effects 0.000 description 8
- 206010070545 Bacterial translocation Diseases 0.000 description 7
- 206010009887 Colitis Diseases 0.000 description 7
- 241000229754 Iva xanthiifolia Species 0.000 description 7
- 125000003295 alanine group Chemical group N[C@@H](C)C(=O)* 0.000 description 7
- 230000004059 degradation Effects 0.000 description 7
- 238000006731 degradation reaction Methods 0.000 description 7
- 230000019439 energy homeostasis Effects 0.000 description 7
- 230000002163 immunogen Effects 0.000 description 7
- 238000010348 incorporation Methods 0.000 description 7
- 230000003993 interaction Effects 0.000 description 7
- 235000018977 lysine Nutrition 0.000 description 7
- 230000004678 mucosal integrity Effects 0.000 description 7
- 230000002797 proteolythic Effects 0.000 description 7
- 150000003431 steroids Chemical class 0.000 description 7
- 238000002560 therapeutic procedure Methods 0.000 description 7
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 7
- NKCXQMYPWXSLIZ-PSRDDEIFSA-N (2S)-2-[[(2S)-1-[(2S)-5-amino-2-[[2-[[(2S)-6-amino-2-[[2-[[(2S)-2-[[(2S)-4-amino-2-[[(2S)-2-[[(2S,3R)-2-[[(2S)-2-[[(2S,3R)-2-amino-3-hydroxybutanoyl]amino]-3-(1H-indol-3-yl)propanoyl]amino]-3-hydroxybutanoyl]amino]propanoyl]amino]-4-oxobutanoyl]amino]-3-m Chemical compound O=C([C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCCCN)NC(=O)CNC(=O)[C@@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@@H](N)[C@@H](C)O)[C@@H](C)O)C(C)C)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O NKCXQMYPWXSLIZ-PSRDDEIFSA-N 0.000 description 6
- 210000003719 B-Lymphocytes Anatomy 0.000 description 6
- 101700062901 DPP Proteins 0.000 description 6
- 241000282619 Hylobates lar Species 0.000 description 6
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 6
- XUJNEKJLAYXESH-REOHCLBHSA-N L-cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 6
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 6
- 239000004472 Lysine Substances 0.000 description 6
- 108090000028 MMP12 Proteins 0.000 description 6
- 102100004961 MMP12 Human genes 0.000 description 6
- 102100004962 MMP13 Human genes 0.000 description 6
- 101700084657 MMP13 Proteins 0.000 description 6
- 108090000190 Thrombin Proteins 0.000 description 6
- 241000723792 Tobacco etch virus Species 0.000 description 6
- 239000000556 agonist Substances 0.000 description 6
- 235000018417 cysteine Nutrition 0.000 description 6
- 238000010790 dilution Methods 0.000 description 6
- 239000000543 intermediate Substances 0.000 description 6
- 230000001404 mediated Effects 0.000 description 6
- 230000036231 pharmacokinetics Effects 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 6
- 102000004196 processed proteins & peptides Human genes 0.000 description 6
- 108090000765 processed proteins & peptides Proteins 0.000 description 6
- 239000011780 sodium chloride Substances 0.000 description 6
- 230000002459 sustained Effects 0.000 description 6
- 229960004072 thrombin Drugs 0.000 description 6
- 239000003981 vehicle Substances 0.000 description 6
- 230000037094 Cmin Effects 0.000 description 5
- 229960002989 Glutamic Acid Drugs 0.000 description 5
- 102000004851 Immunoglobulin G Human genes 0.000 description 5
- 108090001095 Immunoglobulin G Proteins 0.000 description 5
- 206010061255 Ischaemia Diseases 0.000 description 5
- 108060005987 Kallikreins Proteins 0.000 description 5
- 102000001399 Kallikreins Human genes 0.000 description 5
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 5
- 230000036740 Metabolism Effects 0.000 description 5
- 239000002202 Polyethylene glycol Substances 0.000 description 5
- 230000036045 Renal clearance Effects 0.000 description 5
- 230000004913 activation Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 230000003042 antagnostic Effects 0.000 description 5
- 239000005557 antagonist Substances 0.000 description 5
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 5
- 239000000969 carrier Substances 0.000 description 5
- 230000004087 circulation Effects 0.000 description 5
- 230000001809 detectable Effects 0.000 description 5
- 239000000499 gel Substances 0.000 description 5
- 235000013922 glutamic acid Nutrition 0.000 description 5
- 239000004220 glutamic acid Substances 0.000 description 5
- 230000004060 metabolic process Effects 0.000 description 5
- 230000035786 metabolism Effects 0.000 description 5
- 229920001223 polyethylene glycol Polymers 0.000 description 5
- 230000002265 prevention Effects 0.000 description 5
- 150000003839 salts Chemical class 0.000 description 5
- 230000036269 ulceration Effects 0.000 description 5
- 229920000160 (ribonucleotides)n+m Polymers 0.000 description 4
- AOJJSUZBOXZQNB-TZSSRYMLSA-N ADRIAMYCIN Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-TZSSRYMLSA-N 0.000 description 4
- 229960001230 Asparagine Drugs 0.000 description 4
- 230000036912 Bioavailability Effects 0.000 description 4
- 229920001405 Coding region Polymers 0.000 description 4
- 102000004127 Cytokines Human genes 0.000 description 4
- 108090000695 Cytokines Proteins 0.000 description 4
- QIVBCDIJIAJPQS-SECBINFHSA-N D-tryptophane Chemical compound C1=CC=C2C(C[C@@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-SECBINFHSA-N 0.000 description 4
- 229960004679 Doxorubicin Drugs 0.000 description 4
- 108010048049 Factor IXa Proteins 0.000 description 4
- 108010054265 Factor VIIa Proteins 0.000 description 4
- 229940012414 Factor VIIa Drugs 0.000 description 4
- 108010071241 Factor XIIa Proteins 0.000 description 4
- 108010080805 Factor XIa Proteins 0.000 description 4
- 108010074860 Factor Xa Proteins 0.000 description 4
- 206010018987 Haemorrhage Diseases 0.000 description 4
- 229940088597 Hormone Drugs 0.000 description 4
- 229960000310 ISOLEUCINE Drugs 0.000 description 4
- 150000008575 L-amino acids Chemical class 0.000 description 4
- ODKSFYDXXFIFQN-BYPYZUCNSA-N L-arginine Chemical group OC(=O)[C@@H](N)CCCN=C(N)N ODKSFYDXXFIFQN-BYPYZUCNSA-N 0.000 description 4
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 4
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 4
- 102100018201 MMP17 Human genes 0.000 description 4
- 101700031813 MMP17 Proteins 0.000 description 4
- 241000282567 Macaca fascicularis Species 0.000 description 4
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 4
- 229960005454 Thioguanine Drugs 0.000 description 4
- 235000009582 asparagine Nutrition 0.000 description 4
- 230000035514 bioavailability Effects 0.000 description 4
- 230000000740 bleeding Effects 0.000 description 4
- 231100000319 bleeding Toxicity 0.000 description 4
- 230000036765 blood level Effects 0.000 description 4
- 230000021615 conjugation Effects 0.000 description 4
- 238000003379 elimination reaction Methods 0.000 description 4
- 230000002124 endocrine Effects 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 4
- 235000004554 glutamine Nutrition 0.000 description 4
- 239000005556 hormone Substances 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 230000002757 inflammatory Effects 0.000 description 4
- 230000002401 inhibitory effect Effects 0.000 description 4
- 230000014759 maintenance of location Effects 0.000 description 4
- 230000035772 mutation Effects 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 239000004474 valine Substances 0.000 description 4
- GHASVSINZRGABV-UHFFFAOYSA-N 5-flurouricil Chemical compound FC1=CNC(=O)NC1=O GHASVSINZRGABV-UHFFFAOYSA-N 0.000 description 3
- 206010000269 Abscess Diseases 0.000 description 3
- 210000000612 Antigen-Presenting Cells Anatomy 0.000 description 3
- 229940009098 Aspartate Drugs 0.000 description 3
- NLZUEZXRPGMBCV-UHFFFAOYSA-N Butylhydroxytoluene Chemical compound CC1=CC(C(C)(C)C)=C(O)C(C(C)(C)C)=C1 NLZUEZXRPGMBCV-UHFFFAOYSA-N 0.000 description 3
- 229920002676 Complementary DNA Polymers 0.000 description 3
- CKLJMWTZIZZHCS-UHFFFAOYSA-N DL-aspartic acid Chemical compound OC(=O)C(N)CC(O)=O CKLJMWTZIZZHCS-UHFFFAOYSA-N 0.000 description 3
- 108010013369 EC 3.4.21.9 Proteins 0.000 description 3
- 102100003966 ELANE Human genes 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 108010029144 Factor IIa Proteins 0.000 description 3
- 229960002949 Fluorouracil Drugs 0.000 description 3
- 208000003243 Intestinal Obstruction Diseases 0.000 description 3
- 206010022714 Intestinal ulcer Diseases 0.000 description 3
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 3
- 102100012746 MMP20 Human genes 0.000 description 3
- 101700010802 MMP20 Proteins 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 108010061543 Neutralizing Antibodies Proteins 0.000 description 3
- 108090000155 Pancreatic elastase II Proteins 0.000 description 3
- 229960005190 Phenylalanine Drugs 0.000 description 3
- 206010054048 Postoperative ileus Diseases 0.000 description 3
- 241000283984 Rodentia Species 0.000 description 3
- 208000006011 Stroke Diseases 0.000 description 3
- 102100009508 TMPRSS15 Human genes 0.000 description 3
- 206010068760 Ulcers Diseases 0.000 description 3
- 238000010521 absorption reaction Methods 0.000 description 3
- 150000007513 acids Chemical class 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- 231100000494 adverse effect Toxicity 0.000 description 3
- 238000010171 animal model Methods 0.000 description 3
- 230000003110 anti-inflammatory Effects 0.000 description 3
- 108091007172 antigens Proteins 0.000 description 3
- 102000038129 antigens Human genes 0.000 description 3
- 239000002246 antineoplastic agent Substances 0.000 description 3
- 125000000613 asparagine group Chemical group N[C@@H](CC(N)=O)C(=O)* 0.000 description 3
- 238000001574 biopsy Methods 0.000 description 3
- 201000011510 cancer Diseases 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 229920003045 dextran sodium sulfate Polymers 0.000 description 3
- 101710038873 glc-1 Proteins 0.000 description 3
- 108091005889 globular proteins Proteins 0.000 description 3
- 102000034327 globular proteins Human genes 0.000 description 3
- 125000000404 glutamine group Chemical group N[C@@H](CCC(N)=O)C(=O)* 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000006011 modification reaction Methods 0.000 description 3
- 210000002569 neurons Anatomy 0.000 description 3
- 235000015816 nutrient absorption Nutrition 0.000 description 3
- 235000015097 nutrients Nutrition 0.000 description 3
- 230000002035 prolonged Effects 0.000 description 3
- 238000003118 sandwich ELISA Methods 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 231100000397 ulcer Toxicity 0.000 description 3
- 230000004580 weight loss Effects 0.000 description 3
- OMJKFYKNWZZKTK-POHAHGRESA-N (5Z)-5-(dimethylaminohydrazinylidene)imidazole-4-carboxamide Chemical compound CN(C)N\N=C1/N=CN=C1C(N)=O OMJKFYKNWZZKTK-POHAHGRESA-N 0.000 description 2
- COVZYZSDYWQREU-UHFFFAOYSA-N 1,4-Butanediol, dimethanesulfonate Chemical compound CS(=O)(=O)OCCCCOS(C)(=O)=O COVZYZSDYWQREU-UHFFFAOYSA-N 0.000 description 2
- DHMYGZIEILLVNR-UHFFFAOYSA-N 5-fluoro-1-(oxolan-2-yl)pyrimidine-2,4-dione;1H-pyrimidine-2,4-dione Chemical compound O=C1C=CNC(=O)N1.O=C1NC(=O)C(F)=CN1C1OCCC1 DHMYGZIEILLVNR-UHFFFAOYSA-N 0.000 description 2
- 108010021810 ALX-0600 Proteins 0.000 description 2
- 101700037792 AURKB Proteins 0.000 description 2
- UUVWYPNAQBNQJQ-UHFFFAOYSA-N Altretamine Chemical compound CN(C)C1=NC(N(C)C)=NC(N(C)C)=N1 UUVWYPNAQBNQJQ-UHFFFAOYSA-N 0.000 description 2
- 229960005261 Aspartic Acid Drugs 0.000 description 2
- 229960001561 Bleomycin Drugs 0.000 description 2
- 108010006654 Bleomycin Proteins 0.000 description 2
- 210000004204 Blood Vessels Anatomy 0.000 description 2
- GAGWJHPBXLXJQN-UORFTKCHSA-N Capecitabine Chemical compound C1=C(F)C(NC(=O)OCCCCC)=NC(=O)N1[C@H]1[C@H](O)[C@H](O)[C@@H](C)O1 GAGWJHPBXLXJQN-UORFTKCHSA-N 0.000 description 2
- 229960004117 Capecitabine Drugs 0.000 description 2
- 229960004562 Carboplatin Drugs 0.000 description 2
- OLESAACUTLOWQZ-UHFFFAOYSA-L Carboplatin Chemical compound O=C1O[Pt]([N]([H])([H])[H])([N]([H])([H])[H])OC(=O)C11CCC1 OLESAACUTLOWQZ-UHFFFAOYSA-L 0.000 description 2
- PTOAARAWEBMLNO-KVQBGUIXSA-N Cladribine Chemical compound C1=NC=2C(N)=NC(Cl)=NC=2N1[C@H]1C[C@H](O)[C@@H](CO)O1 PTOAARAWEBMLNO-KVQBGUIXSA-N 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- 210000001072 Colon Anatomy 0.000 description 2
- 229950006799 Crisantaspase Drugs 0.000 description 2
- 229960004397 Cyclophosphamide Drugs 0.000 description 2
- CMSMOCZEIVJLDB-UHFFFAOYSA-N Cyclophosphamide Chemical compound ClCCN(CCCl)P1(=O)NCCCO1 CMSMOCZEIVJLDB-UHFFFAOYSA-N 0.000 description 2
- 229960000640 Dactinomycin Drugs 0.000 description 2
- 108010092160 Dactinomycin Proteins 0.000 description 2
- 108010067722 Dipeptidyl Peptidase 4 Proteins 0.000 description 2
- 230000036947 Dissociation constant Effects 0.000 description 2
- ZDZOTLJHXYCWBA-VCVYQWHSSA-N Docetaxel Chemical compound O([C@H]1[C@H]2[C@@](C([C@H](O)C3=C(C)[C@@H](OC(=O)[C@H](O)[C@@H](NC(=O)OC(C)(C)C)C=4C=CC=CC=4)C[C@]1(O)C3(C)C)=O)(C)[C@@H](O)C[C@H]1OC[C@]12OC(=O)C)C(=O)C1=CC=CC=C1 ZDZOTLJHXYCWBA-VCVYQWHSSA-N 0.000 description 2
- 108030007212 EC 3.4.21.79 Proteins 0.000 description 2
- 229940110715 ENZYMES FOR TREATMENT OF WOUNDS AND ULCERS Drugs 0.000 description 2
- AOJJSUZBOXZQNB-VTZDEGQISA-N EPIRUBICIN Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-VTZDEGQISA-N 0.000 description 2
- 229960001904 EPIRUBICIN Drugs 0.000 description 2
- 108010042407 Endonucleases Proteins 0.000 description 2
- 208000010227 Enterocolitis Diseases 0.000 description 2
- 241000709661 Enterovirus Species 0.000 description 2
- 229940114721 Enzymes FOR DISORDERS OF THE MUSCULO-SKELETAL SYSTEM Drugs 0.000 description 2
- 229940093738 Enzymes for ALIMENTARY TRACT AND METABOLISM Drugs 0.000 description 2
- 206010016165 Failure to thrive Diseases 0.000 description 2
- 102100004391 GZMB Human genes 0.000 description 2
- 241000287828 Gallus gallus Species 0.000 description 2
- SDUQYLNIPVEERB-QPPQHZFASA-N Gemcitabine Chemical compound O=C1N=C(N)C=CN1[C@H]1C(F)(F)[C@H](O)[C@@H](CO)O1 SDUQYLNIPVEERB-QPPQHZFASA-N 0.000 description 2
- 210000001511 Glucagon-secreting cell Anatomy 0.000 description 2
- 240000006600 Humulus lupulus Species 0.000 description 2
- 235000008694 Humulus lupulus Nutrition 0.000 description 2
- 229960000908 Idarubicin Drugs 0.000 description 2
- XDXDZDZNSLXDNA-TZNDIEGXSA-N Idarubicin hydrochloride Chemical compound C1[C@H](N)[C@H](O)[C@H](C)O[C@H]1O[C@@H]1C2=C(O)C(C(=O)C3=CC=CC=C3C3=O)=C3C(O)=C2C[C@@](O)(C(C)=O)C1 XDXDZDZNSLXDNA-TZNDIEGXSA-N 0.000 description 2
- 229960001101 Ifosfamide Drugs 0.000 description 2
- HOMGKSMUEGBAAB-UHFFFAOYSA-N Ifosfamide Chemical compound ClCCNP1(=O)OCCCN1CCCl HOMGKSMUEGBAAB-UHFFFAOYSA-N 0.000 description 2
- 210000003405 Ileum Anatomy 0.000 description 2
- 210000000987 Immune System Anatomy 0.000 description 2
- 210000003000 Inclusion Bodies Anatomy 0.000 description 2
- UWKQSNNFCGGAFS-XIFFEERXSA-N Irinotecan Chemical compound C1=C2C(CC)=C3CN(C(C4=C([C@@](C(=O)OC4)(O)CC)C=4)=O)C=4C3=NC2=CC=C1OC(=O)N(CC1)CCC1N1CCCCC1 UWKQSNNFCGGAFS-XIFFEERXSA-N 0.000 description 2
- 210000001630 Jejunum Anatomy 0.000 description 2
- 210000003734 Kidney Anatomy 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 2
- 102100005410 LINE-1 retrotransposable element ORF2 protein Human genes 0.000 description 2
- SGDBTWWWUNNDEQ-LBPRGKRZSA-N Melphalan Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N(CCCl)CCCl)C=C1 SGDBTWWWUNNDEQ-LBPRGKRZSA-N 0.000 description 2
- 229960004635 Mesna Drugs 0.000 description 2
- 108020004999 Messenger RNA Proteins 0.000 description 2
- 229960004857 Mitomycin Drugs 0.000 description 2
- KKZJGLLVHKMTCM-UHFFFAOYSA-N Mitoxantrone Chemical compound O=C1C2=C(O)C=CC(O)=C2C(=O)C2=C1C(NCCNCCO)=CC=C2NCCNCCO KKZJGLLVHKMTCM-UHFFFAOYSA-N 0.000 description 2
- 229960001156 Mitoxantrone Drugs 0.000 description 2
- NWIBSHFKIJFRCO-WUDYKRTCSA-N Mytomycin Chemical compound C1N2C(C(C(C)=C(N)C3=O)=O)=C3[C@@H](COC(N)=O)[C@@]2(OC)[C@@H]2[C@H]1N2 NWIBSHFKIJFRCO-WUDYKRTCSA-N 0.000 description 2
- DLGOEMSEDOSKAD-UHFFFAOYSA-N Nitrumon Chemical compound ClCCNC(=O)N(N=O)CCCl DLGOEMSEDOSKAD-UHFFFAOYSA-N 0.000 description 2
- 229920000272 Oligonucleotide Polymers 0.000 description 2
- 102100016785 PCSK2 Human genes 0.000 description 2
- 241000634212 Penia Species 0.000 description 2
- CPTBDICYNRMXFX-UHFFFAOYSA-N Procarbazine Chemical compound CNNCC1=CC=C(C(=O)NC(C)C)C=C1 CPTBDICYNRMXFX-UHFFFAOYSA-N 0.000 description 2
- 102000035853 Proglucagon Human genes 0.000 description 2
- 108010058003 Proglucagon Proteins 0.000 description 2
- 210000002784 Stomach Anatomy 0.000 description 2
- 229960001052 Streptozocin Drugs 0.000 description 2
- ZSJLQEPLLKMAKR-GKHCUFPYSA-N Streptozotocin Chemical compound O=NN(C)C(=O)N[C@H]1[C@@H](O)O[C@H](CO)[C@@H](O)[C@@H]1O ZSJLQEPLLKMAKR-GKHCUFPYSA-N 0.000 description 2
- 108091008153 T cell receptors Proteins 0.000 description 2
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 description 2
- 101710040537 TNF Proteins 0.000 description 2
- 229960003087 Tioguanine Drugs 0.000 description 2
- IVTVGDXNLFLDRM-HNNXBMFYSA-N Tomudex Chemical compound C=1C=C2NC(C)=NC(=O)C2=CC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)S1 IVTVGDXNLFLDRM-HNNXBMFYSA-N 0.000 description 2
- UCFGDBYHRUNTLO-QHCPKHFHSA-N Topotecan Chemical compound C1=C(O)C(CN(C)C)=C2C=C(CN3C4=CC5=C(C3=O)COC(=O)[C@]5(O)CC)C4=NC2=C1 UCFGDBYHRUNTLO-QHCPKHFHSA-N 0.000 description 2
- 229960003048 Vinblastine Drugs 0.000 description 2
- HOFQVRTUGATRFI-XQKSVPLYSA-N Vinblastine Chemical compound C([C@@H](C[C@]1(C(=O)OC)C=2C(=CC3=C([C@]45[C@H]([C@@]([C@H](OC(C)=O)[C@]6(CC)C=CCN([C@H]56)CC4)(O)C(=O)OC)N3C)C=2)OC)C[C@@](C2)(O)CC)N2CCC2=C1N=C1[C]2C=CC=C1 HOFQVRTUGATRFI-XQKSVPLYSA-N 0.000 description 2
- JXLYSJRDGCGARV-WWYNWVTFSA-N Vinblastine Natural products O=C(O[C@H]1[C@](O)(C(=O)OC)[C@@H]2N(C)c3c(cc(c(OC)c3)[C@]3(C(=O)OC)c4[nH]c5c(c4CCN4C[C@](O)(CC)C[C@H](C3)C4)cccc5)[C@@]32[C@H]2[C@@]1(CC)C=CCN2CC3)C JXLYSJRDGCGARV-WWYNWVTFSA-N 0.000 description 2
- 229960004528 Vincristine Drugs 0.000 description 2
- YCPOZVAOBBQLRI-PHDIDXHHSA-N [(2R,3R)-2,3-dihydroxy-4-methylsulfonyloxybutyl] methanesulfonate Chemical compound CS(=O)(=O)OC[C@@H](O)[C@H](O)COS(C)(=O)=O YCPOZVAOBBQLRI-PHDIDXHHSA-N 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- RJURFGZVJUQBHK-IIXSONLDSA-N actinomycin D Chemical compound C[C@H]1OC(=O)[C@H](C(C)C)N(C)C(=O)CN(C)C(=O)[C@@H]2CCCN2C(=O)[C@@H](C(C)C)NC(=O)[C@H]1NC(=O)C1=C(N)C(=O)C(C)=C2OC(C(C)=CC=C3C(=O)N[C@@H]4C(=O)N[C@@H](C(N5CCC[C@H]5C(=O)N(C)CC(=O)N(C)[C@@H](C(C)C)C(=O)O[C@@H]4C)=O)C(C)C)=C3N=C21 RJURFGZVJUQBHK-IIXSONLDSA-N 0.000 description 2
- 230000003213 activating Effects 0.000 description 2
- 230000002730 additional Effects 0.000 description 2
- 229960000473 altretamine Drugs 0.000 description 2
- 125000003277 amino group Chemical group 0.000 description 2
- 239000000427 antigen Substances 0.000 description 2
- 229940019336 antithrombotic Enzymes Drugs 0.000 description 2
- 239000007864 aqueous solution Substances 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- IVRMZWNICZWHMI-UHFFFAOYSA-N azide group Chemical group [N-]=[N+]=[N-] IVRMZWNICZWHMI-UHFFFAOYSA-N 0.000 description 2
- 229910052788 barium Inorganic materials 0.000 description 2
- DSAJWYNOEDNPEQ-UHFFFAOYSA-N barium(0) Chemical compound [Ba] DSAJWYNOEDNPEQ-UHFFFAOYSA-N 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000031018 biological processes and functions Effects 0.000 description 2
- OYVAGSVQBOHSSS-UAPAGMARSA-O bleomycin A2 Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC=C(N=1)C=1SC=C(N=1)C(=O)NCCC[S+](C)C)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1N=CNC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C OYVAGSVQBOHSSS-UAPAGMARSA-O 0.000 description 2
- UIIMBOGNXHQVGW-UHFFFAOYSA-M buffer Substances [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 2
- 229960002092 busulfan Drugs 0.000 description 2
- 150000001720 carbohydrates Chemical class 0.000 description 2
- 235000014633 carbohydrates Nutrition 0.000 description 2
- 229960005243 carmustine Drugs 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 238000001142 circular dichroism spectrum Methods 0.000 description 2
- 229960004316 cisplatin Drugs 0.000 description 2
- 229960002436 cladribine Drugs 0.000 description 2
- ZNEWHQLOPFWXOF-UHFFFAOYSA-N coenzyme M Chemical compound OS(=O)(=O)CCS ZNEWHQLOPFWXOF-UHFFFAOYSA-N 0.000 description 2
- 230000000112 colonic Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 239000000356 contaminant Substances 0.000 description 2
- 230000002596 correlated Effects 0.000 description 2
- 238000004132 cross linking Methods 0.000 description 2
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 2
- 229960003901 dacarbazine Drugs 0.000 description 2
- 230000001419 dependent Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 230000035510 distribution Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 229960003668 docetaxel Drugs 0.000 description 2
- 210000003890 endocrine cells Anatomy 0.000 description 2
- 238000001839 endoscopy Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 229960005277 gemcitabine Drugs 0.000 description 2
- 230000002068 genetic Effects 0.000 description 2
- 229940121355 glucagon like peptide 2 (GLP-2) analogues Drugs 0.000 description 2
- LYCAIKOWRPUZTN-UHFFFAOYSA-N glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 2
- 229940020899 hematological Enzymes Drugs 0.000 description 2
- 238000004128 high performance liquid chromatography Methods 0.000 description 2
- 125000001165 hydrophobic group Chemical group 0.000 description 2
- 229960001330 hydroxycarbamide Drugs 0.000 description 2
- VSNHCAURESNICA-UHFFFAOYSA-N hydroxyurea Chemical compound NC(=O)NO VSNHCAURESNICA-UHFFFAOYSA-N 0.000 description 2
- 230000028993 immune response Effects 0.000 description 2
- 230000016784 immunoglobulin production Effects 0.000 description 2
- 238000000099 in vitro assay Methods 0.000 description 2
- 230000003870 intestinal permeability Effects 0.000 description 2
- 238000005342 ion exchange Methods 0.000 description 2
- 229960004768 irinotecan Drugs 0.000 description 2
- 238000001155 isoelectric focusing Methods 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000004301 light adaptation Effects 0.000 description 2
- 235000012054 meals Nutrition 0.000 description 2
- 229960001924 melphalan Drugs 0.000 description 2
- 238000002844 melting Methods 0.000 description 2
- GLVAUDGFNGKCSF-UHFFFAOYSA-N mercaptopurine Chemical compound S=C1NC=NC2=C1NC=N2 GLVAUDGFNGKCSF-UHFFFAOYSA-N 0.000 description 2
- 229960001428 mercaptopurine Drugs 0.000 description 2
- 229920002106 messenger RNA Polymers 0.000 description 2
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 2
- 229960000485 methotrexate Drugs 0.000 description 2
- 230000003278 mimic Effects 0.000 description 2
- 230000000116 mitigating Effects 0.000 description 2
- 230000000051 modifying Effects 0.000 description 2
- 230000004899 motility Effects 0.000 description 2
- 210000000651 myofibroblasts Anatomy 0.000 description 2
- 230000002956 necrotizing Effects 0.000 description 2
- 230000035764 nutrition Effects 0.000 description 2
- 230000036961 partial Effects 0.000 description 2
- WBXPDJSOTKVWSJ-ZDUSSCGKSA-N pemetrexed Chemical compound C=1NC=2NC(N)=NC(=O)C=2C=1CCC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 WBXPDJSOTKVWSJ-ZDUSSCGKSA-N 0.000 description 2
- 229960005079 pemetrexed Drugs 0.000 description 2
- 229940083249 peripheral vasodilators Enzymes Drugs 0.000 description 2
- 239000012071 phase Substances 0.000 description 2
- 230000004962 physiological condition Effects 0.000 description 2
- 230000036470 plasma concentration Effects 0.000 description 2
- 108091008117 polyclonal antibodies Proteins 0.000 description 2
- 230000035494 population pharmacokinetics Effects 0.000 description 2
- 230000003389 potentiating Effects 0.000 description 2
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 2
- 230000000770 pro-inflamatory Effects 0.000 description 2
- 229960000624 procarbazine Drugs 0.000 description 2
- 239000000651 prodrug Substances 0.000 description 2
- 229940002612 prodrugs Drugs 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 235000004252 protein component Nutrition 0.000 description 2
- 230000017854 proteolysis Effects 0.000 description 2
- 229960004432 raltitrexed Drugs 0.000 description 2
- 238000001525 receptor binding assay Methods 0.000 description 2
- 125000003616 serine group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C(O[H])([H])[H] 0.000 description 2
- 231100000486 side effect Toxicity 0.000 description 2
- 150000003384 small molecules Chemical class 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 2
- 229910001415 sodium ion Inorganic materials 0.000 description 2
- 230000002194 synthesizing Effects 0.000 description 2
- 229960000303 topotecan Drugs 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- 230000001988 toxicity Effects 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 230000001131 transforming Effects 0.000 description 2
- LXZZYRPGZAFOLE-UHFFFAOYSA-L transplatin Chemical compound [H][N]([H])([H])[Pt](Cl)(Cl)[N]([H])([H])[H] LXZZYRPGZAFOLE-UHFFFAOYSA-L 0.000 description 2
- 229960003181 treosulfan Drugs 0.000 description 2
- 125000000430 tryptophan group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C2=C([H])C([H])=C([H])C([H])=C12 0.000 description 2
- 230000001515 vagal Effects 0.000 description 2
- OGWKCGZFUXNPDA-XQKSVPLYSA-N vincristine Chemical compound C([N@]1C[C@@H](C[C@]2(C(=O)OC)C=3C(=CC4=C([C@]56[C@H]([C@@]([C@H](OC(C)=O)[C@]7(CC)C=CCN([C@H]67)CC5)(O)C(=O)OC)N4C=O)C=3)OC)C[C@@](C1)(O)CC)CC1=C2NC2=CC=CC=C12 OGWKCGZFUXNPDA-XQKSVPLYSA-N 0.000 description 2
- 229960002066 vinorelbine Drugs 0.000 description 2
- GBABOYUKABKIAF-IELIFDKJSA-N vinorelbine Chemical compound C1N(CC=2C3=CC=CC=C3NC=22)CC(CC)=C[C@H]1C[C@]2(C(=O)OC)C1=CC([C@]23[C@H]([C@@]([C@H](OC(C)=O)[C@]4(CC)C=CCN([C@H]34)CC2)(O)C(=O)OC)N2C)=C2C=C1OC GBABOYUKABKIAF-IELIFDKJSA-N 0.000 description 2
- 230000036642 wellbeing Effects 0.000 description 2
- 238000001262 western blot Methods 0.000 description 2
- ZROHGHOFXNOHSO-BNTLRKBRSA-L (1R,2R)-cyclohexane-1,2-diamine;oxalate;platinum(2+) Chemical compound [H][N]([C@@H]1CCCC[C@H]1[N]1([H])[H])([H])[Pt]11OC(=O)C(=O)O1 ZROHGHOFXNOHSO-BNTLRKBRSA-L 0.000 description 1
- DTHNMHAUYICORS-KTKZVXAJSA-N 107444-51-9 Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(N)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC=1N=CNC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=CC=C1 DTHNMHAUYICORS-KTKZVXAJSA-N 0.000 description 1
- 125000004042 4-aminobutyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])C([H])([H])N([H])[H] 0.000 description 1
- 206010000050 Abdominal adhesion Diseases 0.000 description 1
- 206010059837 Adhesion Diseases 0.000 description 1
- 244000105975 Antidesma platyphyllum Species 0.000 description 1
- IQTUDDBANZYMAR-UHFFFAOYSA-N Asparaginyl-Methionine Chemical compound CSCCC(C(O)=O)NC(=O)C(N)CC(N)=O IQTUDDBANZYMAR-UHFFFAOYSA-N 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 108091008154 B cell receptors Proteins 0.000 description 1
- 230000003844 B-cell-activation Effects 0.000 description 1
- GQYIWUVLTXOXAJ-UHFFFAOYSA-N Belustine Chemical compound ClCCN(N=O)C(=O)NC1CCCCC1 GQYIWUVLTXOXAJ-UHFFFAOYSA-N 0.000 description 1
- 230000037177 Biodistribution Effects 0.000 description 1
- 108010039209 Blood Coagulation Factors Proteins 0.000 description 1
- 102000015081 Blood Coagulation Factors Human genes 0.000 description 1
- 229940019700 Blood coagulation factors Drugs 0.000 description 1
- 210000001124 Body Fluids Anatomy 0.000 description 1
- 210000000988 Bone and Bones Anatomy 0.000 description 1
- 102100013077 CD4 Human genes 0.000 description 1
- 108010041397 CD4 Antigens Proteins 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- JCKYGMPEJWAADB-UHFFFAOYSA-N Chlorambucil Chemical compound OC(=O)CCCC1=CC=C(N(CCCl)CCCl)C=C1 JCKYGMPEJWAADB-UHFFFAOYSA-N 0.000 description 1
- 229960004630 Chlorambucil Drugs 0.000 description 1
- 230000035700 Clearance Rate Effects 0.000 description 1
- 229960000684 Cytarabine Drugs 0.000 description 1
- UHDGCWIWMRVCDJ-CCXZUQQUSA-N Cytosar Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@@H](O)[C@H](O)[C@@H](CO)O1 UHDGCWIWMRVCDJ-CCXZUQQUSA-N 0.000 description 1
- STQGQHZAVUOBTE-VGBVRHCVSA-N DAUNOMYCIN Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(C)=O)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 STQGQHZAVUOBTE-VGBVRHCVSA-N 0.000 description 1
- 229960000975 Daunorubicin Drugs 0.000 description 1
- 229960000633 Dextran Sulfate Drugs 0.000 description 1
- 230000037217 Elimination half-life Effects 0.000 description 1
- 229940095399 Enema Drugs 0.000 description 1
- 241000792859 Enema Species 0.000 description 1
- 229940088598 Enzyme Drugs 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 210000003238 Esophagus Anatomy 0.000 description 1
- 229920000181 Ethylene propylene rubber Polymers 0.000 description 1
- 229960005420 Etoposide Drugs 0.000 description 1
- VJJPUSNTGOMMGY-MRVIYFEKSA-N Etoposide Chemical compound COC1=C(O)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@@H](O[C@H]3[C@@H]([C@@H](O)[C@@H]4O[C@H](C)OC[C@H]4O3)O)[C@@H]3[C@@H]2C(OC3)=O)=C1 VJJPUSNTGOMMGY-MRVIYFEKSA-N 0.000 description 1
- 229920000665 Exon Polymers 0.000 description 1
- 101700053597 FCER2 Proteins 0.000 description 1
- 102100014608 FCER2 Human genes 0.000 description 1
- 101710003435 FCGRT Proteins 0.000 description 1
- 108010079356 FIIa Proteins 0.000 description 1
- 102100004626 GSAP Human genes 0.000 description 1
- 108060003415 GSAP Proteins 0.000 description 1
- 206010064147 Gastrointestinal inflammation Diseases 0.000 description 1
- 108010072447 Glicentin Proteins 0.000 description 1
- 102400000320 Glicentin Human genes 0.000 description 1
- 108010088406 Glucagon-Like Peptides Proteins 0.000 description 1
- RWSXRVCMGQZWBV-WDSKDSINSA-N Glutathione Chemical compound OC(=O)[C@@H](N)CCC(=O)N[C@@H](CS)C(=O)NCC(O)=O RWSXRVCMGQZWBV-WDSKDSINSA-N 0.000 description 1
- 102000006354 HLA-DR Antigens Human genes 0.000 description 1
- 108010058597 HLA-DR Antigens Proteins 0.000 description 1
- 229940037467 Helicobacter pylori Drugs 0.000 description 1
- 241000590002 Helicobacter pylori Species 0.000 description 1
- 230000036938 INCREASE IN AUC Effects 0.000 description 1
- WMDZARSFSMZOQO-DRZSPHRISA-N Ile-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WMDZARSFSMZOQO-DRZSPHRISA-N 0.000 description 1
- 102000004218 Insulin-like growth factor I Human genes 0.000 description 1
- 108090000723 Insulin-like growth factor I Proteins 0.000 description 1
- RCINICONZNJXQF-MZXODVADSA-N Intaxel Chemical compound O([C@@H]1[C@@]2(C[C@@H](C(C)=C(C2(C)C)[C@H](C([C@]2(C)[C@@H](O)C[C@H]3OC[C@]3([C@H]21)OC(C)=O)=O)OC(=O)C)OC(=O)[C@H](O)[C@@H](NC(=O)C=1C=CC=CC=1)C=1C=CC=CC=1)O)C(=O)C1=CC=CC=C1 RCINICONZNJXQF-MZXODVADSA-N 0.000 description 1
- 210000004347 Intestinal Mucosa Anatomy 0.000 description 1
- 229920002459 Intron Polymers 0.000 description 1
- 108020004391 Introns Proteins 0.000 description 1
- 241001520820 Joinvillea ascendens Species 0.000 description 1
- LRQKBLKVPFOOQJ-YFKPBYRVSA-N L-2-aminohexanoic acid zwitterion Chemical group CCCC[C@H]([NH3+])C([O-])=O LRQKBLKVPFOOQJ-YFKPBYRVSA-N 0.000 description 1
- OWMZNFCDEHGFEP-NFBCVYDUSA-N L-Histidyl-L-seryl-L-a-aspartylglycyl-L-threonyl-L-phenylalanyl-L-threonyl-L-seryl-L-a-glutamyl-L-leucyl-L-seryl-L-arginyl-L-leucyl-L-arginyl-L-a-glutamylglycyl -L-alanyl-L-arginyl-L-leucyl-L-glutaminyl-L-arginyl-L-leucyl-L-leucyl-L-glutaminylglycyl-L-leu Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(N)=O)[C@@H](C)O)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)C1=CC=CC=C1 OWMZNFCDEHGFEP-NFBCVYDUSA-N 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- 125000000241 L-isoleucino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])[C@@](C([H])([H])[H])(C(C([H])([H])[H])([H])[H])[H] 0.000 description 1
- 210000004185 Liver Anatomy 0.000 description 1
- 210000004698 Lymphocytes Anatomy 0.000 description 1
- 210000003712 Lysosomes Anatomy 0.000 description 1
- 101710028361 MARVELD2 Proteins 0.000 description 1
- 230000035683 MEAN RESIDENCE TIME Effects 0.000 description 1
- 229920002521 Macromolecule Polymers 0.000 description 1
- 102000002274 Matrix Metalloproteinases Human genes 0.000 description 1
- 108010000684 Matrix Metalloproteinases Proteins 0.000 description 1
- 230000036650 Metabolic stability Effects 0.000 description 1
- 230000035633 Metabolized Effects 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N Methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- PXZWGQLGAKCNKD-DPNMSELWSA-N MolPort-023-276-326 Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O)[C@@H](C)O)C(C)C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)O)C1=CC=CC=C1 PXZWGQLGAKCNKD-DPNMSELWSA-N 0.000 description 1
- 241001182492 Nes Species 0.000 description 1
- 210000000118 Neural Pathways Anatomy 0.000 description 1
- 229920002332 Noncoding DNA Polymers 0.000 description 1
- 229940099990 Ogen Drugs 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 208000001132 Osteoporosis Diseases 0.000 description 1
- 101800001388 Oxyntomodulin Proteins 0.000 description 1
- 102400000319 Oxyntomodulin Human genes 0.000 description 1
- 101710007828 PCSK2 Proteins 0.000 description 1
- 229960001592 Paclitaxel Drugs 0.000 description 1
- 206010033645 Pancreatitis Diseases 0.000 description 1
- 229960002340 Pentostatin Drugs 0.000 description 1
- FPVKHBSQESCIEP-JQCXWYLXSA-N Pentostatin Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC[C@H]2O)=C2N=C1 FPVKHBSQESCIEP-JQCXWYLXSA-N 0.000 description 1
- 230000036823 Plasma Levels Effects 0.000 description 1
- 230000036908 Plasma Stability Effects 0.000 description 1
- 239000004698 Polyethylene (PE) Substances 0.000 description 1
- BEPSGCXDIVACBU-UHFFFAOYSA-N Prolyl-Histidine Chemical compound C1CCNC1C(=O)NC(C(=O)O)CC1=CN=CN1 BEPSGCXDIVACBU-UHFFFAOYSA-N 0.000 description 1
- 238000001069 Raman spectroscopy Methods 0.000 description 1
- 241000700157 Rattus norvegicus Species 0.000 description 1
- 102000007312 Recombinant Proteins Human genes 0.000 description 1
- 108010033725 Recombinant Proteins Proteins 0.000 description 1
- 229920001914 Ribonucleotide Polymers 0.000 description 1
- 108020004418 Ribosomal RNA Proteins 0.000 description 1
- 102100001186 SCT Human genes 0.000 description 1
- 230000036141 SERUM STABILITY Effects 0.000 description 1
- 229960002101 Secretin Drugs 0.000 description 1
- 108010086019 Secretin Proteins 0.000 description 1
- UJTZHGHXJKIAOS-WHFBIAKZSA-N Ser-Gln Chemical compound OC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O UJTZHGHXJKIAOS-WHFBIAKZSA-N 0.000 description 1
- 102000005632 Single-Chain Antibodies Human genes 0.000 description 1
- 108010070144 Single-Chain Antibodies Proteins 0.000 description 1
- 108090000250 Sortase A Proteins 0.000 description 1
- 230000035551 Systemic clearance Effects 0.000 description 1
- 230000024932 T cell mediated immunity Effects 0.000 description 1
- 108010076818 TEV protease Proteins 0.000 description 1
- BPEGJWRSRHCHSN-UHFFFAOYSA-N Temodal Chemical compound O=C1N(C)N=NC2=C(C(N)=O)N=CN21 BPEGJWRSRHCHSN-UHFFFAOYSA-N 0.000 description 1
- 230000035672 Terminal elimination rate constant Effects 0.000 description 1
- 231100000765 Toxin Toxicity 0.000 description 1
- 229920001949 Transfer RNA Polymers 0.000 description 1
- 108020004566 Transfer RNA Proteins 0.000 description 1
- 206010046851 Uveitis Diseases 0.000 description 1
- 229960004355 Vindesine Drugs 0.000 description 1
- UGGWPQSBPIFKDZ-KOTLKJBCSA-N Vindesine Chemical compound C([C@@H](C[C@]1(C(=O)OC)C=2C(=CC3=C([C@]45[C@H]([C@@]([C@H](O)[C@]6(CC)C=CCN([C@H]56)CC4)(O)C(N)=O)N3C)C=2)OC)C[C@@](C2)(O)CC)N2CCC2=C1N=C1[C]2C=CC=C1 UGGWPQSBPIFKDZ-KOTLKJBCSA-N 0.000 description 1
- 241000269368 Xenopus laevis Species 0.000 description 1
- ZCBJDQBSLZREAA-UHFFFAOYSA-N [4-[2-(4-acetyloxyphenyl)-3-oxo-4H-1,4-benzoxazin-2-yl]phenyl] acetate Chemical compound C1=CC(OC(=O)C)=CC=C1C1(C=2C=CC(OC(C)=O)=CC=2)C(=O)NC2=CC=CC=C2O1 ZCBJDQBSLZREAA-UHFFFAOYSA-N 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 230000003044 adaptive Effects 0.000 description 1
- 238000009098 adjuvant therapy Methods 0.000 description 1
- 230000001058 adult Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 125000003172 aldehyde group Chemical group 0.000 description 1
- 150000001299 aldehydes Chemical class 0.000 description 1
- 230000001668 ameliorated Effects 0.000 description 1
- 150000001412 amines Chemical class 0.000 description 1
- 238000001286 analytical centrifugation Methods 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 230000001028 anti-proliferant Effects 0.000 description 1
- 230000000840 anti-viral Effects 0.000 description 1
- 230000030741 antigen processing and presentation Effects 0.000 description 1
- 238000000149 argon plasma sintering Methods 0.000 description 1
- 230000000975 bioactive Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000000903 blocking Effects 0.000 description 1
- 239000003114 blood coagulation factor Substances 0.000 description 1
- 238000009534 blood test Methods 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 238000005251 capillar electrophoresis Methods 0.000 description 1
- 150000001718 carbodiimides Chemical class 0.000 description 1
- 230000020411 cell activation Effects 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000001413 cellular Effects 0.000 description 1
- 229910052729 chemical element Inorganic materials 0.000 description 1
- 108091006028 chimera Proteins 0.000 description 1
- 230000002759 chromosomal Effects 0.000 description 1
- 238000002983 circular dichroism Methods 0.000 description 1
- 238000000978 circular dichroism spectroscopy Methods 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 201000011231 colorectal cancer Diseases 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000001268 conjugating Effects 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000001808 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000002354 daily Effects 0.000 description 1
- 239000003398 denaturant Substances 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 238000000113 differential scanning calorimetry Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 201000009910 diseases by infectious agent Diseases 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000002330 electrospray ionisation mass spectrometry Methods 0.000 description 1
- 239000007920 enema Substances 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 210000001842 enterocyte Anatomy 0.000 description 1
- 210000003158 enteroendocrine cell Anatomy 0.000 description 1
- 230000002255 enzymatic Effects 0.000 description 1
- 239000002532 enzyme inhibitor Substances 0.000 description 1
- 230000001586 eradicative Effects 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 230000002349 favourable Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004992 fission Effects 0.000 description 1
- 229960000390 fludarabine Drugs 0.000 description 1
- GIUYCYHIANZCFB-FJFJXFQQSA-N fludarabine phosphate Chemical compound C1=NC=2C(N)=NC(F)=NC=2N1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@@H]1O GIUYCYHIANZCFB-FJFJXFQQSA-N 0.000 description 1
- VVIAGPKUTFNRDU-ABLWVSNPSA-N folinic acid Chemical compound C1NC=2NC(N)=NC(=O)C=2N(C=O)C1CNC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 VVIAGPKUTFNRDU-ABLWVSNPSA-N 0.000 description 1
- 235000008191 folinic acid Nutrition 0.000 description 1
- 239000011672 folinic acid Substances 0.000 description 1
- 238000004108 freeze drying Methods 0.000 description 1
- 230000027119 gastric acid secretion Effects 0.000 description 1
- 230000030136 gastric emptying Effects 0.000 description 1
- 230000030135 gastric motility Effects 0.000 description 1
- 230000003899 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 235000009424 haa Nutrition 0.000 description 1
- 150000002402 hexoses Chemical class 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 238000000760 immunoelectrophoresis Methods 0.000 description 1
- 230000001771 impaired Effects 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 238000001764 infiltration Methods 0.000 description 1
- 230000028709 inflammatory response Effects 0.000 description 1
- 239000003978 infusion fluid Substances 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000031891 intestinal absorption Effects 0.000 description 1
- 210000002490 intestinal epithelial cell Anatomy 0.000 description 1
- 230000004609 intestinal homeostasis Effects 0.000 description 1
- 238000010255 intramuscular injection Methods 0.000 description 1
- 239000007927 intramuscular injection Substances 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 230000036012 kel Effects 0.000 description 1
- 230000002147 killing Effects 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 229960001691 leucovorin Drugs 0.000 description 1
- 230000021633 leukocyte mediated immunity Effects 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 229960002247 lomustine Drugs 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 230000001868 lysosomic Effects 0.000 description 1
- 230000002101 lytic Effects 0.000 description 1
- 210000004962 mammalian cells Anatomy 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000003340 mental Effects 0.000 description 1
- 230000002503 metabolic Effects 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 210000004877 mucosa Anatomy 0.000 description 1
- 238000004848 nephelometry Methods 0.000 description 1
- 230000001264 neutralization Effects 0.000 description 1
- 235000006286 nutrient intake Nutrition 0.000 description 1
- 230000000414 obstructive Effects 0.000 description 1
- 230000003287 optical Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000003960 organic solvent Substances 0.000 description 1
- 229960001756 oxaliplatin Drugs 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 239000000816 peptidomimetic Substances 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 210000001539 phagocyte Anatomy 0.000 description 1
- 239000000546 pharmaceutic aid Substances 0.000 description 1
- 239000000825 pharmaceutical preparation Substances 0.000 description 1
- 230000003285 pharmacodynamic Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 230000000865 phosphorylative Effects 0.000 description 1
- 230000004983 pleiotropic Effects 0.000 description 1
- 229920000232 polyglycine polymer Polymers 0.000 description 1
- 238000003752 polymerase chain reaction Methods 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 230000001323 posttranslational Effects 0.000 description 1
- 230000036515 potency Effects 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 230000003449 preventive Effects 0.000 description 1
- 230000002062 proliferating Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000001737 promoting Effects 0.000 description 1
- 230000000069 prophylaxis Effects 0.000 description 1
- 230000004224 protection Effects 0.000 description 1
- 230000004845 protein aggregation Effects 0.000 description 1
- 230000002685 pulmonary Effects 0.000 description 1
- 238000003127 radioimmunoassay Methods 0.000 description 1
- 239000000018 receptor agonist Substances 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000000268 renotropic Effects 0.000 description 1
- 238000002271 resection Methods 0.000 description 1
- 108091007521 restriction endonucleases Proteins 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 229920002973 ribosomal RNA Polymers 0.000 description 1
- 229920002033 ribozyme Polymers 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 238000007423 screening assay Methods 0.000 description 1
- 229950002350 secretin human Drugs 0.000 description 1
- 230000003248 secreting Effects 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- KISFEBPWFCGRGN-UHFFFAOYSA-M sodium;2-(2,4-dichlorophenoxy)ethyl sulfate Chemical compound [Na+].[O-]S(=O)(=O)OCCOC1=CC=C(Cl)C=C1Cl KISFEBPWFCGRGN-UHFFFAOYSA-M 0.000 description 1
- 230000003595 spectral Effects 0.000 description 1
- 230000000638 stimulation Effects 0.000 description 1
- 238000002563 stool test Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000010254 subcutaneous injection Methods 0.000 description 1
- 239000007929 subcutaneous injection Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 230000004083 survival Effects 0.000 description 1
- 229930003347 taxol Natural products 0.000 description 1
- 229960004964 temozolomide Drugs 0.000 description 1
- 125000003396 thiol group Chemical group [H]S* 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 108020003112 toxins Proteins 0.000 description 1
- 230000032258 transport Effects 0.000 description 1
- 230000001960 triggered Effects 0.000 description 1
- 230000001228 trophic Effects 0.000 description 1
- 238000004450 types of analysis Methods 0.000 description 1
- 238000010798 ubiquitination Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
- 230000003827 upregulation Effects 0.000 description 1
- 230000002227 vasoactive Effects 0.000 description 1
- 230000035513 volume of distribution Effects 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P1/00—Drugs for disorders of the alimentary tract or the digestive system
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P1/00—Drugs for disorders of the alimentary tract or the digestive system
- A61P1/04—Drugs for disorders of the alimentary tract or the digestive system for ulcers, gastritis or reflux esophagitis, e.g. antacids, inhibitors of acid secretion, mucosal protectants
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P1/00—Drugs for disorders of the alimentary tract or the digestive system
- A61P1/12—Antidiarrhoeals
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P1/00—Drugs for disorders of the alimentary tract or the digestive system
- A61P1/14—Prodigestives, e.g. acids, enzymes, appetite stimulants, antidyspeptics, tonics, antiflatulents
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P1/00—Drugs for disorders of the alimentary tract or the digestive system
- A61P1/18—Drugs for disorders of the alimentary tract or the digestive system for pancreatic disorders, e.g. pancreatic enzymes
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P15/00—Drugs for genital or sexual disorders; Contraceptives
- A61P15/08—Drugs for genital or sexual disorders; Contraceptives for gonadal disorders or for enhancing fertility, e.g. inducers of ovulation or of spermatogenesis
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P3/00—Drugs for disorders of the metabolism
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P3/00—Drugs for disorders of the metabolism
- A61P3/02—Nutrients, e.g. vitamins, minerals
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P3/00—Drugs for disorders of the metabolism
- A61P3/04—Anorexiants; Antiobesity agents
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P3/00—Drugs for disorders of the metabolism
- A61P3/08—Drugs for disorders of the metabolism for glucose homeostasis
- A61P3/10—Drugs for disorders of the metabolism for glucose homeostasis for hyperglycaemia, e.g. antidiabetics
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
- A61P31/04—Antibacterial agents
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P37/00—Drugs for immunological or allergic disorders
- A61P37/02—Immunomodulators
- A61P37/04—Immunostimulants
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P37/00—Drugs for immunological or allergic disorders
- A61P37/02—Immunomodulators
- A61P37/06—Immunosuppressants, e.g. drugs for graft rejection
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P37/00—Drugs for immunological or allergic disorders
- A61P37/08—Antiallergic agents
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/575—Hormones
- C07K14/605—Glucagons
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/31—Fusion polypeptide fusions, other than Fc, for prolonged plasma life, e.g. albumin
Abstract
Disclosed is a composition for use in achieving an intestinotrophic effect in a subject comprising a recombinant fusion protein comprising (i) a glucagon-like protein-2 (GLP-2) sequence selected from the group consisting of the sequences of SEQ ID NOS: 1 and 3-23, and (ii) an extended recombinant polypeptide (XTEN), wherein the XTEN is a sequence exhibiting at least 90% sequence identity to a sequence selected from the group consisting of the sequences in Table 4, and wherein the XTEN is further characterized in that: (a) the XTEN comprises at least 36 amino acid residues; (b) the sum of glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P) residues constitutes more than 80% of the total amino acid residues of the XTEN; (c) the XTEN is substantially non-repetitive such that (i) the XTEN contains no three contiguous amino acids that are identical unless the amino acids are serine; (ii) at least 80% of the XTEN sequence consists of non-overlapping sequence motifs, each of the sequence motifs comprising 9 to 14 amino acid residues consisting of four to six amino acids selected from glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P), wherein any two contiguous amino acid residues do not occur more than twice in each of the non-overlapping sequence motifs; or (iii) the XTEN sequence has a subsequence score of less than 10; (d) the XTEN has greater than 90% random coil formation as determined by GOR algorithm; (e) the XTEN has less than 2% alpha helices and 2% beta-sheets as determined by Chou-Fasman algorithm; and (f) the XTEN lacks a predicted T-cell epitope when analysed by TEPITOPE algorithm, wherein the TEPITOPE threshold score for the prediction by the algorithm has a threshold of –9, wherein the fusion protein exhibits an apparent molecular weight factor of at least 4 and is capable of achieving an intestinotrophic effect in a subject using a dosage of 2.5 nmol/kg to 6250 nmol/kg, or 25 nmol/kg to 3750 nmol/kg, or 75 nmol/kg/dose to 1250 nmol/kg/dose, or 125 nmol/kg/dose to 750 nmol/kg/dose. lypeptide (XTEN), wherein the XTEN is a sequence exhibiting at least 90% sequence identity to a sequence selected from the group consisting of the sequences in Table 4, and wherein the XTEN is further characterized in that: (a) the XTEN comprises at least 36 amino acid residues; (b) the sum of glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P) residues constitutes more than 80% of the total amino acid residues of the XTEN; (c) the XTEN is substantially non-repetitive such that (i) the XTEN contains no three contiguous amino acids that are identical unless the amino acids are serine; (ii) at least 80% of the XTEN sequence consists of non-overlapping sequence motifs, each of the sequence motifs comprising 9 to 14 amino acid residues consisting of four to six amino acids selected from glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P), wherein any two contiguous amino acid residues do not occur more than twice in each of the non-overlapping sequence motifs; or (iii) the XTEN sequence has a subsequence score of less than 10; (d) the XTEN has greater than 90% random coil formation as determined by GOR algorithm; (e) the XTEN has less than 2% alpha helices and 2% beta-sheets as determined by Chou-Fasman algorithm; and (f) the XTEN lacks a predicted T-cell epitope when analysed by TEPITOPE algorithm, wherein the TEPITOPE threshold score for the prediction by the algorithm has a threshold of –9, wherein the fusion protein exhibits an apparent molecular weight factor of at least 4 and is capable of achieving an intestinotrophic effect in a subject using a dosage of 2.5 nmol/kg to 6250 nmol/kg, or 25 nmol/kg to 3750 nmol/kg, or 75 nmol/kg/dose to 1250 nmol/kg/dose, or 125 nmol/kg/dose to 750 nmol/kg/dose.
Description
GLUCAGON-LIKE PEPTIDE-2 COMPOSITIONS
AND METHODS OF MAKING AND USING SAME
CROSS-REFERENCE TO RELATED APPLICATION
This application claims priority benefit to US. Provisional Application Serial No. 61/573,748
filed September 12, 2011, and which application is incorporated herein by nce in its ty.
BACKGROUND OF THE INVENTION
Glucagon-like peptide-2 (GLP-2) is an endocrine peptide that, in humans, is generated as a 33
amino acid peptide by post-translational lytic cleavage of proglucagon; a process that also liberates
the related glucagon-like peptide-1 (GLP-l). GLP-2 is produced and secreted in a nutrient-dependent
fashion by the intestinal endocrine L cells. GLP-2 is trophic to the inal l epithelium via
stimulation of crypt cell proliferation and reduction of enterocyte apoptosis. GLP—2 exerts its effects
through specific GLP—2 receptors but the responses in the intestine are mediated by indirect pathways in
that the receptor is not expressed on the epithelium but on enteric neurons (Redstone, HA, et al. The
Effect of Glucagon-Like Peptide-2 Receptor Agonists on Colonic Anastomotic Wound Healing.
Gastroenterol Res Pract. (2010); 2010: Art. ID: ).
The effects of GLP-2 are multiple, including intestinaltrophic s resulting in an increase in
intestinal absorption and nutrient assimilation (Lovshin, J. and DJ. Drucker, Synthesis, secretion and
biological actions of the glucagon-like peptides. Ped. Diabetes (2000) 1(1):49-57); anti-inflammatory
activities; mucosal healing and repair; sing intestinal permeability; and an increase in mesenteric
blood flow (Bremholm, L. et al. on-like e-2 increases mesenteric blood flow in humans.
Scan. J. Gastro. (2009) 44(3):314-319). Exogenously administered GLP-2 produces a number of s
in humans and rodents, including slowing gastric emptying, increasing intestinal blood flow and
intestinal /mucosal surface area, enhancement of intestinal function, reduction in bone breakdown
and rotection. GLP-2 may act in an endocrine fashion to link intestinal growth and metabolism
with nutrient intake. In inflamed , however, GLP-2 action is antiproliferative, decreasing the
expression of proinflammatory nes while increasing the sion of IGF-1, promoting healing of
inflamed mucosa.
Many patients e al removal of the small or large bowel for a wide range of conditions,
including colorectal cancer, inflammatory bowel disease, irritable bowel syndrome, and trauma. Short
bowel syndrome (SBS) patients with end jejunostomy and no colon have reduced release of GLP-2 in
response to a meal due to the removal of secreting L cells. Patients with active Crohn’s Disease or
ulcerative colitis have endogenous serum GLP-2 concentrations that are increased, suggesting the
possibility of a normal adaptive response to mucosal injury (Buchman, A. L., et al. Teduglutide, a novel
mucosally active analog of glucagon-like peptide-2 (GLP-2) for the ent ofmoderate to severe
Crohn's disease. Inflammatory Bowel Diseases, (2010) 16:962—973).
Exogenously administered GLP-2 and GLP-2 analogues have been demonstrated in animal
models to promote the growth and repair of the intestinal epithelium, including enhanced nutrient
absorption following small bowel resection and alleviation of total parenteral nutrition-induced
hypoplasia in rodents, as well as demonstration of sed mortality and ement of disease-
related histopathology in animal models such as indomethacin-induced enteritis, dextran sulfate-induced
colitis and Chemotherapy-induced mucositis. Accordingly, GLP-2 and related analogs may be treatments
for short bowel syndrome, irritable bowel syndrome, Crohn's disease, and other diseases of the intestines
(Moor, BA, et al. GLP-2 receptor m rates inflammation and gastrointestinal stasis in murine
post-operative ileus. J Pharmacol Exp Ther. (2010) 333(2):574-583). However, native GLP-2 has a half-
life of approximately seven minutes due to cleavage by dipeptidyl peptidase IV (DPP-IV) sen PB,
et al., Teduglutide (ALX-0600), a dipeptidyl peptidase IV resistant glucagon-like peptide 2 analogue,
improves intestinal function in short bowel syndrome patients. Gut. (2005) 54(9):]224-1231; Hartmann
B, et al. (2000) Dipeptidyl peptidase IV inhibition enhances the intestinotrophic effect of glucagon-like
peptide-2 in rats and mice. Endocrinology 141:4013—4020). It has been determined that modification of
the GLP-2 sequence by replacement of alanine with glycine in position 2 blocks degradation by DPP-IV,
extending the half life of the analog called teduglutide to 0.9—2.3 hours (Marier JF, Population
pharmacokinetics glutide ing repeated subcutaneous administrations in healthy participants
and in patients with short bowel syndrome and Crohn's disease. J Clin Pharmacol. (2010) 50(1):36—49).
However, recent clinical trials utilizing teduglutide in patients with short bowel syndrome required daily
administration of the GLP-2 analog to achieve a clinical benefit (Jeppesen PB, Randomized placebo-
controlled trial of teduglutide in reducing parenteral nutrition and/or intravenous fluid requirements in
patients with short bowel syndrome. Gut (2011) 902-9l4).
Chemical modifications to a therapeutic protein can modify its in vivo clearance rate and
subsequent half-life. One example of a common ation is the addition of a polyethylene glycol
(PEG) moiety, typically coupled to the protein via an aldehyde or oxysuccinimide (NHS) group
on the PEG reacting with an amine group (e.g. lysine side chain or the N—terminus). However, the
conjugation step can result in the formation of heterogeneous t mixtures that need to be separated,
leading to cant product loss and complexity of manufacturing and does not result in a tely
ally-uniform product. Also, the pharmacologic function of pharmacologically—active proteins may
be hampered if amino acid side chains in the vicinity of its binding site become modified by the
PEGylation process. Other ches include the genetic fusion of an EC domain to the eutic
protein, which increases the size of the therapeutic protein, hence reducing the rate of clearance through
the . Additionally, the Fc domain s the ability to bind to, and be recycled from lysosomes
by, the FCRn receptor, which results in increased pharmacokinetic ife. A form of GLP-2 fused to PC
has been evaluated in a murine model of gastrointestinal inflammation ated with postoperative ileus
(Moor, BA, et al. GLP-2 receptor agonism ameliorates ation and gastrointestinal stasis in murine
post-operative ileus. J Pharmacol Exp Ther. (2010) 333(2):574-583). Unfortunately, the Fc domain does
not fold ntly during recombinant expression, and tends to form insoluble precipitates known as
inclusion bodies. These inclusion bodies must be solubilized and onal protein must be renatured
from the misfolded aggregate, a time-consuming, inefficient, and expensive process.
SUMMARY OF THE INVENTION
Accordingly, there remains a considerable need for GL—2 compositions and ations with
increased half—life and retention of activity and bioavailability when administered as part of a preventive
and/or therapeutic regimen for GLP-2 ated conditions and diseases that can be administered less
frequently, and are safer and less complicated and costly to produce. The present ion addresses
this need and provides related advantages as well. The present invention relates to novel GLP-2
itions and uses thereof. Specifically, the compositions provided herein are particularly used for
the treatment or improvement of a gastrointestinal a condition. In one aspect, the present invention
provides compositions of fusion proteins comprising a recombinant glucagon—like protein-2 (“GLP-2”)
and one or more ed recombinant polypeptides (“XTEN”). A subject XTEN is typically a
polypeptide with a non-repetitive sequence and unstructured conformation that is useful as a fusion
partner to GLP-2 peptides in that it confers enhanced properties to the rsulting fusion protein. In one
embodiment, one or more XTEN is linked to a GLP-2 or sequence variants thereof, ing in a GLP
XTEN fusion n (“GLPZ-XTEN”). The present disclosure also provides pharmaceutical
compositions comprising the fusion proteins and the uses thereof for treating GLPrelated ions.
In one aspect, the GLP2-XTEN compositions have enhanced pharmacokinetic and/or physicochemical
ties compared to recombinant GLP-2 not linked to the XTEN, which permit more convenient
dosing and result in improvement in one or more parameters ated with the gastrointestinal
condition. The GLP2-XTEN fusion ns of the embodiments sed herein exhibit one or more or
any combination of the improved properties and/or the embodiments as detailed herein. In some
embodiments, the GLP2-XTEN compositions of the invention do not have a component selected the
group consisting of: polyethylene glycol (PEG), n, antibody, and an antibody fragment.
In one ment, the invention provides a recombinant GLP-2 fusion protein comprising an
XTEN, wherein the XTEN is characterized in that a) the XTEN comprises at least 36, or at least 72, or at
least 96, or at least 120, or at least 144, or at least 288, or at least 576, or at least 864, or at least 1000, or
at least 2000, or at least 3000 amino acid residues; b) the sum of glycine (G), alanine (A), serine (S),
ine (T), glutamate (E) and proline (P) residues constitutes at least about 80%, or at least about 90%,
or at least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about
99%, of the total amino acid residues of the XTEN; c) the XTEN is substantially non-repetitive such that
(i) the XTEN contains no three contiguous amino acids that are identical unless the amino acids are
serine; (ii) at least about 80%, or at least about 90%, or at least about 91%, or at least about 92%, or at
least about 93%, or at least about 94%, or at least about 95%, or at least about 96%, or at least about 97%,
or at least about 98%, or at least about 99%, of the XTEN sequence consists of non-overlapping sequence
motifs, each of the sequence motifs comprising about 9 to about 14, or about 12 amino acid residues
consisting of three, four, five or six types of amino acids selected from e (G), alanine (A), serine
(S), threonine (T), glutamate (E) and proline (P), n any two contiguous amino acid residues do not
occur more than twice in each of the non-overlapping sequence motifs; or (iii) the XTEN sequence has a
subsequence score of less than 10; d) the XTEN has greater than 90%, or greater than 95%, or greater
than 99%, random coil formation as determined by GOR algorithm; e) the XTEN has less than 2% alpha
helices and 2% beta-sheets as determined by Chou-Fasman thm; f) the XTEN lacks a predicted T-
cell epitope when analyzed by TEPITOPE thm, wherein the TEPITOPE threshold score for said
prediction by said algorithm has a threshold of —9; wherein said fiJsion protein exhibits an apparent
molecular weight factor of at least about 4, or at least about 5, or at least about 6, or at least about 7, or at
least about 8, or at least about 9, or at least about 10, or at least about 11, or at least about 12, or at least
about 15, or at least about 20 When measured by size exclusion chromatography or comparable method
and exhibits an intestinotrophic effect when administered to a t using a therapeutically effective
amount. In the foregoing embodiment, the XTEN can have any one of elements (a)-(d) or any
combination of (a)—(d). In another embodiment of the foregoing, the fusion protein exhibits an nt
molecular weight of at least about 200 kDa, or at least about 400 kDa, or at least about 500 kDa, or at
least about 700 kDa, or at least about 1000 kDa, or at least about 1400 kDa, or at least about 1600 kDa,
or at least about 1800kDa, or at least about 2000 kDa, or at least about 3000 kDa. In another
embodiment of the foregoing, the fusion protein exhibits a al half—life that is longer than about 24,
or about 30, or about 48, or about 72, or about 96, or about 120, or about 144 hours when stered
to a subject, wherein the subject is selected from mouse, rat, monkey and man. In one embodiment, the
XTEN ofthe fusion protein is characterized in that at least about 80%, or at least about 90%, or at least
about 91%, or at least about 92%, or at least about 93%, or at least about 94%, or at least about 95%, or
at least about 96%, or at least about 97%, or at least about 98%, or at least about 99% of the XTEN
sequence consists of non-overlapping sequence motifs wherein the motifs are selected from Table 3. In
some ments, the XTEN of the fusion proteins are further characterized in that the sum of
asparagine and glutamine residues is less than 10%, or less than 5%, or less than 2% of the total amino
acid sequence of the XTEN. In other embodiments, the XTEN of the fusion proteins are further
characterized in that the sum of methionine and tryptophan residues is less than 2% of the total amino
acid sequence of the XTEN. In still other ments, the XTEN of the fusion proteins are further
characterized in that the XTEN has less than 5% amino acid residues with a positive charge. In one
embodiment, the intestinotrophic effect of the administered fusion protein is at least about 30%, or at
least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about
80%, or at least about 90%, or at least about 100% or at least about 120% or at least about 150% or at
least about 200% of the inotrophic effect compared to the corresponding GLP-2 not linked to
XTEN and administered to a subject using a comparable dose. In one ment, the intestinotrophic
effect is manifest in a subject selected from the group consisting of mouse, rat, , and human. In
the foregoing embodiments, said administration is subcutaneous, intramuscular, or intravenous. In
another embodiment, the inotrophic effect is determined after administration of 1 dose, or 3 doses,
or 6 doses, or 10 doses, or 12 or more doses of the fusion protein. In another embodiment, the
intestinotrophic effect is selected from the group consisting of intestinal growth, increased hyperplasia of
the villus epithelium, increased crypt cell proliferation, increased height of the crypt and villus axis,
increased healing after intestinal anastomosis, increased small bowel weight, increased small bowel
length, decreased small bowel epithelium apoptosis, reduced ulceration, d intestinal adhesions, and
enhancement of intestinal function.
In one embodiment, the administration of the GLP2-XTEN fusion protein results in an increase
in small intestine weight of at least about 10%, or at least about 20%, or at least about 30%. In another
embodiment, the stration results in an increase in small intestine length of at least about 5%, or at
least about 6%, or at least about 7%, or at least about 8%, or at least about 9%, or at least about 10%, or
at least about 20%, or at least about 30%.
In one embodiment, the GLP-2 ce of the fusion n has at least 90%, or at least about
91%, or at least about 92%, or at least about 93%, or at least about 94%, or at least about 95%, or at least
about 96%, or at least about 97%, or at least about 98%, or at least about 99%, or about 100% sequence
identity to a sequence selected from the group consisting of the sequences in Table 1, when optimally
aligned. In r embodiment, the GLP-2 of the fusion protein comprises human GLP-2. In another
embodiment, the GLP—2 of the fusion protein comprises a GLP—2 of a species origin other than human,
such as bovine GLP—2, pig GLP—2, sheep GLP—2, chicken GLP—2, and canine GLP—2. In some
embodiments, the GLP-2 of the fusion proteins has an amino acid substitution in place of Alaz, n
the substitution is glycine. In yet another embodiment, the GLP-2 of the fusion protein has the sequence
HGDGSFSDEMNTILDNLAARDFINWLIQTKITD.
In one ment of the GLP2-XTEN fusion protein, the XTEN is linked to the C-terminus of
the GLP-2. In another embodiment of the GLP2-XTEN fusion n wherein the XTEN is linked to the
C-terminus of the GLP-2, the fusion protein further comprises a spacer sequence of 1 to about 50 amino
acid residues linking the GLP-2 and XTEN components. In one embodiment, the spacer sequence is a
single glycine residue.
In one embodiment of the GLP2-XTEN fusion protein, the XTEN is characterized in that: (a) the
total XTEN amino acid es is at least 36 to about 3000, or about 144 to about 2000, or about 288 to
about 1000 amino acid residues; and (b) the sum of glycine (G), e (A), serine (S), threonine (T),
glutamate (E) and proline (P) residues constitutes at least about 90%, or at least about 95%, or at least
about 96%, or at least about 97%, or at least about 98%, or at least about 99%, of the total amino acid
residues of the XTEN.
In one embodiment of the GLP2-XTEN fusion protein, the fusion protein comprises one or more
XTEN having at least 80%, or at least about 90%, or at least about 95%, or at least about 96%, or at least
about 97%, or at least about 98%, or at least about 99% or sequence identity compared to a ce of
able length selected from any one of Table 4, Table 8, Table 9, Table 10, Table 11, and Table 12,
when optimally aligned. In r embodiment, the fusion protein comprises an XTEN wherein the
sequence is AE864 of Table 4. In r embodiment, the fusion protein sequence has a sequence with
at least 90%, or at least about 91%, or at least about 92%, or at least about 93%, or at least about 94%, or
at least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about
99%, or 100% sequence ty to the sequence set forth in .
In one embodiment, the fiision protein comprising a GLP-2 and XTEN binds to a GLP-2 receptor
with an ECso of less than about 30 nM, or about 100 nM, or about 200 nM, or about 300 nM, or about
370 nM, or about 400 nM, or about 500 nM, or about 600 nM, or about 700 nM, or about 800 nM, or
about 1000 nM, or about 1200 nM, or about 1400 nM when assayed using an in vitro GLP2R cell assay.
In another embodiment, the fusion protein retains at least about 1%, or about 2%, or about 3%, or about
4%, or about 5%, or about 10%, or about 20%, or about 30% 0f the potency of the corresponding GLP-2
not linked to XTEN when d using an in vitro GLP2R cell assay. In the foregoing embodiments of
the paragraph, the GLP2R cell can be a human recombinant GLP-2 glucagon family or calcium-
optimized cell or another cell comprising GLP2R known in the art.
Non-limiting examples of fusion proteins with a single GLP-2 linked to one or two XTEN are
ted in Tables 13 and 32. In one embodiment, the ion provides a fusion protein composition
has at least about 80% sequence identity compared to a sequence from Table 13 or Table 33, alternatively
at least about 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,
96%, 97%, 98%, 99%, or about 100% sequence identity as compared to a sequence from Table 13 or
Table 33. However, the invention also provides substitution of any of the GLP-2 sequences of Table 1 for
a GLP-2 in a sequence of Table 33, and substitution of any XTEN ce of Table 4 for an XTEN in a
sequence of Table 33. In some embodiments, the GLP-2 and the XTEN further comprise a spacer
sequence of 1 to about 50 amino acid residues linking the GLP-2 and XTEN components, wherein the
spacer sequence optionally comprises a cleavage sequence that is cleavable by a protease, including
endogenous mammalian proteases. Examples of such se include, but are not limited to, FXIa,
FXIIa, kallikrein, FVIIIa, FVIIIa, FXa, thrombin, se-2, granzyme B, MMP-12, MMP-13, MMP-17
or MMP-ZO, TEV, enterokinase, rhinovirus 3C protease, and e A, or a ce selected from
Table 6. In one embodiment, a fusion protein composition with a cleavage sequence has a sequence
having at least about 80% sequence identity compared to a sequence from Table 34, alternatively at least
about 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,
98%, 99%, or about 100% sequence identity as compared to a sequence from Table 34. However, the
invention also provides substitution of any of the GLP-2 sequences of Table 1 for a GLP-2 in a sequence
of Table 34, and substitution of any XTEN ce of Table 4 for an XTEN in a sequence of Table 34,
and substitution of any cleavage sequence of Table 6 for a cleavage sequence in a sequence of Table 34.
In ments having the subject cleavage sequences linked to the XTEN, cleavage of the cleavage
sequence by the protease releases the XTEN from the filsion protein. In some embodiments of the fusion
proteins comprising cleavage sequences that link XTEN to GLP-2, the GLP-2 component becomes
WO 40093
biologically active or has an increase in the ty to bind to GLP-2 receptor upon its release from the
XTEN by cleavage of the cleavage sequence, wherein the ing activity of the cleaved protein is at
least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about
70%, or at least about 80%, or at least about 90% compared to the ponding GLP-2 not linked to
XTEN. In one embodiment of the foregoing, the cleavage sequence is cleavable by a protease of Table 6.
In another embodiment, the fusion protein comprises XTEN linked to the GLP-2 by two heterologous
ge sequences that are cleavable by different proteases, which can be sequences of Table 6. In one
ment of the foregoing, the cleaved GLPZ-XTEN has increased ty to bind the GLP-2
receptor.
The invention provides that the fusion proteins compositions of the embodiments comprising
GLP-2 and XTEN characterized as described above, can be in different N— to C-terminus configurations.
In one ment of the GLPZ-XTEN composition, the ion provides a fusion protein of formula
(GLP-2)-(XTEN) I
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or analog as defined herein,
including sequences of Table l, and XTEN is an extended recombinant polypeptide as defined herein,
including sequences exhibiting at least about 80%, or at least about 90%, or at least about 95%, or at least
about 99% sequence identity to a sequence of comparable length from any one of of Table 4, Table 8,
Table 9, Table 10, Table 11, and Table 12, when optimally aligned. In one embodiment, the XTEN is
AE864.
In another embodiment of the GLPZ-XTEN composition, the invention provides a fusion
protein of formula II:
(XTEN)-(GLP-2) 11
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or analog as d herein,
including sequences of Table l, and XTEN is an extended inant polypeptide as defined ,
including sequences exhibiting at least about 80%, or at least about 90%, or at least about 95%, or at least
about 99% sequence identity to a sequence of comparable length from any one of of Table 4, Table 8,
Table 9, Table 10, Table 11, and Table 12, when optimally aligned. In one embodiment, the XTEN is
AE864.
In another embodiment of the GLP2-XTEN composition, the invention provides an isolated
fusion protein, wherein the fusion protein is of formula III:
(XTEN)-(GLP-2)-(XTEN) 111
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or analog as defined herein (e.g.,
including sequences of Table l), and XTEN is an extended recombinant polypeptide as defined herein,
including sequences exhibiting at least about 80%, or at least about 90%, or at least about 95%, or at least
about 99% sequence identity to a sequence of comparable length from any one of of Table 4, Table 8,
Table 9, Table 10, Table 11, and Table 12, when optimally aligned. In one embodiment, the XTEN is
AE864.
In r embodiment of the GLPZ-XTEN composition, the invention provides an isolated
fusion protein, wherein the fiJsion protein is of formula IV:
(GLP-2)-(XTEN)-(GLP-2) IV
n independently for each occurrence, GLP-2 is a GLP-2 protein or analog as d herein (e.g.,
including sequences of Table 1), and XTEN is an extended recombinant polypeptide as defined herein
e. g., including sequences exhibiting at least about 80%, or at least about 90%, or at least about 95%, or at
least about 99% sequence identity to a sequence of comparable length from any one of of Table 4, Table
8, Table 9, Table 10, Table 11, and Table 12, when lly aligned. In one embodiment, the XTEN is
AE864.
In another embodiment of the GLPZ-XTEN composition, the invention provides an isolated
fiJsion protein, wherein the fusion protein is of formula V:
(GLP-2)-(S)x-(XTEN)y V
wherein independently for each ence, GLP-2 is a GLP-2 n or analog as d herein,
including sequences of Table 1; S is a spacer ce having between 1 to about 50 amino acid residues
that can optionally include a cleavage sequence or amino acids compatible with restrictions sites; X is
either 0 or 1; and XTEN is an extended recombinant polypeptide as defined herein, including ces
ting at least about 80%, or at least about 90%, or at least about 95%, or at least about 99%
ce identity to a sequence of comparable length from any one of of Table 4, Table 8, Table 9, Table
, Table 11, and Table 12, when lly aligned. In one embodiment, the XTEN is AE864. In the
embodiments of formula V, the spacer sequence comprising a cleavage sequence is a sequence that is
cleavable by a mammalian protease selected from the group consisting of factor XIa, factor XIIa,
kallikrein, factor VIIa, factor IXa, factor Xa, factor IIa (thrombin), elastase-2, MMP-12, MMP13, MMP-
17 and MMP-ZO. In one embodiment of the fusion protein of formula V, the GLP-2 comprises human
GLP-Z. In another embodiment of the fusion protein of formula V, the GLP-Z comprises a GLP-Z of a
s origin other than human, e.g., bovine GLP-Z, pig GLP-Z, sheep GLP-Z, chicken GLP-2, and
canine GLP-Z. In another embodiment of the fusion protein of formula V, the GLP-2 has an amino acid
substitution in place of Alaz, and wherein the substitution is glycine. In another embodiment, of the
fiJsion protein of formula V, the GLP-2 has the sequence
HGDGSFSDEMNTILDNLAARDFINWLIQTKITD. In another ment of the fusion protein of
formula V, the fusion protein comprises a spacer sequence wherein the spacer sequence is a glycine
residue.
] In another embodiment of the GLPZ-XTEN composition, the invention provides an isolated
fiJsion protein, wherein the fusion protein is of formula VI:
(XTEN)X-(S)X-(GLP-2)-(S)y-(XTEN)y v1
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or analog as defined herein (e.g.,
ing sequences of Table 1); S is a spacer sequence having between 1 to about 50 amino acid
residues that can optionally include a cleavage sequence or amino acids compatible with restrictions
sites; x is either 0 or 1 and y is either 0 or 1 wherein x+y 31; and XTEN is an extended recombinant
polypeptide as defined herein, e. g., including exhibiting at least about 80%, or at least about 90%, or at
least about 95%, or at least about 99% sequence identity to a sequence of comparable length from any
one of of Table 4, Table 8, Table 9, Table 10, Table 11, and Table 12, when optimally aligned. In one
embodiment, the XTEN is AE864. In the ments of formula VI, the spacer sequence comprising a
cleavage ce is a sequence that is ble by a mammalian protease not limited to
, includingbut
factor XIa, factor XIIa, kallikrein, factor VIIa, factor IXa, factor Xa, factor Ila (thrombin), se-2,
, MMP13, MMP-17 and MMP-20.
In some embodiments, administration of a therapeutically effective dose of a fusion protein of
one of formulae I-VI to a subject in need thereof can result in a gain in time of at least two-fold, or at
least three-fold, or at least four-fold, or at least five-fold, or at least 10-fold or more spent within a
therapeutic window for the fusion protein compared to the corresponding GLP-2 not linked to the XTEN
and stered at a comparable dose to a subject. In other cases, administration of a therapeutically
effective dose of a fusion protein of an ment of formulae I-VI to a subject in need thereof can
result in a gain in time between consecutive doses necessary to maintain a therapeutically effective dose
regimen of at least 48 h, or at least 72 h, or at least about 96 h, or at least about 120 h, or at least about 7
days, or at least about 14 days, or at least about 21 days between consecutive doses compared to
administration of a corresponding GLP-2 not linked to XTEN at a able dose.
The filsion protein compositions ofthe embodiments bed herein can be evaluated for
retention of activity (including after cleavage of any incorporated XTEN-releasing cleavage sites) using
any appropriate in vitro assay disclosed herein (e.g., the assays of Table 32 or the assays described in the
Examples), to determine the suitability of the configuration for use as a therapeutic agent in the treatment
of a GLPfactor related condition. In one embodiment, the fusion protein ts at least about 2%, or
at least about 5%, or at least about 10%, or at least about 20%, or at least about 30%, or at least about
40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least
about 90% of the activity ed to the ponding GLP-2 not linked to XTEN. In another
embodiment, the GLP-2 component released from the filsion protein by enzymatic cleavage of the
incorporated cleavage sequence g the GLP-2 and XTEN components exhibits at least about 50%, or
at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% of the biological
activity compared to the corresponding GLP-2 not linked to XTEN.
In some embodiments, filsion proteins comprising GLP-2 and one or more XTEN, wherein the
fiasion proteins exhibit enhanced pharmacokinetic properties when administered to a subject ed to
a GLP-2 not linked to the XTEN, wherein the enhanced properties include but are not limited to longer
terminal half-life, larger area under the curve, increased time in which the blood concentration remains
within the therapeutic window, increased time between consecutive doses resulting in blood
concentrations within the therapeutic window, increased time between Cmax and Cmin blood
concentrations when consecutive doses are administered, and decreased cumulative dose over time
required to be administered compared to a GLP-2 not linked to the XTEN, yet still result in a blood
concentration within the therapeutic window. A subject to which a XTEN composition is
administered can include but is not limited to mouse, rat, monkey and human. In some embodiments, the
terminal ife of the fusion protein administered to a subject is increased at least about three-fold, or
at least about four-fold, or at least about five-fold, or at least about ld, or at least about eight-fold, or
at least about ten-fold, or at least about 20-fold, or at least about d, or at least about 60-fold, or at
least about ld, or even longer as ed to the corresponding recombinant GLP-2 not linked to
the XTEN when the corresponding GLP-2 is administered to a t at a able dose. In other
embodiments, the terminal half-life of the fusion protein administered to a subject is at least about 12 h,
or at least about 24 h, or at least about 48 h, or at least about 72 h, or at least about 96 h, or at least about
120 h, or at least about 144 h, or at least about 21 days or greater. In other embodiments, the enhanced
pharmacokinetic property is reflected by the fact that the blood concentrations remain within the
therapeutic window for the fusion protein for a period that is at least about two-fold, or at least about
three-fold, or at least about four-fold, or at least about five-fold, or at least about six-fold, or at least about
eight—fold, or at least about ten—fold longer, or at least about 20—fold, or at least about 40—fold, or at least
about 60—fold, or at least about 100—fold greater compared to the corresponding GLP—2 not linked to the
XTEN when thee corresponding GLP-2 is administered to a subject at a comparable dose. The se
in half-life and time spent within the therapeutic window permits less nt dosing and decreased
amounts of the fusion protein (in nmoles/kg equivalent) that are administered to a subject, compared to
the corresponding GLP-2 not linked to the XTEN. In one embodiment, administration of three or more
doses of a GLP2-XTEN fusion protein to a subject in need thereof using a therapeutically—effective dose
regimen results in a gain in time of at least two-fold, or at least three-fold, or at least four-fold, or at least
five-fold, or at least ld, or at least fold, or at least 10-fold, or at least about 20-fold, or at least
about 40-fold, or at least about 60-fold, or at least about 100-fold or higher between at least two
consecutive Cmax peaks and/or Cm troughs for blood levels of the fusion protein compared to the
corresponding GLP-2 not linked to the XTEN and administered using a comparable dose n to a
subject. In one embodiment, the GLP2-XTEN administered using a therapeutically effective amount to a
subject in need f results in blood concentrations of the GLPZ-XTEN fusion protein that remain
above at least about 500 ng/ml, at least about 1000 ng/ml, or at least about 2000 ng/ml, or at least about
3000 ng/ml, or at least about 4000 ng/ml, or at least about 5000 ng/ml, or at least about 10000 ng/ml, or
at least about 15000 ng/ml, or at least about 20000 ng/ml, or at least about 30000 ng/ml, or at least about
40000 ng/ml for at least about 24 hours, or at least about 48 hours, or at least about 72 hours, or at least
about 96 hours, or at least about 120 hours, or at least about 144 hours. In another embodiment, the
GLP2-XTEN administered at an appropriate dose to a subject results in area under the curve
concentrations of the GLP2-XTEN fusion protein of at least 100000 hr*ng/mL, or at least about 200000
hr*ng/mL, or at least about 400000 hr*ng/mL, or at least about 600000 hr*ng/mL, or at least about
800000 hr*ng/mL, or at least about 1000000 hr*ng/mL, or at least about 2000000 hr*ng/mL after a
single dose. In one ment, the GLP2-XTEN fiJsion protein has a terminal half-life that results in a
gain in time between consecutive doses ary to maintain a therapeutically effective dose regimen of
at least 48 h, or at least 72 h, or at least about 96 h, or at least about 120 h, or at least about 7 days, or at
least about 14 days, or at least about 21 days between utive doses compared to the regimen of a
GLP-2 not linked to XTEN and administered at a comparable close.
In one embodiment, the GLP2-XTEN fusion protein is characterized in that when an equivalent
amount, in nmoles/kg of the fusion n and the corresponding GLP-2 that lacks the XTEN are each
administered to comparable subjects, the fusion protein achieves a terminal half-life in the subject that is
at least about 3-fold, or at least , or at least 5-fold, or at least 10-fold, or at least 15-fold, or at least
-fold longer compared to the corresponding GLP-2 that lacks the XTEN. In another embodiment, the
GLP2-XTEN fusion protein is characterized in that when a 2-fold, or 3-fold, or 4-fold, or 5-fold, or 6-
fold smaller amount, in nmoles/kg, of the fusion protein than the corresponding GLP-2 that lacks the
XTEN are each administered to comparable subjects with a gastrointestinal condition, the fusion protein
achieves a comparable therapeutic effect in the subject as the corresponding GLP-2 that lacks the XTEN.
In another embodiment, the GLP2—XTEN fusion protein is characterized in that when the fusion protein
is administered to a subject in utive doses to a subject using a dose interval that is at least about 2—
fold, or at least 3-fold, or at least 4-fold, or at least 5-fold, or at least 10-fold, or at least 15-fold, or at
least 20-fold longer as ed to a dose interval for the corresponding GLP-2 that lacks the XTEN and
is administered to a comparable subject using an otherwise equivalent nmoles/kg amount, the fusion
protein achieves a similar blood concentration in the t as compared to the ponding GLP-2
that lacks the XTEN. In another embodiment, the GLP2-XTEN fusion protein is characterized in that
when the fusion protein is administered to a subject in consecutive doses to a subject using a dose
interval that is at least about 3-fold, or at least 4-fold, or at least 5-fold, or at least d, or at least 15-
fold, or at least 20-fold longer as compared to a dose interval for the corresponding GLP-2 that lacks the
XTEN and is administered to a comparable subject using an otherwise equivalent /kg amount, the
fusion protein achieves a comparable therapeutic effect in the subject as the corresponding GLP-2 that
lacks the XTEN. In another embodiment, the GLP2-XTEN fusion protein ts any combination of,
or all of the ing characterisitics of this paragraph. In the embodiments of this paragraph, the
subject to which the subject composition is administered can e but is not, limited to mouse, rat,
, and human. In one embodiment, the subject is rat. In another embodiment, the subject is
human.
In one ment, the stration of a GLP2-XTEN fusion protein to a subject results in a
greater therapeutic effect compared to the effect seen with the corresponding GLP-2 not linked to XTEN.
In another embodiment, the administration of an effective amount the fusion n results in a greater
eutic effect in a subject with enteritis compared to the corresponding GLP-2 not linked to XTEN
and administered to a comparable subject using a comparable nmoles/kg amount. In the foregoing, the
subject is ed from the group consisting of mouse, rat, monkey, and human. In one embodiment of
the foregoing, the subject is human and the enteritis is s disease. In another embodiment of the
foregoing, the subject is rat subject and the enteritis is induced with indomethacin. In the foregoing
embodiments of this paragraph, the greater therapeutic effect is selected from the group consisting of
body weight gain, small intestine length, reduction in TNF (1 content of the small intestine ,
reduced mucosal atrophy, reduced incidence of perforated ulcers, and height of villi. In one embodiment,
the administration of a GLP2-XTEN fusion protein to a subject results in an se in small intestine
weight of at least about 10%, or at least about 20%, or at least about 30%, or at least about 40% greater
ed to that of the corresponding GLP-2 not linked to XTEN. In another embodiment of the
administration of a TEN fusion protein to a subject, the administration results in an increase in
small intestine length of at least about 5%, or at least about 6%, or at least about 7%, or at least about 8%,
or at least about 9%, or at least about 10%, or at least about 20%, or at least about 30%, or at least about
40% greater compared to that of the corresponding GLP-2 not linked to XTEN. In another ment
of the administration of a GLP2-XTEN fusion protein to a subject, the administration results in an
increase in body weight is at least about 5%, or at least about 6%, or at least about 7%, or at least about
8%, or at least about 9%, or at least about 10%, or at least about 20%, or at least about 30%, or at least
about 40% greater compared to that of the corresponding GLP-2 not linked to XTEN. In another
embodiment of the administration of a TEN fusion protein to a subject, the administration results
a reduction in TNFOL content of at least about 0.5 ng/g, or at least about 0.6 ng/g, or at least about 0.7
ng/g, or at least about 0.8 ng/g, or at least about 0.9 ng/g, or at least about 1.0 ng/g, or at least about 1.1
ng/g, or at least about 1.2 ng/g, or at least about 1.3 ng/g, or at least about 1.4 ng/g of small intestine
tissue or greater compared to that of the corresponding GLP-2 not linked to XTEN. In another
embodiment of the stration of a GLP2-XTEN fusion protein to a subject, the administration results
in an increase in villi height of at least about 5%, or at least about 6%, or at least about 7%, or at least
about 8%, or at least about 9%, or at least about 10%, or at least about 11%, or at least about 12% greater
compared to that of the ponding GLP-2 not linked to XTEN. In the foregoing embodiments of this
paragraph, the fusion protein is administered as 1, or 2, or 3, or 4, or 5 , or 6, or 10, or 12 or more
consecutive doses, wherein the dose amount is at least about 5, or least about 10, or least about 25, or
least about 100, or least about 200 nmoles/kg.
In one embodiment, the GLP2-XTEN recombinant fusion protein comprises a GLP-2 linked to
the XTEN via a cleavage sequence that is cleavable by a mammalian protease including but not limited to
factor XIa, factor XIIa, rein, factor VIIa, factor IXa, factor Xa, factor IIa (thrombin), Elastase-2,
MMP-12, MMP13, MMP-17 and MMP-20, n cleavage at the cleavage sequence by the
mammalian protease releases the GLP-2 sequence from the XTEN ce, and wherein the released
GLP-2 sequence exhibits an increase in receptor binding activity of at least about 30% compared to the
uncleaved fusion n.
The present invention provides methods of producing the GLPZ-XTEN fusion proteins. In some
embodiments, the method of producing a fusion protein sing GLP-Z fused to one or more
extended recombinant ptides (XTEN), comprises providing a host cell comprising a recombinant
nucleic acid encoding the fusion protein of any ofthe embodiments bed herein; culturing the host
cell under conditions permitting the sion of the fusion protein; and recovering the fusion protein.
In one embodiment of the method, the the host cell is a prokaryotic cell. In r embodiment of the
method, the host cell is E. coli. In another embodiment of the method, the fusion protein is recovered
from the host cell cytoplasm in substantially soluble form. In another ment of the method, the
recombinant nucleic molecule has a sequence with at least 90%, or at least about 91%, or at least about
92%, or at least about 93%, or at least about 94%, or at least about 95%, or at least about 96%, or at least
about 97%, or at least about 98%, or at least about 99%, or about 100% sequence identity to a ce
selected from the group consisting of the DNA sequences set forth in Table 13, when optimally aligned,
or the complement thereof.
The present invention es isolated nucleic acids encoding the GLPZ-XTEN fiJsion proteins,
vectors, and host cells comprising the vectors and nucleic acids. In one ment, the invention
provides an isolated nucleic acid comprising a nucleic acid ce that has at least 70%, or at least
about 80%, or at least about 90%,or at least about 91%, or at least about 92%, or at least about 93%, or at
least about 94%, or at least about 95%, or at least about 96%, or at least about 97%, or at least about
98%, or at least about 99%, or 100% sequence identity to a DNA sequence selected from Table 13, or the
complement thereof. In another embodiment, the invention provides a nucleotide sequence encoding the
fusion protein of any of fusion protein embodiments described herein, or the complement thereof. In
another embodiment, the invention provides an expression vector or isolated host cell comprising the
nucleic acid ofthe foregoing embodiments of this aph. In r embodiment, the invention
provides a host cell comprising the foregoing expression vector.
Additionally, the present invention provides pharmaceutical compositions comprising the fusion
protein of any of the foregoing embodiments described herein and a pharmaceutically acceptable carrier.
In addition, the present invention provides pharmaceutical compositions comprising the fusion protein of
any of the foregoing embodiments bed herein for use in treating a gastrointestinal ion in a
subject. In one embodiment, administration of a therapeutically effective amount of the ceutical
composition to a subject with a gastrointestinal condition results in maintaining blood concentrations of
the fusion protein within a therapeutic window for the fusion protein at least three-fold longer compared
to the ponding GLP-2 not linked to the XTEN and administered at a comparable amount to the
subject. In another embodiment, administration of three or more doses of the pharmaceutical
composition to a subject with a gastrointestinal condition using a therapeutically-effective dose n
results in a gain in time of at least four-fold between at least two consecutive me peaks and/or Cmin
troughs for blood levels of the fusion protein compared to the corresponding GLP-2 not linked to the
XTEN and administered using a comparable dose n to a subject. In another embodiment, the
intravenous, subcutaneous, or intramuscular administration ofthe pharmaceutical composition
comprising at least about 5, or least about 10, or least about 25, or least about 100, or least about 200
nmoles/kg 0f the fusion protein to a subject results in fitsion protein blood levels maintained above 1000
ng/ml for at least 72 hours. In the foregoing embodiments of the paragraph, the gastrointestinal ion
is selected from the group consisting of tis, digestion disorders, malabsorption syndrome, short-gut
syndrome, short bowel syndrome, cul-de-sac syndrome, inflammatory bowel disease, celiac disease,
tropical sprue, hypogammaglobulinemic sprue, Crohn's disease, ulcerative colitis, enteritis,
chemotherapy-induced enteritis, irritable bowel syndrome, small intestine damage, small intestinal
damage due to cancer-chemotherapy, gastrointestinal , diarrheal diseases, intestinal insufficiency,
acid-induced intestinal injury, arginine deficiency, idiopathic hypospermia, obesity, catabolic illness,
febrile penia, diabetes, obesity, steatorrhea, autoimmune diseases, food allergies, hypoglycemia,
gastrointestinal barrier disorders, sepsis, bacterial peritonitis, bum-induced intestinal damage, decreased
gastrointestinal motility, intestinal failure, herapy-associated emia, bowel trauma, bowel
ischemia, mesenteric ischemia, malnutrition, necrotizing enterocolitis, necrotizing pancreatitis, neonatal
feeding intolerance, NSAID-induced gastrointestinal damage, nutritional insufficiency, total parenteral
nutrition damage to gastrointestinal tract, neonatal ional insufficiency, radiation—induced enteritis,
radiation—induced injury to the ines, tis, pouchitis, and gastrointestinal ischemia. In the
foregoing embodiments of the paragraph, the subject is selected from mouse, rat, monkey and human.
In another embodiment, the present invention provides a GLPZ-XTEN fitsion protein according
to any of the embodiments bed herein for use in the preparation of a ment for the treatment
of a gastrointestinal condition described herein.
The present invention es GLPZ-XTEN fusion proteins according to any ofthe
embodiments described herein for use in a method of treating a gastrointestinal condition in a subject,
comprising administering to the subject a therapeutically ive amount of the fusion protein. In one
embodiment, the gastrointestinal condition is selected from the group consisting of gastritis, digestion
disorders, malabsorption me, short-gut syndrome, short bowel syndrome, -sac syndrome,
inflammatory bowel disease, celiac e, tropical sprue, hypogammaglobulinemic sprue, s
disease, ulcerative colitis, enteritis, chemotherapy-induced enteritis, irritable bowel syndrome, small
intestine damage, small intestinal damage due to cancer-chemotherapy, gastrointestinal injury, diarrheal
diseases, intestinal insufficiency, acid-induced intestinal injury, arginine deficiency, idiopathic
hypospermia, obesity, catabolic illness, febrile neutropenia, diabetes, obesity, steatorrhea, autoimmune
diseases, food allergies, hypoglycemia, gastrointestinal barrier disorders, sepsis, ial nitis,
burn-induced intestinal damage, decreased gastrointestinal ty, intestinal e, chemotherapy-
associated bacteremia, bowel trauma, bowel ischemia, mesenteric ischemia, malnutrition, necrotizing
enterocolitis, izing pancreatitis, neonatal feeding intolerance, NSAID-induced gastrointestinal
2012/054941
damage, nutritional insufficiency, total parenteral nutrition damage to gastrointestinal tract, al
nutritional insufficiency, radiation-induced enteritis, radiation-induced injury to the intestines, mucositis,
pouchitis, and gastrointestinal ischemia. In another embodiment of the fusion protein for use in a method
of treating a intestinal condition in a subject, administration of two or more utive doses of
the fusion n administered using a therapeutically effective dose n to a subject results in a
prolonged period between consecutive Cmax peaks and/or Cmin troughs for blood levels of the filSlOl’l
protein ed to the corresponding GLP-2 that lacks the XTEN and administered using a
therapeutically effective dose regimen established for the GLP-Z. In another embodiment of the fusion
n for use in a method of treating a gastrointestinal condition in a t, administration of a smaller
amount in /kg of the fusion protein to a t in comparison to the corresponding GLP-2 that
lacks the XTEN, when administered to a subject under an otherwise equivalent dose regimen, results in
the fusion n achieving a comparable therapeutic effect as the corresponding GLP-2 that lacks the
XTEN. In the foregoing, the therapeutic effect is selected from the group consisting of blood
concentrations of GLP-2, increased mesenteric blood flow, sed inflammation, increased weight
gain, decreased diarrhea, decreased fecal wet weight, intestinal wound healing, increase in plasma
citrulline concentrations, decreased CRP levels, decreased requirement for steroid therapy, enhancing or
stimulating mucosal integrity, decreased sodium loss, minimizing, ting, or preventing bacterial
translocation in the intestines, enhancing, stimulating or accelerating recovery of the intestines after
surgery, preventing relapses of inflammatory bowel disease, and maintaining energy homeostasis.
The present invention provides GLPZ-XTEN fusion proteins according to any ofthe
ments described herein for use in a pharmaceutical regimen for treatment of a gastrointestinal
condition in a subject. In one embodiment, the r pharmaceutical egimen comprises a pharmaceutical
composition comprising the GLPZ-XTEN fusion protein. In another ment, the ceutical
regimen further comprises the step of determining the amount of pharmaceutical composition needed to
achieve a therapeutic effect in the subject, wherein the therapeutic effect is selected from the group
ting of increased mesenteric blood flow, decreased inflammation, increased weight gain, decreased
ea, decreased fecal wet weight, intestinal wound healing, increase in plasma citrulline
concentrations, decreased CRP levels, decreased requirement for steroid therapy, enhanced mucosal
integrity, decreased sodium loss, preventing bacterial translocation in the intestines, accelerated recovery
of the intestines after surgery, prevention of relapses of inflammatory bowel disease, and maintaining
energy homeostasis. In another embodiment, the pharmaceutical regimen comprises administering the
pharmaceutical composition in two or more successive doses to the t at an effective amount,
wherein the administration results in at least a 5%, or 10%, or 20%, or 30%, or 40%, or 50%, or 60%, or
70%, or 80%, or 90% greater improvement of at least one, two, or three parameters associated with the
gastrointestinal condition compared to the GLP-2 not linked to XTEN and administered using a
comparable nmol/kg amount. In one embodiment of the foregoing, the parameter improved is ed
from increased blood concentrations of GLP-2, increased mesenteric blood flow, decreased
inflammation, increased weight gain, decreased diarrhea, sed fecal wet weight, intestinal wound
healing, increase in plasma citrulline trations, decreased CRP levels, decreased requirement for
steroid y, enhanced mucosal integrity, decreased sodium loss, preventing bacterial translocation in
the intestines, accelerated recovery of the intestines after surgery, prevention of relapses of inflammatory
bowel disease, and maintaining energy homeostasis. In another embodiment, the pharmaceutical regimen
comprises administering a eutically effective amount of the pharmaceutical composition once
every 7, or 10, or 14, or 21, or 28 or more days. In an embodiment of the foregoing, the effective amount
is at least about 5, or least about 10, or least about 25, or least about 100, or least about 200 nmoles/kg.
In the embodiments of the n, the administration is subcutaneous, intramuscular, or intravenous.
The t invention provides methods of treating a gastrointestinal condition in a subject. In
some embodiments, the method comprises administering to said subject a composition comprising an
effective amount of a pharmaceutical composition comprising a GLP2-XTEN fitsion n described
herein. In one embodiment of the method, the effective amount is at least about 5, or least about 10, or
least about 25, or least about 100, or least about 200 nmoles/kg. In another embodiment of the method,
administration of the pharmaeceutical ition is subcutaneous, intramuscular, or intravenous. In
another embodiment of the method, administration of the effective amount results in the fusion protein
ting a terminal half-life of greater than about 30 hours in the subject, wherein the subject is ed
from the group consisting of mouse, rat, monkey, and human. In the foregoing embodiments, the
gastrointestinal condition is selected from the group consisting of gastritis, digestion disorders,
malabsorption me, short-gut syndrome, short bowel syndrome, cul-de-sac syndrome,
inflammatory bowel disease, celiac disease, tropical sprue, hypogammaglobulinemic sprue, Crohn's
disease, tive colitis, enteritis, chemotherapy-induced enteritis, ble bowel syndrome, small
intestine damage, small intestinal damage due to cancer-chemotherapy, gastrointestinal injury, diarrheal
es, intestinal insufficiency, acid-induced intestinal injury, arginine deficiency, idiopathic
hypospermia, obesity, catabolic illness, febrile penia, diabetes, obesity, steatorrhea, autoimmune
diseases, food allergies, hypoglycemia, gastrointestinal barrier disorders, sepsis, bacterial peritonitis,
burn-induced intestinal , decreased gastrointestinal motility, intestinal failure, chemotherapy-
associated bacteremia, bowel trauma, bowel ischemia, mesenteric ischemia, malnutrition, izing
enterocolitis, necrotizing atitis, al feeding intolerance, NSAID-induced gastrointestinal
damage, nutritional insufficiency, total eral nutrition damage to intestinal tract, neonatal
nutritional insufficiency, radiation-induced enteritis, radiation-induced injury to the intestines, mucositis,
pouchitis, and gastrointestinal ischemia. In another ment of the method, the method is used to
treat a subject with small intestinal damage due to chemotherapeutic agents such as, but not limited to 5-
FU, altretamine, bleomycin, busulfan, capecitabine, carboplatin, carmustine, mbucil, cisplatin,
cladribine, crisantaspase, cyclophosphamide, cytarabine, dacarbazine, dactinomycin, ubicin,
docetaxel, doxorubicin, epirubicin, ide, fludarabine, fluorouracil, gemcitabine, hydroxycarbamide,
idarubicin, ifosfamide, irinotecan, liposomal doxorubicin, leucoyorin, lomustine, melphalan,
2012/054941
mercaptopurine, mesna, methotrexate, mitomycin, mitoxantrone, oxaliplatin, axel, pemetrexed,
pentostatin, procarbazine, raltitrexed, streptozocin, tegafur—uracil, lomide, pa, tioguanine,
thioguanine, topotecan, treosulfan, vinblastine, vincristine, vindesine, and vinorelbine. In another
embodiment of the method, administration ofthe pharmaeceutical composition results in an
intestinotrophic effect in said subject. In yet another embodiment of the , administration of the
pharmaeceutical composition results in an intestinotrophic effect in said subject, wherein the
intestinotrophic effect is at least about 30%, or at least about 40%, or at least about 50%, or at least about
60%, or at least about 70%, or at least about 80%, or at least about 90%, or at least about 100% or at least
about 120% or at least about 150% or at least about 200% of the intestinotrophic effect compared to the
corresponding GLP-2 not linked to XTEN and administered to a subject using a comparable dose. In one
embodiment of the foregoing, the intestinotrophic effect is determined after administration of 1 dose, or 3
doses, or 6 doses, or 10 doses, or 12 or more doses of the fusion protein. In r embodiment of the
foregoing, the inotrophic effect is selected from the group consisting of intestinal growth, increased
hyperplasia of the villus epithelium, increased crypt cell proliferation, increased height of the crypt and
villus axis, sed healing after intestinal anastomosis, increased small bowel weight, increased small
bowel length, decreased small bowel epithelium apoptosis, and enhancement of intestinal function.
In r embodiment, the present invention provides kits, comprising packaging material and
at least a first container sing the pharmaceutical composition comprising a GLP2—XTEN fusion
protein described herein and a sheet of instructions for the reconstitution and/or administration of the
pharmaceutical compositions to a subject.
The following are miting ary embodiments of the invention:
Item 1. A recombinant fusion protein comprising a glucagon-like protein-2 (GLP-2) and an extended
recombinant ptide (XTEN), wherein the XTEN is characterized in that:
(a) the XTEN comprises at least 36 amino acid residues;
(b) the sum of glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and
proline (P) residues constitutes more than about 80% of the total amino acid residues of the XTEN;
(c) the XTEN is substantially non-repetitive such that (i) the XTEN contains no three
contiguous amino acids that are identical unless the amino acids are serine; (ii) at least about 80% of the
XTEN sequence consists of non-overlapping sequence motifs, each of the sequence motifs comprising
about 9 to about 14 amino acid residues consisting of four to six amino acids selected from glycine (G),
alanine (A), serine (S), threonine (T), glutamate (E) and proline (P), wherein any two contiguous amino
acid residues do not occur more than twice in each of the non-overlapping sequence motifs; or (iii) the
XTEN ce has a uence score of less than 10;
(d) the XTEN has greater than 90% random coil formation as determined by GOR
algorithm;
(e) the XTEN has less than 2% alpha helices and 2% beta-sheets as determined by
Chou-Fasman algorithm; and
(f) the XTEN lacks a predicted T-cell e when analyzed by TEPITOPE algorithm,
wherein the TEPITOPE threshold score for said prediction by said algorithm has a threshold of —9,
wherein said fusion protein exhibits an apparent molecular weight factor of at least about 4 and exhibits
an intestinotrophic effect when administered to a subject using a therapeutically effective .
Item 2. The recombinant fusion n of item 1, wherein the intestinotrophic effect is at least about
%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least
about 80%, or at least about 90%, or at least about 100% or at least about 120% or at least about 150% or
at least about 200% ofthe intestinotrophic effect compared to the corresponding GLP-2 not linked to
XTEN when the corresponding GLP-2 is stered to a subject using a comparable dose.
Item 3. The recombinant fusion protein of item 1, wherein the t is selected from the group
consisting of mouse, rat, monkey, and human.
Item 4. The recombinant fusion protein of any one of the preceding items, wherein said
administration is subcutaneous, intramuscular, or intravenous.
Item 5. The recombinant fusion protein of any one of the preceding items, wherein the
intestinotrophic effect is ined after administration of 1 dose, or 3 doses, or 6 doses, or 10 doses, or
12 or more doses of the fusion protein.
Item 6. The recombinant fusion protein of any one of the preceding items, wherein the
intestinotrophic effect is selected from the group consisting of intestinal , increased hyperplasia of
the villus epithelium, increased crypt cell eration, increased height of the crypt and villus axis,
increased healing after intestinal anastomosis, increased small bowel weight, increased small bowel
length, decreased small bowel epithelium apoptosis, and enhancement of intestinal on.
Item 7. The recombinant fusion protein of Item 6, wherein the stration results in an increase in
small intestine weight of at least about 10%, or at least about 20%, or at least about 30%.
Item 8. The recombinant fusion protein of Item 6, wherein the administration results in an increase in
small intestine length of at least about 5%, or at least about 6%, or at least about 7%, or at least about 8%,
or at least about 9%, or at least about 10%, or at least about 20%, or at least about 30%.
Item 9. The recombinant fusion protein of any one of the preceding items, wherein the GLP-2
sequence has at least 90%, or at least about 91%, or at least about 92%, or at least about 93%, or at least
about 94%, or at least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or
at least about 99%, or 100% sequence identity to a sequence selected from the group consisting of the
sequences in Table 1, when optimally aligned.
Item 10. The recombinant fusion protein of any one of the preceding items, n the GLP-2
comprises human GLP-Z.
Item 11. The recombinant fusion protein of any one of Item 9-Item 11, wherein the GLP-2 is ed
from the group consisting of bovine GLP-2, pig GLP-2, sheep GLP-2, chicken GLP-2, and canine GLP-
2012/054941
Item 12. The inant fusion protein of any one of the preceding items, wherein the GLP-2 has an
amino acid substitution in place of Alaz, and wherein the substitution is glycine.
Item 13. The recombinant fusion protein of any one of Item 1-Item 9, wherein the GLP-2 has the
sequence HGDGSFSDEMNTILDNLAARDFINWLIQTKITD.
Item 14. The recombinant fusion protein any one of the preceding items, wherein the XTEN is linked
to the C—terminus of the GLP-2.
Item 15. The recombinant fusion protein of Item 14, further comprising a spacer sequence of l to
about 50 amino acid es linking the GLP-2 and XTEN components.
Item 16. The recombinant fusion protein of Item 15, wherein the spacer sequence is a e residue.
Item 17. The recombinant fusion protein of any one of the preceding items, wherein the XTEN is
characterized in that:
(a) the total XTEN amino acid residues is at least 36 to about 3000 amino acid residues;
(b) the sum of e (G), e (A), serine (S), threonine (T), glutamate (E) and
proline (P) residues constitutes at least about 90% of the total amino acid residues of the XTEN;
Item 18. The recombinant fusion protein of any one of the preceding items, wherein the XTEN is
characterized in that the sum of asparagine and glutamine residues is less than 10% of the total amino
acid sequence of the XTEN.
Item 19. The recombinant fusion protein of any one of the preceding items, wherein the XTEN is
characterized in that the sum of methionine and phan residues is less than 2% of the total amino
acid sequence of the XTEN.
Item 20. The inant fusion protein any one of the preceding items, wherein the XTEN has at
least 90%, or at least about 91%, or at least about 92%, or at least about 93%, or at least about 94%, or at
least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about
99%, or about 100% sequence identity when compared to a sequence of comparable length selected from
any one of Table 4, Table 8, Table 9, Table 10, Table 11, and Table 12, when optimally aligned.
Item 21. The recombinant fusion protein any one of the preceding items, wherein the XTEN has at
least 90%, or at least about 91%, or at least about 92%, or at least about 93%, or at least about 94%, or at
least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about
99%, or about 100% sequence identity when ed to an AE864 sequence from Table 4, when
optimally aligned.
Item 22. The recombinant fusion protein of any one of Item 1-Item 9 or Item 13, wherein the fusion
protein sequence has a sequence with at least 90%, or at least about 91%, or at least about 92%, or at
least about 93%, or at least about 94%, or at least about 95%, or at least about 96%, or at least about
97%, or at least about 98%, or at least about 99%, or 100% sequence identity to the sequence set forth in
.
2012/054941
Item 23. The recombinant fusion protein of any one of the preceding items, wherein the fusion protein
exhibits a terminal half-life that is at least about 30 hours when administered to a subject.
Item 24. The inant fusion protein of any one of the preceding items, wherein the fusion protein
binds to a GLP-2 receptor with an EC50 of less than about 30 nM, or about 100 nM, or about 200 nM, or
about 300 nM, or about 370 nM, or about 400 nM, or about 500 nM, or about 600 nM, or about 700 nM,
or about 800 nM, or about 1000 nM, or about 1200 nM, or about 1400 nM when assayed using an in Vitro
GLP2R cell assay wherein the GLP2R cell is a human recombinant GLP-2 on family or
calcium-optimized cell.
Item 25. The inant fusion protein of any one of the preceding items, wherein the fusion protein
retains at least about 1%, or about 2%, or about 3%, or about 4%, or about 5%, or about 10%, or about
%, or about 30% of the potency of the corresponding GLP-2 not linked to XTEN when assayed using
an in Vitro GLP2R cell assay wherein the GLP2R cell is a human recombinant GLP-2 glucagon family
receptor calcium-optimized cell.
Item 26. The recombinant fusion protein of any one of the preceding items, characterized in that
(a) when an equivalent amount, in nmoles/kg, of the fusion protein and the
corresponding GLP-2 that lacks the XTEN are each stered to comparable subjects, the fusion
protein es a terminal half-life in the subject that is at least about 3-fold, or at least 4-fold, or at least
—fold, or at least 10—fold, or at least 15—fold, or at least 20—fold longer compared to the corresponding
GLP—2 that lacks the XTEN;
(b) when a 2-fold, or 3-fold, or 4-fold, or 5-fold, or 6-fold smaller amount, in nmoles/kg,
of the fusion protein than the corresponding GLP-2 that lacks the XTEN are each stered to
comparable subjects with a gastrointestinal condition, the fusion protein achieves a comparable
therapeutic effect in the subject as the corresponding GLP-2 that lacks the XTEN;
(c) when the fusion protein is stered to a subject in consecutive doses to a subject
using a dose interval that is at least about , or at least 3-fold, or at least 4-fold, or at least 5-fold, or
at least 10-fold, or at least 15-fold, or at least 20-fold longer as compared to a dose interval for the
corresponding GLP-2 that lacks the XTEN and is administered to a comparable subject using an
otherwise equivalent nmoles/kg amount, the fusion n achieves a similar blood concentration in the
subject as compared to the corresponding GLP-2 that lacks the XTEN; or
(d) when the fusion protein is administered to a subject in consecutive doses to a subject
using a dose interval that is at least about 3-fold, or at least 4-fold, or at least , or at least 10-fold, or
at least 15-fold, or at least 20-fold longer as ed to a dose interval for the corresponding GLP-2 that
lacks the XTEN and is administered to a comparable subject using an otherwise equivalent nmoles/kg
amount, the filSlOl’l protein achieves a comparable therapeutic effect in the subject as the corresponding
GLP-2 that lacks the XTEN.
Item 27. The recombinant fusion protein of Item 26, wherein the subject is selected from the group
consisting of mouse, rat, monkey, and human.
2012/054941
Item 28. The inant fusion protein of Item 27, wherein the subject is rat.
Item 29. The recombinant fusion protein of any one of Item 26-Item 28, wherein the administration
results in a greater therapeutic effect compared to the effect seen with the corresponding GLP-2 not
linked to XTEN.
Item 30. The recombinant fusion protein of any one of Item 26-Item 29, wherein administration of an
ive amount the fusion protein results in a greater therapeutic effect in a subject with enteritis
compared to the corresponding GLP-2 not linked to XTEN when the corresponding GLP-2 is
administered to a comparable subject using a comparable nmoles/kg amount.
Item 31. The recombinant fusion protein of any one of Item 26-Item 30, wherein the subject is
selected from the group consisting of mouse, rat, monkey, and human.
Item 32. The recombinant fusion protein of Item 31, n the subject is human and the enteritis is
Crohn’s disease.
Item 33. The recombinant fusion protein of Item 31, wherein the t is rat subject and the enteritis
is induced with indomethacin.
Item 34. The recombinant fusion protein of any one of Item 29-Item 33, wherein the greater
therapeutic effect is selected from the group consisting of body weight gain, small intestine length,
reduction in TNFo. content of the small intestine tissue, reduced mucosal atrophy, reduced incidence of
perforated ulcers, and height of Villi.
Item 35. The recombinant fusion protein of Item 34, wherein the administration s in an increase
in small intestine weight of at least about 10%, or at least about 20%, or at least about 30%, or at least
about 40% greater compared to that of the corresponding GLP-2 not linked to XTEN.
Item 36. The recombinant fusion protein of Item 34, wherein the administration results in an increase
in small intestine length of at least about 5%, or at least about 6%, or at least about 7%, or at least about
8%, or at least about 9%, or at least about 10%, or at least about 20%, or at least about 30%, or at least
about 40% r ed to that of the corresponding GLP-2 not linked to XTEN.
Item 37. The recombinant fusion protein of Item 34, n the administration results in an increase
in body weight is at least about 5%, or at least about 6%, or at least about 7%, or at least about 8%, or at
least about 9%, or at least about 10%, or at least about 20%, or at least about 30%, or at least about 40%
greater compared to that of the corresponding GLP-2 not linked to XTEN.
Item 38. The inant fusion protein of Item 34, wherein the ion in TNFU. content is at least
about 0.5 ng/g, or at least about 0.6 ng/g, or at least about 0.7 ng/g, or at least about 0.8 ng/g, or at least
about 0.9 ng/g, or at least about 1.0 ng/g, or at least about 1.1 ng/g, or at least about 1.2 ng/g, or at least
about 1.3 ng/g, or at least about 1.4 ng/g of small intestine tissue or greater ed to that of the
corresponding GLP-2 not linked to XTEN.
Item 39. The recombinant fusion protein of Item 34, wherein the Villi height is at least about 5%, or at
least about 6%, or at least about 7%, or at least about 8%, or at least about 9%, or at least about 10%, or
at least about 11%, or at least about 12% greater compared to that of the corresponding GLP-2 not linked
to XTEN.
Item 40. The recombinant fusion protein of any one of Item 29-Item 39, wherein the fusion protein is
administered as 1, or 2, or 3, or 4, or 5, or 6, or 10, or 12 or more consecutive doses.
Item 41. The recombinant fusion protein of any one of Item 30-Item 40, wherein the effective amount
is at least about 5, or least about 10, or least about 25, or least about 100, or least about 200 nmoles/kg.
Item 42. The recombinant fusion n of any one of the preceding items, wherein the GLP-2 is
linked to the XTEN via a cleavage sequence that is cleavable by a mammalian protease selected from the
group consisting of factor XIa, factor XIIa, kallikrein, factor VIIa, factor IXa, factor Xa, factor IIa
(thrombin), Elastase-2, MMP-12, MMP13, MMP-17 and MMP-20, wherein cleavage at the cleavage
sequence by the mammalian protease releases the GLP-2 sequence from the XTEN sequence, and
wherein the ed GLP-2 sequence exhibits an increase in receptor binding activity of at least about
% compared to the uncleaved fusion protein.
Item 43. A method of producing a fusion protein comprising GLP-2 fused to one or more extended
inant polypeptides , comprising:
(a) providing a host cell comprising a inant nucleic acid encoding the fusion
protein of any one of items 1 to Item 41;
(b) culturing the host cell under conditions permitting the expression of the fusion
protein; and
(c) recovering the fusion protein.
Item 44. The method of Item 43, wherein:
(a) the host cell is a prokaryotic cell; or
(b) the fusion protein is recovered from the host cell asm in ntially soluble
form.
Item 45. The method of Item 43, n the recombinant nucleic acid molecule has a ce with
at least 90%, or at least about 91%, or at least about 92%, or at least about 93%, or at least about 94%, or
at least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about
99%, or about 100% ce identity to a sequence selected from the group consisting of the DNA
sequences set forth in Table 13, when optimally aligned, or the complement thereof.
Item 46. An isolated nucleic acid comprising:
(a) a nucleic acid sequence that has at least 70%, or at least about 80%, or at least about
90%,or at least about 91%, or at least about 92%, or at least about 93%, or at least about 94%, or at least
about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about 99%, or
about 100% sequence identity to a DNA sequence selected from Table 13, or the complement thereof; or
(b) a nucleotide sequence encoding the fusion protein of any of items 1-Item 41, or the
complement thereof.
Item 47. An sion vector or isolated host cell comprising the nucleic acid of any one of Item 43-
Item 46.
Item 48. A host cell comprising the expression vector of Item 47.
Item 49. A pharmaceutical composition comprising the fiJsion protein of 1-Item 41, and a
pharmaceutically acceptable carrier.
Item 50. The recombinant fusion n of item I configured according to formula V:
(a) )-(S)x-(XTEN) (V)
wherein independently for each occurrence,
(b) GLP-2 is a sequence having at least 90%, or at least about 91%, or at least about
92%, or at least about 93%, or at least about 94%, or at least about 95%, or at least about 96%, or at least
about 97%, or at least about 98%, or at least about 99%, or about 100% sequence identity to a sequence
selected from the group consisting of the sequences in Table 1, when optimally aligned;
(c) S is a spacer sequence having between 1 to about 50 amino acid residues that can
optionally include a ge sequence from Table 6 or amino acids compatible with restrictions sites;
(d) X is either 0 or 1;
Item 51. The recombinant fusion protein of Item 50, wherein the GLP-2 comprises human GLP-2.
Item 52. The recombinant fusion n of Item 50, wherein the GLP—2 is selected from the group
consisting of bovine GLP—2, pig GLP—2, sheep GLP—2, chicken GLP—2, and canine GLP—2.
Item 53. The recombinant fusion protein of Item 51 or item Item 52, wherein the GLP-2 has an amino
acid substitution in place of Alaz, and wherein the tution is glycine.
Item 54. The recombinant fusion protein of Item 50, wherein the GLP-2 has the sequence
HGDGSFSDEMNTILDNLAARDFINWLIQTKITD.
Item 55. The recombinant fusion protein of any one of Item 5O-Item 54, comprising a spacer sequence
wherein the spacer sequence is a glycine residue.
Item 56. The recombinant fusion n any one of Item m 55, wherein the XTEN has at least
90%, or at least about 91%, or at least about 92%, or at least about 93%, or at least about 94%, or at least
about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about 99%, or
100% sequence ty when compared to a sequence of comparable length selected from any one of
Table 4, Table 8, Table 9, Table 10, Table 11, and Table 12, when optimally aligned.
Item 57. The recombinant fusion protein any one of Item 50-Item 55, wherein the XTEN has at least
90%, or at least about 91%, or at least about 92%, or at least about 93%, or at least about 94%, or at least
about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about 99%, or
100% sequence identity when compared to a AE864 sequence from Table 4, when optimally aligned.
Item 58. The pharmaceutical composition of Item 49, wherein administration of a therapeutically
effective amount of the pharmaceutical composition to a subject with a gastrointestinal condition results
in maintaining blood trations ofthe fusion protein within a therapeutic window for the fusion
WO 40093
protein at least three-fold longer ed to the corresponding GLP-2 not linked to the XTEN and
administered at a comparable amount to the subject.
Item 59. The pharmaceutical composition of Item 49, wherein administration of three or more doses
of the pharmaceutical composition to a subject with a gastrointestinal condition using a therapeutically-
effective dose regimen results in a gain in time of at least four-fold n at least two consecutive Cmax
peaks and/or le-n troughs for blood levels of the fusion protein ed to the corresponding GLP-2 not
linked to the XTEN and administered using a comparable dose regimen to a subject.
Item 60. The pharmaceutical composition of Item 59 or Item 60, wherein the gastrointestinal
condition is ed from the group consisting of gastritis, digestion disorders, malabsorption syndrome,
short-gut syndrome, short bowel syndrome, cul-de-sac syndrome, inflammatory bowel disease, celiac
disease, tropical sprue, hypogammaglobulinemic sprue, Crohn's disease, tive colitis, enteritis,
chemotherapy-induced enteritis, irritable bowel syndrome, small intestine damage, small intestinal
damage due to cancer-chemotherapy, gastrointestinal injury, diarrheal diseases, intestinal insufficiency,
acid-induced intestinal , arginine deficiency, thic hypospermia, obesity, lic s,
febrile neutropenia, diabetes, obesity, steatorrhea, autoimmune diseases, food allergies, hypoglycemia,
gastrointestinal barrier disorders, sepsis, bacterial peritonitis, burn-induced inal damage, decreased
intestinal motility, intestinal failure, chemotherapy- associated emia, bowel trauma, bowel
ischemia, mesenteric ischemia, malnutrition, necrotizing enterocolitis, necrotizing pancreatitis, neonatal
feeding intolerance, NSAID—induced gastrointestinal damage, nutritional insufficiency, total parenteral
nutrition damage to gastrointestinal tract, neonatal nutritional insufficiency, radiation-induced enteritis,
radiation-induced injury to the intestines, mucositis, pouchitis, and gastrointestinal ischemia.
Item 61. The pharmaceutical composition of Item 49, wherein after intravenous, subcutaneous, or
intramuscular administration of the pharmaceutical composition comprising at least about 5, or least
about 10, or least about 25, or least about 100, or least about 200 nmoles/kg of the fusion protein to a
subject, the fusion protein blood levels are ined above 1000 11ng for at least 72 hours.
Item 62. The pharmaceutical ition of Item 61, wherein the subject is selected from mouse, rat,
monkey and human.
Item 63. A recombinant fusion protein according to any one of l-Item 41 for use in the manufacture
of a medicament for the treatment of a intestinal condition.
Item 64. The inant fusion protein of Item 63 wherein the gastrointestinal condition is selected
from the group consisting of gastritis, digestion disorders, malabsorption syndrome, short-gut syndrome,
short bowel syndrome, cul-de-sac syndrome, inflammatory bowel disease, celiac disease, al sprue,
hypogammaglobulinemic sprue, Crohn's e, ulcerative colitis, enteritis, chemotherapy-induced
enteritis, irritable bowel syndrome, small intestine damage, small intestinal damage due to cancer-
chemotherapy, gastrointestinal , diarrheal diseases, inal insufficiency, acid-induced inal
injury, arginine deficiency, idiopathic hypospermia, obesity, catabolic illness, febrile neutropenia,
diabetes, obesity, steatorrhea, autoimmune diseases, food allergies, ycemia, gastrointestinal barrier
WO 40093
disorders, sepsis, bacterial nitis, bum-induced intestinal damage, decreased gastrointestinal
motility, inal failure, chemotherapy- associated bacteremia, bowel trauma, bowel ischemia,
mesenteric ischemia, malnutrition, necrotizing colitis, necrotizing pancreatitis, neonatal feeding
intolerance, NSAID-induced gastrointestinal , nutritional insufficiency, total parenteral nutrition
damage to gastrointestinal tract, neonatal ional insufficiency, radiation-induced enteritis, radiation-
induced injury to the intestines, mucositis, pouchitis, ischemia, and stroke.
Item 65. A recombinant fusion protein according to any one of l-Item 41 for use in a method of
ng a gastrointestinal condition in a subject, comprising administering to the subject a therapeutically
ive amount of the fusion protein.
Item 66. The recombinant fusion protein for use according to item Item 65, wherein the
gastrointestinal condition is selected from the group consisting of gastritis, digestion disorders,
malabsorption syndrome, short-gut syndrome, short bowel syndrome, cul-de-sac syndrome,
inflammatory bowel disease, celiac disease, tropical sprue, hypogammaglobulinemic sprue, Crohn's
disease, ulcerative colitis, enteritis, chemotherapy-induced enteritis, irritable bowel syndrome, small
intestine damage, small intestinal damage due to cancer-chemotherapy, gastrointestinal injury, diarrheal
diseases, intestinal insufficiency, acid-induced intestinal injury, arginine deficiency, idiopathic
hypospermia, obesity, catabolic illness, febrile neutropenia, diabetes, obesity, steatorrhea, autoimmune
diseases, food allergies, hypoglycemia, gastrointestinal barrier disorders, sepsis, bacterial peritonitis,
nduced inal damage, sed gastrointestinal motility, intestinal failure, chemotherapy—
associated bacteremia, bowel trauma, bowel ischemia, mesenteric ischemia, malnutrition, necrotizing
enterocolitis, necrotizing pancreatitis, neonatal feeding rance, NSAID-induced gastrointestinal
damage, nutritional insufficiency, total parenteral nutrition damage to gastrointestinal tract, neonatal
nutritional ciency, ion-induced enteritis, radiation-induced injury to the intestines, mucositis,
pouchitis, ischemia, and stroke.
Item 67. The inant fusion n for use ing to item Item 65 wherein administration of
two or more consecutive doses of the fusion protein administered using a therapeutically effective dose
regimen to a subject results in a prolonged period between consecutive me peaks and/or Cmin troughs for
blood levels of the fusion protein compared to the corresponding GLP-2 that lacks the XTEN and
administered using a therapeutically effective dose regimen established for the GLP-Z.
Item 68. The recombinant fusion protein for use according to item Item 65 wherein a smaller amount
in nmoles/kg of the fusion protein is stered to a subject in comparison to the ponding GLP-2
that lacks the XTEN administered to a subject under an ise equivalent dose regimen, and the
fusion protein achieves a able therapeutic effect as the corresponding GLP-2 that lacks the XTEN.
Item 69. The inant fusion protein for use according to item Item 68, wherein the therapeutic
effect is selected from the group consisting of blood concentrations of GLP-2, increased mesenteric blood
flow, decreased inflammation, sed weight gain, decreased ea, decreased fecal wet weight,
intestinal wound healing, increase in plasma citrulline concentrations, decreased CRP levels, decreased
requirement for steroid therapy, enhancing or stimulating mucosal integrity, decreased sodium loss,
minimizing, mitigating, or preventing ial translocation in the ines, enhancing, stimulating or
accelerating recovery of the intestines after surgery, preventing relapses of inflammatory bowel disease,
and maintaining energy homeostasis.
Item 70. A recombinant fusion protein for use in a pharmaceutical regimen for treatment of a
gastrointestinal condition in a subject, said regimen sing a pharmaceutical composition comprising
the fusion protein of any one of l-Item 41.
Item 71. The recombinant fusion protein of Item 70, n the pharmaceutical regimen further
comprises the step of determining the amount of ceutical composition needed to achieve a
therapeutic effect in the subject, wherein the therapeutic effect is selected from the group consisting of
increased mesenteric blood flow, decreased inflammation, increased weight gain, decreased diarrhea,
decreased fecal wet weight, intestinal wound healing, increase in plasma citrulline concentrations,
decreased CRP levels, decreased requirement for steroid therapy, enhanced mucosal ity, decreased
sodium loss, preventing bacterial translocation in the intestines, accelerated ry of the intestines
after surgery, prevention of relapses of inflammatory bowel disease, and ining energy homeostasis.
Item 72. The inant fusion protein of Item 70, wherein the gastrointestinal condition is selected
from the group consisting of gastritis, digestion disorders, malabsorption syndrome, short-gut me,
short bowel syndrome, cul—de—sac syndrome, inflammatory bowel disease, celiac disease, tropical sprue,
hypogammaglobulinemic sprue, Crohn's disease, ulcerative colitis, enteritis, chemotherapy—induced
enteritis, irritable bowel syndrome, small intestine damage, small intestinal damage due to -
chemotherapy, gastrointestinal injury, diarrheal diseases, intestinal ciency, acid-induced intestinal
injury, ne deficiency, idiopathic hypospermia, obesity, catabolic illness, febrile neutropenia,
diabetes, obesity, steatorrhea, autoimmune diseases, food allergies, hypoglycemia, intestinal r
disorders, , bacterial peritonitis, bum-induced intestinal damage, decreased gastrointestinal
motility, intestinal failure, chemotherapy— associated bacteremia, bowel , bowel ischemia,
mesenteric ischemia, malnutrition, izing enterocolitis, necrotizing pancreatitis, al feeding
intolerance, NSAID-induced gastrointestinal damage, ional insufficiency, total parenteral nutrition
damage to intestinal tract, neonatal nutritional insufficiency, radiation-induced enteritis, radiation-
induced injury to the intestines, mucositis, pouchitis, ischemia, and stroke.
Item 73. The recombinant fusion protein of Item 70, wherein the pharmaceutical regimen for treating
a subject with a gastrointestinal condition comprises administering the pharmaceutical ition in
two or more successive doses to the subject at an effective amount, wherein the administration results in
at least a 5%, or 10%, or 20%, or 30%, or 40%, or 50%, or 60%, or 70%, or 80%, or 90% r
improvement of at least one, two, or three parameters associated with the gastrointestinal condition
compared to the GLP-2 not linked to XTEN and administered using a able nmol/kg amount.
Item 74. The recombinant fusion protein of Item 73, n the parameter improved is selected from
increased blood concentrations of GLP-2, increased mesenteric blood flow, decreased inflammation,
increased weight gain, decreased diarrhea, decreased fecal wet weight, intestinal wound g, increase
in plasma citrulline concentrations, decreased CRP levels, decreased requirement for steroid therapy,
enhanced mucosal integrity, decreased sodium loss, preventing bacterial translocation in the intestines,
accelerated recovery of the intestines after surgery, prevention of relapses of inflammatory bowel disease,
and maintaining energy tasis.
Item 75. The recombinant fusion protein of Item 70, wherein the regimen comprises stering a
therapeutically effective amount of the pharmaceutical composition of Item 49 once every 7, or 10, or 14,
or 21, or 28 or more days.
Item 76. The recombinant fusion protein of Item 75, wherein the effective amount is at least about 5,
or least about 10, or least about 25, or least about 100, or least about 200 /kg.
Item 77. The recombinant fusion protein of any one of Item 73-Item 76, wherein said administration
is subcutaneous, intramuscular, or intravenous.
Item 78. A method of treating a gastrointestinal condition in a subject, comprising administering to
said subject a composition comprising an effective amount of the ceutical ition of Item 49.
Item 79. The method of Item 78, wherein the effective amount is at least about 5, or least about 10, or
least about 25, or least about 100, or least about 200 nmoles/kg.
Item 80. The method of Item 79, wherein the fusion protein exhibits a terminal half-life of greater
than about 30 hours in said t.
Item 81. The method of any one of Item m 80, n the gastrointestinal condition is selected
from the group consisting of gastritis, digestion disorders, malabsorption me, short-gut syndrome,
short bowel syndrome, cul-de-sac syndrome, inflammatory bowel disease, celiac disease, tropical sprue,
hypogammaglobulinemic sprue, Crohn's disease, ulcerative colitis, enteritis, chemotherapy-induced
enteritis, irritable bowel syndrome, small intestine damage, small intestinal damage due to cancer-
chemotherapy, gastrointestinal injury, diarrheal diseases, intestinal insufficiency, acid-induced intestinal
injury, arginine deficiency, idiopathic hypospermia, obesity, catabolic illness, febrile neutropenia,
diabetes, obesity, steatorrhea, mune diseases, food allergies, hypoglycemia, gastrointestinal barrier
disorders, , bacterial peritonitis, bum-induced intestinal damage, sed gastrointestinal
motility, intestinal failure, chemotherapy— ated bacteremia, bowel trauma, bowel ischemia,
mesenteric ischemia, malnutrition, necrotizing enterocolitis, necrotizing pancreatitis, neonatal feeding
intolerance, NSAID-induced gastrointestinal damage, nutritional insufficiency, total parenteral ion
damage to gastrointestinal tract, al nutritional insufficiency, radiation-induced enteritis, radiation-
induced injury to the intestines, mucositis, pouchitis, ischemia, and .
Item 82. The method of Item 81, wherein the gastrointestinal condition is s disease.
Item 83. The method of any one of Item 78-Item 82, wherein the subject is selected from the group
consisting of mouse, rat, monkey, and human.
Item 84. The method of any one of Item 78-Item 83, wherein said administration is aneous,
intramuscular, or intravenous.
2012/054941
Item 85. The method of any one of Item 78-Item 84, wherein said administration results in an
intestinotrophic effect in said subject.
Item 86. The method of Item 85, wherein the intestinotrophic effect is at least about 30%, or at least
about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or
at least about 90%, or at least about 100% or at least about 120% or at least about 150% or at least about
200% of the intestinotrophic effect compared to the corresponding GLP-2 not linked to XTEN and
administered to a subject using a comparable dose.
Item 87. The method of Item 85 or Item 86, wherein the inotrophic effect is determined after
administration of 1 dose, or 3 doses, or 6 doses, or 10 doses, or 12 or more doses of the fusion protein.
Item 88. The method of any one of Item 85-Item 87, wherein the inotrophic effect is selected
from the group consisting of intestinal growth, increased hyperplasia ofthe villus epithelium, increased
crypt cell proliferation, increased height of the crypt and villus axis, increased healing after intestinal
anastomosis, increased small bowel weight, increased small bowel , decreased small bowel
epithelium apoptosis, and enhancement of intestinal on.
It is ically contemplated that the recombinant GLP2-XTEN fusion proteins can exhibit
one or more or any combination of the properties disclosed herein.
ORATION BY REFERENCE
All publications, patents, and patent applications mentioned in this specification are herein
incorporated by reference to the same extent as if each individual publication, patent, or patent
application was specifically and individually indicated to be incorporated by reference.
BRIEF DESCRIPTION OF THE DRAWINGS
The features and advantages of the invention may be fiarther explained by reference to the
following detailed description and accompanying drawings that sets forth illustrative embodiments.
is a tic of the logic flow chart of the algorithm SegScore. In the figure the
following legend applies: i, j - counters used in the control loops that run through the entire sequence;
HitCount- this variable is a counter that keeps track ofhow many times a subsequence encounters an
identical subsequence in a block; SubSeqX - this variable holds the subsequence that is being checked for
redundancy; SubSeqY - this le holds the subsequence that the SubSeqX is checked against;
BlockLen - this variable holds the user determined length of the block; SegLen - this variable holds the
length of a segment. The program is hardcoded to generate scores for subsequences of lengths 3, 4, 5, 6,
7, 8, 9, and 10; Block - this le holds a string of length BlockLen. The string is composed of letters
from an input XTEN sequence and is determined by the position of the i counter; SubSeqList - this is a
list that holds all of the generated uence scores.
depicts the application of the thm SegScore to a hypothetical XTEN of 11 amino
acids in order to determine the tiveness. An XTEN sequence consisting ofN amino acids is d
into N-S+l subsequences of length S (823 in this case). A pair-wise comparison of all subsequences is
performed and the average number of identical subsequences is calculated to result, in this case, in a
subsequence score of 1.89.
illustrates the use of donor XTEN sequences to produce truncated XTEN ces. provides the sequence of AG864, with the underlined sequence used to generate an AGS76 sequence.
provides the sequence of AG864, with the underlined sequence used to generate an AG288
sequence. provides the ce of AG864, with the ined sequence used to te an
AGl44 sequence. es the sequence of AE864, with the underlined sequence used to
generate an AES76 sequence. es the sequence of AE864, with the underlined sequence
used to generate an AE288 sequence.
is a schematic flowchart of representative steps in the assembly, production and the
evaluation of an XTEN.
is a schematic flowchart of representative steps in the assembly of a GLP2-XTEN
polynucleotide construct encoding a fusion protein. Individual oligonucleotides 501 are ed into
sequence motifs 502 such as a 12 amino acid motif (“12-mer”), which is ligated to additional sequence
motifs from a y to create a pool that encompasses the desired length of the XTEN 504, as well as
ligated to a smaller concentration of an oligo containing BbsI, and KpnI restriction sites 503. The
resulting pool of on ts is gel-purified and the band with the desired length of XTEN is cut,
resulting in an isolated XTEN gene with a stopper sequence 505. The XTEN gene is cloned into a r
vector. In this case, the vector encodes an optional CBD sequence 506 and a GFP gene 508. Digestion is
then performed with BbsI/HindIII to remove 507 and 508 and place the stop codon. The resulting
product is then cloned into a BsaI/HindIII digested vector containing a gene encoding the GLP-2,
resulting in gene 500 encoding a GLP2-XTEN fitsion protein.
is a schematic flowchart of representative steps in the ly of a gene encoding
fusion protein comprising a GLP-2 and XTEN, its expression and recovery as a fusion n, and its
evaluation as a candidate GLP2-XTEN product.
shows schematic representations of exemplary GLP2-XTEN fusion proteins (FIGS. 7A-
H), all depicted in an N— to C-terminus orientation. shows two different rations of
GLP2-XTEN fusion proteins (100), each comprising a single GLP-2 and an XTEN, the first of which has
an XTEN molecule (102) attached to the C—terminus of a GLP-2 (103), and the second of which has an
XTEN molecule attached to the N—terminus of a GLP-2 (103). shows two different
configurations of GLPZ-XTEN fusion proteins (100), each sing a single GLP-2, a spacer sequence
and an XTEN, the first of which has an XTEN molecule (102) attached to the C-terminus of a spacer
sequence (104) and the spacer sequence attached to the C-terminus of a GLP-2 (103) and the second of
which has an XTEN molecule attached to the N—terminus of a spacer sequence (104) and the spacer
sequence attached to the N—terminus of a GLP-2 (103). shows two different configurations of
GLP2-XTEN fusion ns (101), each comprising two molecules of a single GLP-2 and one molecule
of an XTEN, the first of which has an XTEN linked to the C-terminus of a first GLP-2 and that GLP-2 is
2012/054941
linked to the C—terminus of a second GLP-2, and the second of which is in the opposite orientation in
which the XTEN is linked to the inus of a first GLP-2 and that GLP-2 is linked to the inus
of a second GLP-2. shows two different configurations of GLP2-XTEN fusion proteins (101),
each comprising two molecules of a single GLP-Z, a spacer sequence and one molecule of an XTEN, the
first of which has an XTEN linked to the C-terminus of a spacer ce and the spacer sequence linked
to the C—terminus of a first GLP-2 which is linked to the C-terminus of a second GLP-2, and the second
of which is in the opposite orientation in which the XTEN is linked to the N—terminus of a spacer
sequence and the spacer sequence is linked to the N—terminus of a first GLP-2 that that GLP-2 is linked to
the N-terminus of a second GLP-2. shows two different configurations of GLPZ-XTEN fusion
proteins (101), each comprising two molecules of a single GLP-2, a spacer sequence and one molecule of
an XTEN, the first ofwhich has an XTEN linked to the C-terminus of a first GLP-2 and the first GLP-2
linked to the C-terminus of a spacer sequence which is linked to the C-terminus of a second GLP-2
molecule, and the second of which is in the opposite configuration of XTEN linked to the N-terminus of
a first GLP-2 which is linked to the N—terminus of a spacer sequence which in turn is linked to the N-
terminus of a second molecule of GLP-2. shows a configuration of GLP2-XTEN fusion protein
(105), each comprising one molecule of GLP-2 and two molecules of an XTEN linked to the N—terminus
and the C-terminus of the GLP-2. shows a configuration (106) of a single GLP-2 linked to two
XTEN, with the second XTEN separated from the GLP—2 by a spacer sequence. shows a
configuration (106) of a two GLP—2 linked to two XTEN, with the second XTEN linked to the C—
terminus ofthe first GLP-2 and the N—terminus of the second GLP-2, which is at the C-terminus of the
TEN.
is a schematic illustration of exemplary polynucleotide constructs (FIGS. 8A—H) of
GLP2-XTEN genes that encode the corresponding TEN polypeptides of all depicted in a
’ to 3’ orientation. In these illustrative examples the genes encode GLP2-XTEN fusion proteins with
one GLP-2 and XTEN (200); or one GLP-2, one spacer sequence and one XTEN (200); two GLP-2 and
one XTEN (201); or two GLP-2, a spacer ce and one XTEN (201); one GLP-2 and two XTEN
(205); or two GLP-2 and two XTEN (206). In these depictions, the polynucleotides encode the following
components: XTEN (202), GLP-2 (203), and spacer amino acids that can include a cleavage sequence
(204), with all sequences linked in frame.
is a schematic representation of the design of GLP2-XTEN expression s with
different processing strategies. shows an exemplary expression vector ng XTEN fused to
the 3’ end ofthe sequence encoding GLP-2. Note that no additional leader sequences are required in this
vector. s an expression vector encoding XTEN fused to the 3’ end ofthe sequence
encoding GLP-2 with a CBD leader sequence and a TEV protease site. depicts an expression
vector where the CBD and TEV processing site have been replaced with an optimized N—terminal leader
sequence (NTS). s an expression vector encoding an NTS sequence, an XTEN, a
sequence encoding GLP-2, and then a second ce encoding an XTEN.
illustrates the process of combinatorial gene assembly of genes ng XTEN. In this
case, the genes are assembled from 6 base fragments and each fragment is available in 4 different codon
versions (A, B, C and D). This allows for a theoretical ity of 4096 in the assembly of a 12 amino
acid motif.
shows characteriation data of the fusion protein GLP2-2G_AE864. A is an SDSPAGE
gel of GLP2-2G-XTEN_AE864 lot AP690, as described in Example 16. The gels show lanes of
lar weight standards and 2 or 10 ug ofreference standard, as indicated. B shows results of
a size exclusion chromatography analysis of GLP2-2G-XTEN_AE864 lot AP690, as described in
Example 16, ed to molecular weight standards of 667, 167, 44, 17, and 3.5 kDa.
shows the ESI-MS analysis of GLP2-2G-XTEN_AE864 lot AP690, as described in
Example 16, with a major peak at 83,142 Da, ting full length intact GLP2-2G-XTEN, with an
additional minor peak of 83,003 Da detected, representing the des-His GLP2-2G-XTEN at <5% of total
protein.
shows results of the GLP-2 receptor binding assay, as described in Example 17.
shows the results of the pharmacokinetics of GLP2-2G-XTEN_AE864 in C57Bl/6 mice
following subcutaneous (SC) administration. The samples were analyzed for fusion protein
concentration, performed by both anti-XTEN/anti- XTEN sandwich ELISA and anti-GLP2/anti-XTEN
sandwich ELISA, as described in Example 18, with results for both assays plotted.
shows the results of the pharmacokinetics of GLP2—2G—XTEN_AE864 in Wistar rats
following SC administration of two different dosage levels, performed by both anti-XTEN/anti- XTEN
sandwich ELISA and anti-GLP2/anti-XTEN ch ELISA, as described in Example 19, with results
for both assays plotted.
shows the results of the pharmacokinetics of GLP2-2G-XTEN_AE864 in male
cynomolgus monkeys following either subcutaneous (squares) or intravenous (triangles) administration
of the fusion protein at a single dosage level (2 mg/kg). The samples were analyzed for fusion protein
concentration, performed by anti-GLP2/anti-XTEN ELISA, as described in e 20.
shows the linear regression ofthe tric g of GLP2-2G—XTEN half-life from
three species used to predict a projected half-life of 240 hours in humans, as described in Example 20.
shows the results in rat small intestine weight and length from vehicle and treatment
groups, as described in Example 21.
shows the results of changes in body weight in a murine dextran sodium sulfate (DSS)
model, with groups d with vehicle, GLP2-2G e (no XTEN) or G-XTEN, as bed
in Example 21.
shows entative histopathology sections of the DSS model mice from vehicle ileum
(A) and jejunum (B) and GLP2-2G-XTEN ileum (C) and jejunum (D), as
described in Example 21.
shows s from Study 1 of a rat model of Crohn’s Disease of indomethacin-induced
intestinal inflammation, with groups treated with vehicle, GLP2-2G peptide (no XTEN) or GLP2-2G-
XTEN and assayed, as described in Example 21. A shows results of the body weight at the
termination of the experiment. B shows results of the length of the small intestines from each
group. C shows results of the weight of the small intestines from each group. D shows
results of the length of ulcerations and the percentage of ulceration in the small intestines from each
group. E shows s of the scores of adhesions and transulceration in the small intestines from
each group. F shows results of the length and tage of inflammation of the small intestines
from each group. G shows results of the TNFoc assay of the small intestines from each group.
shows s from Study 2 of a rat model of Crohn’s Disease of indomethacin-induced
intestinal inflammation, with groups treated with vehicle, GLP2-2G peptide (no XTEN) or GLP2-2G-
XTEN and assayed, as bed in Example 21. A shows the Trans-Ulceration Score of the
small intestines from each group. B shows the Adhesion Score of the small intestines from each
group.
shows representative histopathology sections from Study 2 of the rat model of Crohn’s
Disease of indomethacin-induced intestinal ation from vehicle-no indomethicin (A),
vehicle-indomethicin (B) and GLP2-2G-XTEN treatment groups (FIGS. 22C, D), as described in
Example 21.
shows the results of small intestine length (A), villi height (B) and
histopathology scoring (C) of mucosal atrophy, ulceration, infiltration measurements from
ed, e-treated, GLP2-2G peptide-treated, and GLP2-2G-XTEN—treated rats, as described in
Example 21. Asterisks indicate groups with tically significant differences from vehicle (diseased)
control group.
shows results of a size exclusion chromatography analysis of glucagon-XTEN construct
samples measured t protein standards n molecular weight (as ted), with the graph
output as absorbance versus retention volume, as described in Example 25. The glucagon—XTEN
ucts are 1) glucagon—Y288; 2) glucagonY—144; 3) glucagon—Y72; and 4) glucagon—Y36. The
results indicate an increase in apparent molecular weight with increasing length ofXTEN moiety.
shows the cokinetic profile (plasma concentrations) in cynomolgus monkeys
after single doses of different compositions of GFP linked to unstructured polypeptides of varying length,
administered either subcutaneously or intravenously, as described in Example 26. The compositions
were GFP-L288, GFP-L576, GFP-XTEN_AF576, GFP-Y576 and XTEN_AD836-GFP. Blood samples
were analyzed at various times after injection and the tration of GFP in plasma was measured by
ELISA using a polyclonal antibody against GFP for capture and a biotinylated preparation of the same
polyclonal antibody for detection. Results are presented as the plasma concentration versus time (h) after
dosing and show, in particular, a erable increase in half-life for the XTEN_AD836-GFP, the
composition with the longest sequence length of XTEN. The construct with the shortest sequence length,
the GFP-L288 had the shortest half-life.
shows an SDS-PAGE gel of samples from a stability study ofthe fusion protein of
E864 fused to the N-terminus of GFP (see Example 27). The GFP-XTEN was incubated in
cynomolgus plasma and rat kidney lysate for up to 7 days at 37°C. In addition, GFP-XTEN administered
to cynomolgus monkeys was also assessed. Samples were withdrawn at O, l and 7 days and analyzed by
SDS PAGE followed by detection using Western is with antibodies against GFP.
shows the amino acid ce of GLP2-2G_AE864.
DETAILED DESCRIPTION OF THE INVENTION
Before the embodiments ofthe invention are described, it is to be understood that such
embodiments are provided by way of example only, and that various alternatives to the embodiments of
the invention described herein may be employed in practicing the ion. Numerous variations,
changes, and substitutions will now occur to those skilled in the art without departing from the invention.
Unless otherwise , all technical and scientific terms used herein have the same meaning as
commonly understood by one of ordinary skill in the art to which this invention belongs. Although
methods and materials similar or lent to those described herein can be used in the practice or
testing ofthe present invention, suitable methods and als are described below. In case of t,
the patent specification, including definitions, will control. In addition, the materials, methods, and
examples are illustrative only and not intended to be limiting. Numerous ions, changes, and
substitutions will now occur to those skilled in the art without departing from the invention.
DEFINITIONS
In the context of the present application, the ing terms have the meanings ascribed to them
unless specified otherwise:
As used in the specification and claims, the singular forms “a”, “an” and “the” include plural
references unless the context clearly dictates ise. For example, the term “a cell” includes a
plurality of cells, including mixtures thereof.
The terms “polypeptide”, “peptide”, and “protein” are used interchangeably herein to refer to
polymers of amino acids of any length. The polymer may be linear or branched, it may comprise
modified amino acids, and it may be interrupted by ino acids. The terms also encompass an
amino acid polymer that has been modified, for example, by disulfide bond formation, glycosylation,
lipidation, acetylation, phosphorylation, or any other manipulation, such as ation with a labeling
component.
As used herein, the term “amino acid” refers to either natural and/or unnatural or synthetic amino
acids, including but not limited to both the D or L l isomers, and amino acid s and
peptidomimetics. Standard single or three letter codes are used to designate amino acids.
The term “natural L-amino acid” means the L optical isomer forms of glycine (G), proline (P),
alanine (A), valine (V), leucine (L), isoleucine (I), nine (M), cysteine (C), phenylalanine (F),
tyrosine (Y), tryptophan (W), histidine (H), lysine (K), arginine (R), glutamine (Q), asparagine (N),
glutamic acid (E), aspartic acid (D), serine (S), and threonine (T).
The term aturally occurring,” as applied to sequences and as used herein, means
polypeptide or polynucleotide sequences that do not have a counterpart to, are not complementary to, or
do not have a high degree of homology with a ype or naturally-occurring sequence found in a
mammal. For example, a non-naturally occurring polypeptide or fragment may share no more than 99%,
98%, 95%, 90%, 80%, 70%, 60%, 50% or even less amino acid sequence identity as compared to a
natural sequence when ly aligned.
The terms “hydrophilic” and “hydrophobic” refer to the degree of affinity that a substance has
with water. A hydrophilic substance has a strong affinity for water, tending to dissolve in, mix with, or
be wetted by water, while a hydrophobic substance substantially lacks affinity for water, tending to repel
and not absorb water and tending not to dissolve in or mix with or be wetted by water. Amino acids can
be characterized based on their hydrophobicity. A number of scales have been developed. An example
is a scale developed by Levitt, M, et al., J Mol Biol (1976) 104:59, which is listed in Hopp, TP, et al.,
Proc Natl Acad Sci U S A (1981) 78:3 824. Examples of “hydrophilic amino acids” are arginine, lysine,
threonine, alanine, asparagine, and glutamine. Of particular interest are the hydrophilic amino acids
aspartate, glutamate, and serine, and glycine. Examples of “hydrophobic amino acids” are tryptophan,
tyrosine, phenylalanine, methionine, leucine, isoleucine, and valine.
A “fragment” when applied to a protein, is a truncated form of a native biologically active
protein that retains at least a portion of the therapeutic and/or biological activity. A “variant” when
applied to a protein, is a n with sequence homology to the native biologically active n that
retains at least a portion of the therapeutic and/or biological activity of the biologically active protein. For
example, a variant protein may share at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99%
amino acid sequence identity compared with the reference biologically active n. As used herein,
the term gically active protein moiety” includes proteins modified deliberately, as for example, by
site directed mutagenesis, synthesis of the encoding gene, insertions, or accidentally through mutations.
The term “sequence variant” means polypeptides that have been d compared to their
native or original sequence by one or more amino acid ions, deletions, or substitutions. Insertions
may be located at either or both termini of the protein, and/or may be oned within internal regions
of the amino acid sequence. A non-limiting e would be insertion of an XTEN sequence within the
ce of the biologically-active payload protein. In on variants, one or more amino acid
es in a ptide as described herein are removed. Deletion variants, therefore, include all
nts of a payload polypeptide ce. In substitution variants, one or more amino acid residues
of a polypeptide are removed and replaced with alternative es. In one aspect, the substitutions are
conservative in nature and conservative substitutions of this type are well known in the art.
As used herein, “internal XTEN” refers to XTEN ces that have been inserted into the
sequence of the GLP-2. Internal XTENs can be constructed by insertion of an XTEN sequence into the
sequence of GLP-2 by ion between two adjacent amino acids or wherein XTEN es a partial,
internal sequence of the GLP-Z.
As used herein, “terminal XTE ” refers to XTEN
sequences that have been fused to or in the N-
or C-terminus of the GLP-2 or to a proteolytic cleavage sequence at the N— or C—terminus of the GLP-2.
Terminal XTENs can be fused to the native termini ofthe GLP-Z. Alternatively, terminal XTENs can
replace a terminal sequence of the GLP-Z.
The term “XTEN release site” refers to a cleavage sequence in GLPZ-XTEN fusion proteins that
can be recognized and d by a mammalian protease, effecting release of an XTEN or a portion of an
XTEN from the GLPZ-XTEN fusion protein. As used herein, “mammalian se” means a protease
that normally exists in the body fluids, cells or tissues of a mammal. XTEN release sites can be
engineered to be cleaved by various mammalian proteases (a.k.a. “XTEN release proteases”) such as
FXIa, FXHa, rein, , FVHIa, FXa, FHa (thrombin), se-2, MMP-12, MMP13, ,
MMP-ZO, or any protease that is present in the subject in proximity to the fusion n. Other
equivalent proteases (endogenous or exogenous) that are capable of recognizing a defined cleavage site
can be ed. The ge sites can be adjusted and ed to the se utilized.
The term “within”, when referring to a first polypeptide being linked to a second polypeptide,
encompasses linking that connects the N—terminus of the first or second polypeptide to the C—terminus of
the second or first polypeptide, respectively, as well as insertion of the first polypeptide into the sequence
of the second polypeptide. For example, when an XTEN is linked “within” a GLP-2 polypeptide, the
XTEN may be linked to the N-terminus, the C-terminus, or may be inserted between any two amino
acids of the GLP-2 polypeptide.
“Activity” for the purposes herein refers to an action or effect of a component of a fusion protein
consistent with that of the corresponding native biologically active protein component of the fusion
protein, wherein “biological activity” refers to an in vitro or in vivo biological on or effect,
including but not limited to receptor binding, antagonist activity, agonist activity, a cellular or
physiologic response, or an effect generally known in the art for the payload GLP-Z.
As used herein, the term "ELISA" refers to an enzyme-linked immunosorbent assay as described
herein or as otherwise known in the art.
A “host cell” includes an individual cell or cell culture which can be or has been a recipient for
the subject vectors. Host cells include progeny of a single host cell. The progeny may not arily be
completely identical (in morphology or in genomic of total DNA complement) to the original parent cell
due to natural, accidental, or deliberate mutation. A host cell includes cells transfected in vivo with a
vector of this invention.
“Isolated,” when used to describe the various polypeptides sed herein, means polypeptide
that has been identified and separated and/or red from a component of its natural environment.
WO 40093
Contaminant components of its natural environment are materials that would lly ere with
diagnostic or therapeutic uses for the polypeptide, and may include enzymes, hormones, and other
proteinaceous or non-proteinaceous solutes. As is apparent to those of skill in the art, a non-naturally
occurring polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, does not require
“isolation” to distinguish it from its naturally occurring counterpart. In addition, a ntrated”,
“separated” or “diluted” polynucleotide, peptide, polypeptide, n, antibody, or fragments f, is
distinguishable from its naturally occurring counterpart in that the concentration or number of molecules
per volume is generally greater than that of its naturally occurring counterpart. In general, a polypeptide
made by recombinant means and expressed in a host cell is considered to be “isolated.”
An “isolated” nucleic acid is a nucleic acid molecule that is identified and separated from at least
one contaminant c acid molecule with which it is ordinarily associated in the natural source of the
nucleic acid. For example, an ed polypeptide-encoding nucleic acid molecule is other than in the
form or g in which it is found in nature. Isolated ptide-encoding nucleic acid molecules
therefore are distinguished from the specific ptide-encoding c acid molecule as it exists in
natural cells. However, an isolated polypeptide-encoding nucleic acid molecule includes polypeptide-
encoding c acid les ned in cells that ordinarily express the polypeptide where, for
example, the nucleic acid molecule is in a chromosomal or extra-chromosomal location different from
that of natural cells.
A “chimeric” protein contains at least one fusion polypeptide comprising at least one region in a
different on in the sequence than that which occurs in nature. The regions may normally exist in
separate proteins and are brought together in the fiJsion polypeptide, or they may normally exist in the
same protein but are placed in a new arrangement in the fusion polypeptide. A chimeric protein may be
created, for example, by chemical synthesis, or by creating and translating a polynucleotide in which the
e regions are encoded in the desired relationship.
“Conjugated”, “linked,” “fused,” and “fusion” are used interchangeably herein. These terms
refer to the joining together of two or more chemical elements, sequences or components, by whatever
means including chemical conjugation or recombinant means. For example, a promoter or enhancer is
operably linked to a coding sequence if it affects the transcription of the sequence. Generally, “operably
linked” means that the DNA sequences being linked are contiguous, and in reading phase or in-frame.
An ame fusion” refers to the joining oftwo or more open reading frames (ORFs) to form a
continuous longer ORF, in a manner that maintains the correct reading frame of the original ORFs. Thus,
the resulting recombinant fusion protein is a single protein containing two or more segments that
correspond to polypeptides encoded by the original ORFs (which segments are not normally so joined in
nature).
In the context of polypeptides, a “linear sequence” or a “sequence” is an order of amino acids in
a polypeptide in an amino to carboxyl terminus direction in which residues that neighbor each other in
the sequence are contiguous in the primary structure of the polypeptide. A “partial sequence” is a linear
sequence of part of a polypeptide that is known to comprise additional residues in one or both directions.
“Heterologous” means derived from a genotypically distinct entity from the rest of the entity to
which it is being compared. For example, a glycine rich sequence removed from its native coding
sequence and operatively linked to a coding ce other than the native sequence is a heterologous
glycine rich sequence. The term ologous” as applied to a polynucleotide, a polypeptide, means that
the polynucleotide or polypeptide is derived from a genotypically distinct entity from that ofthe rest of
the entity to which it is being compared.
The terms “polynucleotides”, ic acids”, “nucleotides” and “oligonucleotides” are used
interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides
or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and
may m any on, known or unknown. The following are non-limiting examples of
polynucleotides: coding or non-coding regions of a gene or gene fragment, loci (locus) defined from
linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes,
cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any
sequence, isolated RNA of any sequence, nucleic acid , and primers. A polynucleotide may
comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present,
modifications to the nucleotide structure may be imparted before or after assembly of the r. The
ce of nucleotides may be interrupted by non—nucleotide components. A cleotide may be
filrther modified after polymerization, such as by ation with a labeling component.
The term “complement of a polynucleotide” denotes a polynucleotide molecule having a
complementary base sequence and reverse orientation as compared to a reference sequence, such that it
could hybridize with a reference sequence with complete fidelity.
“Recombinant” as applied to a polynucleotide means that the polynucleotide is the product of
various combinations of recombination steps which may include cloning, restriction and/or ligation steps,
and other procedures that result in an expression of a recombinant protein in a host cell.
The terms “gene” and “gene fragment” are used interchangeably herein. They refer to a
polynucleotide containing at least one open reading frame that is capable of encoding a particular protein
after being transcribed and translated. A gene or gene fragment may be genomic or cDNA, as long as the
polynucleotide ns at least one open reading frame, which may cover the entire coding region or a
segment thereof A “fusion gene” is a gene composed of at least two heterologous polynucleotides that
are linked together.
ogy” or “homologous” or “sequence ty” refers to sequence rity or
interchangeability between two or more polynucleotide ces or between two or more polypeptide
sequences. When using a program such as BestFit to ine sequence identity, rity or
homology between two different amino acid sequences, the default settings may be used, or an
appropriate scoring matrix, such as blosum45 or blosum80, may be selected to optimize identity,
rity or gy scores. Preferably, cleotides that are homologous are those which
hybridize under stringent conditions as defined herein and have at least 70%, ably at least 80%,
more preferably at least 90%, more preferably 95%, more preferably 97%, more preferably 98%, and
even more preferably 99% sequence identity compared to those sequences. Polypeptides that are
homologous preferably have sequence ties that are at least 70%, preferably at least 80%, even more
preferably at least 90%, even more preferably at least 95-99%, and most preferably 100% identical.
"Ligation" refers to the process of forming odiester bonds between two c acid
fragments or genes, linking them together. To ligate the DNA fragments or genes together, the ends of
the DNA must be compatible with each other. In some cases, the ends will be ly compatible after
endonuclease digestion. However, it may be necessary to first convert the staggered ends commonly
produced after endonuclease digestion to blunt ends to make them compatible for ligation.
The terms “stringent conditions” or “stringent hybridization conditions” includes reference to
conditions under which a polynucleotide will hybridize to its target sequence, to a detectably greater
degree than other sequences (e. g., at least 2-fold over background). Generally, stringency of
hybridization is sed, in part, with reference to the temperature and salt concentration under which
the wash step is carried out. Typically, stringent ions will be those in which the salt concentration
is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH
7.0 to 8.3 and the temperature is at least about 30°C for short polynucleotides (e.g., 10 to 50 nucleotides)
and at least about 60°C for long polynucleotides (e.g., greater than 50 nucleotides)—for example,
“stringent conditions” can include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and
three washes for 15 min each in 0.1 XSSC/1% SDS at 60°C to 65°C. Alternatively, temperatures of about
65°C, 60°C, 55°C, or 42°C may be used. SSC tration may be varied from about 0.1 to 2XSSC,
with SDS being present at about 0.1%. Such wash temperatures are typically selected to be about 5°C to
°C lower than the thermal melting point for the specific sequence at a defined ionic strength and pH.
The Tm is the temperature (under defined ionic th and pH) at which 50% of the target sequence
hybridizes to a perfectly matched probe. An equation for calculating Tm and conditions for nucleic acid
ization are well known and can be found in Sambrook, J. et a]. , “Molecular Cloning: A tory
Manual,” 3rd edition, Cold Spring Harbor Laboratory Press, 2001. Typically, blocking reagents are used
to block non-specific hybridization. Such ng reagents include, for instance, sheared and denatured
salmon sperm DNA at about 100-200 [Lg/m1. Organic solvent, such as formamide at a concentration of
about 35-50% v/v, may also be used under particular circumstances, such as for RNA2DNA
hybridizations. Useful variations on these wash conditions will be readily apparent to those of ordinary
skill in the art.
The terms “percent identity, “percentage of sequence identity,” and “% identity,” as applied to
polynucleotide sequences, refer to the percentage ofresidue s between at least two polynucleotide
sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and
reproducible way, gaps in the sequences being compared in order to optimize alignment between two
sequences, and therefore achieve a more gful comparison of the two sequences. Percent ty
may be measured over the length of an entire defined polynucleotide sequence, or may be measured over
a shorter length, for example, over the length of a fragment taken from a larger, defined polynucleotide
sequence, for instance, a fragment of at least 45, at least 60, at least 90, at least 120, at least 150, at least
210 or at least 450 uous residues. Such lengths are exemplary only, and it is understood that any
fragment length supported by the sequences shown , in the tables, figures or Sequence Listing, may
be used to be a length over which percentage identity may be measured. The percentage of
sequence identity is calculated by comparing two optimally aligned sequences over the window of
comparison, determining the number of matched positions (at which identical residues occur in both
polypeptide sequences), dividing the number of matched positions by the total number of positions in the
window of ison (i.e., the window size), and multiplying the result by 100 to yield the percentage
of sequence identity. When sequences of different length are to be compared, the shortest sequence
defines the length of the window of comparison. vative substitutions are not considered when
calculating sequence identity.
“Percent (%) sequence identity,” with respect to the polypeptide sequences identified herein, is
defined as the tage of amino acid residues in a query ce that are cal with the amino
acid residues of a second, reference polypeptide sequence or a portion thereof, after aligning the
sequences and introducing gaps, if necessary, to achieve the m percent sequence identity, and not
considering any conservative substitutions as part of the sequence identity, thereby ing in optimal
alignment. Alignment for purposes of determining percent amino acid sequence identity can be achieved
in various ways that are within the skill in the art, for instance, using publicly ble computer
software such as BLAST, BLAST-2, ALIGN or Megalign (DNASTAR) software. Those skilled in the
art can determine appropriate parameters for measuring alignment, including any algorithms needed to
achieve optimal alignment over the fiill length of the sequences being compared. Percent identity may be
measured over the length of an entire defined polypeptide sequence, or may be measured over a shorter
, for example, over the length of a fragment taken from a larger, defined polypeptide sequence, for
instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150
contiguous residues. Such lengths are exemplary only, and it is understood that any fragment length
supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be used to
be a length over which percentage identity may be measured.
itiveness” used in the t of polynucleotide sequences refers to the degree of internal
homology in the sequence such as, for example, the frequency of identical nucleotide sequences of a
given length. Repetitiveness can, for example, be measured by analyzing the frequency of identical
sequences.
A “vector” is a nucleic acid molecule, preferably self-replicating in an appropriate host, which
ers an inserted c acid molecule into and/or between host cells. The term includes vectors that
fimction primarily for insertion of DNA or RNA into a cell, replication of s that function primarily
for the replication of DNA or RNA, and sion vectors that function for transcription and/or
translation of the DNA or RNA. Also included are vectors that provide more than one of the above
ons. An “expression vector” is a polynucleotide which, when introduced into an appropriate host
cell, can be transcribed and translated into a polypeptide(s). An “expression system” usually connotes a
suitable host cell comprised of an expression vector that can function to yield a desired expression
product.
“Serum ation resistance,” as applied to a polypeptide, refers to the ability of the
polypeptides to and degradation in blood or components thereof, which typically involves
proteases in the serum or plasma. The serum degradation resistance can be measured by ing the
protein with human (or mouse, rat, monkey, as appropriate) serum or plasma, typically for a range of
days (e.g. 0.25, 0.5, 1, 2, 4, 8, 16 days), lly at about 37°C. The samples for these time points can be
run on a Western blot assay and the n is detected with an antibody. The antibody can be to a tag in
the protein. If the protein shows a single band on the western, where the protein’s size is identical to that
of the injected protein, then no degradation has occurred. In this exemplary method, the time point where
50% of the protein is degraded, as judged by Western blots or equivalent techniques, is the serum
degradation ife or “serum half-life” of the protein.
] The terms “tug”, “terminal half-life”, “elimination half-life” and “circulating half-life” are used
interchangeably herein and, as used herein mean the al half—life calculated as ln(2)/Kel. K61 is the
terminal elimination rate constant calculated by linear regression of the terminal linear portion of the log
concentration vs. time curve. ife typically refers to the time required for half the quantity of an
administered substance deposited in a living organism to be metabolized or eliminated by normal
biological processes.
“Active clearance” means the isms by which a protein is removed from the circulation
other than by filtration, and which includes removal from the circulation mediated by cells, receptors,
metabolism, or degradation of the protein.
“Apparent molecular weight factor” and “apparent molecular weight” are related terms referring
to a measure of the relative increase or se in apparent molecular weight ted by a particular
amino acid or polypeptide sequence. The apparent molecular weight is determined using size exclusion
chromatography (SEC) or r methods by comparing to globular protein standards and is measured in
ent kDa” units. The apparent molecular weight factor is the ratio between the apparent molecular
weight and the actual molecular weight; the latter predicted by adding, based on amino acid composition,
the calculated molecular weight of each type of amino acid in the composition or by estimation from
comparison to molecular weight standards in an SDS electrophoresis gel. Determination of both the
apparent molecular weight and apparent molecular weight factor for representative proteins is described
in the Examples.
The terms “hydrodynamic radius” or “Stokes radius” is the ive radius (R11 in mm) of a
le in a solution measured by assuming that it is a body moving through the solution and resisted
by the solution’s Viscosity. In the embodiments of the invention, the hydrodynamic radius measurements
of the XTEN fusion ns correlate with the ‘apparent molecular weight factor’, which is a more
intuitive measure. The dynamic radius” of a protein affects its rate of diffusion in aqueous
solution as well as its y to migrate in gels of macromolecules. The hydrodynamic radius of a
protein is determined by its lar weight as well as by its structure, including shape and
compactness. Methods for determining the hydrodynamic radius are well known in the art, such as by
the use of size exclusion chromatography (SEC), as described in US. Patent Nos. 6,406,632 and
7,294,513. Most proteins have globular structure, which is the most compact three-dimensional ure
a protein can have with the smallest ynamic radius. Some proteins adopt a random and open,
unstructured, or ‘linear’ conformation and as a result have a much larger hydrodynamic radius compared
to typical globular proteins of r molecular weight.
“Physiological conditions” refers to a set of conditions in a living host as well as in vitro
conditions, ing temperature, salt concentration, pH, that mimic those conditions of a living subject.
A host of physiologically relevant conditions for use in in vitro assays have been established. Generally,
a physiological buffer contains a physiological tration of salt and is adjusted to a neutral pH
ranging from about 6.5 to about 7.8, and preferably from about 7.0 to about 7.5. A variety of
logical buffers are listed in Sambrook et al. (2001). Physiologically relevant temperature ranges
from about 25°C to about 38°C, and preferably from about 350C to about 37°C.
A “reactive group” is a chemical structure that can be coupled to a second reactive group.
es for reactive groups are amino groups, carboxyl , sulfhydryl groups, hydroxyl groups,
aldehyde groups, azide groups. Some reactive groups can be activated to facilitate coupling with a
second reactive group. Non-limiting examples for activation are the reaction of a carboxyl group with
carbodiimide, the sion of a carboxyl group into an ted ester, or the conversion of a carboxyl
group into an azide function.
“Controlled release agent”, “slow release agent”, “depot formulation” and “sustained release
agent” are used interchangeably to refer to an agent capable of ing the duration of release of a
polypeptide of the invention relative to the duration of release when the polypeptide is administered in
the absence of agent. Different embodiments of the present invention may have ent release rates,
resulting in different therapeutic amounts.
The terms “antigen”, “target antigen” and ogen” are used interchangeably herein to refer
to the structure or binding determinant that an antibody fragment or an antibody fragment-based
therapeutic binds to or has specificity against.
The term “payload” as used herein refers to a protein or peptide ce that has biological or
therapeutic ty; the counterpart to the pharmacophore of small molecules. Examples of ds
include, but are not limited to, cytokines, enzymes, hormones, blood coagulation factors, and growth
factors. Payloads can further comprise genetically fused or chemically conjugated moieties such as
chemotherapeutic agents, antiviral compounds, toxins, or st . These conjugated moieties can
be joined to the rest of the polypeptide Via a linker that may be cleavable or non-cleavable.
The term “antagonist”, as used , includes any molecule that partially or fully blocks,
inhibits, or neutralizes a biological activity of a native ptide disclosed herein. Methods for
identifying antagonists of a polypeptide may comprise contacting a native ptide with a candidate
antagonist molecule and measuring a detectable change in one or more biological activities normally
ated With the native polypeptide. In the t of the present invention, antagonists may include
proteins, c acids, carbohydrates, antibodies or any other molecules that se the effect of a
biologically active protein.
The term “agonist” is used in the broadest sense and includes any molecule that mimics a
biological activity of a native polypeptide disclosed herein. Suitable agonist molecules specifically
include agonist antibodies or antibody fragments, fragments or amino acid sequence variants of native
ptides, peptides, small organic molecules, etc. Methods for identifying agonists of a native
polypeptide may comprise contacting a native polypeptide With a candidate agonist molecule and
measuring a detectable change in one or more biological activities ly associated With the native
polypeptide.
“Inhibition constant”, or “K”, are used interchangeably and mean the dissociation constant of
the enzyme—inhibitor complex, or the reciprocal of the binding y of the inhibitor to the enzyme.
As used herein, “treat” or “treating,” or “palliating” or “ameliorating” are used interchangeably
and mean administering a drug or a biologic to achieve a therapeutic benefit, to cure or reduce the
severity of an existing condition, or to achieve a prophylactic , prevent or reduce the hood of
onset or severity the occurrence of a condition. By therapeutic benefit is meant eradication or
amelioration of the underlying condition being treated or one or more of the physiological symptoms
associated with the underlying condition such that an improvement is observed in the subject,
notwithstanding that the subject may still be afflicted With the underlying condition.
A “therapeutic effect” or peutic benefit,” as used herein, refers to a physiologic effect,
including but not d to the mitigation, amelioration, or prevention of disease in humans or other
animals, or to otherwise enhance physical or mental wellbeing of humans or animals, resulting from
administration of a fusion protein of the invention other than the ability to induce the production of an
antibody against an nic epitope sed by the biologically active protein. For prophylactic
benefit, the compositions may be administered to a subject at risk of developing a particular condition, or
to a subject reporting one or more ofthe physiological symptoms of a condition, even though a diagnosis
(e.g., Crohn’s Disease) may not have been made.
The terms “therapeutically ive amount” and “therapeutically effective dose”, as used
herein, refer to an amount of a drug or a ically active n, either alone or as a part of a fusion
protein composition, that is capable of having any detectable, beneficial effect on any symptom, aspect,
measured parameter or characteristics of a disease state or condition When administered in one or
WO 40093
repeated doses to a subject. Such effect need not be absolute to be beneficial. Determination of a
therapeutically effective amount is well within the capability of those d in the art, especially in light
of the detailed disclosure provided herein.
The term “therapeutically effective dose regimen”, as used , refers to a schedule for
consecutively administered multiple doses (i.e., at least two or more) of a biologically active protein,
either alone or as a part of a fusion protein composition, wherein the doses are given in therapeutically
effective amounts to result in sustained beneficial effect on any symptom, aspect, measured parameter or
characteristics of a e state or condition.
I). GENERAL TECHNIQUES
The practice of the present invention employs, unless otherwise indicated, conventional
techniques of immunology, biochemistry, chemistry, molecular biology, microbiology, cell y,
cs and inant DNA, which are within the skill of the art. See Sambrook, J. et al.,
ular Cloning: A Laboratory Manual,” 3rd edition, Cold Spring Harbor Laboratory Press, 2001;
“Current protocols in molecular biology”, F. M. Ausubel, et al. eds.,1987; the series ds in
Enzymology,” Academic Press, San Diego, CA.; “PCR 2: a practical approach”, M.J. MacPherson, B.D.
Hames and GR. Taylor eds., Oxford University Press, 1995; “Antibodies, a laboratory manual” Harlow,
E. and Lane, D. eds., Cold Spring Harbor Laboratory,1988; “Goodman & Gilman’s The Pharmacological
Basis of Therapeutics,” 11Th n, McGraw-Hill, 2005; and Freshney, R.I., “Culture of Animal Cells:
A Manual of Basic Technique,” 4th edition, John Wiley & Sons, Somerset, NJ, 2000, the contents of
which are incorporated in their entirety herein by reference.
'1). GLUCAGON-LIKE-2 PROTEIN
The present ion relates, in part, to fusion protein compositions comprising GLP-2 and
one or more extended recombinant polypeptide , resulting in GLP2-XTEN fusion protein
compositions.
“Glucagon—like protein-2” or “GLP-2” means, collectively herein, human glucagon like
peptide-2, species gs of human GLP-2, and non-natural sequence variants having at least a
n of the biological activity of mature GLP-2 including variants such as, but not limited to, a variant
with glycine substituted for alanine at position 2 ofthe mature sequence (“2G”) as well as Val, Glu, Lys,
Arg, Leu or Ile substituted for alanine at position 2. GLP-2 or sequence variants have been isolated,
synthesized, characterized, or cloned, as described in US. Patent or Application Nos. 5,789,379;
428; 5,990,077; 5,994,500; 6,184,201; 7,186,683; 7,563,770; 20020025933; and 20030162703.
Human GLP-2 is a 33 amino acid e, co-secreted along with GLP-l from intestinal
endocrine cells in the epithelium of the small and large intestine. The 180 amino-acid product of the
proglucagon gene is post-translationally processed in a tissue-specific manner in pancreatic A cells and
inal L cells into the 33 amino acid GLP-2 v et al., FEBS Lett. (1989) 247: 193-196;
Hartmann et al., Peptides (2000) 21: 73-80). In pancreatic A cells, the major bioactive hormone is
glucagon cleaved by PCSK2/PC2. In the intestinal L cells PCSKl/PCl liberates GLP—l, GLP—2, glicentin
and oxyntomodulin. GLP-2 functions as a pleiotropic intestinotrophic hormone with wide-ranging
effects that include the promotion of mucosal growth and nutrient absorption, intestinal homeostasis,
regulation of gastric motility, gastric acid secretion and inal hexose transport, reduction of intestinal
permeability and increase in mesenteric blood flow (Estall JL, Drucker DJ (2006) Glucagon—like
e-2. Annual Rev 1391—411), (Guan X, et al. (2006) GLP-2 receptor localizes to enteric
neurons and endocrine cells sing vasoactive peptides and mediates increased blood flow.
Gastroenterology 130:150—164; Stephens J, et al. (2006) Glucagon-like peptide-2 acutely increases
proximal small inal blood flow in TPN-fed al piglets. Am J Physiol Regul Integr Comp
Physiol 290:R283—R289; Nelson DW, et al. (2007) Localization and activation of GLP-2 receptors on
vagal afferents in the rat. Endocrinology 148:1954—1962). The effects mediated by GLP-2 are triggered
by the binding and activation of the GLP-2 receptor, a member of the glucagon/secretin G ncoupled
receptor superfamily that is located on enteric (Bj erknes M, Cheng H (2001) Modulation of
specific inal epithelial progenitors by enteric neurons. Proc Natl Acad Sci USA 98:12497—12502)
and vagal (Nelson et al., 2007) nerves, subepithelial myofibroblasts (Orskov C, et al. (2005) GLP-2
stimulates colonic growth Via KGF, released by subepithelial myofibroblasts with GLP-2 receptors.
Regul Pept 124:105—11), and a subset of intestinal epithelial cells (Thulesen J, et al. (2000) Potential
targets for glucagon-like peptide 2 (GLP-2) in the rat: distribution and g of i.v. injected (125)1-
GLP—2. Peptides 21:151171517). In addition, GLP—2 has an important role in intestinal adaptation,
repair and protection during inflammatory events, including amelioration of the effects of
proinflammatory cytokines (Sigalet DL, et al. (2007) Enteric neural pathways mediate the antiinflammatory
actions of glucagon—like peptide 2. Am J Physiol Gastrointest Liver Physiol 293 :G21 1—
G221). GLP-2 also enhances nutrient absorption and gut adaptation in rodents or humans with short
bowel syndrome (SBS) (Jeppesen et al., (2001) Gastroenterology 120: 806-815).
In one aspect, the invention contemplates ion of GLP-2 ces in the TEN
fusion protein compositions that are identical to human GLP-2, sequences that have homology to GLP-2
ces, ces that are l, such as from humans, non-human primates, mammals (including
domestic animals) that retain at least a portion of the biologic activity or biological function of native
human GLP-2. In one embodiment, the GLP-2 is a non-natural GLP-2 sequence variant, fragment, or a
mimetic of a l sequence that retains at least a portion of the biological activity of the ponding
native GLP-2, such as but not limited to the substitution of the alanine at position 2 of the mature GLP-2
peptide ce with glycine (“GLP2G”). In another embodiment, the GLP-2 of the fusion protein
has the sequence HGDGSFSDEMNTILDNLAARDFINWLIQTKITD. Sequences with homology to GLP-2
may be found by standard homology searching techniques, such as NCBI BLAST, or in public databases
such as Chemical Abstracts Services Databases (e. g., the CAS ry), GenBank, The Universal
Protein Resource ot) and subscription provided databases such as GenSeq (e.g., Derwent).
Table 1 provides a non-limiting list of amino acid sequences of GLP-2 that are encompassed by
the GLP2-XTEN fusion proteins of the invention. Any of the GLP-2 ces or homologous
derivatives to be incorporated into the fusion protein compositions can be constructed by shuffling
individual mutations into and between the amino acids of the sequences of Table 1 or by replacing the
amino acids ofthe sequences of Table 1. The resulting GLP-2 sequences can be evaluated for activity
and those that retain at least a portion of the biological activity of the native GLP-2 may be useful for
inclusion in the fusion protein compositions of this invention. In some embodiments, GLP-2 that can be
incorporated into a GLP2-XTEN include proteins that have at least about 80% ce identity, or
alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,
97%, 98%, 99%, or 100% sequence identity compared to an amino acid ce selected from Table 1.
Table 1: GLP-Z amino acid seguences
Name (source) Amino Acid Sequence
GLP-2 (human) HADGSFSDEMNTILD\LAARDFINWLIQTKITD
GLP-2 variant 1 SEQ ID NO: 3 US Pat No. 7,186,683 HADGSFSDEMNTILD\LATRDFINWLIQTKITD
GLP—2 variant 2 SEQ ID N05 US Pat No. 5,789,379 SDEMNTILD\LAARDFINWLIQTKITD
GLP-2 variant 3 HVDGSFSDEMNTILD\LAARDFINWLIQTKITD
GLP-2 t 4 HEDGSFSDEMNTILDNLAARDFINWLIQTKITD
GLP—2 variant 5 HKDGSFSDEMNTILD\LAARDFINWLIQTKITD
GLP-2 variant 6 HRDGSFSDEMNTILD\LAARDFWWLIQTKITD
GLP-2 variant 7 HLDGSFSDEMNTILD\LAARDFINWLIQTKITD
GLP—2 variant 8 SDEMI\TILDNLAARDFINWLIQTKITD
GLP-2 (mouse) HADGSFSDE\/ISTILD\LATRDFINWLIQTKITD
GLP-2 (rat) HADGSFSDE\/I\TILDNLATRDFINWLIQTKITD
GLP—2 (bovine) HADGSFSDE\/f\TVLDSLATRDFINWLLQTKITD
GLP-2 (bovine variant) HGDGSFSDE\/f\TVLDSLATRDFINWLLQTKITD
GLP-2 (pig) HADGSFSDE\/I\TVLDNLATRDFINWLLHTKITDSL
GLP-2 (pig variant) SDE\/f\TVLDNLATRDFINWLLHTKITDSL
GLP-2 (sheep) HADGSFSDE\/f\TVLDSLATRDFINWLLQTKI
GLP-2 (sheep variant) HGDGSFSDE\/I\TVLDSLATRDFINWLLQTKI
GLP-2 (canine) HADGSFSDE\/f\TVLDTLATRDFINWLLQTKITD
GLP-2 (canine variant) HGDGSFSDE\/f\TVLDTLATRDFINWLLQTKITD
GLP-2 (chicken) HADGTFTSDII\KILDDMAAKEFLKWLINTKVTQ
GLP-2 (chicken variant) HGDGTFTSDII\KILDDMAAKEFLKWLINTKVTQ
GLP-2 (turkey) HADGTFTSDII\KILDDMAAKEFLKWLINTKVTQ
GLP-2 (turkey variant) HGDGTFTSDII\KILDDMAAKEFLKWLINTKVTQ
GLP-2 (Xenopus laevis) HADGSFTNDINKVLDIIAAQEFLDWVINTQETE
] The GLP-2 of the subject compositions are not limited to native, filll-length GLP-2
ptides, but also include recombinant versions as well as biologically and/or pharmacologically
active forms with sequence variants, or fragments thereof. For example, it will be iated that
various amino acid deletions, insertions and substitutions can be made in the GLP-2 to create variants
that exhibit one or more biological activity or pharmacologic properties of the wild-type GLP-2.
Examples of conservative substitutions for amino acids in polypeptide sequences are shown in Table 2.
In embodiments of the GLPZ-XTEN in which the sequence identity ofthe GLP-2 is less than 100%
compared to a specific sequence disclosed herein, the invention contemplates substitution of any of the
other 19 natural L-amino acids for a given amino acid e of a given GLP-Z, which may be at any
position within the sequence of the GLP-Z, including adjacent amino acid residues. In some
embodiments, the GLP-2 t incorporated into the GLPZ-XTEN has glycine (G), valine (V),
glutamate (E), lysine (K), ne (R), leucine (K) or isoleucine (I) substituted for alanine (A) at position
2 of the mature peptide. Such substitution may confer resistance to dipeptidyl peptidase-4 (DPP-4). In
one embodiment, glycine is substituted for alanine at position 2 ofthe GLP-2 sequence. If any one
substitution s in an undesirable change in biological ty, then one of the alternative amino acids
can be employed and the uct protein evaluated by the methods described herein (e.g., the assays of
Table 32), or using any of the techniques and guidelines for conservative and non-conservative mutations
set forth, for instance, in US. Pat. No. 5,364,934 (the content of which is incorporated by reference in its
entirety), or using s generally known in the art. In addition, variants can include, for instance,
polypeptides wherein one or more amino acid residues are added or deleted at the N— or C-terminus of the
fiJll-length native amino acid sequence of a GLP-2 that retains some if not all of the ical activity of
the native peptide; e.g., the ability to bind GLP-2 receptor and/or the ability to activate GLP-2 receptor.
Table 2: Exemplary conservative amino acid substitutions
Originamesmue
Ala (A)
Arg (R)
Asn (N)
‘1’)
Cys (C) Ser
Gln Q
Glu (E)
Gly (G) Pro
His (H) asn: gin: lys: arg
116 1
Leu (L)
Lys (K) arg: gin: asn
Met M leu; ohe; ile
Phe F
Pro (P)
Thr (T) Ser
T . W
Tyr(Y)
Val (V) Ile; leu; met; phe; ala; norleucine
Sequence variants of GLP-2, whether exhibiting substantially the same or better biological
activity than a corresponding wild-type GLP-Z, or, alternatively, exhibiting substantially modified or
reduced biological activity relative to wild-type GLP-2, include, without limitation, polypeptides having
an amino acid sequence that differs from the sequence of ype GLP-2 by insertion, deletion, or
substitution of one or more amino acids. Such GLP-2 variants are known in the art, including those
described in US Patent No. 7,186,683 or US Pat. No. 5,789,379, 500, all ofwhich are incorporated
herein by reference.
III). EXTENDED RECOMBINANT POLYPEPTIDES
In one aspect, the invention es XTEN polypeptide compositions that are useful as fusion
protein partner(s) to link to and/or incorporate within a GLP-2 sequence, resulting in a GLP2-XTEN
filsion n. XTEN are generally polypeptides with non-naturally occurring, substantially non-
repetitive ces having a low degree of or no secondary or tertiary structure under physiologic
conditions. XTEN typically have from about 36 to about 3000 amino acids of which the majority or the
entirety are small hilic amino acids. As used herein, “XTEN” specifically excludes whole
antibodies or antibody fragments (e.g. single-chain antibodies and PC fragments). XTENs have y as
a fusion protein partners in that they serve in various roles, conferring certain ble pharmacokinetic,
ochemical and pharmaceutical properties when linked to a GLP-2 protein to a create a GLPZ-
XTEN fusion protein. Such GLPZ-XTEN fusion protein compositions have enhanced properties
ed to the corresponding GLP-2 not linked to XTEN, making them useful in the treatment of
certain gastrointestinal conditions, as more fully described below.
The selection criteria for the XTEN to be fused to the biologically active proteins generally
relate to attributes of physicochemical properties and conformational ure of the XTEN that is, in
turn, used to confer the enhanced properties to the fusion proteins compositions. The unstructured
characteristic and physical/chemical properties of the XTEN result, in part, from the overall amino acid
composition disproportionately limited to 4-6 hilic amino acids, the linking of the amino acids in a
quantifiable non-repetitive design, and the length of the XTEN polypeptide. In an ageous feature
common to XTEN but uncommon to polypeptides, the properties ofXTEN disclosed herein are not tied
to absolute primary amino acid sequences, as evidenced by the diversity of the exemplary sequences of
Table 4 that, within varying ranges of length, possess similar ties, many of which are documented
in the es. The XTEN of the present invention exhibits one or more of the following advantageous
properties: conformational flexibility, reduced or lack of secondary ure, high degree of s
solubility, high degree of protease resistance, low immunogenicity, low binding to mammalian receptors,
a defined degree of charge, and increased hydrodynamic (or Stokes) radii; properties that make them
particularly useful as fusion protein partners. In turn, non-limiting examples of the ed ties
of the fusion proteins comprising GLP-2 fused to the XTEN include increases in the overall solubility
and/or metabolic stability, reduced susceptibility to proteolysis, reduced immunogenicity, reduced rate of
absorption when administered subcutaneously or intramuscularly, reduced nce by the kidney,
enhanced interactions with ate, and enhanced pharmacokinetic properties. Enhanced
pharmacokinetic properties of the inventive GLPZ-XTEN compositions include longer terminal half-life
(e.g., two-fold, three-fold, four-fold or more), increased area under the curve (AUC) (e.g., 25%, 50%,
100% or more), lower volume of distribution, slower absorption after subcutaneous or intramuscular
injection (compared to GLP-2 not linked to the XTEN and administered by a similar route) such that the
Cmax is lower, which, in turn, results in reductions in adverse effects of the GLP-Z that, collectively,
results in an increased period of time that a fusion protein of a GLPZ-XTEN composition administered to
a subject provides therapeutic activity. In some embodiments, the TEN compositions se
cleavage sequences (described more fully, below) that permits sustained release ogically active
GLP-2.A GLPZ-XTEN having such cleavage ce can act as a depot when subcutaneously or
intramuscularly administered. It is specifically contemplated that the t GLPZ-XTEN fusion
ns of the disclosure can exhibit one or more or any ation of the improved properties
disclosed herein. In some embodiments, GLPZ-XTEN compositions permit less frequent dosing
compared to GLP-2 not linked to the XTEN and administered in a able fashion. Such GLPZ-
XTEN fusion protein compositions have utility to treat certain related diseases, disorders or
conditions, as bed herein.
A y of methods and assays are known in the art for determining the physicochemical
properties of proteins such as the compositions comprising the inventive XTEN. Such properties include
but are not limited to secondary or tertiary structure, solubility, protein aggregation, melting ties,
contamination and water content. Such methods include analytical centrifugation, EPR, HPLC-ion
exchange, HPLC-size exclusion chromatography (SEC), HPLC-reverse phase, light scattering, capillary
electrophoresis, circular dichroism, differential scanning calorimetry, fluorescence, HPLC—ion exchange,
IR, NMR, Raman spectroscopy, refractometry, and UVNisible oscopy. Additional methods are
disclosed in Arnau, et al., Prot Expr and Purif (2006) 48, 1-13.
The XTEN component(s) of the GLPZ-XTEN are designed to behave like denatured e
sequences under physiological conditions, despite the extended length of the polymer. “Denatured”
describes the state of a peptide in solution that is characterized by a large mational freedom of the
peptide backbone. Most es and proteins adopt a denatured mation in the presence of high
concentrations of denaturants or at elevated temperature. Peptides in denatured conformation have, for
example, characteristic circular dichroism (CD) spectra and are characterized by a lack of long-range
interactions as determined by NMR. “Denatured conformation” and “unstructured conformation” are
used synonymously herein. In some embodiments, the invention provides XTEN sequences that, under
physiologic conditions, resemble denatured ces that are largely devoid in secondary structure. In
other cases, the XTEN sequences are substantially devoid of secondary structure under physiologic
conditions. “Largely devoid,” as used in this context, means that less than 50% of the XTEN amino acid
residues of the XTEN sequence contribute to secondary structure as measured or determined by the
means described herein. “Substantially devoid,” as used in this context, means that at least about 60%, or
about 70%, or about 80%, or about 90%, or about 95%, or at least about 99% of the XTEN amino acid
residues of the XTEN ce do not contribute to secondary structure, as ed or determined by
the methods described herein.
A variety of methods have been established in the art to discern the presence or absence of
secondary and tertiary structures in a given ptide. In particular, secondary structure can be
measured spectrophotometrically, e.g., by circular dichroism spectroscopy in the “far-UV” spectral
region (190-250 nm). Secondary structure elements, such as helix and beta-sheet, each give rise to
a characteristic shape and magnitude of CD spectra. ary structure can also be predicted for a
polypeptide sequence via certain computer ms or algorithms, such as the well-known Chou-
Fasman algorithm (Chou, P. Y., et a]. (1974) Biochemistry, 13: 222-45) and the Garnier—Osguthorpe-
Robson algorithm (“Gor algorithm”) (Garnier J, Gibrat JF, Robson B. (1996), GOR method for
predicting protein secondary structure from amino acid sequence. Methods Enzymol 266:540-55 3), as
described in US Patent Application Publication No. 20030228309A1. For a given sequence, the
algorithms can predict whether there exists some or no ary structure at all, expressed as the total
and/or percentage of es of the sequence that form, for e, alpha-helices or beta-sheets or the
percentage of residues of the sequence predicted to result in random coil formation (which lacks
secondary structure). Polypeptide sequences can be analyzed using the Chou-Fasman algorithm using
sites on the world wide web at, for example,
fasta.bioch.virginia.edu/fasta_www2/fasta_www.cgi?rm=miscl and the Gor algorithm at npsapbil.ibcp.fr
in/npsa_automat.pl?page=npsa_gor4.html (both accessed on September 5, 2012).
In one embodiment, the XTEN sequences used in the subject fusion protein compositions have
an alpha—helix percentage ranging from 0% to less than about 5% as determined by the Chou—Fasman
algorithm. In another embodiment, the XTEN sequences of the msion protein compositions have a beta-
sheet percentage ranging from 0% to less than about 5% as determined by the Chou-Fasman thm.
In some embodiments, the XTEN sequences of the fusion protein compositions have an alpha-helix
percentage ranging from 0% to less than about 5% and a beta-sheet percentage ranging from 0% to less
than about 5% as determined by the Chou-Fasman algorithm. In one ment, the XTEN sequences
of the fusion protein compositions have an alpha-helix percentage less than about 2% and a heet
tage less than about 2%. The XTEN sequences of the fusion n compositions have a high
degree ofrandom coil percentage, as determined by the GOR algorithm. In some embodiments, an
XTEN sequence have at least about 80%, more ably at least about 90%, more preferably at least
about 91%, more preferably at least about 92%, more preferably at least about 93%, more preferably at
least about 94%, more ably at least about 95%, more preferably at least about 96%, more preferably
at least about 97%, more preferably at least about 98%, and most preferably at least about 99% random
coil, as determined by the GOR thm. In one embodiment, the XTEN sequences of the fusion
protein compositions have an alpha-helix percentage ranging from 0% to less than about 5% and a beta-
sheet percentage ranging from 0% to less than about 5% as determined by the Chou-Fasman algorithm
and at least about 90% random coil, as determined by the GOR algorithm. In another embodiment, the
XTEN sequences of the fusion protein compositions have an alpha-helix percentage less than about 2%
and a beta-sheet tage less than about 2% at least about 90% random coil, as determined by the
GOR algorithm.
1. Non-repetitive Sequences
] It is contemplated that the XTEN sequences of the GLPZ-XTEN embodiments are substantially
non-repetitive. In general, tive amino acid sequences have a cy to aggregate or form higher
order ures, as exemplified by natural repetitive ces such as collagens and leucine zippers.
These repetitive amino acids may also tend to form contacts resulting in crystalline or pseudocrystaline
structures. In contrast, the low tendency of non-repetitive sequences to aggregate enables the design of
long-sequence XTENs with a relatively low frequency of charged amino acids that would otherwise be
likely to aggregate if the sequences were repetitive. The non-repetitiveness of a subject XTEN can be
observed by assessing one or more of the following es. In one embodiment, a “substantially non-
repetitive” XTEN sequence has no three contiguous amino acids in the ce that are of identical
amino acid types unless the amino acid is serine, in which case no more than three contiguous amino
acids are serine residues. In another embodiment, as described more fully below, a “substantially non-
repetitive” XTEN sequence ses motifs of 9 to 14 amino acid residues wherein the motifs consist of
3, 4, 5, or 6 types of amino acids selected from glycine (G), e (A), serine (S), ine (T),
glutamate (E) and proline (P), and wherein the sequence of any two contiguous amino acid residues in
any one motif is not repeated more than twice in the sequence motif.
The degree of repetitiveness of a polypeptide or a gene can be measured by computer programs
or algorithms or by other means known in the art. According to the current invention, algorithms to be
used in ating the degree ofrepetitiveness of a particular polypeptide, such as an XTEN, are
disclosed herein, and examples of ces analyzed by algorithms are provided (see Examples, below).
In one embodiment, the repetitiveness of a polypeptide of a predetermined length can be calculated
(hereinafter “subsequence score”) according to the formula given by Equation 1:
Subsequence score = ET: 3: {Itl‘lt'il'ii g I
wherein: m = (amino acid length of polypeptide) — (amino acid length of subsequence) +
1; and
Countl- = cumulative number of occurrences of each unique subsequence within
sequence;-
An algorithm termed ore” was developed to apply the foregoing equation to quantitate
repetitiveness of polypeptides, such as an XTEN, providing the subsequence score wherein sequences of
a predetermined amino acid length are analyzed for repetitiveness by determining the number of times (a
“count”) a unique subsequence of length “s” appears in the set length, divided by the absolute number of
subsequences within the predetermined length of the sequence. depicts a logic flowchart of the
SegScore algorithm, while portrays a schematic of how a subsequence score is derived for a
fictitious XTEN with 11 amino acids and a subsequence length of 3 amino acid es. For example, a
predetermined polypeptide length of 200 amino acid es has 192 overlapping 9-amino acid
subsequences and 198 3-mer subsequences, but the subsequence score of any given polypeptide will
depend on the absolute number of unique uences and how frequently each unique subsequence
(meaning a different amino acid sequence) appears in the predetermined length of the sequence.
In the context of the present ion, “subsequence score” means the sum of occurrences of
each unique 3-mer frame across 200 consecutive amino acids ofthe cumulative XTEN polypeptide
d by the absolute number of unique 3-mer subsequences within the 200 amino acid sequence.
Examples of such subsequence scores derived from 200 consecutive amino acids of repetitive and nonrepetitive
polypeptides are presented in e 30. In one embodiment, the invention provides a
GLP2-XTEN comprising one XTEN in which the XTEN has a uence score less than 12, more
preferably less than 10, more preferably less than 9, more preferably less than 8, more preferably less
than 7, more ably less than 6, and most preferably less than 5. In another ment, the
invention provides GLP2-XTEN sing two more XTENs in which at least one XTEN has a
subsequence score of less than 10, or less than 9, or less than 8, or less than 7, or less than 6, or less than
, or less. In yet r embodiment, the invention provides GLPZ-XTEN comprising at least two
XTENs in which each individual XTEN of 36 or more amino acids has a subsequence score of less than
, or less than 9, or less than 8, or less than 7, or less than 6, or less than 5, or less. In the ments
of this aph, the XTEN is characterized as substantially petitive.
In one aspect, the non—repetitive characteristic ofXTEN of the present invention together with
the particular types of amino acids that predominate in the XTEN, rather than the absolute primary
sequence, confers one or more of the enhanced physicochemical and biological properties of the GLP2-
XTEN filSiOIl proteins. These enhanced properties include a higher degree of expression of the fusion
protein in the host cell, greater genetic stability ofthe gene encoding XTEN, a greater degree of
solubility, less tendency to aggregate, and enhanced pharmacokinetics of the resulting GLP2-XTEN
compared to fusion proteins comprising polypeptides having repetitive sequences. These enhanced
properties permit more efficient manufacturing, lower cost of goods, and/or facilitate the formulation of
XTEN-comprising pharmaceutical preparations containing extremely high protein concentrations, in
some cases exceeding 100 mg/ml. In some embodiments, the XTEN polypeptide sequences of the
embodiments are designed to have a low degree of internal repetitiveness in order to reduce or
substantially eliminate immunogenicity when administered to a mammal. Polypeptide sequences
ed of short, repeated motifs largely limited to only three amino acids, such as glycine, serine and
glutamate, may result in relatively high dy titers when administered to a mammal despite the
absence of predicted T-cell epitopes in these sequences. This may be caused by the repetitive nature of
polypeptides, as it has been shown that immunogens with ed epitopes, including protein
aggregates, cross-linked immunogens, and repetitive carbohydrates are highly immunogenic and can, for
example, result in the cross-linking of B-cell receptors causing B-cell tion. (Johansson, J., et al.
(2007) e, 25 :1676-82 ; Yankai, 2., et al. (2006) Biochem Biophys Res Commun, 345 :1365-71 ;
Hsu, C. T., et a]. (2000) Cancer Res, 60:3701-5); Bachmann MF, et al. Eur J Immunol. (1995)
(12):3445-3451).
2. Exemplafl Seguence Motifs
The present invention encompasses XTEN used as fusion partners that comprise multiple units
of shorter sequences, or motifs, in which the amino acid sequences of the motifs are substantially non-
tive. The petitive property can be met even using a “building block” ch using a y
of sequence motifs that are multimerized to create the XTEN sequences. While an XTEN sequence may
consist of multiple units of as few as four different types of ce motifs, e the motifs
themselves generally consist of non-repetitive amino acid sequences, the overall XTEN sequence is
designed to render the sequence ntially non-repetitive.
In one embodiment, an XTEN has a substantially non-repetitive sequence of greater than about
36 to about 3000, or about 100 to about 2000, or about 144 to about 1000 amino acid residues, or even
longer wherein at least about 80%, or at least about 85%, or at least about 90%, or at least about 95%, or
at least about 97%, or about 100% of the XTEN ce consists of non-overlapping sequence motifs,
and wherein each of the motifs has about 9 to 36 amino acid residues. As used herein, verlapping”
means that the individual motifs do not share amino acid residues but, rather, are linked to other motifs or
amino acid residues in a linear fashion. In other embodiments, at least about 80%, or at least about 85%,
or at least about 90%, or at least about 95%, or at least about 97%, or about 100% ofthe XTEN sequence
consists of non—overlapping ce motifs wherein each of the motifs has 9 to 14 amino acid residues.
In still other embodiments, at least about 80%, or at least about 85%, or at least about 90%, or at least
about 95%, or at least about 97%, or about 100% ofthe XTEN sequence consists of non-overlapping
sequence motifs wherein each of the motifs has 12 amino acid residues. In these embodiments, it is
preferred that the sequence motifs are composed of substantially (e.g., 90% or more) or exclusively small
hydrophilic amino acids, such that the overall sequence has an unstructured, flexible characteristic.
Examples of amino acids that are included in XTEN are, e.g., arginine, lysine, threonine, alanine,
asparagine, glutamine, aspartate, glutamate, serine, and glycine. In one embodiment, XTEN sequences
have predominately four to six types of amino acids selected from glycine (G), e (A), serine (S),
threonine (T), glutamate (E) or proline (P) that are arranged in a substantially non-repetitive sequence
that is greater than about 36 to about 3000, or about 100 to about 2000, or about 144 to about 1000 amino
acid residues in length. In some embodiments, an XTEN sequence is made of 4, 5 or 6 types of amino
acids selected from the group ting of glycine (G), alanine (A), serine (S), threonine (T), glutamate
(E) or e (P). In some embodiments, XTEN have sequences of greater than about 36 to about 1000,
or about 100 to about 2000, or about 400 to about 3000 amino acid residues wherein at least about 80%
of the sequence consists of non-overlapping sequence motifs wherein each of the motifs has 9 to 36
amino acid residues and wherein at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least
94%, or at least 95%, or at least 96%, or at least 97%, or 100% of each of the motifs consists of 4 to 6
types of amino acids selected from glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and
proline (P), and wherein the content of any one amino acid type in the full-length XTEN does not exceed
%. In other embodiments, at least about 90% of the XTEN sequence consists of non-overlapping
sequence motifs wherein each of the motifs has 9 to 36 amino acid residues wherein the motifs consist of
4 to 6 types of amino acids selected from glycine (G), e (A), serine (S), threonine (T), glutamate
(E) and proline (P), and wherein the content of any one amino acid type in the full-length XTEN does not
exceed 40%, or about 30%, or about 25%. In other embodiments, at least about 90% of the XTEN
sequence consists of non-overlapping sequence motifs wherein each of the motifs has 12 amino acid
residues ting of 4 to 6 types of amino acids selected from glycine (G), e (A), serine (S),
threonine (T), glutamate (E) and proline (P), and wherein the content of any one amino acid type in the
fiJll-length XTEN does not exceed 40%, or 30%, or about 25%. In yet other embodiments, at least about
90%, or about 91%, or about 92%, or about 93%, or about 94%, or about 95%, or about 96%, or about
97%, or about 98%, or about 99%, to about 100% of the XTEN sequence consists of non-overlapping
sequence motifs wherein each of the motifs has 12 amino acid residues ting of glycine (G), alanine
(A), serine (S), threonine (T), glutamate (E) and proline (P).
In still other embodiments, XTENs comprise substantially non-repetitive sequences of greater
than about 36 to about 3000 amino acid residues wherein at least about 80%, or at least about 90%, or
about 91%, or about 92%, or about 93%, or about 94%, or about 95%, or about 96%, or about 97%, or
about 98%, or about 99% of the sequence consists of non—overlapping sequence motifs of 9 to 14 amino
acid residues wherein the motifs consist of 4 to 6 types of amino acids selected from e (G), alanine
(A), serine (S), threonine (T), glutamate (E) and proline (P), and wherein the sequence of any two
contiguous amino acid residues in any one motif is not repeated more than twice in the sequence motif.
In other embodiments, at least about 90%, or about 91%, or about 92%, or about 93%, or about 94%, or
about 95%, or about 96%, or about 97%, or about 98%, or about 99% of an XTEN sequence consists of
non-overlapping sequence motifs of 12 amino acid residues wherein the motifs consist of four to six
types of amino acids selected from glycine (G), e (A), serine (S), threonine (T), glutamate (E) and
proline (P), and wherein the sequence of any two contiguous amino acid residues in any one sequence
motif is not ed more than twice in the sequence motif. In other embodiments, at least about 90%,
or about 91%, or about 92%, or about 93%, or about 94%, or about 95%, or about 96%, or about 97%, or
about 98%, or about 99% of an XTEN sequence ts of non-overlapping sequence motifs of 12
amino acid residues wherein the motifs consist of glycine (G), alanine (A), serine (S), threonine (T),
glutamate (E) and e (P), and wherein the sequence of any two contiguous amino acid residues in
any one sequence motif is not repeated more than twice in the sequence motif. In yet other embodiments,
XTENs consist of 12 amino acid sequence motifs wherein the amino acids are selected from e (G),
e (A), serine (S), threonine (T), glutamate (E) and proline (P), and wherein the sequence of any two
contiguous amino acid residues in any one ce motif is not repeated more than twice in the
sequence motif, and wherein the content of any one amino acid type in the full-length XTEN does not
exceed 30%. The foregoing embodiments are es of substantially non-repetitive XTEN sequences.
Additional examples are detailed below.
In some embodiments, the invention provides GLPZ-XTEN itions comprising one, or
two, or three, or four, five, six or more non-repetitive XTEN sequence(s) of about 36 to about 1000
amino acid residues, or cumulatively about 100 to about 3000 amino acid residues wherein at least about
80%, or at least about 90%, or about 91%, or about 92%, or about 93%, or about 94%, or about 95%, or
about 96%, or about 97%, or about 98%, or about 99% to about 100% of the sequence consists of
multiple units of four or more non-overlapping sequence motifs selected from the amino acid ces
of Table 3, wherein the overall sequence remains substantially non-repetitive. In some ments, the
XTEN comprises non-overlapping sequence motifs in which about 80%, or at least about 85%, or at least
about 90%, or about 91%, or about 92%, or about 93%, or about 94%, or about 95%, or about 96%, or
about 97%, or about 98%, or about 99% or about 100% of the sequence consists of multiple units of non-
overlapping sequences selected from a single motif family ed from Table 3, resulting in a family
sequence. Family as applied to motifs means that the XTEN has motifs selected from a motif category of
Table 3; i.e., AD, AE, AF, AG, AM, AQ, BC, or BD, and that any other amino acids in the XTEN not
from a motif family are selected to achieve a needed property, such as to permit incorporation of a
restriction site by the encoding nucleotides, incorporation of a cleavage sequence, or to achieve a better
linkage to a GLP—2 component of the GLP2—XTEN. In some embodiments ofXTEN families, an XTEN
sequence comprises multiple units of non—overlapping sequence motifs of the AD motif , or of the
AE motif , or of the AF motif family, or of the AG motif family, or of the AM motif family, or of
the AQ motif family, or of the BC , or of the BD family, with the resulting XTEN exhibiting the
range of homology described above. In other embodiments, ofXTEN families, each XTEN of a given
family has at least four different motifs of the same family from Table 3; e. g., four motifs ofAD or AE or
AF or AG or AM, etc. In other embodiments, the XTEN comprises multiple units of motif sequences
from two or more of the motif families of Table 3, selected to achieve d physicochemical
characteristics, including such properties as net charge, lack of secondary structure, or lack of
repetitiveness that may be conferred by the amino acid composition ofthe motifs, described more fully
below. In the embodiments hereinabove described in this paragraph, the motifs or portions of the motifs
incorporated into the XTEN can be selected and led using the methods described herein to e
an XTEN of about 36, about 42, about 72, about 144, about 288, about 576, about 864, about 1000, about
2000 to about 3000 amino acid residues, or any intermediate length. Non-limiting examples ofXTEN
family sequences useful for incorporation into the subject GLP2-XTEN are presented in Table 4. It is
intended that a specified sequence ned relative to Table 4 has that ce set forth in Table 4,
while a lized reference to an AE144 sequence, for example, is ed to encompass any AE
sequence having 144 amino acid residues; e. g., AE144_1A, AE144_2A, etc., or a generalized reference
to an AG144 sequence, for example, is intended to encompass any AG sequence having 144 amino acid
residues, e.g., AG144_1, AG144_2, AG144_A, AG144_B, AG144_C, etc.
2012/054941
Table 3: XTEN Seguence Motifs of 12 Amino Acids and Motif Families
SSGSES
AD GSSESGSSEGGP
AE,AM
AE, AM, AQ GSEPATSGSETP
AAAAAQ
AEAAMAAQ
AF, AM GSTSESPSGTAP
AFAAM
AF, AM GSTSSTAESPGP
AG,AM
AG, AM GSSPSASTGTGP
BD GSETATSGSETA
a Denotes individual motif sequences that, when used together in various
ations, results in a “family sequence”
Table 4: XTEN Palmeptides
XTEN
Amino Acid Sequence
Name
AE42 GAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPASS
AE42 1 TEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
AE42—2 PAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSG
AE42—3 SEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSP
AG42_1 GAPSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGPSGP
AG42:2 GPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGASP
AG4273 SPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGA
AG42 4 SASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATG
XTEN
Amino Acid Sequence
Name
AE48 MAEPAGSPTSTEEGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGS
AM48 MAEPAGSPTSTEEGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGS
GSEPATSGSETPGTSESATPESGPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGSEPA
AE144 TSGSETPGSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSET
PGTSTEPSEGSAP
SPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPS
AE144_1A EGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGP
GTSTEPSEGSAPG
TSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPS
2A TSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGP
GTSESATPESGPG
TSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPS
2B EGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGP
GTSESATPESGPG
SPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPS
AE144_3A EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEE
GTSTEPSEGSAPG
SPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPS
AE144_3B EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEE
GTSTEPSEGSAPG
TSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS
AE144_4A EGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGP
GTSTEPSEGSAPG
TSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS
AE144_4B EGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGP
SEGSAPG
TSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS
AE144_5A EGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEE
GSPAGSPTSTEEG
TSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATS
6B GSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGP
GTSTEPSEGSAPG
GTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGSTSESPSGTAPGSTSST
AF144 AESPGPGTSPSGESSTAPGTSTPESGSASPGSTSSTAESPGPGTSPSGESSTAPGTSPSGESSTAP
GTSPSGESSTAP
SGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTG
AG144_1 TGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGS
GTGPGASP
PGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGASP
AG144_2 GTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSST
GSPGTPGSGTASSS
GASPGTSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPS
AG144_A ASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTG
SPGASPGTSSTGSP
GTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPS
AG144_B ASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTG
SPGASPGTSSTGSP
GTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGS
AG144_C GTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATG
SPGASPGTSSTGSP
GSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPS
AG144_F ASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATG
GTSSTGSP
GTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPG
AG144_3 TSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTG
SPGASPGTSSTGSP
AG144_4 GTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPG
WO 40093
XTEN
Amino Acid Sequence
Name
PGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASS
SPGSSTPSGATGSP
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEP
SEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEE
AE288_1
GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPA
TSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSET
PGTSESATPESGPGTSTEPSEGSAP
GSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEP
SEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEE
AE288_2 GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES
ATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTE
EGTSESATPESGPGTSTEPSEGSAP
PGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSST
PSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTG
AG288_1 TGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGA
SPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSAST
GTGPGTPGSGTASSSPGSSTPSGATGS
GSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGASPG
TSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTG
AG288_2 SPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSS
PSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSS
TGSPGASPGTSSTGSPGTPGSGTASSSP
SSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSXPS
ASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTG
SPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSS
TPSGATGSPGSXPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSS
AF504
TGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPG
ASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGT
SSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGS
PGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSP
AESPGPGSTSSTAESPGPGSTSESPSGTAPGSTSSTAESPGPGSTSSTAESPGPGTSTPE
SGSASPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAP
GSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGSTSES
PSGTAPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGSTSSTAESPGP
AF540 GTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSES
PSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASP
GSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPE
SGSASPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGSTSSTAESPGPGTSTPESGSASP
GSTSESPSGTAP
GSSESGSSEGGPGSGGEPSESGSSGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSES
GSSEGGPGSSESGSSEGGPGESPGGSSGSESGSEGSSGPGESSGSSESGSSEGGPGSSESGSSEG
GPGSSESGSSEGGPGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSS
ESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSGSEGSSGPGESSGESPGGSSGSESGSGGEPSE
AD576 SGSSGSGGEPSESGSSGSGGEPSESGSSGSSESGSSEGGPGESPGGSSGSESGESPGGSSGSESG
ESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPGSGGEPSESGSSGSEGSS
GPGESSGSSESGSSEGGPGSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGESPGGSSGSE
SGESPGGSSGSESGSSESGSSEGGPGSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGSGG
EPSESGSSGESPGGSSGSESGSEGSSGPGESSGSSESGSSEGGPGSEGSSGPGESS
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEP
SEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSES
PGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGS
AE576 APGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEG
TEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTS
TEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP
GSTSSTAESPGPGSTSSTAESPGPGSTSESPSGTAPGSTSSTAESPGPGSTSSTAESPGPGTSTPE
AF576
SGSASPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAP
WO 40093
XTEN
Amino Acid Sequence
Name
GSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGSTSES
PSGTAPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGSTSSTAESPGP
SGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSES
PSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASP
GSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPE
SGSASPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGSTSSTAESPGPGTSTPESGSASP
GSTSESPSGTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASP
PGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSST
PSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSS
TGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSP
GASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPG
AG576 SGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSST
GSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPG
ASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGS
GTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSST
GSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGS
MAEPAGSPTSTEEGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSPAGSPTSTEEGTS
ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATP
ESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPG
TSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEP
SEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESG
AE624
PGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST
EPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEG
SAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS
EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSP
TSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP
GSSESGSSEGGPGSSESGSSEGGPGESPGGSSGSESGSGGEPSESGSSGESPGGSSGSESGESP
GGSSGSESGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGESPGGSSGSESGESPGGSS
GSESGESPGGSSGSESGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGP
SSEGGPGSSESGSSEGGPGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSGG
EPSESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEPSESGSSGSEGSSGPGESSGSSESGSSE
GGPGSGGEPSESGSSGESPGGSSGSESGSGGEPSESGSSGSGGEPSESGSSGSSESGSSEGGPG
AD836 SGGEPSESGSSGSGGEPSESGSSGSEGSSGPGESSGESPGGSSGSESGSEGSSGPGESSGSEGSS
GPGESSGSGGEPSESGSSGSSESGSSEGGPGSSESGSSEGGPGESPGGSSGSESGSGGEPSESG
SSGSEGSSGPGESSGESPGGSSGSESGSEGSSGPGSSESGSSEGGPGSGGEPSESGSSGSEGSSG
PGESSGSEGSSGPGESSGSEGSSGPGESSGSGGEPSESGSSGSGGEPSESGSSGESPGGSSGSES
GESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGESPGGSSGSESGSSESGSSEGGPGSSES
GSSEGGPGSSESGSSEGGPGSGGEPSESGSSGSSESGSSEGGPGESPGGSSGSESGSGGEPSES
GSSGSSESGSSEGGPGESPGGSSGSESGSGGEPSESGSSGESPGGSSGSESGSGGEPSESGSS
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTE
PSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPES
GPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTS
ESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSE
GSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPG
TSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEP
SEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSET
AE864
PGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPA
EEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPE
SGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGS
PAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSP
TSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETP
SGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSES
ATPESGPGTSTEPSEGSAP
PSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPE
SGSASPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGTSPSGESSTAP
GTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSTPE
AF864
SGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGP
GTSTPESGSASPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPE
SGSASPGSTSSTAESPGPGSTSSTAESPGPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAP
XTEN
Amino Acid Sequence
Name
PSGTAPGSTSESPSGTAPGTSTPESGPXXXGASASGAPSTXXXXSESPSGTAPGSTSE
SPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSA
SGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGSTS
ESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGS
ASPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGSTSESPSGTAPGSTSESPSGTAPGT
SPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGTSPSGESSTAPGTSPSGES
STAPGTSPSGESSTAPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAPGSSPSASTGTGPG
SSTPSGATGSPGSSTPSGATGSP
GASPGTSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPS
ASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSST
GSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGS
TGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGT
GASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGS
PGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASP
GTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGAT
AG864_2
GSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPG
TPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGS
GTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGAT
TPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGS
STPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSG
ATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGS
PGSSTPSGATGSPGASPGTSSTGSP
GTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSTSSTAESPGPGTSTPESGSASPGSTSE
SPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSEPATSGSETPGTSESATPES
GPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSP
AGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSE
GSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEG
SSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTSTEPSEGSAPGTSTEPSEGSAPGSEPAT
SGSETPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGASASGAPSTGGTSESATPESGP
AM875
GSPAGSPTSTEEGSPAGSPTSTEEGSTSSTAESPGPGSTSESPSGTAPGTSPSGESSTAPGTPGS
GTASSSPGSSTPSGATGSPGSSPSASTGTGPGSEPATSGSETPGTSESATPESGPGSEPATSGSE
TPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAPGSEPATSGSETPGSEPATSGSETPGTS
TEPSEGSAPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGTSTEPSEGSAPGTSTEPSE
GSAPGTSTEPSEGSAPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSEPATSGSETP
GTSESATPESGPGSPAGSPTSTEEGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGTSES
ATPESGPGTSTEPSEGSAPGTSTEPSEGSAP
MAEPAGSPTSTEEGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSPAGSPTSTEEGTS
ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATP
ESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPG
TSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEP
SEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESG
PGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST
EPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEG
AE912 SAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS
EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSP
SPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETP
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSES
ATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGS
APGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSP
AGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSE
GSAP
MAEPAGSPTSTEEGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGTSTEPSEGSAPGSE
PATSGSETPGSPAGSPTSTEEGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGSTSESPSG
TAPGTSTPESGSASPGTSTPESGSASPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGT
STEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS
AM923
EGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAP
GTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSSTPSGATGSPGTPGS
GTASSSPGSSTPSGATGSPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTST
EEGSPAGSPTSTEEGTSTEPSEGSAPGASASGAPSTGGTSESATPESGPGSPAGSPTSTEEGSP
XTEN
Amino Acid Sequence
Name
AGSPTSTEEGSTSSTAESPGPGSTSESPSGTAPGTSPSGESSTAPGTPGSGTASSSPGSSTPSGA
TGSPGSSPSASTGTGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGSTSSTAESPGPG
STSSTAESPGPGTSPSGESSTAPGSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGSTSST
AESPGPGTSTPESGSASPGSTSESPSGTAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSA
PGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSEPATSGSETPGTSESATPESGPGSPA
GSPTSTEEGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGTSESATPESGPGTSTEPSEG
SAPGTSTEPSEGSAP
GTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSTSSTAESPGPGTSTPESGSASPGSTSE
SPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSEPATSGSETPGTSESATPES
GPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSP
TEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSE
GSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEG
SSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTSTEPSEGSAPGTSTEPSEGSAPGSEPAT
SGSETPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGPEPTGPAPSGGSEPATSGSETP
GTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSES
ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSTSSTAESPGPGSTSESPSGTAPGTSPSGESST
APGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGTSTEPSEGSAPGTSESATPESGPGTS
AM1318 ESATPESGPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSESATP
ESGPGTSTEPSEGSAPGTSPSGESSTAPGTSPSGESSTAPGTSPSGESSTAPGTSTEPSEGSAPG
SPAGSPTSTEEGTSTEPSEGSAPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGSSTPS
GATGSPGSSTPSGATGSPGASPGTSSTGSPGASASGAPSTGGTSPSGESSTAPGSTSSTAESPG
PGTSPSGESSTAPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSSPSASTGTGPGSST
PSGATGSPGASPGTSSTGSPGTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPGTSESATPE
SGPGSEPATSGSETPGTSTEPSEGSAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGS
PAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATS
GSETPGSSTPSGATGSPGASPGTSSTGSPGSSTPSGATGSPGSTSESPSGTAPGTSPSGESSTAP
GSTSSTAESPGPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSPAGSPTSTEEGSPAG
SPTSTEEGTSTEPSEGSAP
GTSTEPSEPGSAGTSTEPSEPGSAGSEPATSGTEPSGSGASEPTSTEPGSEPATSGTEPSGSEPA
TSGTEPSGSEPATSGTEPSGSGASEPTSTEPGTSTEPSEPGSAGSEPATSGTEPSGTSTEPSEPG
SAGSEPATSGTEPSGSEPATSGTEPSGTSTEPSEPGSAGTSTEPSEPGSAGSEPATSGTEPSGSE
PATSGTEPSGTSEPSTSEPGAGSGASEPTSTEPGTSEPSTSEPGAGSEPATSGTEPSGSEPATSG
TEPSGTSTEPSEPGSAGTSTEPSEPGSAGSGASEPTSTEPGSEPATSGTEPSGSEPATSGTEPSG
SEPATSGTEPSGSEPATSGTEPSGTSTEPSEPGSAGSEPATSGTEPSGSGASEPTSTEPGTSTEP
SEPGSAGSEPATSGTEPSGSGASEPTSTEPGTSTEPSEPGSAGSGASEPTSTEPGSEPATSGTEP
BC 864
SGSGASEPTSTEPGSEPATSGTEPSGSGASEPTSTEPGTSTEPSEPGSAGSEPATSGTEPSGSGA
SEPTSTEPGTSTEPSEPGSAGSEPATSGTEPSGTSTEPSEPGSAGSEPATSGTEPSGTSTEPSEPG
SAGTSTEPSEPGSAGTSTEPSEPGSAGTSTEPSEPGSAGTSTEPSEPGSAGTSTEPSEPGSAGTS
EPSTSEPGAGSGASEPTSTEPGTSTEPSEPGSAGTSTEPSEPGSAGTSTEPSEPGSAGSEPATSG
GASEPTSTEPGSEPATSGTEPSGSEPATSGTEPSGSEPATSGTEPSGSEPATSGTEPSG
TSEPSTSEPGAGSEPATSGTEPSGSGASEPTSTEPGTSTEPSEPGSAGSEPATSGTEPSGSGASE
PTSTEPGTSTEPSEPGSA
GSETATSGSETAGTSESATSESGAGSTAGSETSTEAGTSESATSESGAGSETATSGSETAGSE
TATSGSETAGTSTEASEGSASGTSTEASEGSASGTSESATSESGAGSETATSGSETAGTSTEA
SEGSASGSTAGSETSTEAGTSESATSESGAGTSESATSESGAGSETATSGSETAGTSESATSES
GAGTSTEASEGSASGSETATSGSETAGSETATSGSETAGTSTEASEGSASGSTAGSETSTEAG
TSESATSESGAGTSTEASEGSASGSETATSGSETAGSTAGSETSTEAGSTAGSETSTEAGSET
TAGTSESATSESGAGTSESATSESGAGSETATSGSETAGTSESATSESGAGTSESATS
BD864 ESGAGSETATSGSETAGSETATSGSETAGTSTEASEGSASGSTAGSETSTEAGSETATSGSET
ATSESGAGSTAGSETSTEAGSTAGSETSTEAGSTAGSETSTEAGTSTEASEGSASGS
TAGSETSTEAGSTAGSETSTEAGTSTEASEGSASGSTAGSETSTEAGSETATSGSETAGTSTE
ASEGSASGTSESATSESGAGSETATSGSETAGTSESATSESGAGTSESATSESGAGSETATSG
SESATSESGAGSETATSGSETAGTSTEASEGSASGTSTEASEGSASGSTAGSETSTE
AGSTAGSETSTEAGSETATSGSETAGTSESATSESGAGTSESATSESGAGSETATSGSETAGS
ETATSGSETAGSETATSGSETAGTSTEASEGSASGTSESATSESGAGSETATSGSETAGSETA
TSGSETAGTSESATSESGAGTSESATSESGAGSETATSGSETA
GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAG
AE948 EGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGSEPATSGSE
TPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTS
2012/054941
XTEN
Amino Acid Sequence
Name
TEPSEGSAPGSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGSEPATSGSETPGSEPATSG
SETPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPG
SEPATSGSETPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGSPAGS
PTSTEEGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSA
PGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPA
GSPTSTEEGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPE
SGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGS
EPATSGSETPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSEPATS
GSETPGSEPATSGSETPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGP
GSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGSPAG
SPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGS
APGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP
GSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSTE
PSEGSAPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSESATPES
GPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGSP
AGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSG
SETPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPG
TSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSPAGS
PTSTEEGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSET
PGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGTST
AE1044
EPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEG
SAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGS
EPATSGSETPGTSESATPESGPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPS
EGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEE
GTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSES
ATPESGPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTST
EPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTS
TEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGTSESATPESGPGTST
GSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGSEPA
PGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSESATPES
SATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTS
TEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSPAGSPT
STEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPG
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESA
TPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTE
EGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSE
AE1140
SATPESGPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPE
SGPGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGS
EPATSGSETPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESAT
PESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGP
GTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGTSTE
PSEGSAPGSEPATSGSETPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGTSESATPES
GPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSP
AGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGSPA
GSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSTE
PSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSE
TPGSPAGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGTS
TEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSE
GSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGSEPATSGSETPG
TSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSESA
TPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTE
AE1236 EGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSPA
GSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPE
SGPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGT
SESATPESGPGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGSPAGSPTSTEEGTSESAT
TSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGTSESATPESGP
GSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSES
PGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGSPAGSPTST
EEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGSPAGSPTSTEEGSPAGSPTSTEEGTS
XTEN
Amino Acid Sequence
Name
ESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGSEP
PTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGTSTE
PSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSPAGSPTST
EEGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGSE
PATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATP
ESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPG
TSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGTSESA
TPESGPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSA
PGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSEP
AE1332
TPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGT
SESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPS
EGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGP
GTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGTSES
ATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTST
EEGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGSE
PATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTST
GSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTE
PSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTST
EEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
TEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPT
STEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
SPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSTEP
SEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGSEPATSGSET
PGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGSEP
AE1428
ATSGSETPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGTSESATPE
SGPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGT
STEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSP
TSTEEGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAP
GTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGTSES
ATPESGPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTST
EEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGTS
TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSPA
GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAG
SPTSTEEGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGS
APGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGTS
TEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPT
STEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEG
TSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEP
SEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGSPAGSPTSTE
EGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGTSE
AE1524
SATPESGPGSEPATSGSETPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGS
ETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGS
PAGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSEPATS
GSETPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAP
GSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTE
PSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGTSESATPES
GPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGTS
ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSPA
SGSETPGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGTSES
ATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGS
APGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSESATPESGPGTS
SAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSE
STEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPG
AE1620
SEPATSGSETPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESA
TPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTE
SPTSTEEGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGTSE
SATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTS
TEEGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGS
XTEN
Amino Acid Sequence
Name
PAGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATS
GSETPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETP
GSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGTSES
PGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSESATPES
GPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGSPAGSPTSTEEGTS
ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTST
GTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAG
SPTSTEEGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGTSESATPES
EPSEGSAPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGSE
PATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSG
SETPGSPAGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPG
TSTEEGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSEPAT
SGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESG
PGSEPATSGSETPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSE
AE1716
SATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSESATPE
SGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGS
EPATSGSETPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGTSTEPS
EGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSPAGSPTSTEE
GTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSES
ATPESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
APGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTS
ESATPESGPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSE
GTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSTE
PSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPES
GPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSE
ETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSTEPSE
GSAPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEG
TSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGTSESA
TPESGPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSA
PGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSPA
AE1812
GSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPE
SGPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT
SESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGSEPATS
GSETPGSPAGSPTSTEEGTSESATPESGPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE
GTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAG
SPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGS
APGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSP
AGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEP
GSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGSPAG
EGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEGS
APGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTS
ESATPESGPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSESATP
ESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPG
SEPATSGSETPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSESA
TPESGPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTE
EGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSEP
AE1908
ATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTS
TEEGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGT
SESATPESGPGSEPATSGSETPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESAT
PESGPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE
GTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSEPA
TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPES
GPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSP
AGSPTSTEEGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEP
SEGSAPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSPAG
SPTSTEEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGS
AE2004A APGSPAGSPTSTEEGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTS
ESATPESGPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSTEPSE
GSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSPAGSPTSTEEG
2012/054941
XTEN
Amino Acid Sequence
Name
TSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEP
SEGSAPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESG
PGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSESATPESGPGTST
EPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEG
SAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGT
SESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPS
EGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP
GSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGSPAG
SPTSTEEGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGSEPATSGSE
TPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTS
ESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSE
GSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGTPG
SGTASSSPGTPGSGTASSSPGSSPSASTGTGPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGAT
GSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGS
SPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSAS
TGTGPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGS
PGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSST
PSGATGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGTPGSGTA
AG948 SSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPG
ASPGTSSTGSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGSSTPS
GATGSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTG
SPGTPGSGTASSSPGTPGSGTASSSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTP
SSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGSSPSAST
GTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSP
GTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSTP
SGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSP
GTPGSGTASSSPGTPGSGTASSSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGTPG
SGTASSSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTG
TGPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGTPGSGTASSSPG
ASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGTPGS
GTASSSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGAT
GSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGS
STPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGASPGT
SSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGS
AG1044
PGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSST
PSGATGSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSS
TGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSP
GSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGSSPS
ASTGTGPGASPGTSSTGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSST
GSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPG
TPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGTPGS
GTASSSPGSSPSASTGTGPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGSST
GASPGTSSTGSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGSSTP
SGATGSPGTPGSGTASSSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGAT
GSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGS
SPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGTPGSG
TASSSPGSSPSASTGTGPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGS
PGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGASPGTSSTGSPGSSP
SASTGTGPGTPGSGTASSSPGTPGSGTASSSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTA
SPSASTGTGPGASPGTSSTGSPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPG
AG1140
TPGSGTASSSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGS
GTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGAT
TPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGS
STPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSG
ATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTG
PGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGASP
SPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGSSPSASTGTGPGASPGTSST
GSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGSST
AG1236 GSSPSASTGTGPGTPGSGTASSSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGASP
XTEN
Amino Acid Sequence
Name
GTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTA
SSSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPG
TPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGTPGS
PGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGT
GPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGA
SPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGT
ASSSPGSSTPSGATGSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGP
GSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGSSPS
ASTGTGPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGTPGSGTAS
SSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGS
STPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGTPGSGTASSSPGSSPSAS
TGTGPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGS
PGTPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGSST
PSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSS
TGSPGTPGSGTASSSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASP
GSSTPSGATGSPGSSPSASTGTGPGTPGSGTASSSPGSSPSASTGTGPGASPGTSSTGSPGSSPS
ASTGTGPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGAT
PGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGS
SPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGTPGSG
TASSSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSS
GTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGSSP
GPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGTPGSGTASSSPGSSPSAST
GTGPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGASPGTSSTGSP
AG1332
GATGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSPSASTGTGPGTPG
SGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGTPGSGTASSSPGTPGSGTA
SSSPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPG
ASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGTPGS
GTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSPSASTGT
GPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGTPGSGTASSSPGSS
TPSGATGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGSSPSASTGTGPGASPGTS
SSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPG
GTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGTPG
SGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGSSTPSGAT
GSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPG
TPGSGTASSSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGTPGS
GTASSSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGSSPSASTGT
GPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSPSASTGTGPGA
SPGTSSTGSPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGSSTPSGATGSPGSSPSAS
TGTGPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGS
AG1428
PGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGTP
GSGTASSSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGASPGTSS
TGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSP
GASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASP
GTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTA
SSSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPG
SSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGTPGS
GTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGASP
GSSTPSGATGSPGTPGSGTASSSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGTPG
SGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGTPGSGTASSSPGTPGSGTA
SSSPGSSPSASTGTGPGSSTPSGATGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPG
SSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSPSA
STGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATG
SPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSS
AG1524
PSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSPSASTGTGPGASPGTS
STGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSP
GTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGTPG
SGTASSSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTG
TGPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGTPGSGTASSSPGSSPSASTGTGPG
ASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGS
XTEN
Amino Acid Sequence
Name
GTASSSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGAT
GSPGSSTPSGATGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGSSPSASTGTGPGS
SPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGT
SSTGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGTPG
GSSTPSGATGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGTPGSGTASSSPGASP
SPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGSSPSASTG
TGPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPG
ASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGASPGTSSTGSPGTPGS
GTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGAT
GSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGS
SPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSPSAS
TGTGPGSSTPSGATGSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTG
AG1 620
PGASPGTSSTGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASP
GTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGAT
GSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGS
STPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGSSPSAS
TGTGPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGSSPSASTGTGPGTPGSGTASSS
PGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGSST
SPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGASPGTSS
TGSPGTPGSGTASSSPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGSST
GASPGTSSTGSPGSSPSASTGTGPGSSTPSGATGSPGSSPSASTGTGPGTPGSGTASSSPGSSTP
SGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTG
TGPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPG
ASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSTPSGATGSPGTPGS
GTASSSPGSSPSASTGTGPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGTPGSGTAS
SSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGT
PGSGTASSSPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSG
ATGSPGASPGTSSTGSPGTPGSGTASSSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGS
AG1716
PGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSP
SASTGTGPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGA
TGSPGASPGTSSTGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGASPGTSSTGSP
GSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGTPGSGTASSSPGTPG
SGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGAT
GSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGS
SPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGASPGT
SSTGSPGASPGTSSTGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPG
GSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSPS
ASTGTGPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTG
TGPGTPGSGTASSSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPG
SSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSA
STGTGPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTG
SPGSSTPSGATGSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGAS
PGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSAST
GTGPGTPGSGTASSSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSP
AG1812
STGTGPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPG
SGTASSSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSST
PGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGS
SPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGT
SSTGSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGASPGTSSTGSPGSSTPSGATGS
GTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSP
SASTGTGPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGASPGTSS
TGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPGASP
GSSPSASTGTGPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSPSASTGTGPGSSPS
ASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGTPGSGTASSSPGASPGTSST
GSPGTPGSGTASSSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGSSPSASTGTGPG
AG1908 ASPGTSSTGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGTPGS
GTASSSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGAT
GSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPG
ASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGS
XTEN
Amino Acid Sequence
Name
GTASSSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGAT
GSPGTPGSGTASSSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPG
ASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSA
GASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTG
SPGTPGSGTASSSPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSS
PSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGASPGTS
STGSPGSSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGP
GTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGTPG
SGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGSSPSASTGTGPGSSP
GSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTP
SGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGASPGTSST
GSPGSSTPSGATGSPGTPGSGTASSSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGS
SPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSG
TASSSPGTPGSGTASSSPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGSSTPSGATGS
PGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSPSASTGTGPGSSP
SASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGTPGSGTA
SSSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGSSTPSGATGSPG
AG2004A
SSTPSGATGSPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGSSPSA
STGTGPGTPGSGTASSSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSPSASTGT
GPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSS
TPSGATGSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTS
SSPSASTGTGPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSP
GSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGTPGSGTASSSPGSSPSASTGTGPGASP
GTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTA
SSSPGSSTPSGATGSPGTPGSGTASSSPGSSPSASTGTGPGSSPSASTGTGPGASP
SPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPAT
AE72B
SGSETPG
TSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEP
AE72C
SEGSAPG
TEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGT
AE108A
SESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTS
GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPA
AE108B
TSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAP
GSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESAT
AE144A PESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEE
GSPAGSPTSTEEGS
SEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGS
AE144B PTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTE
EGTSTEPSEGSAPG
TSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEE
AE180A GTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPA
TSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATS
PESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGP
GSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSES
AE216A
PGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGS
ATSGSETPGTSESAT
ESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPG
SEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESA
AE252A
TPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTE
EGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSE
TPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSET
PGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSE
AE288A SATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPE
SGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGT
STEPSEGSAPGSEPATSGSETPGTSESA
PESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTE
AE324A
PSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTST
EEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSE
2012/054941
XTEN
Amino Acid Sequence
Name
PATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATS
TSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEE
GTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPA
TSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSE
AE360A
TPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS
ESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSE
GSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAT
PESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEE
GSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA
PGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTST
AE396A
EEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTS
TEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSG
SETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPS
EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAG
SPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSE
AE432A TPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTS
ESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSE
GSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPG
SPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATS
EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGP
GSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSES
ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGS
APGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTS
AE468A
TEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTS
TEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSE
PATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGS
ETPGTSESAT
EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEE
GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES
ATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTE
EGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP
AE504A
TPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS
ESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEG
TEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPS
TPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAG
SPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSET
PGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPA
AE540A GSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES
GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEP
ATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPES
GPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTS
TEPSEGSAPGTSTEP
TPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEP
SEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAG
AE576A SPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSET
PGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSES
ATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGS
APGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSP
AGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESA
GSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEP
AE612A SEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAP
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEP
SEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGP
XTEN
Amino Acid Sequence
Name
GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES
ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPES
GPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTS
ESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEG
SAPGSEPATSGSETPGTSESAT
PESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPG
TSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESAT
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPG
TSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESAT
PESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEG
AE648A
SPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATS
GSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEE
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEP
SEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETP
GSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAT
EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAP
GTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSES
ATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGS
APGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSP
AGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSG
AE684A SETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGS
PAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATP
EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS
EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATP
ESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEG
TSTEPSEGSAPGTSTEPSEGSAPGSEPATS
PGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSA
PGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSES
ATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPES
GPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTS
SAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTS
AE720A TEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTS
TEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSE
ETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS
ESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTE
TSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSA
PGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSES
PGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPES
GPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTS
TEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTS
TEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
AE756A
ESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTS
TEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSE
PATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS
ESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEG
SAPGTSTEPSEGSAPGSEPATSGSETPGTSES
EGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSES
ATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGS
APGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEG
AE792A
SAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTS
TEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSE
PATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTS
TEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTS
WO 40093
XTEN
Amino Acid Sequence
Name
TEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGS
ETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTS
TEPS
PESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPS
EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAP
GTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSES
PGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGS
EPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSP
AE828A AGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSG
SETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGS
PAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATP
ESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS
EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATP
ESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEG
TSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAT
GPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGTP
AG72A
GSGTASS
GSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGS
AG72B
GTASSSP
SPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSG
AG72C
ATGSPGA
SASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSST
AG108A
GSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASP
PGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSP
AG108B
SASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSS
PGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGASP
AG144A GTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSST
GSPGTPGSGTASSS
PSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTG
AG144B TGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGA
SPGTSSTGSPGASP
TSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGT
AG180A GPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGAS
PGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGS
SSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSP
GTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGS
AG2 1 6A
GTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGT
GPGSSPSASTGTGPGSSTPSG
TSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGT
GPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGAS
AG252A
PGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGA
TGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPG
TSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGT
GPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGAS
AG288A PGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGA
TGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPG
ASPGTSSTGSPGASPGTSSTGSPGTPGS
TSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTG
SGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSS
AG324A TPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTA
SSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPG
SSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTP
TSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTG
GTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTP
GSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSS
AG360A
TGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPG
ASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSG
TASSSPGSSTPSGATGSPGSSTPSGATGSPGASPG
XTEN
Amino Acid Sequence
Name
GATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSS
TSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSST
PSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGAT
AG396A GSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGS
SPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSAS
TGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSP
GASPGT
GATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGS
PGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASP
GTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGAT
AG432A GSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGS
SPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSAS
TGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGP
GASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTPS
TSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTG
SPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSS
TPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTA
SSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPG
AG468A
SSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPS
GATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGS
PGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSP
SASTGTGPGASPG
TSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTG
SPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSS
TPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTA
SSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPG
AG504A
SSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPS
GATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGS
PGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSP
SASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTP
TSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTG
SPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSS
TPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSS
TGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPG
AG540A TPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPS
GATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSS
SGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSST
PSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGAT
GSPGSSTPSGATGSPGASPG
TSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGT
GPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSS
TPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGA
TGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPG
AG576A ASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSG
TASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTG
ASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSP
GPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSST
GSPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPG
TPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSP
STGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASPG
TSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTG
SPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSS
TPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGA
AG612A
TGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPG
ASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPS
GATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGS
PGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSP
SASTGTGPGSSPSASTGTGPGASPGTS
AG648A GTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATG
WO 40093
XTEN
Amino Acid Sequence
Name
SPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGAS
PGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSS
TGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPG
ATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGT
SSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGS
PGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSST
PSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTAS
SSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGS
STPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTP
TSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATG
SPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGAS
PGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSS
TGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPG
TPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGT
AG684A SSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGS
PGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPG
SGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSST
GSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGA
SPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGT
ASSSPGSSTPSGATGSPGSSTPSGATGSPGASPG
TSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATG
SASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGAS
PGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSS
SPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPG
SSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPS
GATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSS
AG720A
PGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSST
SPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGAT
GSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGS
SPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSAS
TGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSP
GASPG
TSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGT
GPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGAS
PGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGA
TGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPG
ASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGT
SSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGS
AG756A
PGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSST
PSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTAS
SSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGS
STPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSG
ATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSP
GSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPG
TSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGT
GPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGAS
PGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGA
TGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPG
ASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGT
GSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGS
AG792A PGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSST
PSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTAS
SSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGS
STPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSG
ATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSP
GSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPS
ASTGTGPGASPG
TSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGT
AG828A
GPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGAS
XTEN
Ammo ACId Sequence
Name
PGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGA
SPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPG
ASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGT
SSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGS
PGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSST
SPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTAS
SSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGS
STPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSG
ATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSP
STGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPS
ASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSSTP
TASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPG
PGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTG
AG288_D
SPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTP
GSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSS
TGSPGTPGSGTASSSPGSSTPSGATGSP
In other embodiments, the GLP2-XTEN composition comprises one or more non-repetitive
XTEN sequences of about 36 to about 3000 amino acid residues or about 144 to about 2000 amino acid
residues or about 288 or about 1000 amino acid residues, wherein at least about 80%, or at least about
90%, or about 91%, or about 92%, or about 93%, or about 94%, or about 95%, or about 96%, or about
97%, or about 98%, or about 99% to about 100% of the sequence consists of non-overlapping 36 amino
acid sequence motifs selected from one or more of the polypeptide sequences of Tables 8-11, either as a
family sequence, or where motifs are ed from two or more families of motifs.
In those embodiments wherein the XTEN component of the GLP2—XTEN fusion protein has
less than 100% of its amino acids consisting of four to six amino acid selected from glycine (G), alanine
(A), serine (S), threonine (T), ate (E) and proline (P), or less than 100% ofthe sequence consisting
of the sequence motifs from Table 3 or the sequences of Tables 4, and 8-12 or less than 100% sequence
identity compared with an XTEN from Table 4, the other amino acid residues are selected from any other
of the 14 l L-amino acids, but are preferentially selected from hydrophilic amino acids such that the
XTEN sequence contains at least about 90%, or at least about 91%, or at least about 92%, or at least
about 93%, or at least about 94%, or at least about 95%, or at least about 96%, or at least about 97%, or
at least about 98%, or at least about 99% hydrophilic amino acids. The XTEN amino acids that are not
e (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P) are interspersed
throughout the XTEN sequence, are located within or between the sequence motifs, or are concentrated
in one or more short stretches ofthe XTEN sequence. In such cases where the XTEN component of the
TEN comprises amino acids other than glycine (G), alanine (A), serine (S), threonine (T),
glutamate (E) and proline (P), it is desirable that the amino acids not be hydrophobic residues and should
not substantially confer secondary structure of the XTEN ent. Hydrophobic residues that are less
favored in construction ofXTEN include tryptophan, phenylalanine, tyrosine, leucine, isoleucine, valine,
and methionine. Additionally, one can design the XTEN sequences to contain less than 5% or less than
4% or less than 3% or less than 2% or less than 1% or none of the following amino acids: cysteine (to
avoid disulflde formation and oxidation), methionine (to avoid ion), asparagine and glutamine (to
avoid desamidation). Thus, in some embodiments, the XTEN component of the GLP2-XTEN fusion
protein comprising other amino acids in addition to glycine (G), e (A), serine (S), threonine (T),
glutamate (E) and proline (P) would have a sequence with less than 5% of the residues contributing to
helices and beta-sheets as measured by the Chou-Fasman algorithm and have at least 90%, or at
least about 95% or more random coil formation as measured by the GOR algorithm.
3. Length of Sequence
In r aspect, the invention provides XTEN of varying lengths for incorporation into
GLP2-XTEN compositions wherein the length of the XTEN sequence(s) are chosen based on the
property or function to be achieved in the fusion protein. Depending on the intended property or
fianction, the GLP2-XTEN compositions comprise short or intermediate length XTEN and/or longer
XTEN sequences that can serve as carriers. While not intended to be limiting, the XTEN or fragments of
XTEN include short segments of about 6 to about 99 amino acid residues, intermediate lengths of about
100 to about 399 amino acid residues, and longer lengths of about 400 to about 3000 amino acid residues.
Thus, the subject GLP2-XTEN encompass XTEN or fragments ofXTEN with lengths of about 6, or
about 12, or about 36, or about 40, or about 100, or about 144, or about 288, or about 401, or about 500,
or about 600, or about 700, or about 800, or about 900, or about 1000, or about 1500, or about 2000, or
about 2500, or up to about 3000 amino acid residues in length. In other cases, the XTEN sequences can
be about 6 to about 50, or about 100 to 150, about 150 to 250, about 250 to 400, about 400 to about 500,
about 500 to 900, about 900 to 1500, about 1500 to 2000, or about 2000 to about 3000 amino acid
residues in . The precise length of an XTEN can vary without adversely affecting the biological
activity of a GLP2-XTEN ition. In one embodiment, one or more of the XTEN used in the
GLP2-XEN disclosed herein has 36 amino acids, 42 amino acids, 144 amino acids, 288 amino acids, 576
amino acids, or 864 amino acids in length and may be selected from one of the XTEN family sequences;
i.e., AD, AE, AF, AG, AM, AQ, BC or BD. In another ment, one or more of the XTEN used
herein is selected from the group consisting ofXTEN_AE864, XTEN_AE576, XTEN_AE288,
XTEN_AE144, E42, XTEN_AG864, XTEN_AG576, XTEN_AG288, G144, and
XTEN_AG42 or other XTEN sequences in Table 4. In the embodiments of the GLP2-XTEN, the one or
more XTEN or fragments ofXTEN sequences dually t at least about 80% sequence identity,
or alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,
96%, 97%, 98%, 99%, or 100% sequence identity compared to a motif or an XTEN ed from Table
4, or a fragment thereof with comparable length. In some ments, the GLP2-XTEN fusion proteins
comprise a first and at least a second XTEN sequence, wherein the cumulative length of the es in
the XTEN sequences is greater than about 100 to about 3000 or about 400 to about 1000 amino acid
residues and the XTEN can be identical or they can be different in sequence or in length. As used herein,
“cumulative length” is intended to encompass the total length, in amino acid residues, when more than
one XTEN is incorporated into the GLP2-XTEN fusion protein.
As described more fully below, methods are disclosed in which the GLP2-XTEN is designed
by selecting the length of the XTEN to confer a target half-life or other physicochemical property on a
fusion n administered to a subject. When XTEN are used as a carrier, the invention takes
advantage of the discovery that increasing the length of the non-repetitive, unstructured ptides
enhances the unstructured nature of the XTENs and correspondingly enhances the biological and
cokinetic properties of fusion proteins comprising the XTEN carrier. In general, XTEN
cumulative lengths longer that about 400 residues incorporated into the fusion n compositions
result in longer half-life compared to shorter cumulative lengths, e. g., r than about 280 residues.
As described more fully in the Examples, proportional increases in the length of the XTEN, even if
created by a repeated order of single family sequence motifs (e.g., the four AE motifs of Table 3), result
in a sequence with a higher percentage of random coil ion, as determined by GOR algorithm, or
reduced content of alpha-helices or beta-sheets, as determined by Chou-Fasman algorithm, compared to
shorter XTEN lengths. In addition, increasing the length of the unstructured polypeptide fiJsion partner,
as described in the Examples, results in a fusion protein with a disproportionate increase in terminal half-
life compared to fusion proteins with unstructured polypeptide partners with shorter ce lengths.
In some embodiments, where the XTEN serve primarily as a carrier, the invention encompasses
GLP2-XTEN compositions comprising one or more XTEN wherein the cumulative XTEN sequence
length of the fusion protein(s) is greater than about 100, 200, 400, 500, 600, 800, 900, or 1000 to about
3000 amino acid residues, wherein the fiision protein exhibits enhanced pharmacokinetic properties when
administered to a subject ed to a GLP-2 not linked to the XTEN and administered at a comparable
dose. In one embodiment of the foregoing, the one or more XTEN sequences t at least about 80%,
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, or 98% or more identity to a sequence selected from Table
4, and the remainder, if any, of the carrier sequence(s) ns at least 90% hydrophilic amino acids and
less than about 2% of the overall ce consists of hobic or aromatic amino acids or cysteine.
The enhanced pharmacokinetic properties of the GLP2-XTEN in comparison to GLP-2 not linked to
XTEN are described more fully, below.
In another aspect, the invention provides methods to create XTEN of short or ediate
lengths from longer “donor” XTEN sequences, wherein the longer donor sequence is created by
truncating at the N—terminus, or the C—terminus, or a fragment is created from the interior of a donor
sequence, thereby resulting in a short or intermediate length XTEN. In miting es, as
schematically depicted in -C, the AG864 sequence of 864 amino acid residues can be truncated
to yield an AG144 with 144 residues, an AG288 with 288 residues, an AG576 with 576 residues, or other
intermediate lengths, while the AE864 sequence (as ed in , E) can be truncated to yield an
AE288 or AES76 or other intermediate lengths. It is ically contemplated that such an approach can
be utilized with any of the XTEN embodiments described herein or with any of the sequences listed in
Tables 4 or 8-12 to result in XTEN of a desired length.
4. Net charge
] In other embodiments, the XTEN polypeptides have an ctured characteristic ed by
incorporation of amino acid residues with a net charge and containing a low proportion or no
hydrophobic amino acids in the XTEN sequence. The overall net charge and net charge density is
lled by modifying the content of charged amino acids in the XTEN sequences, either positive or
negative, with the net charge typically represented as the percentage of amino acids in the ptide
contributing to a charged state beyond those residues that are cancelled by a residue with an opposing
charge. In some ments, the net charge y of the XTEN of the compositions may be above
+0.1 or below -0.1 charges/residue. By “net charge density” of a protein or e herein is meant the
net charge divided by the total number of amino acids in the protein or propeptide. In other
embodiments, the net charge of an XTEN can be about 0%, about 1%, about 2%, about 3%, about 4%,
about 5%, about 6%, about 7%, about 8%, about 9%, about 10% about 11%, about 12%, about 13%,
about 14%, about 15%, about 16%, about 17%, about 18%, about 19%, or about 20% or more. In some
ments, the XTEN sequence ses charged residues separated by other es such as serine
or glycine, which leads to better expression or purification behavior. Based on the net charge, some
XTENs have an isoelectric point (pI) of 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, or even 6.5. In
one embodiment, the XTEN will have an ctric point n 1.5 and 4.5 and carry a net negative
charge under physiologic conditions.
Since most tissues and surfaces in a human or animal have a net negative charge, in some
embodiments the XTEN ces are designed to have a net negative charge to minimize non-specific
interactions between the XTEN containing compositions and various surfaces such as blood vessels,
y tissues, or various receptors. Not to be bound by a particular theory, an XTEN can adopt open
conformations due to electrostatic repulsion between individual amino acids of the XTEN polypeptide
that individually carry a net negative charge and that are distributed across the sequence of the XTEN
polypeptide. In some embodiments, the XTEN sequence is designed with at least 90% or 95% of the
charged residues separated by other residues such as serine, alanine, threonine, proline or glycine, which
leads to a more uniform distribution of charge, better expression or purification behavior. Such a
distribution of net negative charge in the extended sequence lengths ofXTEN can lead to an unstructured
conformation that, in turn, can result in an effective increase in hydrodynamic radius. In preferred
embodiments, the negative charge of the subject XTEN is conferred by incorporation of glutamic acid
residues. Generally, the glutamic residues are spaced uniformly across the XTEN sequence. In some
cases, the XTEN can contain about 10-80, or about 15-60, or about 20-50 glutamic residues per 20kDa of
XTEN that can result in an XTEN with charged residues that would have very similar pKa, which can
increase the charge homogeneity of the product and sharpen its isoelectric point, enhance the
physicochemical properties of the resulting GLP2-XTEN fusion protein for, and hence, simplifying
purification procedures. For example, where an XTEN with a negative charge is desired, the XTEN can
be ed solely from an AB family sequence, which has approximately a 17% net charge due to
2012/054941
incorporated glutamic acid, or can include varying proportions of glutamic acid-containing motifs of
Table 3 to provide the desired degree of net charge. Non-limiting examples of AE XTEN include, but
are not limited to the AE36, AE42, AE48, AE144, AE288, AE576, AE624, AE864, and AE912
polypeptide sequences of Tables 4 or 9, or fragments thereof. In one embodiment, an XTEN sequence of
Tables 4 or 9 can be modified to include additional glutamic acid es to achieve the desired net
negative charge. Accordingly, in one ment the invention provides XTEN in which the XTEN
sequences contain about 1%, 2%, 4%, 8%, 10%, 15%, 17%, 20%, 25%, or even about 30% ic
acid. In some cases, the XTEN can contain about 10-80, or about 15-60, or about 20-50 glutamic
residues per 20kDa ofXTEN that can result in an XTEN with charged residues that would have very
similar pKa, which can increase the charge homogeneity of the product and sharpen its isoelectric point,
enhance the ochemical properties of the resulting GLP2-XTEN fusion protein for, and hence,
simplifying purification procedures. In one embodiment, the invention contemplates oration of
aspartic acid es into XTEN in addition to glutamic acid in order to achieve a net negative charge.
Not to be bound by a particular theory, the XTEN of the GLP2-XTEN compositions with the
higher net negative charge are expected to have less non-specific ctions with s negativelycharged
surfaces such as blood vessels, tissues, or various receptors, which would further contribute to
reduced active clearance. Conversely, it is believed that the XTEN of the GLP2-XTEN itions
with a low (or no) net charge would have a higher degree of interaction with surfaces that can potentiate
the biological activity of the associated GLP—2, given the known contribution of phagocytic cells in the
inflammatory process in the intestines.
In other cases, where no net charge is desired, the XTEN can be selected from, for example,
AG family XTEN components, such as the AG motifs of Table 3, or those AM motifs of Table 3 that
have approximately no net charge. Non-limiting es ofAG XTEN include, but are not limited to
AG42, AG144, AG288, AG576, and AG864 polypeptide sequences of Tables 4 and 11, or fragments
thereof. In another embodiment, the XTEN can comprise g proportions ofAE and AG motifs ( in
order to have a net charge that is deemed optimal for a given use or to maintain a given physicochemical
property.
The XTEN of the compositions of the present invention generally have no or a low t of
positively charged amino acids. In some embodiments, the XTEN may have less than about 10% amino
acid residues with a positive charge, or less than about 7%, or less than about 5%, or less than about 2%,
or less than about 1% amino acid residues with a positive charge. However, the ion contemplates
constructs where a d number of amino acids with a positive charge, such as lysine, are incorporated
into XTEN to permit conjugation n the n amine of the lysine and a reactive group on a GLP-
2 peptide, a linker bridge, or a reactive group on a drug or small molecule to be conjugated to the XTEN
backbone. In one embodiment of the foregoing, the XTEN has between about 1 to about 100 lysine
residues, or about 1 to about 70 lysine residues, or about 1 to about 50 lysine residues, or about 1 to about
lysine residues, or about 1 to about 20 lysine residues, or about 1 to about 10 lysine residues, or about
1 to about 5 lysine residues, or alternatively only a single lysine residue. Using the foregoing lysine-
containing XTEN, fusion proteins are ucted that comprises XTEN, a GLP-2, plus a
chemotherapeutic agent useful in the ent of GLPrelated diseases or disorders, wherein the
maximum number of molecules of the agent incorporated into the XTEN component is determined by the
numbers of lysines or other amino acids with ve side chains (e.g., cysteine) incorporated into the
XTEN. Accordingly, the invention also provides XTEN with 1 to about 10 cysteine es, or about 1
to about 5 cysteine residues, or alternatively only a single ne e wherein fusion proteins are
constructed that comprises XTEN, a GLP-2, plus a herapeutic agent useful in the treatment of
GLPrelated diseases or disorders, wherein the maximum number of molecules of the agent
incorporated into the XTEN component is determined by the numbers of cysteines.
As hydrophobic amino acids impart structure to a polypeptide, the invention provides that the
content of hydrophobic amino acids in the XTEN will typically be less than 5%, or less than 2%, or less
than 1% hydrophobic amino acid content. In one embodiment, the amino acid content of methionine and
tryptophan in the XTEN component of a TEN fusion protein is typically less than 5%, or less
than 2%, and most preferably less than 1%. In another embodiment, the XTEN will have a sequence that
has less than 10% amino acid es with a positive charge, or less than about 7%, or less that about
%, or less than about 2% amino acid residues with a positive charge, the sum of nine and
tryptophan residues will be less than 2%, and the sum of asparagine and glutamine residues will be less
than 5% of the total XTEN sequence.
. Low immunogenicity
In another aspect, the invention provides compositions in which the XTEN sequences have a
low degree of immunogenicity or are ntially non-immunogenic. Several factors can contribute to
the low immunogenicity ofXTEN, e. g., the non-repetitive sequence, the unstructured conformation, the
high degree of solubility, the low degree or lack of self-aggregation, the low degree or lack of proteolytic
sites within the sequence, and the low degree or lack of epitopes in the XTEN sequence.
Conformational epitopes are formed by regions of the protein surface that are composed of
multiple discontinuous amino acid sequences of the protein antigen. The e folding of the protein
brings these sequences into a well-defined, stable l configurations, or epitopes, that can be
recognized as “foreign” by the host humoral immune system, resulting in the production of antibodies to
the protein or the activation of a cell-mediated immune response. In the latter case, the immune response
to a protein in an individual is y influenced by T-cell epitope recognition that is a function of the
peptide binding specificity of that individual’s HLA-DR allotype. Engagement of a MHC Class II
peptide complex by a cognate T-cell receptor on the surface of the T-cell, together with the cross-binding
of certain other co-receptors such as the CD4 molecule, can induce an ted state within the T-cell.
Activation leads to the release of cytokines further activating other lymphocytes such as B cells to
e antibodies or ting T killer cells as a full cellular immune response.
The ability of a peptide to bind a given MHC Class II molecule for presentation on the surface
of an APC (antigen presenting cell) is dependent on a number of factors; most notably its primary
sequence. In one embodiment, a lower degree of immunogenicity is achieved by designing XTEN
sequences that resist antigen processing in antigen presenting cells, and/or choosing sequences that do
not bind MHC receptors well. The invention provides TEN fusion proteins with substantially
non-repetitive XTEN polypeptides designed to reduce binding with MHC II receptors, as well as
avoiding formation of epitopes for T-cell receptor or antibody binding, resulting in a low degree of
immunogenicity. Avoidance of immunogenicity can attiibute to, at least in part, a result of the
mational flexibility of XTEN sequences; i.e., the lack of secondary structure due to the selection
and order of amino acid residues. For example, of particular st are sequences having a low
cy to adapt compactly folded conformations in aqueous solution or under physiologic conditions
that could result in conformational epitopes. The stration of fusion proteins comprising XTEN,
using conventional therapeutic practices and dosing, would generally not result in the formation of
neutralizing antibodies to the XTEN sequence, and also reduce the immunogenicity of the GLP-2 fusion
partner in the GLP2-XTEN compositions.
In one ment, the XTEN sequences utilized in the subject fusion proteins can be
substantially free of epitopes recognized by human T cells. The elimination of such epitopes for the
purpose of generating less immunogenic proteins has been disclosed previously; see for example WO
98/52976, WO 02/079232, and WO 00/3317 which are incorporated by reference herein. Assays for
human T cell epitopes have been described (Stickler, M., et al. (2003) JImmunol Methods, 281: 95-108).
icular interest are peptide sequences that can be oligomerized without generating T cell epitopes
or non-human ces. This is achieved by g direct s of these sequences for the presence
of T-cell epitopes and for the occurrence of 6 to 15-mer and, in ular, 9-mer sequences that are not
human, and then altering the design of the XTEN sequence to eliminate or disrupt the epitope sequence.
In some embodiments, the XTEN sequences are substantially non-immunogenic by the restriction of the
numbers of epitopes of the XTEN predicted to bind MHC receptors. With a reduction in the numbers of
es capable ofbinding to MHC receptors, there is a concomitant reduction in the potential for T cell
activation as well as T cell helper function, reduced B cell activation or upregulation and reduced
antibody production. The low degree of predicted T-cell epitopes can be determined by epitope
tion algorithms such as, e.g., TEPITOPE iolo, T., et a]. (1999) Nat hnol, 17: 555-61),
as shown in Example 31. The TEPITOPE score of a given e frame within a protein is the log of
the Kd (dissociation constant, affinity, off-rate) of the binding of that e frame to le of the
most common human MHC alleles, as sed in Stumiolo, T. et al. (1999) Nature Biotechnology
171555). The score ranges over at least 20 logs, from about 10 to about -10 (corresponding to binding
aints of 10e10 Kd to 10e'10 Kd), and can be reduced by avoiding hydrophobic amino acids that serve
as anchor residues during peptide display on MHC, such as M, I, L, V, F. In some embodiments, an
XTEN component incorporated into a GLP2-XTEN does not have a predicted T-cell epitope at a
TEPITOPE threshold score of about -5, or -6, or -7, or -8, or -9, or at a TEPITOPE score of -10. As used
herein, a score of “-9” would be a more stringent TEPITOPE old than a score of -5.
In another embodiment, the inventive XTEN sequences, including those incorporated into the
subject GLPZ-XTEN fusion proteins, are rendered ntially munogenic by the restriction of
known proteolytic sites from the sequence of the XTEN, reducing the processing ofXTEN into small
peptides that can bind to MHC II receptors. In another ment, the XTEN sequence is rendered
substantially non-immunogenic by the use a sequence that is substantially devoid of secondary structure,
conferring resistance to many proteases due to the high entropy of the structure. Accordingly, the
reduced TEPITOPE score and elimination of known proteolytic sites from the XTEN render the XTEN
compositions, including the XTEN ofthe GLPZ-XTEN fusion protein itions, substantially unable
to be bound by mammalian receptors, including those of the immune system. In one embodiment, an
XTEN of a GLPZ-XTEN fusion protein can have >100 nM Kd binding to a mammalian receptor, or
greater than 500 nM Kd, or r than 1 uM Kd towards a mammalian cell surface or circulating
polypeptide or.
onally, the non-repetitive sequence and corresponding lack of epitopes of XTEN limit the
ability of B cells to bind to or be activated by XTEN. A repetitive sequence is recognized and can form
multivalent contacts with even a few B cells and, as a consequence of the cross-linking of multiple T-cell
ndent ors, can stimulate B cell proliferation and antibody production. In st, while a
XTEN can make ts with many different B cells over its extended sequence, each individual B cell
may only make one or a small number of contacts with an individual XTEN due to the lack of
repetitiveness of the sequence. Not being to be bound by any theory, XTENs typically have a much
lower tendency to stimulate proliferation of B cells and thus an immune response. In one ment,
the GLPZ-XTEN have d immunogenicity as ed to the corresponding GLP-2 that is not
fused to an XTEN. In one embodiment, the administration of up to three parenteral doses of a GLPZ-
XTEN to a mammal result in detectable anti-GLPZ-XTEN IgG at a serum dilution of 1:100 but not at a
dilution of 1:1000. In another embodiment, the administration of up to three parenteral doses of a GLPZ-
XTEN to a mammal result in able LP-2 IgG at a serum dilution of 1:1000 but not at a
dilution of 1:10,000. In another embodiment, the administration of up to three parenteral doses of a
GLPZ-XTEN to a mammal result in detectable anti-XTEN IgG at a serum dilution of 1210,000 but not at
a dilution of 0,000. In the foregoing embodiments, the mammal can be a mouse, a rat, a rabbit, or
a cynomolgus monkey.
An additional feature ofXTENs with non-repetitive sequences relative to sequences with a high
degree ofrepetitiveness is non-repetitive XTENs form weaker contacts with antibodies. Antibodies are
multivalent molecules. For instance, IgGs have two identical binding sites and Ing contain 10 identical
binding sites. Thus antibodies against repetitive sequences can form multivalent contacts with such
repetitive sequences with high avidity, which can affect the potency and/or elimination of such repetitive
sequences. In contrast, antibodies against non-repetitive XTENs may yield monovalent interactions,
2012/054941
resulting in less likelihood of immune clearance such that the GLP2-XTEN compositions can remain in
ation for an sed period of time.
6. Increased hydrodynamic radius
In r aspect, the present ion provides XTEN in which the XTEN polypeptides have
a high hydrodynamic radius that confers a corresponding increased apparent molecular weight to the
GLP2-XTEN fusion protein incorporating the XTEN. As detailed in Example 25 the linking ofXTEN
to therapeutic protein sequences results in GLP2-XTEN compositions that can have increased
hydrodynamic radii, increased apparent molecular weight, and increased apparent lar weight
factor compared to a eutic protein not linked to an XTEN. For example, in therapeutic
applications in which prolonged half-life is d, compositions in which a XTEN with a high
hydrodynamic radius is incorporated into a fusion protein comprising a therapeutic protein can
effectively enlarge the hydrodynamic radius of the composition beyond the ular pore size of
approximately 3-5 nm sponding to an apparent molecular weight of about 70 kDA) eti. 2003.
Pharmacokinetic and biodistribution properties ofpoly(ethylene glycol)-pr0tein conjugates. Adv Drug
DeliV Rev 55:1261-1277), resulting in reduced renal nce of circulating proteins with a
corresponding increase in terminal half-life and other enhanced pharmacokinetic properties. The
hydrodynamic radius of a protein is determined by its lar weight as well as by its structure,
including shape or compactness. Not to be bound by a particular theory, the XTEN can adopt open
conformations due to electrostatic repulsion between dual s of the peptide or the inherent
flexibility imparted by the particular amino acids in the sequence that lack potential to confer secondary
structure. The open, extended and unstructured conformation of the XTEN polypeptide can have a
greater proportional hydrodynamic radius compared to polypeptides of a comparable sequence length
and/or molecular weight that have secondary and/or tertiary structure, such as typical globular proteins.
Methods for determining the hydrodynamic radius are well known in the art, such as by the use of size
exclusion chromatography (SEC), as described in US. Patent Nos. 6,406,632 and 7,294,513. As the
results of Example 25 demonstrate, the addition of increasing lengths of XTEN s in proportional
increases in the parameters of hydrodynamic radius, apparent molecular weight, and apparent molecular
weight factor, permitting the ing of GLP2-XTEN to desired characteristic cut-off apparent
molecular weights or hydrodynamic radii. Accordingly, in certain embodiments, the GLP2-XTEN fusion
protein can be configured with an XTEN such that the fusion protein can have a hydrodynamic radius of
at least about 5 nm, or at least about 8 nm, or at least about 10 nm, or 12 nm, or at least about 15 nm. In
the foregoing embodiments, the large hydrodynamic radius conferred by the XTEN in a GLP2-XTEN
fusion protein can lead to reduced renal clearance of the resulting fusion n, leading to a
corresponding increase in terminal half-life, an increase in mean residence time, and/or a decrease in
renal clearance rate.
When the molecular weights of the GLP2-XTEN fusion proteins are derived from size
exclusion chromatography analyses, the open conformation of the XTEN due to the low degree of
secondary structure results in an increase in the apparent molecular weight of the fusion proteins. In
some embodiments the GLP2-XTEN comprising a GLP-2 and at least a first or multiple XTEN exhibits
an apparent molecular weight of at least about 200 kDa, or at least about 400 kDa, or at least about 500
kDa, or at least about 700 kDa, or at least about 1000 kDa, or at least about 1400 kDa. Accordingly, the
GLP2-XTEN fusion proteins comprising one or more XTEN exhibit an apparent molecular weight that
is about 2-fold greater, or about 3-fold greater or about 4-fold greater, or about 8-fold greater, or about
-fold greater, or about 12-fold greater, or about 15 -fold r, or about d r than the actual
lar weight of the fusion protein. In one embodiment, the isolated GLP2-XTEN fusion protein of
any of the embodiments disclosed herein t an apparent molecular weight factor under physiologic
conditions that is greater than about 2, or about 3, or about 4, or about 5, or about 6, or about 7, or about
8, or about 10, or about 15, or greater than about 20. In another embodiment, the TEN fusion
protein has, under physiologic conditions, an apparent molecular weight factor that is about 3 to about
, or is about 5 to about 15, or is about 8 to about 14, or is about 10 to about 12 relative to the actual
molecular weight of the fusion n.
IV). GLPZ-XTEN ITIONS
The present ion relates in part to fusion protein compositions comprising GLP-2 linked to
one or more XTEN, wherein the fusion protein would act to replace or augment existing GLP-2 when
administered to a subject. The invention addresses a long-felt need in sing the terminal half-life of
ously administered GLP-2 to a subject in need f. One way to increase the circulation half-
life of a eutic protein is to ensure that renal clearance of the protein is reduced. Another way to
increase the circulation half-life is to reduce the active clearance of the therapeutic protein, whether
mediated by receptors, active metabolism of the n, or other endogenous mechanisms. Both may be
achieved by conjugating the protein to a r, which, in some cases, is capable of conferring an
increased molecular size (or hydrodynamic radius) to the protein and, hence, reduced renal clearance,
and, in other cases, interferes with binding of the protein to clearance receptors or other proteins that
contribute to metabolism or clearance. Thus, certain objects of the present invention include, but are not
limited to, providing improved GLP-2 molecules with a longer circulation or terminal half-life,
decreasing the number or frequency of necessary administrations of GLP-2 compositions, retaining at
least a portion of the biological activity of the native GLP-2, and enhancing the ability to treat GLP
related diseases or gastrointestinal conditions with resulting ement in clinical symptoms and
overall well-being more ntly, more effectively, more ically, and with greater safety
compared to presently available GLP-2 preparations.
To meet these needs, in a first aspect, the invention provides isolated fusion protein
itions comprising a biologically active GLP-2 covalently linked to one or more XTEN, resulting
in a GLP2-XTEN fusion protein composition. The subject GLPXTEN can mediate one or more
biological or therapeutic activities of a wild-type GLP-2. GLP2-XTEN can be produced recombinantly
2012/054941
or by chemical ation of a GLP-2 to and XTEN. In one embodiment, the GLP-2 is native GLP-2.
In another embodiment, the GLP-2 is a sequence t of a natural sequence that retains at least a
portion of the biological activity of the native GLP-2. In one ment, the GLP-2 is a sequence
having at least 90%, or at least about 91%, or at least about 92%, or at least about 93%, or at least about
94%, or at least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least
about 99%, or 100% sequence identity to a sequence selected from the group consisting of the sequences
in Table 1, when optimally aligned. In another embodiment, the GLP-2 is a sequence variant with glycine
substituted for alanine at residue number 2 ofthe mature GLP-2 peptide. In one embodiment, the GLP2-
XTEN comprises a GLP-2 having the sequence HGDGSFSDEMNTILDNLAARDFINWLIQTKITD. In
one ment, the ion provides GLP2-XTEN fusion proteins comprising GLP-2 N- and/or C-
terminally modified forms comprising one or more XTEN.
The GLP-2 of the subject compositions, particularly those disclosed in Table 1, together with
their corresponding nucleic acid and amino acid sequences, are well known in the art and descriptions
and sequences are available in public databases such as Chemical Abstracts Services Databases (e. g., the
CAS Registry), GenBank, The Universal n ce (UniProt) and subscription provided databases
such as GenSeq (e.g., Derwent). Polynucleotide sequences may be a wild type polynucleotide sequence
encoding a given GLP-2 (e. g., either full length or mature), or in some instances the sequence may be a
variant of the wild type polynucleotide sequence (e.g., a polynucleotide which encodes the wild type
biologically active protein, wherein the DNA sequence of the polynucleotide has been optimized, for
example, for sion in a particular s; or a polynucleotide encoding a variant of the wild type
protein, such as a site directed mutant or an allelic variant. It is well within the ability of the skilled
artisan to use a wild-type or consensus cDNA sequence or a codon-optimized sequence variant of a GLP-
2 to create GLP2-XTEN constructs contemplated by the invention using methods known in the art and/or
in conjunction with the guidance and methods ed herein and described more fully in the Examples.
In some embodiments, the TEN fusion proteins retain at least a portion of the
biological activity of native GLP-2. A GLP2-XTEN fusion protein of the invention is capable of binding
and activating a GLP-2 receptor. In one embodiment, the GLPZ-XTEN fusion protein ofthe present
invention has an EC50 value, when assessed using an in vitro GLP-2 receptor binding assay such as
described herein or others known in the art, of less than about 30 nM, or about 100 nM, or about 200 nM,
or about 300 nM, or about 400 nM, or about 500 nM, or about 600 nM, or about 700 nM, or about 800
nM, or about 1000 nM, or about 1200 nM, or about 1400 nM. In another embodiment, the TEN
fusion protein of the present invention s at least about 1%, or about 2%, or about 3%, or about 4%,
or about 5%, or about 10%, or about 20%, or about 30% of the potency of the corresponding GLP-2 not
linked to XTEN when assayed using an in vitro GLP2R cell assay such as described in the es or
others known in the art.
In some embodiments, GLP2-XTEN fusion proteins of the disclosure have intestinotrophic,
wound healing and anti-inflammatory activity. In some ments, the GLP2-XTEN fusion protein
compositions exhibit an improvement in one, two, three or more gastrointestinal-related parameters
disclosed herein that are at least about 20%, or 30%, or 40%, or 50%, or 60%, or 70%, or 80%, or 90%,
or 100%, or 120%, or 140%, at least about 150% greater compared to the parameter(s) achieved by the
ponding GLP-2 component not linked to the XTEN when administered to a subject. The parameter
can be a measured parameter selected from blood trations of GLP-2, sed eric blood
flow, decreased inflammation, increased weight gain, decreased diarrhea, decreased fecal wet weight,
intestinal wound healing, increase in plasma citrulline concentrations, decreased CRP levels, decreased
requirement for steroid therapy, enhancing or stimulating mucosal integrity, decreased sodium loss,
decreased parenteral nutrition required to maintain body weight, minimizing, ting, or ting
bacterial translocation in the intestines, enhancing, stimulating or accelerating recovery of the intestines
after surgery, preventing relapses of inflammatory bowel disease, or ing or maintaining energy
homeostasis, among others. In one embodiment, administration of the GLP2-XTEN fiJsion protein to a
subject results in a greater ability to se small intestine weight and/or length when administered to a
subject with a surgically-resected intestine (e.g., short-bowel syndrome) or Crohn’s Disease, compared to
the corresponding GLP-2 not linked to XTEN and administered at a comparable dose in nmol/kg and
dose regimen. In another ment, a GLP2-XTEN fusion n exhibits at least about 10%, or
%, or 30%, or 40%, or 50%, or 60%, or 70%, or 80%, or at least about 90% greater ability to reduce
ulceration when administered to a subject with Crohn’s Disease (either naturally acquired or
experimentally induced) compared to the corresponding GLP—2 ent not linked to the XTEN and
administered at a comparable nmol/kg dose and dose regimen. In r embodiment, the filSiOl’l protein
exhibits the ability to reduce inflammatory cytokines when stered to a subject with Crohn’s
Disease (either naturally ed or experimentally induced) by at least about 20%, or 30%, or 40%, or
50%, or 60%, or 70%, or 80%, or at least about 90% compared the corresponding GLP-2 component not
linked to the XTEN and administered at a able nmol/kg dose and dose regimen. In another
embodiment, a GLP2-XTEN fusion protein ts at least about 10%, or 20%, or 30%, or 40%, or
50%, or 60%, or 70%, or 80%, or at least about 90% greater ability to reduce mucosal atrophy when
stered to a subject with Crohn’s Disease (either naturally acquired or experimentally induced; e. g.,
administration of indomethacin) ed to the corresponding GLP-2 component not linked to the
XTEN and administered at a comparable nmol/kg dose and dose regimen. In another embodiment, a
GLP2-XTEN fusion protein exhibits at least about 5%, or at least about 6%, or 7%, or 8%, or 9%, or
%, or 11%, or 12%, or 15%, or at least about 20% greater ability to increase height of intestinal villi
when administered to a subject with Crohn’s Disease (either naturally ed or experimentally
induced; e.g., administration of indomethacin) compared to the corresponding GLP-2 component not
linked to the XTEN and administered at a comparable nmol/kg dose and dose regimen. In another
embodiment, a GLP2-XTEN fusion protein exhibits at least about 10%, or 20%, or 30%, or 40%, or
50%, or 60%, or 70%, or 80%, or at least about 90% greater ability to increase body weight when
administered to a subject with Crohn’s Disease (either naturally acquired or mentally induced; e.g.,
administration of indomethacin) compared to the corresponding GLP-2 component not linked to the
XTEN and administered at a comparable nmol/kg dose and dose regimen. In the foregoing embodiments
of the paragraph, the t is selected from the group consisting of mouse, rat, monkey and human.
The compositions of the invention include fusion proteins that are useful, when administered to
a subject, for mediating or preventing or ameliorating a gastrointestinal condition associated with GLP-2
such as, but not d to ulcers, gastritis, digestion disorders, malabsorption syndrome, short-gut
syndrome, short bowel syndrome, cul-de-sac syndrome, inflammatory bowel disease, celiac disease,
tropical sprue, hypogammaglobulinemic sprue, Crohn's disease, ulcerative colitis, enteritis,
chemotherapy-induced enteritis, irritable bowel syndrome, small intestine damage, small inal
damage due to -chemotherapy, gastrointestinal , diarrhea] es, inal insufficiency,
acid-induced intestinal , arginine deficiency, idiopathic hypospermia, obesity, catabolic illness,
febrile neutropenia, diabetes, obesity, steatorrhea, autoimmune diseases, food allergies, hypoglycemia,
gastrointestinal barrier disorders, sepsis, bacterial peritonitis, burn-induced intestinal damage, decreased
intestinal motility, intestinal failure, chemotherapy-associated bacteremia, bowel trauma, bowel
ischemia, mesenteric ischemia, malnutrition, necrotizing enterocolitis, necrotizing atitis, neonatal
feeding rance, NSAID-induced gastrointestinal damage, nutritional insufficiency, total eral
nutrition damage to gastrointestinal tract, neonatal nutritional insufficiency, radiation-induced enteritis,
radiation—induced injury to the intestines, mucositis associated with cancer chemotherapy and irritable
bowel disease, pouchitis, ia, and stroke.
] Ofparticular interest are GLPZ-XTEN fitsion protein compositions for which an increase in a
pharmacokinetic ter, increased solubility, increased stability, or some other enhanced
pharmaceutical property compared to native GLP-2 is obtained, providing compositions with enhanced
efficacy, safety, or that result in reduced dosing frequency and/or improve patient ment. The
GLP2-XTEN fusion proteins of the ments disclosed herein exhibit one or more or any
combination of the improved properties and/or the embodiments as detailed herein. Thus, the subject
GLPZ-XTEN fusion protein itions are designed and ed with various objectives in mind,
including improving the therapeutic efficacy of the ive GLP-2 by, for example, increasing the in
vivo exposure or the length that the GLP2-XTEN remains within the therapeutic window when
stered to a subject, compared to a GLP-2 not linked to XTEN.
In one embodiment, a GLPZ-XTEN fusion protein comprises a single GLP-2 molecule linked
to a single XTEN (e.g., an XTEN as described above). In another embodiment, the GLPZ-XTEN
comprises a single GLP-2 linked to two XTEN, wherein the XTEN may be identical or they may be
different. In another embodiment, the GLPZ-XTEN fusion protein comprises a single GLP-2 molecule
linked to a first and a second XTEN, in which the GLP-2 is a sequence that has at least about 80%
sequence identity, or alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, or at least about 99%, or 100% sequence identity compared to a
protein sequence selected from Table l, and the first and the second XTEN are each sequences that have
2012/054941
at least about 80% sequence identity, or alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%,
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least about 99%, or 100% sequence ty
ed to one or more sequences selected from Table 4, or fragments thereof. In another embodiment,
the GLPZ-XTEN fusion protein comprises a sequence with at least about 80% sequence identity, or
alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,
97%, 98%, or at least about 99%, or 100% sequence identity to a ce from Table 33 and 34.
1. GLP2-XTEN Fusion Protein Configurations
The invention provides GLP2-XTEN fusion n compositions with the GLP-2 and XTEN
components linked in specific N— to inus configurations.
In one embodiment of the TEN composition, the invention provides a fusion protein of
a I:
(GLP-2)-(XTEN) I
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or variant as defined herein,
including sequences having at least about 80%, or at least about 90%, or at least about 95%, or at least
about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100% sequence identity
with sequenced from Table 1, and XTEN is an extended recombinant polypeptide as described herein,
including, but not limited to sequences having at least about 80%, or at least about 90%, or at least about
95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100%
sequence identity to sequences set forth in Table 4.
In another embodiment of the GLP2-XTEN composition, the invention provides a fusion
protein of formula II:
-(GLP-2) 11
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or variant as defined
herein, including sequences having at least about 80%, or at least about 90%, or at least about 95%, or at
least about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100% sequence
ty with sequenced from Table 1, and XTEN is an extended recombinant polypeptide as described
herein, ing, but not limited to ces having at least about 80%, or at least about 90%, or at
least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about 99%
or 100% sequence identity to sequences set forth in Table 4.
In another embodiment of the GLP2-XTEN composition, the invention provides an isolated
fusion protein, wherein the fusion protein is of formula III:
(XTEN)-(GLP-2)-(XTEN) III
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or variant as defined herein,
including sequences having at least about 80%, or at least about 90%, or at least about 95%, or at least
about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100% sequence identity
with ced from Table 1, and XTEN is an extended recombinant polypeptide as described herein,
including, but not limited to sequences having at least about 80%, or at least about 90%, or at least about
95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100%
sequence identity to sequences set forth in Table 4..
In another embodiment of the GLPZ-XTEN ition, the invention es an isolated
fusion protein, wherein the 11.181011 protein is of formula IV:
(GLP-2)-(XTEN)-(GLP-2) IV
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or variant as d herein,
including ces having at least about 80%, or at least about 90%, or at least about 95%, or at least
about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100% sequence identity
with sequenced from Table 1, and XTEN is an extended recombinant polypeptide as bed herein,
including, but not limited to sequences having at least about 80%, or at least about 90%, or at least about
95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100%
sequence identity to sequences set forth in Table 4.
In another embodiment of the GLPZ-XTEN composition, the ion es an isolated
fiJsion protein, wherein the fusion protein is of formula V:
)-(S)x-(XTEN) V
wherein independently for each ence, GLP-2 is a GLP-2 protein or variant as defined herein,
ing sequences having at least about 80%, or at least about 90%, or at least about 95%, or at least
about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100% sequence identity
with sequenced from Table 1; S is a spacer sequence having between 1 to about 50 amino acid residues
that can optionally include a cleavage ce or amino acids compatible with restrictions sites; X is
either 0 or 1; and XTEN is an extended recombinant polypeptide as described herein, including, but not
limited to sequences having at least about 80%, or at least about 90%, or at least about 95%, or at least
about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100% sequence identity
to ces set forth in Table 4.
In another ment of the GLPZ-XTEN composition, the invention provides an isolated
fiJsion protein, wherein the fusion protein is of formula VI:
(XTEN)x-(S)X-(GLP-2)-(S)y-(XTEN)y v1
wherein independently for each occurrence, GLP-2 is a GLP-2 protein or variant as defined herein,
including sequences having at least about 80%, or at least about 90%, or at least about 95%, or at least
about 96%, or at least about 97%, or at least about 98%, or at least about 99% or 100% sequence identity
with sequenced from Table 1; S is a spacer sequence having between 1 to about 50 amino acid residues
that can optionally include a cleavage sequence or amino acids compatible with restrictions sites; X is
either 0 or 1 and y is either 0 or 1 wherein x+y 31; and XTEN is an extended recombinant polypeptide as
described herein, including, but not limited to sequences having at least about 80%, or at least about 90%,
or at least about 95%, or at least about 96%, or at least about 97%, or at least about 98%, or at least about
99% or 100% sequence identity to sequences set forth in Table 4.
The embodiments of formulae I-VI ass TEN configurations wherein one or
more XTEN of lengths ranging from about 36 amino acids to 3000 amino acids (e.g., sequences selected
from Table 4 or fragments thereof, or sequences exhibiting at least about 90-95% or more sequence
identity thereto) are linked to the N- or C-terminus of the GLP-2. The embodiments of a V further
provide configurations wherein the XTEN are linked to GLP-2 via spacer sequences that can optionally
comprise amino acids compatible with restrictions sites or can include cleavage sequences (e.g., the
sequences of Tables 5 and 6, described more fully below) such that the XTEN encoding sequence can, in
the case of a restriction site, be integrated into a GLP2-XTEN construct and, in the case of a cleavage
sequence, the XTEN can be ed from the fusion protein by the action of a protease appropriate for
the cleavage sequence. In one ment of formula V, the fusion protein comprises a spacer sequence
that is a single glycine e.
2. GLP2-XTEN Fusion Protein Configurations with Spacer and Cleavage Sequences
In another aspect, the invention provides TEN configured with one or more spacer
sequences incorporated into or adjacent to the XTEN that are designed to incorporate or enhance a
fianctionality or property to the composition, or as an aid in the assembly or manufacture of the fusion
protein compositions. Such properties include, but are not limited to, inclusion of cleavage sequence(s),
such at TEV or other cleavage sequences of Table 6, to permit release of components, inclusion of amino
acids compatible with nucleotide restrictions sites to permit linkage of ncoding nucleotides to
GLP—2—encoding tides or that facilitate construction of expression vectors, and linkers designed to
reduce steric hindrance in regions of TEN fusion proteins.
In an embodiment, a spacer sequence can be introduced between an XTEN sequence and a
GLP-2 component to decrease steric hindrance such that the GLP-2 component may assume its desired
tertiary structure and/or interact appropriately with its target receptor. For s and methods of
identifying desirable spacers, see, for example, George, et al. (2003) Protein Engineering 152871—879,
cally incorporated by reference herein. In one embodiment, the spacer comprises one or more
e sequences that are between 1—50 amino acid residues in , or about 1—25 residues, or about
1-10 residues in length. Spacer ces, exclusive of cleavage sites, can se any of the 20
l L amino acids, and will preferably have XTEN-like properties in that 1) they will comprise
hydrophilic amino acids that are satirically unhindered such as, but not limited to, glycine (G), alanine
(A), serine (S), threonine (T), glutamate (E), proline (P) and aspartate (D); and 2) will be substantially
non-repetitive. In addition, spacer sequences are designed to avoid the introduction of T-cell epitopes;
determination of which are described above and in the Examples. In some cases, the spacer can be
polyglycines or polyalanines, or is predominately a mixture of combinations of glycine, serine and
alanine residues. In one embodiment, a spacer sequence, exclusive of ge site amino acids, has
about 1 to 10 amino acids that consist of amino acids selected from glycine (G), e (A), serine (S),
threonine (T), glutamate (E), and proline (P) and are substantially devoid of secondary structure; e. g., less
than about 10%, or less than about 5% as determined by the Chou-Fasman and/or GOR algorithms. In
one embodiment, the spacer ce is GPEGPS. In r embodiment, the spacer sequence is a
single glycine residue. In another embodiment, the spacer sequence is GPEGPS linked to a cleavage
sequence of Table 6.
In a particular embodiment, the GLPZ-XTEN filSlOl’l protein comprises one or more spacer
sequences linked at the junction(s) n the payload GLP-Z sequence and the one more XTEN
incorporated into the fusion protein, wherein the spacer sequences comprise amino acids that are
compatible with nucleotides encoding restriction sites. In another embodiment, the GLPZ-XTEN fusion
protein comprises one or more spacer sequences linked at the junction(s) between the d GLP-2
sequence and a signal sequence incorporated into the fusion protein, wherein the spacer sequences
comprise a cleavage sequence (e.g., TEV) to release the TEN after expression. In another
embodiment, the TEN fusion protein ses one or more spacer sequences linked at the
junction(s) between the payload GLP-2 sequence and the one more XTEN incorporated into the fusion
protein wherein the spacer sequences se amino acids that are compatible with nucleotides
encoding restriction sites and the amino acids and the one more spacer sequence amino acids are chosen
from glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), and proline (P). In another
embodiment, the GLPZ-XTEN fusion protein comprises one or more spacer sequences linked at the
junction(s) between the payload GLP-2 sequence and the one more XTEN incorporated into the fusion
protein wherein the spacer sequences comprise amino acids that are compatible with nucleotides
encoding restriction sites and the one more spacer sequences are chosen from the sequences of Table 5.
The exact sequence of each spacer ce is chosen to be ible with cloning sites in expression
vectors that are used for a particular GLPZ-XTEN construct. For embodiments in which a single XTEN
is attached to the N- or C-terminus, only a single spacer sequence at the on of the two components
would be required. As would be apparent to one of ordinary skill in the art, the spacer ces
comprising amino acids compatible with restriction sites could be omitted from the construct when an
entire GLP2-XTEN gene is synthetically generated, rather than ligated using GLP-2 and XTEN ng
genes.
Table 5: Spacer Seguences Compatible with Restriction Sites
Spacer Sequence Restriction Enzyme
GSPG BsaI
ETET BsaI
PGSSS BbsI
GAP AscI
GPA FseI
GPSGP SfiI
AAA SacII
TG AgeI
GT KpnI
GAGSPGAETA SfiI
ASS XhoI
2012/054941
In another aspect, the present invention provides GLPZ-XTEN configurations with cleavage
sequences incorporated into the spacer sequences. In some embodiments, a spacer sequence in a GLPZ-
XTEN fusion protein composition comprises one or more cleavage sequences, which are identical or
different, wherein the cleavage sequence may be acted on by a protease to e the XTEN sequence(s)
from the filSlOl’l protein. In one embodiment, the incorporation of the ge sequence into the GLPZ-
XTEN is designed to permit release of a GLP-2 that becomes active or more active upon its release from
the XTEN component. The cleavage sequences are located sufficiently close to the GLP-2 ces,
generally within 18, or within 12, or within 6, or within 2 amino acids of the GLP-2 sequence, such that
any remaining es attached to the GLP-Zs after ge do not appreciably interfere with the
activity (e.g., such as binding to a GLP-2 receptor) of the GLP-2, yet provide sufficient access to the
protease to be able to effect cleavage of the cleavage ce. In some cases, the GLPZ-XTEN
comprising the cleavage sequences will also have one or more spacer sequence amino acids between the
GLP-2 and the cleavage sequence or the XTEN and the cleavage ce to facilitate access of the
protease to the cleavage sequence; the spacer amino acids sing any natural amino acid, ing
glycine, serine and alanine as preferred amino acids. In one embodiment, the cleavage site is a sequence
that can be cleaved by a protease endogenous to the mammalian subject such that the GLPZ-XTEN can
be cleaved after administration to a subject. In such case, the GLP2—XTEN can serve as a prodrug or a
ating depot for the GLP—2. In a particular construct of the foregoing, the GLPZ—XTEN would have
one or two XTEN linked to the N— and/or the C-terminus such that the XTEN could be released, g
the active form of GLP-2 free. In one embodiment of the foregoing construct, the GLP-2 that is released
from the fusion protein by cleavage of the cleavage sequence exhibits at least about a two-fold, or at least
about a three-fold, or at least about a four-fold, or at least about a ld, or at least about a six-fold, or
at least about a eight-fold, or at least about a ld, or at least about a 20-fold increase in biological
activity compared to the intact GLPZ-XTEN fusion n.
] Examples of cleavage sites contemplated by the ion include, but are not limited to, a
polypeptide sequence cleavable by a mammalian endogenous protease selected from FXIa, FXIIa,
rein, FVIIIa, FVIIIa, FXa, FIIa bin), Elastase-Z, granzyme B, MMP-12, MMP-l3, MMP-l7
or MMP-ZO, or by non-mammalian proteases such as TEV, enterokinase, PreScissionTM protease
(rhinovirus 3C protease), and sortase A. Sequences known to be cleaved by the foregoing proteases and
others are known in the art. Exemplary cleavage sequences contemplated by the invention and the
respective cut sites within the sequences are presented in Table 6, as well as sequence variants thereof.
Thus, cleavage sequences, particularly those of Table 6 that are susceptible to the endogenous proteases
present during inflammation would provide for release of GLP-2 that, in certain embodiments of the
GLPZ-XTEN, provide a higher degree of activity for the GLP-2 component released from the intact form
of the GLPZ-XTEN, as well as additional safety margin for high doses of GLPZ-XTEN administered to a
subject. For example, it has been demonstrated that many of the metaloproteinases are elevated in
WO 40093
Crohn’s Disease and inflamed intestines (D Schuppan and T Freitag. Fistulising Crohn’s disease: MMPs
gone awry. Gut (2004) 53(5): 622—624). In one embodiment, the invention provides GLPZ-XTEN
comprising one or more cleavage sequences operably oned to release the GLP-2 from the fusion
protein upon cleavage, wherein the one or more cleavage sequences has at least about 86%, or at least
about 92% or greater sequence identity to a sequence selected from Table 6. In another embodiment, the
GLPZ-XTEN comprising a cleavage ce would have at least about 80%, or at least about 85%, or at
least about 90%, or at least about 95%, or at least about 96%, or at least about 97%, or at least about
98%, or at least about 99% sequence identity compared to a sequence selected from Table 34.
In some embodiments, only the two or three amino acids flanking both sides of the cut site
(four to six amino acids total) are incorporated into the cleavage sequence that, in turn, is incorporated
into the GLPZ-XTEN of the embodiments. In other embodiments, the orated cleavage sequence of
Table 6 can have one or more deletions or insertions or one or two or three amino acid substitutions for
any one or two or three amino acids in the known sequence, wherein the deletions, insertions or
substitutions result in reduced or enhanced susceptibility but not an absence of susceptibility to the
protease, ing in an y to tailor the rate of release of the GLP-2 from the XTEN. Exemplary
substitutions are shown in Table 6.
Table 6: Protease Cleavage Seguences
Protease Acting Upon Exemplary Cleavage Minimal Cut Site*
Sequence Sequence
KMUTAWAVE/GT/Gv
G KD/FL/T/RWA/VE/GT/GV
TMTAMVGG NA
Kallikrein SPFRlSTGG -/-/FL/RY»LSR/RT/—/—
FIXa R¢-/—/—/—
FXa IA/E/GFP/RJSTI/VFSHG
RtSAG/-/-/-
AAA-www-
TEV ENLYFQIG lG/s
Enterokinase DDDKMVGG DDDKlIVGG
(PE:EZZZ:O:§M) LEVLFQJ/GP LEVLFQJ/GP
LPKTIGSES L/P/KEAD/TJrG/JEKS/S
iindicates cleavage site NA: not applicable
* the listing of multiple amino acids , between,
or after a slash indicate alternative amino
acids that can be substituted at the position; - indicates that any amino acid may be
substituted for the corresponding amino acid indicated in the middle column
3. Exemplafl GLP2-XTEN Fusion Protein Sequences
Non-limiting examples of sequences of fusion proteins containing a single GLP-2 linked to one
or two XTEN, either joined at the N— or ini are presented in Tables 13 and 32. In one
embodiment, a GLP2-XTEN composition would comprise a fusion protein having at least about 80%
sequence identity compared to a GLP2-XTEN selected from Table 13 or Table 33, alternatively at least
about 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,
98%, 99%, or about 100% sequence identity as compared to a GLP2-XTEN from Table 13 or Table 33.
However, the ion also contemplates substitution of any of the GLP-2 sequences of Table 1 for a
GLP-2 component of the GLP2-XTEN of Table 13 or Table 33, and/or substitution of any sequence of
Table 4 for an XTEN ent of the GLP2-XTEN of Table 13 or Table 33. In preferred
embodiments, the resulting GLP2-XTEN of the foregoing examples retain at least a portion of the
biological activity of the corresponding GLP-2 not linked to the XTEN; e. g., the ability to bind and
te a GLP-2 receptor and/or result in an intestinotrophic, proliferative, or wound-healing effect. In
the foregoing fusion proteins hereinabove described in this paragraph, the GLP2—XTEN fusion protein
can filrther se one or more cleavage sequences; e.g., a sequence from Table 6, the cleavage
sequence being located between the GLP-2 and the XTEN. In some embodiments comprising ge
sequence(s), the intact GLP2-XTEN composition has less biological activity but a longer half-life in its
intact form compared to a corresponding GLP-2 not linked to the XTEN, but is designed such that upon
administration to a subject, the GLP-2 component is gradually released from the fusion protein by
cleavage at the ge sequence(s) by endogenous proteases, whereupon the GLP-2 component
exhibits activity, i.e., the ability to ively bind to the GLP-2 receptor. In non-limiting examples, the
GLPZ-XTEN with a cleavage sequence has about 80% sequence identity compared to a sequence from
Table 34, or about 85%, or about 90%, or about 95%, or about 97%, or about 98%, or about 99%
sequence ty ed to a ce from Table 34. However, the invention also contemplates
substitution of any ofthe GLP-2 sequences of Table 1 for a GLP-2 component of the GLP2-XTEN of
Table 34, substitution of any sequence of Table 4 for an XTEN component of the GLP2-XTEN of Table
34, and substitution of any cleavage sequence of Table 6 for a ge component ofthe GLP2-XTEN
of Table 34. In some cases, the GLP2-XTEN of the foregoing embodiments in this paragraph serve as
prodrugs or a circulating depot, resulting in a longer terminal half-life compared to GLP-2 not linked to
the XTEN. In such cases, a higher concentration of GLP2-XTEN can be administered to a t to
maintain therapeutic blood levels for an extended period of time compared to the corresponding GLP-2
not linked to XTEN because a smaller proportion of the circulating composition is active.
WO 40093
The TEN itions of the embodiments can be evaluated for biological ty
using assays or in viva parameters as described herein (e. g., assays ofthe Examples or assays of Table
32), or a pharmacodynamic effect in a preclinical model of GLP-2 deficiency or in clinical trials in
humans, using s as described in the Examples or other methods lmown in the art for assessing
GLP-2 biological activity to determine the suitability of the uration or the GLP-2 sequence
variant, and those GLP2-XTEN compositions (including after cleavage of any orated XTEN-
releasing ge sites) that retain at least about 40%, or about 50%, or about 55%, or about 60%, or
about 70%, or about 80%, or about 90%, or about 95% or more biological activity compared to native
GLP-2 sequence are considered suitable for use in the treatment of GLPrelated conditions.
V). PROPERTIES OF THE GLPZ-XTEN COMPOSITIONS OF THE INVENTION
(a) Pharmacokinetic Properties of GLPZ-XTEN
It is an object of the present invention to provide GLP2-XTEN fusion proteins with ed
pharmacokinetics compared to GLP-2 not linked to the XTEN. The pharmacokinetic properties of a
GLP-2 that can be enhanced by linking a given XTEN to the GLP-2 include, but are not limited to,
terminal half-life, area under the curve (AUC), Cmax, volume of bution, maintaining the biologically
active GLP2-XTEN within the therapeutic window above the minimum effective dose or blood unit
concentration for a longer period of time compared to the GLP-2 not linked to XTEN, and
bioavailability; properties that permits less nt dosing or an enhanced pharmacologic effect,
resulting in enhanced utility in the treatment of gastrointestinal conditions.
Native GLP-2 has been ed to have a terminal half-life in humans of approximately seven
minutes (Jeppesen PB, et al., Teduglutide (ALX-0600), a dipeptidyl ase IV resistant glucagon-like
peptide 2 analogue, improves intestinal function in short bowel syndrome patients. Gut. (2005)
54(9): 1224—123 1; Hartmann B, et al. (2000) Dipeptidyl peptidase IV inhibition enhances the
intestinotrophic effect of glucagon—like peptide—2 in rats and mice. Endocrinology 141:401374020), while
an analog teduglutide exhibited a terminal half—life of approximately 0.9—2.3 hr in humans (Marier JF,
Population pharmacokinetics of teduglutide following repeated subcutaneous administrations in healthy
participants and in patients with short bowel syndrome and Crohn's disease. J Clin Pharmacol. (2010)
50(1):36-49). It will be understood by the d artisan that the pharmacokinetic properties ofthe
GLP2-XTEN embodiments are to be compared to able forms of GLP-2 not linked to the XTEN,
i.e., recombinant, native sequence or a teduglutide-like analog.
As a result of the enhanced properties conferred by XTEN, the GLPZ-XTEN, when used at the
dose and dose regimen determined to be appropriate for the composition by the methods bed
herein, administration of a GLP2-XTEN fusion protein composition can achieve a circulating
concentration resulting in a desired pharmacologic or clinical effect for an extended period of time
compared to a comparable dose of the corresponding GLP-2 not linked to the XTEN. As used herein, a
“comparable dose” means a dose with an equivalent moles/kg for the active GLP-2 pharmacophore (e.g.,
GLP-2) that is administered to a subject in a comparable fashion. It will be understood in the art that a
"comparable dosage" of GLPZ-XTEN fusion protein would represent a greater weight of agent but would
have essentially the same mole-equivalents of GLP-2 in the dose of the fusion protein administered.
In one embodiment, the invention provides TEN that enhance the pharmacokinetics of
the fusion protein by linking one or more XTEN to the GLP-Z component of the fiJsion n, wherein
the fusion protein has an increase in apparent lar weight factor of at least about two-fold, or at
least about three-fold, or at least about four-fold, or at least about five-fold, or at least about six-fold, or at
least about fold, or at least about eight-fold, or at least about ten-fold, or at least about twelve-fold,
or at least about fifteen-fold, and wherein the terminal half-life of the GLPZ-XTEN when administered to
a t is increased at least about 2-fold, or at least about 3-fold, or at least about 4-fold, or at least
about 5-fold, or at least about 6-fold, or at least about 7-fold, or at least about 8-fold, or at least about 10-
fold or more compared to the corresponding GLP-2 not linked to the XTEN. In the foregoing
embodiment, wherein the fusion protein comprises at least two XTEN molecules incorporated into the
GLPZ-XTEN, the XTEN can be identical or they can be of a different sequence composition (and net
charge) or length. The XTEN can have at least about 80% sequence identity, or at least about 90%, or at
least about 95%, or at least about 98%, or at least about 99% ce identity compared to a sequence
selected from Table 4. Not to be bound by a particular theory, the XTEN of the GLPZ-XTEN
itions with the higher net charge are expected, as described above, to have less non—specific
interactions with various negatively—charged es such as blood s, tissues, or various ors,
which would further contribute to reduced active clearance. sely, the XTEN of the TEN
compositions with a low (or no) net charge are expected to have a higher degree of interaction with
surfaces that potentiate the ical activity of the associated GLP-2, given the known association of
atory cells in the intestines during an inflammatory response. Thus, the invention provides
GLP2-XTEN in which the degree of potency, bioavailability, and half-life of the fusion protein can be
tailored by the selection and placement of the type and length of the XTEN in the GLPZ-XTEN
itions. Accordingly, the invention contemplates compositions in which a GLP-Z from Table l
and XTEN from Table 4 are combined and are produced, for example, in a configuration selected from
any one of formulae I-VI such that the construct has enhanced pharmacokinetic properties and reduced
systemic clearance. The invention further takes advantage of the fact that n ligands with reduced
binding to a clearance receptor, either as a result of a decreased on—rate or an increased off-rate, may be
effected by the obstruction of either the N- or C-terminus and using that terminus as the linkage to
another polypeptide of the composition, whether another molecule of a GLP-Z, an XTEN, or a spacer
sequence results in the reduced binding. The choice of the particular configuration of the TEN
fusion protein can be tested by methods disclosed herein to confirm those configurations that reduce the
degree ofbinding to a nce receptor such that a reduced rate of active clearance is achieved.
In one embodiment, the invention provides GLPZ-XTEN with enhanced pharmacokinetic
properties wherein the GLPZ-XTEN is a sequence that has at least about 80% sequence identity, or
alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,
97%, 98%, 99%, or 100% sequence identity compared to a sequence selected from any one of Tables 13,
32 or 33. In other ments, the GLP2-XTEN with enhanced pharmacokinetic properties comprises
a GLP-2 sequence that has at least about 80% sequence ty, or alternatively 81%, 82%, 83%, 84%,
85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or about 99% sequence
identity compared to a sequence from Table 1 linked to one or more XTEN that has at least about 80%
sequence identity, or alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%,
93%, 94%, 95%, 96%, 97%, 98%, or about 99% sequence identity compared to a sequence from Table 4.
For the subject compositions, TEN with a longer terminal half-life is generally preferred, so as to
improve patient convenience, to increase the interval between doses and to reduce the amount of drug
required to achieve a sustained . In the embodiments hereinabove bed in this paragraph the
administration of the fusion protein results in an ement in at least one, two, three or more of the
parameters disclosed herein as being useful for assessing the t conditions; e.g., maintaining a blood
concentration, ining bowel function, preventing onset of a m associated with a
gastrointestinal condition such as colitis, short bowel syndrome or Crohn’s Disease, using a lower dose
of fusion protein compared to the corresponding GLP-2 component not linked to the fusion protein and
administered at a comparable dose or dose regimen to a subject. Alternatively, in the embodiments
hereinabove described in this aph the administration of the fusion protein results in an
improvement in at least one of the parameters disclosed herein as being useful for assessing the subject
conditions using a comparable dose of fusion protein but stered using a dose regimen that has a 2-
fold, or 3-fold, or 4-fold, or 5-fold, or 6-fold, or , or 8-fold, or 10-fold, or 20-fold greater interval
between dose administrations compared to the corresponding GLP-2 component not linked to the fitsion
protein and administered to the subject. In the foregoing embodiments, the total dose in millimoles/kg
administered to achieve the improvement in the parameter(s) is at least about three-fold lower, or at least
about four-fold, or at least about ld, or at least about ld, or at least about eight-fold, or at least
about 10-fold lower compared to the corresponding GLP-2 component not linked to the XTEN.
As described more fully in the Examples pertaining to pharmacokinetic characteristics of fusion
proteins comprising XTEN, it was ed that increasing the length ofthe XTEN sequence confers a
portionate increase in the terminal half-life of a fusion protein comprising the XTEN.
Accordingly, the invention provides GLP2-XTEN fusion proteins comprising XTEN wherein the XTEN
is ed to provide a targeted half-life for the GLP2-XTEN composition administered to a subject. In
some embodiments, the invention provides monomeric GLP2-XTEN fusion proteins comprising XTEN
wherein the XTEN is selected to confer an se in the terminal half-life for the GLP2-XTEN
administered to a subject, compared to the corresponding GLP-2 not linked to the XTEN and
stered at a comparable dose, wherein the increase is at least about two-fold longer, or at least about
three-fold, or at least about four-fold, or at least about five-fold, or at least about six-fold, or at least about
seven-fold, or at least about eight-fold, or at least about nine-fold, or at least about ten-fold, or at least
about 15-fold, or at least a 20-fold, or at least a 40-fold or greater an increase in terminal half-life
compared to the GLP-2 not linked to the XTEN. In another embodiment, the administration of a
therapeutically effective amount of GLP2-XTEN to a subject in need f results in a terminal half-life
that is at least 12 h greater, or at least about 24 h greater, or at least about 48 h greater, or at least about
72 h greater, or at least about 96 h greater, or at least about 144 h r, or at least about 7 days greater,
or at least about 14 days greater, or at least about 21 days greater compared to a able dose of the
corresponding GLP-2 not linked to the XTEN. In r embodiment, administration of a
therapeutically effective dose of a GLP2-XTEN fusion protein to a t in need thereof can result in a
gain in time between consecutive doses necessary to maintain a therapeutically effective blood level of
the fusion protein of at least 48 h, or at least 72 h, or at least about 96 h, or at least about 120 h, or at least
about 7 days, or at least about 14 days, or at least about 21 days between utive doses compared to
the corresponding GLP-2 not linked to the XTEN and administered at a comparable dose. It will be
understood in the art that the time between consecutive doses to maintain a peutically effective
blood level” will vary greatly depending on the logic state of the t, and it will be appreciated
that a patient with Crohn’s Disease may require more frequent and longer dosing of a GLP-2 preparation
compared to a patient receiving the same preparation for short bowel syndrome. The foregoing
notwithstanding, it is believed that the GLP2-XTEN of the present invention permit less frequent dosing,
as described above, compared to a GLP—2 not linked to the XTEN. In one embodiment, the GLP2—XTEN
administered using a therapeutically—effective amount to a subject results in blood concentrations ofthe
GLP2-XTEN fusion protein that remains above at least 500 ng/ml, or at least about 1000 ng/ml, or at
least about 2000 ng/ml, or at least about 3000 ng/ml, or at least about 4000 ng/ml, or at least about 5000
ng/ml, or at least about 10000 ng/ml, or at least about 15000 ng/ml, or at least about 20000 ng/ml, or at
least about 30000 ng/ml, or at least about 40000 ng/ml for at least about 24 hours, or at least about 48
hours, or at least about 72 hours, or at least about 96 hours, or at least about 120 hours, or at least about
144 hours.
In one embodiment, the present invention provides GLPZ-XTEN fusion proteins that exhibits
an increase in AUC of at least about 50%, or at least about 60%, or at least about 70%, or at least about
80%, or at least about 90%, or at least about a 100%, or at least about 150%, or at least about 200%, or at
least about 300%, or at least about 500%, or at least about 1000%, or at least about a 2000% compared to
the corresponding GLP-2 not linked to the XTEN and administered to a subject at a comparable dose. In
another embodiment, the GLPZ-XTEN administered at an riate dose to a subject results in area
under the curve concentrations ofthe GLP2-XTEN fusion protein of at least 100000 hr*ng/mL, or at
least about 200000 mL, or at least about 400000 hr*ng/mL, or at least about 600000 mL,
or at least about 800000 hr*ng/mL, or at least about 1000000 hr*ng/mL, or at least about 2000000
hr*ng/mL after a single dose. The pharmacokinetic parameters of a GLP2-XTEN can be determined by
standard methods involving g, the taking of blood samples at times als, and the assaying of the
WO 40093
protein using ELISA, HPLC, radioassay, or other methods known in the art or as described herein,
followed by standard calculations of the data to derive the half-life and other PK parameters.
The enhanced PK parameters allow for reduced dosing of the GLP2-XTEN itions,
compared to GLP-2 not linked to the XTEN, particularly for those subjects receiving doses for routine
prophylaxis or c treatment of a gastrointestinal condition. In one embodiment, a r moles-
equivalent amount of about two-fold less, or about three-fold less, or about four-fold less, or about five-
fold less, or about six-fold less, or about eight-fold less, or about d less or greater of the fusion
protein is administered in comparison to the corresponding GLP-2 not linked to the XTEN under a dose
regimen needed to maintain a comparable area under the curve as the corresponding amount of the GLP-
2 not linked to the XTEN. In another embodiment, a smaller amount of moles of about two-fold less, or
about three-fold less, or about four-fold less, or about five-fold less, or about six-fold less, or about eight-
fold less, or about 10-fold less or greater of the fusion protein is administered in comparison to the
corresponding GLP-2 not linked to the XTEN under a dose regimen needed to maintain a blood
concentration above at least about 500 ng/ml, at least about 1000 ng/ml, or at least about 2000 ng/ml, or
at least about 3000 ng/ml, or at least about 4000 ng/ml, or at least about 5000 ng/ml, or at least about
10000 ng/ml, or at least about 15000 ng/ml, or at least about 20000 ng/ml, or at least about 30000 ng/ml,
or at least about 40000 ng/ml for at least about 24 hours, or at least about 48 h, or at least 72 h, or at least
96 h, or at least 120 h compared to the corresponding amount of the GLP—2 not linked to the XTEN. In
another embodiment, the GLP2—XTEN fusion protein requires less frequent administration for treatment
of a subject with intestinal condition, wherein the dose is administered about every four days, about
every seven days, about every 10 days, about every 14 days, about every 21 days, or about monthly ofthe
fusion protein administered to a subject, and the filsion protein achieves a comparable area under the
curve as the ponding GLP-2 not linked to the XTEN. In yet other embodiments, an accumulatively
smaller amount of moles of about 5%, or about 10%, or about 20%, or about 40%, or about 50%, or
about 60%, or about 70%, or about 80%, or about 90% less of the fusion protein is administered to a
subject in comparison to the corresponding amount of the GLP-2 not linked to the XTEN under a dose
regimen needed to achieve the therapeutic outcome or clinical parameter, yet the fusion protein achieves
at least a comparable area under the curve as the corresponding GLP-2 not linked to the XTEN. The
accumulative r amount is measure for a period of at least about one week, or about 14 days, or
about 21 days, or about one month.
(b) Pharmacology and Pharmaceutical Properties of GLP2-XTEN
The present invention provides GLP2-XTEN compositions comprising GLP-2 covalently
linked to the XTEN that can have enhanced ties compared to GLP-2 not linked to XTEN, as well
as methods to enhance the therapeutic and/or ic activity or effect of the respective two GLP-2
components of the compositions. In addition, GLP2-XTEN fusion ns provide significant
advantages over chemical conjugates, such as pegylated constructs of GLP-2, y the fact that
recombinant GLP2-XTEN fusion proteins can be made in host cell expression systems, which can reduce
time and cost at both the ch and pment and manufacturing stages of a product, as well as
result in a more homogeneous, defined product with less toxicity for both the product and metabolites of
the GLPZ-XTEN compared to pegylated conjugates.
As eutic agents, the GLPZ-XTEN possesses a number of advantages over therapeutics
not comprising XTEN, including one or more of the following non-limiting ed properties:
increased solubility, increased thermal stability, reduced immunogenicity, increased nt molecular
weight, reduced renal clearance, reduced proteolysis, reduced metabolism, enhanced therapeutic
efficiency, a lower ive therapeutic dose, increased bioavailability, increased time between s
capable of maintaining a subject without sed symptoms of colitis, enteritis, or Crohn’s Disease, the
ability to administer the GLPZ-XTEN composition intravenously, aneously, or intramuscularly, a
“tailored” rate of absorption when administered intravenously, aneously, or intramuscularly,
enhanced lyophilization stability, enhanced serum/plasma stability, increased terminal half-life, increased
solubility in blood stream, decreased binding by neutralizing antibodies, decreased active clearance,
reduced side effects, reduced immunogenicity, retention of substrate binding affinity, stability to
degradation, stability to freeze-thaw, stability to proteases, stability to ubiquitination, ease of
administration, compatibility with other pharmaceutical excipients or carriers, persistence in the subject,
increased stability in storage (e.g., increased shelf-life), reduced toxicity in an organism or environment
and the like. The GLPZ—XTEN fusion proteins of the embodiments disclosed herein exhibit one or more
or any combination of the improved properties and/or the ments as detailed herein. The net effect
of the ed properties is that the use of a GLPZ-XTEN ition can result in enhanced
therapeutic and/or biologic effect compared to a GLP-2 not linked to the XTEN, result in economic
benefits associated with less nt dosing, or result in improved patient compliance when
administered to a subject with a GLP-Z-related condition.
] In one embodiment, XTEN as a fusion partner ses the solubility of the GLP-Z d.
Accordingly, where enhancement of the pharmaceutical or physicochemical ties of the GLP-2 is
ble, such as the degree of aqueous solubility or stability, the length and/or the motif family
composition of the XTEN sequences incorporated into the fusion protein may each be selected to confer
a different degree of solubility and/or stability on the respective filsion proteins such that the overall
pharmaceutical properties of the GLPZ-XTEN composition are enhanced. The GLP2-XTEN fusion
proteins can be constructed and assayed, using methods described herein, to confirm the physicochemical
properties and the XTEN adjusted, as needed, to result in the desired properties. In one embodiment, the
GLPZ-XTEN has an aqueous solubility that is at least about 25% greater compared to a GLP-2 not linked
to the fusion protein, or at least about 30%, or at least about 40%, or at least about 50%, or at least about
75%, or at least about 100%, or at least about 200%, or at least about 300%, or at least about 400%, or at
least about 500%, or at least about 1000% greater than the corresponding GLP-Z not linked to the filsion
protein.
The invention provides methods to produce and recover expressed GLPZ-XTEN from a host
cell with ed solubility and ease of recovery compared to GLP-2 not linked to the XTEN. In one
embodiment, the method includes the steps of transforming a host cell with a polynucleotide encoding a
GLPZ-XTEN with one or more XTEN ents of cumulative sequence length greater than about
100, or greater than about 200, or greater than about 400, or greater than about 800 amino acid residues,
expressing the GLPZ-XTEN fusion protein in the host cell, and recovering the expressed fiision protein in
soluble form. In the foregoing embodiment, the XTEN of the GLPZ-XTEN fusion proteins can have at
least about 80% sequence identity, or about 90%, or about 91%, or about 92%, or about 93%, or about
94%, or about 95%, or about 96%, or about 97%, or about 98%, or about 99%, to about 100% ce
identity compared to one or more XTEN selected from Table 4, and the GLP-2 can have at least about
80% sequence identity, or about 90%, or about 91%, or about 92%, or about 93%, or about 94%, or about
95%, or about 96%, or about 97%, or about 98%, or about 99%, or 100% sequence identity compared to
a GLP-2 selected from Table l and the TEN components can be in an N— to C-terminus
ration ed from any one of formulae I-VI.
The invention provides methods to e the GLP2-XTEN compositions that can maintain
the GLP-2 component at therapeutic levels when administered to a subject in need thereof for at least a
two-fold, or at least a three-fold, or at least a four-fold, or at least a five-fold r period of time
compared to able dosages of the corresponding GLP—2 not linked to the XTEN. It will be
understood in the art that a "comparable dosage" of GLPZ—XTEN fusion protein would ent a
greater weight of agent but would have the same approximate moles of GLP-2 in the dose of the fitsion
protein and/or would have the same approximate g concentration relative to the dose of GLP-2 not
linked to the XTEN. The method to produce the compositions that can maintain the GLP-2 component
at therapeutic levels includes the steps of selecting the XTEN appropriate for conjugation to a GLP-2 to
provide the desired cokinetic properties in view of a given dose and dose regimen, creating an
expression construct that encodes the GLPZ-XTEN using a configuration described , transforming
an appropriate host cell with an expression vector comprising the encoding gene, expressing and
recovering the GLP2-XTEN, administration of the TEN to a subject followed by assays to verify
the cokinetic properties, the activity of the GLPZ-XTEN fusion protein (e.g., the ability to bind
receptor), and the safety ofthe administered composition. The subject can be selected from mouse, rat,
monkey and human. By the methods, GLPZ-XTEN provided herein can result in increased efficacy of
the administered composition by maintaining the circulating concentrations of the GLP-2 at therapeutic
levels for an enhanced period oftime.
In another aspect, the GLPZ-XTEN compositions of the invention are capable of resulting in an
intestinotrophic effect. As used herein, “intestinotrophic effect” means that a subject, e.g., mouse, rat,
monkey or human, exhibits at least one of the ing after administration of a GLP-2 containing
composition: intestinal growth, increased hyperplasia of the Villus epithelium, increased crypt cell
proliferation, increased the height of the crypt and Villus axis, increased healing after intestinal
anastomosis, increased small bowel weight, increased small bowel length, decreased small bowel
epithelium apoptosis, or enhancement of intestinal function. The GLPZ-XTEN compositions may act in
an endocrine fashion to link inal growth and lism with nt .. GLP-2 and related
analogs may be treatments for short bowel syndrome, Crohn's disease, osteoporosis and as adjuvant
therapy during cancer chemotherapy, amongst other gastrointestinal conditions described herein. In one
embodiment, a GLPZ-XTEN is capable of resulting in at least one, or two, or three or more
intestinotrophic effects when administered to a subject using an effective amount.
The characteristics of TEN itions of the invention, including functional
characteristics or biologic and pharmacologic ty and ters that result, can be determined by
any suitable screening assay known in the art for measuring the d characteristic. The invention
provides s to assay the GLPZ-XTEN fusion proteins of differing composition or configuration in
order to provide GLPZ-XTEN with the desired degree of biologic and/or therapeutic activity, as well as
safety profile. Specific in vitro, in vivo and ex vivo biological assays are used to assess the activity of
each configured GLPZ-XTEN and/or GLP-2 component to be incorporated into GLPZ-XTEN, including
but not limited to the assays of the Examples, assays of Table 32, ination of inflammatory
cytokine levels, GLP-2 blood concentrations, ELISA assays, or bowel function tests, as well as clinical
endpoints such as bleeding, inflammation, colitis, diarrhea, fecal wet weight, weight loss, sodium loss,
intestinal ulcers, intestinal obstruction, fistulae, and abscesses, survival, among others known in the art.
The foregoing assays or endpoints can also be used in preclinical assays to assess GLP—2 sequence
variants (assayed as single components or as GLPZ-XTEN filSlOI’l proteins) and can be compared to the
native human GLP-2 to determine r they have the same degree of biologic ty as the native
GLP-2, or some fraction thereof such that they are suitable for inclusion in GLP2—XTEN. In one
ment, the invention provides GLPZ-XTEN fusion ns that exhibit at least about 30%, or at
least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about
80%, or at least about 90%, or at least about 100% or at least about 120% or at least about 150% or at
least about 200% of the inotrophic effect compared to the corresponding GLP-2 not linked to
XTEN and stered to a subject using a comparable dose.
Dose optimization is important for all drugs. A therapeutically effective dose or amount of the
GLPZ-XTEN varies according to factors such as the disease state, age, sex, and weight of the individual,
and the ability ofthe stered fusion protein to elicit a desired response in the individual. For
example, a standardized single dose of GLP-2 for all patients presenting with e pulmonary
conditions or abnormal clinical parameters (e.g., neutralizing antibodies) may not always be effective. A
consideration of these factors is well within the purview of the ordinarily skilled clinician for the purpose
of determining the therapeutically or pharmacologically effective amount of the GLPZ-XTEN and the
appropriated dosing schedule, versus that amount that would result in insufficient potency such that
clinical improvement is not achieved.
The methods of the invention includes administration of consecutive doses of a therapeutically
effective amount of the GLP2-XTEN for a period of time sufficient to achieve and/or maintain the
desired parameter or al effect, and such consecutive doses of a therapeutically effective amount
establishes the therapeutically effective dose n for the GLP2-XTEN, i.e., the schedule for
utively administered doses of the fusion protein composition, wherein the doses are given in
amounts to result in a sustained beneficial effect on any clinical sign or symptom, aspect, measured
ter or characteristic of a GLPrelated disease state or condition, including, but not limited to,
those described herein. A prophylactically effective amount refers to an amount of GLP2-XTEN
required for the period of time necessary to prevent a physiologic or clinical result or event; e.g., reduced
mesenteric blood flow, bleeding, inflammation, colitis, diarrhea, fecal wet , weight loss, sodium
loss, intestinal ulcers, intestinal obstruction, fistulae, and abscesses, changed frequency in bowel
movements, s, as well growth failure in children, or maintaining blood trations of GLP-2
above a threshold level, e.g., 100 ng/ml of GLP-2 equivalent (or approximately 2200 11ng of GLP
2G_XTEN_AE864) or 30 pmol/L. In the s of treatment, the dosage amount of the TEN
that is administered to a subject ranges from about 0.2 to 500 mg/kg/dose (2.5 nmol/kg — 6250 nmol/kg),
or from about 2 to 300 dose (25 nmol/kg — 3750 nmol/kg), or from about 6 to about 100
mg/kg/dose (75 nmol/kg/dose — 1250 nmol/kg/dose), or from about 10 to about 60 mg/kg/dose (125
nmol/kg/dose , 750 nmol/kg/dose) for a subject. A le dosage may also depend on other factors that
may influence the response to the drug; e. g., subjects with surgically resected bowel generally requiring
higher doses compared to ble bowel syndrome. In some embodiments, the method comprises
administering a eutically—effective amount of a ceutical composition comprising a GLP2-
XTEN fission protein composition sing GLP-2 linked to one or more XTEN sequences and at least
one pharmaceutically acceptable carrier to a subject in need thereof that results in a greater improvement
in at least one of the disclosed parameters or physiologic conditions, or results in a more favorable
clinical outcome compared to the effect on the parameter, condition or clinical outcome mediated by
administration of a pharmaceutical composition comprising a GLP-2 not linked to XTEN and
stered at a comparable dose. In one embodiment of the foregoing, the improvement is ed by
administration of the TEN pharmaceutical composition at a therapeutically effective dose. In
another embodiment of the foregoing, the improvement is achieved by administration of multiple
consecutive doses of the GLP2-XTEN pharmaceutical composition using a therapeutically effective dose
regimen (as defined herein) for the length of the dosing period.
] In many cases, the therapeutic levels for GLP-2 in subjects of different ages or degree of
disease have been established and are ble in published literature or are stated on the drug label for
approved products containing the GLP-2. In other cases, the therapeutic levels can be established for
new compositions, including those GLP2-XTEN fusion proteins of the disclosure. The methods for
establishing the therapeutic levels and dosing schedules for a given composition are known to those of
skill in the art (see, e.g., Goodman & Gilman's The Pharmacological Basis of Therapeutics, llTh Edition,
McGraw-Hill (2005)). For e, by using dose-escalation studies in subjects with the target disease
or condition to determine efficacy or a desirable pharmacologic effect, appearance of adverse events, and
determination of circulating blood levels, the therapeutic blood levels for a given subject or population of
subjects can be ined for a given drug or biologic. The dose escalation studies can evaluate the
activity of a GLPZ-XTEN through metabolic studies in a subject or group of ts that monitor
physiological or biochemical parameters, as known in the art or as described herein for one or more
parameters ated with the GLP-Z-related condition, or clinical ters associated with a
beneficial outcome for the particular indication, together with observations and/or measured parameters
to determine the no effect dose, adverse events, minimum effective dose and the like, together with
measurement of pharmacokinetic parameters that establish the determined or derived circulating blood
levels. The results can then be correlated with the dose administered and the blood concentrations of the
therapeutic that are coincident with the foregoing determined ters or effect levels. By these
methods, a range of doses and blood concentrations can be correlated to the minimum effective dose as
well as the maximum dose and blood concentration at which a desired effect occurs and the period for
which it can be maintained, thereby establishing the eutic blood levels and dosing schedule for the
composition. Thus, by the foregoing methods, a Cm blood level is ished, below which the GLP2-
XTEN fusion protein would not have the desired pharmacologic effect and a Cmax blood level, above
which side effects may occur.
One of skill in the art can, by the means disclosed herein or by other s known in the art,
confirm that the stered GLPZ-XTEN remains at therapeutic blood levels yet retains adequate
safety (thereby establishing the “therapeutic ”) to maintain biological activity for the desired
interval or requires adjustment in dose or length or sequence of XTEN. r, the ination of the
appropriate dose and dose frequency to keep the TEN within the therapeutic window establishes
the therapeutically effective dose n; the schedule for administration of multiple consecutive doses
using a therapeutically effective dose of the fusion n to a subject in need thereof resulting in
consecutive Cmax peaks and/or le-n troughs that remain above therapeutically-effective concentrations
and result in an improvement in at least one measured parameter relevant for the target condition. In one
embodiment, the GLPZ-XTEN administered at an appropriate dose to a subject results in blood
concentrations of the GLP2-XTEN fusion protein that remains above the minimum effective
concentration to maintain a given activity or effect (as determined by the assays of the Examples or Table
32) for a period at least about two-fold longer compared to the corresponding GLP-2 not linked to XTEN
and administered at a comparable dose; alternatively at least about fold longer; alternatively at least
about old longer; alternatively at least about five-fold longer; alternatively at least about six-fold
; alternatively at least about seven-fold longer; alternatively at least about eight-fold longer;
atively at least about nine-fold longer, alternatively at least about ten-fold longer, or at least about
twenty-fold longer or greater compared to the corresponding GLP-2 not linked to XTEN and
administered at a comparable dose. As used herein, an “appropriate dose” means a dose of a drug or
biologic that, when administered to a t, would result in a ble therapeutic or pharmacologic
effect and/or a blood concentration within the therapeutic window. For example, serum or plasma levels
of GLP-2 or XTEN—containing fusion proteins comprising GLP-2 can be ed by nephelometry,
ELISA, HPLC, radioimmunoassay or by immunoelectrophoresis (Jeppesen PB. Impaired meal
stimulated on-like e 2 response in ileal resected short bowel patients with intestinal failure.
Gut. (1999) 45(4):559-963; assays of Examples 18-21). Phenotypic identification of GLP-2 or GLP-2
variants can be accomplished by a number of s including isoelectric focusing (IEF) (Jeppsson et
al., Proc. Natl. Acad. Sci. USA, 81 :5690-93, 1994), or by DNA analysis (Kidd et al., Nature, 0-34,
1983; Braun et al., Eur. J. Clin. Chem. Clin. Biochem, 34:761-64, 1996).
In one embodiment, administration of at least two doses, or at least three doses, or at least four
or more doses of a GLPZ-XTEN using a therapeutically effective dose regimen results in a gain in time
of at least about three-fold longer; alternatively at least about four-fold longer; atively at least about
five-fold longer; alternatively at least about ld longer; atively at least about seven-fold longer;
alternatively at least about eight-fold longer; alternatively at least about nine-fold longer or at least about
ten-fold longer between at least two consecutive Cm; peaks and/or Cmin troughs for blood levels of the
fiJsion protein compared to the corresponding ically active protein of the fusion protein not linked
to the XTEN and administered at a able dose regimen to a subject. In another embodiment, the
GLPZ—XTEN administered at a therapeutically effective dose regimen results in a comparable
improvement in one, or two, or three or more measured parameters using less frequent dosing or a lower
total dosage in moles of the fusion n of the pharmaceutical composition compared to the
corresponding biologically active protein component(s) not linked to the XTEN and administered to a
subject using a therapeutically effective dose regimen for the GLP-2. The measured parameters include
any of the clinical, biochemical, or physiological parameters disclosed herein, or others known in the art
for assessing subjects with GLP-Z-related condition. Non-limiting examples of ters or
physiologic effects that can be assayed to assess the activity of the GLPZ-XTEN fusion proteins include
assays of the Example, Table 32 or tests or assays to detect reduced mesenteric blood flow, bleeding,
inflammation, s, diarrhea, fecal wet weight, sodium loss, weight loss, intestinal ulcers, intestinal
obstruction, fistulae, and abscesses, changed frequency in bowel movements, uveitis, growth failure in
children, or maintaining blood concentrations of GLP-2 above a threshold level, e.g., 100 ng/ml of GLP-
2 equivalent (or approximately 2200 ng/ml of GLP2G_XTEN_AE864), as well as parameters
ed from experimental animal models of enteritis such as body weight gain, small intestine ,
reduction in TNFa content of the small intestine, reduced mucosal atrophy, reduced incidence of
perforated ulcers, and height of villi.
In some embodiments, the biological activity of the GLP-2 component is manifested by the
intact GLPZ-XTEN fusion n, while in other cases the biological activity of the GLP-Z component is
primarily manifested upon cleavage and release of the GLP-2 from the fusion protein by action of a
protease that acts on a cleavage sequence incorporated into the GLPZ-XTEN fusion protein using
configurations and sequences described herein. In the foregoing, the GLPZ-XTEN is designed to reduce
the binding y of the GLP-2 component for the GLP-2 receptor when linked to the XTEN but have
restored or increased affinity when released from XTEN h the cleavage of cleavage sequence(s)
incorporated into the GLPZ-XTEN sequence. In one embodiment of the ing, the invention
provides an isolated filsion protein comprising a GLP-2 linked to at least a first XTEN by a cleavage
sequence, wherein the fusion protein has less than 10% or the biological activity (e. g., or binding)
prior to cleavage and wherein the GLP-2 released from the fusion protein by proteolytic cleavage at the
cleavage sequence has biological activity that is at least about 40%, at least about 50%, at least about
60%, or at least about 70%, or at least about 80%, or at least about 90%, or at least about 95% as active
compared to native GLP-2 not linked to the XTEN.
In one aspect, the invention provides GLPZ-XTEN compositions designed to reduce active
clearance of the fusion protein, thereby increasing the al half-life of GLPZ-XTEN administered to
a subject, while still retaining biological activity. t being bound by any particular theory, it is
believed that the GLP2-XTEN of the present invention have comparatively higher and/or sustained
activity achieved by reduced active clearance of the molecule by the addition of unstructured XTEN to
the GLP-Z. Uptake, elimination, and inactivation of GLP-2 can occur in the atory system as well as
in the ascular space.
VI). USES OF THE TEN COMPOSITIONS
In r aspect, the invention provides GLP2-XTEN fusion proteins for use in methods of
treatment, including treatment for achieving a beneficial effect in a gastrointestinal condition mediated or
ameliorated by GLP-Z. As used herein, “gastrointestinal condition” is intended to include, but is not
limited to gastritis, digestion disorders, orption syndrome, short-gut syndrome, short bowel
syndrome, —sac me, inflammatory bowel disease, celiac disease, tropical sprue,
mmaglobulinemic sprue, Crohn's disease, ulcerative colitis, enteritis, chemotherapy—induced
enteritis, irritable bowel me, small intestine damage, small intestinal damage due to cancer—
chemotherapy, gastrointestinal injury, diarrheal diseases, intestinal insufficiency, acid-induced intestinal
injury, ne deficiency, idiopathic hypospermia, obesity, catabolic illness, e neutropenia,
obesity, steatorrhea, autoimmune diseases, gastrointestinal barrier disorders, sepsis, bacterial peritonitis,
burn-induced intestinal damage, decreased gastrointestinal ty, intestinal failure, chemotherapy-
associated bacteremia, bowel trauma, bowel ia, mesenteric ischemia, malnutrition, necrotizing
enterocolitis, necrotizing pancreatitis, neonatal feeding intolerance, NSAID-induced intestinal
damage, nutritional insufficiency, total parenteral nutrition damage to gastrointestinal tract, neonatal
nutritional insufficiency, radiation-induced enteritis, radiation-induced injury to the intestines, mucositis,
pouchitis, and gastrointestinal-induced ischemia,
The present invention provides TEN fusion proteins for use in methods for treating a
subject, such as a human, with a GLP-Z-related disease, disorder or gastrointestinal condition in order to
achieve a beneficial effect, addressing disadvantages and/or limitations of other methods of ent
using GLP-Z preparations that have a relatively short terminal half-life, require repeated administrations,
or have unfavorable pharmacoeconomics. The fact that GLP-2 native, recombinant or synthetic proteins
have a short ife itates frequent dosing in order to achieve clinical benefit, which results in
difficulties in the management of such patients.
In one embodiment, the method of treatment comprises administering a therapeutically-
effective amount of a GLPZ-XTEN composition to a subject with a gastrointestinal condition. In another
ment of the method of treatment, the administration ofthe GLPZ-XTEN composition results in
the improvement of one, two, three or more biochemical, physiological or clinical parameters associated
with the gastrointestinal condition. In the foregoing method, the administered GLPZ-XTEN comprises a
GLP-2 with at least about 80%, or at least about 90%, or at least about 95%, or at least about 97%, or at
least about 99% sequence identity to a GLP-2 of Table 1 linked to at least a first XTEN with at least
about 80%, or at least about 90%, or at least about 95%, or at least about 97%, or at least about 99%
sequence identity to a XTEN selected from any one of Tables 4, and 8-12. In another embodiment of the
foregoing method, the administered GLPZ-XTEN has a sequence with at least about 80%, or at least
about 90%, or at least about 95%, or at least about 97%, or at least about 99% sequence identity to a
sequence from Tables 13, 32, or 33. In one embodiment, the method of treatment comprises
administering a therapeutically—effective amount of a GLPZ—XTEN composition in one or more doses to
a subject with a intestinal condition wherein the stration results in the improvement of one,
two, three or more biochemical, physiological or clinical parameters or a therapeutic effect associated
with the condition for a period at least two-fold longer, or at least four-fold , or at least five-fold
longer, or at least six-fold longer compared to a GLP-2 not linked to the XTEN and administered using a
comparable amount. In another embodiment, the method of ent comprises administering a
therapeutically—effective amount of a GLPZ-XTEN composition to a subject ing from GLP-2
deficiency wherein the administration results in ting onset of a clinically relevant parameter or
symptom or dropping below a clinically-relevant blood concentration for a duration at least two-fold, or
at least fold, or at least four-fold longer compared to a GLP-Z not linked to the XTEN. In another
embodiment, the method of treatment comprises administering a therapeutically-effective amount of a
GLPZ-XTEN to a subject with a gastrointestinal condition, n the administration s in at least a
%, or 10%, or 20%, or 30%, or 40%, or 50%, or 60%, or 70%, or 80%, or 90% greater ement of
at least one, two, or three parameters associated with the gastrointestinal condition compared to the GLP-
2 not linked to XTEN and administered using a comparable nmol/kg amount. In the foregoing
embodiments of the method of treatment, the stration is subcutaneous, intramuscular, or
intravenous. In the foregoing embodiments ofthe method of ent, the subject is selected from the
group ting of mouse, rat, monkey, and human. In the foregoing embodiments of the method of
treatment, the therapeutic effect or parameter includes, but is not limited to, blood concentrations of
GLP-2, increased mesenteric blood flow, sed inflammation, increased weight gain, decreased
diarrhea, decreased fecal wet weight, intestinal wound healing, increase in plasma citrulline
trations, decreased CRP levels, decreased requirement for d therapy, enhancing or
stimulating mucosal integrity, decreased sodium loss, minimizing, ting, or preventing bacterial
translocation in the intestines, enhancing, stimulating or accelerating recovery of the intestines after
surgery; preventing relapses of inflammatory bowel disease; or achieving or maintaining energy
homeostasis, among others.
In one embodiment, the method of treatment is used to treat a subject with small intestinal
damage due to herapeutic agents such as, but not d to 5-FU, altretamine, bleomycin,
busulfan, capecitabine, carboplatin, carmustine, chlorambucil, cisplatin, cladribine, crisantaspase,
cyclophosphamide, bine, dacarbazine, dactinomycin, daunorubicin, docetaxel, doxorubicin,
epirubicin, etoposide, bine, fluorouracil, gemcitabine, hydroxycarbamide, idarubicin, ifosfamide,
irinotecan, liposomal doxorubicin, leucovorin, ine, melphalan, mercaptopurine, mesna,
methotrexate, mitomycin, mitoxantrone, latin, paclitaxel, pemetrexed, tatin, procarbazine,
raltitrexed, streptozocin, tegafur-uracil, temozolomide, pa, tioguanine, thioguanine, topotecan,
treosulfan, vinblastine, vincristine, ine, and vinorelbine.
Prior to administering treatment by the described methods, a diagnosis of a gastrointestinal
condition may be obtained. A gastrointestinal condition can be diagnosed by rd of care means
known in the art. Ulcers, for example, may be diagnosed by barium X—ray of the esophagus, stomach, and
intestine, by endoscopy, or by blood, breath, and stomach tissue biopsy (e. g., to detect the presence of
Helicobacterpylori). Malabsorption syndromes can be diagnosed by blood tests or stool tests that
monitor nutrient levels in the blood or levels of fat in stool that are diagnostic of a orption
syndrome. Celiac sprue can be sed by antibody tests which may include testing for antiendomysial
dy (IgA), antitransglutaminase (IgA), antigliadin (IgA and IgG), and total serum IgA. Endoscopy
or small bowel biopsy can be used to detect abnormal intestinal lining where symptoms such as flattening
of the villi, which are stic of celiac sprue. Tropical sprue can be diagnosed by detecting
malabsorption or infection using small bowel biopsy or response to chemotherapy. Inflammatory bowel
disease can be detected by scopy or by an x-ray following a barium enema in combination with
clinical symptoms, where inflammation, bleeding, or ulcers on the colon wall are diagnostic of
inflammatory bowel es such as ulcerative colitis or Crohn's disease.
In some embodiments of the method of treatment, administration of the TEN to a
subject results in an improvement in one or more of the biochemical, physiologic, or clinical parameters
that is of greater magnitude than that of the corresponding GLP-2 component not linked to the XTEN,
determined using the same assay or based on a measured clinical parameter. In one embodiment of the
ing, the administration of a therapeutically effective amount of a GLPZ-XTEN composition to a
subject in need thereof results in a greater reduction of parenteral nutrition (PN) dependence in patients
with adult short bowel syndrome (SBS) of about 10%, or about 20%, or about 30%, or about 40%, or
about 50%, or about 60%, or about 70%, or more in the subject at 2-7 days after administration compared
WO 40093
to a comparable amount of the corresponding GLP-2 not linked to the XTEN. In another embodiment,
the administration of a GLP2-XTEN to a subject in need thereof using a therapeutically effective dose
regimen results in an increase ofbody weight of 10%, or about 20%, or about 30%, or about 40%, or
about 50% or more in the subject at 7, 10, 14, 21 or 30 days after initiation of administration compared to
a comparable eutically ive dose regimen of the corresponding GLP-2 not linked to the XTEN.
In another embodiment, the administration of a therapeutically effective amount of a GLP2-XTEN
composition to a subject in need thereof results in a greater reduction in fecal wet weight in patients with
adult short bowel syndrome (SBS) of about 10%, or about 20%, or about 30%, or about 40%, or about
50%, or about 60%, or about 70%, or more in the subject at 2-7 days after administration compared to a
comparable amount of the corresponding GLP-2 not linked to the XTEN. In another embodiment, the
administration of a therapeutically effective amount of a GLP2-XTEN composition to a subject in need
thereofresults in a greater reduction in sodium loss in patients with adult short bowel syndrome (SBS) of
about 10%, or about 20%, or about 30%, or about 40%, or about 50%, or about 60%, or about 70%, or
more in the t at 2-7 days after administration compared to a comparable amount of the
corresponding GLP-2 not linked to the XTEN.
In some embodiments of the method of treatment, (i) a smaller amount of moles of about two-
fold less, or about three-fold less, or about four-fold less, or about five-fold less, or about six-fold less, or
about eight—fold less, or about 10—fold less of the GLP2—XTEN fusion protein is administered to a subject
in need f in comparison to the ponding GLP—2 not linked to the XTEN under an otherwise
same dose regimen, and the fiJsion protein achieves a comparable area under the curve and/or a
comparable therapeutic effect as the corresponding GLP-2 not linked to the XTEN; (ii) the GLP2-XTEN
fusion protein is administered less frequently (e. g., every three days, about every seven days, about every
days, about every 14 days, about every 21 days, or about monthly) in ison to the
corresponding GLP-2 not linked to the XTEN under an otherwise same dose amount, and the fusion
protein achieves a able area under the curve and/or a comparable therapeutic effect as the
corresponding GLP-2 not linked to the XTEN; or (iii) an accumulative smaller amount of moles of at
least about 20%, or about 30%, or about 40%, or about 50%, or about 60%, or about 70%, or about 80%,
or about 90% less of the fusion protein is administered in comparison to the corresponding GLP-2 not
linked to the XTEN under an ise same dose regimen and the GLP2-XTEN fusion protein achieves
a comparable area under the curve and/or a comparable therapeutic effect as the ponding GLP-2
not linked to the XTEN. The accumulative smaller amount is measured for a period of at least about one
week, or about 14 days, or about 21 days, or about one month. In the ing embodiments ofthe
method of treatment, the therapeutic effect can be determined by any of the ed parameters
described herein, including but not limited to blood concentrations of GLP-2, assays of Table 32, or
assays to detect reduced mesenteric blood flow, bleeding, inflammation, colitis, diarrhea, fecal wet
weight, weight loss, sodium loss, intestinal ulcers, intestinal obstruction, fistulae, and abscesses, changed
frequency in bowel movements, uveitis, growth e in children, or maintaining blood concentrations
of GLP-2 above a threshold level, e.g., 100 ng/ml of GLP-2 equivalent (or approximately 2200 ng/ml of
GLP2G_XTEN_AE864), among others known in the art for GLPrelated conditions.
] The invention provides GLP2-XTEN fusion proteins for use in a pharmaceutical regimen for
treating a t with a intestinal condition. In one embodiment, the regimen comprises a
pharmaceutical composition comprising a GLP2-XTEN fusion protein described herein. In another
embodiment, the ceutical regimen fiirther comprises the step of determining the amount of
pharmaceutical composition needed to e a therapeutic effect in the subject. In another
embodiment, the pharmaceutical regimen for treating a subject with a gastrointestinal condition
comprises administering the pharmaceutical composition in two or more successive doses to the subject
at an effective amount, wherein the administration results in at least a 5%, or 10%, or 20%, or 30%, or
40%, or 50%, or 60%, or 70%, or 80%, or 90% greater improvement of at least one, two, or three
parameters associated with the gastrointestinal condition ed to the GLP-2 not linked to XTEN and
administered using a able nmol/kg amount. In r embodiment of the pharmaceutical
nt, the ive amount is at least about 5, or least about 10, or least about 25, or least about 100,
or least about 200 nmoles/kg, or any amount intermediate to the ing. In another embodiment, the
pharmaceutical regimen for treating a subject with a gastrointestinal condition ses stering a
therapeutically effective amount of the pharmaceutical composition once about every 3, 6, 7, 10, 14, 21,
28 or more days. In another embodiment, the pharmaceutical regimen for treating a subject with a
gastrointestinal condition comprises administering the GLP2—XTEN pharmaceutical composition wherein
said administration is subcutaneous, intramuscular, or intravenous. In another embodiment, the
pharmaceutical regimen for treating a subject with a gastrointestinal condition comprises administering a
therapeutically effective amount of the pharmaceutical composition, wherein the therapeutically ive
amount results in maintaining blood trations ofthe fusion protein within a therapeutic window for
the fusion protein at least three-fold longer compared to the corresponding GLP-2 not linked to the
XTEN administered at a comparable amount to the subject.
The invention r plates that the GLP2-XTEN used in accordance with the methods
provided herein can be administered in conjunction with other treatment methods and compositions (e. g.,
anti-inflammatory agents such as steroids or NSAIDS) useful for treating GLPrelated conditions, or
conditions for which GLP-2 is or could be adjunctive therapy.
In another aspect, the invention provides GLP2-XTEN fusion ns for use in a method of
preparing a medicament for treatment of a GLPrelated condition In one embodiment, the method of
preparing a medicament comprises linking a GLP-2 sequence with at least about 80%, or at least about
90%, or at least about 95%, or at least about 97%, or at least about 99% sequence identity to a GLP-2 of
Table 1 to at least a first XTEN with at least about 80%, or at least about 90%, or at least about 95%, or
at least about 97%, or at least about 99% sequence ty to a XTEN selected from any one of Tables 4,
and 8-12, wherein the GLP2-XTEN retains at least a portion of the biological activity of the native GLP-
2, and further combining the GLP2-XTEN with at least one pharmaceutically acceptable carrier. In
WO 40093
another embodiment, the GLP2-XTEN has a sequence With at least about 80%, or at least about 90%, or
at least about 95%, or at least about 97%, or at least about 99% sequence identity compared to a sequence
selected from any one of Tables 13, 32 or 33.
In another , the invention provides a method of designing the GLP2-XTEN compositions
to achieve desired pharmacokinetic, cologic or pharmaceutical properties. In general, the steps in
the design and production of the filsion proteins and the inventive compositions, as illustrated in FIGS. 4-
6, include: (1) selecting a GLP-2 (e.g., native proteins, sequences of Table 1, s or derivatives with
activity) to treat the particular condition; (2) selecting the XTEN that Will confer the desired PK and
physicochemical teristics on the resulting GLP2-XTEN (e.g., the administration ofthe GLP2-
XTEN composition to a subject results in the fusion protein being maintained Within the therapeutic
Window for a r period compared to GLP-2 not linked to the XTEN); (3) establishing a desired N— to
C-terminus configuration ofthe GLP2-XTEN to e the desired efficacy or PK parameters; (4)
ishing the design of the expression vector encoding the configured GLP2-XTEN; (5) transforming
a suitable host With the expression vector; and (6) expressing and recovering of the resultant fusion
protein. For those TEN for Which an increase in half-life or an increased period of time spent
above the minimum effective concentration is desired, the XTEN chosen for incorporation generally has
at least about 288, or about 432, or about 576, or about 864, or about 875, or about 912, or about 923
amino acid residues Where a single XTEN is to be incorporated into the GLP2—XTEN. In another
embodiment, the GLP2—XTEN comprises a first XTEN of the foregoing lengths, and at least a second
XTEN ofabout 36, or about 72, or about 144, or about 288, or about 576, or about 864, or about 875, or
about 912, or about 923, or about 1000 or more amino acid residues.
In r aspect, the invention provides methods of making GLP2-XTEN compositions to
improve ease of manufacture, result in increased stability, increased water solubility, and/or ease of
formulation, as compared to the native GLP-2. In one embodiment, the invention includes a method of
increasing the water solubility of a GLP-2 comprising the step of linking the GLP-2 to one or more
XTEN such that a higher tration in soluble form of the ing GLP2-XTEN can be achieved,
under physiologic conditions, compared to the GLP-2 in an un-fused state. In some ments, the
method results in a GLP2-XTEN fusion protein wherein the water solubility is at least about 20%, or at
least about 30% greater, or at least about 50% greater, or at least about 75% greater, or at least about 90%
greater, or at least about 100% r, or at least about 150% greater, or at least about 200% greater, or
at least about 400% greater, or at least about 600% greater, or at least about 800% greater, or at least
about 1000% greater, or at least about 2000% greater under logic conditions, compared to the un-
fused GLP-2. Factors that contribute to the property ofXTEN to confer increased water lity of
GLP-2 when incorporated into a fusion protein include the high solubility of the XTEN filsion partner
and the low degree of self-aggregation between molecules of XTEN in solution. In one embodiment of
the foregoing, the TEN comprises a GLP-2 linked to an XTEN having at least about 36, or about
48, or about 96, or about 144, or about 288, or about 576, or about 864 amino acid residues in Which the
solubility of the filsion protein under physiologic ions is at least three-fold greater than the
ponding GLP-2 not linked to the XTEN, or alternatively, at least four-fold, or ld, or six-fold,
or seven-fold, or fold, or nine-fold, or at least 10-fold, or at least 20-fold, or at least 30-fold, or at
least 50-fold, or at least 60-fold or greater than GLP-2 not linked to the XTEN. In one embodiment of
the foregoing, the GLP-2 has at least about 80%, or at least about 90%, or at least about 95%, or at least
about 97%, or at least about 99% sequence identity to a GLP-2 of Table 1 linked to at least an XTEN
with at least about 80%, or at least about 90%, or at least about 95%, or at least about 97%, or at least
about 99% sequence identity to a XTEN selected from any one of Tables 4, and 8-12.
In another embodiment, the invention includes a method of increasing the shelf-life of a GLP-2
sing the step of linking the GLP-2 with one or more XTEN selected such that the shelf-life of the
resulting GLPZ-XTEN is extended compared to the GLP-2 in an un-fused state. As used herein, shelf-
life refers to the period of time over which the functional activity of a GLP-2 or GLPZ-XTEN that is in
solution or in some other storage formulation remains stable without undue loss of activity. As used
herein, ”functional activity" refers to a pharmacologic effect or biological activity, such as the ability to
bind a receptor or ligand, or ate, or trigger an up-regulated activity, or to display one or more
known functional activities associated with a GLP-2, as known in the art. A GLP-2 that degrades or
aggregates generally has reduced functional activity or reduced bioavailability compared to one that
remains in solution. Factors that bute to the ability of the method to extend the shelf life of GLP—2s
when incorporated into a fusion protein include increased water solubility, reduced ggregation in
on, and increased heat ity of the XTEN fusion partner. In particular, the low tendency of
XTEN to aggregate facilitates methods of formulating pharmaceutical preparations containing higher
drug concentrations of GLP-2s, and the heat-stability of XTEN contributes to the property of GLPZ-
XTEN fusion ns to remain soluble and functionally active for extended periods. In one
embodiment, the method results in GLPZ-XTEN fusion proteins with nged” or ”extended" shelf-
life that exhibit greater activity relative to a standard that has been subjected to the same storage and
handling conditions. The standard may be the un-fused full-length GLP-Z. In one embodiment, the
method includes the step of formulating the isolated GLPZ-XTEN with one or more pharmaceutically
acceptable ents that enhance the ability of the XTEN to retain its unstructured conformation and for
the GLPZ-XTEN to remain soluble in the formulation for a time that is greater than that of the
corresponding un-fused GLP-2. In one embodiment, the method comprises linking a GLP-2 to one or
more XTEN selected from Table 4 to create a GLPZ-XTEN fusion protein results in a solution that
retains greater than about 100% of the functional activity, or r than about 105%, 110%, 120%,
130%, 150% or 200% of the functional activity of a standard when compared at a given time point and
when ted to the same storage and handling ions as the standard, thereby increasing its shelf-
life.
] Shelf-life may also be ed in terms of functional activity remaining after storage,
normalized to functional activity when storage began. GLPZ-XTEN fusion proteins of the invention with
prolonged or extended life as exhibited by ged or extended functional activity retain about
50% more onal activity, or about 60%, 70%, 80%, or 90% more ofthe functional activity ofthe
equivalent GLP-2 not linked to the XTEN when subjected to the same conditions for the same period of
time. For example, a GLPZ-XTEN fitsion n of the invention comprising GLP-2 fused to one or
more XTEN ces selected from Table 4 retains about 80% or more of its original ty in
solution for periods of up to 2 weeks, or 4 weeks, or 6 weeks, or 12 weeks or longer under various
elevated temperature conditions. In some embodiments, the GLPZ-XTEN retains at least about 50%, or
about 60%, or at least about 70%, or at least about 80%, and most preferably at least about 90% or more
of its original activity in solution when heated at 80°C for 10 min. In other embodiments, the GLPZ-
XTEN s at least about 50%, preferably at least about 60%, or at least about 70%, or at least about
80%, or atively at least about 90% or more of its original activity in solution when heated or
ined at 37°C for about 7 days. In another embodiment, GLPZ-XTEN fusion protein retains at least
about 80% or more of its functional activity after exposure to a temperature of about 30°C to about 70°C
over a period of time of about one hour to about 18 hours. In the foregoing embodiments hereinabove
described in this paragraph, the retained activity of the GLPZ-XTEN is at least about two-fold, or at least
about three-fold, or at least about four-fold, or at least about five-fold, or at least about six-fold greater at
a given time point than that of the corresponding GLP-2 not linked to the XTEN.
VII). THE NUCLEIC ACIDS SEQUENCES OF THE INVENTION
The present invention provides isolated polynucleic acids encoding GLPZ-XTEN chimeric
fitsion proteins and sequences complementary to polynucleic acid molecules encoding GLPZ-XTEN
chimeric fusion proteins, including gous variants thereof. In another aspect, the invention
encompasses s to produce polynucleic acids encoding GLPZ-XTEN chimeric fusion proteins and
sequences complementary to cleic acid molecules ng GLPZ—XTEN chimeric fusion protein,
including homologous variants thereof. In general, and as illustrated in FIGS. 4—6, the methods of
producing a polynucleotide sequence coding for a GLPZ—XTEN fusion protein and expressing the
resulting gene product include assembling nucleotides encoding GLP-2 and XTEN, ligating the
components in frame, incorporating the ng gene into an expression vector appropriate for a host
cell, transforming the appropriate host cell with the expression vector, and ing the host cell under
conditions causing or permitting the fusion protein to be expressed in the ormed host cell, thereby
producing the biologically-active GLPZ-XTEN polypeptide, which is recovered as an isolated fusion
protein by standard protein purification methods known in the art. Standard recombinant techniques in
molecular biology are used to make the polynucleotides and expression vectors of the present invention.
In accordance with the invention, nucleic acid sequences that encode GLP2-XTEN (or its
complement) are used to generate recombinant DNA molecules that direct the expression of GLP2-
XTEN fusion proteins in appropriate host cells. Several cloning gies are suitable for performing the
present invention, many of which is used to generate a construct that comprises a gene coding for a
filsion protein of the GLPZ-XTEN ition of the present invention, or its complement. In some
ments, the cloning gy is used to create a gene that encodes a monomeric GLPZ-XTEN that
comprises at least a first GLP-2 and at least a first XTEN polypeptide, or their complement. In one
embodiment of the foregoing, the gene comprises a sequence ng a GLP-2 or sequence t. In
other embodiments, the cloning strategy is used to create a gene that encodes a monomeric GLPZ-XTEN
that ses nucleotides encoding at least a first molecule of GLP-2 or its ment and a first and
at least a second XTEN or their complement that is used to transform a host cell for expression ofthe
fiasion protein of the GLPZ-XTEN composition. In the foregoing embodiments hereinabove described in
this paragraph, the genes can further comprise nucleotides encoding spacer ces that also encode
cleavage sequence(s).
In designing a desired XTEN sequences, it was discovered that the non-repetitive nature of the
XTEN ofthe inventive compositions is achieved despite use of a ”building block” molecular approach in
the creation of the XTEN-encoding sequences. This was achieved by the use of a library of
cleotides ng peptide sequence motifs, described above, that are then ligated and/or
multimerized to create the genes encoding the XTEN sequences (see FIGS. 4, 5, 8, 9 and Examples).
Thus, while the XTEN(s) of the expressed fusion protein may consist of multiple units of as few as four
different sequence motifs, because the motifs lves consist of non-repetitive amino acid sequences,
the overall XTEN sequence is rendered non—repetitive. Accordingly, in one embodiment, the XTEN—
encoding polynucleotides comprise multiple polynucleotides that encode non—repetitive sequences, or
motifs, operably linked in frame and in which the resulting expressed XTEN amino acid ces are
non-repetitive.
In one approach, a construct is first prepared containing the DNA ce corresponding to
GLP2-XTEN fusion protein. In those embodiments in which a mammalian native GLP-2 sequence is to
be employed in the fusion protein, DNA encoding the GLP-2 of the compositions is obtained from a
cDNA library prepared using standard methods from tissue or isolated cells believed to possess GLP-2
mRNA and to express it at a detectable level. Libraries are screened with probes containing, for
example, about 20 to 100 bases designed to identify the GLP-Z gene of interest by hybridization using
conventional molecular biology techniques. The best ates for probes are those that ent
sequences that are highly homologous for GLP-2, and should be of sufficient length and sufficiently
unambiguous that false ves are minimized, but may be degenerate at one or more positions. If
necessary, the coding sequence can be obtained using conventional primer extension procedures as
described in Sambrook, et al., supra, to detect precursors and processing intermediates ofmRNA that
may not have been reverse-transcribed into cDNA. One can then use polymerase chain reaction (PCR)
methodology to amplify the target DNA or RNA coding sequence to obtain sufficient material for the
preparation of the GLPZ-XTEN constructs ning the GLP-2 gene. Assays can then be conducted to
confirm that the hybridizing full-length genes are the d GLP-2 gene(s). By these conventional
s, DNA can be conveniently obtained from a cDNA library prepared from such sources. In those
embodiments in which a GLP-2 analog (with one or more amino acid substitutions, such as ces of
Table 1) for the preparation of the GLPZ-XTEN constructs, the GLP-2 encoding gene(s) is created by
standard synthetic procedures known in the art (e. g., automated nucleic acid synthesis using, for e
one of the methods described in Engels et a1. (Agnew. Chem. Int. Ed. Engl., 28:716-734 1989)), using
DNA sequences obtained from publicly available databases, s, or literature references. Such
procedures are well lmown in the art and well described in the scientific and patent literature. For
example, sequences can be obtained from Chemical Abstracts Services (CAS) Registry Numbers
(published by the American Chemical Society) and/or GenBank Accession Numbers (e. g., Locus ID,
NP_XXXXX, and XP_XXXXX) Model Protein identifiers available through the National Center for
Biotechnology Information (NCBI) webpage, available on the world wide web at ncbi.nlm.nih.gov that
correspond to entries in the CAS Registry or GenBank database that contain an amino acid ce of
the protein of interest or of a fragment or variant of the protein. For such sequence identifiers provided
herein, the summary pages associated with each ofthese CAS and GenBank and GenSeq ion
s as well as the cited l publications (e.g., PubMed ID number (PMID)) are each
incorporated by reference in their entireties, particularly with respect to the amino acid sequences
described therein. In one embodiment, the GLP-2 encoding gene encodes a protein from any one of
Table 1, or a fragment or variant thereof.
A gene or polynucleotide encoding the GLP—2 portion of the t TEN protein, in
the case of an expressed fusion protein that comprises a single GLP—2 is then be cloned into a construct,
which is a plasmid or other vector under the control of appropriate transcription and translation sequences
for high level protein expression in a biological system. In a later step, a second gene or polynucleotide
coding for the XTEN is cally fused to the nucleotides encoding the N- and/or C-terminus ofthe
GLP-2 gene by cloning it into the construct adjacent and in frame with the gene(s) coding for the GLP-Z.
This second step occurs through a ligation or multimerization step. In the foregoing embodiments
above described in this paragraph, it is to be understood that the gene constructs that are created
can alternatively be the complement of the tive genes that encode the respective fusion proteins.
The gene encoding for the XTEN can be made in one or more steps, either fully synthetically or
by synthesis combined with enzymatic ses, such as restriction enzyme-mediated cloning, PCR and
p extension, including methods more fully described in the Examples. The s disclosed
herein can be used, for example, to ligate short sequences of polynucleotides encoding XTEN into longer
XTEN genes of a desired length and sequence. In one embodiment, the method ligates two or more
optimized oligonucleotides encoding XTEN motif or t sequences of about 9 to 14 amino
acids, or about 12 to 20 amino acids, or about 18 to 36 amino acids, or about 48 to about 144 amino
acids, or about 144 to about 288 or longer, or any combination of the ing ranges of motif or
segment lengths.
] Alternatively, the disclosed method is used to multimerize XTEN-encoding sequences into
longer sequences of a desired length; e.g., a gene encoding 36 amino acids of XTEN can be dimerized
into a gene encoding 72 amino acids, then 144, then 288, etc. Even with multimerization, XTEN
ptides can be constructed such that the XTEN-encoding gene has low or virtually no repetitiveness
through design of the codons selected for the motifs of the shortest unit being used, which can reduce
recombination and se stability of the encoding gene in the transformed host.
Genes encoding XTEN with petitive sequences are assembled from oligonucleotides
using standard techniques of gene synthesis. The gene design can be performed using algorithms that
optimize codon usage and amino acid composition. In one method of the invention, a library of relatively
short XTEN-encoding cleotide constructs is created and then assembled, as described above. The
resulting genes are then assembled with genes encoding GLP-2 or regions of GLP-2, as illustrated in
FIGS. 5 and 8, and the ing genes used to transform a host cell and produce and recover the GLP2-
XTEN for evaluation of its properties, as described herein.
In some embodiments, the GLP2-XTEN sequence is designed for optimized sion by
inclusion of an N—terminal sequence (NTS) XTEN, rather than using a leader sequence known in the art.
In one embodiment, the NTS is created by inclusion of encoding nucleotides in the XTEN gene
determined to result in zed expression when joined to the gene ng the fusion protein. In one
embodiment, the N—terminal XTEN sequence of the expressed GLP2-XTEN is optimized for expression
in a otic cell, such as but not limited to CH0, HEK, yeast, and other cell types know in the art.
Polynucleotide libraries
In another aspect, the ion provides libraries of polynucleotides that encode XTEN
ces that are used to assemble genes that encode XTEN of a desired length and sequence.
In certain embodiments, the XTEN-encoding library constructs comprise polynucleotides that
encode polypeptide segments of a fixed length. As an initial step, a library of oligonucleotides that
encode motifs of 9-14 amino acid residues can be assembled. In a preferred embodiment, libraries of
oligonucleotides that encode motifs of 12 amino acids are assembled.
The XTEN-encoding sequence ts can be dimerized or multimerized into longer
encoding sequences. Dimerization or multimerization can be med by ligation, overlap extension,
PCR assembly or similar cloning techniques known in the art. This process of can be repeated multiple
times until the ing XTEN-encoding sequences have reached the organization of sequence and
desired length, providing the XTEN-encoding genes. As will be appreciated, a library of polynucleotides
that encodes, e.g., 12 amino acid motifs can be dimerized and/or ligated into a y of polynucleotides
that encode 36 amino acids. Libraries encoding motifs of different lengths; e. g., 9-14 amino acid motifs
leading to libraries encoding 27 to 42 amino acids are plated by the invention. In turn, the library
of polynucleotides that encode 27 to 42 amino acids, and preferably 36 amino acids (as described in the
Examples) can be serially dimerized into a library containing successively longer lengths of
polynucleotides that encode XTEN sequences of a desired length for incorporation into the gene
encoding the GLP2-XTEN fusion protein, as disclosed herein.
WO 40093
A more efficient way to optimize the DNA sequence encoding XTEN is based on
combinatorial libraries. The gene encoding XTEN can be designed and synthesized in segment such that
le codon versions are obtained for each segment. These segments can be randomly assembled into
a library of genes such that each library member encodes the same amino acid sequences but library
s comprise a large number of codon ns. Such libraries can be screened for genes that result
in high-level expression and/or a low abundance of truncation products. The process of atorial
gene assembly is illustrated in . The genes in are assembled from 6 base fragments and
each fragment is available in 4 different codon versions. This allows for a tical diversity of 4096.
In some embodiments, libraries are assembled of polynucleotides that encode amino acids that
are limited to specific ce XTEN es; e. g., AD, AE, AF, AG, AM, or AQ ces of Table
3. In other ments, ies comprise sequences that encode two or more of the motif family
sequences from Table 3. The names and sequences of representative, miting polynucleotide
sequences of ies that encode 36mers are presented in Tables 8-11, and the methods used to create
them are described more fully in the respective Examples. In other embodiments, libraries that encode
XTEN are constructed from segments of polynucleotide codons linked in a randomized sequence that
encode amino acids wherein at least about 80%, or at least about 90%, or at least about 91%, or at least
about 92%, or at least about 93%, or at least about 94%, or at least about 95%, or at least about 97%, or
at least about 98%, or at least about 99% of the codons are selected from the group consisting of condons
for glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P) amino acids. The
libraries can be used, in turn, for serial dimerization or on to achieve polynucleotide sequence
libraries that encode XTEN sequences, for example, of48, 72, 144, 288, 576, 864, 875, 912, 923, 1318
amino acids, or up to a total length of about 3000 amino acids, as well as intermediate lengths, in which
the encoded XTEN can have one or more ofthe properties disclosed herein, when expressed as a
component of a GLP2-XTEN fusion protein. In some cases, the polynucleotide library sequences may
also include additional bases used as ”sequencing islands,” described more fully below.
is a schematic flowchart of representative, non-limiting steps in the assembly of an
XTEN polynucleotide construct and a GLP2-XTEN polynucleotide construct in the embodiments of the
invention. Individual oligonucleotides 501 are annealed into sequence motifs 502 such as a 12 amino
acid motif (“12-mer”), which is ligated to additional sequence motifs from a library to create a pool that
encompasses the desired length of the XTEN 504, as well as d to a smaller concentration of an oligo
containing BbsI, and KpnI restriction sites 503. The resulting pool of ligation products is gel-purified
and the band with the desired length ofXTEN is cut, resulting in an isolated XTEN gene with a stopper
sequence 505. The XTEN gene is cloned into a stuffer vector. In this case, the vector encodes an
optional CBD sequence 506 and a GFP gene 508. Digestion is than performed with BbsI/HindIII to
remove 507 and 508 and place the stop codon. The resulting product is then cloned into a BsaI/HindIII
digested vector containing a gene encoding the GLP-2, resulting in the gene 500 encoding a GLP2-
XTEN fusion protein. A non-exhaustive list of the polynucleotides encoding XTEN and sor
sequences is provided in Tables 7-12.
Table 7: DNA seguences of XTEN and sor seguences
XTEN
DNA Nucleotide Sequence
Name
AE48 GAACCTGCTGGCTCTCCAACCTCCACTGAGGAAGGTACCCCGGGTAGCGGTACTG
CTTCTTCCTCTCCAGGTAGCTCTACCCCTTCTGGTGCAACCGGCTCTCCAGGTGCTTCTCCG
GGCACCAGCTCTACCGGTTCT
AM48 ATGGCTGAACCTGCTGGCTCTCCAACCTCCACTGAGGAAGGTGCATCCCCGGGCACCAGCT
CTACCGGTTCTCCAGGTAGCTCTACCCCGTCTGGTGCTACCGGCTCTCCAGGTAGCTCTACC
CCGTCTGGTGCTACTGGCTCT
AE144 GGTAGCGAACCGGCAACTTCCGGCTCTGAAACCCCAGGTACTTCTGAAAGCGCTACTCCTG
AGTCTGGCCCAGGTAGCGAACCTGCTACCTCTGGCTCTGAAACCCCAGGTAGCCCGGCAG
GCTCTCCGACTTCCACCGAGGAAGGTACCTCTACTGAACCTTCTGAGGGTAGCGCTCCAGG
TAGCGAACCGGCAACCTCTGGCTCTGAAACCCCAGGTAGCGAACCTGCTACCTCCGGCTCT
GAAACTCCAGGTAGCGAACCGGCTACTTCCGGTTCTGAAACTCCAGGTACCTCTACCGAAC
AAGGCAGCGCACCAGGTACTTCTGAAAGCGCAACCCCTGAATCCGGTCCAGGTA
GCGAACCGGCTACTTCTGGCTCTGAGACTCCAGGTACTTCTACCGAACCGTCCGAAGGTAG
CGCACCA
AF144 TCTACTCCGGAAAGCGGTTCCGCATCTCCAGGTACTTCTCCTAGCGGTGAATCTT
CTACTGCTCCAGGTACCTCTCCTAGCGGCGAATCTTCTACTGCTCCAGGTTCTACCAGCTCT
ACCGCTGAATCTCCTGGCCCAGGTTCTACCAGCGAATCCCCGTCTGGCACCGCACCAGGTT
CTACTAGCTCTACCGCAGAATCTCCGGGTCCAGGTACTTCCCCTAGCGGTGAATCTTCTAC
TGCTCCAGGTACCTCTACTCCGGAAAGCGGCTCCGCATCTCCAGGTTCTACTAGCTCTACT
GCTGAATCTCCTGGTCCAGGTACCTCCCCTAGCGGCGAATCTTCTACTGCTCCAGGTACCT
CTCCTAGCGGCGAATCTTCTACCGCTCCAGGTACCTCCCCTAGCGGTGAATCTTCTACCGC
ACCA
AE288 GGTACCTCTGAAAGCGCAACTCCTGAGTCTGGCCCAGGTAGCGAACCTGCTACCTCCGGCT
CTGAGACTCCAGGTACCTCTGAAAGCGCAACCCCGGAATCTGGTCCAGGTAGCGAACCTG
CAACCTCTGGCTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGG
TACTTCTACTGAACCGTCCGAGGGCAGCGCACCAGGTAGCCCTGCTGGCTCTCCAACCTCC
ACCGAAGAAGGTACCTCTGAAAGCGCAACCCCTGAATCCGGCCCAGGTAGCGAACCGGCA
ACCTCCGGTTCTGAAACCCCAGGTACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTA
GCCCGGCTGGCTCTCCGACTTCCACCGAGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTAC
TGAAGAAGGTACTTCTACCGAACCTTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGC
TACCCCTGAGTCCGGCCCAGGTACTTCTGAAAGCGCTACTCCTGAATCCGGTCCAGGTACT
TCTGAAAGCGCTACCCCGGAATCTGGCCCAGGTAGCGAACCGGCTACTTCTGGTTCTGAAA
CCCCAGGTAGCGAACCGGCTACCTCCGGTTCTGAAACTCCAGGTAGCCCAGCAGGCTCTCC
GACTTCCACTGAGGAAGGTACTTCTACTGAACCTTCCGAAGGCAGCGCACCAGGTACCTCT
ACTGAACCTTCTGAGGGCAGCGCTCCAGGTAGCGAACCTGCAACCTCTGGCTCTGAAACCC
CAGGTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGTACTTCTACTGAACCGTCCGA
GGGCAGCGCACCA
AE576 GGTAGCCCGGCTGGCTCTCCTACCTCTACTGAGGAAGGTACTTCTGAAAGCGCTACTCCTG
AGTCTGGTCCAGGTACCTCTACTGAACCGTCCGAAGGTAGCGCTCCAGGTAGCCCAGCAG
GCTCTCCGACTTCCACTGAGGAAGGTACTTCTACTGAACCTTCCGAAGGCAGCGCACCAGG
TACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACCCCGGAA
CCAGGTAGCGAACCGGCTACTTCTGGTTCTGAAACCCCAGGTAGCGAACCGGCTA
CCTCCGGTTCTGAAACTCCAGGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAAGGTAC
TTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGTACCTCTACCGAACCGTCTGAGGGCAG
CGCACCAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGGTTC
TCCTACCTCCACCGAGGAAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTAC
CTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACCCCGGAGTCC
GGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTGAAAGCGCA
ACCCCTGAATCCGGTCCAGGTAGCGAACCGGCTACTTCTGGCTCTGAGACTCCAGGTACTT
CTACCGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTACTGAACCGTCTGAAGGTAGCG
CACCAGGTACTTCTGAAAGCGCAACCCCGGAATCCGGCCCAGGTACCTCTGAAAGCGCAA
CCCCGGAGTCCGGCCCAGGTAGCCCTGCTGGCTCTCCAACCTCCACCGAAGAAGGTACCTC
TGAAAGCGCAACCCCTGAATCCGGCCCAGGTAGCGAACCGGCAACCTCCGGTTCTGAAAC
XTEN
DNA tide Sequence
Name
CCCAGGTACCTCTGAAAGCGCTACTCCGGAGTCTGGCCCAGGTACCTCTACTGAACCGTCT
GAGGGTAGCGCTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTA
CCGAACCGTCCGAAGGCAGCGCTCCAGGTACCTCTACTGAACCTTCCGAGGGCAGCGCTC
CAGGTACCTCTACCGAACCTTCTGAAGGTAGCGCACCAGGTACTTCTACCGAACCGTCCGA
GGGTAGCGCACCAGGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGGAAGGTACTTCTACC
GAACCGTCCGAGGGTAGCGCACCAGGTACCTCTGAAAGCGCAACTCCTGAGTCTGGCCCA
GGTAGCGAACCTGCTACCTCCGGCTCTGAGACTCCAGGTACCTCTGAAAGCGCAACCCCG
GAATCTGGTCCAGGTAGCGAACCTGCAACCTCTGGCTCTGAAACCCCAGGTACCTCTGAAA
CTCCTGAATCTGGCCCAGGTACTTCTACTGAACCGTCCGAGGGCAGCGCACCAGG
TACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTAGCCCGGCTGGCTCTCCGACTTCC
ACCGAGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTACTGAAGAAGGTAGCCCGGCAGGC
ACCTCTACTGAGGAAGGTACTTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGT
ACCTCTACCGAACCGTCTGAGGGCAGCGCACCA
AF576 GGTTCTACTAGCTCTACCGCTGAATCTCCTGGCCCAGGTTCCACTAGCTCTACCGCAGAAT
CTCCGGGCCCAGGTTCTACTAGCGAATCCCCTTCTGGTACCGCTCCAGGTTCTACTAGCTCT
ACCGCTGAATCTCCGGGTCCAGGTTCTACCAGCTCTACTGCAGAATCTCCTGGCCCAGGTA
CTTCTACTCCGGAAAGCGGTTCCGCTTCTCCAGGTTCTACCAGCGAATCTCCTTCTGGCACC
GGTACCTCTCCTAGCGGCGAATCTTCTACCGCTCCAGGTTCTACTAGCGAATCTC
CTTCTGGCACTGCACCAGGTTCTACCAGCGAATCTCCTTCTGGCACCGCTCCAGGTACCTCT
CCTAGCGGCGAATCTTCTACCGCTCCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCAC
CAGGTTCTACCAGCGAATCTCCTTCTGGCACCGCTCCAGGTACCTCTCCTAGCGGCGAATC
TTCTACCGCTCCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCACCAGGTTCTACTAGCG
AATCTCCTTCTGGCACTGCACCAGGTTCTACCAGCGAATCTCCGTCTGGCACTGCACCAGG
TACCTCTACCCCTGAAAGCGGTTCCGCTTCTCCAGGTTCTACTAGCGAATCTCCTTCTGGTA
CCGCTCCAGGTACTTCTACCCCTGAAAGCGGCTCCGCTTCTCCAGGTTCCACTAGCTCTACC
GCTGAATCTCCGGGTCCAGGTTCTACTAGCTCTACTGCAGAATCTCCTGGCCCAGGTACCT
CTACTCCGGAAAGCGGCTCTGCATCTCCAGGTACTTCTACCCCTGAAAGCGGTTCTGCATC
TCCAGGTTCTACTAGCGAATCCCCGTCTGGTACCGCACCAGGTACTTCTACCCCGGAAAGC
GGCTCTGCTTCTCCAGGTACTTCTACCCCGGAAAGCGGCTCCGCATCTCCAGGTTCTACTA
GCGAATCTCCTTCTGGTACCGCTCCAGGTTCTACCAGCGAATCCCCGTCTGGTACTGCTCC
AGGTTCTACCAGCGAATCTCCTTCTGGTACTGCACCAGGTTCTACTAGCTCTACTGCAGAA
TCTCCTGGCCCAGGTACCTCTACTCCGGAAAGCGGCTCTGCATCTCCAGGTACTTCTACCC
CTGAAAGCGGTTCTGCATCTCCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCACCAGG
TTCTACCAGCGAATCTCCGTCTGGCACTGCACCAGGTACCTCTACCCCTGAAAGCGGTTCC
GCTTCTCCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCACCAGGTTCTACCAGCGAAT
CTCCGTCTGGCACTGCACCAGGTACCTCTACCCCTGAAAGCGGTTCCGCTTCTCCAGGTAC
TTCTCCGAGCGGTGAATCTTCTACCGCACCAGGTTCTACTAGCTCTACCGCTGAATCTCCG
GGCCCAGGTACTTCTCCGAGCGGTGAATCTTCTACTGCTCCAGGTTCCACTAGCTCTACTG
CTGAATCTCCTGGCCCAGGTACTTCTACTCCGGAAAGCGGTTCCGCTTCTCCAGGTTCTACT
AGCGAATCTCCGTCTGGCACCGCACCAGGTTCTACTAGCTCTACTGCAGAATCTCCTGGCC
CAGGTACCTCTACTCCGGAAAGCGGCTCTGCATCTCCAGGTACTTCTACCCCTGAAAGCGG
TTCTGCATCTCCA
AE624 ATGGCTGAACCTGCTGGCTCTCCAACCTCCACTGAGGAAGGTACCCCGGGTAGCGGTACTG
CTTCTTCCTCTCCAGGTAGCTCTACCCCTTCTGGTGCAACCGGCTCTCCAGGTGCTTCTCCG
GGCACCAGCTCTACCGGTTCTCCAGGTAGCCCGGCTGGCTCTCCTACCTCTACTGAGGAAG
GTACTTCTGAAAGCGCTACTCCTGAGTCTGGTCCAGGTACCTCTACTGAACCGTCCGAAGG
TAGCGCTCCAGGTAGCCCAGCAGGCTCTCCGACTTCCACTGAGGAAGGTACTTCTACTGAA
CCTTCCGAAGGCAGCGCACCAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTA
CTTCTGAAAGCGCTACCCCGGAATCTGGCCCAGGTAGCGAACCGGCTACTTCTGGTTCTGA
AACCCCAGGTAGCGAACCGGCTACCTCCGGTTCTGAAACTCCAGGTAGCCCGGCAGGCTC
TCCGACCTCTACTGAGGAAGGTACTTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGTAC
CTCTACCGAACCGTCTGAGGGCAGCGCACCAGGTACTTCTACCGAACCGTCCGAGGGTAG
AGGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGGAAGGTACTTCTACCGAACCG
TCCGAGGGTAGCGCACCAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTT
CTGAAAGCGCTACCCCGGAGTCCGGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCG
GTACTTCTGAAAGCGCAACCCCTGAATCCGGTCCAGGTAGCGAACCGGCTACTTC
TGGCTCTGAGACTCCAGGTACTTCTACCGAACCGTCCGAAGGTAGCGCACCAGGTACTTCT
ACTGAACCGTCTGAAGGTAGCGCACCAGGTACTTCTGAAAGCGCAACCCCGGAATCCGGC
CCAGGTACCTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGTAGCCCTGCTGGCTCTCCA
XTEN
DNA Nucleotide Sequence
Name
ACCGAAGAAGGTACCTCTGAAAGCGCAACCCCTGAATCCGGCCCAGGTAGCGAA
CCGGCAACCTCCGGTTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCGGAGTCTGGCC
CCTCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTACTTCTACTGAACCGTCCGA
AGGTAGCGCACCAGGTACTTCTACCGAACCGTCCGAAGGCAGCGCTCCAGGTACCTCTACT
GAACCTTCCGAGGGCAGCGCTCCAGGTACCTCTACCGAACCTTCTGAAGGTAGCGCACCA
GGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGGTTCTCCTACCT
CCACCGAGGAAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTACCTCTGAAA
GCGCAACTCCTGAGTCTGGCCCAGGTAGCGAACCTGCTACCTCCGGCTCTGAGACTCCAGG
TGAAAGCGCAACCCCGGAATCTGGTCCAGGTAGCGAACCTGCAACCTCTGGCTCT
GAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGTACTTCTACTGAAC
CGTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTA
GCCCGGCTGGCTCTCCGACTTCCACCGAGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTAC
TGAAGAAGGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAAGGTACTTCTGAAAGCGC
AACCCCGGAGTCCGGCCCAGGTACCTCTACCGAACCGTCTGAGGGCAGCGCACCA
AM875 GGTACTTCTACTGAACCGTCTGAAGGCAGCGCACCAGGTAGCGAACCGGCTACTTCCGGTT
CTGAAACCCCAGGTAGCCCAGCAGGTTCTCCAACTTCTACTGAAGAAGGTTCTACCAGCTC
TACCGCAGAATCTCCTGGTCCAGGTACCTCTACTCCGGAAAGCGGCTCTGCATCTCCAGGT
TCTACTAGCGAATCTCCTTCTGGCACTGCACCAGGTTCTACTAGCGAATCCCCGTCTGGTA
CTGCTCCAGGTACTTCTACTCCTGAAAGCGGTTCCGCTTCTCCAGGTACCTCTACTCCGGAA
AGCGGTTCTGCATCTCCAGGTAGCGAACCGGCAACCTCCGGCTCTGAAACCCCAGGTACCT
CTGAAAGCGCTACTCCTGAATCCGGCCCAGGTAGCCCGGCAGGTTCTCCGACTTCCACTGA
GGAAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACC
CCGGAGTCCGGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTA
CCGAACCGTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGG
AAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTACTTCTACCGAACCTTCCGA
GGGCAGCGCACCAGGTACTTCTGAAAGCGCTACCCCTGAGTCCGGCCCAGGTACTTCTGA
AAGCGCTACTCCTGAATCCGGTCCAGGTACCTCTACTGAACCTTCCGAAGGCAGCGCTCCA
TCTACCGAACCGTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCAACCCCT
GAATCCGGTCCAGGTACTTCTACTGAACCTTCCGAAGGTAGCGCTCCAGGTAGCGAACCTG
CTACTTCTGGTTCTGAAACCCCAGGTAGCCCGGCTGGCTCTCCGACCTCCACCGAGGAAGG
TACCCCGTCTGGTGCTACTGGTTCTCCAGGTACTCCGGGCAGCGGTACTGCTTCTT
CCTCTCCAGGTAGCTCTACCCCTTCTGGTGCTACTGGCTCTCCAGGTACCTCTACCGAACCG
TCCGAGGGTAGCGCACCAGGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTAGC
GAACCGGCAACCTCCGGTTCTGAAACTCCAGGTAGCCCTGCTGGCTCTCCGACTTCTACTG
AGGAAGGTAGCCCGGCTGGTTCTCCGACTTCTACTGAGGAAGGTACTTCTACCGAACCTTC
CGAAGGTAGCGCTCCAGGTGCAAGCGCAAGCGGCGCGCCAAGCACGGGAGGTACTTCTGA
AAGCGCTACTCCTGAGTCCGGCCCAGGTAGCCCGGCTGGCTCTCCGACTTCCACCGAGGAA
GGTAGCCCGGCTGGCTCTCCAACTTCTACTGAAGAAGGTTCTACCAGCTCTACCGCTGAAT
CTCCTGGCCCAGGTTCTACTAGCGAATCTCCGTCTGGCACCGCACCAGGTACTTCCCCTAG
CGGTGAATCTTCTACTGCACCAGGTACCCCTGGCAGCGGTACCGCTTCTTCCTCTCCAGGT
AGCTCTACCCCGTCTGGTGCTACTGGCTCTCCAGGTTCTAGCCCGTCTGCATCTACCGGTAC
CGGCCCAGGTAGCGAACCGGCAACCTCCGGCTCTGAAACTCCAGGTACTTCTGAAAGCGC
TACTCCGGAATCCGGCCCAGGTAGCGAACCGGCTACTTCCGGCTCTGAAACCCCAGGTTCC
ACCAGCTCTACTGCAGAATCTCCGGGCCCAGGTTCTACTAGCTCTACTGCAGAATCTCCGG
GTCCAGGTACTTCTCCTAGCGGCGAATCTTCTACCGCTCCAGGTAGCGAACCGGCAACCTC
TGGCTCTGAAACTCCAGGTAGCGAACCTGCAACCTCCGGCTCTGAAACCCCAGGTACTTCT
CCTTCTGAGGGCAGCGCACCAGGTTCTACCAGCTCTACCGCAGAATCTCCTGGTC
CAGGTACCTCTACTCCGGAAAGCGGCTCTGCATCTCCAGGTTCTACTAGCGAATCTCCTTC
TGGCACTGCACCAGGTACTTCTACCGAACCGTCCGAAGGCAGCGCTCCAGGTACCTCTACT
GAACCTTCCGAGGGCAGCGCTCCAGGTACCTCTACCGAACCTTCTGAAGGTAGCGCACCA
GGTAGCTCTACTCCGTCTGGTGCAACCGGCTCCCCAGGTTCTAGCCCGTCTGCTTCCACTG
GTACTGGCCCAGGTGCTTCCCCGGGCACCAGCTCTACTGGTTCTCCAGGTAGCGAACCTGC
TACCTCCGGTTCTGAAACCCCAGGTACCTCTGAAAGCGCAACTCCGGAGTCTGGTCCAGGT
AGCCCTGCAGGTTCTCCTACCTCCACTGAGGAAGGTAGCTCTACTCCGTCTGGTGCAACCG
GCTCCCCAGGTTCTAGCCCGTCTGCTTCCACTGGTACTGGCCCAGGTGCTTCCCCGGGCAC
CAGCTCTACTGGTTCTCCAGGTACCTCTGAAAGCGCTACTCCGGAGTCTGGCCCAGGTACC
TCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCG
CACCA
XTEN
DNA Nucleotide Sequence
Name
AE864 GGTAGCCCGGCTGGCTCTCCTACCTCTACTGAGGAAGGTACTTCTGAAAGCGCTACTCCTG
AGTCTGGTCCAGGTACCTCTACTGAACCGTCCGAAGGTAGCGCTCCAGGTAGCCCAGCAG
GCTCTCCGACTTCCACTGAGGAAGGTACTTCTACTGAACCTTCCGAAGGCAGCGCACCAGG
TACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACCCCGGAA
TCTGGCCCAGGTAGCGAACCGGCTACTTCTGGTTCTGAAACCCCAGGTAGCGAACCGGCTA
CCTCCGGTTCTGAAACTCCAGGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAAGGTAC
TTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGTACCTCTACCGAACCGTCTGAGGGCAG
CGCACCAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGGTTC
TCCTACCTCCACCGAGGAAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTAC
CTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACCCCGGAGTCC
GGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTGAAAGCGCA
ACCCCTGAATCCGGTCCAGGTAGCGAACCGGCTACTTCTGGCTCTGAGACTCCAGGTACTT
CTACCGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTACTGAACCGTCTGAAGGTAGCG
CACCAGGTACTTCTGAAAGCGCAACCCCGGAATCCGGCCCAGGTACCTCTGAAAGCGCAA
CCCCGGAGTCCGGCCCAGGTAGCCCTGCTGGCTCTCCAACCTCCACCGAAGAAGGTACCTC
TGAAAGCGCAACCCCTGAATCCGGCCCAGGTAGCGAACCGGCAACCTCCGGTTCTGAAAC
CCCAGGTACCTCTGAAAGCGCTACTCCGGAGTCTGGCCCAGGTACCTCTACTGAACCGTCT
AGCGCTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTA
CCGAACCGTCCGAAGGCAGCGCTCCAGGTACCTCTACTGAACCTTCCGAGGGCAGCGCTC
CCTCTACCGAACCTTCTGAAGGTAGCGCACCAGGTACTTCTACCGAACCGTCCGA
CGCACCAGGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGGAAGGTACTTCTACC
GAACCGTCCGAGGGTAGCGCACCAGGTACCTCTGAAAGCGCAACTCCTGAGTCTGGCCCA
GGTAGCGAACCTGCTACCTCCGGCTCTGAGACTCCAGGTACCTCTGAAAGCGCAACCCCG
GAATCTGGTCCAGGTAGCGAACCTGCAACCTCTGGCTCTGAAACCCCAGGTACCTCTGAAA
GCGCTACTCCTGAATCTGGCCCAGGTACTTCTACTGAACCGTCCGAGGGCAGCGCACCAGG
TACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTAGCCCGGCTGGCTCTCCGACTTCC
ACCGAGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTACTGAAGAAGGTAGCCCGGCAGGC
TCTCCGACCTCTACTGAGGAAGGTACTTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGT
ACCTCTACCGAACCGTCTGAGGGCAGCGCACCAGGTACCTCTGAAAGCGCAACTCCTGAG
TCTGGCCCAGGTAGCGAACCTGCTACCTCCGGCTCTGAGACTCCAGGTACCTCTGAAAGCG
CAACCCCGGAATCTGGTCCAGGTAGCGAACCTGCAACCTCTGGCTCTGAAACCCCAGGTA
CCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGTACTTCTACTGAACCGTCCGAGGGCAG
CGCACCAGGTAGCCCTGCTGGCTCTCCAACCTCCACCGAAGAAGGTACCTCTGAAAGCGC
AACCCCTGAATCCGGCCCAGGTAGCGAACCGGCAACCTCCGGTTCTGAAACCCCAGGTAC
TTCTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTAGCCCGGCTGGCTCTCCGACTTCCACC
GAGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTACTGAAGAAGGTACTTCTACCGAACCTT
GCAGCGCACCAGGTACTTCTGAAAGCGCTACCCCTGAGTCCGGCCCAGGTACTT
CTGAAAGCGCTACTCCTGAATCCGGTCCAGGTACTTCTGAAAGCGCTACCCCGGAATCTGG
CCCAGGTAGCGAACCGGCTACTTCTGGTTCTGAAACCCCAGGTAGCGAACCGGCTACCTCC
GGTTCTGAAACTCCAGGTAGCCCAGCAGGCTCTCCGACTTCCACTGAGGAAGGTACTTCTA
CTGAACCTTCCGAAGGCAGCGCACCAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCC
AGGTAGCGAACCTGCAACCTCTGGCTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCT
GGCCCAGGTACTTCTACTGAACCGTCCGAGGGCAGCGCACCA
AF864 GGTTCTACCAGCGAATCTCCTTCTGGCACCGCTCCAGGTACCTCTCCTAGCGGCGAATCTT
CTACCGCTCCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCACCAGGTTCTACTAGCGA
ATCCCCGTCTGGTACTGCTCCAGGTACTTCTACTCCTGAAAGCGGTTCCGCTTCTCCAGGTA
CTCCGGAAAGCGGTTCTGCATCTCCAGGTTCTACCAGCGAATCTCCTTCTGGCAC
CGCTCCAGGTTCTACTAGCGAATCCCCGTCTGGTACCGCACCAGGTACTTCTCCTAGCGGC
GAATCTTCTACCGCACCAGGTTCTACTAGCGAATCTCCGTCTGGCACTGCTCCAGGTACTT
CTCCTAGCGGTGAATCTTCTACCGCTCCAGGTACTTCCCCTAGCGGCGAATCTTCTACCGCT
CCAGGTTCTACTAGCTCTACTGCAGAATCTCCGGGCCCAGGTACCTCTCCTAGCGGTGAAT
CTTCTACCGCTCCAGGTACTTCTCCGAGCGGTGAATCTTCTACCGCTCCAGGTTCTACTAGC
TCTACTGCAGAATCTCCTGGCCCAGGTACCTCTACTCCGGAAAGCGGCTCTGCATCTCCAG
GTACTTCTACCCCTGAAAGCGGTTCTGCATCTCCAGGTTCTACTAGCGAATCTCCTTCTGGC
ACTGCACCAGGTTCTACCAGCGAATCTCCGTCTGGCACTGCACCAGGTACCTCTACCCCTG
AAAGCGGTTCCGCTTCTCCAGGTTCTACCAGCTCTACCGCAGAATCTCCTGGTCCAGGTAC
CTCTACTCCGGAAAGCGGCTCTGCATCTCCAGGTTCTACTAGCGAATCTCCTTCTGGCACT
GCACCAGGTACTTCTCCGAGCGGTGAATCTTCTACCGCACCAGGTTCTACTAGCTCTACCG
CTGAATCTCCGGGCCCAGGTACTTCTCCGAGCGGTGAATCTTCTACTGCTCCAGGTACCTC
XTEN
DNA Nucleotide ce
Name
TACTCCTGAAAGCGGTTCTGCATCTCCAGGTTCCACTAGCTCTACCGCAGAATCTCCGGGC
CCAGGTTCTACTAGCTCTACTGCTGAATCTCCTGGCCCAGGTTCTACTAGCTCTACTGCTGA
ATCTCCGGGTCCAGGTTCTACCAGCTCTACTGCTGAATCTCCTGGTCCAGGTACCTCCCCG
AGCGGTGAATCTTCTACTGCACCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCACCAG
GTTCTACCAGCGAATCTCCGTCTGGCACTGCACCAGGTACCTCTACCCCTGAAAGCGGTCC
XXXXXXXXXXXXTGCAAGCGCAAGCGGCGCGCCAAGCACGGGAXXXXXXXXTAGCGAAT
CTCCTTCTGGTACCGCTCCAGGTTCTACCAGCGAATCCCCGTCTGGTACTGCTCCAGGTTCT
ACCAGCGAATCTCCTTCTGGTACTGCACCAGGTTCTACTAGCGAATCTCCTTCTGGTACCG
CTCCAGGTTCTACCAGCGAATCCCCGTCTGGTACTGCTCCAGGTTCTACCAGCGAATCTCC
TTCTGGTACTGCACCAGGTACTTCTACTCCGGAAAGCGGTTCCGCATCTCCAGGTACTTCTC
CTAGCGGTGAATCTTCTACTGCTCCAGGTACCTCTCCTAGCGGCGAATCTTCTACTGCTCCA
GGTTCTACCAGCTCTACTGCTGAATCTCCGGGTCCAGGTACTTCCCCGAGCGGTGAATCTT
CTACTGCACCAGGTACTTCTACTCCGGAAAGCGGTTCCGCTTCTCCAGGTTCTACCAGCGA
ATCTCCTTCTGGCACCGCTCCAGGTTCTACTAGCGAATCCCCGTCTGGTACCGCACCAGGT
ACTTCTCCTAGCGGCGAATCTTCTACCGCACCAGGTTCTACTAGCGAATCCCCGTCTGGTA
CCGCACCAGGTACTTCTACCCCGGAAAGCGGCTCTGCTTCTCCAGGTACTTCTACCCCGGA
CTCCGCATCTCCAGGTTCTACTAGCGAATCTCCTTCTGGTACCGCTCCAGGTACT
TCTACCCCTGAAAGCGGCTCCGCTTCTCCAGGTTCCACTAGCTCTACCGCTGAATCTCCGG
GTCCAGGTTCTACCAGCGAATCTCCTTCTGGCACCGCTCCAGGTTCTACTAGCGAATCCCC
GTCTGGTACCGCACCAGGTACTTCTCCTAGCGGCGAATCTTCTACCGCACCAGGTTCTACC
AGCTCTACTGCTGAATCTCCGGGTCCAGGTACTTCCCCGAGCGGTGAATCTTCTACTGCAC
CTTCTACTCCGGAAAGCGGTTCCGCTTCTCCAGGTACCTCCCCTAGCGGCGAATC
TGCTCCAGGTACCTCTCCTAGCGGCGAATCTTCTACCGCTCCAGGTACCTCCCCTA
GCGGTGAATCTTCTACCGCACCAGGTTCTACTAGCTCTACTGCTGAATCTCCGGGTCCAGG
TTCTACCAGCTCTACTGCTGAATCTCCTGGTCCAGGTACCTCCCCGAGCGGTGAATCTTCTA
CTGCACCAGGTTCTAGCCCTTCTGCTTCCACCGGTACCGGCCCAGGTAGCTCTACTCCGTCT
GGTGCAACTGGCTCTCCAGGTAGCTCTACTCCGTCTGGTGCAACCGGCTCCCCA
XXXX was inserted in two areas Where no sequence information is available.
AG864 GGTGCTTCCCCGGGCACCAGCTCTACTGGTTCTCCAGGTTCTAGCCCGTCTGCTTCTACTGG
TACTGGTCCAGGTTCTAGCCCTTCTGCTTCCACTGGTACTGGTCCAGGTACCCCGGGTAGC
GGTACCGCTTCTTCTTCTCCAGGTAGCTCTACTCCGTCTGGTGCTACCGGCTCTCCAGGTTC
TAACCCTTCTGCATCCACCGGTACCGGCCCAGGTGCTTCTCCGGGCACCAGCTCTACTGGT
TCTCCAGGTACCCCGGGCAGCGGTACCGCATCTTCTTCTCCAGGTAGCTCTACTCCTTCTGG
TGCAACTGGTTCTCCAGGTACTCCTGGCAGCGGTACCGCTTCTTCTTCTCCAGGTGCTTCTC
CTGGTACTAGCTCTACTGGTTCTCCAGGTGCTTCTCCGGGCACTAGCTCTACTGGTTCTCCA
GGTACCCCGGGTAGCGGTACTGCTTCTTCCTCTCCAGGTAGCTCTACCCCTTCTGGTGCAAC
TCCAGGTGCTTCTCCGGGCACCAGCTCTACCGGTTCTCCAGGTACCCCGGGTAGC
GGTACCGCTTCTTCTTCTCCAGGTAGCTCTACTCCGTCTGGTGCTACCGGCTCTCCAGGTTC
TAACCCTTCTGCATCCACCGGTACCGGCCCAGGTTCTAGCCCTTCTGCTTCCACCGGTACTG
GCCCAGGTAGCTCTACCCCTTCTGGTGCTACCGGCTCCCCAGGTAGCTCTACTCCTTCTGGT
GGCTCTCCAGGTGCATCTCCGGGCACTAGCTCTACTGGTTCTCCAGGTGCATCCC
CTGGCACTAGCTCTACTGGTTCTCCAGGTGCTTCTCCTGGTACCAGCTCTACTGGTTCTCCA
GGTACTCCTGGCAGCGGTACCGCTTCTTCTTCTCCAGGTGCTTCTCCTGGTACTAGCTCTAC
TGGTTCTCCAGGTGCTTCTCCGGGCACTAGCTCTACTGGTTCTCCAGGTGCTTCCCCGGGCA
CTAGCTCTACCGGTTCTCCAGGTTCTAGCCCTTCTGCATCTACTGGTACTGGCCCAGGTACT
CCGGGCAGCGGTACTGCTTCTTCCTCTCCAGGTGCATCTCCGGGCACTAGCTCTACTGGTTC
TCCAGGTGCATCCCCTGGCACTAGCTCTACTGGTTCTCCAGGTGCTTCTCCTGGTACCAGCT
CTACTGGTTCTCCAGGTAGCTCTACTCCGTCTGGTGCAACCGGTTCCCCAGGTAGCTCTACT
CCTTCTGGTGCTACTGGCTCCCCAGGTGCATCCCCTGGCACCAGCTCTACCGGTTCTCCAG
GTACCCCGGGCAGCGGTACCGCATCTTCCTCTCCAGGTAGCTCTACCCCGTCTGGTGCTAC
CGGTTCCCCAGGTAGCTCTACCCCGTCTGGTGCAACCGGCTCCCCAGGTAGCTCTACTCCG
TCTGGTGCAACCGGCTCCCCAGGTTCTAGCCCGTCTGCTTCCACTGGTACTGGCCCAGGTG
CTTCCCCGGGCACCAGCTCTACTGGTTCTCCAGGTGCATCCCCGGGTACCAGCTCTACCGG
TTCTCCAGGTACTCCTGGCAGCGGTACTGCATCTTCCTCTCCAGGTGCTTCTCCGGGCACCA
GCTCTACTGGTTCTCCAGGTGCATCTCCGGGCACTAGCTCTACTGGTTCTCCAGGTGCATCC
CCTGGCACTAGCTCTACTGGTTCTCCAGGTGCTTCTCCTGGTACCAGCTCTACTGGTTCTCC
AGGTACCCCTGGTAGCGGTACTGCTTCTTCCTCTCCAGGTAGCTCTACTCCGTCTGGTGCTA
XTEN
DNA Nucleotide Sequence
Name
CCGGTTCTCCAGGTACCCCGGGTAGCGGTACCGCATCTTCTTCTCCAGGTAGCTCTACCCC
TGCTACTGGTTCTCCAGGTACTCCGGGCAGCGGTACTGCTTCTTCCTCTCCAGGTA
CCCCTTCTGGTGCTACTGGCTCTCCAGGTAGCTCTACCCCGTCTGGTGCTACTGGC
TCCCCAGGTTCTAGCCCTTCTGCATCCACCGGTACCGGTCCAGGTTCTAGCCCGTCTGCATC
TACTGGTACTGGTCCAGGTGCATCCCCGGGCACTAGCTCTACCGGTTCTCCAGGTACTCCT
GGTAGCGGTACTGCTTCTTCTTCTCCAGGTAGCTCTACTCCTTCTGGTGCTACTGGTTCTCC
AGGTTCTAGCCCTTCTGCATCCACCGGTACCGGCCCAGGTTCTAGCCCGTCTGCTTCTACCG
GTACTGGTCCAGGTGCTTCTCCGGGTACTAGCTCTACTGGTTCTCCAGGTGCATCTCCTGGT
ACTAGCTCTACTGGTTCTCCAGGTAGCTCTACTCCGTCTGGTGCAACCGGCTCTCCAGGTTC
TAGCCCTTCTGCATCTACCGGTACTGGTCCAGGTGCATCCCCTGGTACCAGCTCTACCGGTT
CTCCAGGTTCTAGCCCTTCTGCTTCTACCGGTACCGGTCCAGGTACCCCTGGCAGCGGTAC
CGCATCTTCCTCTCCAGGTAGCTCTACTCCGTCTGGTGCAACCGGTTCCCCAGGTAGCTCTA
CTCCTTCTGGTGCTACTGGCTCCCCAGGTGCATCCCCTGGCACCAGCTCTACCGGTTCTCCA
1434923 ATGGCTGAACCTGCTGGCTCTCCAACCTCCACTGAGGAAGGTGCATCCCCGGGCACCAGCT
CTACCGGTTCTCCAGGTAGCTCTACCCCGTCTGGTGCTACCGGCTCTCCAGGTAGCTCTACC
CCGTCTGGTGCTACTGGCTCTCCAGGTACTTCTACTGAACCGTCTGAAGGCAGCGCACCAG
GTAGCGAACCGGCTACTTCCGGTTCTGAAACCCCAGGTAGCCCAGCAGGTTCTCCAACTTC
TACTGAAGAAGGTTCTACCAGCTCTACCGCAGAATCTCCTGGTCCAGGTACCTCTACTCCG
GAAAGCGGCTCTGCATCTCCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCACCAGGTT
CTACTAGCGAATCCCCGTCTGGTACTGCTCCAGGTACTTCTACTCCTGAAAGCGGTTCCGC
TTCTCCAGGTACCTCTACTCCGGAAAGCGGTTCTGCATCTCCAGGTAGCGAACCGGCAACC
TCCGGCTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGAATCCGGCCCAGGTAGCC
CGGCAGGTTCTCCGACTTCCACTGAGGAAGGTACCTCTACTGAACCTTCTGAGGGCAGCGC
TCCAGGTACTTCTGAAAGCGCTACCCCGGAGTCCGGTCCAGGTACTTCTACTGAACCGTCC
GAAGGTAGCGCACCAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTAGCCCA
GCAGGTTCTCCTACCTCCACCGAGGAAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCAC
CAGGTACTTCTACCGAACCTTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCTACCCC
TGAGTCCGGCCCAGGTACTTCTGAAAGCGCTACTCCTGAATCCGGTCCAGGTACCTCTACT
GAACCTTCCGAAGGCAGCGCTCCAGGTACCTCTACCGAACCGTCCGAGGGCAGCGCACCA
GGTACTTCTGAAAGCGCAACCCCTGAATCCGGTCCAGGTACTTCTACTGAACCTTCCGAAG
GTAGCGCTCCAGGTAGCGAACCTGCTACTTCTGGTTCTGAAACCCCAGGTAGCCCGGCTGG
CTCTCCGACCTCCACCGAGGAAGGTAGCTCTACCCCGTCTGGTGCTACTGGTTCTCCAGGT
ACTCCGGGCAGCGGTACTGCTTCTTCCTCTCCAGGTAGCTCTACCCCTTCTGGTGCTACTGG
CTCTCCAGGTACCTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTACCTCTACTGAACCG
TCTGAGGGTAGCGCTCCAGGTAGCGAACCGGCAACCTCCGGTTCTGAAACTCCAGGTAGC
CCTGCTGGCTCTCCGACTTCTACTGAGGAAGGTAGCCCGGCTGGTTCTCCGACTTCTACTG
AGGAAGGTACTTCTACCGAACCTTCCGAAGGTAGCGCTCCAGGTGCAAGCGCAAGCGGCG
CGCCAAGCACGGGAGGTACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTAGCCCGG
CTCCGACTTCCACCGAGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTACTGAAGA
AGGTTCTACCAGCTCTACCGCTGAATCTCCTGGCCCAGGTTCTACTAGCGAATCTCCGTCT
GGCACCGCACCAGGTACTTCCCCTAGCGGTGAATCTTCTACTGCACCAGGTACCCCTGGCA
GCGGTACCGCTTCTTCCTCTCCAGGTAGCTCTACCCCGTCTGGTGCTACTGGCTCTCCAGGT
TCTAGCCCGTCTGCATCTACCGGTACCGGCCCAGGTAGCGAACCGGCAACCTCCGGCTCTG
AAACTCCAGGTACTTCTGAAAGCGCTACTCCGGAATCCGGCCCAGGTAGCGAACCGGCTA
CTTCCGGCTCTGAAACCCCAGGTTCCACCAGCTCTACTGCAGAATCTCCGGGCCCAGGTTC
CTCTACTGCAGAATCTCCGGGTCCAGGTACTTCTCCTAGCGGCGAATCTTCTACC
GCTCCAGGTAGCGAACCGGCAACCTCTGGCTCTGAAACTCCAGGTAGCGAACCTGCAACC
TCCGGCTCTGAAACCCCAGGTACTTCTACTGAACCTTCTGAGGGCAGCGCACCAGGTTCTA
CCAGCTCTACCGCAGAATCTCCTGGTCCAGGTACCTCTACTCCGGAAAGCGGCTCTGCATC
TCCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCACCAGGTACTTCTACCGAACCGTCC
GAAGGCAGCGCTCCAGGTACCTCTACTGAACCTTCCGAGGGCAGCGCTCCAGGTACCTCTA
CCGAACCTTCTGAAGGTAGCGCACCAGGTAGCTCTACTCCGTCTGGTGCAACCGGCTCCCC
AGGTTCTAGCCCGTCTGCTTCCACTGGTACTGGCCCAGGTGCTTCCCCGGGCACCAGCTCT
ACTGGTTCTCCAGGTAGCGAACCTGCTACCTCCGGTTCTGAAACCCCAGGTACCTCTGAAA
GCGCAACTCCGGAGTCTGGTCCAGGTAGCCCTGCAGGTTCTCCTACCTCCACTGAGGAAGG
TACTCCGTCTGGTGCAACCGGCTCCCCAGGTTCTAGCCCGTCTGCTTCCACTGGTA
CTGGCCCAGGTGCTTCCCCGGGCACCAGCTCTACTGGTTCTCCAGGTACCTCTGAAAGCGC
GGAGTCTGGCCCAGGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTACT
TCTACTGAACCGTCCGAAGGTAGCGCACCA
WO 40093 2012/054941
XTEN
DNA Nucleotide Sequence
Name
AE912 ATGGCTGAACCTGCTGGCTCTCCAACCTCCACTGAGGAAGGTACCCCGGGTAGCGGTACTG
CTTCTTCCTCTCCAGGTAGCTCTACCCCTTCTGGTGCAACCGGCTCTCCAGGTGCTTCTCCG
GGCACCAGCTCTACCGGTTCTCCAGGTAGCCCGGCTGGCTCTCCTACCTCTACTGAGGAAG
GTACTTCTGAAAGCGCTACTCCTGAGTCTGGTCCAGGTACCTCTACTGAACCGTCCGAAGG
TAGCGCTCCAGGTAGCCCAGCAGGCTCTCCGACTTCCACTGAGGAAGGTACTTCTACTGAA
CCTTCCGAAGGCAGCGCACCAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTA
CTTCTGAAAGCGCTACCCCGGAATCTGGCCCAGGTAGCGAACCGGCTACTTCTGGTTCTGA
AACCCCAGGTAGCGAACCGGCTACCTCCGGTTCTGAAACTCCAGGTAGCCCGGCAGGCTC
TCCGACCTCTACTGAGGAAGGTACTTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGTAC
CTCTACCGAACCGTCTGAGGGCAGCGCACCAGGTACTTCTACCGAACCGTCCGAGGGTAG
CGCACCAGGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGGAAGGTACTTCTACCGAACCG
TCCGAGGGTAGCGCACCAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTT
CTGAAAGCGCTACCCCGGAGTCCGGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCG
CACCAGGTACTTCTGAAAGCGCAACCCCTGAATCCGGTCCAGGTAGCGAACCGGCTACTTC
TGGCTCTGAGACTCCAGGTACTTCTACCGAACCGTCCGAAGGTAGCGCACCAGGTACTTCT
ACTGAACCGTCTGAAGGTAGCGCACCAGGTACTTCTGAAAGCGCAACCCCGGAATCCGGC
CCAGGTACCTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGTAGCCCTGCTGGCTCTCCA
ACCGAAGAAGGTACCTCTGAAAGCGCAACCCCTGAATCCGGCCCAGGTAGCGAA
CCGGCAACCTCCGGTTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCGGAGTCTGGCC
CAGGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTACTTCTACTGAACCGTCCGA
AGGTAGCGCACCAGGTACTTCTACCGAACCGTCCGAAGGCAGCGCTCCAGGTACCTCTACT
GAACCTTCCGAGGGCAGCGCTCCAGGTACCTCTACCGAACCTTCTGAAGGTAGCGCACCA
GGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGGTTCTCCTACCT
CCACCGAGGAAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTACCTCTGAAA
GCGCAACTCCTGAGTCTGGCCCAGGTAGCGAACCTGCTACCTCCGGCTCTGAGACTCCAGG
TACCTCTGAAAGCGCAACCCCGGAATCTGGTCCAGGTAGCGAACCTGCAACCTCTGGCTCT
GAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGTACTTCTACTGAAC
CGTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTA
GCCCGGCTGGCTCTCCGACTTCCACCGAGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTAC
TGAAGAAGGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAAGGTACTTCTGAAAGCGC
AACCCCGGAGTCCGGCCCAGGTACCTCTACCGAACCGTCTGAGGGCAGCGCACCAGGTAC
CTCTGAAAGCGCAACTCCTGAGTCTGGCCCAGGTAGCGAACCTGCTACCTCCGGCTCTGAG
ACTCCAGGTACCTCTGAAAGCGCAACCCCGGAATCTGGTCCAGGTAGCGAACCTGCAACC
TCTGGCTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGTACTT
AACCGTCCGAGGGCAGCGCACCAGGTAGCCCTGCTGGCTCTCCAACCTCCACCG
AAGAAGGTACCTCTGAAAGCGCAACCCCTGAATCCGGCCCAGGTAGCGAACCGGCAACCT
CCGGTTCTGAAACCCCAGGTACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTAGCCC
GGCTGGCTCTCCGACTTCCACCGAGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTACTGAA
ACTTCTACCGAACCTTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCTACC
CCTGAGTCCGGCCCAGGTACTTCTGAAAGCGCTACTCCTGAATCCGGTCCAGGTACTTCTG
AAAGCGCTACCCCGGAATCTGGCCCAGGTAGCGAACCGGCTACTTCTGGTTCTGAAACCCC
AGGTAGCGAACCGGCTACCTCCGGTTCTGAAACTCCAGGTAGCCCAGCAGGCTCTCCGACT
TCCACTGAGGAAGGTACTTCTACTGAACCTTCCGAAGGCAGCGCACCAGGTACCTCTACTG
AACCTTCTGAGGGCAGCGCTCCAGGTAGCGAACCTGCAACCTCTGGCTCTGAAACCCCAG
GTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGTACTTCTACTGAACCGTCCGAGGG
ACCA
AM1 3 18 GGTACTTCTACTGAACCGTCTGAAGGCAGCGCACCAGGTAGCGAACCGGCTACTTCCGGTT
CTGAAACCCCAGGTAGCCCAGCAGGTTCTCCAACTTCTACTGAAGAAGGTTCTACCAGCTC
TACCGCAGAATCTCCTGGTCCAGGTACCTCTACTCCGGAAAGCGGCTCTGCATCTCCAGGT
TCTACTAGCGAATCTCCTTCTGGCACTGCACCAGGTTCTACTAGCGAATCCCCGTCTGGTA
CTGCTCCAGGTACTTCTACTCCTGAAAGCGGTTCCGCTTCTCCAGGTACCTCTACTCCGGAA
AGCGGTTCTGCATCTCCAGGTAGCGAACCGGCAACCTCCGGCTCTGAAACCCCAGGTACCT
CTGAAAGCGCTACTCCTGAATCCGGCCCAGGTAGCCCGGCAGGTTCTCCGACTTCCACTGA
GGAAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACC
CCGGAGTCCGGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTA
CCGAACCGTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGG
AAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTACTTCTACCGAACCTTCCGA
GGGCAGCGCACCAGGTACTTCTGAAAGCGCTACCCCTGAGTCCGGCCCAGGTACTTCTGA
AAGCGCTACTCCTGAATCCGGTCCAGGTACCTCTACTGAACCTTCCGAAGGCAGCGCTCCA
2012/054941
XTEN
DNA Nucleotide Sequence
Name
GGTACCTCTACCGAACCGTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCAACCCCT
GAATCCGGTCCAGGTACTTCTACTGAACCTTCCGAAGGTAGCGCTCCAGGTAGCGAACCTG
CTACTTCTGGTTCTGAAACCCCAGGTAGCCCGGCTGGCTCTCCGACCTCCACCGAGGAAGG
TACCCCGTCTGGTGCTACTGGTTCTCCAGGTACTCCGGGCAGCGGTACTGCTTCTT
CCTCTCCAGGTAGCTCTACCCCTTCTGGTGCTACTGGCTCTCCAGGTACCTCTACCGAACCG
TCCGAGGGTAGCGCACCAGGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTAGC
GAACCGGCAACCTCCGGTTCTGAAACTCCAGGTAGCCCTGCTGGCTCTCCGACTTCTACTG
AGGAAGGTAGCCCGGCTGGTTCTCCGACTTCTACTGAGGAAGGTACTTCTACCGAACCTTC
CGAAGGTAGCGCTCCAGGTCCAGAACCAACGGGGCCGGCCCCAAGCGGAGGTAGCGAAC
CGGCAACCTCCGGCTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGAATCCGGCCC
AGGTAGCCCGGCAGGTTCTCCGACTTCCACTGAGGAAGGTACTTCTGAAAGCGCTACTCCT
GAGTCCGGCCCAGGTAGCCCGGCTGGCTCTCCGACTTCCACCGAGGAAGGTAGCCCGGCT
GGCTCTCCAACTTCTACTGAAGAAGGTACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCAG
GTAGCCCGGCTGGCTCTCCGACTTCCACCGAGGAAGGTAGCCCGGCTGGCTCTCCAACTTC
TACTGAAGAAGGTTCTACCAGCTCTACCGCTGAATCTCCTGGCCCAGGTTCTACTAGCGAA
TCTCCGTCTGGCACCGCACCAGGTACTTCCCCTAGCGGTGAATCTTCTACTGCACCAGGTT
CTACCAGCGAATCTCCTTCTGGCACCGCTCCAGGTTCTACTAGCGAATCCCCGTCTGGTAC
AGGTACTTCTCCTAGCGGCGAATCTTCTACCGCACCAGGTACTTCTACCGAACCT
TCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCTACCCCTGAGTCCGGCCCAGGTACTT
CTGAAAGCGCTACTCCTGAATCCGGTCCAGGTAGCGAACCGGCAACCTCTGGCTCTGAAA
CCCCAGGTACCTCTGAAAGCGCTACTCCGGAATCTGGTCCAGGTACTTCTGAAAGCGCTAC
TCCGGAATCCGGTCCAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACTTCT
GAAAGCGCTACCCCGGAGTCCGGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCGCA
CCAGGTACCTCCCCTAGCGGCGAATCTTCTACTGCTCCAGGTACCTCTCCTAGCGGCGAAT
CTTCTACCGCTCCAGGTACCTCCCCTAGCGGTGAATCTTCTACCGCACCAGGTACTTCTACC
GAACCGTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGGAA
GGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGTTCTAGCCCTTCTGCTTCCACCG
GTACCGGCCCAGGTAGCTCTACTCCGTCTGGTGCAACTGGCTCTCCAGGTAGCTCTACTCC
GTCTGGTGCAACCGGCTCCCCAGGTAGCTCTACCCCGTCTGGTGCTACCGGCTCTCCAGGT
AGCTCTACCCCGTCTGGTGCAACCGGCTCCCCAGGTGCATCCCCGGGTACTAGCTCTACCG
GTTCTCCAGGTGCAAGCGCAAGCGGCGCGCCAAGCACGGGAGGTACTTCTCCGAGCGGTG
AATCTTCTACCGCACCAGGTTCTACTAGCTCTACCGCTGAATCTCCGGGCCCAGGTACTTCT
CCGAGCGGTGAATCTTCTACTGCTCCAGGTACCTCTGAAAGCGCTACTCCGGAGTCTGGCC
CAGGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTACTTCTACTGAACCGTCCGA
AGGTAGCGCACCAGGTTCTAGCCCTTCTGCATCTACTGGTACTGGCCCAGGTAGCTCTACT
CCTTCTGGTGCTACCGGCTCTCCAGGTGCTTCTCCGGGTACTAGCTCTACCGGTTCTCCAGG
TACTTCTACTCCGGAAAGCGGTTCCGCATCTCCAGGTACTTCTCCTAGCGGTGAATCTTCTA
CTGCTCCAGGTACCTCTCCTAGCGGCGAATCTTCTACTGCTCCAGGTACTTCTGAAAGCGC
AACCCCTGAATCCGGTCCAGGTAGCGAACCGGCTACTTCTGGCTCTGAGACTCCAGGTACT
TCTACCGAACCGTCCGAAGGTAGCGCACCAGGTTCTACCAGCGAATCCCCTTCTGGTACTG
CTCCAGGTTCTACCAGCGAATCCCCTTCTGGCACCGCACCAGGTACTTCTACCCCTGAAAG
CGGCTCCGCTTCTCCAGGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAAGGTACTTCT
GCAACCCCGGAGTCCGGCCCAGGTACCTCTACCGAACCGTCTGAGGGCAGCGCA
CCAGGTAGCCCTGCTGGCTCTCCAACCTCCACCGAAGAAGGTACCTCTGAAAGCGCAACC
CCTGAATCCGGCCCAGGTAGCGAACCGGCAACCTCCGGTTCTGAAACCCCAGGTAGCTCT
ACCCCGTCTGGTGCTACCGGTTCCCCAGGTGCTTCTCCTGGTACTAGCTCTACCGGTTCTCC
AGGTAGCTCTACCCCGTCTGGTGCTACTGGCTCTCCAGGTTCTACTAGCGAATCCCCGTCT
GGTACTGCTCCAGGTACTTCCCCTAGCGGTGAATCTTCTACTGCTCCAGGTTCTACCAGCTC
TACCGCAGAATCTCCGGGTCCAGGTAGCTCTACCCCTTCTGGTGCAACCGGCTCTCCAGGT
GCATCCCCGGGTACCAGCTCTACCGGTTCTCCAGGTACTCCGGGTAGCGGTACCGCTTCTT
CCTCTCCAGGTAGCCCTGCTGGCTCTCCGACTTCTACTGAGGAAGGTAGCCCGGCTGGTTC
TCCGACTTCTACTGAGGAAGGTACTTCTACCGAACCTTCCGAAGGTAGCGCTCCA
BC864 TCCACCGAACCATCCGAACCAGGTAGCGCAGGTACTTCCACCGAACCATCCGAA
AGCGCAGGTAGCGAACCGGCAACCTCTGGTACTGAACCATCAGGTAGCGGCGCA
TCCGAGCCTACCTCTACTGAACCAGGTAGCGAACCGGCTACCTCCGGTACTGAGCCATCAG
GTAGCGAACCGGCAACTTCCGGTACTGAACCATCAGGTAGCGAACCGGCAACTTCCGGCA
CTGAACCATCAGGTAGCGGTGCATCTGAGCCGACCTCTACTGAACCAGGTACTTCTACTGA
ACCATCTGAGCCGGGCAGCGCAGGTAGCGAACCAGCTACTTCTGGCACTGAACCATCAGG
TACTTCTACTGAACCATCCGAACCAGGTAGCGCAGGTAGCGAACCTGCTACCTCTGGTACT
XTEN
DNA Nucleotide Sequence
Name
GAGCCATCAGGTAGCGAACCGGCTACCTCTGGTACTGAACCATCAGGTACTTCTACCGAAC
CATCCGAGCCTGGTAGCGCAGGTACTTCTACCGAACCATCCGAGCCAGGCAGCGCAGGTA
GCGAACCGGCAACCTCTGGCACTGAGCCATCAGGTAGCGAACCAGCAACTTCTGGTACTG
AACCATCAGGTACTAGCGAGCCATCTACTTCCGAACCAGGTGCAGGTAGCGGCGCATCCG
AACCTACTTCCACTGAACCAGGTACTAGCGAGCCATCCACCTCTGAACCAGGTGCAGGTA
GCGAACCGGCAACTTCCGGCACTGAACCATCAGGTAGCGAACCGGCTACCTCTGGTACTG
AACCATCAGGTACTTCTACCGAACCATCCGAGCCTGGTAGCGCAGGTACTTCTACCGAACC
ATCCGAGCCAGGCAGCGCAGGTAGCGGTGCATCCGAGCCGACCTCTACTGAACCAGGTAG
CGAACCAGCAACTTCTGGCACTGAGCCATCAGGTAGCGAACCAGCTACCTCTGGTACTGA
ACCATCAGGTAGCGAACCGGCTACTTCCGGCACTGAACCATCAGGTAGCGAACCAGCAAC
CTCCGGTACTGAACCATCAGGTACTTCCACTGAACCATCCGAACCGGGTAGCGCAGGTAG
CGAACCGGCAACTTCCGGCACTGAACCATCAGGTAGCGGTGCATCTGAGCCGACCTCTACT
GAACCAGGTACTTCTACTGAACCATCTGAGCCGGGCAGCGCAGGTAGCGAACCTGCAACC
TCCGGCACTGAGCCATCAGGTAGCGGCGCATCTGAACCAACCTCTACTGAACCAGGTACTT
AACCATCTGAGCCAGGCAGCGCAGGTAGCGGCGCATCTGAACCAACCTCTACTG
AACCAGGTAGCGAACCAGCAACTTCTGGTACTGAACCATCAGGTAGCGGCGCATCTGAGC
CTACTTCCACTGAACCAGGTAGCGAACCGGCAACTTCCGGCACTGAACCATCAGGTAGCG
GTGCATCTGAGCCGACCTCTACTGAACCAGGTACTTCTACTGAACCATCTGAGCCGGGCAG
CGCAGGTAGCGAACCGGCAACTTCCGGCACTGAACCATCAGGTAGCGGTGCATCTGAGCC
TACTGAACCAGGTACTTCTACTGAACCATCTGAGCCGGGCAGCGCAGGTAGCGA
ACCAGCTACTTCTGGCACTGAACCATCAGGTACTTCTACTGAACCATCCGAACCAGGTAGC
GCAGGTAGCGAACCTGCTACCTCTGGTACTGAGCCATCAGGTACTTCTACTGAACCATCCG
AGCCGGGTAGCGCAGGTACTTCCACTGAACCATCTGAACCTGGTAGCGCAGGTACTTCCAC
TGAACCATCCGAACCAGGTAGCGCAGGTACTTCTACTGAACCATCCGAGCCGGGTAGCGC
AGGTACTTCCACTGAACCATCTGAACCTGGTAGCGCAGGTACTTCCACTGAACCATCCGAA
CCAGGTAGCGCAGGTACTAGCGAACCATCCACCTCCGAACCAGGCGCAGGTAGCGGTGCA
TCTGAACCGACTTCTACTGAACCAGGTACTTCCACTGAACCATCTGAGCCAGGTAGCGCAG
GTACTTCCACCGAACCATCCGAACCAGGTAGCGCAGGTACTTCCACCGAACCATCCGAAC
CTGGCAGCGCAGGTAGCGAACCGGCAACCTCTGGTACTGAACCATCAGGTAGCGGTGCAT
CCGAGCCGACCTCTACTGAACCAGGTAGCGAACCAGCAACTTCTGGCACTGAGCCATCAG
GTAGCGAACCAGCTACCTCTGGTACTGAACCATCAGGTAGCGAACCGGCAACCTCTGGCA
CTGAGCCATCAGGTAGCGAACCAGCAACTTCTGGTACTGAACCATCAGGTACTAGCGAGC
CATCTACTTCCGAACCAGGTGCAGGTAGCGAACCTGCAACCTCCGGCACTGAGCCATCAG
GTAGCGGCGCATCTGAACCAACCTCTACTGAACCAGGTACTTCCACCGAACCATCTGAGCC
AGGCAGCGCAGGTAGCGAACCTGCAACCTCCGGCACTGAGCCATCAGGTAGCGGCGCATC
AACCTCTACTGAACCAGGTACTTCCACCGAACCATCTGAGCCAGGCAGCGCA
IHR64 GGTAGCGAAACTGCTACTTCCGGCTCTGAGACTGCAGGTACTAGTGAATCCGCAACTAGC
GAATCTGGCGCAGGTAGCACTGCAGGCTCTGAGACTTCCACTGAAGCAGGTACTAGCGAG
TCCGCAACCAGCGAATCCGGCGCAGGTAGCGAAACTGCTACCTCTGGCTCCGAGACTGCA
GGTAGCGAAACTGCAACCTCTGGCTCTGAAACTGCAGGTACTTCCACTGAAGCAAGTGAA
GGCTCCGCATCAGGTACTTCCACCGAAGCAAGCGAAGGCTCCGCATCAGGTACTAGTGAG
TCCGCAACTAGCGAATCCGGTGCAGGTAGCGAAACCGCTACCTCTGGTTCCGAAACTGCA
GGTACTTCTACCGAGGCTAGCGAAGGTTCTGCATCAGGTAGCACTGCTGGTTCCGAGACTT
CTACTGAAGCAGGTACTAGCGAATCTGCTACTAGCGAATCCGGCGCAGGTACTAGCGAAT
CCAGCGAATCCGGCGCAGGTAGCGAAACTGCAACCTCTGGTTCCGAGACTGCAG
GTACTAGCGAGTCCGCTACTAGCGAATCTGGCGCAGGTACTTCCACTGAAGCTAGTGAAG
GTTCTGCATCAGGTAGCGAAACTGCTACTTCTGGTTCCGAAACTGCAGGTAGCGAAACCGC
TACCTCTGGTTCCGAAACTGCAGGTACTTCTACCGAGGCTAGCGAAGGTTCTGCATCAGGT
AGCACTGCTGGTTCCGAGACTTCTACTGAAGCAGGTACTAGCGAGTCCGCTACTAGCGAAT
CTGGCGCAGGTACTTCCACTGAAGCTAGTGAAGGTTCTGCATCAGGTAGCGAAACTGCTAC
TTCTGGTTCCGAAACTGCAGGTAGCACTGCTGGCTCCGAGACTTCTACCGAAGCAGGTAGC
ACTGCAGGTTCCGAAACTTCCACTGAAGCAGGTAGCGAAACTGCTACCTCTGGCTCTGAGA
CTGCAGGTACTAGCGAATCTGCTACTAGCGAATCCGGCGCAGGTACTAGCGAATCCGCTA
CCAGCGAATCCGGCGCAGGTAGCGAAACTGCAACCTCTGGTTCCGAGACTGCAGGTACTA
GCGAATCTGCTACTAGCGAATCCGGCGCAGGTACTAGCGAATCCGCTACCAGCGAATCCG
GTAGCGAAACTGCAACCTCTGGTTCCGAGACTGCAGGTAGCGAAACCGCTACCT
CTGGTTCCGAAACTGCAGGTACTTCTACCGAGGCTAGCGAAGGTTCTGCATCAGGTAGCAC
TTCCGAGACTTCTACTGAAGCAGGTAGCGAAACTGCTACTTCCGGCTCTGAGACT
GCAGGTACTAGTGAATCCGCAACTAGCGAATCTGGCGCAGGTAGCACTGCAGGCTCTGAG
XTEN
DNA Nucleotide Sequence
Name
ACTTCCACTGAAGCAGGTAGCACTGCTGGTTCCGAAACCTCTACCGAAGCAGGTAGCACT
GCAGGTTCTGAAACCTCCACTGAAGCAGGTACTTCCACTGAGGCTAGTGAAGGCTCTGCAT
GCACTGCTGGTTCCGAAACCTCTACCGAAGCAGGTAGCACTGCAGGTTCTGAAA
CCTCCACTGAAGCAGGTACTTCCACTGAGGCTAGTGAAGGCTCTGCATCAGGTAGCACTGC
AGGTTCTGAGACTTCCACCGAAGCAGGTAGCGAAACTGCTACTTCTGGTTCCGAAACTGCA
GGTACTTCCACTGAAGCTAGTGAAGGTTCCGCATCAGGTACTAGTGAGTCCGCAACCAGC
GAATCCGGCGCAGGTAGCGAAACCGCAACCTCCGGTTCTGAAACTGCAGGTACTAGCGAA
TCCGCAACCAGCGAATCTGGCGCAGGTACTAGTGAGTCCGCAACCAGCGAATCCGGCGCA
GGTAGCGAAACCGCAACCTCCGGTTCTGAAACTGCAGGTACTAGCGAATCCGCAACCAGC
GAATCTGGCGCAGGTAGCGAAACTGCTACTTCCGGCTCTGAGACTGCAGGTACTTCCACCG
AAGCAAGCGAAGGTTCCGCATCAGGTACTTCCACCGAGGCTAGTGAAGGCTCTGCATCAG
GTAGCACTGCTGGCTCCGAGACTTCTACCGAAGCAGGTAGCACTGCAGGTTCCGAAACTTC
CACTGAAGCAGGTAGCGAAACTGCTACCTCTGGCTCTGAGACTGCAGGTACTAGCGAATC
TGCTACTAGCGAATCCGGCGCAGGTACTAGCGAATCCGCTACCAGCGAATCCGGCGCAGG
TAGCGAAACTGCAACCTCTGGTTCCGAGACTGCAGGTAGCGAAACTGCTACTTCCGGCTCC
GAGACTGCAGGTAGCGAAACTGCTACTTCTGGCTCCGAAACTGCAGGTACTTCTACTGAGG
AAGGTTCCGCATCAGGTACTAGCGAGTCCGCAACCAGCGAATCCGGCGCAGGTA
GCGAAACTGCTACCTCTGGCTCCGAGACTGCAGGTAGCGAAACTGCAACCTCTGGCTCTGA
AACTGCAGGTACTAGCGAATCTGCTACTAGCGAATCCGGCGCAGGTACTAGCGAATCCGC
TACCAGCGAATCCGGCGCAGGTAGCGAAACTGCAACCTCTGGTTCCGAGACTGCA
One may clone the library of XTEN—encoding genes into one or more expression vectors
known in the art. To facilitate the identification of well—expressing library members, one can uct
the library as fusion to a reporter protein. Non—limiting examples of suitable er genes are green
fluorescent protein, luciferace, alkaline phosphatase, and alactosidase. By screening, one can
identify short XTEN sequences that can be expressed in high concentration in the host organism of
choice. Subsequently, one can generate a library of random XTEN dimers and repeat the screen for high
level of sion. Subsequently, one can screen the resulting constructs for a number of properties
such as level of expression, protease stability, or binding to antiserum.
One aspect ofthe invention is to provide polynucleotide sequences encoding the components of
the fusion protein wherein the on of the sequence has undergone codon optimization. Of particular
interest is codon zation with the goal of improving expression of the polypeptide compositions and
to improve the genetic stability of the encoding gene in the production hosts. For example, codon
optimization is of particular importance for XTEN ces that are rich in e or that have very
repetitive amino acid sequences. Codon optimization is performed using computer programs
(Gustafsson, C., et al. (2004) Trends Biotechnol, 22: 346-53), some of which minimize ribosomal
g (Coda Genomics Inc.). In one embodiment, one can perform codon optimization by constructing
codon libraries where all members of the library encode the same amino acid sequence but where codon
usage is varied. Such libraries can be screened for highly expressing and genetically stable members that
are particularly suitable for the large-scale production of XTEN-containing products. When designing
XTEN sequences one can consider a number of properties. One can minimize the repetitiveness in the
encoding DNA ces. In addition, one can avoid or minimize the use of codons that are rarely used
by the production host (e. g. the AGG and AGA ne codons and one leucine codon in E. coli). In the
case of E. coli, two glycine , GGA and GGG, are rarely used in highly expressed proteins. Thus
codon optimization of the gene encoding XTEN sequences can be very desirable. DNA sequences that
have a high level of glycine tend to have a high GC content that can lead to instability or low expression
levels. Thus, when possible, it is preferred to choose codons such that the tent ofXTEN-
encoding sequence is suitable for the production organism that will be used to manufacture the XTEN.
Optionally, the full-length XTEN-encoding gene comprises one or more sequencing islands. In
this t, sequencing islands are short-stretch sequences that are distinct from the XTEN library
construct sequences and that include a restriction site not present or expected to be present in the full-
length XTEN-encoding gene. In one embodiment, a sequencing island is the sequence
’-AGGTGCAAGCGCAAGCGGCGCGCCAAGCACGGGAGGT-3’. In another embodiment, a
sequencing island is the sequence
’-AGGTCCAGAACCAACGGGGCCGGCCCCAAGCGGAGGT-3 ’.
] In one ment, polynucleotide libraries are constructed using the disclosed methods
wherein all members ofthe library encode the same amino acid sequence but the codon usage for the
respective amino acids in the sequence is varied. Such libraries can be screened for highly expressing
and genetically stable members that are particularly suitable for the large-scale production ofXTEN-
containing products.
] Optionally, one can sequence clones in the library to eliminate isolates that n undesirable
ces. The initial library of short XTEN sequences allows some variation in amino acid sequence.
For instance one can randomize some codons such that a number of hydrophilic amino acids can occur in
a particular on. During the process of iterative multimerization one can screen the resulting library
s for other characteristics like solubility or protease resistance in addition to a screen for high-
level expression.
Once the gene that encodes the XTEN of desired length and properties is selected, it is
genetically fused at the desired location to the nucleotides encoding the GLP-2 ) by cloning it into
the construct adjacent and in frame with the gene coding for GLP-Z, or atively in frame with
tides encoding a spacer/cleavage sequence linked to a terminal XTEN. The invention provides
various permutations of the foregoing, depending on the GLPZ-XTEN to be encoded. For example, a
gene encoding a GLPZ-XTEN fusion protein comprising a GLP-2 and two XTEN, such as embodied by
formula III, as depicted above, the gene would have polynucleotides encoding GLP-2, and
cleotides encoding two XTEN, which can be identical or different in ition and sequence
length. In one non-limiting embodiment of the ing, the GLP-2 polynucleotides would encode
native GLP-2 and the polynucleotides encoding the C-terminus XTEN would encode AE864 and the
polynucleotides encoding an N—terminal XTEN AE912. The step of cloning the GLP-2 genes into the
XTEN construct can occur through a ligation or erization step, as shown in in a schematic
flowchart of representative steps in the assembly of a GLPZ-XTEN polynucleotide construct. Individual
oligonucleotides 501 are annealed into sequence motifs 502 such as a 12 amino acid motif (“12-mer”),
which is ligated to additional sequence motifs from a library that can multimerize to create a pool that
encompasses the desired length of the XTEN 504, as well as ligated to a smaller tration of an oligo
containing BbsI, and KpnI ction sites 503. The motif libraries can be limited to specific sequence
XTEN families; e.g., AD, AE, AF, AG, AM, or AQ sequences of Table 3. As illustrated in the
XTEN polynucleotides encode a length, in this case, of 36 amino acid residues, but longer lengths can be
achieved by this process. For example, multimerization can be performed by ligation, overlap extension,
PCR ly or similar cloning techniques known in the art. The resulting pool of ligation products is
gel-purified and the band with the d length of XTEN is cut, resulting in an isolated XTEN gene
with a stopper sequence 505. The XTEN gene can be cloned into a stuffer vector. In this case, the
vector encodes an optional CBD ce 506 and a GFP gene 508. Digestion is than performed with
BbsI/HindIII to remove 507 and 508 and place the stop codon. The resulting product is then cloned into a
BsaI/HindIII digested vector ning a gene encoding the GLP-2, resulting in the gene 500 encoding a
GLPZ-XTEN fusion protein. As would be apparent to one of ordinary skill in the art, the methods can be
applied to create constructs in alternative configurations and with varying XTEN lengths.
The constructs encoding GLPZ-XTEN fusion proteins can be ed in different
configurations of the components XTEN, GLP—2, and spacer sequences, such as shown in In one
embodiment, the construct comprises polynucleotide sequences complementary to, or those that encode a
monomeric polypeptide of components in the following order (5’ to 3’) GLP-2 and XTEN. In r
embodiment, the construct comprises polynucleotide sequences complementary to, or those that encode a
monomeric polypeptide of ents in the following order (5’ to 3’) XTEN and GLP—2. In another
embodiment, the construct comprises polynucleotide sequences complementary to, or those that encode a
monomeric polypeptide of components in the following order (5’ to 3’) XTEN, GLP-2, and a second
XTEN. In another embodiment, the uct comprises polynucleotide ces complementary to, or
those that encode a monomeric polypeptide of ents in the following order (5’ to 3’) GLP-Z,
spacer sequence, and XTEN. In another embodiment, the construct comprises polynucleotide sequences
complementary to, or those that encode a ric polypeptide of components in the following order
(5’ to 3’) XTEN, spacer sequence, and GLP-2. The spacer polynucleotides can optionally comprise
sequences encoding cleavage sequences. As will be apparent to those of skill in the art, other
permutations or multimers of the foregoing are possible.
The invention also encompasses polynucleotides comprising XTEN-encoding polynucleotide
ts that have a high tage of sequence identity compared to (a) a polynucleotide sequence from
Table 7, or (b) sequences that are complementary to the polynucleotides of (a). A polynucleotide with a
high percentage of sequence identity is one that has at least about an 80% nucleic acid sequence identity,
alternatively at least about 81%, alternatively at least about 82%, alternatively at least about 83%,
alternatively at least about 84%, alternatively at least about 85%, alternatively at least about 86%,
atively at least about 87%, atively at least about 88%, alternatively at least about 89%,
alternatively at least about 90%, alternatively at least about 91% at least about 92%,
, alternatively
alternatively at least about 93%, alternatively at least about 94%, alternatively at least about 95%,
alternatively at least about 96%, alternatively at least about 97%, alternatively at least about 98%, and
alternatively at least about 99% nucleic acid sequence identity ed to (a) or (b) of the foregoing, or
that can hybridize with the target polynucleotide or its complement under stringent conditions.
Homology, sequence rity or sequence identity of nucleotide or amino acid sequences may
also be determined conventionally by using known software or computer programs such as the BestFit or
Gap pairwise comparison programs (GCG Wisconsin Package, Genetics Computer Group, 575 Science
Drive, Madison, Wis. 53711). BestFit uses the local homology thm of Smith and Waterman
(Advances in Applied Mathematics. 1981. 2: 482-489), to find the best segment of identity or similarity
between two sequences. Gap performs global alignments: all of one sequence with all of another similar
sequence using the method of Needleman and Wunsch, (Journal of Molecular Biology. 1970. 48:443-
453). When using a sequence alignment program such as BestFit, to ine the degree of ce
homology, similarity or identity, the default setting may be used, or an appropriate scoring matrix may be
selected to optimize identity, similarity or homology scores.
Nucleic acid sequences that are “complementary” are those that are capable of airing
according to the standard Watson-Crick complementarity rules. As used herein, the term
ementary sequences” means nucleic acid sequences that are substantially mentary, as may
be assessed by the same nucleotide comparison set forth above, or as defined as being capable of
hybridizing to the polynucleotides that encode the GLPZ—XTEN sequences under stringent conditions,
such as those described herein.
The resulting polynucleotides ng the GLPZ-XTEN chimeric fusion ns can then be
individually cloned into an expression vector. The nucleic acid sequence is ed into the vector by a
y of procedures. In general, DNA is inserted into an appropriate restriction endonuclease site(s)
using techniques known in the art. Vector components generally include, but are not limited to, one or
more of a signal sequence, an origin of replication, one or more marker genes, an enhancer element, a
promoter, and a ription termination sequence (. Construction of suitable vectors containing
one or more of these components employs standard ligation techniques which are known to the skilled
artisan. Such techniques are well known in the art and well described in the scientific and patent
literature.
Various vectors are ly available. The vector may, for example, be in the form of a
plasmid, cosmid, viral particle, or phage that may conveniently be subjected to recombinant DNA
procedures, and the choice of vector will often depend on the host cell into which it is to be introduced.
Thus, the vector may be an autonomously replicating vector, i.e., a vector, which exists as an
hromosomal , the replication of which is independent of chromosomal replication, e. g., a
plasmid. Alternatively, the vector may be one which, when introduced into a host cell, is integrated into
the host cell genome and replicated together with the chromosome(s) into which it has been integrated.
The invention provides for the use of plasmid vectors containing replication and l
sequences that are compatible with and ized by the host cell, and are operably linked to the GLP2-
2012/054941
XTEN gene for controlled expression of the GLP2-XTEN fusion proteins. The vector ordinarily carries a
replication site, as well as sequences that encode ns that are capable of providing phenotypic
selection in transformed cells. Such vector ces are well known for a y of bacteria, yeast, and
viruses. Useful expression s that can be used include, for example, segments of chromosomal,
non-chromosomal and tic DNA sequences. "Expression vector” refers to a DNA construct
containing a DNA sequence that is operably linked to a suitable control sequence capable of ing the
expression of the DNA encoding the fusion protein in a suitable host. The requirements are that the
s are replicable and viable in the host cell of choice. Low- or high-copy number vectors may be
used as desired.
le vectors include, but are not limited to, derivatives of SV40 and pcDNA and known
bacterial plasmids such as col El, pCRl, pBR322, pMal-C2, pET, pGEX as described by Smith, et al.,
Gene 57:31-40 (1988), pMB9 and derivatives thereof, plasmids such as RP4, phage DNAs such as the
numerous derivatives of phage I such as NM98 9, as well as other phage DNA such as M13 and
filamentous single stranded phage DNA; yeast plasmids such as the 2 micron plasmid or derivatives of
the 2m plasmid, as well as centomeric and integrative yeast shuttle vectors; vectors useful in eukaryotic
cells such as s useful in insect or mammalian cells; vectors derived from combinations of plasmids
and phage DNAs, such as plasmids that have been modified to employ phage DNA or the sion
control sequences; and the like. Yeast expression systems that can also be used in the present invention
include, but are not limited to, the non—filsion pYES2 vector (lnvitrogen), the fusion pYESHisA, B, C
(Invitrogen), pRS vectors and the like.
The control sequences of the vector include a promoter to effect transcription, an optional
operator sequence to l such transcription, a ce encoding suitable mRNA me binding
sites, and sequences that control termination of transcription and translation. The promoter may be any
DNA sequence, which shows transcriptional activity in the host cell of choice and may be derived from
genes encoding proteins either homologous or heterologous to the host cell.
Examples of suitable promoters for directing the transcription of the DNA encoding the GLP2-
XTEN in mammalian cells are the SV40 promoter (Subramani et al., Mol. Cell. Biol. 1 (1981), 854-864),
the MT-l (metallothionein gene) promoter (Palmiter et al., Science 222 , 809-814), the CMV
promoter (Boshart et al., Cell 41 2521-5 30, 1985) or the adenovirus 2 major late promoter (Kaufman and
Sharp, M01. Cell. Biol, 2:1304-1319, 1982). The vector may also carry sequences such as UCOE
(ubiquitous tin opening elements).
Examples of suitable promoters for use in filamentous fungus host cells are, for instance, the
ADH3 promoter or the tpiA promoter. Examples of other useful promoters are those derived from the
gene encoding A. orjyzae TAKA amylase, Rhizomucor miehei aspartic proteinase, A. niger neutral 0t-
amylase, A. niger acid stable a-amylase, A. niger or A. awamoriglucoamylase (gluA), Rhizomucor miehei
lipase, A. oryzae alkaline protease, A. oryzae tliose phosphate isomerase or A. nidulans acetamidase.
red are the TAKA-amylase and gluA promoters. Yeast sion systems that can also be used in
the t invention include, but are not limited to, the non-fusion pYES2 vector (Invitrogen), the filsion
pYESHisA, B, C (Invitrogen), pRS s and the like.
Promoters le for use in expression vectors with prokaryotic hosts include the fi-lactamase
and lactose promoter systems [Chang et al., Nature, 275 :615 (1978); Goeddel et al., Nature, 281:544
(1979)], alkaline phosphatase, a phan (trp) promoter system [G0eddel, Nucleic Acids Res., 8:4057
(1980); EP 36,776], and hybrid promoters such as the tac promoter [deBoer et al., Proc. Natl. Acad. Sci.
USA, 25 (1983)], all is operably linked to the DNA encoding GLP2-XTEN polypeptides.
Promoters for use in bacterial systems can also contain a Shine-Dalgarno (S.D.) ce, ly
linked to the DNA encoding GLP2-XTEN polypeptides.
The ion contemplates use of other expression systems including, for example, a
baculovirus expression system with both non-fusion transfer vectors, such as, but not limited to pVL941
Summers, et al., Virology 84:390-402 (1978)), pVL1393 (Invitrogen), pVL1392 (Summers, et al.,
Virology 84:390- 402 (1978) and Invitrogen) and acHl (Invitrogen), and fusion transfer vectors
such as, but not limited to, pAc7 00 (Summers, et al., Virology 84:390-402 (1978)), pAc701 and pAc70-
2 (same as pAc700, with different g frames), pAc36O Invitrogen) and acHisA, B, C
(Invitrogen) can be used.
The DNA sequences encoding the GLP2-XTEN may also, if necessary, be ly connected
to a suitable terminator, such as the hGH terminator (Palmiter et al., Science 222, 1983, pp. 809-814) or
the TH] terminators (Alber and Kawasaki, J. Mol. App]. Gen. 1, 1982, pp. 419-434) or ADH3
(McKnight et al., The EMBO J. 4, 1985, pp. 2093-2099). Expression vectors may also contain a set of
RNA splice sites located downstream from the promoter and upstream from the insertion site for the
GLP2-XTEN sequence itself, including splice sites obtained from adenovirus. Also contained in the
sion vectors is a polyadenylation signal located downstream of the insertion site. Particularly
preferred polyadenylation signals include the early or late polyadenylation signal from SV4O (Kaufman
and Sharp, ibid.), the polyadenylation signal from the adenovirus 5 Elb region, the hGH terminator
(DeNoto et al. Nucl. Acids Res. 9:3719—3730, 1981). The expression vectors may also include a
noncoding viral leader sequence, such as the irus 2 tripartite leader, located between the promoter
and the RNA splice sites; and enhancer sequences, such as the SV40 enhancer.
In one embodiment, the polynucleotide encoding a GLP2-XTEN filSiOl’l protein composition
is fused C-terminally to an inal signal sequence appropriate for the expression host system. Signal
sequences are typically proteolytically removed from the protein during the translocation and ion
process, generating a defined N-terminus. A wide variety of signal ces have been bed for
most expression systems, including bacterial, yeast, insect, and mammalian systems. A non-limiting list
of preferred examples for each expression system follows . Preferred signal sequences are OmpA,
PhoA, and DsbA for E. coli expression. Signal peptides preferred for yeast expression are ppL-alpha,
DEX4, ase signal peptide, acid phosphatase signal peptide, CPY, or 1NU1. For insect cell
expression the preferred signal sequences are sexta adipokinetic hormone precursor, CP1, CP2, CP3,
CP4, TPA, PAP, or gp67. For mammalian expression the preferred signal sequences are IL2L, SV40,
IgG kappa and IgG lambda.
In another embodiment, a leader sequence, potentially comprising a well-expressed,
independent n , can be fused to the N-terminus of the GLPZ-XTEN sequence, separated by a
protease ge site. While any leader peptide sequence which does not inhibit cleavage at the
designed proteolytic site can be used, sequences in preferred embodiments will comprise stable, wellexpressed
sequences such that expression and folding of the overall composition is not significantly
adversely affected, and preferably expression, solubility, and/or folding efficiency are significantly
improved. A wide variety of suitable leader sequences have been described in the literature. A nonlimiting
list of suitable sequences includes e binding n, cellulose binding domain, glutathione
S-transferase, 6xHis tag, FLAG tag, utinin tag, and green fluorescent protein. The leader
sequence can also be further improved by codon optimization, especially in the second codon position
following the ATG start codon, by methods well described in the literature and hereinabove.
The procedures used to ligate the DNA sequences coding for the GLP2-XTEN, the promoter
and optionally the ator and/or secretory signal sequence, respectively, and to insert them into
suitable vectors containing the ation ary for replication, are well known to persons skilled in
the art (of, for instance, ok, J. et al., “Molecular Cloning: A Laboratory Manual,” 3rd edition,
Cold Spring Harbor tory Press, 2001).
In other embodiments, the invention provides constructs and methods of making constructs
comprising an polynucleotide sequence optimized for expression that encodes at least about 20 to about
60 amino acids with XTEN characteristics that can be included at the N—terminus of an XTEN carrier
encoding sequence (in other words, the polynucleotides encoding the 20-60 encoded optimized amino
acids are linked in frame to cleotides encoding an XTEN component that is N—terminal to GLP-2)
to promote the initiation slation to allow for expression of XTEN fusions at the N—terminus of
proteins without the presence of a helper domain. In an advantage ofthe foregoing, the sequence does
not require subsequent cleavage, thereby reducing the number of steps to manufacture XTEN-containing
itions. As described in more detail in the Examples, the optimized N—terminal sequence has
attributes of an unstructured protein, but may include nucleotide bases encoding amino acids selected for
their ability to promote tion of ation and enhanced expression. In one ment of the
foregoing, the optimized polynucleotide encodes an XTEN sequence with at least about 90% sequence
identity compared to AE912. In another embodiment of the foregoing, the optimized cleotide
encodes an XTEN sequence with at least about 90% sequence identity compared to AM923. In another
embodiment of the foregoing, the optimized polynucleotide encodes an XTEN sequence with at least
about 90% sequence identity compared to AE48. In r embodiment of the foregoing, the optimized
polynucleotide encodes an XTEN sequence with at least about 90% sequence identity compared to
AM48. In one embodiment, the zed polynucleotide NTS comprises a sequence that exhibits at
least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least
2012/054941
about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about
98%, or at least about 99%, sequence identity compared to a sequence or its complement selected from
AB 48: 5’-
ATGGCTGAACCTGCTGGCTCTCCAACCTCCACTGAGGAAGGTACCCCGGGTAGCGGTACTGC
TTCTTCCTCTCCAGGTAGCTCTACCCCTTCTGGTGCAACCGGCTCTCCAGGTGCTTCTCCGGG
CACCAGCTCTACCGGTTCTCCA-3’
AM 48: 5 ’-
ATGGCTGAACCTGCTGGCTCTCCAACCTCCACTGAGGAAGGTGCATCCCCGGGCACCAGCTC
TACCGGTTCTCCAGGTAGCTCTACCCCGTCTGGTGCTACCGGCTCTCCAGGTAGCTCTACCCC
GTCTGGTGCTACTGGCTCTCCA-3 ’.
In this manner, a chimeric DNA molecule coding for a monomeric GLP2-XTEN fiJsion
protein is generated. Optionally, this chimeric DNA molecule may be transferred or cloned into another
construct that is a more riate expression vector. At this point, a host cell capable of sing the
chimeric DNA molecule can be transformed With the chimeric DNA molecule. The vectors containing
the DNA segments of interest can be transferred into the host cell by well-known methods, depending on
the type of cellular host. For example, calcium chloride transfection is commonly utilized for prokaryotic
cells, Whereas calcium phosphate treatment, lipofection, or electroporation may be used for other cellular
hosts. Other methods used to transform mammalian cells include the use of polybrene, protoplast ,
liposomes, oporation, and microinjection. See, generally, Sambrook, et al., supra.
The transformation may occur With or t the utilization of a r, such as an
expression vector. Then, the ormed host cell is cultured under conditions suitable for the
expression of the chimeric DNA molecule encoding of GLP2-XTEN.
The present invention also provides a host cell for expressing the monomeric fusion protein
compositions disclosed . Examples ofmammalian cell lines for use in the present invention are the
COS-1 (ATCC CRL 1650), COS-7 (ATCC CRL 1651), BHK-21 (ATCC CCL 10)) and BHK-293
(ATCC CRL 1573; Graham et al., J. Gen. Virol. 36:59-72, 1977), BHK-570 cells (ATCC CRL 10314),
CHO-Kl (ATCC CCL 61), CHO-S (Invitrogen 11619-012), and 293-F (Invitrogen R790-7). A 3
BHK cell line is also available from the ATCC under accession number CRL 1632. In addition, a number
of other cell lines may be used Within the present invention, ing Rat Hep I (Rat hepatoma; ATCC
CRL 1600), Rat Hep II (Rat ma; ATCC CRL 1548), TCMK (ATCC CCL 139), Human lung
(ATCC HB 8065), NCTC 1469 (ATCC CCL 9.1), CHO (ATCC CCL 61) and DUKX cells (Urlaub and
Chasin, Proc. Natl. Acad. Sci. USA 77:4216-4220, 1980).
Examples of suitable yeasts host cells include cells of Saccharomyces spp. or
Schizosaccharomyces spp., in particular strains of Saccharomyces cerevisiae or romyces kluyveri.
Other yeasts include Schizosaccharomyces pombe (Beach and Nurse, Nature, 290: 140 ; EP
139,383 published 2 May 1985); Kluyveromyces hosts (U.S. Pat. No. 4,943,529; Fleer et al.,
Bio/Technology, 9:968-975 (1991)) such as, e.g., K. lactis 8C, CBS683, CBS4574; court
et al., J. Bacteriol, 737 ), K fragilis (ATCC 12,424), K. bulgaricus (ATCC 16,045), K. wickeramii
(ATCC 24,178), K waltii (ATCC 56,500), K drosophilarum (ATCC 36,906; Van den Berg et al.,
Bio/Technology, 8:135 (1990)), K. tolerans and K. marxianus; yarrowia (EP 402,226); Pichia
pastoris (EP 183,070; Sreekrishna et al., J. Basic Microbiol, 28:265-278 [1988]); Candida; Trichoderma
reesia (EP 244,234); Neurospora crassa (Case et al., Proc. Natl. Acad. Sci. USA, 76:5259-5263 [1979]);
Schwanniomyces such as Schwanniomyces ntalis (EP 394,538 published 31 Oct. 1990).
Methylotropic yeasts are suitable herein and include, but are not limited to, yeast capable of growth on
methanol selected from the genera consisting of Hansenula, Candida, Kloeckera, Pichia,
Saccharomyces, Torulopsis , and Rhodotorula. Further examples of suitable yeast cells are strains of
Kluyveromyces, such as Hansenula H. polymorpha or Pichia P. pastoris (cf. Gleeson et al.,
, e.g. , , e.g.
J. Gen. Microbiol. 132, 1986, pp. 3459-3465; U.S. Pat. No. 4,882,279). A list of specific species that are
exemplary of this class of yeasts may be found in C. Anthony, The Biochemistry of Methylotrophs, 269
(1982). Methods for transforming yeast cells With heterologous DNA and producing heterologous
polypeptides there from are bed, e.g. in U.S. Pat. No. 4,599,311, U.S. Pat. No. 4,931,373, U.S. Pat.
No. 4,870,008, 5,037,743, and U.S. Pat. No. 4,845,075, all of Which are hereby incorporated by
reference.
] Examples of other fungal cells are cells of filamentous fungi, e. g. Aspergillus spp., Neurospora
spp., Fusarium spp. or derma spp., in particular strains of A. , A. nidulans or A. niger. The
use ofAspergillus spp. for the expression of proteins is described in, e.g., EP 272 277, EP 238 023, EP
184 438 The transformation of F. rum may, for instance, be carried out as described by Malardier
et al., 1989 Gene 78: 147-156. The transformation of Trichoderma spp. may be performed for instance
as described in EP 244 234.
Other suitable cells that can be used in the present invention include, but are not limited to,
prokaryotic host cells strains such as Escherichia coli, (e. g., strain DH5-0t), Bacillus subtilis, Salmonella
typhimurium, or strains of the genera of monas, Streptomyces and Staphylococcus. Non-limiting
examples of le prokaryotes include those from the : Actinoplanes; Archaeoglobus;
Bdellovibrio; Borrelia; Chloroflexus; Enterococcus; Escherichia; Lactobacillus; Listeria;
Oceanobacillus; Paracoccus; Pseudomonas; lococcus; Streptococcus; Streptomyces;
Thermoplasma; and Vibrio.
Transformed cells are selected by a phenotype determined by a able marker, commonly
drug resistance or the ability to grow in the e of a particular nutrient, e.g. leucine. A preferred
vector for use in yeast is the POT1 vector disclosed in U.S. Pat. No. 4,931,373. The DNA sequences
encoding the GLPZ-XTEN may be preceded by a signal sequence and ally a leader sequence, e. g.
as described above. Methods of transfecting mammalian cells and expressing DNA sequences introduced
in the cells are described in e.g., Kaufman and Sharp, J. Mol. Biol. 159 (1982), 601-621; Southern and
Berg, J. Mol. Appl. Genet. 1 (1982), 327-341; Loyter et al., Proc. Natl. Acad. Sci. USA 79 (1982), 422-
426; Wigler et a1., Cell 14 (1978), 725; Corsaro and Pearson, Somatic Cell Genetics 7 (1981), 603,
Graham and van der Eb, Virology 52 , 456; and Neumann et al., EMBOJ. 1 (1982), 841-845.
Cloned DNA sequences are introduced into cultured mammalian cells by, for example, calcium
phosphate-mediated transfection (Wigler et al., Cell 14:725-732, 1978; Corsaro and n, c
Cell Genetics 7:603 -616, 1981; Graham and Van der Eb, Virology 52d1456-467, 1973), transfection with
many commercially available reagents such as FuGENEG Roche Diagnostics, Mannheim, Germany) or
lipofectamine (lnvitrogen) or by oporation (Neumann et al., EMBO J. 1:841-845, 1982). To
identify and select cells that s the exogenous DNA, a gene that confers a selectable phenotype (a
selectable ) is generally introduced into cells along with the gene or cDNA of interest. Preferred
selectable markers include genes that confer resistance to drugs such as neomycin, hygromycin,
puromycin, zeocin, and methotrexate. The selectable marker may be an amplifiable able marker. A
preferred amplifiable selectable marker is a dihydrofolate reductase (DHFR) sequence. Further examples
of selectable s are well known to one of skill in the art and e reporters such as enhanced
green fluorescent protein (EGFP), beta-galactosidase (B-gal) or mphenicol acetyltransferase
(CAT). Selectable markers are reviewed by Thilly (Mammalian Cell Technology, Butterworth
Publishers, Stoneham, Mass., incorporated herein by reference). A person skilled in the art will easily be
able to choose suitable selectable markers. Any known selectable marker may be employed so long as it
is capable of being expressed simultaneously with the nucleic acid encoding a gene product.
able markers may be introduced into the cell on a separate plasmid at the same time as
the gene of st, or they may be introduced on the same plasmid. On the same plasmid, the selectable
marker and the gene of st may be under the control of different promoters or the same promoter, the
latter arrangement produces a dicistronic message. Constructs of this type are known in the art (for
example, Levinson and en, US. Pat. No. 4,713,339). It may also be advantageous to add
additional DNA, known as “carrier DNA,” to the mixture that is introduced into the cells.
After the cells have taken up the DNA, they are grown in an appropriate growth ,
typically 1-2 days, to begin expressing the gene of interest. As used herein the term “appropriate growth
medium” means a medium ning nutrients and other ents required for the growth of cells
and the expression ofthe GLP2-XTEN of interest. Media generally include a carbon source, a nitrogen
source, essential amino acids, essential sugars, vitamins, salts, phospholipids, protein and growth factors.
For production of carboxylated proteins, the medium will contain vitamin K, preferably at a
concentration of about 0.1 [Lg/ml to about 5 [Lg/ml. Drug selection is then applied to select for the growth
of cells that are expressing the selectable marker in a stable fashion. For cells that have been transfected
with an amplifiable selectable marker the drug tration may be increased to select for an increased
copy number of the cloned sequences, thereby increasing expression levels. Clones of stably transfected
cells are then screened for expression of the GLP-2 polypeptide variant of interest.
The transformed or transfected host cell is then ed in a suitable nutrient medium under
conditions permitting expression ofthe GLP2-XTEN fusion protein after which the resulting peptide may
WO 40093
be recovered from the culture. The medium used to culture the cells may be any conventional medium
suitable for growing the host cells, such as minimal or complex media containing appropriate
supplements. Suitable media are available from commercial suppliers or may be prepared according to
published recipes (e.g. in catalogues of the an Type Culture Collection). The culture conditions,
such as temperature, pH and the like, are those previously used with the host cell selected for expression,
and will be apparent to the ordinarily skilled artisan.
Gene expression may be ed in a sample directly, for example, by conventional Northern
blotting to tate the transcription ofmRNA [Thomas, Proc. Natl. Acad. Sci. USA, 1-5205
(1980)], dot blotting (DNA analysis), or in situ hybridization, using an appropriately labeled probe, based
on the sequences provided herein. Alternatively, antibodies may be employed that can ize specific
duplexes, including DNA duplexes, RNA duplexes, and DNA-RNA hybrid duplexes or DNA-protein
duplexes. The antibodies in turn may be labeled and the assay may be carried out where the duplex is
bound to a surface, so that upon the formation of duplex on the surface, the presence of dy bound
to the duplex can be detected.
Gene expression, atively, may be measured by immunological of cent methods,
such as immunohistochemical staining of cells or tissue sections and assay of cell culture or body fluids
or the ion of selectable markers, to quantitate directly the expression of gene product. Antibodies
useful for immunohistochemical staining and/or assay of sample fluids may be either monoclonal or
polyclonal, and may be prepared in any . Conveniently, the antibodies may be prepared against a
native sequence GLP-2 polypeptide or against a synthetic peptide based on the DNA ces provided
herein or against exogenous sequence fused to GLP-2 and encoding a specific antibody epitope.
Examples of selectable s are well known to one of skill in the art and include reporters such as
enhanced green fluorescent protein (EGFP), beta-galactosidase (B-gal) or chloramphenicol
acetyltransferase (CAT).
Expressed GLP2-XTEN polypeptide product(s) may be purified via methods known in the art
or by s disclosed herein. Procedures such as gel filtration, affinity purification (e. g., using an
anti-GLP-2 antibody column), salt fractionation, ion exchange chromatography, size exclusion
chromatography, hydroxyapatite adsorption chromatography, hydrophobic interaction chromatography
and gel electrophoresis may be used; each tailored to recover and purify the fusion protein produced by
the respective host cells. Additional purification may be ed by conventional al purification
means, such as high performance liquid chromatography. Some expressed GLP2-XTEN may require
refolding during isolation and purification. Methods of purification are described in Robert K. Scopes,
n Purification: Principles and Practice, Charles R. Castor (ed.), Springer-Verlag 1994, and
Sambrook, et £11., supra. Multi-step purification separations are also described in Baron, er al., Crit. Rev.
hnol. 10:179-90 (1990) and Below, er al., J. Chromatogr. A. 679:67-83 (1994). For therapeutic
purposes it is preferred that the GLP2-XTEN fusion proteins of the invention are substantially pure.
Thus, in a preferred embodiment of the invention the GLPZ-XTEN of the invention is purified to at least
about 90 to 95% homogeneity, ably to at least about 98% neity. Purity may be assessed by,
e. g., gel ophoresis, HPLC, and amino-terminal amino acid sequencing.
VIII). PHARMACEUTICAL COMPOSITIONS
The t invention provides pharmaceutical compositions comprising GLPZ-XTEN. In one
embodiment, the pharmaceutical composition comprises a GLP2-XTEN fusion protein disclosed herein
and at least one pharmaceutically acceptable carrier. GLPZ-XTEN polypeptides of the present ion
can be formulated according to known methods to prepare pharmaceutically useful compositions,
whereby the polypeptide is ed in admixture with a pharmaceutically acceptable carrier vehicle,
such as aqueous solutions, buffers, solvents and/or pharmaceutically acceptable suspensions, emulsions,
stabilizers or excipients. Examples of non-aqueous solvents include propylethylene glycol, polyethylene
glycol and vegetable oils. Formulations of the pharmaceutical compositions are prepared for storage by
mixing the active GLPZ-XTEN ingredient having the desired degree of purity with optional
physiologically acceptable carriers, excipients (e.g., sodium chloride, a calcium salt, sucrose, or
polysorbate) or stabilizers (e.g., sucrose, trehalose, raffmose, arginine, a m salt, glycine or
histidine), as described in Remington's ceutical Sciences 16th edition, Osol, A. Ed. (1980), in the
form of lyophilized formulations or aqueous solutions.
] In one embodiment, the pharmaceutical composition may be supplied as a lyophilized powder
to be reconstituted prior to administration. In another embodiment, the pharmaceutical composition may
be supplied in a liquid form, which can be administered directly to a patient. In another embodiment, the
composition is supplied as a liquid in a pre-filled syringe for administration of the composition. In
another embodiment, the composition is supplied as a liquid in a pre-filled vial that can be incorporated
into a pump.
The pharmaceutical compositions can be administered by any suitable means or route,
including subcutaneously, subcutaneously by infusion pump, intramuscularly, intravenously, or via the
pulmonary route. It will be iated that the preferred route will vary with the disease and age of the
recipient, and the severity of the condition being treated.
In one embodiment, the GLPZ-XTEN pharmaceutical composition in liquid form or after
reconstitution (when ed as a lized ) comprises GLP-2 linked to XTEN, which
composition is capable of increasing GLPrelated activity to at least 10% of the normal GLP-2 plasma
level in the blood for at least about 72 hours, or at least about 96 hours, or at least about 120 hours, or at
least about 7 days, or at least about 10 days, or at least about 14 days, or at least about 21 days after
administration of the GLP-Z ceutical composition to a subject in need. In another embodiment,
the GLPZ-XTEN ceutical composition in liquid form or after reconstitution (when ed as a
lyophilized powder) and administration to a subject is capable of increasing GLPZ-XTEN concentrations
to at least 500 ng/ml, or at least 1000 ng/ml, or at least about 2000 ng/ml, or at least about 3000 ng/ml, or
at least about 4000 ng/ml, or at least about 5000 ng/ml, or at least about 10000 ng/ml, or at least about
2012/054941
15000 ng/ml, or at least about 20000 ng/ml, or at least about 30000 , or at least about 40000 ng/ml
for at least about 24 hours, or at least about 48 hours, or at least about 72 hours, or at least about 96
hours, or at least about 120 hours, or at least about 144 hours after stration of the GLP-2
pharmaceutical composition to a subject in need. It is specifically contemplated that the pharmaceutical
compositions of the foregoing embodiments in this paragraph can be formulated to include one or more
excipients, buffers or other ingredients known in the art to be compatible with administration by the
intravenous route or the subcutaneous route or the intramuscular route. Thus, in the embodiments
hereinaboye described in this paragraph, the pharmaceutical composition is administered subcutaneously,
intramuscularly, or intravenously.
The compositions of the invention may be formulated using a variety of excipients. Suitable
excipients include microcrystalline cellulose (e.g. Avicel PH102, Avicel PHlOl), polymethacrylate,
poly(ethyl acrylate, methyl methacrylate, trimethylammonioethyl methacrylate chloride) (such as
Eudragit RS-30D), hydroxypropyl methylcellulose (Methocel K100M, Premium CR Methocel K100M,
Methocel E5, ®), magnesium stearate, talc, triethyl citrate, aqueous ellulose dispersion
(Surelease®), and protamine sulfate. The slow release agent may also se a carrier, which can
comprise, for example, solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic
and absorption delaying agents. Pharmaceutically acceptable salts can also be used in these slow release
agents, for example, l salts such as hydrochlorides, hydrobromides, phosphates, or sulfates, as well
as the salts of organic acids such as acetates, proprionates, malonates, or benzoates. The composition may
also n liquids, such as water, saline, glycerol, and ethanol, as well as substances such as wetting
agents, emulsifying agents, or pH buffering agents. Liposomes may also be used as a carrier.
In another embodiment, the itions of the present ion are encapsulated in
liposomes, which have demonstrated utility in delivering beneficial active agents in a controlled manner
over prolonged s of time. Liposomes are closed bilayer nes containing an entrapped
aqueous volume. Liposomes may also be unilamellar vesicles possessing a single membrane bilayer or
multilamellar vesicles with multiple membrane bilayers, each separated from the next by an aqueous
layer. The ure of the resulting membrane bilayer is such that the hydrophobic (non-polar) tails of
the lipid are oriented toward the center of the bilayer while the hydrophilic ) heads orient s
the aqueous phase. In one embodiment, the liposome may be coated with a flexible water soluble
polymer that avoids uptake by the organs of the mononuclear phagocyte system, primarily the liver and
spleen. Suitable hydrophilic polymers for surrounding the liposomes include, without limitation, PEG,
polyvinylpyrrolidone, polyvinylmethylether, polymethyloxazoline, polyethyloxazoline,
p0lyhydroxypropyloxazoline, polyhydroxypropylmethacrylamide, polymethacrylamide,
polydimethylacrylamide, polyhydroxypropylmethacrylate, polyhydroxethylacrylate,
hydroxymethylcellulose hydroxyethylcellulose, polyethyleneglycol, partamide and hydrophilic
e sequences as described in U.S. Pat. Nos. 6,316,024; 6,126,966; 6,056,973; 6,043,094, the
contents of which are orated by reference in their entirety.
mes may be comprised of any lipid or lipid ation known in the art. For example,
the vesicle-forming lipids may be naturally-occurring or synthetic lipids, including phospholipids, such
as phosphatidylcholine, phosphatidylethanolamine, phosphatidic acid, phosphatidylserine,
phasphatidylglycerol, phosphatidylinositol, and sphingomyelin as disclosed in US. Pat. Nos. 6,056,973
and 5,874,104. The vesicle-forming lipids may also be glycolipids, cerebrosides, or cationic lipids, such
as 1,2-dioleyloxy—3-(trimethylamino) propane (DOTAP); N-[1-(2,3,-ditetradecyloxy)propyl]-N,N-
yl-N-hydroxyethylammonium bromide (DMRIE); N-[1 [(2,3,-dioleyloxy)propyl]-N,N-dimethyl-
N-hydroxy ethylammonium bromide (DORIE); 2,3-dioleyloxy)propyl]-N,N,N-
trimethylammonium chloride (DOTMA); 3 [N-(N',N'—dimethylaminoethane) carbamoly] cholesterol
ol); or yldioctadecylammonium (DDAB) also as disclosed in US. Pat. No. 973.
Cholesterol may also be present in the proper range to impart stability to the e as disclosed in U.S.
Pat. Nos. 5,916,588 and 5,874,104.
Additional liposomal technologies are described in U.S. Pat. Nos. 6,759,057; 6,406,713;
6,352,716; 6,316,024; 6,294,191; 6,126,966; 6,056,973; 6,043,094; 5,965,156; 5,916,588; 5,874,104;
680; and 4,684,479, the contents of which are incorporated herein by reference. These be
liposomes and lipid-coated microbubbles, and methods for their manufacture. Thus, one skilled in the art,
considering both the disclosure of this invention and the disclosures of these other s could produce
a liposome for the extended e of the polypeptides of the t invention.
For liquid formulations, a desired property is that the formulation be supplied in a form that can
pass through a 25, 28, 30, 31, 32 gauge needle for intravenous, intramuscular, intraarticular, or
subcutaneous administration. In another embodiment, a desired property is that the formulation be
supplied in a form that can be nebulized into an aerosal of suitable particle size for inhalation therapy.
Osmotic pumps may be used as slow release agents in the form of tablets, pills, capsules or
implantable devices. Osmotic pumps are well known in the art and readily available to one of ordinary
skill in the art from companies experienced in ing osmotic pumps for extended release drug
delivery. Examples are ALZA‘s DUROSTM; ALZA's OROSTM; Osmotica Pharmaceutical's OsmodexTM
system; Shire tories' EnSoTrolTM system; and AlzetTM. Patents that describe c pump
logy are US. Pat. Nos. 6,890,918; 6,838,093; 6,814,979; 6,713,086; 090; 6,514,532;
6,361,796; 6,352,721; 6,294,201; 6,284,276; 6,110,498; 5,573,776; 4,200,0984; and 4,088,864, the
contents of which are orated herein by reference. One skilled in the art, considering both the
disclosure of this invention and the disclosures of these other patents could produce an osmotic pump for
the extended release of the polypeptides of the present invention.
Syringe pumps may also be used as slow release agents. Such devices are described in US.
Pat. Nos. 4,976,696; 4,933,185; 5,017,378; 6,309,370; 6,254,573; 4,435,173; 4,398,908; 6,572,585;
,298,022; 5,176,502; 534; 5,318,540; and 4,988,337, the contents of which are incorporated
herein by reference. One skilled in the art, considering both the disclosure of this invention and the
disclosures of these other patents could produce a syringe pump for the extended release ofthe
compositions of the t invention.
IX). PHARMACEUTICAL KITS
In another aspect, the invention provides a kit to facilitate the use of the GLPZ-XTEN
polypeptides. The kit comprises the pharmaceutical composition provided herein, a label identifying the
pharmaceutical composition, and an instruction for storage, reconstitution and/or administration of the
ceutical compositions to a subject. In some embodiment, the kit comprises, preferably: (a) an
amount of a GLPZ-XTEN fusion protein composition sufficient to treat a gastrointestinal condition upon
administration to a subject in need thereof; (b) an amount of a pharmaceutically able carrier; and
(0) together in a formulation ready for injection or for reconstitution with sterile water, buffer, or
dextrose; together with a label identifying the GLPZ-XTEN drug and storage and handling ions,
and a sheet of the approved indications for the drug, instructions for the reconstitution and/or
administration of the GLPZ-XTEN drug for the use for the prevention and/or treatment of an approved
indication, appropriate dosage and safety information, and information identifying the lot and expiration
of the drug. In another embodiment of the foregoing, the kit can comprise a second container that can
carry a suitable diluent for the GLPZ-XTEN composition, the use of which will e the user with the
appropriate concentration of GLPZ-XTEN to be delivered to the t.
EXAMPLES
Example 1: Construction of D36 motif segments
The following example describes the construction of a collection of optimized genes
encoding motif sequences of 36 amino acids. As a first step, a stuffer vector pCWO359 was constructed
based on a pET vector and that includes a T7 er. pCWO359 encodes a cellulose binding domain
(CBD) and a TEV protease recognition site followed by a stuffer sequence that is flanked by BsaI, BbsI,
and KpnI sites. The BsaI and BbsI sites were ed such that they generate compatible overhangs after
digestion. The stuffer sequence is ed by a truncated version of the GFP gene and a His tag. The
stuffer sequence contains stop codons and thus E. coli cells carrying the r plasmid pCWO359 form
non-fluorescent colonies. The r vector pCW0359 was digested with BsaI and KpnI to remove the
stuffer segment and the resulting vector fragment was isolated by agarose gel purification. The
sequences were designated XTEN_AD36, reflecting the AD family of motifs. Its segments have the
amino acid sequence [X]3 where X is a 12mer e with the sequences: GESPGGSSGSES,
GSEGSSGPGESS, GSSESGSSEGGP, or GSGGEPSESGSS. The insert was obtained by ing the
following pairs of orylated synthetic oligonucleotide pairs:
AD1 for: AGGTGAATCTCCDGGTGGYTCYAGCGGTTCYGARTC
AD1 rev: ACCTGAYTCRGAACCGCTRGARCCACCHGGAGATTC
AD2for: AGGTAGCGAAGGTTCTTCYGGTCCDGGYGARTCYTC
AD2rev: ACCTGARGAYTCRCCHGGACCRGAAGAACCTTCGCT
: AGGTTCYTCYGAAAGCGGTTCTTCYGARGGYGGTCC
AD3rev: ACCTGGACCRCCYTCRGAAGAACCGCTTTCRGARGA
: AGGTTCYGGTGGYGAACCDTCYGARTCTGGTAGCTC
We also annealed the phosphorylated oligonucleotide 3KpnlstopperFor:
AGGTTCGTCTTCACTCGAGGGTAC and the non-phosphorylated oligonucleotide
pr_3KpnIstopperRev: CCTCGAGTGAAGACGA. The annealed oligonucleotide pairs were ligated,
which resulted in a mixture of ts with varying length that represents the varying number of 12mer
repeats ligated to one BbsI/Kpnl segment. The products corresponding to the length of 36 amino acids
were isolated from the mixture by preparative agarose gel electrophoresis and ligated into the Bsal/KpnI
digested stuffer vector pCW035 9. Most of the clones in the resulting y designated LCW0401
showed green fluorescence after ion, which shows that the sequence ofXTEN_AD36 had been
ligated in frame with the GFP gene and that most sequences ofXTEN_AD36 had good expression levels.
We screened 96 isolates from library LCW0401 for high level of cence by stamping
them onto agar plate containing IPTG. The same isolates were evaluated by PCR and 48 isolates were
fied that contained segments with 36 amino acids as well as strong fluorescence. These isolates
were sequenced and 39 clones were identified that contained correct XTEN_AD36 segments. The file
names of the nucleotide and amino acid ucts for these segments are listed in Table 8.
Table 8: DNA and Amino Acid Seguences for 36-mer motifs
File name Amino acid sequence Nucleotide sequence
LCW0401_001_ GSGGEPSESGSSGESPGG GGTTCTGGTGGCGAACCGTCCGAGTCTGGTAGC
GFP-N_A01.abl SSGSESGESPGGSSGSES TCAGGTGAATCTCCGGGTGGCTCTAGCGGTTCC
GAGTCAGGTGAATCTCCTGGTGGTTCCAGCGGT
TCCGAGTCA
LCW0401_002_ GSEGSSGPGESSGESPGG GGTAGCGAAGGTTCTTCTGGTCCTGGCGAGTCT
B01.abl SSGSESGSSESGSSEGGP TCAGGTGAATCTCCTGGTGGTTCCAGCGGTTCT
GAATCAGGTTCCTCCGAAAGCGGTTCTTCCGAG
GGCGGTCCA
LCW0401_003_ GSSESGSSEGGPGSSESG GGTTCCTCTGAAAGCGGTTCTTCCGAAGGTGGT
GFP-NiCO l .abl SSEGGPGESPGGSSGSES CCAGGTTCCTCTGAAAGCGGTTCTTCTGAGGGT
GGTGAATCTCCGGGTGGCTCCAGCGGT
TCCGAGTCA
LCW0401_004_ SESGSSGSSESG GGTTCCGGTGGCGAACCGTCTGAATCTGGTAGC
GFP-N_D01.abl SSEGGPGSGGEPSESGSS TCAGGTTCTTCTGAAAGCGGTTCTTCCGAGGGT
GGTCCAGGTTCTGGTGGTGAACCTTCCGAGTCT
GGTAGCTCA
LCW0401_007_ SSEGGPGSEGSS GGTTCTTCCGAAAGCGGTTCTTCTGAGGGTGGT
GFP-N_F01.abl GPGESSGSEGSSGPGESS CCAGGTAGCGAAGGTTCTTCCGGTCCAGGTGAG
GGTAGCGAAGGTTCTTCTGGTCCTGGT
GAATCTTCA
LCW0401_008_ GSSESGSSEGGPGESPGG GGTTCCTCTGAAAGCGGTTCTTCCGAGGGTGGT
GFP-N_G01.abl SSGSESGSEGSSGPGESS CCAGGTGAATCTCCAGGTGGTTCCAGCGGTTCT
GAGTCAGGTAGCGAAGGTTCTTCTGGTCCAGGT
GAATCCTCA
LCW0401_012_ GSGGEPSESGSSGSGGEP GGTTCTGGTGGTGAACCGTCTGAGTCTGGTAGC
GFP—N_H01.abl SESGSSGSEGSSGPGESS TCAGGTTCCGGTGGCGAACCATCCGAATCTGGT
GGTAGCGAAGGTTCTTCCGGTCCAGGT
GAGTCTTCA
LCW0401_015_ GSSESGSSEGGPGSEGSS GGTTCTTCCGAAAGCGGTTCTTCCGAAGGCGGT
WO 40093
File name Amino acid sequence Nucleotide sequence
GFP-N_A02.ab1 GPGESSGESPGGSSGSES CCAGGTAGCGAAGGTTCTTCTGGTCCAGGCGAA
TCTTCAGGTGAATCTCCTGGTGGCTCCAGCGGT
TCTGAGTCA
LCW0401_016_ GSSESGSSEGGPGSSESG GGTTCCTCCGAAAGCGGTTCTTCTGAGGGCGGT
GFP—N_B02.ab1 SSEGGPGSSESGSSEGGP CCAGGTTCCTCCGAAAGCGGTTCTTCCGAGGGC
GGTCCAGGTTCTTCTGAAAGCGGTTCTTCCGAG
GGCGGTCCA
LCW0401_020_ GSGGEPSESGSSGSEGSS GGTTCCGGTGGCGAACCGTCCGAATCTGGTAGC
GFP-N_E02.ab1 GPGESSGSSESGSSEGGP TCAGGTAGCGAAGGTTCTTCTGGTCCAGGCGAA
TCTTCAGGTTCCTCTGAAAGCGGTTCTTCTGAG
GGCGGTCCA
LCW040170227 GSGGEPSESGSSGSSESG GGTTCTGGTGGTGAACCGTCCGAATCTGGTAGC
GFP-N_F02.ab1 SSEGGPGSGGEPSESGSS TCAGGTTCTTCCGAAAGCGGTTCTTCTGAAGGT
GGTCCAGGTTCCGGTGGCGAACCTTCTGAATCT
GGTAGCTCA
LCW0401_024_ GSGGEPSESGSSGSSESG GGTTCTGGTGGCGAACCGTCCGAATCTGGTAGC
GFP-N_G02.ab1 SSEGGPGESPGGSSGSES TCCTCCGAAAGCGGTTCTTCTGAAGGT
GGTCCAGGTGAATCTCCAGGTGGTTCTAGCGGT
TCTGAATCA
LCW0401_026_ GSGGEPSESGSSGESPGG GGTTCTGGTGGCGAACCGTCTGAGTCTGGTAGC
GFP-N_H02.ab1 SSGSESGSEGSSGPGESS TCAGGTGAATCTCCTGGTGGCTCCAGCGGTTCT
GAATCAGGTAGCGAAGGTTCTTCTGGTCCTGGT
LCW0401_027_ GSGGEPSESGSSGESPGG GGTTCCGGTGGCGAACCTTCCGAATCTGGTAGC
GFP-N_A03.ab1 SSGSESGSGGEPSESGSS TCAGGTGAATCTCCGGGTGGTTCTAGCGGTTCT
GAGTCAGGTTCTGGTGGTGAACCTTCCGAGTCT
LCW0401_028_ GSSESGSSEGGPGSSESG GGTTCCTCTGAAAGCGGTTCTTCTGAGGGCGGT
GFP-N_BO3.ab1 SSEGGPGSSESGSSEGGP CCAGGTTCTTCCGAAAGCGGTTCTTCCGAGGGC
GGTCCAGGTTCTTCCGAAAGCGGTTCTTCTGAA
GGCGGTCCA
LCW0401_030_ GESPGGSSGSESGSEGSS GGTGAATCTCCGGGTGGCTCCAGCGGTTCTGAG
CO3.ab1 GPGESSGSEGSSGPGESS TCAGGTAGCGAAGGTTCTTCCGGTCCGGGTGAG
TCCTCAGGTAGCGAAGGTTCTTCCGGTCCTGGT
GAGTCTTCA
LCW0401_031_ GSGGEPSESGSSGSGGEP GGTTCTGGTGGCGAACCTTCCGAATCTGGTAGC
GFP-N_D03.ab1 SESGSSGSSESGSSEGGP TCAGGTTCCGGTGGTGAACCTTCTGAATCTGGT
AGCTCAGGTTCTTCTGAAAGCGGTTCTTCCGAG
GGCGGTCCA
LCW040170337 GSGGEPSESGSSGSGGEP GGTTCCGGTGGTGAACCTTCTGAATCTGGTAGC
GFP-N_E03.ab1 GSGGEPSESGSS TCAGGTTCCGGTGGCGAACCATCCGAGTCTGGT
AGCTCAGGTTCCGGTGGTGAACCATCCGAGTCT
GGTAGCTCA
LCW0401_037_ GSGGEPSESGSSGSSESG GGTTCCGGTGGCGAACCTTCTGAATCTGGTAGC
GFP-N_F03.ab1 SSEGGPGSEGSSGPGESS TCCTCCGAAAGCGGTTCTTCTGAGGGC
GGTAGCGAAGGTTCTTCTGGTCCGGGC
GAGTCTTCA
LCW0401_038_ GSGGEPSESGSSGSEGSS GGTTCCGGTGGTGAACCGTCCGAGTCTGGTAGC
GFP-N_G03.ab1 GPGESSGSGGEPSESGSS TCAGGTAGCGAAGGTTCTTCTGGTCCGGGTGAG
TCTTCAGGTTCTGGTGGCGAACCGTCCGAATCT
GGTAGCTCA
LCW0401_039_ SESGSSGESPGG GGTGGCGAACCGTCCGAATCTGGTAGC
GFP-N_H03.ab1 SSGSESGSGGEPSESGSS TCAGGTGAATCTCCTGGTGGTTCCAGCGGTTCC
GAGTCAGGTTCTGGTGGCGAACCTTCCGAATCT
GGTAGCTCA
LCW0401_040_ GSSESGSSEGGPGSGGEP TCCGAAAGCGGTTCTTCCGAGGGCGGT
GFP-N_A04.ab1 SESGSSGSSESGSSEGGP CCAGGTTCCGGTGGTGAACCATCTGAATCTGGT
AGCTCAGGTTCTTCTGAAAGCGGTTCTTCTGAA
GGTGGTCCA
File name Amino acid sequence Nucleotide sequence
LCW0401_042_ GSEGSSGPGESSGESPGG GGTAGCGAAGGTTCTTCCGGTCCTGGTGAGTCT
GFP-N_C04.ab1 SSGSESGSEGSSGPGESS TCAGGTGAATCTCCAGGTGGCTCTAGCGGTTCC
GAGTCAGGTAGCGAAGGTTCTTCTGGTCCTGGC
GAGTCCTCA
1_046_ GSSESGSSEGGPGSSESG GGTTCCTCTGAAAGCGGTTCTTCCGAAGGCGGT
GFP-N_D04.ab1 GSSESGSSEGGP CCAGGTTCTTCCGAAAGCGGTTCTTCTGAGGGC
GGTCCAGGTTCCTCCGAAAGCGGTTCTTCTGAG
GGTGGTCCA
LCW0401_047_ GSGGEPSESGSSGESPGG GGTTCTGGTGGCGAACCTTCCGAGTCTGGTAGC
GFP-N_EO4.ab1 SSGSESGESPGGSSGSES TCAGGTGAATCTCCGGGTGGTTCTAGCGGTTCC
GAGTCAGGTGAATCTCCGGGTGGTTCCAGCGGT
TCTGAGTCA
LCW0401_051_ GSGGEPSESGSSGSEGSS GGTGGCGAACCATCTGAGTCTGGTAGC
GFP-N_F04.ab1 GPGESSGESPGGSSGSES TCAGGTAGCGAAGGTTCTTCCGGTCCAGGCGAG
TCTTCAGGTGAATCTCCTGGTGGCTCCAGCGGT
TCTGAGTCA
LCW0401_053_ GESPGGSSGSESGESPGG GGTGAATCTCCTGGTGGTTCCAGCGGTTCCGAG
GFP-N_H04.ab1 SSGSESGESPGGSSGSES TCAGGTGAATCTCCAGGTGGCTCTAGCGGTTCC
GAGTCAGGTGAATCTCCTGGTGGTTCTAGCGGT
TCTGAATCA
LCW0401_054_ GSEGSSGPGESSGSEGSS GGTAGCGAAGGTTCTTCCGGTCCAGGTGAATCT
GFP-N_A05.ab1 GPGESSGSGGEPSESGSS TCAGGTAGCGAAGGTTCTTCTGGTCCTGGTGAA
TCCTCAGGTTCCGGTGGCGAACCATCTGAATCT
GGTAGCTCA
LCW0401_059_ GSGGEPSESGSSGSEGSS GGTTCTGGTGGCGAACCATCCGAATCTGGTAGC
GFP-N_D05.ab1 GPGESSGESPGGSSGSES TCAGGTAGCGAAGGTTCTTCTGGTCCTGGCGAA
TCTTCAGGTGAATCTCCAGGTGGCTCTAGCGGT
TCCGAATCA
LCW0401_060_ GSGGEPSESGSSGSSESG GGTTCCGGTGGTGAACCGTCCGAATCTGGTAGC
GFP-N_E05.ab1 SSEGGPGSGGEPSESGSS TCAGGTTCCTCTGAAAGCGGTTCTTCCGAGGGT
GGTCCAGGTTCCGGTGGTGAACCTTCTGAGTCT
GGTAGCTCA
LCW0401_061_ GSSESGSSEGGPGSGGEP GGTTCCTCTGAAAGCGGTTCTTCTGAGGGCGGT
GFP-N_F05.ab1 SESGSSGSEGSSGPGESS TCTGGTGGCGAACCATCTGAATCTGGT
AGCTCAGGTAGCGAAGGTTCTTCCGGTCCGGGT
1_063_ GSGGEPSESGSSGSEGSS GGTTCTGGTGGTGAACCGTCCGAATCTGGTAGC
GFP-N_H05.ab1 GPGESSGSEGSSGPGESS TCAGGTAGCGAAGGTTCTTCTGGTCCTGGCGAG
TCTTCAGGTAGCGAAGGTTCTTCTGGTCCTGGT
GAATCTTCA
1_066_ GSGGEPSESGSSGSSESG GGTTCTGGTGGCGAACCATCCGAGTCTGGTAGC
GFP-N_B06.ab1 SSEGGPGSGGEPSESGSS TCAGGTTCTTCCGAAAGCGGTTCTTCCGAAGGC
GGTCCAGGTTCTGGTGGTGAACCGTCCGAATCT
GGTAGCTCA
LCW0401_067_ GSGGEPSESGSSGESPGG GGTTCCGGTGGCGAACCTTCCGAATCTGGTAGC
GFP-N_C06.ab1 SSGSESGESPGGSSGSES TCAGGTGAATCTCCGGGTGGTTCTAGCGGTTCC
GAATCAGGTGAATCTCCAGGTGGTTCTAGCGGT
TCCGAATCA
LCW0401_069_ GSGGEPSESGSSGSGGEP GGTGGTGAACCATCTGAGTCTGGTAGC
GFP-N_D06.ab1 SESGSSGESPGGSSGSES TCAGGTTCCGGTGGCGAACCGTCCGAGTCTGGT
AGCTCAGGTGAATCTCCGGGTGGTTCCAGCGGT
TCCGAATCA
LCW0401_070_ GSEGSSGPGESSGSSESG GGTAGCGAAGGTTCTTCTGGTCCGGGCGAATCC
GFP-N_EO6.ab1 SSEGGPGSEGSSGPGESS TCCTCCGAAAGCGGTTCTTCCGAAGGT
GGTAGCGAAGGTTCTTCCGGTCCTGGT
GAATCTTCA
LCW0401_078_ GSSESGSSEGGPGESPGG GGTTCCTCTGAAAGCGGTTCTTCTGAAGGCGGT
GFP-N_F06.ab1 SSGSESGESPGGSSGSES GAATCTCCGGGTGGCTCCAGCGGTTCT
GAATCAGGTGAATCTCCTGGTGGCTCCAGCGGT
TCCGAGTCA
—Amino acid sequence Nucleotide ce
LCW0401_079_ GSEGSSGPGESSGSEGSS GGTAGCGAAGGTTCTTCTGGTCCAGGCGAGTCT
GFP-N_G06.abl GPGESSGSGGEPSESGSS TCAGGTAGCGAAGGTTCTTCCGGTCCTGGCGAG
TCTTCAGGTTCCGGTGGCGAACCGTCCGAATCT
GGTAGCTCA
Example 2: uction of XTEN_AE36 segments
A codon library encoding XTEN sequences of 36 amino acid length was constructed. The
XTEN sequence was designated XTENiAE36. Its ts have the amino acid sequence [X]3 where X
is a 12mer peptide with the sequence: GSPAGSPTSTEE, GSEPATSGSE TP, GTSESA TPESGP, or
GTSTEPSEGSAP. The insert was obtained by ing the following pairs of phosphorylated synthetic
oligonucleotide pairs:
AEl for: AGGTAGCCCDGCWGGYTCTCCDACYTCYACYGARGA
AB 1 rev: ACCTTCYTCRGTRGARGTHGGAGARCCWGCHGGGCT
AEZfor: AGGTAGCGAACCKGCWACYTCYGGYTCTGARACYCC
: ACCTGGRGTYTCAGARCCRGARGTWGCMGGTTCGCT
AE3 for: AGGTACYTCTGAAAGCGCWACYCCKGARTCYGGYCC
AE3rev: ACCTGGRCCRGAYTCMGGRGTWGCGCTTTCAGARGT
AE4for: AGGTACYTCTACYGAACCKTCYGARGGYAGCGCWCC
AE4reV: ACCTGGWGCGCTRCCYTCRGAMGGTTCRGTAGARGT
We also annealed the phosphorylated oligonucleotide 3KpnIstopperFor:
AGGTTCGTCTTCACTCGAGGGTAC and the non-phosphorylated oligonucleotide
pr_3KpnIstopperRev: GTGAAGACGA. The annealed oligonucleotide pairs were ligated,
which resulted in a mixture of products with g length that represents the varying number of 12mer
repeats d to one BbsI/Kpnl segment. The products corresponding to the length of 36 amino acids
were isolated from the mixture by ative agarose gel electrophoresis and ligated into the Bsal/Kpnl
digested stuffer vector pCWO35 9. Most of the clones in the resulting library designated LCW0402
showed green fluorescence after induction which shows that the sequence ofXTEN_AE36 had been
ligated in frame with the GFP gene and most sequences ofXTEN_AE36 show good expression.
We screened 96 isolates from library LCW0402 for high level of fluorescence by stamping
them onto agar plate containing IPTG. The same isolates were evaluated by PCR and 48 isolates were
identified that contained ts with 36 amino acids as well as strong cence. These isolates
were sequenced and 37 clones were identified that contained correct XTEN_AE36 ts. The file
names of the nucleotide and amino acid constructs for these segments are listed in Table 9.
Table 9: DNA and Amino Acid Seguences for 36—mer motifs
File name Amino acid sequence Nucleotide sequence
LCW0402_002_ GSPAGSPTSTEEGTSE GGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAA
GFP-N_A07 . ab 1 SATPESGPGTSTEPSE GGTACTTCTGAAAGCGCAACCCCGGAGTCCGGCCCA
GSAP GGTACCTCTACCGAACCGTCTGAGGGCAGCGCACCA
LCW0402_003_ GTSTEPSEGSAPGTST GGTACTTCTACCGAACCGTCCGAAGGCAGCGCTCCA
GFP-N_BO7 . ab 1 EPSEGSAPGTSTEPSE TCTACTGAACCTTCCGAGGGCAGCGCTCCA
GSAP GGTACCTCTACCGAACCTTCTGAAGGTAGCGCACCA
File name Amino acid sequence Nucleotide sequence
Lcw0402_004_ SEGSAPGTSE GGTACCTCTACCGAACCGTCTGAAGGTAGCGCACCA
GFP-\_C07.ab1 SATPESGPGTSESATP GGTACCTCTGAAAGCGCAACTCCTGAGTCCGGTCCA
ESGP GGTACTTCTGAAAGCGCAACCCCGGAGTCTGGCCCA
LCW0402_005_ GTSTEPSEGSAPGTSE GGTACTTCTACTGAACCGTCTGAAGGTAGCGCACCA
GFP—\_D07.ab1 SATPESGPGTSESATP GGTACTTCTGAAAGCGCAACCCCGGAATCCGGCCCA
ESGP GGTACCTCTGAAAGCGCAACCCCGGAGTCCGGCCCA
LCW0402_006_ SGSETPGTSE GGTAGCGAACCGGCAACCTCCGGCTCTGAAACCCCA
EO7.ab1 SATPESGPGSPAGSPT GGTACCTCTGAAAGCGCTACTCCTGAATCCGGCCCA
STEE GGTAGCCCGGCAGGTTCTCCGACTTCCACTGAGGAA
LCW0402_008_ GTSESATPESGPGSEP TCTGAAAGCGCAACCCCTGAATCCGGTCCA
GFP-\_F07.ab1 TPGTSTEPSE GGTAGCGAACCGGCTACTTCTGGCTCTGAGACTCCA
GSAP GGTACTTCTACCGAACCGTCCGAAGGTAGCGCACCA
LCW0402_009_ GSPAGSPTSTEEGSPA GGTAGCCCGGCTGGCTCTCCAACCTCCACTGAGGAA
GFP—\_G07.ab1 GSPTSTEEGSEPATSG GGTAGCCCGGCTGGCTCTCCAACCTCCACTGAAGAA
SETP GGTAGCGAACCGGCTACCTCCGGCTCTGAAACTCCA
LCW0402_011_ GSPAGSPTSTEEGTSE GGTAGCCCGGCTGGCTCTCCTACCTCTACTGAGGAA
GFP-\_A08.ab1 GPGTSTEPSE GGTACTTCTGAAAGCGCTACTCCTGAGTCTGGTCCA
GSAP GGTACCTCTACTGAACCGTCCGAAGGTAGCGCTCCA
Lcw0402_012_ PTSTEEGSPA GGTAGCCCTGCTGGCTCTCCGACTTCTACTGAGGAA
B08eab1 GSPTSTEEGTSTEPSE GGTAGCCCGGCTGGTTCTCCGACTTCTACTGAGGAA
GSAP GGTACTTCTACCGAACCTTCCGAAGGTAGCGCTCCA
LCW0402_013_ GTSESATPESGPGTST GGTACTTCTGAAAGCGCTACTCCGGAGTCCGGTCCA
GFP-\_C08.ab1 EPSEGSAPGTSTEPSE GGTACCTCTACCGAACCGTCCGAAGGCAGCGCTCCA
GSAP GGTACTTCTACTGAACCTTCTGAGGGTAGCGCTCCA
LCW0402_014_ GTSTEPSEGSAPGSPA GGTACCTCTACCGAACCTTCCGAAGGTAGCGCTCCA
GFP-\_D08.ab1 GSPTSTEEGTSTEPSE GGTAGCCCGGCAGGTTCTCCTACTTCCACTGAGGAA
GSAP GGTACTTCTACCGAACCTTCTGAGGGTAGCGCACCA
LCW040270157 GSEPATSGSETPGSPA GGTAGCGAACCGGCTACTTCCGGCTCTGAGACTCCA
GFP-\_E08.ab1 GSPTSTEEGTSESATP GGTAGCCCTGCTGGCTCTCCGACCTCTACCGAAGAA
ESGP GGTACCTCTGAAAGCGCTACCCCTGAGTCTGGCCCA
2_016_ GTSTEPSEGSAPGTSE GGTACTTCTACCGAACCTTCCGAGGGCAGCGCACCA
GFP-\_F08.ab1 SATPESGPGTSESATP GGTACTTCTGAAAGCGCTACCCCTGAGTCCGGCCCA
ESGP GGTACTTCTGAAAGCGCTACTCCTGAATCCGGTCCA
2_020_ GTSTEPSEGSAPGSEP GGTACTTCTACTGAACCGTCTGAAGGCAGCGCACCA
GFP-\7G08.ab1 ATSGSETPGSPAGSPT GGTAGCGAACCGGCTACTTCCGGTTCTGAAACCCCA
STEE GGTAGCCCAGCAGGTTCTCCAACTTCTACTGAAGAA
Lcw0402_023_ GSPAGSPTSTEEGTSE GGTAGCCCTGCTGGCTCTCCAACCTCCACCGAAGAA
GFP-\_A09.ab1 SATPESGPGSEPATSG GGTACCTCTGAAAGCGCAACCCCTGAATCCGGCCCA
SETP GGTAGCGAACCGGCAACCTCCGGTTCTGAAACCCCA
LCW0402_024_ GTSESATPESGPGSPA GGTACTTCTGAAAGCGCTACTCCTGAGTCCGGCCCA
GFP-\_B09.ab1 GSPTSTEEGSPAGSPT GGTAGCCCGGCTGGCTCTCCGACTTCCACCGAGGAA
STEE GGTAGCCCGGCTGGCTCTCCAACTTCTACTGAAGAA
2_025_ GTSTEPSEGSAPGTSE GGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCA
GFP-\7CO9.ab1 SATPESGPGTSTEPSE GGTACTTCTGAAAGCGCTACCCCGGAGTCCGGTCCA
GSAP GGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCA
LCW0402_026_ GSPAGSPTSTEEGTST GGTAGCCCGGCAGGCTCTCCGACTTCCACCGAGGAA
GFP-\_D09.ab1 EPSEGSAPGSEPATSG GGTACCTCTACTGAACCTTCTGAGGGTAGCGCTCCA
SETP GGTAGCGAACCGGCAACCTCTGGCTCTGAAACCCCA
LCW0402_027_ GSPAGSPTSTEEGTST GGTAGCCCAGCAGGCTCTCCGACTTCCACTGAGGAA
GFP-\_E09.ab1 EPSEGSAPGTSTEPSE GGTACTTCTACTGAACCTTCCGAAGGCAGCGCACCA
GSAP GGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCA
LCW0402_032_ GSEPATSGSETPGTSE GAACCTGCTACCTCCGGTTCTGAAACCCCA
GFP—\_H09.ab1 SATPESGPGSPAGSPT GGTACCTCTGAAAGCGCAACTCCGGAGTCTGGTCCA
STEE GGTAGCCCTGCAGGTTCTCCTACCTCCACTGAGGAA
Lcw0402_034_ GTSESATPESGPGTST GGTACCTCTGAAAGCGCTACTCCGGAGTCTGGCCCA
GFP-\_A10.ab1 EPSEGSAPGTSTEPSE GGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCCA
GSAP GGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCA
2_036_ GSPAGSPTSTEEGTST GGTAGCCCGGCTGGTTCTCCGACTTCCACCGAGGAA
GFP-\_C10.ab1 EPSEGSAPGTSTEPSE GGTACCTCTACTGAACCTTCTGAGGGTAGCGCTCCA
File name Amino acid sequence Nucleotide ce
GSAP GGTACCTCTACTGAACCTTCCGAAGGCAGCGCTCCA
LCW0402_039_ GTSTEPSEGSAPGTST GGTACTTCTACCGAACCGTCCGAGGGCAGCGCTCCA
GFP-\_E10.abl EPSEGSAPGTSTEPSE GGTACTTCTACTGAACCTTCTGAAGGCAGCGCTCCA
GSAP GGTACTTCTACTGAACCTTCCGAAGGTAGCGCACCA
LCW0402_040_ GSEPATSGSETPGTSE GGTAGCGAACCTGCAACCTCTGGCTCTGAAACCCCA
GFP-\_F10.ab1 SATPESGPGTSTEPSE GGTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCA
GSAP TCTACTGAACCGTCCGAGGGCAGCGCACCA
LCW0402_041_ GTSTEPSEGSAPGSPA GGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCA
GFP—\_G10.abl GSPTSTEEGTSTEPSE GGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGGAA
GSAP GGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCA
LCW0402_050_ GSEPATSGSETPGTSE GGTAGCGAACCGGCAACCTCCGGCTCTGAAACTCCA
GFP-\_Al l.abl SATPESGPGSEPATSG GGTACTTCTGAAAGCGCTACTCCGGAATCCGGCCCA
SETP GGTAGCGAACCGGCTACTTCCGGCTCTGAAACCCCA
LCW0402_051_ GSEPATSGSETPGTSE GGTAGCGAACCGGCAACTTCCGGCTCTGAAACCCCA
GFP-\_B11.ab1 SATPESGPGSEPATSG GGTACTTCTGAAAGCGCTACTCCTGAGTCTGGCCCA
SETP GGTAGCGAACCTGCTACCTCTGGCTCTGAAACCCCA
LCW0402_059_ GSEPATSGSETPGSEP GGTAGCGAACCGGCAACCTCTGGCTCTGAAACTCCA
El l.ab1 ATSGSETPGTSTEPSE GAACCTGCAACCTCCGGCTCTGAAACCCCA
GSAP GGTACTTCTACTGAACCTTCTGAGGGCAGCGCACCA
LCW0402_060_ GTSESATPESGPGSEP GGTACTTCTGAAAGCGCTACCCCGGAATCTGGCCCA
GFP-\_Fl l.ab1 ATSGSETPGSEPATSG GGTAGCGAACCGGCTACTTCTGGTTCTGAAACCCCA
SETP GGTAGCGAACCGGCTACCTCCGGTTCTGAAACTCCA
LCW0402_061_ GTSTEPSEGSAPGTST GGTACCTCTACTGAACCTTCCGAAGGCAGCGCTCCA
GFP-\_G1 labl EPSEGSAPGTSESATP GGTACCTCTACCGAACCGTCCGAGGGCAGCGCACCA
ESGP GGTACTTCTGAAAGCGCAACCCCTGAATCCGGTCCA
2_065_ GSEPATSGSETPGTSE GGTAGCGAACCGGCAACCTCTGGCTCTGAAACCCCA
GFP-\_A12.abl SATPESGPGTSESATP GGTACCTCTGAAAGCGCTACTCCGGAATCTGGTCCA
ESGP GGTACTTCTGAAAGCGCTACTCCGGAATCCGGTCCA
LCW0402_066_ GSEPATSGSETPGSEP GGTAGCGAACCTGCTACCTCCGGCTCTGAAACTCCA
GFP-\_B12.abl TPGTSTEPSE GGTAGCGAACCGGCTACTTCCGGTTCTGAAACTCCA
GSAP TCTACCGAACCTTCCGAAGGCAGCGCACCA
LCW0402_067_ GSEPATSGSETPGTST GGTAGCGAACCTGCTACTTCTGGTTCTGAAACTCCA
C12‘abl EPSEGSAPGSEPATSG GGTACTTCTACCGAACCGTCCGAGGGTAGCGCTCCA
SETP GGTAGCGAACCTGCTACTTCTGGTTCTGAAACTCCA
LCW0402_069_ GTSTEPSEGSAPGTST GGTACCTCTACCGAACCGTCCGAGGGTAGCGCACCA
GFP-\_D12.abl EPSEGSAPGSEPATSG GGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCCA
SETP GGTAGCGAACCGGCAACCTCCGGTTCTGAAACTCCA
LCW0402_073_ GTSTEPSEGSAPGSEP GGTACTTCTACTGAACCTTCCGAAGGTAGCGCTCCA
GFP-\_F12.abl ATSGSETPGSPAGSPT GGTAGCGAACCTGCTACTTCTGGTTCTGAAACCCCA
STEE GGTAGCCCGGCTGGCTCTCCGACCTCCACCGAGGAA
LCW0402_074_ GSEPATSGSETPGSPA GGTAGCGAACCGGCTACTTCCGGCTCTGAGACTCCA
GFP-\_G12.abl GSPTSTEEGTSESATP GGTAGCCCAGCTGGTTCTCCAACCTCTACTGAGGAA
ESGP GGTACTTCTGAAAGCGCTACCCCTGAATCTGGTCCA
LCW0402_075_ GTSESATPESGPGSEP GGTACCTCTGAAAGCGCAACTCCTGAGTCTGGCCCA
GFP-\_H12.abl ATSGSETPGTSESATP GGTAGCGAACCTGCTACCTCCGGCTCTGAGACTCCA
ESGP GGTACCTCTGAAAGCGCAACCCCGGAATCTGGTCCA
Example 3: Construction of XTEN_AF36 ts
A codon library encoding sequences of 36 amino acid length was constructed. The sequences
were designated XTEN_AF3 6. Its segments have the amino acid sequence [X]3 where X is a 12mer
peptide with the sequence: GSTSESPSGTAP, GTSTPESGSASP, GTSPSGESSTAP, or
GSTSSTAESPGP. The insert was ed by ing the following pairs of phosphorylated tic
oligonucleotide pairs:
AFl for: AGGTTCTACYAGCGAATCYCCKTCTGGYACYGCWCC
AF 1 rev: WGCRGTRCCAGAMGGRGATTCGCTRGTAGA
AFZfor: AGGTACYTCTACYCCKGAAAGCGGYTCYGCWTCTCC
AFZrev: ACCTGGAGAWGCRGARCCGCTTTCMGGRGTAGARGT
AF3 for: AGGTACYTCYCCKAGCGGYGAATCTTCTACYGCWCC
AF3rev: ACCTGGWGCRGTAGAAGATTCRCCGCTMGGRGARGT
AF4for: AGGTTCYACYAGCTCTACYGCWGAATCTCCKGGYCC
: ACCTGGRCCMGGAGATTCWGCRGTAGAGCTRGTRGA
We also annealed the orylated oligonucleotide 3KpnlstopperFor:
AGGTTCGTCTTCACTCGAGGGTAC and the non-phosphorylated oligonucleotide
pr_3KpnIstopperRev: CCTCGAGTGAAGACGA. The annealed oligonucleotide pairs were ligated,
which resulted in a mixture of products with varying length that represents the varying number of 12mer
repeats ligated to one BbsI/Kpnl segment The products corresponding to the length of 36 amino acids
were isolated from the mixture by preparative agarose gel electrophoresis and ligated into the BsaI/Kpnl
ed stuffer vector pCWO35 9. Most of the clones in the resulting y designated LCW0403
showed green fluorescence after induction which shows that the sequence ofXTEN_AF36 had been
ligated in frame with the GFP gene and most sequences ofXTEN_AF36 show good expression.
We screened 96 isolates from library 3 for high level of fluorescence by stamping
them onto agar plate containing lPTG. The same es were evaluated by PCR and 48 isolates were
identified that contained segments with 36 amino acids as well as strong fluorescence. These isolates
were sequenced and 44 clones were identified that contained correct XTEN_AF36 segments. The file
names of the nucleotide and amino acid constructs for these segments are listed in Table 10.
Table 10: DNA and Amino Acid Seguences for 36-mer motifs
File name Amino acid SCI uence Nucleotide SQ a_uence
370047 GTSTPESGSASPGTSP GGTACTTCTACTCCGGAAAGCGGTTCCGCATCTCCA
GFP-\_A0 l i ab 1 SGESSTAPGTSPSGES GGTACTTCTCCTAGCGGTGAATCTTCTACTGCTCCAG
STAP GTACCTCTCCTAGCGGCGAATCTTCTACTGCTCCA
LCW0403_005_ GTSPSGESSTAPGSTS GGTACTTCTCCGAGCGGTGAATCTTCTACCGCACCA
B0 l . ab 1 STABSPGPGTSPSGES GGTTCTACTAGCTCTACCGCTGAATCTCCGGGCCCAG
STAP CTCCGAGCGGTGAATCTTCTACTGCTCCA
LCW0403_006_ AESPGPGTSP GGTTCCACCAGCTCTACTGCTGAATCTCCTGGTCCAG
GFP-\iCO l .abl SGESSTAPGTSTPESG GTACCTCTCCTAGCGGTGAATCTTCTACTGCTCCAGG
SASP TACTTCTACTCCTGAAAGCGGCTCTGCTTCTCCA
LCW0403_007_ GSTSSTAESPGPGSTS ACCAGCTCTACTGCAGAATCTCCTGGCCCAG
GFP-\_D01.abl STAESPGPGTSPSGES GTTCCACCAGCTCTACCGCAGAATCTCCGGGTCCAG
STAP GTACTTCCCCTAGCGGTGAATCTTCTACCGCACCA
LCW0403_008_ GSTSSTAESPGPGTSP GGTTCTACTAGCTCTACTGCTGAATCTCCTGGCCCAG
GFP-\_E01.abl SGESSTAPGTSTPESG GTACTTCTCCTAGCGGTGAATCTTCTACCGCTCCAGG
SASP TACCTCTACTCCGGAAAGCGGTTCTGCATCTCCA
3_010_ GSTSSTAESPGPGTST GGTTCTACCAGCTCTACCGCAGAATCTCCTGGTCCAG
GFP-\_F01.abl PESGSASPGSTSESPS GTACCTCTACTCCGGAAAGCGGCTCTGCATCTCCAG
GTAP GTTCTACTAGCGAATCTCCTTCTGGCACTGCACCA
3_011_ GSTSSTAESPGPGTST GGTTCTACTAGCTCTACTGCAGAATCTCCTGGCCCAG
GFP-\_G01.abl PESGSASPGTSTPESG GTACCTCTACTCCGGAAAGCGGCTCTGCATCTCCAG
SASP GTACTTCTACCCCTGAAAGCGGTTCTGCATCTCCA
LCW0403_012_ GSTSESPSGTAPGTSP GGTTCTACCAGCGAATCTCCTTCTGGCACCGCTCCAG
GFP-\_H01.abl SGESSTAPGSTSESPS GTACCTCTCCTAGCGGCGAATCTTCTACCGCTCCAGG
File name Amino acid sequence Nucleotide sequence
GTAP TTCTACTAGCGAATCTCCTTCTGGCACTGCACCA
LCW0403_013_ GSTSSTAESPGPGSTS GGTTCCACCAGCTCTACTGCAGAATCTCCGGGCCCA
GFP-\_A02.ab1 STAESPGPGTSPSGES GGTTCTACTAGCTCTACTGCAGAATCTCCGGGTCCAG
STAP GTACTTCTCCTAGCGGCGAATCTTCTACCGCTCCA
LCW0403_014_ GSTSSTAESPGPGTST GGTTCCACTAGCTCTACTGCAGAATCTCCTGGCCCAG
GFP-\_B02.ab1 PESGSASPGSTSESPS GTACCTCTACCCCTGAAAGCGGCTCTGCATCTCCAG
GTAP GTTCTACCAGCGAATCCCCGTCTGGCACCGCACCA
LCW0403_015_ AESPGPGSTS ACTAGCTCTACTGCTGAATCTCCGGGTCCAG
GFP—\_C02.ab1 STAESPGPGTSPSGES GTTCTACCAGCTCTACTGCTGAATCTCCTGGTCCAGG
STAP TACCTCCCCGAGCGGTGAATCTTCTACTGCACCA
LCW0403_017_ GSTSSTAESPGPGSTS GGTTCTACCAGCTCTACCGCTGAATCTCCTGGCCCAG
D02.ab1 ESPSGTAPGSTSSTAE GTTCTACCAGCGAATCCCCGTCTGGCACCGCACCAG
SPGP GTTCTACTAGCTCTACCGCTGAATCTCCGGGTCCA
LCW0403_018_ GSTSSTAESPGPGSTS GGTTCTACCAGCTCTACCGCAGAATCTCCTGGCCCA
GFP-\_E02.ab1 STAESPGPGSTSSTAE GGTTCCACTAGCTCTACCGCTGAATCTCCTGGTCCAG
SPGP GTTCTACTAGCTCTACCGCTGAATCTCCTGGTCCA
3_019_ GSTSESPSGTAPGSTS GGTTCTACTAGCGAATCCCCTTCTGGTACTGCTCCAG
GFP—\_F02.ab1 STAESPGPGSTSSTAE GTTCCACTAGCTCTACCGCTGAATCTCCTGGCCCAGG
SPGP TAGCTCTACTGCAGAATCTCCTGGTCCA
LCW0403_023_ GSTSESPSGTAPGSTS GGTTCTACTAGCGAATCTCCTTCTGGTACCGCTCCAG
GFP-\_H02.ab1 ESPSGTAPGSTSESPS GTTCTACCAGCGAATCCCCGTCTGGTACTGCTCCAGG
GTAP TTCTACCAGCGAATCTCCTTCTGGTACTGCACCA
3_024_ GSTSSTAESPGPGSTS GGTTCCACCAGCTCTACTGCTGAATCTCCTGGCCCAG
GFP-\_A03.ab1 STAESPGPGSTSSTAE GTTCTACCAGCTCTACTGCTGAATCTCCGGGCCCAGG
SPGP TTCCACCAGCTCTACCGCTGAATCTCCGGGTCCA
LCW0403_025_ GSTSSTAESPGPGSTS GGTTCCACTAGCTCTACCGCAGAATCTCCTGGTCCAG
GFP-\_B03.ab1 STAESPGPGTSPSGES CTAGCTCTACTGCTGAATCTCCGGGTCCAGG
STAP TACCTCCCCTAGCGGCGAATCTTCTACCGCTCCA
LCW0403_028_ GSSPSASTGTGPGSST GGTTCTAGCCCTTCTGCTTCCACCGGTACCGGCCCAG
D03.ab1 PSGATGSPGSSTPSGA CTACTCCGTCTGGTGCAACTGGCTCTCCAGG
TGSP TAGCTCTACTCCGTCTGGTGCAACCGGCTCCCCA
LCW0403_029_ GTSPSGESSTAPGTST GGTACTTCCCCTAGCGGTGAATCTTCTACTGCTCCAG
GFP-\_E03.ab1 PESGSASPGSTSSTAE GTACCTCTACTCCGGAAAGCGGCTCCGCATCTCCAG
SPGP GTTCTACTAGCTCTACTGCTGAATCTCCTGGTCCA
LCW0403_030_ GSTSSTAESPGPGSTS GGTTCTACTAGCTCTACCGCTGAATCTCCGGGTCCAG
GFP-\_F03.ab1 STAESPGPGTSTPESG GTTCTACCAGCTCTACTGCAGAATCTCCTGGCCCAGG
SASP TACTTCTACTCCGGAAAGCGGTTCCGCTTCTCCA
3_031_ GTSPSGESSTAPGSTS GGTACTTCTCCTAGCGGTGAATCTTCTACCGCTCCAG
GFP-\_G03.ab1 STAESPGPGTSTPESG GTTCTACCAGCTCTACTGCTGAATCTCCTGGCCCAGG
SASP TACTTCTACCCCGGAAAGCGGCTCCGCTTCTCCA
3_033_ GSTSESPSGTAPGSTS GGTTCTACTAGCGAATCCCCTTCTGGTACTGCACCAG
GFP-\_H03.ab1 GPGSTSSTAE GTTCTACCAGCTCTACTGCTGAATCTCCGGGCCCAGG
SPGP TTCCACCAGCTCTACCGCAGAATCTCCTGGTCCA
LCW0403_035_ GSTSSTAESPGPGSTS ACCAGCTCTACCGCTGAATCTCCGGGCCCA
GFP-\_A04.ab1 ESPSGTAPGSTSSTAE GGTTCTACCAGCGAATCCCCTTCTGGCACTGCACCA
SPGP GGTTCTACTAGCTCTACCGCAGAATCTCCGGGCCCA
LCW0403_036_ GSTSSTAESPGPGTSP GGTTCTACCAGCTCTACTGCTGAATCTCCGGGTCCAG
GFP-\7BO4.ab1 SGESSTAPGTSTPESG GTACTTCCCCGAGCGGTGAATCTTCTACTGCACCAG
SASP GTACTTCTACTCCGGAAAGCGGTTCCGCTTCTCCA
LCW0403_039_ GSTSESPSGTAPGSTS GGTTCTACCAGCGAATCTCCTTCTGGCACCGCTCCAG
GFP-\_C04.ab1 APGTSPSGES GTTCTACTAGCGAATCCCCGTCTGGTACCGCACCAG
STAP GTACTTCTCCTAGCGGCGAATCTTCTACCGCACCA
LCW0403_041_ GSTSESPSGTAPGSTS GGTTCTACCAGCGAATCCCCTTCTGGTACTGCTCCAG
GFP-\_D04.ab1 ESPSGTAPGTSTPESG GTTCTACCAGCGAATCCCCTTCTGGCACCGCACCAG
SASP GTACTTCTACCCCTGAAAGCGGCTCCGCTTCTCCA
LCW0403_044_ GTSTPESGSASPGSTS GGTACCTCTACTCCTGAAAGCGGTTCTGCATCTCCAG
GFP-\_EO4.ab1 STAESPGPGSTSSTAE GTTCCACTAGCTCTACCGCAGAATCTCCGGGCCCAG
SPGP GTTCTACTAGCTCTACTGCTGAATCTCCTGGCCCA
LCW0403_046_ GSTSESPSGTAPGSTS GGTTCTACCAGCGAATCCCCTTCTGGCACTGCACCA
File name Amino acid sequence Nucleotide sequence
GFP-\_F04.ab1 ESPSGTAPGTSPSGES GGTTCTACTAGCGAATCCCCTTCTGGTACCGCACCAG
STAP GTACTTCTCCGAGCGGCGAATCTTCTACTGCTCCA
LCW0403_047_ GSTSSTAESPGPGSTS GGTTCTACTAGCTCTACCGCTGAATCTCCTGGCCCAG
GFP-\_G04.ab1 STAESPGPGSTSESPS CTAGCTCTACCGCAGAATCTCCGGGCCCAG
GTAP GTTCTACTAGCGAATCCCCTTCTGGTACCGCTCCA
LCW0403_049_ GSTSSTAESPGPGSTS GGTTCCACCAGCTCTACTGCAGAATCTCCTGGCCCA
GFP-\_H04.ab1 STAESPGPGTSTPESG GGTTCTACTAGCTCTACCGCAGAATCTCCTGGTCCAG
SASP GTACCTCTACTCCTGAAAGCGGTTCCGCATCTCCA
LCW0403_051_ GSTSSTAESPGPGSTS GGTTCTACTAGCTCTACTGCTGAATCTCCGGGCCCAG
GFP-\_A05.ab1 STAESPGPGSTSESPS GTTCTACTAGCTCTACCGCTGAATCTCCGGGTCCAGG
GTAP TTCTACTAGCGAATCTCCTTCTGGTACCGCTCCA
LCW0403_053_ ESSTAPGSTS GGTACCTCCCCGAGCGGTGAATCTTCTACTGCACCA
GFP-\_B05‘ab1 ESPSGTAPGSTSSTAE GGTTCTACTAGCGAATCCCCTTCTGGTACTGCTCCAG
SPGP GTTCCACCAGCTCTACTGCAGAATCTCCGGGTCCA
LCW0403_054_ PSGTAPGTSP GGTTCTACTAGCGAATCCCCGTCTGGTACTGCTCCAG
GFP-\_C05.ab1 SGESSTAPGSTSSTAE GTACTTCCCCTAGCGGTGAATCTTCTACTGCTCCAGG
SPGP TTCTACCAGCTCTACCGCAGAATCTCCGGGTCCA
3_057_ GSTSSTAESPGPGSTS GGTTCTACCAGCTCTACCGCTGAATCTCCTGGCCCAG
GFP-\_D05.ab1 ESPSGTAPGTSPSGES GTTCTACTAGCGAATCTCCGTCTGGCACCGCACCAG
STAP GTACTTCCCCTAGCGGTGAATCTTCTACTGCACCA
LCW0403_058_ GSTSESPSGTAPGSTS ACTAGCGAATCTCCTTCTGGCACTGCACCAG
GFP-\_E05.ab1 ESPSGTAPGTSTPESG GTTCTACCAGCGAATCTCCGTCTGGCACTGCACCAG
SASP GTACCTCTACCCCTGAAAGCGGTTCCGCTTCTCCA
LCW0403_060_ GTSTPESGSASPGSTS GGTACCTCTACTCCGGAAAGCGGTTCCGCATCTCCA
GFP-\_F05.ab1 ESPSGTAPGSTSSTAE GGTTCTACCAGCGAATCCCCGTCTGGCACCGCACCA
SPGP GGTTCTACTAGCTCTACTGCTGAATCTCCGGGCCCA
LCW0403_063_ GSTSSTAESPGPGTSP GGTTCTACTAGCTCTACTGCAGAATCTCCGGGCCCA
GFP-\7G05.ab1 APGTSPSGES GGTACCTCTCCTAGCGGTGAATCTTCTACCGCTCCAG
STAP CTCCGAGCGGTGAATCTTCTACCGCTCCA
LCW0403_064_ ESSTAPGTSP GGTACCTCCCCTAGCGGCGAATCTTCTACTGCTCCAG
GFP-\_H05.ab1 SGESSTAPGTSPSGES CTCCTAGCGGCGAATCTTCTACCGCTCCAGG
STAP CCCTAGCGGTGAATCTTCTACCGCACCA
LCW0403_065_ GSTSSTAESPGPGTST ACTAGCTCTACTGCTGAATCTCCTGGCCCAG
GFP-\_A06.ab1 PESGSASPGSTSESPS GTACTTCTACTCCGGAAAGCGGTTCCGCTTCTCCAGG
GTAP TTCTACTAGCGAATCTCCGTCTGGCACCGCACCA
LCW0403_066_ GSTSESPSGTAPGTSP GGTTCTACTAGCGAATCTCCGTCTGGCACTGCTCCAG
GFP-\_BO6.ab1 SGESSTAPGTSPSGES GTACTTCTCCTAGCGGTGAATCTTCTACCGCTCCAGG
STAP TACTTCCCCTAGCGGCGAATCTTCTACCGCTCCA
LCW0403_067_ GSTSESPSGTAPGTST GGTTCTACTAGCGAATCTCCTTCTGGTACCGCTCCAG
GFP-\_CO6.ab1 PESGSASPGSTSSTAE GTACTTCTACCCCTGAAAGCGGCTCCGCTTCTCCAGG
SPGP TTCCACTAGCTCTACCGCTGAATCTCCGGGTCCA
LCW0403_068_ GSTSSTAESPGPGSTS ACTAGCTCTACTGCTGAATCTCCTGGCCCAG
GFP-\_DO6.ab1 STAESPGPGSTSESPS GTTCTACCAGCTCTACCGCTGAATCTCCTGGCCCAGG
GTAP TTCTACCAGCGAATCTCCGTCTGGCACCGCACCA
LCW0403_069_ GSTSESPSGTAPGTST GGTTCTACTAGCGAATCCCCGTCTGGTACCGCACCA
E06.ab1 PESGSASPGTSTPESG GGTACTTCTACCCCGGAAAGCGGCTCTGCTTCTCCAG
SASP GTACTTCTACCCCGGAAAGCGGCTCCGCATCTCCA
LCW0403_070_ GSTSESPSGTAPGTST GGTTCTACTAGCGAATCCCCGTCTGGTACTGCTCCAG
GFP-\_F06.ab1 PESGSASPGTSTPESG GTACTTCTACTCCTGAAAGCGGTTCCGCTTCTCCAGG
SASP TACCTCTACTCCGGAAAGCGGTTCTGCATCTCCA
Example 4: Construction of XTEN_AG36 segments
A codon library encoding sequences of 36 amino acid length was constructed. The sequences
were designated G36. Its segments have the amino acid sequence [X]3 Where X is a 12mer
peptide With the sequence: GTPGSGTASSSP, GSSTPSGATGSP, GSSPSASTGTGP, or
GASPGTSSTGSP. The insert was obtained by annealing the following pairs of phosphorylated synthetic
oligonucleotide pairs:
AG] for: AGGTACYCCKGGYAGCGGTACYGCWTCTTCYTCTCC
AGlreV: ACCTGGAGARGAAGAWGCRGTACCGCTRCCMGGRGT
AGZfor: AGGTAGCTCTACYCCKTCTGGTGCWACYGGYTCYCC
: ACCTGGRGARCCRGTWGCACCAGAMGGRGTAGAGCT
AG3for: AGGTTCTAGCCCKTCTGCWTCYACYGGTACYGGYCC
AG3reV: ACCTGGRCCRGTACCRGTRGAWGCAGAMGGGCTAGA
: AGGTGCWTCYCCKGGYACYAGCTCTACYGGTTCTCC
AG4reV: AGAACCRGTAGAGCTRGTRCCMGGRGAWGC
We also annealed the orylated ucleotide 3KpnlstopperFor:
AGGTTCGTCTTCACTCGAGGGTAC and the non-phosphorylated oligonucleotide
pr_3Kpnlst0pperReV: CCTCGAGTGAAGACGA. The ed oligonucleotide pairs were ligated,
which resulted in a mixture of products with varying length that represents the varying number of 12mer
repeats ligated to one Bbsl/Kpnl segment. The products corresponding to the length of 36 amino acids
were isolated from the mixture by preparative agarose gel electrophoresis and ligated into the Bsal/Kpnl
digested stuffer vector pCWO35 9. Most of the clones in the resulting library designated LCW0404
showed green fluorescence after induction which shows that the sequence ofXTENiAG36 had been
ligated in frame with the GFP gene and most sequences ofXTEN_AG36 show good expression.
We screened 96 es from library LCW0404 for high level of fluorescence by stamping
them onto agar plate containing lPTG. The same es were evaluated by PCR and 48 isolates were
identified that contained segments with 36 amino acids as well as strong fluorescence. These isolates
were sequenced and 44 clones were identified that contained correct XTEN_AG36 segments. The file
names of the nucleotide and amino acid constructs for these segments are listed in Table 11.
Table 11: DNA and Amino Acid Seguences for 36-mer motifs
File name Amino acid ce Nucleotide sequence
LCW0404_001_ GASPGTSSTGSPGTPG GGTGCATCCCCGGGCACTAGCTCTACCGGTTCTCCAGGTA
TGSP ACTCCTTCTGGTGCTACTGGTTCTCCA
LCW0404_003_ GSSTPSGATGSPGSSP GGTAGCTCTACCCCTTCTGGTGCTACCGGCTCTCCAGGTT
B07.abl SASTGTGPGSSTPSGA CTAGCCCGTCTGCTTCTACCGGTACCGGTCCAGGTAGCTC
TGSP TACCCCTTCTGGTGCTACTGGTTCTCCA
LCW0404_006_ GASPGTSSTGSPGSSP GGTGCATCTCCGGGTACTAGCTCTACCGGTTCTCCAGGTT
GFP-\_CO7.abl SASTGTGPGSSTPSGA CTAGCCCTTCTGCTTCCACTGGTACCGGCCCAGGTAGCTC
TGSP TACCCCGTCTGGTGCTACTGGTTCCCCA
LCW0404_007_ TASSSPGSST GGTACTCCGGGCAGCGGTACTGCTTCTTCCTCTCCAGGTA
GFP-\_D07.abl PSGATGSPGASPGTSS GCTCTACCCCTTCTGGTGCAACTGGTTCCCCAGGTGCATC
CCCTGGTACTAGCTCTACCGGTTCTCCA
LCW0404_009_ GTPGSGTASSSPGASP GGTACCCCTGGCAGCGGTACTGCTTCTTCTTCTCCAGGTG
E07.abl GTSSTGSPGSRPSAST CTTCCCCTGGTACCAGCTCTACCGGTTCTCCAGGTTCTAG
ACCTTCTGCATCCACCGGTACTGGTCCA
LCW0404_01 l_ GASPGTSSTGSPGSST GGTGCATCTCCTGGTACCAGCTCTACCGGTTCTCCAGGTA
GFP-\_F07.abl PSGATGSPGASPGTSS GCTCTACTCCTTCTGGTGCTACTGGCTCTCCAGGTGCTTCC
TGSP CCGGGTACCAGCTCTACCGGTTCTCCA
WO 40093
File name Amino acid sequence Nucleotide sequence
Lcw0404_012_ GTPGSGTASSSPGSST GGTACCCCGGGCAGCGGTACCGCATCTTCCTCTCCAGGTA
GFP-\_G07.ab1
TGSP TACCCCGTCTGGTGCAACCGGCTCCCCA
4_014_ GASPGTSSTGSPGASP TCTCCGGGCACTAGCTCTACTGGTTCTCCAGGTG
GFP—\_H07.ab1 GTSSTGSPGASPGTSS CATCCCCTGGCACTAGCTCTACTGGTTCTCCAGGTGCTTC
TGSP TCCTGGTACCAGCTCTACTGGTTCTCCA
LCW0404_015_ GSSTPSGATGSPGSSP GGTAGCTCTACTCCGTCTGGTGCAACCGGCTCCCCAGGTT
GFP-\_A08.ab1 SASTGTGPGASPGTSS CTAGCCCGTCTGCTTCCACTGGTACTGGCCCAGGTGCTTC
TGSP CCCGGGCACCAGCTCTACTGGTTCTCCA
LCW0404_016_ GSSTPSGATGSPGSST GGTAGCTCTACTCCTTCTGGTGCTACCGGTTCCCCAGGTA
GFP-\_B08.ab1 PSGATGSPGTPGSGT GCTCTACTCCTTCTGGTGCTACTGGTTCCCCAGGTACTCC
ASSSP GGGCAGCGGTACTGCTTCTTCCTCTCCA
LCW0404_017_ GSSTPSGATGSPGSST GGTAGCTCTACTCCGTCTGGTGCAACCGGTTCCCCAGGTA
GFP—\_C08.ab1 PSGATGSPGASPGTSS GCTCTACTCCTTCTGGTGCTACTGGCTCCCCAGGTGCATC
TGSP CCCTGGCACCAGCTCTACCGGTTCTCCA
LCW0404_018_
GFP-\_D08.ab1 SASTGTGPGSSTPSGA CTAGCCCTTCTGCATCTACCGGTACCGGTCCAGGTAGCTC
TGSP TACTCCTTCTGGTGCTACTGGCTCTCCA
Lcw0404_023_ GASPGTSSTGSPGSSP GGTGCTTCCCCGGGCACTAGCTCTACCGGTTCTCCAGGTT
GFP-\_F08.ab1 SASTGTGPGTPGSGT CTAGCCCTTCTGCATCTACTGGTACTGGCCCAGGTACTCC
ASSSP GGGCAGCGGTACTGCTTCTTCCTCTCCA
LCW0404_025_ GATGSPGSST GGTAGCTCTACTCCGTCTGGTGCTACCGGCTCTCCAGGTA
GFP-\_G08.ab1 PSGATGSPGASPGTSS GCTCTACCCCTTCTGGTGCAACCGGCTCCCCAGGTGCTTC
TGSP TCCGGGTACCAGCTCTACTGGTTCTCCA
LCW0404_029_ GTPGSGTASSSPGSST GGTACCCCTGGCAGCGGTACCGCTTCTTCCTCTCCAGGTA
A09.ab1 PSGATGSPGSSPSAST CCCCGTCTGGTGCTACTGGCTCTCCAGGTTCTAG
GTGP CCCGTCTGCATCTACCGGTACCGGCCCA
LCW040470307 GSSTPSGATGSPGTPG GGTAGCTCTACTCCTTCTGGTGCAACCGGCTCCCCAGGTA
GFP-\_B09.ab1 SGTASSSPGTPGSGTA GCAGCGGTACCGCATCTTCCTCTCCAGGTACTCC
GGGTAGCGGTACTGCTTCTTCTTCTCCA
4_031_ GTPGSGTASSSPGSST GGTACCCCGGGTAGCGGTACTGCTTCTTCCTCTCCAGGTA
GFP-\_C09.ab1 PSGATGSPGASPGTSS GCTCTACCCCTTCTGGTGCAACCGGCTCTCCAGGTGCTTC
TCCGGGCACCAGCTCTACCGGTTCTCCA
4_034_ GSSTPSGATGSPGSST GGTAGCTCTACCCCGTCTGGTGCTACCGGCTCTCCAGGTA
GFP-\7D09.ab1 PSGATGSPGASPGTSS GCTCTACCCCGTCTGGTGCAACCGGCTCCCCAGGTGCATC
TGSP CCCGGGTACTAGCTCTACCGGTTCTCCA
LCW0404_035_
GFP-\_E09.ab1 SGTASSSPGSSTPSGA CCCCGGGCAGCGGTACCGCATCTTCTTCTCCAGGTAGCTC
TGSP TACTCCTTCTGGTGCAACTGGTTCTCCA
4_036_ GSSPSASTGTGPGSST GGTTCTAGCCCGTCTGCTTCCACCGGTACTGGCCCAGGTA
GFP-\_F09.ab1 PSGATGSPGTPGSGT GCTCTACCCCGTCTGGTGCAACTGGTTCCCCAGGTACCCC
ASSSP TGGTAGCGGTACCGCTTCTTCTTCTCCA
LCW0404_037_ GASPGTSSTGSPGSSP GGTGCTTCTCCGGGCACCAGCTCTACTGGTTCTCCAGGTT
GFP-\7G09.ab1 SASTGTGPGSSTPSGA CTAGCCCTTCTGCATCCACCGGTACCGGTCCAGGTAGCTC
TGSP TACCCCTTCTGGTGCAACCGGCTCTCCA
LCW0404_040_ GASPGTSSTGSPGSST GGTGCATCCCCGGGCACCAGCTCTACCGGTTCTCCAGGTA
GFP-\_H09.ab1 PSGATGSPGSSTPSGA GCTCTACCCCGTCTGGTGCTACCGGCTCTCCAGGTAGCTC
TGSP TACCCCGTCTGGTGCTACTGGCTCTCCA
LCW0404_041_
GFP-\_A10.ab1 PSGATGSPGTPGSGT GCTCTACTCCGTCTGGTGCTACCGGTTCTCCAGGTACCCC
ASSSP GGGTAGCGGTACCGCATCTTCTTCTCCA
LCW0404_043_ GSSPSASTGTGPGSST GGTTCTAGCCCTTCTGCTTCCACCGGTACTGGCCCAGGTA
C10.ab1 PSGATGSPGSSTPSGA GCTCTACCCCTTCTGGTGCTACCGGCTCCCCAGGTAGCTC
TGSP TACTCCTTCTGGTGCAACTGGCTCTCCA
LCW0404_045_ GASPGTSSTGSPGSSP GGTGCTTCTCCTGGCACCAGCTCTACTGGTTCTCCAGGTT
GFP-\_D10.ab1 SASTGTGPGSSPSAST CTAGCCCTTCTGCTTCTACCGGTACTGGTCCAGGTTCTAG
GTGP CCCTTCTGCATCCACTGGTACTGGTCCA
4_047_
GFP-\_F10.ab1 GTSSTGSPGASPGTSS CTTCTCCTGGTACTAGCTCTACTGGTTCTCCAGGTGCTTCT
2012/054941
File name Nucleotide sequence
CCGGGCACTAGCTCTACTGGTTCTCCA
GFP-\_G10.ab1 GTSSTGSPGSSTPSGA CTTCTCCTGGTACTAGCTCTACCGGTTCTCCAGGTAGCTC
TGSP TACCCCGTCTGGTGCTACTGGCTCTCCA
LCW0404_049_ GSSTPSGATGSPGTPG GGTAGCTCTACCCCGTCTGGTGCTACTGGTTCTCCAGGTA
TGSP TACCCCTTCTGGTGCTACTGGCTCTCCA
LCW0404_050_ GASPGTSSTGSPGSSP GGTGCATCTCCTGGTACCAGCTCTACTGGTTCTCCAGGTT
GFP—\_A1 1.ab1 SASTGTGPGSSTPSGA CTAGCCCTTCTGCTTCTACCGGTACCGGTCCAGGTAGCTC
TGSP TACTCCTTCTGGTGCTACCGGTTCTCCA
LCW0404_051_ GSSTPSGATGSPGSST GGTAGCTCTACCCCGTCTGGTGCTACTGGCTCTCCAGGTA
GFP-\_B1 1.ab1 PSGATGSPGSSTPSGA GCTCTACTCCTTCTGGTGCTACTGGTTCCCCAGGTAGCTC
TGSP TACCCCGTCTGGTGCAACTGGCTCTCCA
4_052_ GASPGTSSTGSPGTPG GGTGCATCCCCGGGTACCAGCTCTACCGGTTCTCCAGGTA
TGSP TCCGGGCACCAGCTCTACTGGTTCTCCA
LCW0404_053_ GSSTPSGATGSPGSSP GGTAGCTCTACTCCTTCTGGTGCAACTGGTTCTCCAGGTT
GFP—\_D1 1.ab1 SASTGTGPGASPGTSS CTAGCCCGTCTGCATCCACTGGTACCGGTCCAGGTGCTTC
TGSP CCCTGGCACCAGCTCTACCGGTTCTCCA
GFP-\_E1 1.ab1 PSGATGSPGSSPSAST GCTCTACTCCGTCTGGTGCAACCGGCTCTCCAGGTTCTAG
GTGP CCCTTCTGCATCTACCGGTACTGGTCCA
LCW0404_060_ GTPGSGTASSSPGSST GGTACTCCTGGCAGCGGTACCGCATCTTCCTCTCCAGGTA
TGSP TCCGGGTACCAGCTCTACCGGTTCTCCA
LCW0404_062_ GSSTPSGATGSPGTPG GGTAGCTCTACCCCGTCTGGTGCAACCGGCTCCCCAGGTA
GFP-\_G1 1.ab1 SGTASSSPGSSTPSGA CTCCTGGTAGCGGTACCGCTTCTTCTTCTCCAGGTAGCTC
TGSP TACTCCGTCTGGTGCTACCGGCTCCCCA
LCW0404_066_ STGTGPGSSP GGTTCTAGCCCTTCTGCATCCACCGGTACCGGCCCAGGTT
GFP-\_H1 1.ab1 GPGASPGTSS CTAGCCCGTCTGCTTCTACCGGTACTGGTCCAGGTGCTTC
TGSP TCCGGGTACTAGCTCTACTGGTTCTCCA
LCW0404_067_ TASSSPGSST GGTACCCCGGGTAGCGGTACCGCTTCTTCTTCTCCAGGTA
A12.ab1 PSGATGSPGSNPSAST GCTCTACTCCGTCTGGTGCTACCGGCTCTCCAGGTTCTAA
CCCTTCTGCATCCACCGGTACCGGCCCA
LCW0404_068_ STGTGPGSST GGTTCTAGCCCTTCTGCATCTACTGGTACTGGCCCAGGTA
GFP-\_B12.ab1 PSGATGSPGASPGTSS GCTCTACTCCTTCTGGTGCTACCGGCTCTCCAGGTGCTTCT
CCGGGTACTAGCTCTACCGGTTCTCCA
LCW0404_069_ GATGSPGASP GGTAGCTCTACCCCTTCTGGTGCAACCGGCTCTCCAGGTG
GFP-\_C12.ab1 GTSSTGSPGTPGSGTA CATCCCCGGGTACCAGCTCTACCGGTTCTCCAGGTACTCC
SSSP CGGTACCGCTTCTTCCTCTCCA
LCW0404_070_ GSSTPSGATGSPGSST GGTAGCTCTACTCCGTCTGGTGCAACCGGTTCCCCAGGTA
GFP-\_D12.ab1 SPGSSTPSGA GCTCTACCCCTTCTGGTGCAACCGGCTCCCCAGGTAGCTC
TGSP TACCCCTTCTGGTGCAACTGGCTCTCCA
LCW0404_073_ GASPGTSSTGSPGTPG GGTGCTTCTCCTGGCACTAGCTCTACCGGTTCTCCAGGTA
GFP-\_E12.ab1 SGTASSSPGSSTPSGA CCCCTGGTAGCGGTACCGCATCTTCCTCTCCAGGTAGCTC
TGSP TACTCCTTCTGGTGCTACTGGTTCCCCA
LCW0404_075_ GSSTPSGATGSPGSSP GGTAGCTCTACCCCGTCTGGTGCTACTGGCTCCCCAGGTT
GFP-\7F12.ab1 SASTGTGPGSSPSAST CTAGCCCTTCTGCATCCACCGGTACCGGTCCAGGTTCTAG
GTGP CCCGTCTGCATCTACTGGTACTGGTCCA
LCW0404_080_ GASPGTSSTGSPGSSP GGTGCTTCCCCGGGCACCAGCTCTACTGGTTCTCCAGGTT
GFP-\_G12.ab1 SASTGTGPGSSPSAST CTAGCCCGTCTGCTTCTACTGGTACTGGTCCAGGTTCTAG
GTGP CCCTTCTGCTTCCACTGGTACTGGTCCA
LCW0404_081_ GASPGTSSTGSPGSSP GGTGCTTCCCCGGGTACCAGCTCTACCGGTTCTCCAGGTT
GFP-\_H12.ab1 SASTGTGPGTPGSGT CTAGCCCTTCTGCTTCTACCGGTACCGGTCCAGGTACCCC
ASSSP TGGCAGCGGTACCGCATCTTCCTCTCCA
Example 5: uction of XTEN_AE864
E864 was constructed from serial dimerization of XTEN_AE36 to AE72, 144, 288,
576 and 864. A collection ofXTEN_AE72 segments was ucted from 37 different segments of
XTEN_AE36. Cultures of E. coli harboring all 37 ent 36-amino acid segments were mixed and
plasmids were isolated. This plasmid pool was digested with BsaI/Ncol to generate the small fragment as
the insert. The same plasmid pool was ed with BbsI/Ncol to generate the large fragment as the
vector. The insert and vector fragments were ligated resulting in a doubling ofthe length and the ligation
mixture was transformed into BL2lGold(DE3) cells to obtain colonies _AE72.
] This library of XTEN_AE72 segments was designated LCWO406. All clones from LCWO406
were combined and zed again using the same process as described above yielding library
O ofXTEN_AE144. All clones from LCWO410 were combined and dimerized again using the
same process as described above yielding library LCWO414 _AE288. Two es
LCWO414.001 and LCWO414.002 were randomly picked from the library and sequenced to verify the
identities. All clones from LCWO414 were combined and dimerized again using the same process as
described above yielding library LCWO418 ofXTEN_AES76. We screened 96 isolates from library
LCWO418 for high level of GFP fluorescence. 8 isolates with right sizes of inserts by PCR and strong
fluorescence were sequenced and 2 isolates (LCWO418.018 and LCWO418.052) were chosen for future
use based on sequencing and expression data.
The specific clone pCWO432 ofXTENiAE864 was constructed by combining LCWO418.018
ofXTEN_AE576 and LCWO414.002 ofXTEN_AE288 using the same dimerization process as described
above.
Example 6: Construction of XTEN_AM144
A tion ofXTEN_AM144 segments was constructed starting from 37 different segments
ofXTEN_AE36, 44 segments of XTEN_AF36, and 44 ts of XTEN_AG36.
Cultures of E. coli harboring all 125 different 36-amino acid segments were mixed and
plasmids were isolated. This plasmid pool was digested with BsaI/Ncol to generate the small nt as
the insert. The same plasmid pool was digested with BbsI/Ncol to generate the large fragment as the
. The insert and vector fragments were ligated resulting in a doubling ofthe length and the ligation
mixture was transformed into BL2lGold(DE3) cells to obtain colonies ofXTEN_AM72.
This library of XTEN_AM72 segments was designated LCWO461. All clones from LCWO461
were combined and dimerized again using the same process as described above ng library
LCWO462. 1512 Isolates from library LCWO462 were screened for protein expression. Individual
colonies were transferred into 96 well plates and cultured ght as starter cultures. These starter
cultures were d into fresh autoinduction medium and cultured for 20-3 Oh. Expression was measured
using a fluorescence plate reader with excitation at 395 nm and emission at 510 nm. 192 isolates showed
high level sion and were submitted to DNA sequencing. Most clones in library LCWO462 showed
good expression and similar physicochemical properties suggesting that most combinations of
M36 segments yield useful XTEN sequences. 30 isolates from LCWO462 were chosen as a
preferred collection ofXTEN_AM144 segments for the construction of multifunctional proteins that
contain multiple XTEN segments. The file names ofthe nucleotide and amino acid constructs for these
segments are listed in Table 12.
T—q—g_able12.DNAandamino acid se uences for AM144 se ments
DNA Sequence Protein Sequence
LCW462_rl GGTACCCCGGGCAGCGGTACCGCATCTTCCTCTCCAGGTA GTPGSGTASSSPGSSTPS
GCTCTACCCCGTCTGGTGCTACCGGTTCCCCAGGTAGCTC GATGSPGSSTPSGATGS
TACCCCGTCTGGTGCAACCGGCTCCCCAGGTAGCCCGGCT PGSPAGSPTSTEEGTSES
GGCTCTCCTACCTCTACTGAGGAAGGTACTTCTGAAAGCG ATPESGPGTSTEPSEGS
CTACTCCTGAGTCTGGTCCAGGTACCTCTACTGAACCGTC APGSSPSASTGTGPGSS
CGAAGGTAGCGCTCCAGGTTCTAGCCCTTCTGCATCCACC TGPGASPGTSS
GGTACCGGCCCAGGTTCTAGCCCGTCTGCTTCTACCGGTA STEPSEGSAPG
CTGGTCCAGGTGCTTCTCCGGGTACTAGCTCTACTGGTTC TSTEPSEGSAPGSEPATS
TCCAGGTACCTCTACCGAACCGTCCGAGGGTAGCGCACC GSETP
CTCTACTGAACCGTCTGAGGGTAGCGCTCCAGG
TAGCGAACCGGCAACCTCCGGTTCTGAAACTCCA
_r5 GGTTCTACCAGCGAATCCCCTTCTGGCACTGCACCAGGTT GSTSESPSGTAPGSTSES
GCGAATCCCCTTCTGGTACCGCACCAGGTACTTC PSGTAPGTSPSGESSTAP
TCCGAGCGGCGAATCTTCTACTGCTCCAGGTACCTCTACT SEGSAPGTSTEP
GAACCTTCCGAAGGCAGCGCTCCAGGTACCTCTACCGAA SEGSAPGTSESATPESG
CCGTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCA PGASPGTSSTGSPGSSTP
ACCCCTGAATCCGGTCCAGGTGCATCTCCTGGTACCAGCT SGATGSPGASPGTSSTG
CTACCGGTTCTCCAGGTAGCTCTACTCCTTCTGGTGCTAC SPGSTSESPSGTAPGSTS
TGGCTCTCCAGGTGCTTCCCCGGGTACCAGCTCTACCGGT ESPSGTAPGTSTPESGS
TCTCCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCAC ASP
CAGGTTCTACCAGCGAATCTCCGTCTGGCACTGCACCAGG
TACCTCTACCCCTGAAAGCGGTTCCGCTTCTCCA
LCW462_r9 TCTACCGAACCTTCCGAGGGCAGCGCACCAGGT GTSTEPSEGSAPGTSES
ACTTCTGAAAGCGCTACCCCTGAGTCCGGCCCAGGTACTT ATPESGPGTSESATPES
CTGAAAGCGCTACTCCTGAATCCGGTCCAGGTACCTCTAC GPGTSTEPSEGSAPGTS
TGAACCTTCTGAGGGCAGCGCTCCAGGTACTTCTGAAAG ESATPESGPGTSTEPSEG
CGCTACCCCGGAGTCCGGTCCAGGTACTTCTACTGAACCG SAPGTSTEPSEGSAPGS
TCCGAAGGTAGCGCACCAGGTACTTCTACTGAACCTTCCG EPATSGSETPGSPAGSP
AAGGTAGCGCTCCAGGTAGCGAACCTGCTACTTCTGGTTC TSTEEGASPGTSSTGSP
TGAAACCCCAGGTAGCCCGGCTGGCTCTCCGACCTCCACC GSSPSASTGTGPGSSPS
GAGGAAGGTGCTTCTCCTGGCACCAGCTCTACTGGTTCTC ASTGTGP
CAGGTTCTAGCCCTTCTGCTTCTACCGGTACTGGTCCAGG
TTCTAGCCCTTCTGCATCCACTGGTACTGGTCCA
LCW462_r10 GGTAGCGAACCGGCAACCTCTGGCTCTGAAACCCCAGGT GSEPATSGSETPGTSES
ACCTCTGAAAGCGCTACTCCGGAATCTGGTCCAGGTACTT ATPESGPGTSESATPES
CTGAAAGCGCTACTCCGGAATCCGGTCCAGGTTCTACCA GPGSTSESPSGTAPGSTS
GCGAATCTCCTTCTGGCACCGCTCCAGGTTCTACTAGCGA ESPSGTAPGTSPSGESST
ATCCCCGTCTGGTACCGCACCAGGTACTTCTCCTAGCGGC APGASPGTSSTGSPGSS
GAATCTTCTACCGCACCAGGTGCATCTCCGGGTACTAGCT PSASTGTGPGSSTPSGA
CTACCGGTTCTCCAGGTTCTAGCCCTTCTGCTTCCACTGGT TGSPGSSTPSGATGSPG
CCAGGTAGCTCTACCCCGTCTGGTGCTACTGGTT SSTPSGATGSPGASPGT
GTAGCTCTACTCCGTCTGGTGCAACCGGTTCCCC SSTGSP
AGGTAGCTCTACTCCTTCTGGTGCTACTGGCTCCCCAGGT
GCATCCCCTGGCACCAGCTCTACCGGTTCTCCA
LCW462_r15 GGTGCTTCTCCGGGCACCAGCTCTACTGGTTCTCCAGGTT GASPGTSSTGSPGSSPS
CTAGCCCTTCTGCATCCACCGGTACCGGTCCAGGTAGCTC ASTGTGPGSSTPSGATG
TACCCCTTCTGGTGCAACCGGCTCTCCAGGTACTTCTGAA SPGTSESATPESGPGSEP
ACCCCGGAATCTGGCCCAGGTAGCGAACCGGCT ATSGSETPGSEPATSGS
ACTTCTGGTTCTGAAACCCCAGGTAGCGAACCGGCTACCT ESATPESGPGTS
CCGGTTCTGAAACTCCAGGTACTTCTGAAAGCGCTACTCC TEPSEGSAPGTSTEPSEG
GGAGTCCGGTCCAGGTACCTCTACCGAACCGTCCGAAGG SAPGTSTEPSEGSAPGT
CAGCGCTCCAGGTACTTCTACTGAACCTTCTGAGGGTAGC STEPSEGSAPGSEPATS
GCTCCAGGTACCTCTACCGAACCGTCCGAGGGTAGCGCA GSETP
Clone DNA ce Protein ce
CCAGGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCCA
GAACCGGCAACCTCCGGTTCTGAAACTCCA
LCW462_I1 6 GGTACCTCTACCGAACCTTCCGAAGGTAGCGCTCCAGGT GTSTEPSEGSAPGSPAG
GCAGGTTCTCCTACTTCCACTGAGGAAGGTACTT SPTSTEEGTSTEPSEGSA
CTACCGAACCTTCTGAGGGTAGCGCACCAGGTACCTCTG PGTSESATPESGPGSEP
AAAGCGCAACTCCTGAGTCTGGCCCAGGTAGCGAACCTG ATSGSETPGTSESATPES
CTACCTCCGGCTCTGAGACTCCAGGTACCTCTGAAAGCGC GPGSPAGSPTSTEEGTS
AACCCCGGAATCTGGTCCAGGTAGCCCGGCTGGCTCTCCT ESATPESGPGTSTEPSEG
ACCTCTACTGAGGAAGGTACTTCTGAAAGCGCTACTCCTG SAPGSEPATSGSETPGT
AGTCTGGTCCAGGTACCTCTACTGAACCGTCCGAAGGTA GSAPGSEPATS
GCGCTCCAGGTAGCGAACCTGCTACTTCTGGTTCTGAAAC GSETP
TACTTCTACCGAACCGTCCGAGGGTAGCGCTCCA
GGTAGCGAACCTGCTACTTCTGGTTCTGAAACTCCA
LCW462_I20 GGTACTTCTACCGAACCGTCCGAAGGCAGCGCTCCAGGT GTSTEPSEGSAPGTSTEP
ACCTCTACTGAACCTTCCGAGGGCAGCGCTCCAGGTACCT SEGSAPGTSTEPSEGSA
CTACCGAACCTTCTGAAGGTAGCGCACCAGGTACTTCTAC PGTSTEPSEGSAPGTSTE
CGAACCGTCCGAAGGCAGCGCTCCAGGTACCTCTACTGA PSEGSAPGTSTEPSEGS
ACCTTCCGAGGGCAGCGCTCCAGGTACCTCTACCGAACCT APGTSTEPSEGSAPGTS
TCTGAAGGTAGCGCACCAGGTACTTCTACCGAACCTTCCG ESATPESGPGTSESATPE
AGGGCAGCGCACCAGGTACTTCTGAAAGCGCTACCCCTG SGPGTSTEPSEGSAPGS
AGTCCGGCCCAGGTACTTCTGAAAGCGCTACTCCTGAATC EPATSGSETPGSPAGSP
CGGTCCAGGTACTTCTACTGAACCTTCCGAAGGTAGCGCT TSTEE
CCAGGTAGCGAACCTGCTACTTCTGGTTCTGAAACCCCAG
GTAGCCCGGCTGGCTCTCCGACCTCCACCGAGGAA
LCW462_I23 GGTACTTCTACCGAACCGTCCGAGGGCAGCGCTCCAGGT GTSTEPSEGSAPGTSTEP
ACTGAACCTTCTGAAGGCAGCGCTCCAGGTACTT SEGSAPGTSTEPSEGSA
CTACTGAACCTTCCGAAGGTAGCGCACCAGGTTCTACCA PGSTSESPSGTAPGSTSE
GCGAATCCCCTTCTGGTACTGCTCCAGGTTCTACCAGCGA SPSGTAPGTSTPESGSAS
ATCCCCTTCTGGCACCGCACCAGGTACTTCTACCCCTGAA PGSEPATSGSETPGTSES
AGCGGCTCCGCTTCTCCAGGTAGCGAACCTGCAACCTCTG ATPESGPGTSTEPSEGS
GCTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGA EPSEGSAPGTS
ATCTGGCCCAGGTACTTCTACTGAACCGTCCGAGGGCAG ESATPESGPGTSESATPE
CGCACCAGGTACTTCTACTGAACCGTCTGAAGGTAGCGC SGP
ACCAGGTACTTCTGAAAGCGCAACCCCGGAATCCGGCCC
AGGTACCTCTGAAAGCGCAACCCCGGAGTCCGGCCCA
LCW462_I24 GGTAGCTCTACCCCTTCTGGTGCTACCGGCTCTCCAGGTT GSSTPSGATGSPGSSPS
CTAGCCCGTCTGCTTCTACCGGTACCGGTCCAGGTAGCTC PGSSTPSGATG
TACCCCTTCTGGTGCTACTGGTTCTCCAGGTAGCCCTGCT SPGSPAGSPTSTEEGSPA
GGCTCTCCGACTTCTACTGAGGAAGGTAGCCCGGCTGGTT GSPTSTEEGTSTEPSEGS
CTCCGACTTCTACTGAGGAAGGTACTTCTACCGAACCTTC APGASPGTSSTGSPGSS
CGAAGGTAGCGCTCCAGGTGCTTCCCCGGGCACTAGCTCT PSASTGTGPGTPGSGTA
ACCGGTTCTCCAGGTTCTAGCCCTTCTGCATCTACTGGTA SSSPGSTSSTAESPGPGT
CTGGCCCAGGTACTCCGGGCAGCGGTACTGCTTCTTCCTC SPSGESSTAPGTSTPESG
TCCAGGTTCTACTAGCTCTACTGCTGAATCTCCTGGCCCA SASP
GGTACTTCTCCTAGCGGTGAATCTTCTACCGCTCCAGGTA
CCTCTACTCCGGAAAGCGGTTCTGCATCTCCA
LCW462_I27 GGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGGTA GTSTEPSEGSAPGTSES
CTTCTGAAAGCGCTACCCCGGAGTCCGGTCCAGGTACTTC ATPESGPGTSTEPSEGS
TACTGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTACT APGTSTEPSEGSAPGTS
GAACCGTCTGAAGGTAGCGCACCAGGTACTTCTGAAAGC ESATPESGPGTSESATPE
CCGGAATCCGGCCCAGGTACCTCTGAAAGCGCA SGPGTPGSGTASSSPGA
ACCCCGGAGTCCGGCCCAGGTACTCCTGGCAGCGGTACC SPGTSSTGSPGASPGTSS
GCTTCTTCTTCTCCAGGTGCTTCTCCTGGTACTAGCTCTAC TGSPGSPAGSPTSTEEG
TGGTTCTCCAGGTGCTTCTCCGGGCACTAGCTCTACTGGT SPAGSPTSTEEGTSTEPS
TCTCCAGGTAGCCCTGCTGGCTCTCCGACTTCTACTGAGG EGSAP
AAGGTAGCCCGGCTGGTTCTCCGACTTCTACTGAGGAAG
GTACTTCTACCGAACCTTCCGAAGGTAGCGCTCCA
LCW462_I28 GGTAGCCCAGCAGGCTCTCCGACTTCCACTGAGGAAGGT GSPAGSPTSTEEGTSTEP
ACTGAACCTTCCGAAGGCAGCGCACCAGGTACCT SEGSAPGTSTEPSEGSA
CTACTGAACCTTCTGAGGGCAGCGCTCCAGGTACCTCTAC PGTSTEPSEGSAPGTSES
2012/054941
Clone DNA Sequence n Sequence
CGAACCGTCTGAAGGTAGCGCACCAGGTACCTCTGAAAG ATPESGPGTSESATPES
CGCAACTCCTGAGTCCGGTCCAGGTACTTCTGAAAGCGC GPGTPGSGTASSSPGSS
AACCCCGGAGTCTGGCCCAGGTACCCCGGGTAGCGGTAC TPSGATGSPGASPGTSS
TGCTTCTTCCTCTCCAGGTAGCTCTACCCCTTCTGGTGCAA STEPSEGSAPG
CCGGCTCTCCAGGTGCTTCTCCGGGCACCAGCTCTACCGG TSESATPESGPGTSTEPS
TTCTCCAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCT EGSAP
CCAGGTACTTCTGAAAGCGCTACCCCGGAGTCCGGTCCA
GGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCA
LCW462_I3 8 GGTAGCGAACCGGCAACCTCCGGCTCTGAAACTCCAGGT GSEPATSGSETPGTSES
ACTTCTGAAAGCGCTACTCCGGAATCCGGCCCAGGTAGC ATPESGPGSEPATSGSE
GAACCGGCTACTTCCGGCTCTGAAACCCCAGGTAGCTCTA TPGSSTPSGATGSPGTP
CTGGTGCAACCGGCTCCCCAGGTACTCCTGGTAG GSGTASSSPGSSTPSGA
CGGTACCGCTTCTTCTTCTCCAGGTAGCTCTACTCCGTCTG TGSPGASPGTSSTGSPG
GTGCTACCGGCTCCCCAGGTGCATCTCCTGGTACCAGCTC SSTPSGATGSPGASPGT
TACCGGTTCTCCAGGTAGCTCTACTCCTTCTGGTGCTACT SSTGSPGSEPATSGSETP
GGCTCTCCAGGTGCTTCCCCGGGTACCAGCTCTACCGGTT GTSTEPSEGSAPGSEPA
CTCCAGGTAGCGAACCTGCTACTTCTGGTTCTGAAACTCC TSGSETP
TTCTACCGAACCGTCCGAGGGTAGCGCTCCAGG
TAGCGAACCTGCTACTTCTGGTTCTGAAACTCCA
LCW462_I39 GGTACCTCTACTGAACCTTCCGAAGGCAGCGCTCCAGGT SEGSAPGTSTEP
ACCTCTACCGAACCGTCCGAGGGCAGCGCACCAGGTACT SEGSAPGTSESATPESG
TCTGAAAGCGCAACCCCTGAATCCGGTCCAGGTAGCCCT PGSPAGSPTSTEEGSPA
GCTGGCTCTCCGACTTCTACTGAGGAAGGTAGCCCGGCTG GSPTSTEEGTSTEPSEGS
GTTCTCCGACTTCTACTGAGGAAGGTACTTCTACCGAACC APGSPAGSPTSTEEGTS
TTCCGAAGGTAGCGCTCCAGGTAGCCCGGCTGGTTCTCCG TEPSEGSAPGTSTEPSEG
ACCGAGGAAGGTACCTCTACTGAACCTTCTGAGG SAPGASPGTSSTGSPGS
CTCCAGGTACCTCTACTGAACCTTCCGAAGGCA SPSASTGTGPGSSPSAST
GCGCTCCAGGTGCTTCCCCGGGCACCAGCTCTACTGGTTC GTGP
TCCAGGTTCTAGCCCGTCTGCTTCTACTGGTACTGGTCCA
GGTTCTAGCCCTTCTGCTTCCACTGGTACTGGTCCA
LCW462_I4 1 GGTAGCTCTACCCCGTCTGGTGCTACCGGTTCCCCAGGTG GSSTPSGATGSPGASPG
CTTCTCCTGGTACTAGCTCTACCGGTTCTCCAGGTAGCTC TSSTGSPGSSTPSGATGS
TACCCCGTCTGGTGCTACTGGCTCTCCAGGTAGCCCTGCT PGSPAGSPTSTEEGTSES
CCAACCTCCACCGAAGAAGGTACCTCTGAAAGC ATPESGPGSEPATSGSE
GCAACCCCTGAATCCGGCCCAGGTAGCGAACCGGCAACC TPGASPGTSSTGSPGSST
TCCGGTTCTGAAACCCCAGGTGCATCTCCTGGTACTAGCT PSGATGSPGSSPSASTG
CTACTGGTTCTCCAGGTAGCTCTACTCCGTCTGGTGCAAC TGPGSTSESPSGTAPGS
CGGCTCTCCAGGTTCTAGCCCTTCTGCATCTACCGGTACT TSESPSGTAPGTSTPESG
GGTCCAGGTTCTACCAGCGAATCCCCTTCTGGTACTGCTC SASP
CAGGTTCTACCAGCGAATCCCCTTCTGGCACCGCACCAGG
TACTTCTACCCCTGAAAGCGGCTCCGCTTCTCCA
LCW462_I42 GGTTCTACCAGCGAATCTCCTTCTGGCACCGCTCCAGGTT GSTSESPSGTAPGSTSES
CTACTAGCGAATCCCCGTCTGGTACCGCACCAGGTACTTC PSGTAPGTSPSGESSTAP
TCCTAGCGGCGAATCTTCTACCGCACCAGGTACCTCTGAA GTSESATPESGPGTSTEP
AGCGCTACTCCGGAGTCTGGCCCAGGTACCTCTACTGAAC SEGSAPGTSTEPSEGSA
CGTCTGAGGGTAGCGCTCCAGGTACTTCTACTGAACCGTC PGTSTEPSEGSAPGTSES
CGAAGGTAGCGCACCAGGTACCTCTACTGAACCTTCTGA ATPESGPGTSTEPSEGS
GGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACCCCGGA APGSSTPSGATGSPGAS
GTCCGGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGC PGTSSTGSPGSSTPSGAT
GCACCAGGTAGCTCTACCCCGTCTGGTGCTACCGGTTCCC GSP
CAGGTGCTTCTCCTGGTACTAGCTCTACCGGTTCTCCAGG
TAGCTCTACCCCGTCTGGTGCTACTGGCTCTCCA
LCW462_I43 GGTTCTACTAGCTCTACTGCAGAATCTCCGGGCCCAGGTA GSTSSTAESPGPGTSPSG
CTAGCGGTGAATCTTCTACCGCTCCAGGTACTTC ESSTAPGTSPSGESSTAP
CGGTGAATCTTCTACCGCTCCAGGTTCTACTAGC GSTSSTAESPGPGSTSST
TCTACCGCTGAATCTCCGGGTCCAGGTTCTACCAGCTCTA AESPGPGTSTPESGSASP
CTGCAGAATCTCCTGGCCCAGGTACTTCTACTCCGGAAAG GTSPSGESSTAPGSTSST
CGGTTCCGCTTCTCCAGGTACTTCTCCTAGCGGTGAATCT AESPGPGTSTPESGSASP
TCTACCGCTCCAGGTTCTACCAGCTCTACTGCTGAATCTC GSTSSTAESPGPGSTSES
CTGGCCCAGGTACTTCTACCCCGGAAAGCGGCTCCGCTTC PSGTAPGTSPSGESSTAP
Clone DNA Sequence Protein ce
TCCAGGTTCTACCAGCTCTACCGCTGAATCTCCTGGCCCA
ACTAGCGAATCTCCGTCTGGCACCGCACCAGGTA
CTTCCCCTAGCGGTGAATCTTCTACTGCACCA
_I45 GGTACCTCTACTCCGGAAAGCGGTTCCGCATCTCCAGGTT GTSTPESGSASPGSTSES
CTACCAGCGAATCCCCGTCTGGCACCGCACCAGGTTCTAC PSGTAPGSTSSTAESPGP
TAGCTCTACTGCTGAATCTCCGGGCCCAGGTACCTCTACT SEGSAPGTSTEP
GAACCTTCCGAAGGCAGCGCTCCAGGTACCTCTACCGAA SEGSAPGTSESATPESG
CCGTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCA PGTSESATPESGPGTSTE
ACCCCTGAATCCGGTCCAGGTACCTCTGAAAGCGCTACTC PGTSTEPSEGS
CGGAGTCTGGCCCAGGTACCTCTACTGAACCGTCTGAGG APGTSESATPESGPGTS
GTAGCGCTCCAGGTACTTCTACTGAACCGTCCGAAGGTA TEPSEGSAPGTSTEPSEG
GCGCACCAGGTACTTCTGAAAGCGCTACTCCGGAGTCCG SAP
GTCCAGGTACCTCTACCGAACCGTCCGAAGGCAGCGCTC
CAGGTACTTCTACTGAACCTTCTGAGGGTAGCGCTCCC
LCW4627I47 GGTACCTCTACCGAACCGTCCGAGGGTAGCGCACCAGGT GTSTEPSEGSAPGTSTEP
ACCTCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTAGC SEGSAPGSEPATSGSET
GAACCGGCAACCTCCGGTTCTGAAACTCCAGGTACTTCTA PGTSTEPSEGSAPGTSES
CTGAACCGTCTGAAGGTAGCGCACCAGGTACTTCTGAAA ATPESGPGTSESATPES
GCGCAACCCCGGAATCCGGCCCAGGTACCTCTGAAAGCG GPGASPGTSSTGSPGSS
CAACCCCGGAGTCCGGCCCAGGTGCATCTCCGGGTACTA PSASTGTGPGSSTPSGA
GCTCTACCGGTTCTCCAGGTTCTAGCCCTTCTGCTTCCACT TGSPGSSTPSGATGSPG
GGTACCGGCCCAGGTAGCTCTACCCCGTCTGGTGCTACTG SSTPSGATGSPGASPGT
GTTCCCCAGGTAGCTCTACTCCGTCTGGTGCAACCGGTTC SSTGSP
CCCAGGTAGCTCTACTCCTTCTGGTGCTACTGGCTCCCCA
GGTGCATCCCCTGGCACCAGCTCTACCGGTTCTCCA
LCW462_I54 GGTAGCGAACCGGCAACCTCTGGCTCTGAAACTCCAGGT GSEPATSGSETPGSEPA
AGCGAACCTGCAACCTCCGGCTCTGAAACCCCAGGTACT TSGSETPGTSTEPSEGSA
TCTACTGAACCTTCTGAGGGCAGCGCACCAGGTAGCGAA PGSEPATSGSETPGTSES
ACCTCTGGCTCTGAAACCCCAGGTACCTCTGAAA ATPESGPGTSTEPSEGS
GCGCTACTCCTGAATCTGGCCCAGGTACTTCTACTGAACC APGSSTPSGATGSPGSS
GTCCGAGGGCAGCGCACCAGGTAGCTCTACTCCGTCTGG TPSGATGSPGASPGTSS
TGCTACCGGCTCTCCAGGTAGCTCTACCCCTTCTGGTGCA TGSPGSSTPSGATGSPG
ACCGGCTCCCCAGGTGCTTCTCCGGGTACCAGCTCTACTG ASPGTSSTGSPGSSTPSG
GTTCTCCAGGTAGCTCTACCCCGTCTGGTGCTACCGGTTC ATGSP
TGCTTCTCCTGGTACTAGCTCTACCGGTTCTCCA
GGTAGCTCTACCCCGTCTGGTGCTACTGGCTCTCCA
LCW462_r55 GGTACTTCTACCGAACCGTCCGAGGGCAGCGCTCCAGGT SEGSAPGTSTEP
ACTTCTACTGAACCTTCTGAAGGCAGCGCTCCAGGTACTT GTSTEPSEGSA
CTACTGAACCTTCCGAAGGTAGCGCACCAGGTACTTCTGA PGTSESATPESGPGTSTE
AAGCGCTACTCCGGAGTCCGGTCCAGGTACCTCTACCGA PSEGSAPGTSTEPSEGS
ACCGTCCGAAGGCAGCGCTCCAGGTACTTCTACTGAACCT APGSTSESPSGTAPGTSP
TCTGAGGGTAGCGCTCCAGGTTCTACTAGCGAATCTCCGT SGESSTAPGTSPSGESST
CTGGCACTGCTCCAGGTACTTCTCCTAGCGGTGAATCTTC APGSPAGSPTSTEEGTS
TACCGCTCCAGGTACTTCCCCTAGCGGCGAATCTTCTACC ESATPESGPGTSTEPSEG
GCTCCAGGTAGCCCGGCTGGCTCTCCTACCTCTACTGAGG SAP
AAGGTACTTCTGAAAGCGCTACTCCTGAGTCTGGTCCAGG
TACCTCTACTGAACCGTCCGAAGGTAGCGCTCCA
LCW462_I57 GGTACTTCTACTGAACCTTCCGAAGGTAGCGCTCCAGGTA GTSTEPSEGSAPGSEPA
GCGAACCTGCTACTTCTGGTTCTGAAACCCCAGGTAGCCC TSGSETPGSPAGSPTSTE
GGCTGGCTCTCCGACCTCCACCGAGGAAGGTAGCCCGGC EGSPAGSPTSTEEGTSES
AGGCTCTCCGACCTCTACTGAGGAAGGTACTTCTGAAAG ATPESGPGTSTEPSEGS
CGCAACCCCGGAGTCCGGCCCAGGTACCTCTACCGAACC APGTSTEPSEGSAPGTS
GGGCAGCGCACCAGGTACCTCTACTGAACCTTCC TEPSEGSAPGTSESATPE
GAAGGCAGCGCTCCAGGTACCTCTACCGAACCGTCCGAG SGPGSSTPSGATGSPGS
GGCAGCGCACCAGGTACTTCTGAAAGCGCAACCCCTGAA SPSASTGTGPGASPGTS
TCCGGTCCAGGTAGCTCTACTCCGTCTGGTGCAACCGGCT STGSP
CCCCAGGTTCTAGCCCGTCTGCTTCCACTGGTACTGGCCC
AGGTGCTTCCCCGGGCACCAGCTCTACTGGTTCTCCA
LCW462_I6 1 GAACCGGCTACTTCCGGCTCTGAGACTCCAGGT GSEPATSGSETPGSPAG
AGCCCTGCTGGCTCTCCGACCTCTACCGAAGAAGGTACCT SPTSTEEGTSESATPESG
Clone DNA Sequence Protein Sequence
CTGAAAGCGCTACCCCTGAGTCTGGCCCAGGTACCTCTAC PGTSTEPSEGSAPGTSTE
TGAACCTTCCGAAGGCAGCGCTCCAGGTACCTCTACCGA PSEGSAPGTSESATPES
ACCGTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGC GPGTSTPESGSASPGSTS
AACCCCTGAATCCGGTCCAGGTACCTCTACTCCGGAAAG ESPSGTAPGSTSSTAESP
CGGTTCCGCATCTCCAGGTTCTACCAGCGAATCCCCGTCT GPGTSESATPESGPGTS
GGCACCGCACCAGGTTCTACTAGCTCTACTGCTGAATCTC SAPGTSTEPSEG
CGGGCCCAGGTACTTCTGAAAGCGCTACTCCGGAGTCCG SAP
GTCCAGGTACCTCTACCGAACCGTCCGAAGGCAGCGCTC
CTTCTACTGAACCTTCTGAGGGTAGCGCTCCA
LCW462_I64 TCTACCGAACCGTCCGAGGGCAGCGCTCCAGGT GTSTEPSEGSAPGTSTEP
ACTTCTACTGAACCTTCTGAAGGCAGCGCTCCAGGTACTT SEGSAPGTSTEPSEGSA
CTACTGAACCTTCCGAAGGTAGCGCACCAGGTACCTCTAC PGTSTEPSEGSAPGTSES
CGAACCGTCTGAAGGTAGCGCACCAGGTACCTCTGAAAG ATPESGPGTSESATPES
CGCAACTCCTGAGTCCGGTCCAGGTACTTCTGAAAGCGC GPGTPGSGTASSSPGSS
AACCCCGGAGTCTGGCCCAGGTACTCCTGGCAGCGGTAC TPSGATGSPGASPGTSS
CGCATCTTCCTCTCCAGGTAGCTCTACTCCGTCTGGTGCA TGSPGSTSSTAESPGPG
ACTGGTTCCCCAGGTGCTTCTCCGGGTACCAGCTCTACCG TSPSGESSTAPGTSTPES
CAGGTTCCACCAGCTCTACTGCTGAATCTCCTGG GSASP
TCCAGGTACCTCTCCTAGCGGTGAATCTTCTACTGCTCCA
GGTACTTCTACTCCTGAAAGCGGCTCTGCTTCTCCA
LCW462_I67 GGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAAGGT GSPAGSPTSTEEGTSES
ACTTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGTACC ATPESGPGTSTEPSEGS
TCTACCGAACCGTCTGAGGGCAGCGCACCAGGTACTTCT APGTSESATPESGPGSE
GAAAGCGCAACCCCTGAATCCGGTCCAGGTAGCGAACCG PATSGSETPGTSTEPSEG
GCTACTTCTGGCTCTGAGACTCCAGGTACTTCTACCGAAC SAPGSPAGSPTSTEEGT
CGTCCGAAGGTAGCGCACCAGGTAGCCCGGCTGGTTCTC STEPSEGSAPGTSTEPSE
CGACTTCCACCGAGGAAGGTACCTCTACTGAACCTTCTGA GSAPGTSTEPSEGSAPG
GGGTAGCGCTCCAGGTACCTCTACTGAACCTTCCGAAGG TSTEPSEGSAPGTSTEPS
CAGCGCTCCAGGTACTTCTACCGAACCGTCCGAGGGCAG EGSAP
CGCTCCAGGTACTTCTACTGAACCTTCTGAAGGCAGCGCT
CCAGGTACTTCTACTGAACCTTCCGAAGGTAGCGCACCA
LCW4627I69 GGTACTTCTCCGAGCGGTGAATCTTCTACCGCACCAGGTT GTSPSGESSTAPGSTSST
CTACTAGCTCTACCGCTGAATCTCCGGGCCCAGGTACTTC AESPGPGTSPSGESSTAP
TCCGAGCGGTGAATCTTCTACTGCTCCAGGTACCTCTGAA GTSESATPESGPGTSTEP
AGCGCTACTCCGGAGTCTGGCCCAGGTACCTCTACTGAAC GTSTEPSEGSA
CGTCTGAGGGTAGCGCTCCAGGTACTTCTACTGAACCGTC PGSSPSASTGTGPGSSTP
CGAAGGTAGCGCACCAGGTTCTAGCCCTTCTGCATCTACT SGATGSPGASPGTSSTG
GGCCCAGGTAGCTCTACTCCTTCTGGTGCTACCG PESGSASPGTSP
GCTCTCCAGGTGCTTCTCCGGGTACTAGCTCTACCGGTTC SGESSTAPGTSPSGESST
TCCAGGTACTTCTACTCCGGAAAGCGGTTCCGCATCTCCA AP
TCTCCTAGCGGTGAATCTTCTACTGCTCCAGGTA
CCTCTCCTAGCGGCGAATCTTCTACTGCTCCA
LCW462_I70 GGTACCTCTGAAAGCGCTACTCCGGAGTCTGGCCCAGGT GTSESATPESGPGTSTEP
ACCTCTACTGAACCGTCTGAGGGTAGCGCTCCAGGTACTT SEGSAPGTSTEPSEGSA
CTACTGAACCGTCCGAAGGTAGCGCACCAGGTAGCCCTG PGSPAGSPTSTEEGSPA
CTGGCTCTCCGACTTCTACTGAGGAAGGTAGCCCGGCTGG GSPTSTEEGTSTEPSEGS
TTCTCCGACTTCTACTGAGGAAGGTACTTCTACCGAACCT APGSSPSASTGTGPGSS
TCCGAAGGTAGCGCTCCAGGTTCTAGCCCTTCTGCTTCCA TPSGATGSPGSSTPSGA
CCGGTACTGGCCCAGGTAGCTCTACCCCTTCTGGTGCTAC TGSPGSEPATSGSETPG
CGGCTCCCCAGGTAGCTCTACTCCTTCTGGTGCAACTGGC TSESATPESGPGSEPATS
TCTCCAGGTAGCGAACCGGCAACTTCCGGCTCTGAAACC GSETP
CCAGGTACTTCTGAAAGCGCTACTCCTGAGTCTGGCCCAG
GTAGCGAACCTGCTACCTCTGGCTCTGAAACCCCA
_r72 GGTACTTCTACCGAACCGTCCGAAGGCAGCGCTCCAGGT GTSTEPSEGSAPGTSTEP
ACCTCTACTGAACCTTCCGAGGGCAGCGCTCCAGGTACCT SEGSAPGTSTEPSEGSA
CTACCGAACCTTCTGAAGGTAGCGCACCAGGTAGCTCTA PGSSTPSGATGSPGASP
CTGGTGCTACCGGTTCCCCAGGTGCTTCTCCTGG GTSSTGSPGSSTPSGAT
TACTAGCTCTACCGGTTCTCCAGGTAGCTCTACCCCGTCT GSPGTSESATPESGPGS
ACTGGCTCTCCAGGTACTTCTGAAAGCGCAACCC EPATSGSETPGTSTEPSE
CTGAATCCGGTCCAGGTAGCGAACCGGCTACTTCTGGCTC GSAPGSTSESPSGTAPG
Clone DNA Sequence n Sequence
TGAGACTCCAGGTACTTCTACCGAACCGTCCGAAGGTAG STSESPSGTAPGTSTPES
CGCACCAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCA GSASP
CCAGGTTCTACCAGCGAATCTCCGTCTGGCACTGCACCAG
GTACCTCTACCCCTGAAAGCGGTTCCGCTTCTCCA
LCW462_I73 GGTACCTCTACTCCTGAAAGCGGTTCTGCATCTCCAGGTT GTSTPESGSASPGSTSST
CCACTAGCTCTACCGCAGAATCTCCGGGCCCAGGTTCTAC AESPGPGSTSSTAESPGP
TAGCTCTACTGCTGAATCTCCTGGCCCAGGTTCTAGCCCT GSSPSASTGTGPGSSTPS
TCTGCATCTACTGGTACTGGCCCAGGTAGCTCTACTCCTT GASPGTSSTGS
CTGGTGCTACCGGCTCTCCAGGTGCTTCTCCGGGTACTAG TSGSETPGTSES
CTCTACCGGTTCTCCAGGTAGCGAACCGGCAACCTCCGGC ATPESGPGSPAGSPTST
TCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGAAT EEGSTSESPSGTAPGSTS
CCGGCCCAGGTAGCCCGGCAGGTTCTCCGACTTCCACTGA ESPSGTAPGTSTPESGS
GGAAGGTTCTACTAGCGAATCTCCTTCTGGCACTGCACCA ASP
GGTTCTACCAGCGAATCTCCGTCTGGCACTGCACCAGGTA
CCTCTACCCCTGAAAGCGGTTCCGCTTCTCCC
LCW462_I78 GGTAGCCCGGCTGGCTCTCCTACCTCTACTGAGGAAGGTA GSPAGSPTSTEEGTSES
CTTCTGAAAGCGCTACTCCTGAGTCTGGTCCAGGTACCTC ATPESGPGTSTEPSEGS
TACTGAACCGTCCGAAGGTAGCGCTCCAGGTTCTACCAG APGSTSESPSGTAPGSTS
CGAATCTCCTTCTGGCACCGCTCCAGGTTCTACTAGCGAA ESPSGTAPGTSPSGESST
TCCCCGTCTGGTACCGCACCAGGTACTTCTCCTAGCGGCG APGTSTEPSEGSAPGSP
AATCTTCTACCGCACCAGGTACCTCTACCGAACCTTCCGA AGSPTSTEEGTSTEPSE
AGGTAGCGCTCCAGGTAGCCCGGCAGGTTCTCCTACTTCC GSAPGSEPATSGSETPG
ACTGAGGAAGGTACTTCTACCGAACCTTCTGAGGGTAGC TSESATPESGPGTSTEPS
GCACCAGGTAGCGAACCTGCAACCTCTGGCTCTGAAACC EGSAP
CCAGGTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAG
GTACTTCTACTGAACCGTCCGAGGGCAGCGCACCA
LCW462_I79 GGTACCTCTACCGAACCTTCCGAAGGTAGCGCTCCAGGT GTSTEPSEGSAPGSPAG
AGCCCGGCAGGTTCTCCTACTTCCACTGAGGAAGGTACTT EGTSTEPSEGSA
CTACCGAACCTTCTGAGGGTAGCGCACCAGGTACCTCCCC PGTSPSGESSTAPGTSPS
TAGCGGCGAATCTTCTACTGCTCCAGGTACCTCTCCTAGC GESSTAPGTSPSGESST
GGCGAATCTTCTACCGCTCCAGGTACCTCCCCTAGCGGTG APGSTSESPSGTAPGSTS
AATCTTCTACCGCACCAGGTTCTACCAGCGAATCCCCTTC ESPSGTAPGTSTPESGS
TGGTACTGCTCCAGGTTCTACCAGCGAATCCCCTTCTGGC ASPGSEPATSGSETPGT
ACCGCACCAGGTACTTCTACCCCTGAAAGCGGCTCCGCTT SESATPESGPGTSTEPSE
CTCCAGGTAGCGAACCTGCAACCTCTGGCTCTGAAACCCC GSAP
AGGTACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGT
ACTTCTACTGAACCGTCCGAGGGCAGCGCACCA
LCW462_I87 GGTAGCGAACCGGCAACCTCTGGCTCTGAAACCCCAGGT GSEPATSGSETPGTSES
ACCTCTGAAAGCGCTACTCCGGAATCTGGTCCAGGTACTT ATPESGPGTSESATPES
CTGAAAGCGCTACTCCGGAATCCGGTCCAGGTACTTCTCC GPGTSPSGESSTAPGSTS
GAGCGGTGAATCTTCTACCGCACCAGGTTCTACTAGCTCT STAESPGPGTSPSGESST
ACCGCTGAATCTCCGGGCCCAGGTACTTCTCCGAGCGGTG APGSTSESPSGTAPGTSP
AATCTTCTACTGCTCCAGGTTCTACTAGCGAATCCCCGTC SGESSTAPGSTSSTAESP
TGGTACTGCTCCAGGTACTTCCCCTAGCGGTGAATCTTCT GPGSSTPSGATGSPGSS
ACTGCTCCAGGTTCTACCAGCTCTACCGCAGAATCTCCGG GSPGSSTPSGA
GTAGCTCTACTCCGTCTGGTGCAACCGGTTCCCC NWLS
AGGTAGCTCTACCCCTTCTGGTGCAACCGGCTCCCCAGGT
AGCTCTACCCCTTCTGGTGCAAACTGGCTCTCC
LCW462_I8 8 GGTAGCCCTGCTGGCTCTCCGACTTCTACTGAGGAAGGTA GSPAGSPTSTEEGSPAG
CTGGTTCTCCGACTTCTACTGAGGAAGGTACTTC SPTSTEEGTSTEPSEGSA
TACCGAACCTTCCGAAGGTAGCGCTCCAGGTACCTCTACT PGTSTEPSEGSAPGTSTE
GAACCTTCCGAAGGCAGCGCTCCAGGTACCTCTACCGAA PSEGSAPGTSESATPES
CCGTCCGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCA GTSSTGSPGSS
ACCCCTGAATCCGGTCCAGGTGCATCTCCTGGTACCAGCT TPSGATGSPGASPGTSS
CTACCGGTTCTCCAGGTAGCTCTACTCCTTCTGGTGCTAC TGSPGSSTPSGATGSPG
TGGCTCTCCAGGTGCTTCCCCGGGTACCAGCTCTACCGGT TPGSGTASSSPGSSTPSG
TCTCCAGGTAGCTCTACCCCGTCTGGTGCTACTGGTTCTC ATGSP
CAGGTACTCCGGGCAGCGGTACTGCTTCTTCCTCTCCAGG
TACCCCTTCTGGTGCTACTGGCTCTCCA
LCW462_I89 GGTAGCTCTACCCCGTCTGGTGCTACTGGTTCTCCAGGTA GSSTPSGATGSPGTPGS
DNA Sequence Protein Sequence
CTCCGGGCAGCGGTACTGCTTCTTCCTCTCCAGGTAGCTC GTASSSPGSSTPSGATG
TTCTGGTGCTACTGGCTCTCCAGGTAGCCCGGCT SPGSPAGSPTSTEEGTSE
GGCTCTCCTACCTCTACTGAGGAAGGTACTTCTGAAAGCG SATPESGPGTSTEPSEGS
CTACTCCTGAGTCTGGTCCAGGTACCTCTACTGAACCGTC APGTSESATPESGPGSE
TAGCGCTCCAGGTACCTCTGAAAGCGCAACTCC ETPGTSESATPE
TGAGTCTGGCCCAGGTAGCGAACCTGCTACCTCCGGCTCT SGPGTSTEPSEGSAPGT
GAGACTCCAGGTACCTCTGAAAGCGCAACCCCGGAATCT ESGPGTSESATP
GGTCCAGGTACTTCTACTGAACCGTCTGAAGGTAGCGCA ESGP
CCAGGTACTTCTGAAAGCGCAACCCCGGAATCCGGCCCA
GGTACCTCTGAAAGCGCAACCCCGGAGTCCGGCCCA
Example 7: Construction of XTEN_AM288
The entire library LCWO462 was dimerized as described in Example 6 resulting in a library of
XTEN_AM288 clones designated LCWO463. 1512 isolates from library LCWO463 were screened using
the protocol described in Example 6. 176 highly expressing clones were sequenced and 40 preferred
XTEN_AM288 segments were chosen for the construction of unctional proteins that contain
le XTEN segments with 288 amino acid residues.
Example 8: Construction of M432
We ted a library ofXTEN_AM432 segments by recombining segments from y
LCWO462 ofXTEN_AM144 ts and ts from library LCWO463 ofXTEN_AM288
segments. This new library of XTEN_AM432 segment was designated LCWO464. Plasmid was isolated
from cultures of E. coli harboring LCWO462 and LCWO463, respectively. 1512 isolates from library
LCWO464 were screened using the protocol described in Example 6. 176 highly expressing clones were
sequenced and 39 preferred XTEN_AM432 t were chosen for the construction of longer XTENs
and for the construction of multifunctional proteins that contain multiple XTEN segments with 432
amino acid residues.
In parallel we constructed library LMSOlOO ofXTENiAM432 segments using preferred
segments of XTENiAM144 and XTENiAM288. Screening of this library d 4 isolates that were
selected for further construction
Example 9: Construction of M875
The stuffer vector pCW0359 was digested with Bsal and Kpnl to remove the r segment
and the resulting vector fragment was isolated by agarose gel ation.
We annealed the phosphorylated oligonucleotide BsaI-AscI-KpnlforP:
AGGTGCAAGCGCAAGCGGCGCGCCAAGCACGGGAGGTTCGTCTTCACTCGAGGGTAC and
the non-phosphorylated oligonucleotide Bsal-Ascl-Kpnlrev:
CCTCGAGTGAAGACGAACCTCCCGTGCTTGGCGCGCCGCTTGCGCTTGC for introducing the
sequencing island A (SI-A) which encodes amino acids GASASGAPSTG and has the restriction enzyme
AscI recognition nucleotide sequence GGCGCGCC inside. The annealed oligonucleotide pairs were
ligated with BsaI and KpnI digested stuffer vector pCW0359 prepared above to yield pCWO466
containing Sl-A. We then generated a library of XTEN_AM443 segments by recombining 43 preferred
XTEN_AM432 segments from e 8 and Sl-A segments from pCWO466 at C-terminus using the
WO 40093
same dimerization process described in Example 5. This new library of XTEN_AM443 segments was
designated LCW0479.
We generated a library ofXTEN_AM875 segments by recombining ts from library
LCW0479 ofXTEN_AM443 segments and 43 preferred XTEN_AM432 segments from Example 8 using
the same dimerization process described in Example 5. This new library _AM875 segment was
ated LCW0481.
Example 10: Construction of XTEN_AM1318
We annealed the phosphorylated oligonucleotide BsaI-Fsel-KpnlforP:
AGGTCCAGAACCAACGGGGCCGGCCCCAAGCGGAGGTTCGTCTTCACTCGAGGGTAC and
the non-phosphorylated oligonucleotide Bsal-FseI-Kpnlrev:
CCTCGAGTGAAGACGAACCTCCGCTTGGGGCCGGCCCCGTTGGTTCTGG for introducing the
sequencing island B (SI-B) which encodes amino acids GPEPTGPAPSG and has the ction enzyme
Fsel recognition nucleotide ce GGCCGGCC inside. The annealed oligonucleotide pairs were
ligated with BsaI and Kpnl digested stuffer vector pCWO359 as used in Example 9 to yield pCWO467
containing Sl-B. We then generated a library ofXTEN_AM443 segments by recombining 43 preferred
XTEN_AM432 ts from Example 8 and Sl-B segments from 7 at C-terminus using the
same dimerization process described in Example 5. This new library of XTEN_AM443 segments was
designated LCW0480.
We generated a library ofXTEN_AM13 l 8 segments by recombining ts from library
LCW0480 ofXTEN_AM443 segments and ts from library 1 ofXTEN_AM875
segments using the same dimerization process as in Example 5. This new library of XTEN_AM13 l 8
t was designated LCW0487.
Example 11: Construction of XTEN_AD864
] Using the several consecutive rounds of dimerization, we assembled a collection of
D864 sequences starting from segments ofXTEN_AD36 listed in Example 1. These sequences
were assembled as bed in Example 5. Several isolates from XTEN_AD864 were evaluated and
found to show good expression and excellent lity under physiological conditions. One intermediate
construct of XTEN_AD576 was ced. This clone was evaluated in a PK experiment in cynomolgus
monkeys and a half-life of about 20h was measured.
Example 12: Construction of XTEN_AF864
Using the several consecutive rounds of dimerization, we assembled a collection of
XTEN_AF864 sequences starting from segments ofXTEN_AF36 listed in Example 3. These sequences
were assembled as described in Example 5. Several isolates from F864 were evaluated and
found to show good expression and excellent solubility under physiological conditions. One intermediate
construct of XTEN_AF54O was sequenced. This clone was evaluated in a PK experiment in cynomolgus
monkeys and a half-life of about 20h was measured. A full length clone of XTEN_AF864 had excellent
solubility and showed half-life exceeding 60h in cynomolgus monkeys. A second set of XTEN_AF
sequences was assembled including a sequencing island as described in Example 9.
Example 13: Construction of XTEN_AG864
Using the several utive rounds of dimerization, we assembled a collection of
XTEN_AG864 sequences starting from segments ofXTEN_AG36 listed in Example 4. These ces
were assembled as described in Example 5. Several es from XTEN_AG864 were evaluated and
found to show good expression and ent solubility under physiological conditions. A full-length
clone ofXTEN_AG864 had excellent lity and showed half-life exceeding 60h in cynomolgus
monkeys.
Example 14: Methods of producing and evaluating GLP2-XTEN containing GLP-2 and
] A l schema for producing and evaluating GLP2-XTEN compositions is presented in and forms the basis for the general description of this Example. The GLP-2 peptides and sequence
variants may be prepared recombinantly. Exemplary recombinant methods used to prepare GLP-2
peptides include the following, among others, as will be apparent to one skilled in the art. Typically, a
GLP-2 peptide or sequence variant as defined and/or described herein is prepared by constructing the
nucleic acid encoding the desired peptide, cloning the nucleic acid into an expression vector in frame
with nucleic acid encoding one or more XTEN, orming a host cell (e.g., bacteria such as
Escherichia coli such as Saccharomyces cerevisiae or mammalian cell such as Chinese r
, yeast ,
ovary cell or baby hamster kidney cell), and expressing the c acid to produce the desired GLP2-
XTEN. Methods for producing and expressing recombinant polypeptides in vitro and in prokaryotic and
eukaryotic host cells are known to those of ordinary skill in the art. See, for example, U.S. Pat. No.
122, and Sambrook et al., Molecular g—A Laboratory Manual (Third Edition), Cold Spring
Harbor Laboratory Press (2001).
Using the disclosed methods and those known to one of ordinary skill in the art, together with
guidance provided in the illustrative examples, a skilled artesian can create and evaluate GLPZ-XTEN
fiasion proteins comprising XTENs, GLP-2 and variants of GLP-2 disclosed herein or otherwise known
in the art. The Example is, ore, to be construed as merely illustrative, and not limitative of the
s in any way whatsoever; numerous variations will be apparent to the ordinarily skilled artisan. In
this Example, 3 GLP2-XTEN comprising a GLP-2 linked to an XTEN of the AE family of motifs is
created.
] The general scheme for producing polynucleotides encoding XTEN is presented in FIGS. 4 and
. is a schematic rt of representative steps in the assembly of a XTEN polynucleotide
construct in one of the embodiments of the invention. Individual oligonucleotides 501 are annealed into
sequence motifs 502 such as a 12 amino acid motif er”), which is ligated to additional sequence
motifs from a library that can multimerize to create a pool that encompasses the desired length of the
XTEN 504, as well as ligated to a smaller concentration of an oligo containing BbsI, and KpnI restriction
sites 503. The motif libraries can be limited to specific sequence XTEN families; e.g., AD, AE, AF, AG,
AM, or AQ sequences of Table 3. As illustrated in the XTEN length in this case is 864 amino
acid residues, but shorter or longer lengths can be achieved by this process. For e,
multimerization can be performed by ligation, overlap extension, PCR assembly or similar cloning
techniques lmown in the art. The resulting pool of ligation products is gel-purified and the band with the
desired length ofXTEN is cut, resulting in an isolated XTEN gene with a stopper sequence 505. The
XTEN gene can be cloned into a stuffer vector. In this case, the vector encodes an optional CBD
sequence 506 and a GFP gene 508. Digestion is than performed with BbsI/HindIII to remove 507 and
508 and place the stop codon. The resulting t is then cloned into a BsaI/HindIII digested vector
containing a gene encoding the GLP-Z, ing in the gene 500 encoding a GLPZ-XTEN fusion protein.
As would be apparent to one of ordinary skill in the art, the methods can be applied to create constructs in
alternative configurations and with varying XTEN lengths.
DNA sequences encoding GLP-2 can be conveniently obtained by standard procedures known
in the art from a cDNA library prepared from an appropriate cellular source, from a genomic library, or
may be created synthetically (e.g., automated nucleic acid synthesis), particularly where sequence
variants (e. g., 2G) are to be incorporated, using DNA sequences obtained from publicly available
databases, patents, or ture references. In the present example, the GLP2G sequence is utilized. A
gene or polynucleotide encoding the GLP—2 portion of the protein or its complement can be then be
cloned into a construct, such as those described herein, which can be a plasmid or other vector under
control of appropriate transcription and translation ces for high level protein expression in a
biological system. A second gene or polynucleotide coding for the XTEN portion or its complement can
be genetically fused to the nucleotides encoding the terminus of the GLP-2 gene by cloning it into the
construct adjacent and in frame with the gene coding for the GLP-Z, through a ligation or multimerization
step. In this manner, a chimeric DNA molecule coding for (or complementary to) the GLP2-XTEN
fusion protein is ted within the uct. Optionally, a gene encoding for a second XTEN is
inserted and ligated in-frame ally to the nucleotides encoding the GLP-Z-encoding region.
Optionally, this chimeric DNA molecule is transferred or cloned into r construct that is a more
riate expression vector; e. g., a vector appropriate for a prokaryotic host cell such as E. coli, a
eukaryotic host cell such as yeast, or a mammalian host cell such as CHO, BHK and the like. At this
point, a host cell capable of sing the chimeric DNA molecule is ormed with the chimeric
DNA molecule. The s containing the DNA segments of interest can be erred into an
appropriate host cell by well-known methods, depending on the type of cellular host, as described supra.
Host cells containing the GLPZ-XTEN expression vector are cultured in conventional nutrient
media modified as appropriate for activating the promoter. The culture conditions, such as temperature,
pH and the like, are those previously used with the host cell selected for expression, and will be apparent
to the ordinarily skilled artisan. After expression of the fusion n, culture broth is harvested and
ted from the cell mass and the ing crude t retained for purification of the fusion protein.
Gene sion is ed in a sample directly, for example, by tional Southern
blotting, Northern blotting to quantitate the transcription ofmRNA [Thomas, Proc. Natl. Acad. Sci. USA,
77:5201-5205 (1980)], dot blotting (DNA analysis), or in situ hybridization, using an appropriately
labeled probe, based on the sequences provided . Alternatively, gene expression is measured by
immunological of fluorescent methods, such as immunohistochemical staining of cells to quantitate
directly the expression of gene product. Antibodies usefill for immunohistochemical staining and/or
assay of sample fluids may be either monoclonal or polyclonal, and may be prepared in any mammal.
Conveniently, the antibodies may be prepared t the GLP-2 sequence polypeptide using a synthetic
peptide based on the sequences provided herein or against exogenous sequence fused to GLP-2 and
encoding a specific antibody epitope. Examples of selectable markers are well known to one of skill in
the art and include reporters such as enhanced green fluorescent protein (EGFP), beta-galactosidase (B-
gal) or chloramphenicol acetyltransferase (CAT).
The GLP2-XTEN polypeptide product is purified via methods known in the art. Procedures
such as gel filtration, y purification, salt fractionation, ion exchange chromatography, size
exclusion chromatography, hydroxyapatite adsorption chromatography, hydrophobic interaction
tography or gel electrophoresis are all techniques that may be used in the purification. Specific
s of purification are bed in Robert K. Scopes, Protein Purification: Principles and ce,
s R. Castor, ed., Springer—Verlag 1994, and Sambrook, et al., supra. Multi—step purification
tions are also described in Baron, er al., Crit. Rev. Biotechnol. —90 (1990) and Below, er al.,
J. Chromatogr. A. 679:67-83 (1994).
As illustrated in the isolated GLP2-XTEN fusion proteins would then be characterized
for their chemical and activity properties. Isolated fusion protein is characterized, e.g., for sequence,
purity, apparent molecular weight, solubility and stability using standard methods known in the art. The
fusion protein meeting expected standards would then be evaluated for activity, which can be measured
in vitro or in vivo by measuring one of the GLP-Z-associated parameters described herein, using one or
more assays disclosed herein, or using the assays of the es or the assays of Table 32.
In addition, the GLPZ-XTEN fusion protein is administered to one or more animal species to
determine standard cokinetic parameters and pharmacodynamic properties, as described in
Examples 18-21.
] By the iterative s of producing, expressing, and recovering GLP2-XTEN constructs,
followed by their characterization using methods disclosed herein or others known in the art, the GLP2-
XTEN compositions comprising GLP-2 and an XTEN can be produced and evaluated by one of ry
skill in the art to confirm the expected ties such as enhanced solubility, enhanced stability,
improved pharmacokinetics and reduced immunogenicity, leading to an overall enhanced therapeutic
activity compared to the corresponding unfused GLP-2. For those fusion ns not possessing the
desired properties, a different sequence can be constructed, expressed, isolated and evaluated by these
s in order to obtain a composition with such properties.
Example 15: Construction of GLPZ-XTEN genes and vectors
Oligonucleotides were designed and constructed such that the entire GLP-Z gene could be
assembled through the tiling together of these oligonucleotides via designed complementary over hang
regions under conditions of a 48°C annealing temperature. The complementary regions were held
constant, but the other regions of the oligonucleotides were varied such that a codon library was d
with ~50% of the codons in the gene varied instead of the single native gene sequence. A PCR was
performed to create a combined gene library which, as is typical, contained a variety of combinations of
the oligonucleotides and presented as a smear on an agarose gel. A polishing PCR was performed to
amplify those assemblies that had the correct termini using a set of amplification s complimentary
to the 5’ and 3’ ends of the gene. The product of this PCR was then gel d, taking only bands at the
N] 00 bp length of the expected GLP-2 final gene product. This gel-purified product was digested with
Bsal and Ndel and ligated into a rly digested construct containing DNA encoding a CBD leader
sequence and the AE864 XTEN, to produce a GLP2-XTEN_AE864 gene, and transformed in BL21 gold
competent cells. Colonies from this transformation were picked into 500 pl cultures of SB in 96 deep
well plates and grown to tion overnight. These cultures were stored at 4°C after 20 ul of these
cultures was used to inoculate 500 pl of auto-induction media and these cultures were grown at 26°C for
>24 hours. Following the grth the GFP fluorescence of 100 pl of these auto-induction media cultures
was measured using a fluorescence plate reader. The GFP fluorescence is proportional to the number of
les of GLP2—XTEN_AE464 made and is therefore a read—out of total expression. The highest
sing clones were identified, and a new 1 ml overnight was started in SB from the original saturated
overnight culture of that clone. Mini-preps were performed with these new cultures and the derived
plasmids were sequenced to determine the exact tide composition. An E. coli isolate was
designated strain AC453 and was identified as a strain that ed the desired GLP-2_2G-
XTEN_AE864 fusion protein. The DNA and amino acid sequences of the pre-cleavage expressed
product (with a CBD leader and TEV cleavage sequence) and the amino acid sequence of the final
product GLP-Z-ZG—XTEN_AE864 (after TEV cleavage) are ed in Table 13.
Table 13: GLPZ-XTEN DNA and amino acid seguences
Clone
DNA Sequence Amino Acid Sequence
Name
CBD-TEV- ATGGCAAATACACCGGTATCAGGCAATTTGAAGGTTGAAT MANTPVSGNLKVEF
2G— TCTACAACAGCAATCCTTCAGATACTACTAACTCAATCAA YNSNPSDTTNSNPQ
AE864 GTTCAAGGTTACTAATACCGGAAGCAGTGCAATT FKVTNTGSSAIDLSK
(pCW812/ GATTTGTCCAAACTCACATTGAGATATTATTATACAGTAGA
LTLRYYYTVDGQKD
AC453) CGGACAGAAAGATCAGACCTTCTGGGCTGACCATGCTGCA
QTFWADHAAHGSN
ATAATCGGCAGTAACGGCAGCTACAACGGAATTACTTCAA
GSYNGITSNVKGTF
ATGTAAAAGGAACATTTGTAAAAATGAGTTCCTCAACAAA
VKMSSSTNNADTYL
TAACGCAGACACCTACCTTGAAATCAGCTTTACAGGCGGA
ACTCTTGAACCGGGTGCACATGTTCAGATACAAGGTAGAT EISFTGGTLEPGAHV
TTGCAAAGAATGACTGGAGTAACTATACACAGTCAAATGA QIQGRFAKNDWSNY
CTACTCATTCAAGTCTGCTTCACAGTTTGTTGAATGGGATC TQSNDYSFKSASQF
AGGTAACAGCATACTTGAACGGTGTTCTTGTATGGGGTAA TAYLNGV
CGGTGGCAGTGTAGTAGGTTCAGGTTCAGGATCC LVWGKEPGGSVVGS
GAAAATCTGTATTTTCAACATGGTGACGGCTCTTTTAGCGA GSGSENLYFQHGDG
TGAAATGAATACTATACTGGACAACCTTGCGGCACGCGAC
Clone
DNA Sequence Amino Acid ce
Name
TTCATTAACTGGCTGATCCAGACAAAAATCACCGATGGAG SFSDEMNTILDNLA
GTAGCCCGGCTGGCTCTCCTACCTCTACTGAGGAAGGTAC ARDFINWLIQTKITD
TTCTGAAAGCGCTACTCCTGAGTCTGGTCCAGGTACCTCTA GGSPAGSPTSTEEGT
CTGAACCGTCCGAAGGTAGCGCTCCAGGTAGCCCAGCAGG
SESATPESGPGTSTE
CTCTCCGACTTCCACTGAGGAAGGTACTTCTACTGAACCTT
PSEGSAPGSPAGSPT
CCGAAGGCAGCGCACCAGGTACCTCTACTGAACCTTCTGA
STEEGTSTEPSEGSA
GGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACCCCGGAA
PGTSTEPSEGSAPGT
TCTGGCCCAGGTAGCGAACCGGCTACTTCTGGTTCTGAAA
CCCCAGGTAGCGAACCGGCTACCTCCGGTTCTGAAACTCC SESATPESGPGSEPA
AGGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAAGGT TSGSETPGSEPATSG
GAAAGCGCAACCCCGGAGTCCGGCCCAGGTACCT SETPGSPAGSPTSTE
CTACCGAACCGTCTGAGGGCAGCGCACCAGGTACTTCTAC EGTSESATPESGPGT
GTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGG STEPSEGSAPGTSTE
TTCTCCTACCTCCACCGAGGAAGGTACTTCTACCGAACCGT PSEGSAPGSPAGSPT
CCGAGGGTAGCGCACCAGGTACCTCTACTGAACCTTCTGA
STEEGTSTEPSEGSA
GGGCAGCGCTCCAGGTACTTCTGAAAGCGCTACCCCGGAG
PGTSTEPSEGSAPGT
TCCGGTCCAGGTACTTCTACTGAACCGTCCGAAGGTAGCG
SESATPESGPGTSTE
CACCAGGTACTTCTGAAAGCGCAACCCCTGAATCCGGTCC
PSEGSAPGTSESATP
AGGTAGCGAACCGGCTACTTCTGGCTCTGAGACTCCAGGT
ACTTCTACCGAACCGTCCGAAGGTAGCGCACCAGGTACTT EPATSGSET
CTACTGAACCGTCTGAAGGTAGCGCACCAGGTACTTCTGA PGTSTEPSEGSAPGT
AAGCGCAACCCCGGAATCCGGCCCAGGTACCTCTGAAAGC STEPSEGSAPGTSES
GCAACCCCGGAGTCCGGCCCAGGTAGCCCTGCTGGCTCTC ATPESGPGTSESATP
CAACCTCCACCGAAGAAGGTACCTCTGAAAGCGCAACCCC ESGPGSPAGSPTSTE
TGAATCCGGCCCAGGTAGCGAACCGGCAACCTCCGGTTCT ATPESGPGS
GAAACCCCAGGTACCTCTGAAAGCGCTACTCCGGAGTCTG
EPATSGSETPGTSES
GCCCAGGTACCTCTACTGAACCGTCTGAGGGTAGCGCTCC
ATPESGPGTSTEPSE
AGGTACTTCTACTGAACCGTCCGAAGGTAGCGCACCAGGT
GSAPGTSTEPSEGSA
ACTTCTACCGAACCGTCCGAAGGCAGCGCTCCAGGTACCT
PGTSTEPSEGSAPGT
CTACTGAACCTTCCGAGGGCAGCGCTCCAGGTACCTCTAC
CGAACCTTCTGAAGGTAGCGCACCAGGTACTTCTACCGAA STEPSEGSAPGTSTE
CCGTCCGAGGGTAGCGCACCAGGTAGCCCAGCAGGTTCTC PSEGSAPGTSTEPSE
CCACCGAGGAAGGTACTTCTACCGAACCGTCCGA GSAPGSPAGSPTSTE
GGGTAGCGCACCAGGTACCTCTGAAAGCGCAACTCCTGAG EGTSTEPSEGSAPGT
TCTGGCCCAGGTAGCGAACCTGCTACCTCCGGCTCTGAGA SESATPESGPGSEPA
CTCCAGGTACCTCTGAAAGCGCAACCCCGGAATCTGGTCC TSGSETPGTSESATP
CGAACCTGCAACCTCTGGCTCTGAAACCCCAGGT
ESGPGSEPATSGSET
ACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGTACTT
PGTSESATPESGPGT
CTACTGAACCGTCCGAGGGCAGCGCACCAGGTACTTCTGA
STEPSEGSAPGTSES
AAGCGCTACTCCTGAGTCCGGCCCAGGTAGCCCGGCTGGC
ATPESGPGSPAGSPT
TCTCCGACTTCCACCGAGGAAGGTAGCCCGGCTGGCTCTC
CAACTTCTACTGAAGAAGGTAGCCCGGCAGGCTCTCCGAC STEEGSPAGSPTSTE
CTCTACTGAGGAAGGTACTTCTGAAAGCGCAACCCCGGAG EGSPAGSPTSTEEGT
TCCGGCCCAGGTACCTCTACCGAACCGTCTGAGGGCAGCG SESATPESGPGTSTE
CACCAGGTACCTCTGAAAGCGCAACTCCTGAGTCTGGCCC PSEGSAPGTSESATP
AGGTAGCGAACCTGCTACCTCCGGCTCTGAGACTCCAGGT ESGPGSEPATSGSET
ACCTCTGAAAGCGCAACCCCGGAATCTGGTCCAGGTAGCG PGTSESATPESGPGS
AACCTGCAACCTCTGGCTCTGAAACCCCAGGTACCTCTGA
EPATSGSETPGTSES
AAGCGCTACTCCTGAATCTGGCCCAGGTACTTCTACTGAA
PGTSTEPSE
CCGTCCGAGGGCAGCGCACCAGGTAGCCCTGCTGGCTCTC
GSAPGSPAGSPTSTE
CAACCTCCACCGAAGAAGGTACCTCTGAAAGCGCAACCCC
EGTSESATPESGPGS
TGAATCCGGCCCAGGTAGCGAACCGGCAACCTCCGGTTCT
GAAACCCCAGGTACTTCTGAAAGCGCTACTCCTGAGTCCG EPATSGSETPGTSES
GCCCAGGTAGCCCGGCTGGCTCTCCGACTTCCACCGAGGA ATPESGPGSPAGSPT
AGGTAGCCCGGCTGGCTCTCCAACTTCTACTGAAGAAGGT STEEGSPAGSPTSTE
ACTTCTACCGAACCTTCCGAGGGCAGCGCACCAGGTACTT EGTSTEPSEGSAPGT
CTGAAAGCGCTACCCCTGAGTCCGGCCCAGGTACTTCTGA SESATPESGPGTSES
AAGCGCTACTCCTGAATCCGGTCCAGGTACTTCTGAAAGC PGTSESATP
CCGGAATCTGGCCCAGGTAGCGAACCGGCTACTT
ESGPGSEPATSGSET
13:33: DNA Sequence Amino Acid ce
CTGGTTCTGAAACCCCAGGTAGCGAACCGGCTACCTCCGG PGSEPATSGSETPGS
TTCTGAAACTCCAGGTAGCCCAGCAGGCTCTCCGACTTCC PAGSPTSTEEGTSTE
ACTGAGGAAGGTACTTCTACTGAACCTTCCGAAGGCAGCG PSEGSAPGTSTEPSE
CACCAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCC
GSAPGSEPATSGSET
AGGTAGCGAACCTGCAACCTCTGGCTCTGAAACCCCAGGT
ACCTCTGAAAGCGCTACTCCTGAATCTGGCCCAGGTACTT EggsggégiggGPGT
CTACTGAACCGTCCGAGGGCAGCGCACCAGGT
GLP-2—2G— CATGGTGACGGCTCTTTTAGCGATGAAATGAATACTATAC HGDGSFSDEMNTIL
AE864 TGGACAACCTTGCGGCACGCGACTTCATTAACTGGCTGAT DNLAARDFINWLIQ
CCAGACAAAAATCACCGATGGAGGTAGCCCGGCTGGCTCT TKITDGGSPAGSPTS
CCTACCTCTACTGAGGAAGGTACTTCTGAAAGCGCTACTC
TEEGTSESATPESGP
CTGAGTCTGGTCCAGGTACCTCTACTGAACCGTCCGAAGG
GTSTEPSEGSAPGSP
TAGCGCTCCAGGTAGCCCAGCAGGCTCTCCGACTTCCACT
AGSPTSTEEGTSTEP
GAGGAAGGTACTTCTACTGAACCTTCCGAAGGCAGCGCAC
SEGSAPGTSTEPSEG
CAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGG
TACTTCTGAAAGCGCTACCCCGGAATCTGGCCCAGGTAGC SAPGTSESATPESGP
GAACCGGCTACTTCTGGTTCTGAAACCCCAGGTAGCGAAC GSEPATSGSETPGSE
CGGCTACCTCCGGTTCTGAAACTCCAGGTAGCCCGGCAGG ETPGSPAGS
CTCTCCGACCTCTACTGAGGAAGGTACTTCTGAAAGCGCA PTSTEEGTSESATPE
ACCCCGGAGTCCGGCCCAGGTACCTCTACCGAACCGTCTG SGPGTSTEPSEGSAP
AGGGCAGCGCACCAGGTACTTCTACCGAACCGTCCGAGGG SEGSAPGSP
TAGCGCACCAGGTAGCCCAGCAGGTTCTCCTACCTCCACC
AGSPTSTEEGTSTEP
GAGGAAGGTACTTCTACCGAACCGTCCGAGGGTAGCGCAC
SEGSAPGTSTEPSEG
CAGGTACCTCTACTGAACCTTCTGAGGGCAGCGCTCCAGG
SAPGTSESATPESGP
TACTTCTGAAAGCGCTACCCCGGAGTCCGGTCCAGGTACT
GTSTEPSEGSAPGTS
TCTACTGAACCGTCCGAAGGTAGCGCACCAGGTACTTCTG
AAAGCGCAACCCCTGAATCCGGTCCAGGTAGCGAACCGGC ESATPESGPGSEPAT
TACTTCTGGCTCTGAGACTCCAGGTACTTCTACCGAACCGT SGSETPGTSTEPSEG
CCGAAGGTAGCGCACCAGGTACTTCTACTGAACCGTCTGA SAPGTSTEPSEGSAP
CGCACCAGGTACTTCTGAAAGCGCAACCCCGGAA GTSESATPESGPGTS
TCCGGCCCAGGTACCTCTGAAAGCGCAACCCCGGAGTCCG ESATPESGPGSPAGS
GCCCAGGTAGCCCTGCTGGCTCTCCAACCTCCACCGAAGA PTSTEEGTSESATPE
AGGTACCTCTGAAAGCGCAACCCCTGAATCCGGCCCAGGT
SGPGSEPATSGSETP
AGCGAACCGGCAACCTCCGGTTCTGAAACCCCAGGTACCT
GTSESATPESGPGTS
GCGCTACTCCGGAGTCTGGCCCAGGTACCTCTAC
SAPGTSTEP
TGAACCGTCTGAGGGTAGCGCTCCAGGTACTTCTACTGAA
SEGSAPGTSTEPSEG
CCGTCCGAAGGTAGCGCACCAGGTACTTCTACCGAACCGT
CCGAAGGCAGCGCTCCAGGTACCTCTACTGAACCTTCCGA SAPGTSTEPSEGSAP
GGGCAGCGCTCCAGGTACCTCTACCGAACCTTCTGAAGGT GTSTEPSEGSAPGTS
AGCGCACCAGGTACTTCTACCGAACCGTCCGAGGGTAGCG TEPSEGSAPGSPAGS
CACCAGGTAGCCCAGCAGGTTCTCCTACCTCCACCGAGGA GTSTEPSEG
AGGTACTTCTACCGAACCGTCCGAGGGTAGCGCACCAGGT SAPGTSESATPESGP
ACCTCTGAAAGCGCAACTCCTGAGTCTGGCCCAGGTAGCG GSEPATSGSETPGTS
AACCTGCTACCTCCGGCTCTGAGACTCCAGGTACCTCTGA
ESATPESGPGSEPAT
AAGCGCAACCCCGGAATCTGGTCCAGGTAGCGAACCTGCA
SGSETPGTSESATPE
ACCTCTGGCTCTGAAACCCCAGGTACCTCTGAAAGCGCTA
SGPGTSTEPSEGSAP
CTCCTGAATCTGGCCCAGGTACTTCTACTGAACCGTCCGA
GTSESATPESGPGSP
CGCACCAGGTACTTCTGAAAGCGCTACTCCTGAG
TCCGGCCCAGGTAGCCCGGCTGGCTCTCCGACTTCCACCG AGSPTSTEEGSPAGS
AGGAAGGTAGCCCGGCTGGCTCTCCAACTTCTACTGAAGA PTSTEEGSPAGSPTS
AGGTAGCCCGGCAGGCTCTCCGACCTCTACTGAGGAAGGT TEEGTSESATPESGP
ACTTCTGAAAGCGCAACCCCGGAGTCCGGCCCAGGTACCT SEGSAPGTS
CTACCGAACCGTCTGAGGGCAGCGCACCAGGTACCTCTGA SGPGSEPAT
AAGCGCAACTCCTGAGTCTGGCCCAGGTAGCGAACCTGCT SGSETPGTSESATPE
ACCTCCGGCTCTGAGACTCCAGGTACCTCTGAAAGCGCAA
SGPGSEPATSGSETP
CCCCGGAATCTGGTCCAGGTAGCGAACCTGCAACCTCTGG
GTSESATPESGPGTS
CTCTGAAACCCCAGGTACCTCTGAAAGCGCTACTCCTGAA
TEPSEGSAPGSPAGS
CCAGGTACTTCTACTGAACCGTCCGAGGGCAGCG
PTSTEEGTSESATPE
CACCAGGTAGCCCTGCTGGCTCTCCAACCTCCACCGAAGA
WO 40093
Clone
DNA ce Amino Acid Sequence
Name
AGGTACCTCTGAAAGCGCAACCCCTGAATCCGGCCCAGGT SGPGSEPATSGSETP
AGCGAACCGGCAACCTCCGGTTCTGAAACCCCAGGTACTT GTSESATPESGPGSP
CTGAAAGCGCTACTCCTGAGTCCGGCCCAGGTAGCCCGGC AGSPTSTEEGSPAGS
TGGCTCTCCGACTTCCACCGAGGAAGGTAGCCCGGCTGGC
PTSTEEGTSTEPSEG
TCTCCAACTTCTACTGAAGAAGGTACTTCTACCGAACCTTC
SAPGTSESATPESGP
CGAGGGCAGCGCACCAGGTACTTCTGAAAGCGCTACCCCT
GTSESATPESGPGTS
GGCCCAGGTACTTCTGAAAGCGCTACTCCTGAAT
ESATPESGPGSEPAT
CAGGTACTTCTGAAAGCGCTACCCCGGAATCTGG
CCCAGGTAGCGAACCGGCTACTTCTGGTTCTGAAACCCCA SGSETPGSEPATSGS
GGTAGCGAACCGGCTACCTCCGGTTCTGAAACTCCAGGTA ETPGSPAGSPTSTEE
GCCCAGCAGGCTCTCCGACTTCCACTGAGGAAGGTACTTC GTSTEPSEGSAPGTS
TACTGAACCTTCCGAAGGCAGCGCACCAGGTACCTCTACT TEPSEGSAPGSEPAT
TCTGAGGGCAGCGCTCCAGGTAGCGAACCTGCAA SGSETPGTSESATPE
CCTCTGGCTCTGAAACCCCAGGTACCTCTGAAAGCGCTAC SGPGTSTEPSEGSAP
TCCTGAATCTGGCCCAGGTACTTCTACTGAACCGTCCGAG
GGCAGCGCACCAGGT
Example 16: Expression and purification of fusion proteins comprising GLP2G fused to
XTEN_AE864.
The host strain for expression, AmE025, was derived from E. coli W3110, a strain with a K-12
background, having a deletion of theflmA gene and with the lambda DE3 prophage integrated onto the
chromosome. The host cell contained the plasmid pCWlOlO (AC616), ng an amino acid sequence
that is identical to that d by pCW812 (AC453). The final construct comprised the gene encoding
the cellulosome anchoring protein cohesion region cellulose binding domain (CBD) from Clostridium
thermocellum (accession #ABN54273), a tobacco etch virus (TEV) protease recognition site (ENLYFQ),
the GLPZ-ZG sequence, and an AE864 amino acid XTEN sequence under control of a T7 er. The
protein was expressed in a 5L glass jacketed fermentation vessel with a B. Braun Biostat B controller.
Briefly, a starter culture of host strain AmE025 was used to ate 2L of fermentation batch media.
After 6 hours of culture at 37°C, a 50% glucose feed was initiated. After 20 hours of culture, the
temperature was reduced to 26°C and 1M IPTG was added to induce expression. After a total
fermentation run time of 45 hours, the culture was harvested by centrifugation yielding cell pellets ~1 kg
in wet weight. The pellets were stored frozen at -80°C until purification was initiated.
Lysis, heat ation and clarification
The ing cell paste was resuspended at ambient temperature in 20 mM Cl pH 7.5, 50
mM NaCl, at a ratio of N4 ml per 1 g of cell paste. The cells were lysed by 2 passes through an APV
2000 homogenizer at an operating pressure of 800-900 bar. After lysis, the homogenate was heated to
~85°C in a heat exchanger and held for 20 minutes to coagulate host cell protein, then rapidly cooled to
~10°C. The cooled homogenate was clarified by centrifiugation at 4,000 rpm for 60 min using a Sorvall
H6000A rotor in a Sorvall RC-3C fuge. The supernatant was decanted, passed through a 6OSPO3A
Zeta Plus EXT depth filter (3M), followed by passage through a 0.2 um LifeASSURE PDA sterile
capsule and stored at 4°C overnight.
] Initial Anion Exchange capture with arl SuperQ-65OM resin
WO 40093
GLP2-2G-XTEN was ed out of the clarified lysate using 3 columns steps at ambient
temperature. GLP2-2G-XTEN was captured using Toyopearl SuperQ-650M (Tosoh) anion exchange
resin, which selects for the negatively charged XTEN polypeptide tail and removes the bulk of host cell
protein. An appropriately scaled SuperQ-650M column was equilibrated with 5 column volumes of 20
mM Tris-HCl pH 7.5, 50 mM NaCl and the lysate was loaded onto the column at a linear flow rate of
120 cm/hr. The column was then washed with 3 column s of 20 mM Tris-HCl pH 7.5, 50 mM
NaCl and 3 column volumes of 20 mM Tris-HCl pH 7.5, 150 mM NaCl, until the UV absorbance
returned to baseline. GLP2-2G-XTEN protein was eluted with a 7 column volume linear gradient from
150 mM NaCl to 300 mM NaCl in 20 mM NaCl Tris-HCl, pH 7.5. Fractions were collected throughout
and analyzed by SDS-PAGE for pooling and storage at 2-8°C. Product purity was determined to be ~80%
after the Super Q e step.
Intermediate Anion Exchange capture with GE MacroCap Q resin
The resulting SuperQ pool was diluted ~4-fold with 20 mM Tris-HCl pH 7.5 to reduce the
conductivity to < 10 mS/cm. An appropriately scaled MacroCap Q anion ge column (GE Life
Sciences) selects for the full-length intact XTEN polypeptide tail and removes the bulk of endotoxin and
any al host cell protein and DNA. The column was equilibrated with 5 column volumes of 20 mM
Tris-HCl pH 7.5, 50 mM NaCl. The diluted SuperQ pool was loaded at a linear flow rate of 120 cm/hr.
The column was then washed with 3 column volumes of 20 mM Tris—HCl pH 7.5, 50 mM NaCl, and then
3 column volumes of20mM Tris—HCl pH 7.5, 150 mM NaCl, until the UV absorbance returned to
ne. GLP2-2G-XTEN protein was eluted with a 12 column volume linear gradient from 150 mM
NaCl to 300 mM NaCl in 20 mM Tris-HCl pH 7.5. Fractions were collected hout and analyzed by
SDS-PAGE for pooling and storage at 2-8°C. Product purity was determined to be >95% after the
ap Q intermediate step.
Hydrophobic Interaction Chromatography [HIC] using Toyopearl Phenyl-650M resin
An appropriate amount of solid NaCl salt was dissolved in the MacroCap Q pool to adjust load
to 4 M NaCl. and then was sterile d through a 0.2 um filter. An appropriately scaled Toyopearl
Phenyl-650M (Tosoh) column selects for the hydrophobic residues of the GLP2 payload and removes
al XTEN fragments and endotoxin. The column was equilibrated with 5 column volumes of 20 mM
Tris-HCl pH 7.5, 4 M NaCl. The MacroCap Q pool was loaded at a linear flow rate of 60 cm/hr. The
column was then washed with 3 column volumes of 20 mM Tris-HCl pH 7.5, 4 M NaCl. G-
XTEN n was eluted with a step-down gradient to 1.2 M NaCl in 20 mM Tris-HCl pH 7.5. The
elution peak was fractionated and analyzed by SDS-PAGE to confirm successful capture and elution of
GLP2-2G-XTEN. Product purity was determined to be >95% after the final polishing step. The resulting
pool was concentrated to ~11 mg/ml and buffer exchanged into 20 mM Tris-HCl pH 7.5, 135 mM NaCl
formulation buffer using a 30 KDa MWCO Pellicon XL 50 Ultrafiltration Cassette (Millipore). The
purified lot of GLP2-2G-XTEN was designated AP69O and stored at -80°C until r use.
SDS-PAGE Analysis
SDS-PAGE analysis was conducted with 2 pg, 5 pg and 10 pg of AP690 loaded onto a
NuPAGE 4-12% Bis Tris Gel (Invitrogen) and then run for 35 minutes at a constant 200V. The results
(A) showed that the AP690 protein was free from host cell ties and that it migrated near
the 160 kDa marker, the expected result for a payload-XTEN fusion protein of this lar weight and
composition.
Endotoxin Content
Endotoxin levels of lot AP690 was assessed using an fe PTS test cartridge (Carles River)
and determined to be 3.5 EU/mg of protein, making the AP690 lot appropriate for injection into test
animals for pharmacokinetic or pharmacodynamic studies.
Analy_tical size exclusion HPLC
] Gel filtration analysis was performed using a Phenomenex BioSep-SEC-s4000 (7.8mm x
600mm) . 20 pg or AP690 G-XTEN fusion protein were analyzed at a flowrate of 0.5
ml/min with 50 mM Phosphate pH 6.5, 300 mM NaCl mobile phase. Elution was monitored using
OD215nm. Column calibration was performed using a size exclusion check standard from Phenomenex,
with the following markers: lobulin (670 kDa), IgG (156 kDa), BSA (66 kDa) and ovalbumin (17
kDa). The result (B) indicated an apparent molecular weight of 1002 kDa for the fusion protein of
83.1 kDa actual weight, for an apparent molecular weight factor of 12.5.
Intact mass determination by ESI—MS
200 pg of AP690 GLP2—2G—XTEN protein was desalted by solid phase extraction using an
Extract-Clean C18 column (Discovery Sciences). The ed protein solution in 0.1% formic acid, 50%
acetonitrile was infilsed at 4 pl/min into a QSTAR XL mass spectrometer (AB Sciex). Multicharge TOF
spectrum was acquired in 800-1400 amu range. A zero-charge um was obtained by Bayesian
reconstruction in 10-100 kDa range (). The experimental mass ofthe full length intact GLP2-2G-
XTEN was determined to be 83,142 Da, with an additional minor peak of 83,003 Da detected,
representing the des-His G-XTEN at <5% of total protein.
Example 17: Characterization of GLP2-XTEN in vitro receptor binding by calcium flux
potency assay
A receptor binding assay was performed using a GPCRProfiler assay (Millipore) to assess
GLP2-2G-XTEN preparations (including AP690). The assay employed a transfected GLP2R cell line
(Millipore, Cat# HTS164C) ting of a Chem-11 human cell stably transfected with the GLP2 G-
protein d receptor and a G alpha protein that ates calcium flux upon agonism of the GLP2
receptor. Assays were performed by addition of serial dilutions of GLP2-2G-XTEN, synthetic GLP2-2G
peptide (without XTEN) and synthetic native GLP2 peptide, and the m flux was monitored in real-
time by a FLIPR TETRA instrument (Molecular Devices) using the no wash calcium assay kit
(Molecular devices). The s, presented in , were used to derive EC50 values of 370 nM for
GLP2-2G-XTEN and 7 nM for GLP2-2G peptide. The s indicate that the GLP2-2G-XTEN was
able to bind and activate the GLP-2 receptor, with about 2% ofthe potency compared to GLP2-2G.
WO 40093
Example 18: Pharmacokinetic evaluation of GLPZ-XTEN in mice
The fusion protein GLP2-2G-XTEN_AE864 was evaluated for its cokinetic properties
in C57Bl/6 mice following subcutaneous (SC) administration. Female C57Bl/6 mice were injected SC
with 2 mg/kg (25 nmol/kg) of the GLP2-2G—XTEN (lot AP498A) at 0.25mg/mL (8 mL/kg). Three mice
were sacrificed at each of the following time points: Predose, 0.08, 4, 8, 24, 48, 72, 96 and 120 hours
post-dose. Blood samples were collected from the mice and placed into prechilled heparinized tubes at
each interval and were separated by centrifugation to recover the plasma. The samples were analyzed for
fiasion protein concentration, performed by both anti-XTEN/anti- XTEN sandwich ELISA (AS1405) and
anti-GLP2/anti-XTEN sandwich ELISA (AS1717), and the results were analyzed using WinNonLin to
obtain the PK parameters. Terminal half-life was fit from 24 to 120 hours. The s are presented in
Table 14 and , with both assays showing essentially lent results, with a terminal half-life of
31.6-33.9 h determined.
Table 14: GLP2-2G—XTEN-864 Pharmacokinetics
(m1)
GLPZ-XTEN 11,200 720,000 33.9 3.4
ELISA
e 19: cokinetic evaluation of GLPZ-XTEN in rats
The fusion protein G-XTEN_AE864 was evaluated for its pharmacokinetic properties
in Wistar rats following SC administration of two different dosage levels. Prior to the ment,
catheters were surgically implanted into the r vein of female Wistar rats. The catheterized animals
were ized into two groups containing three rats each. The fusion protein GLP2-2G-XTEN (lot
AP510) was administered to each rat via SC injection as follows: 1) Low Dose 2 mg/kg (25 nmol/kg); or
2) High Dose 16 mg/kg (200 nmol/kg). Blood samples (~O.2mL) were collected through the jugular vein
catheter from each rat into prechilled nized tubes at pre-dose, 0.08, 4, 8, 24, 48, 72, 96, 120 and
168 hours after test nd stration (10 time points). Blood was processed into plasma by
centrifugation, split into two ts for analysis by ELISA. The samples were analyzed for fusion
protein concentration, performed by both anti-XTEN/anti-XTEN sandwich ELISA (AS1602) and anti-
GLP2/anti-XTEN sandwich ELISA (AS1705) and the results were analyzed using WinNonLin to obtain
the PK parameters. Terminal half life was fit from 48 to 168 hours. The results are presented in Table 15
and , with both assays g essentially equivalent results and with a terminal half—life of 37.5—
49.7 h determined, greatly exceeding the reported terminal half—life for GLP—2 and for GLP2—2G. In
addition, the pharmacokinetic profile of GLP2—2G—XTEN after single subcutaneous administration to rats
at 25 nmol/kg and 200 nmol/kg was dose proportional with the CH1“ and AUC increasing in an
approximately linear manner.
Table 15: GLP2-2G—XTEN-864 Pharmacokinetics
T 1/2 Cmax AUCInf Vz Cl
(hr) (n ml) (hr*n mL) (mL) (mL/hr)
TEN ELISA
High Dose
42.0 37900 0 65.0 1.07
(l6mg/kg)
Low Dose
42.6 6270 530000 43.4 0.71
(2 mg/kg)
ANTI-GLPZ-XTEN
ELISA
High Dose
49.7 40300 3660000 70.2 0.972
(Mme/kg)
Low Dose
37.5 6900 530000 43.4 0.797
(2 mg/kg)
e 20: Pharmacokinetic evaluation of TEN in cynomolgus monkeys
The fusion protein GLP2-2G-XTEN_AE864 was evaluated for its pharmacokinetic properties in male
cynomolgus s following either subcutaneous or intravenous administration of the fusion protein
at a single dosage level. Three male cynomolgus monkeys were injected IV and 3 male cynomolgus
monkeys were injected SC with 2 mg/kg (25 nmol/kg) GLP2-2G-XTEN at time 0. Blood samples were
collected from each monkey into prechilled heparinized tubes at se and at approximately 0.083 h
(5 min), 1, 2, 4, 8, 24, 48, 72, 96, 120, 168, 216, 264, and 336 hours after administration ofthe fusion
n for the first phase of the study. Animals were allowed to “wash-out” for a 6 week period (4
weeks post-last collection time point of Phase 1), the groups were crossed over (SC to IV and IV to SC),
and dosed again with the same dose of GLP2-2G-XTEN fusion protein. Blood samples were collected at
pre-dose and at approximately 0.083 h (5 min), 1, 2, 4, 8, 24, 48, 72, 96, 120, 168, 216, 264, 336, 384,
432, and 504 hours ose in the second phase of the study. All blood samples were processed into
plasma by centrifugation and split into two aliquots for analysis by ELISA. The samples were analyzed
for fusion protein concentration, performed by anti-GLP2/anti-XTEN ELISA (AS1705) and the results
were analyzed using WinNonLin to obtain the PK parameters. The results are presented in Table 16 and
, with a terminal half-life for the GLP2-2G-XTEN_AE864 fusion protein of 110 h for IV and 120
h for SC administration determined. The bioavailability was 96% demonstrating that GLP2-2G-XTEN is
rapidly and near completely absorbed after subcutaneous administration.
Table 16: G—XTEN-864 cokinetics
GROUP Tl/z Cm AUCInf Vd (:1
(hr) (ng/ml) (hr*ng/mL) (mL/kg) (mL/hr)
The cumulative results of the PK analyses were used to perform allometric scaling of GLP2-
2G_AE864 terminal half-life, clearance and volume of distribution using data from three species (mouse,
rat and monkey). Pharmacokinetic values for a 70 kg human were predicted by extrapolating the log
linear relationship between body weight and each pharmacokinetic parameter, as shown in . The
data for terminal half life, volume of distribution and clearance are presented in Table 17. The ted
terminal half-life in humans of 240 h, greatly exceeds the reported 3.2 h terminal half-life of teduglutide
in humans r, J-F, et al. Pharmacokinetics, Safety, and Tolerability of Teduglutide, a Glucagon-
Like Peptide-2 (GLP-2) Analog, Following le Ascending aneous Administrations in
Healthy Subjects. J Clin Pharmacol (2008) 48: 1289-1299). The terminal half-life in humans can also be
estimated using the predicted values for clearance (Cl) and volume of distribution (Vd) as 0.693 X Vd/Cl.
ng this formula yields a predicted terminal half-life of 230 h in humans, which agrees well with
the extrapolation from the animal Tl/z data, and which greatly exceeds the reported terminal half-life for
native GLP-2 and for GLP2-2G.
Table 17: Allometric scalin 0f GLP2-2G—XTEN-864 harmacokinetics
*predicted value
Example 21: Pharmacodynamic evaluation of GLPZ-XTEN in animal models
The in vivo pharmacologic activity of the GLP2-2G-XTEN_AE864 fusion protein was assessed
using preclinical models of intestinotrophic growth in normal rats and efficacy in mouse DSS-colitis and
rat Crohn’s Disease.
In vivo evaluation of GLP2-2G-XTEN-AE864 in normal rats
] To determine the intestinotrophic ties of GLPZ-XTEN, small intestine grth in rats
was measured as a primary pharmacodynamic nt. GLP2—2G-XTEN—AE864 fusion protein
GLP2-2G peptide, or vehicle was administered via subcutaneous injection into male Sprague-Dawley rats
ng 200-220 grams (10-12 rats per group). GLP2-2G e was dosed using the previously
published n of 12.5 nmol/kg (0.05 mg/kg) twice daily for 12 days. GLP2-2G—XTEN was dosed at
nmol/kg once daily for 12 days. After sacrifice, a midline incision was made, the small intestines
were d, stretched to their maximum length and the length recorded. The fecal material was
flushed from the lumen and the small intestinal wet weight ed. The small intestine length and
weight data were ed with an ANOVA model with a Tukey/Kramer post-hoc test for pairwise
comparisons, with cance at p = 0.05.
: Treatment with GLP2-2G e for 12 days (12.5 nmol/kg/dose using the standard
twice daily dosing regimen) resulted in a significant increase in small intestine weight of 24% (FIG.
???A). There were no significant effects on small intestine length. stration of equal moles GLP2-
2G-XTEN over the 12 day study (25 nmol/kg/dose, once daily) resulted in a similar significant increase
in small intestine weight of 31%. In contrast to the results seen with GLP2-2G peptide, the small
intestine of GLP2-2G-XTEN treated rats showed a significant increase in length of 9% (10 cm), and was
visibly thicker than the tissues from vehicle-treated control animals. ().
Conclusions: The results of the study show that GLP2-2G-XTEN induced small intestine
growth that was as good or better than G peptide, using equal nmol/kg dosing.
In vivo evaluation of G-XTEN-AE864 in murine acute DSS-induced colitis model
To determine the efficacy of GLP2-XTEN, the GLP2-2G—XTEN-AE864 fusion protein was
evaluated in a mouse model of intestinal inflammatory colitis. Intestinal colitis was induced in female
C57B1/6 mice (9—10 weeks of age) by feeding mice with 4.5% n sodium sulfate (DSS) dissolved in
drinking water for 10 days, until ~20% body weight loss is observed. A naive, non-treated control group
(group 1) was given normal drinking water for the on of the experiment. The DSS treated groups
(groups 2-7) were treated SC with vehicle (group2), GLP2-2G peptide (no XTEN) (group 3) or GLP2-
2G-XTEN (lot AP5100 (groups 4-7). The treatment doses and regimens are outlined in Table 18, below;
the GLP-2G peptide was administered BID days 1-10 while the fusion protein was stered QD in
the morning with vehicle control administered in the evening days 1-10. Measured parameters included
body weights (recorded daily) and the following terminal endpoints, determine at day 10 of the
experiment: colon weight and length, small intestine weight and length, and stomach weight. Tissues
were fixed in formalin and then transferred to ethanol for ng and histopathology. The anatomical
data was analyzed with an ANOVA model with a Tukey/Kramer post-hoc test for pairwise comparisons,
with significance at p = 0.05.
Table 18: Treatment groups
GROUP N Treatment Dose Route Regimen
Normal water +
NA BID (10-12h)
Vehicle
DSS + Vehicle BID (10—12h)
DSS + 0.05 mg/ gk
3 10 sc BID (10-12h)
GLP2-2G peptide (12.5 nmol/kg)
DSS + 6 mg/kg Fusion protein AM
GLP2-2G-XTEN (75 nmol/kg) Vehicle PM
DSS + 2 mg/kg Fusion protein AM
l O
GLP2-2G-XTEN (25 nmol/kg) Veh1cle PM.
DSS + 0.2 mg/kg Fusion protein AM
GLP2-2G-XTEN (2.5 nmol/kg) Vehicle PM
DSS + 0.02 mg/kg Fusion protein AM
GLP2-2G-XTEN (0.25 nmol/kg) Vehicle PM
Results: Treatment effects on body weight colon length and weight, small intestine weight and
length and stomach weight were assessed on the day of sacrifice. Although DSS-treated mice showed the
expected significant decrease in body weight as compared to the l mice (see ), neither the
mice treated with GLP2-2G peptide nor any of the groups of mice treated with any dose of GLP2-2G-
XTEN mice showed a reduced loss ofbody weight loss over the course of the experiment. With respect
to ent effects on colon, small intestine and stomach, the parameter with a statistically significant
change was an increase in small ine weight in the GLP2-2G-XTEN high dose group (6 mg/kg),
compared to the l groups 1 and 2 and the GLP2-2G-XTEN medium dose group (2 mg/kg),
compared to group 1 (data not shown). The GLP2-2G peptide did not induce cant growth in the
d tissues in the current study. Histopathology examination was performed on group 2
(DSS/vehicle treated) and group4 (DSS/GLP2-2G-XTEN 6 mg/kg qd treated). Results of the
examination indicated that small intestine samples from the vehicle treated mice show mild-moderate and
marked s of mucosal atrophy (see A, B). The mucosa were sparsely lined by stunted villi
(diminished height) and sed l thickness. In contrast, small intestine samples from mice
treated with G-XTEN at 6 mg/kg qd showed normal mucosal architecture with elongated villi
densely populated with columnar epithelial and goblet cells (see C, D). The results support the
conclusion that, under the ions of the experiment, treatment with the GLP2-2G—XTEN fusion
protein protected the intestines from the inflammatory effects of DSS, with maintenance of normal villi
and mucosal architecture.
Efficacy of GLP2-2G-XTEN vs. GLP2-2G peptide in rat Crohn’s Disease indomethacin
induced inflammation model
To determine the efficacy of GLP2-XTEN using single dose or qd dosing, the GLP2-2G—
XTEN—AE864 fusion protein was evaluated in a rat model of s Disease of indomethacin—induced
intestinal inflammation in three separate s.
M: Intestinal ation was induced in eighty male Wistar rats (Harlan Sprague
Dawley) using indomethacin administered on Days 0 and l of the experiment. The rats were divided into
seven treatment groups for treatment according to Table 19.
Table 19: Treatment groups
“Route Regimen + Indomethacin
1 10 ml/kg sc BID No
2 10 ml/kg sc BID Yes
0.05 mg/kg
LP2-2 SC BID Yes
<12~5nmovkg>
--0.5 mg/kg 4 LP2—2 SC BID Yes
<125nmol/kg)
2 mg/kg
- GLPZ-ZG-XTEN SC QD Y“
l/kg)
-6 mg/kg QD
G-XTEN SC Yes
<75nmol/kg)
7 Prednisolone 10 mg/kg PO QD Yes
All treatments were administered per the le starting on Day -3 of the experiment. Body
weights were determined daily. Groups 3 and 5 were dosed equimolar/day. On Day 2 (24 hours post-2nd
indomethacin dose), the animals were prepped for sacrifice and is. Thirty minutes prior to sacrifice,
the rats were injected intravenously with 1 ml 1% Evans Blue dye, in order to visualize ulcers and extent
of inflammation by athology analysis. The rats were anesthetized (SOP 1810), blood samples
were removed to determine the concentration of GLP2G-XTEN using the anti-XTEN/anti-GLP2
ELISA method. The rats were euthanized then necropsied and scored by gross ation of the
intestines for the presence of adhesions; i.e., none = 0, mild = 1, moderate = 2, or severe = 3. The small
intestines were removed and the length of each was recorded. In each small intestine, a longitudinal
incision was made and the interior was examined. The degree and length of the ted area was
recorded as a score; i.e., none = 0, few = 1, multiple = 2, or continuous = 3. For TNFOL determination,
intestinal samples were thawed and homogenized in a total of 20 ml with DPBS. The supernatants were
equilibrated to room temperature and assayed for TNFOL by ELISA (R&D Systems, Cat. RTAOO, lot
281687, exp. 07SEP11). The s for Group 1 were assayed ted. The samples for Groups 2-7
were diluted 1:4. For histopathology, the small intestines were gently washed with saline to remove the
fecal al and were blotted to remove excess fluid. Each small intestine was weighed then processed
for histopathology examination to quantitate the degree of inflammation; i.e., .0% = 0, 1-33% = 1, 34-
66% = 2, 67-100% = 3.
RLults: The values and scores for the body weight and various small intestine parameters are
presented graphically in . The changes in parameters and scores for Group 2 control animals
versus Group 1 healthy controls indicates that the model is representative ofthe disease process. Results
of body weights (A) indicate that the GLP2-2G did not have a cant increase in body weight
ed to disease control (Group 2), while the GLP2G-XTEN groups demonstrated a significant
increase. Results from the small intestine length (B) showed a significant increase for both the
GLP2G peptide and GLP2G-XTEN fusion protein treatments, with the latter resulting in length
equivalent to the non-diseased control (Group 1). Results from the small intestine weight (C)
WO 40093
showed a significant increase for the 0.5 mg/kg GLP2G peptide and both GLP2G—XTEN fusion
protein groups, compared to diseased control Group 2. Based on gross ogy scoring of the small
intestine, both the GLP2G peptide and 2G-XTEN fusion protein treatments resulted in
significant decreases in ulceration (D), with the 6 mg/kg filSlOl’l protein resulting in a score that
was not significantly different from the non-diseased control (Group 1). Based on scoring of adhesions
and transulceration (E), both the GLP2G peptide and GLP2G-XTEN fusion protein
treatments showed significant decreases ed to diseased l (Group 2), with the 2 and 6 mg/kg
fiJsion n resulting in scores that were not significantly ent from the non-diseased control
(Group 1). Based on scoring of small ine ation (F), neither the GLP2G peptide
nor the GLP2G-XTEN fusion protein ents showed a significant effect on inflammation. Based
on TNFoc assays (G), both the GLP2G peptide and GLP2G-XTEN fusion protein treatments
showed significantly decreased cytokine levels compared to the diseased control Group 2.
Conclusions: The results of the study show that GLP2-2G-XTEN provided efficacy that was as
good or better than GLP2-2G peptide, using equal nmol/kg dosing, in improving indomethacin-induced
small intestine damage.
M: Intestinal inflammation was induced in eighty male Wistar rats (Harlan Sprague
Dawley) using indomethacin administered on Days 0 and 1 of the experiment. The rats were divided into
eight treatment groups for treatment according to Table 20.
Table 20: Treatment groups
1 10 ml/kg sc BID N
2 10 ml/kg sc BID Yes
0.05 mg/kg
3 LP2—2 sc BID Yes
<12-5nmol/kg)
2 mg/kg Once daily
4 -GLP2-2G-XTE\ SC Y65
<25nmol/kg) (QB)
(23:35.:
All treatments were administered per the le starting on Day -3 of the experiment. Body
weights were determined daily. On Day 2 (24 hours post-2nd indomethacin dose), the animals were
prepped for sacrifice and is. Thirty minutes prior to ce, the rats were injected intravenously
with 1 ml 1% Evans Blue dye, in order to visualize ulcers and extent of inflammation by histopathology
analysis. The rats were anesthetized and blood samples were removed to determine the concentration of
2012/054941
GLP2G-XTEN using the anti-XTEN/anti-GLP2 ELISA method. The rats were euthanized then
sied and scored by gross examination of the intestines for the presence of ons; i.e., none = 0,
mild = 1, moderate = 2, or severe = 3. The small intestines were d and the length of each was
recorded. In each small intestine, a longitudinal incision was made and the interior was examined. The
degree and length of the ulcerated area was recorded as a score; i.e., none = O, few = 1, multiple = 2, or
continuous = 3. The fecal material was washed away with saline and d to remove excess fluid and
each small intestine was weighed then processed for histopathology examination to quantitate the degree
ofinflammation; i.e., .O% = O, l-33% = 1, 34-66% = 2, 67-100% = 3.
RLults: The scores for the various ters are presented graphically in . In the
vehicle negative control group, the gross pathologic changes due to indomethacin treatment were most
severe in the ileum and m, with a total disease score of 8.5-9 by assessment of this group. Of the
s GLP2G peptide and GLP2G-XTEN treatment groups, the GLP2G peptide red bid,
the GLP2G-XTEN delivered qd, and the single doses of GLP2G-XTEN at 6 or 2 mg/kg resulted in
significantly improved scores compared to the indomethacin-treated vehicle control group. In the trans-
ulceration scores, the same treatment groups as per the total disease score d tical significance
(A, with star indicating statistically significant difference compared to vehicle . In the
adhesions score analysis, the indomethacin-treated vehicle l group approached the maximum score
of 3 (B). Once—daily treatment with the GLP—2—2G—XTEN provided nearly te protection
from adhesions, and the single high—dose 6 mg/kg GLP—2—2G—XTEN group reached statistically
significant difference compared to vehicle control (star in figure indicating statistically significant
difference), as did the daily bid dosed GLP2G peptide group. In the small intestine length analysis
(with the non-indomethacin treated group ized to 100%), the once-daily treatment with the GLP-
2-2G-XTEN group and the daily bid dosed GLP2G peptide group reached statistically significant
difference compared to indomethacin-treated vehicle control group. The histopathology assessment
finding were essentially similar to the gross pathology findings. The histopathologic changes in the
e control group due to indomethacin treatment were most severe in the ileum and jejunum. The
vehicle control group showed severe mucosal atrophy, ulceration and infiltration (A). The
protective effects of the daily bid GLP2G peptide and once-daily GLP2G-XTEN treatments were
most nced in the ileum, but were also seen in the jejunum. Group 3 had one rat with essentially
normal tissue (B) while two rats each showed ulceration and infiltration but no atrophy and two
rats had histopathologic changes similar to the vehicle control disease group 2. Group 4 (D)
showed protective effects with two rats with essentially normal tissue, one rat showing no atrophy or
ulceration but with slight infiltration, one rat with no atrophy but slight ulceration and infiltration, and
one rat had histopathologic changes similar to the vehicle control e group 2. Group 7 showed
tive effects with one rat with essentially normal tissue, two rats with no ulceration or infiltration
but showing muscular atrophy, and two rats had histopathologic changes similar to the vehicle control
e group 2. Group 8 (C) showed protective effects with one rat with no ulceration or
infiltration, one rat with reduced tion and infiltration, and three rats had histopathologic changes
similar to the vehicle control e group 2. The ELISA results te that the GLP2G-XTEN
fusion n was detectable at Day 2 in all animals of Group 4 and Group 8, and three rats in Group 7.
The results t the conclusion that, under the conditions of the experiment, treatment with
the GLP2-2G-XTEN fusion protein provided significant protection to the intestines from the
atory effects of indomethacin, with daily dosing at 2 mg/kg showing the greatest efficacy and
single doses of 6 mg/kg or 2 mg/kg showing significant efficacy in some parameters.
Stidxfi: A third indomethacin-induced inflammation study was performed to verify previous
results and test additional dose regimens. Intestinal inflammation was induced in male Wistar rats
(Harlan Sprague Dawley) using indomethacin administered on Days 0 and 1 ofthe experiment ing
to Table 21.
Table 21: Treatment groups
——10 Mg QD ND
--0.05 mg/kg 2 LP2-2 SC BID 125““)ng
<12-5nmol/kg)
--2 mg/kg Once daily
3 GLP2-2 -XTEN SC 1251““ gl k
<25nmol/kg) (QD)
--2 mg/kg Day —3 —1 1
4 GLP2-2G-XTEN SC ’ ’ 75m” gl k
<25nmol/kg) <sz
6 mg/kg Once day —3
GLP2-2G-XTEN SC 75 nmol/kg
(75 nmol/kg) only
] All treatments were administered per the schedule starting on Day —3 of the experiment. Body
weights were determined daily. On Day 2 (24 hours post—2nd indomethacin dose), the animals were
d for sacrifice and analysis. The small intestines were removed and the length of each was
recorded. Quantitative histopathology was performed on a subset of samples. Rat small intestine s
consisted of a 3 cm section of proximal jejunum and a 3 cm section of mid-j ejunum collected 15 cm and
cm from the pylorus, respectively. Samples were fixed in 10% neutral buffered formalin. Samples
were trimmed into multiple ns without bias toward lesion ce or absence. These sections were
placed in cassettes, embedded in paraffin, microtomed at approximately 4 microns thickness, and stained
with hematoxylin and eosin (H&E). The slides were evaluated microscopically by a board certified
veterinary pathologist and scored for villous height as well as infiltration/inflammation, mucosal atrophy,
villi/crypt ance, abscesses/ulceration. A 1 to 4 ty grading scale was used, where 1 =
l, 2 = mild, 3 = moderate, 4 = marked/severe, reflecting the combination ofthe cellular reactions
seen histopathologically. Small intestine length was analyzed with an ANOVA model with a
Tukey/Kramer post-hoc test for se comparisons, with significance at p = 0.05. Non-parametric
histology score variables were compared with the vehicle control using a Mann Whitney U test with a
Bonferroni correction for the p-value to create an overall alpha of 0.05.
s: As seen in the initial studies, there was an increase in small intestine length in the
GLP2-2G-XTEN—treated ed rats as compared to vehicle-treated diseased rats (A). This
increase correlated with a significant increase in villi height (B). Both high (total dose of 125
nmol/kg) and low (total dose of 75 nmol/kg) dose GLP2-2G-XTEN-treated groups showed a significant
increase in villi height; the increase in villi height seen in e treated rats was not significant. There
was also a significant decrease in l atrophy as both high and low dose GLP2-2G-XTEN-treated
rats showed a cantly lower mucosal atrophy score than vehicle-treated ed rats (C).
Although there was a trend showing a reduction in mucosal tion and mixed cell infiltrate following
G-XTEN and GLP2-2G peptide treatment, these results were not significant for any of the three
ent groups.
Conclusions: Histopathological results t the conclusion that GLP2-2G-XTEN provided
efficacy that was as good or better than GLP2-2G peptide in improving indomethacin-induced small
intestine damage. Furthermore, G-XTEN dosed once at 75 nmol/kg or three times at 25 nmol/kg
is as effective as GLP2-2G peptide dosed ten times at 12.5 g.
Example 22: Human Clinical Trial Designs for Evaluating GLPZ-XTEN comprising GLP-
As demonstrated in Examples 18-20, fusion ofXTEN to the C-terminus of GLP2glycine
results in improved half—life compared to that known for the native form of the GLP—2 or the GLP—2—2G
peptide, which, it is believed, would enable a reduced dosing frequency yet still result in clinical efficacy
when using such GLP2-XTEN—containing fusion protein compositions. Clinical trials in humans
comparing a GLP2-XTEN fusion protein to GLP-2 (or GLP2G peptide) formulations are performed to
establish the efficacy and advantages, compared to current or experimental modalities, of the GLP2-
XTEN binding fusion protein compositions. Such studies comprise three phases. First, a Phase I safety
and cokinetics study in adult patients is conducted to determine the maximum tolerated dose and
pharmacokinetics and codynamics in humans (e.g., normal healthy volunteer subjects), as well as
to define ial toxicities and e events to be tracked in future studies. A Phase I study is
conducted in which single rising doses of a GLPZ-XTEN composition, such as are disclosed herein, are
administered by the desired route (e.g., by subcutaneous, intramuscular, or intravenous routes) and
biochemical, PK, and clinical parameters are measured at defined intervals, as well as adverse events. A
Phase Ib study will multiple doses would follow, also measuring the biochemical, PK, and clinical
parameters at defined intervals. This would permit the determination of the minimum effective dose and
the maximum tolerated dose and establishes the threshold and maximum concentrations in dosage and
circulating drug that constitute the therapeutic window for the active component. From this information,
the dose and dose schedule that permits less frequent administration of the GLP2-XTEN compositions
(compared to GLP-2 not linked to XTEN), yet retains the pharmacologic response, is obtained.
Thereafter, Phase II and III clinical trials are conducted in patients with the GLP-2 associated condition,
ing the effectiveness and safety of the TEN compositions under the dose conditions.
al trials could be conducted in patients suffering from any disease in which native GLP-2 or the
standard of care for the given condition may be expected to provide al benefit. For e, such
indications include gastritis, digestion disorders, malabsorption syndrome, gut syndrome, short
bowel syndrome, cul-de-sac syndrome, inflammatory bowel disease, celiac e, tropical sprue,
hypogammaglobulinemic sprue, Crohn's disease, ulcerative colitis, enteritis, chemotherapy-induced
enteritis, ble bowel syndrome, small intestine damage, mucosal damage of the small intestine, small
intestinal damage due to cancer-chemotherapy, intestinal injury, diarrhea] diseases, intestinal
insufficiency, acid-induced intestinal injury, arginine deficiency, thic hypospermia, obesity,
catabolic illness, febrile neutropenia, es, obesity, steatorrhea, autoimmune diseases, food allergies,
hypoglycemia, gastrointestinal r disorders, sepsis, bacterial peritonitis, burn-induced intestinal
damage, decreased gastrointestinal motility, intestinal failure, chemotherapy-associated bacteremia,
bowel trauma, bowel ischemia, mesenteric ischemia, malnutrition, necrotizing enterocolitis, necrotizing
pancreatitis, neonatal feeding intolerance, NSAlD-induced gastrointestinal damage, nutritional
insufficiency, total parenteral ion damage to gastrointestinal tract, neonatal nutritional insufficiency,
radiation-induced enteritis, radiation-induced injury to the intestines, mucositis, pouchitis, ischemia, and
stroke. Trials monitor patients before, during and after treatment for s in physiologic and clinical
parameters associated with the respective indications; e. g., weight gain, inflammation, cytokine levels,
pain, bowel function, appetite, febrile episodes, wound healing, glucose levels; enhancing or rating
hunger satiety; parameters that are tracked relative to the placebo or positive control groups. Efficacy
outcomes are determined using standard statistical methods. Toxicity and adverse event markers are also
followed in the study to verify that the compound is safe when used in the manner described.
Example 23: GLPZ-XTEN with cleavage sequences
C-terminal XTEN able by FXIa
An GLPZ-XTEN fusion protein ting of an XTEN protein filsed to the inus of
GLP-2 can be created with a XTEN release site ge sequence placed in between the GLP-2 and
XTEN components, as depicted in Exemplary ces are provided in Table 34. In this case,
the release site cleavage sequence can be incorporated into the GLP2-XTEN that contains an amino acid
sequence that is recognized and cleaved by the FXIa se (EC 3.4.21.27, t P03951).
Specifically the amino acid sequence KLTRAET is cut after the arginine of the sequence by FXIa
protease. FXI is the pro-coagulant protease located immediately before FVIII in the intrinsic or contact
activated coagulation pathway. Active FXIa is produced from FXI by proteolytic cleavage of the
zymogen by FXHa. Production of FXIa is tightly controlled and only occurs when coagulation is
necessary for proper hemostasis. Therefore, by incorporation of the KLTRAET cleavage sequence, the
XTEN domain is removed from GLP-2 concurrent with activation of the intrinsic coagulation y in
proximity to the GLPZ-XTEN.
inal XTEN releasable by Elastase-2
An GLP2-XTEN fusion protein consisting of an XTEN protein fused to the C-terminus of
GLP-2 can be created with a XTEN release site cleavage sequence placed in between the GLP-2 and
XTEN components, as depicted in Exemplary sequences are provided in Table 34. In this case,
the release site contains an amino acid sequence that is recognized and cleaved by the elastase-2 protease
(EC 3.4.21.37, Uniprot P08246). Specifically the sequence LGPVSGVP [Rawlings N.D., et al. (2008)
Nucleic Acids Res, 36: D320], is out after position 4 in the sequence. Elastase is constitutively expressed
by neutrophils and is present at all times in the circulation, but particularly during acute inflammation.
Therefore as the long lived GLP2-XTEN circulates, a fraction of it is cleaved, particularly locally during
atory responses (e.g., inflammation ofthe bowel), creating a pool of r-lived GLP-2 at the
site of inflammation, e. g., in s e, where the GLP-2 is most needed.
inal XTEN releasable by MMP-12
An TEN fusion protein consisting of an XTEN protein fused to the C-terminus of
GLP-2 can be created with a XTEN release site cleavage ce placed in n the GLP-2 and
XTEN components, as depicted in Exemplary sequences are provided in Table 34. In this case,
the release site ns an amino acid sequence that is recognized and cleaved by the MMP-12 protease
(EC 3.4.24.65, Uniprot P39900). Specifically the ce GPAGLGGA [Rawlings N.D., et al. (2008)
Nucleic Acids Res, 36 : D320], is out after on 4 of the sequence. MMP—l2 is constitutively
expressed in whole blood. Therefore as the GLP2—XTEN circulates, a fraction of it is cleaved, creating a
pool of shorter-lived GLP-2 to be used. In a desirable feature of the inventive composition, this creates a
circulating ug depot that constantly releases a prophylactic amount of GLP-2, with higher amounts
ed during an inflammatory response, e.g., in Crohn’s e, where the GLP-2 is most needed.
C-terminal XTEN able by MMP-13
An TEN fusion protein consisting of an XTEN protein fused to the C-terminus of
GLP-2 can be created with a XTEN release site cleavage sequence placed in between the GLP-2 and
XTEN components, as depicted in Exemplary sequences are provided in Table 34. In this case,
the release site contains an amino acid sequence that is recognized and cleaved by the MMP-l 3 protease
(EC 3.4.24.-, Uniprot P45452). Specifically the sequence GPAGLRGA [Rawlings N.D., et a1. (2008)
Nucleic Acids Res, 36: D320], is out after position 4. MMP-13 is constitutively expressed in whole
blood. Therefore as the long lived GLP2-XTEN circulates, a fraction of it is cleaved, creating a pool of
shorter-lived GLP-2 to be used. In a desirable feature of the inventive composition, this creates a
circulating pro-drug depot that constantly releases a prophylactic amount of GLP-2, with higher s
released during an inflammatory se, e.g., in Crohn’s Disease, where the GLP-2 is most needed.
C-terminal XTEN releasable by MMP-17
A GLP2-XTEN fusion protein consisting of an XTEN protein fused to the C-terminus of GLP-
2 can be created with a XTEN release site cleavage sequence placed in between the GLP-2 and XTEN
components, as depicted in Exemplary sequences are provided in Table 34. In this case, the
e site contains an amino acid sequence that is recognized and cleaved by the MMP-20 protease
(EC.3.4.24.-, Uniprot Q9ULZ9). Specifically the sequence LR [Rawlings N.D., et a1. (2008)
Nucleic Acids Res, 36: D320], is cut after position 4 in the sequence. MMP-17 is constitutively
expressed in whole blood. Therefore as the GLPZ-XTEN circulates, a fraction of it is cleaved, creating a
pool of shorter-lived GLP-2 to be used. In a desirable feature of the inventive composition, this creates a
circulating pro-drug depot that ntly releases a lactic amount of GLP-2, with higher amounts
released during an inflammatory response, e.g., in Crohn’s Disease, where the GLP-2 is most needed.
] C-terminal XTEN releasable by MMP-20
A GLPZ-XTEN fusion protein consisting of an XTEN protein fused to the C-terminus of GLP-
2 can be created with a XTEN release site cleavage sequence placed in between the GLP-2 and XTEN
components, as depicted in Exemplary sequences are provided in Table 34. In this case, the
e site contains an amino acid sequence that is recognized and cleaved by the MMP-ZO protease
4.24.-, Uniprot 060882). Specifically the sequence AQ [Rawlings N.D., et al. (2008)
Nucleic Acids Res, 36: D320], is out after position 4 (depicted by the arrow). MMP-20 is constitutively
expressed in whole blood. Therefore as the TEN circulates, a fraction of it is cleaved, creating a
pool of r-lived GLP-2 to be used. In a ble e of the inventive composition, this creates a
circulating pro-drug depot that constantly releases a prophylactic amount of GLP-2, with higher amounts
ed during an inflammatory response, e.g., in Crohn’s Disease, where the GLP—2 is most needed.
Optimization of the release rate of C-terminal XTEN
Variants of the foregoing constructs of the Examples can be created in which the release rate of
C-terminal XTEN is altered. As the rate ofXTEN release by an XTEN release protease is dependent on
the sequence of the XTEN release site, by varying the amino acid sequence in the XTEN release site one
can control the rate ofXTEN release. The sequence specificity ofmany proteases is well known in the
art, and is documented in several data bases. In this case, the amino acid specificity of proteases is
mapped using combinatorial libraries of substrates [Harris, JL, et al. (2000) Proc Natl Acad Sci US A,
97: 7754] or by following the cleavage of ate mixtures as illustrated in [Schellenberger, V, et a1.
(1993) mistry, 32: 4344]. An alternative is the fication of optimal protease cleavage
ces by phage y [Matthews, D., et a1. (1993) Science, 260: 1113]. Constructs are made with
variant sequences and assayed for XTEN release using standard assays for detection of the XTEN.
] Example 24: Biodistribution of large XTEN molecules
To verify that constructs with long XTEN fusions can penetrate into tissue, the biodistribution of
three fluorescently tagged constructs were tested in mice, aHer2-XTEN—864-Alexa 680, aHer2-XTEN-
576-Alexa 680, and aHer2-XTENAlexa 680, using fluorescence imaging. The aHer2 payload is a
scFv fragment specific for binding the Her2 antigen, which is not found on normal tissues (and hence
should not affect biodistribution in normal animals). This study also included fluorescently tagged
Herceptin-Alexa 680 as a control antibody. The mice were given a single intravenous injection of each
agent. After 72 hours, all groups were euthanized and liver, lung, heart, spleen and kidneys were ex vivo
imaged using fluorescence imaging. The data are shown Table 22.
Conclusions: All constructs showed significant penetration into all tissues assayed. The lower
overall fluorescence signals of the XTEN_S76 and XTEN_288 groups are attributed to the increased
clearance of the shorter XTEN constructs over the 72 hour bution period. Similar proportions for
lung fluorescence relative to total signal were ed for all groups, including the antibody control,
supportng that XTEN fusion protein constructs are bioavailable in tissue under these conditions.
Table 22: Fluorescence Signals by Organ
Test Material
scFv-XTEN—
6.7 28 130 16 120
864-Alexa 680
scFv—XTEN—
Wiemso"m
scFv-XTEN-
288—A1exa680
mAb'Alexa680
3.3 32 150 25 370 110
Example 25: Analytical size exclusion tography of XTEN fusion proteins with
e payloads
Size ion chromatography analyses were performed on fusion proteins containing various
therapeutic proteins and unstructured recombinant proteins of increasing length. An exemplary assay
used a -G4000 SWXL (7.8mm x 30cm) column in which 40 ug of purified glucagon fusion
protein at a concentration of 1 mg/ml was separated at a flow rate of 0.6 ml/min in 20 mM phosphate pH
6.8, 114 mM NaCl. Chromatogram profiles were monitored using OD214nm and OD280nm. Column
calibration for all assays were performed using a size exclusion calibration standard from ; the
markers e thyroglobulin (670 kDa), bovine gamma-globulin (158 kDa), chicken ovalbumin (44
kDa), equine myoglobuin (17 kDa) and vitamin B12 (1.35 kDa). entative chromatographic
profiles of Glucagon—Y288, Glucagon—Y144, Glucagon—Y72, Glucagon—Y36 are shown as an overlay in
. The data show that the nt molecular weight of each compound is proportional to the
length of the attached XTEN sequence. However, the data also show that the apparent molecular weight
of each construct is significantly larger than that expected for a globular protein (as shown by ison
to the standard proteins run in the same assay). Based on the SEC analyses for all constructs evaluated,
the apparent molecular weights, the apparent molecular weight factor (expressed as the ratio of apparent
molecular weight to the calculated lar weight) and the hydrodynamic radius (RH in nm) are shown
in Table 23. The results indicate that oration of different XTENs of 576 amino acids or r
confers an apparent molecular weight for the fusion protein of imately 339 kDa to 760, and that
XTEN of 864 amino acids or greater confers an apparent molecular weight greater than at least
approximately 800 kDA. The results of proportional increases in apparent molecular weight to actual
2012/054941
molecular weight were tent for fusion proteins created with XTEN from several ent motif
families; i.e., AD, AE, AF, AG, and AM, with increases of at least four-fold and ratios as high as about
17-fold. Additionally, the incorporation of XTEN fusion partners with 576 amino acids or more into
fusion proteins with the various ds (and 288 residues in the case of glucagon fused to Y288)
resulted with a hydrodynamic radius of 7 nm or greater; well beyond the glomerular pore size of
approximately 3-5 nm. Accordingly, it is expected that fusion proteins comprising growth and XTEN
have reduced renal clearance, contributing to increased terminal ife and improving the eutic
or biologic effect relative to a corresponding un-fused biologic payload protein.
Table 23: SEC analysis of s polypeptides
uct XEEEHM Therapeutic APWM $593161: RH
Name Protein
r (kl)a) 18:23:: (1111])
AC14 Y288 Glucagon 28.7 370 12.9 7.0
AC33 Y36 Glucagon 6.8 29.4 4.3 2.6
AC89 AF120 Glucagon 14.1 76.4 5.4 4.3
AC88 AF108 Glucagon 13.1 61.2 4.7 3.9
AC73 AF144 Glucagon 16.3 95.2 5.8 4.7
AC53 AG576 GFP 74.9 339 4.5 7.0
AC39 AD576 GFP 76.4 546 7.1 7.7
AC41 AE576 GFP 80.4 760 9.5 8.3
AC52 AF576 GFP 78.3 526 6.7 7.6
AC398 AE288 FVII 76.3 650 8.5 8.2
AC404 AE864 FVII 129 1900 14.7 10.1
AC85 AE864 Exendin-4 83.6 93 8 11.2 8.9
AC114 AM875 Exendin-4 82.4 1344 16.3 9.4
AC143 AM875 hGH 100.6 846 8.4 8.7
AC227 AM875 IL-lra 95.4 1103 11.6 9.2
AC228 AM1318 IL—lra 134.8 2286 17.0 10.5
Example 26: Pharmacokinetics of extended polypeptides fused to GFP in cynomolgus
monkeys
The pharmacokinetics of GFP-L288, GFP-L576, GFP-XTEN_AF576, GFP-XTEN_Y576 and
XTEN_AD836-GFP were tested in lgus monkeys to determine the effect of composition and
length of the unstructured polypeptides on PK parameters. Blood samples were analyzed at various times
after injection and the concentration of GFP in plasma was measured by ELISA using a polyclonal
antibody against GFP for capture and a biotinylated preparation of the same polyclonal antibody for
detection. Results are summarized in . They show a surprising increase of half-life with
increasing length of the XTEN sequence. For example, a half-life of 10 h was determined for GFP-
XTEN_L288 (with 288 amino acid residues in the XTEN). Doubling the length of the unstructured
polypeptide fusion partner to 576 amino acids increased the half-life to 20-22 h for multiple fusion
protein constructs; i.e., GFP-XTEN_L576, GFP-XTEN_AF576, GFP-XTEN_Y576. A further increase
of the unstructured polypeptide fusion partner length to 836 residues resulted in a half-life of 72-75 h for
XTEN_AD836-GFP. Thus, increasing the polymer length by 288 residues from 288 to 576 residues
sed in vivo half-life by about 10 h. However, increasing the ptide length by 260 residues
from 576 residues to 836 es increased half-life by more than 50 h. These results show that there is
a surprising threshold of unstructured polypeptide length that results in a greater than proportional gain in
in viva ife. Thus, fusion ns comprising extended, unstructured polypeptides are expected to
have the property of enhanced pharmacokinetics compared to polypeptides of shorter lengths.
] Example 27: Serum stability of XTEN
A fusion protein containing XTEN_AE864 fused to the N-terminus of GFP was incubated in
monkey plasma and rat kidney lysate for up to 7 days at 37°C. Samples were withdrawn at time 0, Day 1
and Day 7 and analyzed by SDS PAGE followed by detection using Western is and detection with
antibodies against GFP as shown in . The ce of XTEN_AE864 showed negligible signs of
degradation over 7 days in plasma. However, XTEN_AE864 was rapidly degraded in rat kidney lysate
over 3 days. The in vivo stability of the fusion protein was tested in plasma samples wherein the
GFPiAE864 was immunoprecipitated and analyzed by SDS PAGE as described above. Samples that
were withdrawn up to 7 days after injection showed very few signs of degradation. The results
trate the resistance of GLP2-XTEN to degradation due to serum proteases; a factor in the
enhancement of pharmacokinetic properties of the GLP2-XTEN fusion proteins.
Example 28: sing solubility and stability of a peptide payload by linking to XTEN
In order to evaluate the ability ofXTEN to enhance the physicochemical properties of
solubility and stability, fusion proteins of glucagon plus shorter-length XTEN were prepared and
evaluated. The test articles were ed in Tris-buffered saline at neutral pH and characterization of
the Gog-XTEN solution was by reverse-phase HPLC and size exclusion chromatography to affirm that
the n was neous and non-aggregated in solution. The data are presented in Table 24. For
comparative purposes, the solubility limit of unmodified glucagon in the same buffer was measured at 60
MM (0.2 mg/mL), and the result demonstrate that for all lengths ofXTEN added, a substantial se in
solubility was attained. Importantly, in most cases the glucagon-XTEN fusion proteins were prepared to
achieve target concentrations and were not evaluated to determine the maximum solubility limits for the
given construct. r, in the case of on linked to the AF-l44 XTEN, the limit of solubility
was determined, with the result that a 60-fold increase in solubility was achieved, compared to glucagon
not linked to XTEN. In on, the glucagon-AF144 GLP2-XTEN was evaluated for stability, and was
found to be stable in liquid formulation for at least 6 months under refrigerated ions and for
approximately one month at 37°C (data not shown).
WO 40093 2012/054941
The data support the conclusion that the linking of short-length XTEN polypeptides to a
biologically active protein such as glucagon can markedly enhance the solubility properties of the protein
by the resulting fusion protein, as well as confer stability at the higher protein concentrations.
Table 24: Solubility of Glucagon-XTEN constructs
] Example 29: Analysis of sequences for secondary structure by prediction algorithms
] Amino acid sequences can be assessed for secondary structure via certain computer programs
or algorithms, such as the nown Chou-Fasman algorithm (Chou, P. Y., et a]. (1974) Biochemistry,
13: 222-45) and the Gamier-Osguthorpe-Robson, or “GOR” method (Gamier J, Gibrat JF, Robson B.
(1996). GOR method for predicting protein secondary structure from amino acid sequence. Methods
Enzymol 266:540-553). For a given sequence, the algorithms can t whether there exists some or
no secondary structure at all, expressed as total and/or percentage of residues of the sequence that form,
for example, alpha—helices or beta—sheets or the percentage of es of the sequence predicted to result
in random coil formation.
Several representative sequences from XTEN “families” have been assessed using two
algorithm tools for the Chou-Fasman and GOR methods to assess the degree of secondary structure in
these ces. The Chou-Fasman tool was provided by William R. Pearson and the University of
Virginia, at the “Biosupport” intemet site, URL located on the World Wide Web at
.fasta.bioch.virginia.edu/fasta_www2/fasta_www.cgi?rm=misc1 as it existed on June 19, 2009. The
GOR tool was provided by Pole atique Lyonnais at the Network Protein ce Analysis
internet site, URL located on the World Wide Web at .npsa—pbilibcp.fr/cgi-bin/secpred_g0r4.pl as it
existed on June 19, 2008.
As a first step in the analyses, a single XTEN sequence was ed by the two algorithms.
The AE864 composition is a XTEN with 864 amino acid residues created from multiple copies of four 12
amino acid sequence motifs consisting of the amino acids G, S, T, E, P, and A. The sequence motifs are
characterized by the fact that there is limited repetitiveness within the motifs and within the overall
sequence in that the sequence of any two utive amino acids is not repeated more than twice in any
one 12 amino acid motif, and that no three contiguous amino acids of full-length the XTEN are cal.
Successively longer portions of the AF 864 sequence from the inus were analyzed by the Chou-
Fasman and GOR algorithms (the latter requires a minimum length of 17 amino acids). The sequences
were analyzed by entering the FASTA format sequences into the prediction tools and g the
analysis. The results from the analyses are presented in Table 25.
The results indicate that, by the Chou-Fasman calculations, short XTEN of the AE and AG
families, up to at least 288 amino acid residues, have no alpha-helices or beta sheets, but amounts of
predicted percentage of random coil by the GOR algorithm vary from 78-99%. With increasing XTEN
lengths of 504 residues to r than 1300, the XTEN ed by the asman algorithm had
ted percentages of alpha-helices or beta sheets of 0 to about 2%, while the calculated percentages
of random coil increased to from 94-99%. Those XTEN with alpha-helices or beta sheets were those
sequences with one or more instances of three contiguous serine residues, which resulted in predicted
beta-sheet formation. However, even these sequences still had approximately 99% random coil
formation.
The data provided herein ts that 1) XTEN created from le sequence motifs of G, S,
T, E, P, and A that have limited repetitiveness as to contiguous amino acids are predicted to have very
low s of alpha-helices and beta-sheets; 2) that increasing the length of the XTEN does not
appreciably increase the ility of alpha—helix or beta—sheet formation; and 3) that ssively
increasing the length of the XTEN sequence by addition of non—repetitive 12—mers consisting of the
amino acids G, S, T, E, P, and A results in increased percentage of random coil formation. Results
fiarther indicate that XTEN sequences defined herein (including e.g., XTEN created from sequence
motifs of G, S, T, E, P, and A) have limited repetitiveness (including those with no more than two
identical contiguous amino acids in any one motif) are expected to have very limited secondary ure.
Any order or combination of sequence motifs from Table 3 can be used to create an XTEN polypeptide
that will result in an XTEN sequence that is substantially devoid of secondary structure, though three
contiguous s are not preferred. The unfavorable property ofthree contiguous series however, can
be ameliorated by increasing the length ofthe XTEN. Such sequences are expected to have the
characteristics described in the GLP2-XTEN embodiments of the invention disclosed herein.
Table 25: CHOU-FASMAN and GOR prediction calculations of polypeptide ces
SEQ N0. Chan-Fasman GOR
Sequence
NAME Residues Calculation Calculation
A1336: GSPAGSPTSTEEGTSESATPESGPGTST 36 e totals1H: 0 E: 0 94.44%
LCW0402 EPSEGSAP
percent: H: 0.0 E; 0.0
A1336: GTSTEPSEGSAPGTSTEPSEGSAPGTST 36 Residue totals: H: 0 E: 0 94.44%
L&\§0402 EPSEGSAP
percent: H: 0.0 E: 0.0
AG36: GASPGTSSTGSPGTPGSGTASSSPGSST 36 Residue totals: H: 0 E: 0 77.78%
1453?]0404 PSGATGSP
percent: H; 0.0 E; 0.0
AG36: GSSTPSGATGSPGSSPSASTGTGPGSST 36 Residue totals: H; 0 E; 0 83.33 %
SEQ N0. asman GOR
Se uence‘1
NAME Residues Calculation calculation
LCW0404 SP perceit1H: 0.0 E: 0.0
7003
AE42_1 TEPSEGSAPGSPAGSPTSTEEGTSESAT 42 Residue : H: 0 E: 0 90.48%
PESGPGSEPATSGS
perce it: H; 0.0 E; 0.0
AE42_1 TEPSEGSAPGSPAGSPTSTEEGTSESAT 42 Residue : H; 0 E; 0 90.48%
PESGPGSEPATSGS
perce 1t: H: 0.0 E; 0.0
AG42_1 GAPSPSASTGTGPGTPGSGTASSSPGS 42 Residue t0ta1s:H: 0 E; 0 88.10%
STPSGATGSPGPSGP
perce 1t: H: 0.0 E; 0.0
AG42_2 GPGTPGSGTASSSPGSSTPSGATGSPG 42 Residue totals: H; 0 E: 0 88.10%
SSPSASTGTGPGASP
perce 1t: H: 0.0 E; 0.0
AE144 SGSETPGTSESATPESGPGSEP 144 e totals: H: 0 E: 0 98.61%
ATSGSETPGSPAGSPTSTEEGTSTEPSE
percent H; 0.0 E; 0.0
EPATSGSETPGSEPATSGSETP
GSEPATSGSETPGTSTEPSEGSAPGTSE
SATPESGPGSEPATSGSETPGTSTEPSE
GSAP
AG144_1 PGSSPSASTGTGPGSSPSASTGTGPGTP 144 Residue t0ta1s:H: 0 E: 0 91.67%
GSGTASSSPGSSTPSGATGSPGSSPSAS
percent; H; 0.0 E; 0.0
TGTGPGASPGTSSTGSPGTPGSGTASS
SPGSSTPSGATGSPGTPGSGTASSSPG
ASPGTSSTGSPGASPGTSSTGSPGTPGS
GTASSS
AE288 GTSESATPESGPGSEPATSGSETPGTSE 288 Residue totals; H; 0 E; 0 99.31%
SATPESGPGSEPATSGSETPGTSESATP
percent; H; 0.0 E; 0.0
ESGPGTSTEPSEGSAPGSPAGSPTSTEE
GTSESATPESGPGSEPATSGSETPGTSE
SATPESGPGSPAGSPTSTEEGSPAGSPT
STEEGTSTEPSEGSAPGTSESATPESGP
GTSESATPESGPGTSESATPESGPGSEP
ATSGSETPGSEPATSGSETPGSPAGSPT
STEEGTSTEPSEGSAPGTSTEPSEGSAP
GSEPATSGSETPGTSESATPESGPGTST
EPSEGSAP
AG288_2 GSSPSASTGTGPGSSPSASTGTGPGTP 288 Residue t0ta1s:H: 0 E; 0 92.71
GSGTASSSPGSSTPSGATGSPGSSPSAS
percent; H; 0.0 E; 0.0
TGTGPGASPGTSSTGSPGTPGSGTASS
SPGSSTPSGATGSPGTPGSGTASSSPG
ASPGTSSTGSPGASPGTSSTGSPGTPGS
GTASSSPGSSTPSGATGSPGASPGTSST
GSPGTPGSGTASSSPGSSTPSGATGSP
GSSPSASTGTGPGSSPSASTGTGPGSST
PSGATGSPGSSTPSGATGSPGASPGTS
STGSPGASPGTSSTGSPGASPGTSSTGS
PGTPGSGTASSSP
AF504 GASPGTSSTGSPGSSPSASTGTGPGSSP 504 Residue totals: H; 0 E; 0 94.44%
SASTGTGPGTPGSGTASSSPGSSTPSG
percent: H: 0.0 E: 0.0
ATGSPGSNPSASTGTGPGASPGTSSTG
SGTASSSPGSSTPSGATGSPGT
PGSGTASSSPGASPGTSSTGSPGASPG
TSSTGSPGTPGSGTASSSPGSSTPSGAT
GSPGASPGTSSTGSPGTPGSGTASSSP
GSSTPSGATGSPGSNPSASTGTGPGSS
PSASTGTGPGSSTPSGATGSPGSSTPSG
ATGSPGASPGTSSTGSPGASPGTSSTG
SPGASPGTSSTGSPGTPGSGTASSSPG
ASPGTSSTGSPGASPGTSSTGSPGASP
SEQ N0. Chou—Fasman GOR
Sequence
NAME Residues Calculation Calculation
SPGSSPSASTGTGPGTPGSGT
ASPGTSSTGSPGASPGTSSTGS
PGASPGTSSTGSPGSSTPSGATGSPGSS
TPSGATGSPGASPGTSSTGSPGTPGSG
TASSSPGSSTPSGATGSPGSSTPSGATG
SPGSSTPSGATGSPGSSPSASTGTGPG
ASPGTSSTGSP
AD 576 GSSESGSSEGGPGSGGEPSESGSSGSSE 576 Residue totals:H: 7 E; 0 99.65%
GPGSSESGSSEGGPGSSESGSS t: H; 1.2 E; 0.0
EGGPGSSESGSSEGGPGSSESGSSEGG
PGESPGGSSGSESGSEGSSGPGESSGSS
ESGSSEGGPGSSESGSSEGGPGSSESGS
SEGGPGSGGEPSESGSSGESPGGSSGS
ESGESPGGSSGSESGSGGEPSESGSSGS
SESGSSEGGPGSGGEPSESGSSGSGGE
SGSEGSSGPGESSGESPGGSSG
SESGSGGEPSESGSSGSGGEPSESGSSG
SGGEPSESGSSGSSESGSSEGGPGESPG
GSSGSESGESPGGSSGSESGESPGGSS
SPGGSSGSESGESPGGSSGSES
GSSESGSSEGGPGSGGEPSESGSSGSE
GSSGPGESSGSSESGSSEGGPGSGGEP
SESGSSGSSESGSSEGGPGSGGEPSESG
SSGESPGGSSGSESGESPGGSSGSESGS
SESGSSEGGPGSGGEPSESGSSGSSESG
SSEGGPGSGGEPSESGSSGSGGEPSES
GSSGESPGGSSGSESGSEGSSGPGESS
GSSESGSSEGGPGSEGSSGPGESS
AE576 GSPAGSPTSTEEGTSESATPESGPGTST 576 Residue totals:H: 2 E: 0 99.65%
EPSEGSAPGSPAGSPTSTEEGTSTEPSE
percent: H: 0.4 E: 0.0
GSAPGTSTEPSEGSAPGTSESATPESGP
GSEPATSGSETPGSEPATSGSETPGSPA
GSPTSTEEGTSESATPESGPGTSTEPSE
GSAPGTSTEPSEGSAPGSPAGSPTSTEE
GTSTEPSEGSAPGTSTEPSEGSAPGTSE
SATPESGPGTSTEPSEGSAPGTSESATP
ESGPGSEPATSGSETPGTSTEPSEGSAP
GTSTEPSEGSAPGTSESATPESGPGTSE
SATPESGPGSPAGSPTSTEEGTSESATP
ESGPGSEPATSGSETPGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGTST
EPSEGSAPGTSTEPSEGSAPGTSTEPSE
GSAPGTSTEPSEGSAPGSPAGSPTSTEE
GTSTEPSEGSAPGTSESATPESGPGSEP
ATSGSETPGTSESATPESGPGSEPATS
GSETPGTSESATPESGPGTSTEPSEGSA
PGTSESATPESGPGSPAGSPTSTEEGSP
AGSPTSTEEGSPAGSPTSTEEGTSESAT
PESGPGTSTEPSEGSAP
AG576 PGTPGSGTASSSPGSSTPSGATGSPGSS 576 Residue totals; H; 0 E; 3 99.31%
PSASTGTGPGSSPSASTGTGPGSSTPSG
percent: H: 0.4 E; 0.5
ATGSPGSSTPSGATGSPGASPGTSSTG
GTSSTGSPGASPGTSSTGSPGT
PGSGTASSSPGASPGTSSTGSPGASPG
TSSTGSPGASPGTSSTGSPGSSPSASTG
TGPGTPGSGTASSSPGASPGTSSTGSP
GASPGTSSTGSPGASPGTSSTGSPGSST
PSGATGSPGSSTPSGATGSPGASPGTS
SEQ N0. Chou—Fasman GOR
Sequence
NAME Residues Calculation Calculation
STGSPGTPGSGTASSSPGSSTPSGATGS
PGSSTPSGATGSPGSSTPSGATGSPGSS
PSASTGTGPGASPGTSSTGSPGASPGT
SSTGSPGTPGSGTASSSPGASPGTSSTG
SPGASPGTSSTGSPGASPGTSSTGSPG
ASPGTSSTGSPGTPGSGTASSSPGSSTP
SGATGSPGTPGSGTASSSPGSSTPSGA
TGSPGTPGSGTASSSPGSSTPSGATGSP
GSSTPSGATGSPGSSPSASTGTGPGSSP
SASTGTGPGASPGTSSTGSPGTPGSGT
ASSSPGSSTPSGATGSPGSSPSASTGTG
ASTGTGPGASPGTSSTGS
AF540 GSTSSTAESPGPGSTSSTAESPGPGSTS 540 Residue totals; H; 2 E; 0 99.65
ESPSGTAPGSTSSTAESPGPGSTSSTAE t: H: 0.4 E; 0.0
SPGPGTSTPESGSASPGSTSESPSGTAP
GTSPSGESSTAPGSTSESPSGTAPGSTS
ESPSGTAPGTSPSGESSTAPGSTSESPS
GTAPGSTSESPSGTAPGTSPSGESSTAP
GSTSESPSGTAPGSTSESPSGTAPGSTS
APGTSTPESGSASPGSTSESPS
GTAPGTSTPESGSASPGSTSSTAESPGP
GSTSSTAESPGPGTSTPESGSASPGTST
PESGSASPGSTSESPSGTAPGTSTPESG
SASPGTSTPESGSASPGSTSESPSGTAP
GSTSESPSGTAPGSTSESPSGTAPGSTS
STAESPGPGTSTPESGSASPGTSTPESG
SASPGSTSESPSGTAPGSTSESPSGTAP
SGSASPGSTSESPSGTAPGSTS
ESPSGTAPGTSTPESGSASPGTSPSGES
STAPGSTSSTAESPGPGTSPSGESSTAP
AESPGPGTSTPESGSASPGSTS
AD836 GSSESGSSEGGPGSSESGSSEGGPGESP 836 e totals:H: 0 E: 0 98.44%
GGSSGSESGSGGEPSESGSSGESPGGS
percent: H: 0.0 E: 0.0
SGSESGESPGGSSGSESGSSESGSSEGG
PGSSESGSSEGGPGSSESGSSEGGPGES
PGGSSGSESGESPGGSSGSESGESPGG
SSGSESGSSESGSSEGGPGSSESGSSEG
GPGSSESGSSEGGPGSSESGSSEGGPG
SSESGSSEGGPGSSESGSSEGGPGSGG
EPSESGSSGESPGGSSGSESGESPGGSS
GSESGSGGEPSESGSSGSEGSSGPGESS
GSSESGSSEGGPGSGGEPSESGSSGSE
GSSGPGESSGSSESGSSEGGPGSGGEP
SESGSSGESPGGSSGSESGSGGEPSESG
SSGSGGEPSESGSSGSSESGSSEGGPGS
GGEPSESGSSGSGGEPSESGSSGSEGSS
GPGESSGESPGGSSGSESGSEGSSGPG
ESSGSEGSSGPGESSGSGGEPSESGSSG
SSESGSSEGGPGSSESGSSEGGPGESPG
GSSGSESGSGGEPSESGSSGSEGSSGP
GESSGESPGGSSGSESGSEGSSGPGSSE
SGSSEGGPGSGGEPSESGSSGSEGSSG
PGESSGSEGSSGPGESSGSEGSSGPGES
SGSGGEPSESGSSGSGGEPSESGSSGES
PGGSSGSESGESPGGSSGSESGSGGEP
SESGSSGSEGSSGPGESSGESPGGSSGS
ESGSSESGSSEGGPGSSESGSSEGGPGS
SEQ N0. Chou—Fasman GOR
Sequence
NAME Residues Calculation Calculation
SESGSSEGGPGSGGEPSESGSSGSSESG
SSEGGPGESPGGSSGSESGSGGEPSES
GSSGSSESGSSEGGPGESPGGSSGSES
GSGGEPSESGSSGESPGGSSGSESGSG
GEPSESGSS
AE864 GSPAGSPTSTEEGTSESATPESGPGTST 864 Residue totals; H; 2 E; 3 99.77%
EPSEGSAPGSPAGSPTSTEEGTSTEPSE
percent: H: 0.2 E; 0.4
GSAPGTSTEPSEGSAPGTSESATPESGP
GSEPATSGSETPGSEPATSGSETPGSPA
GSPTSTEEGTSESATPESGPGTSTEPSE
GSAPGTSTEPSEGSAPGSPAGSPTSTEE
GTSTEPSEGSAPGTSTEPSEGSAPGTSE
SATPESGPGTSTEPSEGSAPGTSESATP
ESGPGSEPATSGSETPGTSTEPSEGSAP
GTSTEPSEGSAPGTSESATPESGPGTSE
SATPESGPGSPAGSPTSTEEGTSESATP
ESGPGSEPATSGSETPGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGTST
EPSEGSAPGTSTEPSEGSAPGTSTEPSE
STEPSEGSAPGSPAGSPTSTEE
SEGSAPGTSESATPESGPGSEP
ATSGSETPGTSESATPESGPGSEPATS
GSETPGTSESATPESGPGTSTEPSEGSA
PGTSESATPESGPGSPAGSPTSTEEGSP
TEEGSPAGSPTSTEEGTSESAT
PESGPGTSTEPSEGSAPGTSESATPESG
PGSEPATSGSETPGTSESATPESGPGSE
PATSGSETPGTSESATPESGPGTSTEPS
SPAGSPTSTEEGTSESATPES
GPGSEPATSGSETPGTSESATPESGPGS
PAGSPTSTEEGSPAGSPTSTEEGTSTEP
SEGSAPGTSESATPESGPGTSESATPES
GPGTSESATPESGPGSEPATSGSETPGS
EPATSGSETPGSPAGSPTSTEEGTSTEP
SEGSAPGTSTEPSEGSAPGSEPATSGSE
TPGTSESATPESGPGTSTEPSEGSAP
AF864 PSGTAPGTSPSGESSTAPGSTS 875 Residue totals; H; 2 E; 0 95.20%
ESPSGTAPGSTSESPSGTAPGTSTPESG percent: H: 0.2 E: 0.0
SASPGTSTPESGSASPGSTSESPSGTAP
GSTSESPSGTAPGTSPSGESSTAPGSTS
ESPSGTAPGTSPSGESSTAPGTSPSGES
STAPGSTSSTAESPGPGTSPSGESSTAP
ESSTAPGSTSSTAESPGPGTST
PESGSASPGTSTPESGSASPGSTSESPS
GTAPGSTSESPSGTAPGTSTPESGSASP
GSTSSTAESPGPGTSTPESGSASPGSTS
ESPSGTAPGTSPSGESSTAPGSTSSTAE
SPGPGTSPSGESSTAPGTSTPESGSASP
AESPGPGSTSSTAESPGPGSTS
STAESPGPGSTSSTAESPGPGTSPSGES
STAPGSTSESPSGTAPGSTSESPSGTAP
GTSTPESGPXXXGASASGAPSTXXXX
SESPSGTAPGSTSESPSGTAPGSTSESP
SGTAPGSTSESPSGTAPGSTSESPSGTA
PGSTSESPSGTAPGTSTPESGSASPGTS
PSGESSTAPGTSPSGESSTAPGSTSSTA
ESPGPGTSPSGESSTAPGTSTPESGSAS
PGSTSESPSGTAPGSTSESPSGTAPGTS
SEQ N0. Chou—Fasman GOR
Sequence
NAME Residues Calculation Calculation
TAPGSTSESPSGTAPGTSTPES
GSASPGTSTPESGSASPGSTSESPSGTA
PGTSTPESGSASPGSTSSTAESPGPGST
SESPSGTAPGSTSESPSGTAPGTSPSGE
SSTAPGSTSSTAESPGPGTSPSGESSTA
PGTSTPESGSASPGTSPSGESSTAPGTS
PSGESSTAPGTSPSGESSTAPGSTSSTA
STSSTAESPGPGTSPSGESSTA
PGSSPSASTGTGPGSSTPSGATGSPGSS
TPSGATGSP
AG864 GASPGTSSTGSPGSSPSASTGTGPGSSP 864 e totals: H: 0 E: 0 94.91%
GPGTPGSGTASSSPGSSTPSG
percent: H: 0.0 E: 0.0
SSPSASTGTGPGASPGTSSTG
SPGTPGSGTASSSPGSSTPSGATGSPGT
PGSGTASSSPGASPGTSSTGSPGASPG
TSSTGSPGTPGSGTASSSPGSSTPSGAT
GSPGASPGTSSTGSPGTPGSGTASSSP
GSSTPSGATGSPGSSPSASTGTGPGSSP
SASTGTGPGSSTPSGATGSPGSSTPSG
ATGSPGASPGTSSTGSPGASPGTSSTG
SPGASPGTSSTGSPGTPGSGTASSSPG
ASPGTSSTGSPGASPGTSSTGSPGASP
GTSSTGSPGSSPSASTGTGPGTPGSGT
ASSSPGASPGTSSTGSPGASPGTSSTGS
PGASPGTSSTGSPGSSTPSGATGSPGSS
TPSGATGSPGASPGTSSTGSPGTPGSG
TASSSPGSSTPSGATGSPGSSTPSGATG
SPGSSTPSGATGSPGSSPSASTGTGPG
ASPGTSSTGSPGASPGTSSTGSPGTPGS
GTASSSPGASPGTSSTGSPGASPGTSST
GSPGASPGTSSTGSPGASPGTSSTGSP
GTPGSGTASSSPGSSTPSGATGSPGTP
GSGTASSSPGSSTPSGATGSPGTPGSG
GSSTPSGATGSPGSSTPSGATG
SPGSSPSASTGTGPGSSPSASTGTGPG
ASPGTSSTGSPGTPGSGTASSSPGSSTP
SGATGSPGSSPSASTGTGPGSSPSAST
GTGPGASPGTSSTGSPGASPGTSSTGS
PGSSTPSGATGSPGSSPSASTGTGPGA
SPGTSSTGSPGSSPSASTGTGPGTPGSG
TASSSPGSSTPSGATGSPGSSTPSGATG
SPGASPGTSSTGSP
AM875 GTSTEPSEGSAPGSEPATSGSETPGSPA 875 Residue totals: H: 7 E; 3 98.63%
GSPTSTEEGSTSSTAESPGPGTSTPESG
percent: H: 0.8 E; 0.3
SASPGSTSESPSGTAPGSTSESPSGTAP
GTSTPESGSASPGTSTPESGSASPGSEP
ATSGSETPGTSESATPESGPGSPAGSPT
STEEGTSTEPSEGSAPGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGSPA
GSPTSTEEGTSTEPSEGSAPGTSTEPSE
GSAPGTSESATPESGPGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGTSE
SATPESGPGTSTEPSEGSAPGSEPATSG
SETPGSPAGSPTSTEEGSSTPSGATGSP
GTPGSGTASSSPGSSTPSGATGSPGTS
TEPSEGSAPGTSTEPSEGSAPGSEPATS
GSETPGSPAGSPTSTEEGSPAGSPTSTE
EGTSTEPSEGSAPGASASGAPSTGGTS
SEQ N0. Chou—Fasman GOR
Sequence
NAME Residues Calculation Calculation
ESATPESGPGSPAGSPTSTEEGSPAGSP
TSTEEGSTSSTAESPGPGSTSESPSGTA
PGTSPSGESSTAPGTPGSGTASSSPGSS
TPSGATGSPGSSPSASTGTGPGSEPAT
SGSETPGTSESATPESGPGSEPATSGSE
TPGSTSSTAESPGPGSTSSTAESPGPGT
SPSGESSTAPGSEPATSGSETPGSEPAT
SGSETPGTSTEPSEGSAPGSTSSTAESP
GPGTSTPESGSASPGSTSESPSGTAPGT
STEPSEGSAPGTSTEPSEGSAPGTSTEP
SEGSAPGSSTPSGATGSPGSSPSASTGT
GPGASPGTSSTGSPGSEPATSGSETPG
TSESATPESGPGSPAGSPTSTEEGSSTP
SGATGSPGSSPSASTGTGPGASPGTSS
TGSPGTSESATPESGPGTSTEPSEGSAP
SEGSAP
AM1318 GTSTEPSEGSAPGSEPATSGSETPGSPA 1318 Residue t0ta1s1H: 7 E; 0 99.17%
GSPTSTEEGSTSSTAESPGPGTSTPESG
percent: H: 0.7 E; 0.0
TSESPSGTAPGSTSESPSGTAP
SGSASPGTSTPESGSASPGSEP
ATSGSETPGTSESATPESGPGSPAGSPT
STEEGTSTEPSEGSAPGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGSPA
GSPTSTEEGTSTEPSEGSAPGTSTEPSE
GSAPGTSESATPESGPGTSESATPESGP
GTSTEPSEGSAPGTSTEPSEGSAPGTSE
SATPESGPGTSTEPSEGSAPGSEPATSG
SETPGSPAGSPTSTEEGSSTPSGATGSP
GTPGSGTASSSPGSSTPSGATGSPGTS
TEPSEGSAPGTSTEPSEGSAPGSEPATS
GSETPGSPAGSPTSTEEGSPAGSPTSTE
PSEGSAPGPEPTGPAPSGGSEP
ATSGSETPGTSESATPESGPGSPAGSPT
STEEGTSESATPESGPGSPAGSPTSTEE
GSPAGSPTSTEEGTSESATPESGPGSPA
GSPTSTEEGSPAGSPTSTEEGSTSSTAE
SPGPGSTSESPSGTAPGTSPSGESSTAP
GSTSESPSGTAPGSTSESPSGTAPGTSP
SGESSTAPGTSTEPSEGSAPGTSESATP
ESGPGTSESATPESGPGSEPATSGSETP
GTSESATPESGPGTSESATPESGPGTST
EPSEGSAPGTSESATPESGPGTSTEPSE
GSAPGTSPSGESSTAPGTSPSGESSTAP
GTSPSGESSTAPGTSTEPSEGSAPGSPA
EEGTSTEPSEGSAPGSSPSAST
GTGPGSSTPSGATGSPGSSTPSGATGS
PGSSTPSGATGSPGSSTPSGATGSPGA
SPGTSSTGSPGASASGAPSTGGTSPSG
ESSTAPGSTSSTAESPGPGTSPSGESST
APGTSESATPESGPGTSTEPSEGSAPG
TSTEPSEGSAPGSSPSASTGTGPGSSTP
SGATGSPGASPGTSSTGSPGTSTPESG
SASPGTSPSGESSTAPGTSPSGESSTAP
GTSESATPESGPGSEPATSGSETPGTST
EPSEGSAPGSTSESPSGTAPGSTSESPS
GTAPGTSTPESGSASPGSPAGSPTSTEE
GTSESATPESGPGTSTEPSEGSAPGSPA
GSPTSTEEGTSESATPESGPGSEPATSG
WO 40093 2012/054941
SEQ N0. Chou—Fasman GOR
Sequence
NAME Residues Calculation ation
STPSGATGSPGASPGTSSTGSP
GSSTPSGATGSPGSTSESPSGTAPGTSP
SGESSTAPGSTSSTAESPGPGSSTPSGA
TGSPGASPGTSSTGSPGTPGSGTASSSP
GSPAGSPTSTEEGSPAGSPTSTEEGTST
EPSEGSAP
AM923 MAEPAGSPTSTEEGASPGTSSTGSPGS 924 Residue totals; H; 4 E; 3 98.70%
STPSGATGSPGSSTPSGATGSPGTSTEP percent: H: 0.4 E; 0.3
SEGSAPGSEPATSGSETPGSPAGSPTST
EEGSTSSTAESPGPGTSTPESGSASPGS
TSESPSGTAPGSTSESPSGTAPGTSTPE
SGSASPGTSTPESGSASPGSEPATSGSE
TPGTSESATPESGPGSPAGSPTSTEEGT
GSAPGTSESATPESGPGTSTEP
SEGSAPGTSTEPSEGSAPGSPAGSPTST
EEGTSTEPSEGSAPGTSTEPSEGSAPGT
SESATPESGPGTSESATPESGPGTSTEP
SEGSAPGTSTEPSEGSAPGTSESATPES
GPGTSTEPSEGSAPGSEPATSGSETPGS
PAGSPTSTEEGSSTPSGATGSPGTPGS
PGSSTPSGATGSPGTSTEPSEG
SAPGTSTEPSEGSAPGSEPATSGSETPG
SPAGSPTSTEEGSPAGSPTSTEEGTSTE
PSEGSAPGASASGAPSTGGTSESATPE
SGPGSPAGSPTSTEEGSPAGSPTSTEEG
STSSTAESPGPGSTSESPSGTAPGTSPS
GESSTAPGTPGSGTASSSPGSSTPSGA
TGSPGSSPSASTGTGPGSEPATSGSETP
GTSESATPESGPGSEPATSGSETPGSTS
STAESPGPGSTSSTAESPGPGTSPSGES
STAPGSEPATSGSETPGSEPATSGSETP
GTSTEPSEGSAPGSTSSTAESPGPGTST
PESGSASPGSTSESPSGTAPGTSTEPSE
GSAPGTSTEPSEGSAPGTSTEPSEGSAP
GSSTPSGATGSPGSSPSASTGTGPGAS
PGTSSTGSPGSEPATSGSETPGTSESAT
PESGPGSPAGSPTSTEEGSSTPSGATGS
PGSSPSASTGTGPGASPGTSSTGSPGTS
ESATPESGPGTSTEPSEGSAPGTSTEPS
EGSAP
AE912 MAEPAGSPTSTEEGTPGSGTASSSPGS 913 e totals:H: 8 E: 3 99.45%
STPSGATGSPGASPGTSSTGSPGSPAG percent: H: 0.9 E; 0.3
SPTSTEEGTSESATPESGPGTSTEPSEG
SAPGSPAGSPTSTEEGTSTEPSEGSAPG
TSTEPSEGSAPGTSESATPESGPGSEPA
TSGSETPGSEPATSGSETPGSPAGSPTS
TEEGTSESATPESGPGTSTEPSEGSAPG
TSTEPSEGSAPGSPAGSPTSTEEGTSTE
PSEGSAPGTSTEPSEGSAPGTSESATPE
SGPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGTSTEPSEGSAPGTSTE
PSEGSAPGTSESATPESGPGTSESATPE
SGPGSPAGSPTSTEEGTSESATPESGPG
SEPATSGSETPGTSESATPESGPGTSTE
PSEGSAPGTSTEPSEGSAPGTSTEPSEG
SAPGTSTEPSEGSAPGTSTEPSEGSAPG
TSTEPSEGSAPGSPAGSPTSTEEGTSTE
PSEGSAPGTSESATPESGPGSEPATSGS
SEQ N0. Chou—Fasman GOR
Sequence
NAME Residues Calculation Calculation
ETPGTSESATPESGPGSEPATSGSETPG
PESGPGTSTEPSEGSAPGTSES
ATPESGPGSPAGSPTSTEEGSPAGSPTS
TEEGSPAGSPTSTEEGTSESATPESGPG
EGSAPGTSESATPESGPGSEPA
TSGSETPGTSESATPESGPGSEPATSGS
ETPGTSESATPESGPGTSTEPSEGSAPG
SPAGSPTSTEEGTSESATPESGPGSEPA
TSGSETPGTSESATPESGPGSPAGSPTS
TEEGSPAGSPTSTEEGTSTEPSEGSAPG
TSESATPESGPGTSESATPESGPGTSES
ATPESGPGSEPATSGSETPGSEPATSGS
ETPGSPAGSPTSTEEGTSTEPSEGSAPG
TSTEPSEGSAPGSEPATSGSETPGTSES
ATPESGPGTSTEPSEGSAP
BC 864 GTSTEPSEPGSAGTSTEPSEPGSAGSEP Residue totals: H: 0 E: 0 99.77%
PSGSGASEPTSTEPGSEPATS percent: H: 0 E; 0
GTEPSGSEPATSGTEPSGSEPATSGTEP
SGSGASEPTSTEPGTSTEPSEPGSAGSE
PATSGTEPSGTSTEPSEPGSAGSEPATS
GTEPSGSEPATSGTEPSGTSTEPSEPGS
AGTSTEPSEPGSAGSEPATSGTEPSGS
EPATSGTEPSGTSEPSTSEPGAGSGAS
EPTSTEPGTSEPSTSEPGAGSEPATSGT
EPSGSEPATSGTEPSGTSTEPSEPGSAG
TSTEPSEPGSAGSGASEPTSTEPGSEPA
TSGTEPSGSEPATSGTEPSGSEPATSGT
EPSGSEPATSGTEPSGTSTEPSEPGSAG
GTEPSGSGASEPTSTEPGTSTE
PSEPGSAGSEPATSGTEPSGSGASEPTS
TEPGTSTEPSEPGSAGSGASEPTSTEPG
SEPATSGTEPSGSGASEPTSTEPGSEPA
TSGTEPSGSGASEPTSTEPGTSTEPSEP
GSAGSEPATSGTEPSGSGASEPTSTEP
GTSTEPSEPGSAGSEPATSGTEPSGTST
EPSEPGSAGSEPATSGTEPSGTSTEPSE
PGSAGTSTEPSEPGSAGTSTEPSEPGSA
GTSTEPSEPGSAGTSTEPSEPGSAGTST
EPSEPGSAGTSEPSTSEPGAGSGASEPT
STEPGTSTEPSEPGSAGTSTEPSEPGSA
GTSTEPSEPGSAGSEPATSGTEPSGSG
TEPGSEPATSGTEPSGSEPATS
GTEPSGSEPATSGTEPSGSEPATSGTEP
SGTSEPSTSEPGAGSEPATSGTEPSGSG
ASEPTSTEPGTSTEPSEPGSAGSEPATS
GTEPSGSGASEPTSTEPGTSTEPSEPGS
* H: alpha-helix E: beta-sheet
Example 30: Analysis of polypeptide sequences for repetitiveness
] In this Example, different polypeptides, including several XTEN sequences, were assessed for
repetitiveness in the amino acid sequence. Polypeptide amino acid sequences can be ed for
repetitiveness by quantifying the number of times a shorter subsequence appears Within the overall
polypeptide. For e, a polypeptide of 200 amino acid residues length has a total of 165
overlapping 36-amino acid “blocks” (or “36-mers”) and 198 3-mer “subsequences”, but the number of
unique 3-mer uences will depend on the amount of repetitiveness within the sequence. For the
analyses, different polypeptide sequences were ed for repetitiveness by determining the
uence score obtained by application of the following equation:
Y 3 H 5:; if} if??? {7 E
Subsequence score = “3:”:I I
3‘31
wherein: m 2 (amino acid length of polypeptide) — (amino acid length of subsequence) +
1; and Countl- = cumulative number of occurrences of each unique subsequence within
sequence,-
In the analyses of the t Example, the subsequence score for the ptides of Table 26 were
determined using the foregoing equation in a computer program using the algorithm depicted in
wherein the subsequence length was set at 3 amino acids. The resulting subsequence score is a reflection
of the degree ofrepetitiveness within the polypeptide.
The results, shown in Table 26, indicate that the ctured polypeptides consisting of 2 or 3
amino acid types have high subsequence scores, while those of ting of the 12 amino acid motifs of
the six amino acids G, S, T, E, P, and A with a low degree of internal repetitiveness, have subsequence
scores of less than 10, and in some cases, less than 5. For e, the L288 sequence has two amino
acid types and has short, highly tive sequences, resulting in a subsequence score of 50.0. The
polypeptide J288 has three amino acid types but also has short, repetitive ces, resulting in a
subsequence score of 33.3. Y576 also has three amino acid types, but is not made of internal repeats,
reflected in the subsequence score of 15.7 over the first 200 amino acids. W576 consists of four types of
amino acids, but has a higher degree of internal repetitiveness, e. g., “GGSG”, resulting in a uence
score of 23.4. The AD576 consists of four types of 12 amino acid motifs, each consisting of four types
of amino acids. Because of the low degree of internal repetitiveness of the individual motifs, the l
subsequence score over the first 200 amino acids is 13.6. In contrast, XTEN’s consisting of four motifs
contains six types of amino acids, each with a low degree of internal repetitiveness have lower
subsequence scores; i.e., AE864 (6.1), AF864 (7.5), and AM875 (4.5), while XTEN consisting of four
motifs containing five types of amino acids were intermediate; i.e., AE864, with a score of 7.2.
Conclusions: The results indicate that the combination of 12 amino acid subsequence motifs,
each consisting of four to six amino acid types that are non-repetitive, into a longer XTEN ptide
results in an overall sequence that is substantially non-repetitive, as indicated by overall average
subsequence scores less than 10 and, in many cases, less than 5. This is despite the fact that each
subsequence motif may be used multiple times across the sequence. In contrast, polymers created from
smaller numbers of amino acid types resulted in higher average subsequence scores, with polypeptides
consisting of two amino acid type having higher scores that those consisting of three amino acid types.
Table 26: Average subseguence score calculations of polypeptide seguences
Seq SEQ ID Score
Name Amino Acid Sequence
J288 783 GSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEG 33.3
GSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEG
GSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEG
GSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEG
GSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEG
GSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEGGSGGEG
K288 784 GEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGE 46.9
GEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGE
GGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGG
GEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGE
GGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGG
EGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEGEGGGEGGEG
EGGGEG
L288 785 SSSSESSSESSESSSSESSSESSESSSSESSSESSESSSSESSSESSESSS 50.0
SESSSESSESSSSESSSESSESSSSESSSESSESSSSESSSESSESSSSESSSESSE
SSESSESSSSESSSESSESSSSESSSESSESSSSESSSESSESSSSESSSE
SSESSSSESSSESSESSSSESSSESSESSSSESSSESSESSSSESSSESSESSSSES
SSESSESSSSESSSESSESSSSESSSESSESSSSESSSESSESSSSESSSESSESSS
Y288 786 GEGSGEGSEGEGSEGSGEGEGSEGSGEGEGGSEGSEGEGGSEGSEGEGGS 26.8
EGSEGEGSGEGSEGEGGSEGSEGEGSGEGSEGEGSEGGSEGEGGSEGSEG
EGSGEGSEGEGGEGGSEGEGSEGSGEGEGSGEGSEGEGSEGSGEGEGSGE
GSEGEGSEGSGEGEGSEGSGEGEGGSEGSEGEGSEGSGEGEGGEGSGEGE
GSGEGSEGEGGGEGSEGEGSGEGGEGEGSEGGSEGEGGSEGGEGEGSEG
SGEGEGSEGGSEGEGSEGGSEGEGSEGSGEGEGSEGSGE
Q576 787 GGKPGEGGKPEGGGGKPGGKPEGEGEGKPGGKPEGGGKPGGGEGGKPE 18.5
GGKPEGEGKPGGGEGKPGGKPEGGGGKPEGEGKPGGGGGKPGGKPEGE
GKPGGGEGGKPEGKPGEGGEGKPGGKPEGGGEGKPGGGKPGEGGKPGE
GKPGGGEGGKPEGGKPEGEGKPGGGEGKPGGKPGEGGKPEGGGEGKPG
GKPGEGGEGKPGGGKPEGEGKPGGGKPGGGEGGKPEGEGKPGGKPEGG
GEGKPGGKPEGGGKPEGGGEGKPGGGKPGEGGKPGEGEGKPGGKPEGE
GKPGGEGGGKPEGKPGGGEGGKPEGGKPGEGGKPEGGKPGEGGEGKPG
GGKPGEGGKPEGGGKPEGEGKPGGGGKPGEGGKPEGGKPEGGGEGKPG
GGKPEGEGKPGGGEGKPGGKPEGGGGKPGEGGKPEGGKPGGEGGGKPE
GEGKPGGKPGEGGGGKPGGKPEGEGKPGEGGEGKPGGKPEGGGEGKPG
GKPEGGGEGKPGGGKPGEGGKPEGGGKPGEGGKPGEGGKPEGEGKPGG
GEGKPGGKPGEGGKPEGGGEGKPGGKPGGEGGGKPEGGKPGEGGKPEG
U576 788 GKPGSGGGKPGEGGKPGSGEGKPGGKPGSGGSGKPGGKPGEG 18.1
SGGKPGGGGKPGGKPGGEGSGKPGGKPEGGGKPEGGSGGKPG
GKPEGGSGGKPGGKPGSGEGGKPGGGKPGGEGKPGSGKPGGEGSGKPG
GKPEGGSGGKPGGKPEGGSGGKPGGSGKPGGKPGEGGKPEGGSGGKPG
GSGKPGGKPEGGGSGKPGGKPGEGGKPGSGEGGKPGGGKPGGEGKPGS
GKPGGEGSGKPGGKPGSGGEGKPGGKPEGGSGGKPGGGKPGGEGKPGS
GGKPGEGGKPGSGGGKPGGKPGGEGEGKPGGKPGEGGKPGGEGSGKPG
GGGKPGGKPGGEGGKPEGSGKPGGGSGKPGGKPEGGGGKPEGSGKPGG
GGKPEGSGKPGGGKPEGGSGGKPGGSGKPGGKPGEGGGKPEGSGKPGG
GSGKPGGKPEGGGKPEGGSGGKPGGKPEGGSGGKPGGKPGGEGSGKPG
GKPGSGEGGKPGGKPGEGSGGKPGGKPEGGSGGKPGGSGKPGGKPEGG
GSGKPGGKPGEGGKPGGEGSGKPGGSGKPG
W576 789 GGSGKPGKPGGSGSGKPGSGKPGGGSGKPGSGKPGGGSGKPGSGKPGG 23.4
SGKPGGGGKPGSGSGKPGGGKPGGSGGKPGGGSGKPGKPGSG
GSGKPGSGKPGGGSGGKPGKPGSGGSGGKPGKPGSGGGSGKPGKPGSG
GSGGKPGKPGSGGSGGKPGKPGSGGSGKPGSGKPGGGSGKPGSGKPGSG
GSGKPGKPGSGGSGKPGSGKPGSGSGKPGSGKPGGGSGKPGSGKPGSGG
SGKPGKPGSGGGKPGSGSGKPGGGKPGSGSGKPGGGKPGGSGGKPGGS
GGKPGKPGSGGGSGKPGKPGSGGGSGKPGKPGGSGSGKPGSGKPGGGS
Seq SEQ ID Score
Amino Acid Sequence
Name N0:
KPGSGGSGKPGKPGSGGSGGKPGKPGSGGGKPGSGSGKPGGG
KPGSGSGKPGGGKPGSGSGKPGGGKPGSGSGKPGGSGKPGSGKPGGGSG
GKPGKPGSGGSGKPGSGKPGSGGSGKPGKPGGSGSGKPGSGKPGGGSGK
PGSGKPGGGSGKPGSGKPGGGSGKPGSGKPGGGGKPGSGSGKPGGSGG
KPGKPGSGGSGGKPGKPGSGGSGKPGSGKPGGGSGGKPGKPGSGG
Y576 790 GEGSGEGSEGEGSEGSGEGEGSEGSGEGEGGSEGSEGEGSEGSGEGEGGE 15.7
GSGEGEGSGEGSEGEGGGEGSEGEGSGEGGEGEGSEGGSEGEGGSEGGE
GEGSEGSGEGEGSEGGSEGEGSEGGSEGEGSEGSGEGEGSEGSGEGEGSE
GSGEGEGSEGSGEGEGSEGGSEGEGGSEGSEGEGSGEGSEGEGGSEGSEG
EGGGEGSEGEGSGEGSEGEGGSEGSEGEGGSEGSEGEGGEGSGEGEGSE
GSGEGEGSGEGSEGEGSEGSGEGEGSEGSGEGEGGSEGSEGEGSGEGSEG
EGSEGSGEGEGSEGSGEGEGGSEGSEGEGGSEGSEGEGGSEGSEGEGGEG
SGEGEGSEGSGEGEGSGEGSEGEGSEGSGEGEGSEGSGEGEGGSEGSEGE
EGEGGEGSGEGEGSGEGSEGEGGGEGSEGEGSEGSGEGEGSEGS
GEGEGSEGGSEGEGGSEGSEGEGSEGGSEGEGSEGGSEGEGSEGSGEGEG
SEGSGEGEGSGEGSEGEGGSEGGEGEGSEGGSEGEGSEGGSEGEGGEGSG
EGEGGGEGSEGEGSEGSGEGEGSGEGSE
AE288 288 TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES 6.0
ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPG
TSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPAT
SGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSET
PGTSESATPESGPGTSTEPSEGSAP
AG288_ 288 PGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTP 6.9
1 GSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSG
ATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSS
PGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGAS
PGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSAST
GTGPGTPGSGTASSSPGSSTPSGATGS
ADS76 791 GSSESGSSEGGPGSGGEPSESGSSGSSESGSSEGGPGSSESGSSEGGPGSSE 13.6
SGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGESPGGSSGSESGSEGSSGP
GESSGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSS
GESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSSESGSSEGGPGSGG
EPSESGSSGSGGEPSESGSSGSEGSSGPGESSGESPGGSSGSESGSGGEPSE
SGSSGSGGEPSESGSSGSGGEPSESGSSGSSESGSSEGGPGESPGGSSGSES
GESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGSSE
SGSSEGGPGSGGEPSESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEPSE
SGSSGSSESGSSEGGPGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSES
GSSESGSSEGGPGSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGSGG
EPSESGSSGESPGGSSGSESGSEGSSGPGESSGSSESGSSEGGPGSEGSSGP
GESS
AE576 792 AGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTS 6.1
TEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATS
GSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAP
GSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTST
EPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEG
ESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPG
SEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEP
SEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTST
EEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS
EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGS
PTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSA
AF540 793 GSTSSTAESPGPGSTSSTAESPGPGSTSESPSGTAPGSTSSTAESPGPGSTSS 8.8
PGTSTPESGSASPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGT
APGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTS
PSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESG
SASPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGSTSSTAESPGPG
Seq SEQ ID Score
Amino Acid Sequence
Name NO:
TSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGTSTPE
SGSASPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSSTAESPG
PGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTST
PESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSPSGESS
TAPGSTSSTAESPGPGTSPSGESSTAPGSTSSTAESPGPGTSTPESGSASPGS
TSESPSGTAP
AF504 794 GASPGTSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSST 7.0
PSGATGSPGSNPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGA
TGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSP
GSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSNP
SASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSS
TGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSP
GASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASP
GTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGA
TGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSP
GSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSP
AE864 795 GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTST 6.1
EPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSG
PAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
SPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEP
SEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGS
APGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGS
SETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPS
EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTE
EGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP
ATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT
STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES
ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPG
TSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPAT
SGSETPGSPAGSPTSTEEGTSTEPSEGSAP
AF864 796 GSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSTP 7.5
ESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESST
APGSTSESPSGTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTS
PSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESG
SASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPG
TSTPESGSASPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSG
ESSTAPGTSTPESGSASPGSTSSTAESPGPGSTSSTAESPGPGSTSSTAESPG
PGSTSSTAESPGPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTST
PESGPXXXGASASGAPSTXXXXSESPSGTAPGSTSESPSGTAPGSTSESPS
GTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASP
GTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTP
ESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGT
APGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGST
SSTAESPGPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAE
SPGPGTSPSGESSTAPGTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPG
TSPSGESSTAPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAPGSSPSA
GSSTPSGATGSPGSSTPSGATGSP
AG864 864 GASPGTSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSPGSST 7.2
PSGATGSPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGA
PGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSP
GSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSP
GPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSS
TGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSP
GASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASP
GTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGA
TGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSP
GSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPG
WO 40093
Seq SEQ ID Score
Amino Acid Sequence
Name NO:
SGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSS
TGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSP
GTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSP
SASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSAST
GTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSP
GSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGSST
PSGATGSPGSSTPSGATGSPGASPGTSSTGSP
AG868 797 GGSPGASPGTSSTGSPGSSPSASTGTGPGSSPSASTGTGPGTPGSGTASSSP 7.5
GSSTPSGATGSPGSNPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSST
PSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTA
SSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSP
GSNPSASTGTGPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASP
GTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSS
TGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSP
GASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSST
PSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGA
TGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSP
GTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGASP
GTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGA
TGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGP
GSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSP
SASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGA
TGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSP
GSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSP
AM875 798 GTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSTSSTAESPGPGTSTP 4.5
ESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSA
SPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTS
ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
GSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAP
GTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPA
GSPTSTEEGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTSTEPSE
GSAPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSPAGSPTSTEE
GTSTEPSEGSAPGASASGAPSTGGTSESATPESGPGSPAGSPTSTEEGSPAG
EGSTSSTAESPGPGSTSESPSGTAPGTSPSGESSTAPGTPGSGTASS
SPGSSTPSGATGSPGSSPSASTGTGPGSEPATSGSETPGTSESATPESGPGS
EPATSGSETPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAPGSEPATS
GSETPGSEPATSGSETPGTSTEPSEGSAPGSTSSTAESPGPGTSTPESGSASP
GSTSESPSGTAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSSTP
SGATGSPGSSPSASTGTGPGASPGTSSTGSPGSEPATSGSETPGTSESATPE
AGSPTSTEEGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPG
TSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAP
AE912 913 MAEPAGSPTSTEEGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSP 4.5
AGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
GSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETP
PTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPA
GSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEG
SAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPG
TSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPAT
SGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSA
PSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTST
EPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSG
SETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEG
SPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESA
TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESG
PGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSE
SATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPE
SGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPG
TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESA
TPESGPGTSTEPSEGSAP
Seq SEQ ID Score
Amino Acid Sequence
Name NO:
AM923 924 MAEPAGSPTSTEEGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGTS 4.5
TEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSTSSTAESPGPGTSTPESG
SASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPG
SEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESA
TPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSA
PSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTST
EPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTS
TEEGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTSTEPSEGSAPG
TSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEP
SEGSAPGASASGAPSTGGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTE
EGSTSSTAESPGPGSTSESPSGTAPGTSPSGESSTAPGTPGSGTASSSPGSST
PSGATGSPGSSPSASTGTGPGSEPATSGSETPGTSESATPESGPGSEPATSG
SETPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAPGSEPATSGSETPG
SEPATSGSETPGTSTEPSEGSAPGSTSSTAESPGPGTSTPESGSASPGSTSES
PSGTAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSSTPSGATG
SPGSSPSASTGTGPGASPGTSSTGSPGSEPATSGSETPGTSESATPESGPGSP
AGSPTSTEEGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGTSESAT
PESGPGTSTEPSEGSAPGTSTEPSEGSAP
AM1296 799 GTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSTSSTAESPGPGTSTP 4.5
ESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSA
SPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTS
ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
STEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAP
GTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPA
GSPTSTEEGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTSTEPSE
STEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSPAGSPTSTEE
GTSTEPSEGSAPGPEPTGPAPSGGSEPATSGSETPGTSESATPESGPGSPAG
SPTSTEEGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPES
GPGSPAGSPTSTEEGSPAGSPTSTEEGSTSSTAESPGPGSTSESPSGTAPGTS
PSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGTSTEPSE
GSAPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
GTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSPS
GESSTAPGTSPSGESSTAPGTSPSGESSTAPGTSTEPSEGSAPGSPAGSPTST
EEGTSTEPSEGSAPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGS
STPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASASGAPSTGGTSPSGE
SSTAPGSTSSTAESPGPGTSPSGESSTAPGTSESATPESGPGTSTEPSEGSAP
GTSTEPSEGSAPGSSPSASTGTGPGSSTPSGATGSPGASPGTSSTGSPGTST
SPGTSPSGESSTAPGTSPSGESSTAPGTSESATPESGPGSEPATSGS
ETPGTSTEPSEGSAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGS
PAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESA
GSEPATSGSETPGSSTPSGATGSPGASPGTSSTGSPGSSTPSGATG
SPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAESPGPGSSTPSGATGSPGAS
PGTSSTGSPGTPGSGTASSSPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSE
GSAP
Example 31: Calculation of TEPITOPE scores
TEPITOPE scores of 9mer peptide ce can be calculated by adding pocket potentials as
described by Stumiolo [Sturniolo, T., et a1. (1999) Nat Biotechnol, 17: 555]. In the present Example,
separate Tepitope scores were calculated for individual HLA alleles. Table 27 shows as an example the
pocket potentials for HLA*0101B, which occurs in high frequency in the ian population. To
calculate the TEPITOPE score of a peptide with sequence Pl-P2-P3-P4-P5-P6-P7-P8-P9, the
corresponding individual pocket potentials in Table 27 were added. The HLA*0101B score of a 9mer
peptide with the sequence FDKLPRTSG is the sum of 0, -1.3, 0, 0.9, 0, -1.8, 0.09, 0, 0.
To evaluate the TEPITOPE scores for long peptides one can repeat the process for all 9mer
subsequences of the sequences. This process can be repeated for the proteins encoded by other HLA
alleles. Tables 28-31 give pocket potentials for the protein products of HLA s that occur with high
frequency in the Caucasian tion.
TEPITOPE scores calculated by this method range from approximately -10 to +10. However,
9mer peptides that lack a hydrophobic amino acid WY) in P1 position have calculated
TEPITOPE scores in the range of -1009 to -989. This value is biologically meaningless and reflects the
fact that a hydrophobic amino acid serves as an anchor residue for HLA binding and peptides lacking a
hydrophobic e in P1 are considered non binders to HLA. Because most XTEN sequences lack
hydrophobic residues, all combinations of 9mer subsequences will have TEPITOPEs in the range in the
range of -1009 to -989. This method confirms that XTEN polypeptides may have few or no predicted T-
cell epitopes.
Table 27: Pocket otential for HLA*0101B .
Amino Acid
U0 0 - 0 0
—I-—
—I-—
0 —I-—
—I-—
"UZKT‘WHE —I-—-1 I-—
—I-—
'1—I-—
-1 08-.—
—I-—
—I-—
01730 . . -l.8 0.2 .
-l.8 0.09
-0 6 -0 2
Table 28: Pocket otential for HLA*0301B allele.
—Amino acid "UIx.) "UI» "U Ln ’1: ox "U \1 "Uo
.- I I—I U) I I—A U.) N U.)
.0 b—K I I—A N I ._I
._I .0 W l ._I
G O U}
._I HoooOmowqowmgflomwww ._I O U.
._I ._I l ._I
O 535.082.0952:mougcm'o
._I ._I O I-I ._I
63.0‘U‘oo OOI—l O [\J s5._. | .0 ON
— | )—A
m- oc'sNr«Di—.HLJNN .0535:”1199.)
oo )_A .53 «a
oDl—K.\])l—ko I I-I ._I
S5I—I I .0 LI]
o I .0 U)
I I-I
.0 00 .0 L»
’12\9
LIL. \l\l
I I-I
WO 40093
Amino acid pg*4 pd 00
*UZ I0.0.0ogoxoo
*<€<
Amino acid P1 P3
A -999 0 0
FHUO -999 0 0
-999 -1.3 -1.3
—-n—
—_*UZEFWHEQTJ
—-n—730
<am -999 -O.3 0.2 . . . .
—_m<2
Table 31: Pocket potential for HLA*1501B allele.
Amino acid "d4; "Cl6
’TJU'JUO L. U)
ll l—Kl—K we;
$3.0 OOl—t s5 \l .'_..'_. \oo
.0 00 N636:-..oo#04; 3'3 4;
Wl—l ._i ._i ._i LII I .0 \l
b—K t—n O 33.0 \10
L 0 1 l 0.5 - :
WOPUZE t—I t—I ._.
.O'oNo'oNe'osDsop—t‘wNN‘moo OOHGenerous I.0 N L. [\J
I.0 W L. ._.
I.0 00 L. O\
GO .0 [\J .53 LI] .
Hm $5.53 UJUJ .0 N
<e< .0 U} 9.53.53“No.7
O 9.0 #N
.0 00 .N Ur 3:; ®4>
Table 32: Exem lar Biolo icalActivit Exem la Assa s and Indications
Biologically Active
Protein Bi010_ calActivit Exemnla Activit Assa s Indication:
Glucagon-Like- Stimulates proliferation intestinal lial cell Gastrointestinal conditions
Peptide 2 (GLPZ; and inhibits apoptosis proliferation can be including, but not limited
Glyt (ill—2) of intestinal lial measured using methods to: gastrointestinal
cells; reduces lial lrnown in the art, including epithelial inj ury; recovery
permeability; decreases the cell proliferation from bowel resection;
gastric acid secretion assays described in Dig. enteritis; colitis; gastritis;
and gastrointestinal Dis. Sci. 47(5): l lSSMl 140 elreinotlierapyindnced
motility; promotes “2092). tnttcositis; short bowel
wound g. Protection of intestinal syndrome; intestinal
epithelium can he atrophy; inflammatory
evaluated using methods bowel disease; Crohn's
known in the art: including disease; Ulcerative
the in Vitro intestinal colitis; acid ; peptic
injury model described in ulcers; diabetes-associated
.l} Sing. Res ltl7t'l): 44—9 bowel growth; intestinal
Biologically Active
Protein Bi010_'calActivit Exemla Activit Assa s Indication:
{2092). iscbernia syndromes;
GLP~2 can be assayed by maintenance of gut
radieinnnnneassay ity after major burn
bed in Regu. trauma; regulatien ef
l’l’iysioll 278(4): Ellie"?— i nal
Rll‘s63 . permeability anti nutrient
Cnntractility of intestinal abserptien.
tissue by GLP—Z can be Hyperglycemia; Diabetes;
measured as described in Diabetes lnsipidus;
US Pat Ne. 7,498,l 4i; Diabetes mellitus; Type l
Measurement 0f CAMP diabetes; Ty e 2 diabetes;
levels in isolated rat small n resistance; insulin
intestinal deficiency;
muensal cells expressing llyperl ini d emia;
{ELF—2 receptors or in llyperltetnnemia; Non—
COS eells insulin dependent
ected Witlt tile GLP- Diabetes Mellitus
’2 receptor, or AP-l NDDM); n;
luciferase rennrter gene dependent Diabetes
activity in Bill: Mellitus ODDM);
fibroblast cells Cnntlitions asseeiated with
nously expressing Diabetes including, but
the (ELF-2 receptor as not limited to Dbesity.‘
described in US Pat App. l-leart
N0. Elli llll’fl lb-l; Disease, Hyperglycemia,
ECSG determinations by infections” patby,
Flipper assay measuring And/0r Ulcers; Metabolic
calcium flux by Disorders; immune
fluorescence triggered by Disnrders; Obesity;
binding of (ELF-2 tn an ar Diserders;
ered cell line with a ssion ef Bedy
stable GEM—R and G (1 Weight; Suppressinn nf
q/‘l l expressien. Appetite; Synrlrmne Xv
Table 33: Exemplary GLPZ-XTEN comprising GLP-2 and terminal XTEN
GLP2— Amino Acid Sequence
XTEN
Names“
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGSEPATSGSETPGTSESATPESGPGSEPATS
AE144 GSETPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGSEPATSGSETPG
TSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAP
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSEPATSGSETPGTSESATPESGPGSEPATS
variant 2- GSETPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSETPGSEPATSGSETPG
AE144 TSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAP
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGTSESATPESGPGSEPATSGSETPGTSESAT
AE288 PESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPG
SEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATP
ESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGT
STEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGTSESATPESGPGSEPATSGSETPGTSESAT
variant 2— PESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPG
AE288 SEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATP
GLPZ- Amino Acid ce
XTEN
Name"
ESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGT
STEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
GLP—Z— HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGTSTPESGSASPGTSPSGESSTAPGTSPSGE
AF144 SSTAPGSTSSTAESPGPGSTSESPSGTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGS
TSSTAESPGPGTSPSGESSTAPGTSPSGESSTAPGTSPSGESSTAP
GLP-2 SDEMNTILDNLAARDFINWLIQTKITDGGTSTPESGSASPGTSPSGESSTAPGTSPSGE
variant 2- SSTAPGSTSSTAESPGPGSTSESPSGTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGS
AF144 TSSTAESPGPGTSPSGESSTAPGTSPSGESSTAPGTSPSGESSTAP
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGSSESGSSEGGPGSGGEPSESGSSGSSESGS
AD576 SEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGESPGGSSGSESG
SEGSSGPGESSGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSSGESPGGS
SGSESGESPGGSSGSESGSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSG
SEGSSGPGESSGESPGGSSGSESGSGGEPSESGSSGSGGEPSESGSSGSGGEPSESGSSGSSESGS
SEGGPGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESG
SSESGSSEGGPGSGGEPSESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEPSESGSSGSSESGS
SEGGPGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPGSGGEPSESGSSG
SSESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSGESPGGSSGSESGSEGSSGPGESSGSSESGS
SEGGPGSEGSSGPGESS
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSSESGSSEGGPGSGGEPSESGSSGSSESGS
variant 2- SEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGESPGGSSGSESG
ADS76 SEGSSGPGESSGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSSGESPGGS
ESPGGSSGSESGSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSG
SEGSSGPGESSGESPGGSSGSESGSGGEPSESGSSGSGGEPSESGSSGSGGEPSESGSSGSSESGS
ESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESG
SSESGSSEGGPGSGGEPSESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEPSESGSSGSSESGS
SEGGPGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPGSGGEPSESGSSG
SSESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSGESPGGSSGSESGSEGSSGPGESSGSSESGS
SEGGPGSEGSSGPGESS
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEPS
AE576 EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG
SEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSP
TSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSP
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
GSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
STEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
SGPGTSTEPSEGSAP
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEPS
variant 2- EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG
AE576 SEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSP
TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSP
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
GSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
STEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
SGPGTSTEPSEGSAP
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGSTSSTAESPGPGSTSSTAESPGPGSTSESP
AF576 SGTAPGSTSSTAESPGPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGTSPSGESSTAPGS
TSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESS
TAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGSTSESPSGTAPGTST
PESGSASPGSTSSTAESPGPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTA
PGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSST
AESPGPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPG
STSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSPSGESSTAPGSTSSTAESPGPGTSPSGES
STAPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGSTSSTAESPGPGTSTPESGSASPGTS
GLPZ- Amino Acid Sequence
XTEN
Name"
TPESGSASP
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSTSSTAESPGPGSTSSTAESPGPGSTSESP
variant 2— SGTAPGSTSSTAESPGPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGTSPSGESSTAPGS
AF576 TSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESS
TAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGSTSESPSGTAPGTST
PESGSASPGSTSSTAESPGPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTA
PGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSST
AESPGPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPG
STSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSPSGESSTAPGSTSSTAESPGPGTSPSGES
STAPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGSTSSTAESPGPGTSTPESGSASPGTS
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGMAEPAGSPTSTEEGTPGSGTASSSPGSSTP
variant 2- SGATGSPGASPGTSSTGSPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTE
AE624 EGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAG
SPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAP
GTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPS
EGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPG
SEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSE
GSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGS
EPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATP
ESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGSSESGSSEGGPGSSESGSSEGGPGESPGG
AD836 GSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPGSSESGSSEGGP
GSSESGSSEGGPGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPGSSESG
SSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSS
GESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEP
SESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEPSESGSSGESPGGSSGSESGSGGEPSESGSS
GSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSGSEGSSGPGESSGESPGG
SSGSESGSEGSSGPGESSGSEGSSGPGESSGSGGEPSESGSSGSSESGSSEGGPGSSESGSSEGGP
GESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGESPGGSSGSESGSEGSSGPGSSESGSSEG
GPGSGGEPSESGSSGSEGSSGPGESSGSEGSSGPGESSGSEGSSGPGESSGSGGEPSESGSSGSG
GEPSESGSSGESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGESPGGSSG
SESGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSSGSSESGSSEGGPGES
PGGSSGSESGSGGEPSESGSSGSSESGSSEGGPGESPGGSSGSESGSGGEPSESGSSGESPGGSSG
SESGSGGEPSESGSS
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSSESGSSEGGPGSSESGSSEGGPGESPGG
variant 2- SSGSESGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPGSSESGSSEGGP
AD83 6 GSSESGSSEGGPGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPGSSESG
SSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSS
GESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEP
SESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEPSESGSSGESPGGSSGSESGSGGEPSESGSS
GSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSGSEGSSGPGESSGESPGG
SSGSESGSEGSSGPGESSGSEGSSGPGESSGSGGEPSESGSSGSSESGSSEGGPGSSESGSSEGGP
GESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGESPGGSSGSESGSEGSSGPGSSESGSSEG
EPSESGSSGSEGSSGPGESSGSEGSSGPGESSGSEGSSGPGESSGSGGEPSESGSSGSG
GEPSESGSSGESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGESPGGSSG
SESGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSSGSSESGSSEGGPGES
PGGSSGSESGSGGEPSESGSSGSSESGSSEGGPGESPGGSSGSESGSGGEPSESGSSGESPGGSSG
GEPSESGSS
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEPS
AE864 EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG
SEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSP
TSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSP
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
GSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
STEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
GLPZ- Amino Acid Sequence
XTEN
Name"
SGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPES
GPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE
GPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGS
APGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEPS
variant 2- EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG
AE864 SEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSP
TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSP
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
SESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
STEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
SGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPES
GPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE
SATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGS
APGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
GLP—2 HADGSFSDEMNTILDNLATRDFINWLIQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEPS
variant 1- EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG
AE864 GSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSP
TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSP
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
GSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
GSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
SGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPES
GPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE
SATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGS
APGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
GLP-2 HVDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEPS
variant 3- EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG
AE864 SEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSP
TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSP
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
GSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
STEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
SGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPES
GPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE
SATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGS
APGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGSTSESPSGTAPGTSPSGESSTAPGSTSESP
AF864 SGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGT
SPSGESSTAPGSTSESPSGTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESS
PSGESSTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTS
ESPSGTAPGTSTPESGSASPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGTSPSGESSTA
PGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGSTSSTAESPGPGSTSSTAESPGPGSTSST
AESPGPGSTSSTAESPGPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGPXXX
GASASGAPSTXXXXSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESP
SGTAPGSTSESPSGTAPGTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGT
SPSGESSTAPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSG
TAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGSTS
2012/054941
GLPZ- Amino Acid Sequence
XTEN
Name"
ESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSAS
PGTSPSGESSTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGSTSSTAESPGPGTSPSG
ESSTAPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSP
GLP-Z HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSTSESPSGTAPGTSPSGESSTAPGSTSESP
variant 2- SGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGT
AF864 SPSGESSTAPGSTSESPSGTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESS
TAPGTSPSGESSTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTS
ESPSGTAPGTSTPESGSASPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGTSPSGESSTA
PGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGSTSSTAESPGPGSTSSTAESPGPGSTSST
AESPGPGSTSSTAESPGPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGPXXX
APSTXXXXSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESP
SGTAPGSTSESPSGTAPGTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGT
SPSGESSTAPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSG
TAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGSTS
ESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSAS
GESSTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGSTSSTAESPGPGTSPSG
ESSTAPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSP
GLP-Z- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGASPGTSSTGSPGSSPSASTGTGPGSSPSAS
AG864 TGTGPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGPGASPGTSSTGSPGTPGSGTASSSP
GSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPS
GATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGPGSSPSASTGTG
PGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGS
GTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSS
PGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPG
TSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTG
PGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPG
TSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGS
PGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPG
TSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGS
PGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGS
GTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSP
GLP—2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGASPGTSSTGSPGSSPSASTGTGPGSSPSAS
variant 2- TGTGPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGPGASPGTSSTGSPGTPGSGTASSSP
AG864 GSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPS
GATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGPGSSPSASTGTG
PGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGS
GTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSS
PGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPG
TSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTG
PGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPG
TSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGS
GTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPG
PGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGS
PGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTPGS
GTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSP
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGTSTEPSEGSAPGSEPATSGSETPGSPAGSP
variant 2- TSTEEGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGT
AM875 STPESGSASPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPE
SGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTS
ESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGS
APGSEPATSGSETPGSPAGSPTSTEEGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTST
EPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGS
APGASASGAPSTGGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSTSSTAESPGPGSTSE
SPSGTAPGTSPSGESSTAPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSEPATSGSET
PGTSESATPESGPGSEPATSGSETPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAPGSEPAT
SGSETPGSEPATSGSETPGTSTEPSEGSAPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPG
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSSTPSGATGSPGSSPSASTGTGPGASPGTS
STGSPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSSTPSGATGSPGSSPSASTGTGPG
GLPZ- Amino Acid Sequence
XTEN
Name"
STGSPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAP
GLP-Z HADGSFSDEMNTVLDSLATRDFINWLLQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEP
bovine— SEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETP
AE864 GSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGS
PTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP
GSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGS
PTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAP
GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS
EGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPG
TSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATP
ESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGT
SESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPE
SGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTS
SGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGS
APGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
GLP-2 pig- HADGSFSDEMNTVLDNLATRDFINWLLHTKITDSLGGASPGTSSTGSPGSSPSASTGTGPGSSP
AG864 SASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGPGASPGTSSTGSPGTPGSGTAS
SSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSS
GSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGPGSSPSASTG
TGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTP
GSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTAS
SSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGAS
PGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTG
TGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGAS
GSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGAT
GSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGAS
PGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSST
GSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSASTGTGPGTP
GSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSP
GLP-Z rat- HADGSFSDEMNTILDNLATRDFINWLIQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEPS
AE576 EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG
SEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSP
TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
SEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSP
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
SESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
STEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
SGPGTSTEPSEGSAP
GLP—2 HKDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSTSESPSGTAPGTSPSGESSTAPGSTSESP
variant 5- SGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGT
AF864 SPSGESSTAPGSTSESPSGTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESS
TAPGTSPSGESSTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGSTS
ESPSGTAPGTSTPESGSASPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTAPGTSPSGESSTA
PGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGSTSSTAESPGPGSTSSTAESPGPGSTSST
AESPGPGSTSSTAESPGPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGPXXX
GASASGAPSTXXXXSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESP
SGTAPGSTSESPSGTAPGTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGT
SPSGESSTAPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSESPSG
TAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGSTS
ESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSAS
PGTSPSGESSTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGSTSSTAESPGPGTSPSG
ESSTAPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSP
GLP-2 HRDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSPAGSPTSTEEGTSESATPESGPGTSTEPS
variant 6- EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG
AE864 SEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSP
TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
GLPZ- Amino Acid Sequence
XTEN
Name"
SEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSP
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG
TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
GSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
STEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
SGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPES
GPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE
SATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGS
APGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSP
variant 2- TSTEEGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG
AE1236 TSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSPAGSP
TSTEEGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPG
SPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATP
ESGPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGS
EPATSGSETPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSTEPSE
GSAPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGT
STEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPE
SGPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSP
AGSPTSTEEGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGTSTEPSEG
SAPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGSE
ETPGTSESATPESGPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGTS
TEPSEGSAPGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGSPAGSPTS
TEEGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGSE
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSPAGSPTSTEEGTSTEPSEGSAPGTSESAT
variant 2- PESGPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEG
AE1332 EGSAPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSTEPSE
GSAPGSEPATSGSETPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGT
ESGPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEG
SAPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTS
TEPSEGSAPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGTSESATPES
GPGSEPATSGSETPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTST
EPSEGSAPGSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTST
EEGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTST
EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTST
EEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPA
GSPTSTEEGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGSEPATSGSETPGSEPATSGSE
TPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSPA
GSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGSEPATSGSETPGSPAGSPTSTEEGTSESATPES
ATSGSETPGSEPATSGSETPGTSESATPESGPGTSESATPESGPGTSTEPSEGSAPGTST
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGSETPGTSTEPSEGSAPGTSTEPSEGSAPGT
variant 2- SESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPE
AE612A TEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS
TEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES
GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPA
GSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSE
SATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSES
ATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSA
PGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAG
SPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAT
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGTSGSETPGSEPATSGSETPGSPAGSPTSTEE
variant 2- GTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPS
AE720A EGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPG
TSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATS
GSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPG
WO 40093
GLPZ- Amino Acid Sequence
XTEN
Name"
EGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATS
GSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPG
TSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATP
ESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGS
PAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPT
STEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGS
SETPGSPAGSPTSTEEGTSTE
GLP-2 HGDGSFSDEMNTILDNLAARDFlNWLIQTKITDGSTGSPGTPGSGTASSSPGSSTPSGATGSPG
variant 2- ASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPSG
AG6 12A ATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPG
ASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGTS
STGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPG
TPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTS
STGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPG
ASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGT
ASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPG
TPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTS
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGTSSTGSPGSSPSASTGTGPGSSPSASTGTGP
variant 2- GTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPS
AG792A GATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSP
GASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGSSTPS
GATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSP
GASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGPGTPGSGTASSSPGASPGT
SSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSP
GTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGASPGT
SSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSP
GASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSG
TASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSP
GTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGPGASPGTSSTGSPGASPGT
SSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPG
* Sequence
name s N— to C-terminus configuration of the GLP-2 and XTEN (by family name and
length)
Table 34: Exemplary GLPZ-XTEN comprising GLP-2, cleavage sequences and XTEN seguences
GLP2- Amino Acid Sequence
XTEN
Name"
GLPZ- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGLTPRSLLVGGGGSSESGSSEGGPGSSESGS
Thrombin- SEGGPGESPGGSSGSESGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPG
AD83 6 SSESGSSEGGPGSSESGSSEGGPGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGSSESGS
SEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPG
SGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGSSESGS
SEGGPGSGGEPSESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEPSESGSSGESPGGSSGSESG
SGGEPSESGSSGSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSGSEGSSG
PGESSGESPGGSSGSESGSEGSSGPGESSGSEGSSGPGESSGSGGEPSESGSSGSSESGSSEGGPG
SSESGSSEGGPGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGESPGGSSGSESGSEGSSG
PGSSESGSSEGGPGSGGEPSESGSSGSEGSSGPGESSGSEGSSGPGESSGSEGSSGPGESSGSGGE
SGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGES
SGESPGGSSGSESGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSSGSSES
GSSEGGPGESPGGSSGSESGSGGEPSESGSSGSSESGSSEGGPGESPGGSSGSESGSGGEPSESGS
SGESPGGSSGSESGSGGEPSESGSS
GLPZ— HADGSFSDEMNTILDNLAARDFlNWLIQTKITDGGGKLTRVVGGGGSPAGSPTSTEEGTSESA
FXIa- TPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGP
AE864 GSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPS
GLPZ- Amino Acid Sequence
XTEN
Name"
EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPG
TSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATP
ESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGT
STEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTS
TEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
ESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTS
TEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSE
PATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS
SGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS
APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
GLPZ- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGGGLGPVSGVPGGSTSESPSGTAPGTSPSGE
Elastase- SSTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGS
AP864 GTAPGTSPSGESSTAPGSTSESPSGTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAES
PGPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASPGSTS
ESPSGTAPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTA
PGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGSTSSTAESPGPGSTSST
AESPGPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPG
TSTPESGPXXXGASASGAPSTXXXXSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPS
GTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPGST
SSTAESPGPGTSPSGESSTAPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESST
APGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGSTSS
TAESPGPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAP
GTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGSTSSTA
ESPGPGTSPSGESSTAPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSP
GLPZ- HADGSFSDEMNTILDNLAARDFINWLIQTKITDGAPLGLRLRGGGGASPGTSSTGSPGSSPSAS
MMP TGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGPGASPGTSSTGSP
AG864 GTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSG
TASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGP
GSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGT
SSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGP
TASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPS
GATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSP
GSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGT
SSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSP
GSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSA
GASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGP
GASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSA
STGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSP
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGLTPRSLLVGGGGSSESGSSEGGPGSSESGS
variant 2- SEGGPGESPGGSSGSESGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSSESGSSEGGPG
Thrombin— SEGGPGSSESGSSEGGPGESPGGSSGSESGESPGGSSGSESGESPGGSSGSESGSSESGS
ADS3 6 SEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPG
SGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGSSESGS
SEGGPGSGGEPSESGSSGSEGSSGPGESSGSSESGSSEGGPGSGGEPSESGSSGESPGGSSGSESG
SGGEPSESGSSGSGGEPSESGSSGSSESGSSEGGPGSGGEPSESGSSGSGGEPSESGSSGSEGSSG
PGESSGESPGGSSGSESGSEGSSGPGESSGSEGSSGPGESSGSGGEPSESGSSGSSESGSSEGGPG
SSESGSSEGGPGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGESSGESPGGSSGSESGSEGSSG
PGSSESGSSEGGPGSGGEPSESGSSGSEGSSGPGESSGSEGSSGPGESSGSEGSSGPGESSGSGGE
PSESGSSGSGGEPSESGSSGESPGGSSGSESGESPGGSSGSESGSGGEPSESGSSGSEGSSGPGES
SGESPGGSSGSESGSSESGSSEGGPGSSESGSSEGGPGSSESGSSEGGPGSGGEPSESGSSGSSES
GSSEGGPGESPGGSSGSESGSGGEPSESGSSGSSESGSSEGGPGESPGGSSGSESGSGGEPSESGS
SGESPGGSSGSESGSGGEPSESGSS
GLP-Z HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGGKLTRVVGGGGSPAGSPTSTEEGTSESA
variant 2- TPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGP
FXIa- GSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPS
AE864 EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPG
TSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATP
GLPZ- Amino Acid Sequence
XTEN
Name"
ESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGT
STEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTS
TEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
SGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTS
TEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSE
PATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS
SGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS
APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGGGLGPVSGVPGGSTSESPSGTAPGTSPSGE
variant 2- STSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGS
Elastase- TSESPSGTAPGTSPSGESSTAPGSTSESPSGTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAES
AP864 PGPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGTSTPESGSASPGTSTPESGSASPGSTS
ESPSGTAPGSTSESPSGTAPGTSTPESGSASPGSTSSTAESPGPGTSTPESGSASPGSTSESPSGTA
PGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAPGTSTPESGSASPGSTSSTAESPGPGSTSST
AESPGPGSTSSTAESPGPGSTSSTAESPGPGTSPSGESSTAPGSTSESPSGTAPGSTSESPSGTAPG
TSTPESGPXXXGASASGAPSTXXXXSESPSGTAPGSTSESPSGTAPGSTSESPSGTAPGSTSESPS
GTAPGSTSESPSGTAPGSTSESPSGTAPGTSTPESGSASPGTSPSGESSTAPGTSPSGESSTAPGST
SSTAESPGPGTSPSGESSTAPGTSTPESGSASPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESST
APGSTSESPSGTAPGTSTPESGSASPGTSTPESGSASPGSTSESPSGTAPGTSTPESGSASPGSTSS
TAESPGPGSTSESPSGTAPGSTSESPSGTAPGTSPSGESSTAPGSTSSTAESPGPGTSPSGESSTAP
SGSASPGTSPSGESSTAPGTSPSGESSTAPGTSPSGESSTAPGSTSSTAESPGPGSTSSTA
ESPGPGTSPSGESSTAPGSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSP
GLP-2 HGDGSFSDEMNTILDNLAARDFINWLIQTKITDGAPLGLRLRGGGGASPGTSSTGSPGSSPSAS
t 2— TGTGPGSSPSASTGTGPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGPGASPGTSSTGSP
MMP GTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGTPGSG
AG864 TASSSPGSSTPSGATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSNPSASTGTGP
GSSPSASTGTGPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSPGASPGTSSTGSPGASPGT
SSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSPSASTGTGP
GTPGSGTASSSPGASPGTSSTGSPGASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSTPS
GATGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSTPSGATGSP
GSSPSASTGTGPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGASPGTSSTGSPGASPGT
SSTGSPGASPGTSSTGSPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGTPGSGTASSSP
GSSTPSGATGSPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGSSPSASTGTGPGSSPSA
STGTGPGASPGTSSTGSPGTPGSGTASSSPGSSTPSGATGSPGSSPSASTGTGPGSSPSASTGTGP
GASPGTSSTGSPGASPGTSSTGSPGSSTPSGATGSPGSSPSASTGTGPGASPGTSSTGSPGSSPSA
STGTGPGTPGSGTASSSPGSSTPSGATGSPGSSTPSGATGSPGASPGTSSTGSP
AE912- MAEPAGSPTSTEEGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSPAGSPTSTEEGTSES
Thrombin- ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESG
GLP2 PGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEP
SEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAP
GTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESA
TPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGS
PTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETP
GTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGS
PTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPAT
SGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGP
GTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPS
EGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGLTPRSLLVGGG
HADGSFSDEMNTILDNLAARDFINWLIQTKITD
AE912- MAEPAGSPTSTEEGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSPAGSPTSTEEGTSES
FXIa-GLP- ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESG
2 variant 2 TSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEP
SEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAP
GTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESA
TPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
GLPZ- Amino Acid Sequence
XTEN
Name"
GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGS
PTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETP
GTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGS
PTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPAT
SGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGP
GTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPS
EGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGGGKLTRVVGGG
HGDGSFSDEMNTILDNLAARDFINWLIQTKITD
AE912- MAEPAGSPTSTEEGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSPAGSPTSTEEGTSES
Elastase- ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESG
GLP-2 PGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEP
variant 2 SEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAP
GTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESA
TPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGS
PTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETP
GTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGS
PTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPAT
GTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGP
GTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPS
EGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGGGLGPVSGVPG
HGDGSFSDEMNTILDNLAARDFINWLIQTKITD
GLP—2 SDEMNTILDNLAARDFlNWLIQTKITDGAPLGLRLRGGGGSPAGSPTSTEEGTSESAT
variant 2- PESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG
MMP- l 7- SEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSE
AE864 GSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGT
SESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPE
SGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTS
TEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTS
TEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS
SGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTS
TEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSE
PATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS
ETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS
ESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS
APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG
* Sequence
name reflects N— to C-terminus configuration of the GLP-Z, XTEN (by family name and
length) and cleavage ce d by protease name active on the sequence.
Claims (84)
1. A composition for use in achieving an intestinotrophic effect in a subject comprising a inant fusion protein comprising (i) a glucagon-like protein-2 (GLP-2) sequence selected from the group consisting of the sequences of SEQ ID NOS: 1 and 3-23, and (ii) an extended recombinant polypeptide (XTEN), n the XTEN is a sequence exhibiting at least 90% sequence identity to a sequence ed from the group consisting of the sequences in Table 4, and wherein the XTEN is further characterized in that: (a) the XTEN comprises at least 36 amino acid residues; (b) the sum of glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P) residues constitutes more than 80% of the total amino acid residues of the XTEN; (c) the XTEN is substantially non-repetitive such that (i) the XTEN contains no three contiguous amino acids that are identical unless the amino acids are serine; (ii) at least 80% of the XTEN ce consists of non-overlapping sequence motifs, each of the sequence motifs comprising 9 to 14 amino acid residues consisting of four to six amino acids selected from glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P), wherein any two contiguous amino acid residues do not occur more than twice in each of the non-overlapping sequence motifs; or (iii) the XTEN sequence has a uence score of less than 10; (d) the XTEN has greater than 90% random coil formation as determined by GOR thm; (e) the XTEN has less than 2% alpha helices and 2% heets as determined by Chou-Fasman algorithm; and (f) the XTEN lacks a predicted T-cell e when ed by TEPITOPE thm, wherein the TEPITOPE threshold score for said prediction by said algorithm has a threshold of –9, wherein said fusion protein exhibits an apparent molecular weight factor of at least 4 and is capable of achieving an intestinotrophic effect in a subject using a dosage of 2.5 nmol/kg to 6250 nmol/kg, or 25 nmol/kg to 3750 nmol/kg, or 75 nmol/kg/dose to 1250 nmol/kg/dose, or 125 g/dose to 750 nmol/kg/dose.
2. The composition of claim 1, wherein the intestinotrophic effect is at least 30%, or at least 40%, or at least 50%, or at least 60%, or at least 70%, or at least 80%, or at least 90%, or at least 100% or at least 120% or at least 150% or at least 200% of the intestinotrophic effect compared to the corresponding GLP-2 not linked to XTEN upon administration of said corresponding GLP-2 to a subject using comparable dose.
3. The ition of claim 2, wherein the subject is selected from the group consisting of mouse, rat, monkey, pig, bovine, sheep, and human.
4. The composition of claim 2, wherein the fusion protein is formulated for delivery by aneous, intramuscular, or intravenous doses.
5. The composition of claim 2, wherein the intestinotrophic effect is determined after administration of 1 dose, or 3 doses, or 6 doses, or 10 doses, or 12 or more doses of the composition.
6. The ition of claim 5, wherein the intestinotrophic effect is selected from the group consisting of intestinal growth, increased hyperplasia of the villus epithelium, increased crypt cell proliferation, increased height of the crypt and villus axis, increased healing after intestinal anastomosis, increased small bowel weight, increased small bowel length, decreased small bowel epithelium apoptosis, and enhancement of intestinal on.
7. The composition of claim 6, wherein the intestinotrophic effect is an increase in small intestine weight of at least 10%, or at least 20%, or at least 30%.
8. The composition of claim 6, wherein the intestinotrophic effect is an increase in small intestine length of at least 5%, or at least 6%, or at least 7%, or at least 8%, or at least 9%, or at least 10%, or at least 20%, or at least 30%.
9. The composition of any one of claims 1-8, n the GLP-2 is ed from the group consisting of bovine GLP-2, pig GLP-2, sheep GLP-2, n GLP-2, and canine GLP-2.
10. The composition of any one of claims 1-8, wherein the GLP-2 has an amino acid substitution in place of Ala2, and wherein the substitution is e.
11. The composition of any one of claims 1-8, wherein the GLP-2 has the sequence HGDGSFSDEMNTILDNLAARDFINWLIQTKITD.
12. The composition of any one of claims 1-8, wherein the XTEN is linked to the C- terminus of the GLP-2.
13. The ition of claim 12, further comprising a spacer sequence of 1 to 50 amino acid residues linking the GLP-2 and XTEN.
14. The composition of claim 13, wherein the spacer sequence ses a glycine residue.
15. The composition of any one of claims 1-8, wherein the XTEN has at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence identity when ed to a sequence of comparable length selected from any one of Table 4, Table 8, Table 9, Table 10, Table 11, and Table 12, when optimally aligned.
16. The composition of claim 15, wherein the XTEN has at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence identity when compared to an AE864 sequence from Table 4, when optimally aligned.
17. The composition of any one of claims 1-8, wherein the fusion protein ce has a sequence with at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence ty to SEQ ID NO: 741, 743, 745, 747, 749, 751-752, 754, 756-758, 760, 762-774, and 798.
18. The composition of any one of claims 1-8, wherein the fusion protein exhibits a terminal half-life that is at least 30 hours when administered to the subject.
19. The composition of any one of claims 1-8, wherein the fusion protein binds to a GLP-2 or with an EC50 of less than 30 nM, or 100 nM, or 200 nM, or 300 nM, or 370 nM, or 400 nM, or 500 nM, or 600 nM, or 700 nM, or 800 nM, or 1000 nM, or 1200 nM, or 1400 nM when assayed using an in vitro GLP2R cell assay wherein the GLP2R cell is a human recombinant GLP-2 glucagon family receptor calcium-optimized cell.
20. The composition of any one of claims 1-8, wherein the fusion protein retains at least 1%, or 2%, or 3%, or 4%, or 5%, or 10%, or 20%, or at least 30% of the potency of the corresponding GLP-2 not linked to XTEN when d using an in vitro GLP2R cell assay wherein the GLP2R cell is a human inant GLP-2 glucagon family receptor calciumoptimized cell.
21. The composition of any one of claims 1-8 or 17, characterized in that (a) an equivalent amount, in nmoles/kg, of the fusion protein compared to the corresponding GLP-2 that lacks the XTEN has a terminal half-life that is at least , or at least 4-fold, or at least 5-fold, or at least 10-fold, or at least 15-fold, or at least 20-fold longer compared to the corresponding GLP-2 that lacks the XTEN; or (b) an equivalent amount, in nmoles/kg, of the fusion protein compared to the corresponding GLP-2 that lacks the XTEN achieves a greater intestinotrophic effect in a subject compared to the corresponding GLP-2 that lacks the XTEN.
22. The composition of claim 21, wherein the greater intestinotrophic effect is selected from the group consisting of body weight gain, small intestine length, reduction in TNFα content of the small intestine tissue, reduced l atrophy, reduced nce of perforated ulcers, and height of villi.
23. The composition of claim 22, wherein the greater intestinotrophic effect is an increase in small intestine weight of at least 10%, or at least 20%, or at least 30%, or at least 40% r compared to that of the corresponding GLP-2 not linked to XTEN.
24. The composition of claim 22, n the greater intestinotrophic effect is an increase in small intestine length of at least 5%, or at least 6%, or at least 7%, or at least 8%, or at least 9%, or at least 10%, or at least 20%, or at least 30%, or at least 40% greater compared to that of the corresponding GLP-2 not linked to XTEN.
25. The composition of claim 22, wherein the greater intestinotrophic is an increase in body weight is at least 5%, or at least 6%, or at least 7%, or at least 8%, or at least 9%, or at least 10%, or at least 20%, or at least 30%, or at least 40% r compared to that of the corresponding GLP-2 not linked to XTEN.
26. The composition of claim 22, wherein the greater intestinotrophic effect is a reduction in TNFα content is at least 0.5 ng/g, or at least 0.6 ng/g, or at least 0.7 ng/g, or at least 0.8 ng/g, or at least 0.9 ng/g, or at least 1.0 ng/g, or at least 1.1 ng/g, or at least 1.2 ng/g, or at least 1.3 ng/g, or at least 1.4 ng/g of small intestine tissue or greater ed to that of the corresponding GLP-2 not linked to XTEN.
27. The ition of claim 22, wherein the greater intestinotrophic effect is villi height is at least 5%, or at least 6%, or at least 7%, or at least 8%, or at least 9%, or at least 10%, or at least 11%, or at least 12% greater compared to that of the corresponding GLP-2 not linked to XTEN.
28. A method of producing a fusion protein comprising GLP-2 fused to one or more extended recombinant polypeptides (XTEN), comprising: (a) providing a prokaryotic host cell comprising a recombinant nucleic acid encoding the fusion protein of any one of claims 1-8 or 17; (b) culturing the host cell under conditions permitting the expression of the fusion protein; and (c) recovering the fusion protein.
29. The method of claim 28, wherein the fusion protein is recovered from the host cell cytoplasm in substantially soluble form.
30. The method of claim 28, wherein the recombinant c acid le has a sequence with at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence identity to a sequence ed from the group consisting of the DNA sequences set forth in Table 13, when optimally aligned, or the complement thereof.
31. An isolated nucleic acid comprising: (a) a nucleic acid sequence that has at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% ce identity to a DNA sequence selected from Table 13, or the complement thereof; or (b) a nucleotide sequence encoding the fusion protein of any one of claims 1 to 8 or 17, or the complement thereof.
32. An sion vector or isolated host cell comprising the nucleic acid of claim 31.
33. A prokaryotic host cell comprising the expression vector of claim 32.
34. A pharmaceutical composition sing the fusion protein of any one of claims 1-8 or 17, and a pharmaceutically acceptable carrier.
35. The composition of any one of claims 1-8 or 17 ured according to formula (GLP-2)-(S)x-(XTEN) (V) wherein S is a spacer sequence having between 1 to about 50 amino acid residues that can optionally include a cleavage sequence from Table 6 or amino acids compatible with restrictions sites; x is either 0 or 1.
36. The ition of claim 35, wherein the GLP-2 is selected from the group consisting of bovine GLP-2, pig GLP-2, sheep GLP-2, chicken GLP-2, and canine GLP-2.
37. The composition of claim 35, wherein the GLP-2 has an amino acid substitution in place of Ala2, and wherein the substitution is glycine.
38. The composition of claim 35, wherein the GLP-2 has the sequence HGDGSFSDEMNTILDNLAARDFINWLIQTKITD.
39. The composition of claim 35, comprising a spacer sequence wherein the spacer ce comprises a glycine residue.
40. The composition of claim 35, wherein the XTEN has at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence identity when ed to a AE864 ce from Table 4, when optimally d.
41. The composition of claim 17 for use in the manufacture of a medicament for the treatment of a gastrointestinal condition.
42. The composition of claim 41 wherein the gastrointestinal condition is selected from the group ting of gastritis, ion disorders, malabsorption syndrome, short-gut syndrome, short bowel syndrome, cul-de-sac syndrome, inflammatory bowel disease, celiac disease, tropical sprue, hypogammaglobulinemic sprue, Crohn's disease, ulcerative colitis, enteritis, chemotherapy-induced enteritis, irritable bowel syndrome, small intestine damage, small intestinal damage due to cancer-chemotherapy, gastrointestinal injury, diarrheal diseases, intestinal insufficiency, acid-induced intestinal , ne deficiency, idiopathic hypospermia, obesity, catabolic illness, e neutropenia, diabetes, obesity, steatorrhea, autoimmune diseases, food allergies, ycemia, gastrointestinal r disorders, sepsis, bacterial nitis, burn-induced intestinal damage, decreased gastrointestinal motility, intestinal failure, chemotherapy- associated bacteremia, bowel trauma, bowel ischemia, mesenteric ischemia, malnutrition, necrotizing enterocolitis, necrotizing pancreatitis, neonatal feeding intolerance, NSAID-induced gastrointestinal damage, nutritional insufficiency, total parenteral nutrition damage to gastrointestinal tract, al nutritional insufficiency, radiation-induced enteritis, radiation-induced injury to the intestines, mucositis, tis, and gastrointestinal ischemia.
43. A kit, comprising packaging material and at least a first container comprising the composition of claim 17, an amount of a pharmaceutically acceptable carrier, and a sheet of ctions for the reconstitution and/or administration of the composition to a subject.
44. Use of a recombinant fusion protein in the manufacture of a medicament for achieving an intestinotrophic effect in a subject, wherein the recombinant fusion protein comprises (i) a glucagon-like n-2 (GLP-2) sequence selected from the group consisting of the sequences of SEQ ID NOS: 1 and 3-23, and (ii) an extended recombinant polypeptide (XTEN), wherein the XTEN is a sequence exhibiting at least 90% sequence identity to a sequence selected from the group ting of the sequences in Table 4, and wherein the XTEN is further characterized in that: (a) the XTEN comprises at least 36 amino acid residues; (b) the sum of glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P) residues constitutes more than 80% of the total amino acid residues of the XTEN; (c) the XTEN is substantially non-repetitive such that (i) the XTEN contains no three contiguous amino acids that are cal unless the amino acids are serine; (ii) at least 80% of the XTEN sequence ts of non-overlapping ce motifs, each of the sequence motifs comprising 9 to 14 amino acid residues consisting of four to six amino acids selected from glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) and proline (P), wherein any two contiguous amino acid residues do not occur more than twice in each of the non-overlapping sequence motifs; or (iii) the XTEN ce has a subsequence score of less than 10; (d) the XTEN has greater than 90% random coil formation as determined by GOR algorithm; (e) the XTEN has less than 2% alpha helices and 2% beta-sheets as determined by Chou-Fasman algorithm; and (f) the XTEN lacks a predicted T-cell epitope when analyzed by TEPITOPE algorithm, wherein the PE old score for said prediction by said algorithm has a threshold of –9, wherein said fusion protein exhibits an apparent molecular weight factor of at least 4 and is prepared for administration to a subject at a dosage of 2.5 nmol/kg to 6250 nmol/kg, or 25 nmol/kg to 3750 nmol/kg, or 75 nmol/kg/dose to 1250 g/dose, or 125 nmol/kg/dose to 750 g/dose.
45. The use of claim 1, wherein the intestinotrophic effect is at least 30%, or at least 40%, or at least 50%, or at least 60%, or at least 70%, or at least 80%, or at least 90%, or at least 100% or at least 120% or at least 150% or at least 200% of the intestinotrophic effect compared to a corresponding GLP-2 not linked to XTEN upon administration of said corresponding GLP-2 to a subject using comparable dose.
46. The use of claim 45, wherein the subject is selected from the group consisting of mouse, rat, monkey, pig, bovine, sheep, and human.
47. The use of claim 45, wherein the fusion protein is formulated for delivery by subcutaneous, intramuscular, or intravenous doses.
48. The use of claim 45, wherein the fusion protein is formulated for administration of 1 dose, or 3 doses, or 6 doses, or 10 doses, or 12 or more doses.
49. The use of claim 48, wherein the intestinotrophic effect is selected from the group consisting of intestinal growth, increased hyperplasia of the villus epithelium, increased crypt cell proliferation, sed height of the crypt and villus axis, sed healing after intestinal anastomosis, increased small bowel weight, increased small bowel length, decreased small bowel epithelium apoptosis, and enhancement of inal function.
50. The use of claim 49, wherein the intestinotrophic effect is an se in small intestine weight of at least 10%, or at least 20%, or at least 30%.
51. The use of claim 49, wherein the intestinotrophic effect is an increase in small intestine length of at least 5%, or at least 6%, or at least 7%, or at least 8%, or at least 9%, or at least 10%, or at least 20%, or at least 30%.
52. The use of any one of claims 44-51, wherein the GLP-2 is selected from the group ting of bovine GLP-2, pig GLP-2, sheep GLP-2, chicken GLP-2, and canine GLP-2.
53. The use of any one of claims 44-51, wherein the GLP-2 has an amino acid substitution in place of Ala2, and wherein the substitution is glycine.
54. The use of any one of claims 44-51, wherein the GLP-2 has the sequence HGDGSFSDEMNTILDNLAARDFINWLIQTKITD.
55. The use of any one of claims 44-51, wherein the XTEN is linked to the C- terminus of the GLP-2.
56. The use of claim 55, wherein the fusion protein further comprises a spacer sequence of 1 to 50 amino acid residues linking the GLP-2 and XTEN.
57. The use of claim 56, wherein the spacer sequence ses a e e.
58. The use of any one of claims 44-51, wherein the XTEN has at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence identity when compared to a sequence of comparable length selected from any one of Table 4, Table 8, Table 9, Table 10, Table 11, and Table 12, when optimally aligned.
59. The use of claim 58, wherein the XTEN has at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence identity when compared to an AE864 sequence from Table 4, when optimally aligned.
60. The use of any one of claims 44-51, wherein the fusion protein sequence has a sequence with at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence identity to SEQ ID NO: 741, 743, 745, 747, 749, 751-752, 754, 756-758, 760, 762- 774, and 798.
61. The use of any one of claims 44-51, wherein the fusion n ts a terminal half-life that is at least 30 hours when administered to a non-human subject.
62. The use of any one of claims 44-51, wherein the fusion protein binds to a GLP-2 receptor with an EC50 of less than 30 nM, or 100 nM, or 200 nM, or 300 nM, or 370 nM, or 400 nM, or 500 nM, or 600 nM, or 700 nM, or 800 nM, or 1000 nM, or 1200 nM, or 1400 nM when assayed using an in vitro GLP2R cell assay wherein the GLP2R cell is a human recombinant GLP-2 glucagon family receptor calcium-optimized cell.
63. The use of any one of claims 44-51, wherein the fusion protein retains at least 1%, or 2%, or 3%, or 4%, or 5%, or 10%, or 20%, or at least 30% of the potency of a corresponding GLP-2 not linked to XTEN when d using an in vitro GLP2R cell assay wherein the GLP2R cell is a human recombinant GLP-2 glucagon family receptor calciumoptimized cell.
64. The use of any one of claims 44-51 or 60, characterized in that (a) an equivalent amount, in nmoles/kg, of the fusion protein compared to a corresponding GLP-2 that lacks the XTEN has a terminal half-life that is at least 3-fold, or at least 4-fold, or at least 5-fold, or at least 10-fold, or at least 15-fold, or at least 20-fold longer compared to the corresponding GLP-2 that lacks the XTEN; or (b) an equivalent , in nmoles/kg, of the fusion protein compared to the corresponding GLP-2 that lacks the XTEN es a greater instestinotrophic effect in a subject compared to the corresponding GLP-2 that lacks the XTEN.
65. The use of claim 64, wherein the greater intestinotrophic effect is selected from the group consisting of body weight gain, increased small intestine length, reduction in TNFα content of the small intestine tissue, reduced mucosal atrophy, d incidence of perforated ulcers, and increased height of villi.
66. The use of claim 65, wherein the r intestinotrophic effect is an increase in small intestine weight of at least 10%, or at least 20%, or at least 30%, or at least 40% greater ed to that of the corresponding GLP-2 not linked to XTEN.
67. The use of claim 65, wherein the greater intestinotrophic effect is an increase in small ine length of at least 5%, or at least 6%, or at least 7%, or at least 8%, or at least 9%, or at least 10%, or at least 20%, or at least 30%, or at least 40% greater compared to that of the corresponding GLP-2 not linked to XTEN.
68. The use of claim 65, wherein the greater intestinotrophic is an increase in body weight is at least 5%, or at least 6%, or at least 7%, or at least 8%, or at least 9%, or at least 10%, or at least 20%, or at least 30%, or at least 40% greater compared to that of the corresponding GLP-2 not linked to XTEN.
69. The use of claim 65, wherein the greater intestinotrophic effect is a reduction in TNFα content is at least 0.5 ng/g, or at least 0.6 ng/g, or at least 0.7 ng/g, or at least 0.8 ng/g, or at least 0.9 ng/g, or at least 1.0 ng/g, or at least 1.1 ng/g, or at least 1.2 ng/g, or at least 1.3 ng/g, or at least 1.4 ng/g of small intestine tissue or greater compared to that of the corresponding GLP-2 not linked to XTEN.
70. The use of claim 65, wherein the greater inotrophic effect is increased villi height of at least 5%, or at least 6%, or at least 7%, or at least 8%, or at least 9%, or at least 10%, or at least 11%, or at least 12% compared to that of the corresponding GLP-2 not linked to XTEN.
71. The use of any one of claims 44-51 or 60, wherein producing the fusion protein comprising GLP-2 fused to one or more extended recombinant polypeptides (XTEN), comprises: (a) providing a yotic host cell comprising a recombinant nucleic acid encoding the fusion protein; (b) culturing the host cell under ions permitting the expression of the fusion protein; and (c) recovering the fusion n.
72. The use of claim 71, wherein the fusion protein is recovered from the host cell cytoplasm in substantially soluble form.
73. The use of claim 71, wherein the recombinant nucleic acid molecule has a sequence with at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% ce identity to a sequence selected from the group consisting of the DNA sequences set forth in Table 13, when optimally aligned, or the complement thereof.
74. The use of any one of claims 44-51 or 60, wherein the medicament comprising the fusion protein further comprises a pharmaceutically acceptable carrier.
75. The use of any one of claims 44-51 or 60, wherein the fusion protein is configured according to formula V: (GLP-2)-(S)x-(XTEN) (V) wherein S is a spacer sequence having between 1 to about 50 amino acid residues that can optionally include a cleavage sequence from Table 6 or amino acids compatible with restrictions sites; x is either 0 or 1.
76. The use of claim 75, wherein the GLP-2 is selected from the group ting of bovine GLP-2, pig GLP-2, sheep GLP-2, chicken GLP-2, and canine GLP-2.
77. The use of claim 75, wherein the GLP-2 has an amino acid substitution in place of Ala2, and wherein the substitution is glycine.
78. The use of claim 75, wherein the GLP-2 has the sequence HGDGSFSDEMNTILDNLAARDFINWLIQTKITD.
79. The use of claim 75, wherein the fusion protein comprises a spacer sequence and wherein the spacer ce ses a glycine residue.
80. The use of claim 75, wherein the XTEN has at least 90%, or at least 91%, or at least 92%, or at least 93%, or at least 94%, or at least 95%, or at least 96%, or at least 97%, or at least 98%, or at least 99%, or 100% sequence identity when compared to a AE864 sequence from Table 4, when optimally aligned.
81. The use of claim 60, n the intestinotrophic effect ses treatment of a gastrointestinal condition.
82. The use of claim 81 wherein the gastrointestinal condition is selected from the group consisting of gastritis, digestion disorders, malabsorption syndrome, short-gut syndrome, short bowel syndrome, cul-de-sac syndrome, inflammatory bowel disease, celiac disease, tropical sprue, hypogammaglobulinemic sprue, Crohn's e, ulcerative colitis, enteritis, chemotherapy-induced enteritis, irritable bowel syndrome, small intestine damage, small intestinal damage due to cancer-chemotherapy, gastrointestinal injury, eal diseases, intestinal insufficiency, acid-induced intestinal injury, ne deficiency, idiopathic ermia, obesity, catabolic illness, febrile neutropenia, diabetes, obesity, steatorrhea, autoimmune diseases, food allergies, hypoglycemia, intestinal barrier disorders, sepsis, bacterial peritonitis, burn-induced intestinal damage, decreased gastrointestinal motility, intestinal failure, chemotherapy- associated bacteremia, bowel trauma, bowel ischemia, mesenteric ischemia, malnutrition, necrotizing colitis, necrotizing pancreatitis, neonatal feeding intolerance, NSAID-induced gastrointestinal , nutritional insufficiency, total parenteral nutrition damage to gastrointestinal tract, neonatal nutritional insufficiency, radiation-induced enteritis, radiation-induced injury to the intestines, mucositis, pouchitis, and gastrointestinal ischemia.
83. The use of claim 60, wherein the medicament is provided in a kit, sing ing material and at least a first container comprising the fusion protein, an amount of a pharmaceutically acceptable r, and a sheet of instructions for the reconstitution and/or administration of the composition to a t.
84. The use of claim 44, n the fusion protein is ed for administration to a subject at a dosage of 2.5 nmol/kg to 125 nmol/kg/dose.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161573748P | 2011-09-12 | 2011-09-12 | |
US61/573,748 | 2011-09-12 | ||
PCT/US2012/054941 WO2013040093A2 (en) | 2011-09-12 | 2012-09-12 | Glucagon-like peptide-2 compositions and methods of making and using same |
Publications (2)
Publication Number | Publication Date |
---|---|
NZ622174A NZ622174A (en) | 2015-12-24 |
NZ622174B2 true NZ622174B2 (en) | 2016-03-30 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20240067695A1 (en) | Glucagon-like peptide-2 compositions and methods of making and using same | |
US10000543B2 (en) | Glucose-regulating polypeptides and methods of making and using same | |
DK2440241T3 (en) | GROWTH HORMON POLYPEPTIDES AND PROCEDURES FOR PREPARING AND USING THEREOF | |
US8557961B2 (en) | Alpha 1-antitrypsin compositions and methods of making and using same | |
US8703717B2 (en) | Growth hormone polypeptides and methods of making and using same | |
TWI489992B (en) | Amide based glucagon superfamily peptide prodrugs | |
US9849188B2 (en) | Growth hormone polypeptides and methods of making and using same | |
EP3530671A2 (en) | Gip peptide analogues | |
NZ622174B2 (en) | Glucagon-like peptide-2 compositions and methods of making and using same | |
WO2009053725A2 (en) | Peptides and uses thereof | |
KR20120036947A (en) | Growth hormone polypeptides and methods of making and using same |