CA2256486A1 - Novel human chromosome 16 genes, compositions, methods of making and using same - Google Patents
Novel human chromosome 16 genes, compositions, methods of making and using same Download PDFInfo
- Publication number
- CA2256486A1 CA2256486A1 CA002256486A CA2256486A CA2256486A1 CA 2256486 A1 CA2256486 A1 CA 2256486A1 CA 002256486 A CA002256486 A CA 002256486A CA 2256486 A CA2256486 A CA 2256486A CA 2256486 A1 CA2256486 A1 CA 2256486A1
- Authority
- CA
- Canada
- Prior art keywords
- leu
- ala
- gly
- ser
- val
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 189
- 238000000034 method Methods 0.000 title claims abstract description 68
- 239000000203 mixture Substances 0.000 title claims abstract description 23
- 210000003917 human chromosome Anatomy 0.000 title description 6
- 241000282414 Homo sapiens Species 0.000 claims abstract description 163
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 127
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 121
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 121
- 210000004027 cell Anatomy 0.000 claims abstract description 109
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 97
- 108010063605 Netrins Proteins 0.000 claims abstract description 92
- 101000959079 Homo sapiens FAD-linked sulfhydryl oxidase ALR Proteins 0.000 claims abstract description 80
- 102000010803 Netrins Human genes 0.000 claims abstract description 75
- 102000054113 human GFER Human genes 0.000 claims abstract description 73
- JVJUWEFOGFCHKR-UHFFFAOYSA-N 2-(diethylamino)ethyl 1-(3,4-dimethylphenyl)cyclopentane-1-carboxylate;hydrochloride Chemical compound Cl.C=1C=C(C)C(C)=CC=1C1(C(=O)OCCN(CC)CC)CCCC1 JVJUWEFOGFCHKR-UHFFFAOYSA-N 0.000 claims abstract description 68
- 210000003705 ribosome Anatomy 0.000 claims abstract description 44
- 239000013598 vector Substances 0.000 claims abstract description 40
- 150000001875 compounds Chemical class 0.000 claims abstract description 35
- 108010006533 ATP-Binding Cassette Transporters Proteins 0.000 claims abstract description 29
- 102000005416 ATP-Binding Cassette Transporters Human genes 0.000 claims abstract description 28
- 230000009261 transgenic effect Effects 0.000 claims abstract description 15
- 239000000074 antisense oligonucleotide Substances 0.000 claims abstract description 14
- 238000012230 antisense oligonucleotides Methods 0.000 claims abstract description 14
- 230000014509 gene expression Effects 0.000 claims description 49
- 102100022104 60S ribosomal protein L3-like Human genes 0.000 claims description 42
- 101001110361 Homo sapiens 60S ribosomal protein L3-like Proteins 0.000 claims description 42
- 108091034117 Oligonucleotide Proteins 0.000 claims description 38
- 230000027455 binding Effects 0.000 claims description 38
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 31
- 108020004999 messenger RNA Proteins 0.000 claims description 31
- 239000012634 fragment Substances 0.000 claims description 23
- 230000000295 complement effect Effects 0.000 claims description 17
- 238000013519 translation Methods 0.000 claims description 16
- 210000000170 cell membrane Anatomy 0.000 claims description 14
- 239000003446 ligand Substances 0.000 claims description 8
- 239000012051 hydrophobic carrier Substances 0.000 claims description 6
- 238000012875 competitive assay Methods 0.000 claims description 5
- 238000004519 manufacturing process Methods 0.000 claims description 5
- 238000012258 culturing Methods 0.000 claims 4
- 108090000765 processed proteins & peptides Proteins 0.000 abstract description 124
- 102000004196 processed proteins & peptides Human genes 0.000 abstract description 104
- 229920001184 polypeptide Polymers 0.000 abstract description 90
- 210000000349 chromosome Anatomy 0.000 abstract description 21
- 108020004711 Nucleic Acid Probes Proteins 0.000 abstract description 5
- 239000002853 nucleic acid probe Substances 0.000 abstract description 5
- 108020000948 Antisense Oligonucleotides Proteins 0.000 abstract description 3
- 108020004635 Complementary DNA Proteins 0.000 description 116
- 238000010804 cDNA synthesis Methods 0.000 description 105
- 239000002299 complementary DNA Substances 0.000 description 105
- 235000018102 proteins Nutrition 0.000 description 87
- 235000001014 amino acid Nutrition 0.000 description 76
- 150000001413 amino acids Chemical class 0.000 description 67
- 108020004414 DNA Proteins 0.000 description 57
- 101000801640 Homo sapiens Phospholipid-transporting ATPase ABCA3 Proteins 0.000 description 52
- 238000003757 reverse transcription PCR Methods 0.000 description 46
- 239000000047 product Substances 0.000 description 42
- 108700024394 Exon Proteins 0.000 description 40
- 108010050848 glycylleucine Proteins 0.000 description 40
- 239000000523 sample Substances 0.000 description 40
- 101000990566 Homo sapiens HEAT repeat-containing protein 6 Proteins 0.000 description 33
- 101000801684 Homo sapiens Phospholipid-transporting ATPase ABCA1 Proteins 0.000 description 33
- 102100033616 Phospholipid-transporting ATPase ABCA1 Human genes 0.000 description 33
- 239000013615 primer Substances 0.000 description 30
- 102000040430 polynucleotide Human genes 0.000 description 29
- 108091033319 polynucleotide Proteins 0.000 description 29
- 239000002157 polynucleotide Substances 0.000 description 29
- 101000801645 Homo sapiens ATP-binding cassette sub-family A member 2 Proteins 0.000 description 28
- 241001529936 Murinae Species 0.000 description 27
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 26
- 239000002773 nucleotide Substances 0.000 description 26
- 125000003729 nucleotide group Chemical group 0.000 description 26
- 102100033618 ATP-binding cassette sub-family A member 2 Human genes 0.000 description 25
- 238000003752 polymerase chain reaction Methods 0.000 description 25
- 238000004458 analytical method Methods 0.000 description 23
- 108010054155 lysyllysine Proteins 0.000 description 22
- 210000001519 tissue Anatomy 0.000 description 22
- 108091026890 Coding region Proteins 0.000 description 21
- 102100033623 Phospholipid-transporting ATPase ABCA3 Human genes 0.000 description 21
- 108010025306 histidylleucine Proteins 0.000 description 19
- 238000009396 hybridization Methods 0.000 description 19
- 230000035897 transcription Effects 0.000 description 19
- 238000013518 transcription Methods 0.000 description 19
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 18
- 241000287828 Gallus gallus Species 0.000 description 17
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 17
- 230000006870 function Effects 0.000 description 17
- 108010015792 glycyllysine Proteins 0.000 description 17
- 108010081726 netrin-2 Proteins 0.000 description 17
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 15
- 108010062796 arginyllysine Proteins 0.000 description 15
- 238000005516 engineering process Methods 0.000 description 15
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 14
- 108010047495 alanylglycine Proteins 0.000 description 14
- 108010034529 leucyl-lysine Proteins 0.000 description 14
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 13
- 241000880493 Leptailurus serval Species 0.000 description 13
- 108091028043 Nucleic acid sequence Proteins 0.000 description 13
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 13
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 13
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 13
- 108010061238 threonyl-glycine Proteins 0.000 description 13
- 108010087924 alanylproline Proteins 0.000 description 12
- 230000000694 effects Effects 0.000 description 12
- 108010003700 lysyl aspartic acid Proteins 0.000 description 12
- 108010017391 lysylvaline Proteins 0.000 description 12
- 208000030761 polycystic kidney disease Diseases 0.000 description 12
- 230000001105 regulatory effect Effects 0.000 description 12
- 238000012216 screening Methods 0.000 description 12
- 101100371856 Caenorhabditis elegans unc-6 gene Proteins 0.000 description 11
- 108020004705 Codon Proteins 0.000 description 11
- 108010079364 N-glycylalanine Proteins 0.000 description 11
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 11
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 11
- 108010093581 aspartyl-proline Proteins 0.000 description 11
- 238000006243 chemical reaction Methods 0.000 description 11
- 238000010367 cloning Methods 0.000 description 11
- 108010057821 leucylproline Proteins 0.000 description 11
- 108010064235 lysylglycine Proteins 0.000 description 11
- 108010026333 seryl-proline Proteins 0.000 description 11
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 10
- 108091060211 Expressed sequence tag Proteins 0.000 description 10
- 238000000636 Northern blotting Methods 0.000 description 10
- 239000005557 antagonist Substances 0.000 description 10
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 10
- 108010051242 phenylalanylserine Proteins 0.000 description 10
- HPNDBHLITCHRSO-WHFBIAKZSA-N Asp-Ala-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)NCC(O)=O HPNDBHLITCHRSO-WHFBIAKZSA-N 0.000 description 9
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 9
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 9
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 9
- TYVAWPFQYFPSBR-BFHQHQDPSA-N Thr-Ala-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)NCC(O)=O TYVAWPFQYFPSBR-BFHQHQDPSA-N 0.000 description 9
- 108010044940 alanylglutamine Proteins 0.000 description 9
- 108010068380 arginylarginine Proteins 0.000 description 9
- 108010077245 asparaginyl-proline Proteins 0.000 description 9
- 108010038633 aspartylglutamate Proteins 0.000 description 9
- 108010047857 aspartylglycine Proteins 0.000 description 9
- 108010068265 aspartyltyrosine Proteins 0.000 description 9
- 210000004899 c-terminal region Anatomy 0.000 description 9
- 108010060199 cysteinylproline Proteins 0.000 description 9
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 9
- 108010081551 glycylphenylalanine Proteins 0.000 description 9
- 210000004072 lung Anatomy 0.000 description 9
- 108010077112 prolyl-proline Proteins 0.000 description 9
- 108010029020 prolylglycine Proteins 0.000 description 9
- 210000000278 spinal cord Anatomy 0.000 description 9
- 108010073969 valyllysine Proteins 0.000 description 9
- 241000283690 Bos taurus Species 0.000 description 8
- UHPAZODVFFYEEL-QWRGUYRKSA-N Gly-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN UHPAZODVFFYEEL-QWRGUYRKSA-N 0.000 description 8
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 8
- 108010074223 Netrin-1 Proteins 0.000 description 8
- 102000009065 Netrin-1 Human genes 0.000 description 8
- 108700026244 Open Reading Frames Proteins 0.000 description 8
- BRKHVZNDAOMAHX-BIIVOSGPSA-N Ser-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N BRKHVZNDAOMAHX-BIIVOSGPSA-N 0.000 description 8
- 239000002253 acid Substances 0.000 description 8
- 150000007513 acids Chemical class 0.000 description 8
- 239000000556 agonist Substances 0.000 description 8
- 210000004556 brain Anatomy 0.000 description 8
- 108010016616 cysteinylglycine Proteins 0.000 description 8
- 108010049041 glutamylalanine Proteins 0.000 description 8
- 230000002209 hydrophobic effect Effects 0.000 description 8
- 238000000338 in vitro Methods 0.000 description 8
- 108010056582 methionylglutamic acid Proteins 0.000 description 8
- 108010012581 phenylalanylglutamate Proteins 0.000 description 8
- 239000000758 substrate Substances 0.000 description 8
- 238000011144 upstream manufacturing Methods 0.000 description 8
- VBEQCZHXXJYVRD-GACYYNSASA-N uroanthelone Chemical group C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)C(C)C)[C@@H](C)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H](CCSC)NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)CNC(=O)CNC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CS)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CS)NC(=O)CNC(=O)[C@H]1N(CCC1)C(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O)C(C)C)[C@@H](C)CC)C1=CC=C(O)C=C1 VBEQCZHXXJYVRD-GACYYNSASA-N 0.000 description 8
- 108020005345 3' Untranslated Regions Proteins 0.000 description 7
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 7
- 108010079245 Cystic Fibrosis Transmembrane Conductance Regulator Proteins 0.000 description 7
- 102100023419 Cystic fibrosis transmembrane conductance regulator Human genes 0.000 description 7
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 7
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 7
- 108091092195 Intron Proteins 0.000 description 7
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 7
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 7
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 7
- WFCKERTZVCQXKH-KBPBESRZSA-N Leu-Tyr-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O WFCKERTZVCQXKH-KBPBESRZSA-N 0.000 description 7
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 7
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 7
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 7
- 108010070944 alanylhistidine Proteins 0.000 description 7
- 230000003321 amplification Effects 0.000 description 7
- 238000013459 approach Methods 0.000 description 7
- 108010084389 glycyltryptophan Proteins 0.000 description 7
- 108010037850 glycylvaline Proteins 0.000 description 7
- 238000002955 isolation Methods 0.000 description 7
- 210000004962 mammalian cell Anatomy 0.000 description 7
- 239000012528 membrane Substances 0.000 description 7
- 229930182817 methionine Natural products 0.000 description 7
- 101150006794 msrAB gene Proteins 0.000 description 7
- 101150068440 msrB gene Proteins 0.000 description 7
- 238000003199 nucleic acid amplification method Methods 0.000 description 7
- 101150068681 pilB gene Proteins 0.000 description 7
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 6
- XCIGOVDXZULBBV-DCAQKATOSA-N Ala-Val-Lys Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCCCN)C(O)=O XCIGOVDXZULBBV-DCAQKATOSA-N 0.000 description 6
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 6
- 108010078791 Carrier Proteins Proteins 0.000 description 6
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 6
- 101001111307 Gallus gallus Netrin-1 Proteins 0.000 description 6
- ZIMTWPHIKZEHSE-UWVGGRQHSA-N His-Arg-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O ZIMTWPHIKZEHSE-UWVGGRQHSA-N 0.000 description 6
- 108010065920 Insulin Lispro Proteins 0.000 description 6
- 101150084684 L3 gene Proteins 0.000 description 6
- VBZOAGIPCULURB-QWRGUYRKSA-N Leu-Gly-His Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N VBZOAGIPCULURB-QWRGUYRKSA-N 0.000 description 6
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 6
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 6
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 6
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 6
- YIBOAHAOAWACDK-QEJZJMRPSA-N Lys-Ala-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 YIBOAHAOAWACDK-QEJZJMRPSA-N 0.000 description 6
- GPAHWYRSHCKICP-GUBZILKMSA-N Met-Glu-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GPAHWYRSHCKICP-GUBZILKMSA-N 0.000 description 6
- 241001465754 Metazoa Species 0.000 description 6
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 6
- OSBADCBXAMSPQD-YESZJQIVSA-N Phe-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N OSBADCBXAMSPQD-YESZJQIVSA-N 0.000 description 6
- RMJZWERKFFNNNS-XGEHTFHBSA-N Pro-Thr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMJZWERKFFNNNS-XGEHTFHBSA-N 0.000 description 6
- DGDCSVGVWWAJRS-AVGNSLFASA-N Pro-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@@H]2CCCN2 DGDCSVGVWWAJRS-AVGNSLFASA-N 0.000 description 6
- 108010076504 Protein Sorting Signals Proteins 0.000 description 6
- 101100323064 Rattus norvegicus Gfer gene Proteins 0.000 description 6
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 6
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 6
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 6
- HJSLDXZAZGFPDK-ULQDDVLXSA-N Val-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C(C)C)N HJSLDXZAZGFPDK-ULQDDVLXSA-N 0.000 description 6
- WUFHZIRMAZZWRS-OSUNSFLBSA-N Val-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C(C)C)N WUFHZIRMAZZWRS-OSUNSFLBSA-N 0.000 description 6
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 6
- 108010011559 alanylphenylalanine Proteins 0.000 description 6
- 108010070783 alanyltyrosine Proteins 0.000 description 6
- 108010013835 arginine glutamate Proteins 0.000 description 6
- 108010054813 diprotin B Proteins 0.000 description 6
- 108010087823 glycyltyrosine Proteins 0.000 description 6
- 238000002744 homologous recombination Methods 0.000 description 6
- 230000006801 homologous recombination Effects 0.000 description 6
- 108010090894 prolylleucine Proteins 0.000 description 6
- 238000011160 research Methods 0.000 description 6
- 238000012163 sequencing technique Methods 0.000 description 6
- 108010071207 serylmethionine Proteins 0.000 description 6
- 241000894007 species Species 0.000 description 6
- 239000000126 substance Substances 0.000 description 6
- 230000008685 targeting Effects 0.000 description 6
- 230000002103 transcriptional effect Effects 0.000 description 6
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 5
- BVSGPHDECMJBDE-HGNGGELXSA-N Ala-Glu-His Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N BVSGPHDECMJBDE-HGNGGELXSA-N 0.000 description 5
- YCRAFFCYWOUEOF-DLOVCJGASA-N Ala-Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 YCRAFFCYWOUEOF-DLOVCJGASA-N 0.000 description 5
- NCQMBSJGJMYKCK-ZLUOBGJFSA-N Ala-Ser-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O NCQMBSJGJMYKCK-ZLUOBGJFSA-N 0.000 description 5
- ZCUFMRIQCPNOHZ-NRPADANISA-N Ala-Val-Gln Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N ZCUFMRIQCPNOHZ-NRPADANISA-N 0.000 description 5
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 5
- YFWTXMRJJDNTLM-LSJOCFKGSA-N Arg-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YFWTXMRJJDNTLM-LSJOCFKGSA-N 0.000 description 5
- OTCJMMRQBVDQRK-DCAQKATOSA-N Arg-Asp-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OTCJMMRQBVDQRK-DCAQKATOSA-N 0.000 description 5
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 5
- GWNMUVANAWDZTI-YUMQZZPRSA-N Asn-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N GWNMUVANAWDZTI-YUMQZZPRSA-N 0.000 description 5
- DBWYWXNMZZYIRY-LPEHRKFASA-N Asp-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O DBWYWXNMZZYIRY-LPEHRKFASA-N 0.000 description 5
- MJKBOVWWADWLHV-ZLUOBGJFSA-N Asp-Cys-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)C(=O)O MJKBOVWWADWLHV-ZLUOBGJFSA-N 0.000 description 5
- KWIUHFFTVRNATP-UHFFFAOYSA-N Betaine Natural products C[N+](C)(C)CC([O-])=O KWIUHFFTVRNATP-UHFFFAOYSA-N 0.000 description 5
- KGIHMGPYGXBYJJ-SRVKXCTJSA-N Cys-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CS KGIHMGPYGXBYJJ-SRVKXCTJSA-N 0.000 description 5
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 5
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 5
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 5
- OPAINBJQDQTGJY-JGVFFNPUSA-N Glu-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)O)N)C(=O)O OPAINBJQDQTGJY-JGVFFNPUSA-N 0.000 description 5
- XLFHCWHXKSFVIB-BQBZGAKWSA-N Gly-Gln-Gln Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLFHCWHXKSFVIB-BQBZGAKWSA-N 0.000 description 5
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 5
- OJNZVYSGVYLQIN-BQBZGAKWSA-N Gly-Met-Asp Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O OJNZVYSGVYLQIN-BQBZGAKWSA-N 0.000 description 5
- ZLCLYFGMKFCDCN-XPUUQOCRSA-N Gly-Ser-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CO)NC(=O)CN)C(O)=O ZLCLYFGMKFCDCN-XPUUQOCRSA-N 0.000 description 5
- DKJWUIYLMLUBDX-XPUUQOCRSA-N Gly-Val-Cys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(=O)O DKJWUIYLMLUBDX-XPUUQOCRSA-N 0.000 description 5
- 241000606790 Haemophilus Species 0.000 description 5
- 101001008894 Homo sapiens Histone-lysine N-methyltransferase 2D Proteins 0.000 description 5
- 101000795659 Homo sapiens Tuberin Proteins 0.000 description 5
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 5
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 5
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 5
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 5
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 5
- ILDSIMPXNFWKLH-KATARQTJSA-N Leu-Thr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 5
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 5
- HONVOXINDBETTI-KKUMJFAQSA-N Lys-Tyr-Cys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CS)C(O)=O)CC1=CC=C(O)C=C1 HONVOXINDBETTI-KKUMJFAQSA-N 0.000 description 5
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 5
- KWIUHFFTVRNATP-UHFFFAOYSA-O N,N,N-trimethylglycinium Chemical compound C[N+](C)(C)CC(O)=O KWIUHFFTVRNATP-UHFFFAOYSA-O 0.000 description 5
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 5
- SMFGCTXUBWEPKM-KBPBESRZSA-N Phe-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 SMFGCTXUBWEPKM-KBPBESRZSA-N 0.000 description 5
- CMOIIANLNNYUTP-SRVKXCTJSA-N Pro-Gln-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CMOIIANLNNYUTP-SRVKXCTJSA-N 0.000 description 5
- XDKKMRPRRCOELJ-GUBZILKMSA-N Pro-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 XDKKMRPRRCOELJ-GUBZILKMSA-N 0.000 description 5
- QVOGDCQNGLBNCR-FXQIFTODSA-N Ser-Arg-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O QVOGDCQNGLBNCR-FXQIFTODSA-N 0.000 description 5
- CAOYHZOWXFFAIR-CIUDSAMLSA-N Ser-His-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O CAOYHZOWXFFAIR-CIUDSAMLSA-N 0.000 description 5
- ODSAPYVQSLDRSR-LKXGYXEUSA-N Thr-Cys-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(O)=O ODSAPYVQSLDRSR-LKXGYXEUSA-N 0.000 description 5
- WOCYUGQDXPTQPY-FXQIFTODSA-N Val-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N WOCYUGQDXPTQPY-FXQIFTODSA-N 0.000 description 5
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 5
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 5
- AEFJNECXZCODJM-UWVGGRQHSA-N Val-Val-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)NCC([O-])=O AEFJNECXZCODJM-UWVGGRQHSA-N 0.000 description 5
- 108010005233 alanylglutamic acid Proteins 0.000 description 5
- -1 alkyl phosphoramidate Chemical compound 0.000 description 5
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 5
- 230000001580 bacterial effect Effects 0.000 description 5
- 229960003237 betaine Drugs 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 239000013604 expression vector Substances 0.000 description 5
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 5
- 239000003550 marker Substances 0.000 description 5
- 230000036961 partial effect Effects 0.000 description 5
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 5
- 108010018625 phenylalanylarginine Proteins 0.000 description 5
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 5
- 239000013612 plasmid Substances 0.000 description 5
- 230000008488 polyadenylation Effects 0.000 description 5
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 5
- 108010070643 prolylglutamic acid Proteins 0.000 description 5
- 108010048818 seryl-histidine Proteins 0.000 description 5
- 238000006467 substitution reaction Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000005026 transcription initiation Effects 0.000 description 5
- 230000032258 transport Effects 0.000 description 5
- 108010029384 tryptophyl-histidine Proteins 0.000 description 5
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 4
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 4
- 108020003589 5' Untranslated Regions Proteins 0.000 description 4
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 4
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 4
- DHBKYZYFEXXUAK-ONGXEEELSA-N Ala-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 DHBKYZYFEXXUAK-ONGXEEELSA-N 0.000 description 4
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 4
- JJHBEVZAZXZREW-LFSVMHDDSA-N Ala-Thr-Phe Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O JJHBEVZAZXZREW-LFSVMHDDSA-N 0.000 description 4
- KJGNDQCYBNBXDA-GUBZILKMSA-N Arg-Arg-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N KJGNDQCYBNBXDA-GUBZILKMSA-N 0.000 description 4
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 4
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 4
- PVSNBTCXCQIXSE-JYJNAYRXSA-N Arg-Arg-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PVSNBTCXCQIXSE-JYJNAYRXSA-N 0.000 description 4
- SVHRPCMZTWZROG-DCAQKATOSA-N Arg-Cys-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCN=C(N)N)N SVHRPCMZTWZROG-DCAQKATOSA-N 0.000 description 4
- OBFTYSPXDRROQO-SRVKXCTJSA-N Arg-Gln-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCN=C(N)N OBFTYSPXDRROQO-SRVKXCTJSA-N 0.000 description 4
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 4
- ZATRYQNPUHGXCU-DTWKUNHWSA-N Arg-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZATRYQNPUHGXCU-DTWKUNHWSA-N 0.000 description 4
- QEHMMRSQJMOYNO-DCAQKATOSA-N Arg-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N QEHMMRSQJMOYNO-DCAQKATOSA-N 0.000 description 4
- PZVMBNFTBWQWQL-DCAQKATOSA-N Arg-His-Cys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N PZVMBNFTBWQWQL-DCAQKATOSA-N 0.000 description 4
- MSILNNHVVMMTHZ-UWVGGRQHSA-N Arg-His-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CN=CN1 MSILNNHVVMMTHZ-UWVGGRQHSA-N 0.000 description 4
- UAOSDDXCTBIPCA-QXEWZRGKSA-N Arg-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UAOSDDXCTBIPCA-QXEWZRGKSA-N 0.000 description 4
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 4
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 4
- RIQBRKVTFBWEDY-RHYQMDGZSA-N Arg-Lys-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RIQBRKVTFBWEDY-RHYQMDGZSA-N 0.000 description 4
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 4
- VYZBPPBKFCHCIS-WPRPVWTQSA-N Arg-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N VYZBPPBKFCHCIS-WPRPVWTQSA-N 0.000 description 4
- ZJIFRAPZHAGLGR-MELADBBJSA-N Asn-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC(=O)N)N)C(=O)O ZJIFRAPZHAGLGR-MELADBBJSA-N 0.000 description 4
- OERMIMJQPQUIPK-FXQIFTODSA-N Asp-Arg-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O OERMIMJQPQUIPK-FXQIFTODSA-N 0.000 description 4
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 4
- PAYPSKIBMDHZPI-CIUDSAMLSA-N Asp-Leu-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PAYPSKIBMDHZPI-CIUDSAMLSA-N 0.000 description 4
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 4
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 4
- FWYBFUDWUUFLDN-FXQIFTODSA-N Cys-Asp-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N)CN=C(N)N FWYBFUDWUUFLDN-FXQIFTODSA-N 0.000 description 4
- XMVZMBGFIOQONW-GARJFASQSA-N Cys-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N)C(=O)O XMVZMBGFIOQONW-GARJFASQSA-N 0.000 description 4
- 201000003883 Cystic fibrosis Diseases 0.000 description 4
- 102400001368 Epidermal growth factor Human genes 0.000 description 4
- 101800003838 Epidermal growth factor Proteins 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- RBWKVOSARCFSQQ-FXQIFTODSA-N Gln-Gln-Ser Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O RBWKVOSARCFSQQ-FXQIFTODSA-N 0.000 description 4
- STHSGOZLFLFGSS-SUSMZKCASA-N Gln-Thr-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STHSGOZLFLFGSS-SUSMZKCASA-N 0.000 description 4
- CMBXOSFZCFGDLE-IHRRRGAJSA-N Gln-Tyr-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O CMBXOSFZCFGDLE-IHRRRGAJSA-N 0.000 description 4
- AVZHGSCDKIQZPQ-CIUDSAMLSA-N Glu-Arg-Ala Chemical compound C[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AVZHGSCDKIQZPQ-CIUDSAMLSA-N 0.000 description 4
- CYHBMLHCQXXCCT-AVGNSLFASA-N Glu-Asp-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CYHBMLHCQXXCCT-AVGNSLFASA-N 0.000 description 4
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 4
- KRRMJKMGWWXWDW-STQMWFEESA-N Gly-Arg-Phe Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KRRMJKMGWWXWDW-STQMWFEESA-N 0.000 description 4
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 4
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 4
- LRQXRHGQEVWGPV-NHCYSSNCSA-N Gly-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN LRQXRHGQEVWGPV-NHCYSSNCSA-N 0.000 description 4
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 4
- UWQDKRIZSROAKS-FJXKBIBVSA-N Gly-Met-Thr Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UWQDKRIZSROAKS-FJXKBIBVSA-N 0.000 description 4
- JPVGHHQGKPQYIL-KBPBESRZSA-N Gly-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 JPVGHHQGKPQYIL-KBPBESRZSA-N 0.000 description 4
- POJJAZJHBGXEGM-YUMQZZPRSA-N Gly-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)CN POJJAZJHBGXEGM-YUMQZZPRSA-N 0.000 description 4
- RYAOJUMWLWUGNW-QMMMGPOBSA-N Gly-Val-Gly Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O RYAOJUMWLWUGNW-QMMMGPOBSA-N 0.000 description 4
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 4
- UROVZOUMHNXPLZ-AVGNSLFASA-N His-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 UROVZOUMHNXPLZ-AVGNSLFASA-N 0.000 description 4
- JUCZDDVZBMPKRT-IXOXFDKPSA-N His-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N)O JUCZDDVZBMPKRT-IXOXFDKPSA-N 0.000 description 4
- YKUAGFAXQRYUQW-KKUMJFAQSA-N His-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O YKUAGFAXQRYUQW-KKUMJFAQSA-N 0.000 description 4
- 101100029888 Homo sapiens PKD1 gene Proteins 0.000 description 4
- 101000641419 Homo sapiens V-type proton ATPase 16 kDa proteolipid subunit c Proteins 0.000 description 4
- 101000775709 Homo sapiens V-type proton ATPase subunit C 1 Proteins 0.000 description 4
- MVLDERGQICFFLL-ZQINRCPSSA-N Ile-Gln-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 MVLDERGQICFFLL-ZQINRCPSSA-N 0.000 description 4
- NZOCIWKZUVUNDW-ZKWXMUAHSA-N Ile-Gly-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O NZOCIWKZUVUNDW-ZKWXMUAHSA-N 0.000 description 4
- XLXPYSDGMXTTNQ-UHFFFAOYSA-N Ile-Phe-Leu Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(CC(C)C)C(O)=O)CC1=CC=CC=C1 XLXPYSDGMXTTNQ-UHFFFAOYSA-N 0.000 description 4
- KSZCCRIGNVSHFH-UWVGGRQHSA-N Leu-Arg-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O KSZCCRIGNVSHFH-UWVGGRQHSA-N 0.000 description 4
- BAJIJEGGUYXZGC-CIUDSAMLSA-N Leu-Asn-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N BAJIJEGGUYXZGC-CIUDSAMLSA-N 0.000 description 4
- POJPZSMTTMLSTG-SRVKXCTJSA-N Leu-Asn-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N POJPZSMTTMLSTG-SRVKXCTJSA-N 0.000 description 4
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 4
- CCQLQKZTXZBXTN-NHCYSSNCSA-N Leu-Gly-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CCQLQKZTXZBXTN-NHCYSSNCSA-N 0.000 description 4
- QPXBPQUGXHURGP-UWVGGRQHSA-N Leu-Gly-Met Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CCSC)C(=O)O)N QPXBPQUGXHURGP-UWVGGRQHSA-N 0.000 description 4
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 4
- IFMPDNRWZZEZSL-SRVKXCTJSA-N Leu-Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O IFMPDNRWZZEZSL-SRVKXCTJSA-N 0.000 description 4
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 4
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 4
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 4
- WMIOEVKKYIMVKI-DCAQKATOSA-N Leu-Pro-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WMIOEVKKYIMVKI-DCAQKATOSA-N 0.000 description 4
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 4
- KZZCOWMDDXDKSS-CIUDSAMLSA-N Leu-Ser-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KZZCOWMDDXDKSS-CIUDSAMLSA-N 0.000 description 4
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 4
- SBANPBVRHYIMRR-GARJFASQSA-N Leu-Ser-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N SBANPBVRHYIMRR-GARJFASQSA-N 0.000 description 4
- VDIARPPNADFEAV-WEDXCCLWSA-N Leu-Thr-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O VDIARPPNADFEAV-WEDXCCLWSA-N 0.000 description 4
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 4
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 4
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 4
- SPCHLZUWJTYZFC-IHRRRGAJSA-N Lys-His-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O SPCHLZUWJTYZFC-IHRRRGAJSA-N 0.000 description 4
- WRODMZBHNNPRLN-SRVKXCTJSA-N Lys-Leu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O WRODMZBHNNPRLN-SRVKXCTJSA-N 0.000 description 4
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 4
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 4
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 4
- PYFNONMJYNJENN-AVGNSLFASA-N Lys-Lys-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PYFNONMJYNJENN-AVGNSLFASA-N 0.000 description 4
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 4
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 4
- SJDQOYTYNGZZJX-SRVKXCTJSA-N Met-Glu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SJDQOYTYNGZZJX-SRVKXCTJSA-N 0.000 description 4
- LHXFNWBNRBWMNV-DCAQKATOSA-N Met-Ser-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LHXFNWBNRBWMNV-DCAQKATOSA-N 0.000 description 4
- QQPMHUCGDRJFQK-RHYQMDGZSA-N Met-Thr-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QQPMHUCGDRJFQK-RHYQMDGZSA-N 0.000 description 4
- 241000588652 Neisseria gonorrhoeae Species 0.000 description 4
- 101150056230 PKD1 gene Proteins 0.000 description 4
- VFDRDMOMHBJGKD-UFYCRDLUSA-N Phe-Tyr-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N VFDRDMOMHBJGKD-UFYCRDLUSA-N 0.000 description 4
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 4
- IHCXPSYCHXFXKT-DCAQKATOSA-N Pro-Arg-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O IHCXPSYCHXFXKT-DCAQKATOSA-N 0.000 description 4
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 4
- FXGIMYRVJJEIIM-UWVGGRQHSA-N Pro-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FXGIMYRVJJEIIM-UWVGGRQHSA-N 0.000 description 4
- RCYUBVHMVUHEBM-RCWTZXSCSA-N Pro-Pro-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RCYUBVHMVUHEBM-RCWTZXSCSA-N 0.000 description 4
- SEZGGSHLMROBFX-CIUDSAMLSA-N Pro-Ser-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O SEZGGSHLMROBFX-CIUDSAMLSA-N 0.000 description 4
- KWMZPPWYBVZIER-XGEHTFHBSA-N Pro-Ser-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KWMZPPWYBVZIER-XGEHTFHBSA-N 0.000 description 4
- KHRLUIPIMIQFGT-AVGNSLFASA-N Pro-Val-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHRLUIPIMIQFGT-AVGNSLFASA-N 0.000 description 4
- 238000012300 Sequence Analysis Methods 0.000 description 4
- HZWAHWQZPSXNCB-BPUTZDHNSA-N Ser-Arg-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O HZWAHWQZPSXNCB-BPUTZDHNSA-N 0.000 description 4
- FMDHKPRACUXATF-ACZMJKKPSA-N Ser-Gln-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O FMDHKPRACUXATF-ACZMJKKPSA-N 0.000 description 4
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 4
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 4
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 4
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 4
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 4
- CKDXFSPMIDSMGV-GUBZILKMSA-N Ser-Pro-Val Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O CKDXFSPMIDSMGV-GUBZILKMSA-N 0.000 description 4
- STIAINRLUUKYKM-WFBYXXMGSA-N Ser-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@@H](N)CO)=CNC2=C1 STIAINRLUUKYKM-WFBYXXMGSA-N 0.000 description 4
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 4
- 108091081024 Start codon Proteins 0.000 description 4
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 4
- LGNBRHZANHMZHK-NUMRIWBASA-N Thr-Glu-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O LGNBRHZANHMZHK-NUMRIWBASA-N 0.000 description 4
- UYTYTDMCDBPDSC-URLPEUOOSA-N Thr-Ile-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N UYTYTDMCDBPDSC-URLPEUOOSA-N 0.000 description 4
- RRRRCRYTLZVCEN-HJGDQZAQSA-N Thr-Leu-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O RRRRCRYTLZVCEN-HJGDQZAQSA-N 0.000 description 4
- JMBRNXUOLJFURW-BEAPCOKYSA-N Thr-Phe-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N)O JMBRNXUOLJFURW-BEAPCOKYSA-N 0.000 description 4
- TZNNEYFZZAHLBL-BPUTZDHNSA-N Trp-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O TZNNEYFZZAHLBL-BPUTZDHNSA-N 0.000 description 4
- IMYTYAWRKBYTSX-YTQUADARSA-N Trp-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC3=CNC4=CC=CC=C43)N)C(=O)O IMYTYAWRKBYTSX-YTQUADARSA-N 0.000 description 4
- JAGGEZACYAAMIL-CQDKDKBSSA-N Tyr-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N JAGGEZACYAAMIL-CQDKDKBSSA-N 0.000 description 4
- 108091023045 Untranslated Region Proteins 0.000 description 4
- ASQFIHTXXMFENG-XPUUQOCRSA-N Val-Ala-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O ASQFIHTXXMFENG-XPUUQOCRSA-N 0.000 description 4
- XQVRMLRMTAGSFJ-QXEWZRGKSA-N Val-Asp-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XQVRMLRMTAGSFJ-QXEWZRGKSA-N 0.000 description 4
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 4
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 4
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 4
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 4
- LCHZBEUVGAVMKS-RHYQMDGZSA-N Val-Thr-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)[C@@H](C)O)C(O)=O LCHZBEUVGAVMKS-RHYQMDGZSA-N 0.000 description 4
- JSOXWWFKRJKTMT-WOPDTQHZSA-N Val-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N JSOXWWFKRJKTMT-WOPDTQHZSA-N 0.000 description 4
- 108010041407 alanylaspartic acid Proteins 0.000 description 4
- 230000000692 anti-sense effect Effects 0.000 description 4
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 4
- 210000001106 artificial yeast chromosome Anatomy 0.000 description 4
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 4
- 238000003556 assay Methods 0.000 description 4
- 230000004009 axon guidance Effects 0.000 description 4
- 230000033228 biological regulation Effects 0.000 description 4
- 108091092330 cytoplasmic RNA Proteins 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 230000029087 digestion Effects 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 229940116977 epidermal growth factor Drugs 0.000 description 4
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 4
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 4
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 4
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 4
- 108010028188 glycyl-histidyl-serine Proteins 0.000 description 4
- 108010089804 glycyl-threonine Proteins 0.000 description 4
- 108010040030 histidinoalanine Proteins 0.000 description 4
- 108010085325 histidylproline Proteins 0.000 description 4
- 108010018006 histidylserine Proteins 0.000 description 4
- 210000003734 kidney Anatomy 0.000 description 4
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 4
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 4
- 210000004185 liver Anatomy 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 210000000496 pancreas Anatomy 0.000 description 4
- 210000002826 placenta Anatomy 0.000 description 4
- 108020003175 receptors Proteins 0.000 description 4
- 102000005962 receptors Human genes 0.000 description 4
- 230000006798 recombination Effects 0.000 description 4
- 238000005215 recombination Methods 0.000 description 4
- 230000008439 repair process Effects 0.000 description 4
- 230000002441 reversible effect Effects 0.000 description 4
- HHGYNJRJIINWAK-FXQIFTODSA-N Ala-Ala-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N HHGYNJRJIINWAK-FXQIFTODSA-N 0.000 description 3
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 3
- WDIYWDJLXOCGRW-ACZMJKKPSA-N Ala-Asp-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WDIYWDJLXOCGRW-ACZMJKKPSA-N 0.000 description 3
- HMRWQTHUDVXMGH-GUBZILKMSA-N Ala-Glu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HMRWQTHUDVXMGH-GUBZILKMSA-N 0.000 description 3
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 3
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 3
- IFKQPMZRDQZSHI-GHCJXIJMSA-N Ala-Ile-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O IFKQPMZRDQZSHI-GHCJXIJMSA-N 0.000 description 3
- QJABSQFUHKHTNP-SYWGBEHUSA-N Ala-Ile-Trp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O QJABSQFUHKHTNP-SYWGBEHUSA-N 0.000 description 3
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 3
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 3
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 3
- CYBJZLQSUJEMAS-LFSVMHDDSA-N Ala-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C)N)O CYBJZLQSUJEMAS-LFSVMHDDSA-N 0.000 description 3
- IHMCQESUJVZTKW-UBHSHLNASA-N Ala-Phe-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 IHMCQESUJVZTKW-UBHSHLNASA-N 0.000 description 3
- IPZQNYYAYVRKKK-FXQIFTODSA-N Ala-Pro-Ala Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IPZQNYYAYVRKKK-FXQIFTODSA-N 0.000 description 3
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 3
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 3
- RWCLSUOSKWTXLA-FXQIFTODSA-N Arg-Asp-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RWCLSUOSKWTXLA-FXQIFTODSA-N 0.000 description 3
- DQNLFLGFZAUIOW-FXQIFTODSA-N Arg-Cys-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O DQNLFLGFZAUIOW-FXQIFTODSA-N 0.000 description 3
- IGULQRCJLQQPSM-DCAQKATOSA-N Arg-Cys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O IGULQRCJLQQPSM-DCAQKATOSA-N 0.000 description 3
- ZDBWKBCKYJGKGP-DCAQKATOSA-N Arg-Leu-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O ZDBWKBCKYJGKGP-DCAQKATOSA-N 0.000 description 3
- JEOCWTUOMKEEMF-RHYQMDGZSA-N Arg-Leu-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEOCWTUOMKEEMF-RHYQMDGZSA-N 0.000 description 3
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 3
- BNYNOWJESJJIOI-XUXIUFHCSA-N Arg-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N BNYNOWJESJJIOI-XUXIUFHCSA-N 0.000 description 3
- UGZUVYDKAYNCII-ULQDDVLXSA-N Arg-Phe-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UGZUVYDKAYNCII-ULQDDVLXSA-N 0.000 description 3
- FMYQECOAIFGQGU-CYDGBPFRSA-N Arg-Val-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FMYQECOAIFGQGU-CYDGBPFRSA-N 0.000 description 3
- YNDLOUMBVDVALC-ZLUOBGJFSA-N Asn-Ala-Ala Chemical compound C[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC(=O)N)N YNDLOUMBVDVALC-ZLUOBGJFSA-N 0.000 description 3
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 3
- YJRORCOAFUZVKA-FXQIFTODSA-N Asn-Arg-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N YJRORCOAFUZVKA-FXQIFTODSA-N 0.000 description 3
- OKZOABJQOMAYEC-NUMRIWBASA-N Asn-Gln-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OKZOABJQOMAYEC-NUMRIWBASA-N 0.000 description 3
- CTQIOCMSIJATNX-WHFBIAKZSA-N Asn-Gly-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O CTQIOCMSIJATNX-WHFBIAKZSA-N 0.000 description 3
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 3
- ALHMNHZJBYBYHS-DCAQKATOSA-N Asn-Lys-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ALHMNHZJBYBYHS-DCAQKATOSA-N 0.000 description 3
- PBVLJOIPOGUQQP-CIUDSAMLSA-N Asp-Ala-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O PBVLJOIPOGUQQP-CIUDSAMLSA-N 0.000 description 3
- NECWUSYTYSIFNC-DLOVCJGASA-N Asp-Ala-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 NECWUSYTYSIFNC-DLOVCJGASA-N 0.000 description 3
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 3
- DWOGMPWRQQWPPF-GUBZILKMSA-N Asp-Leu-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O DWOGMPWRQQWPPF-GUBZILKMSA-N 0.000 description 3
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 3
- DPNWSMBUYCLEDG-CIUDSAMLSA-N Asp-Lys-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O DPNWSMBUYCLEDG-CIUDSAMLSA-N 0.000 description 3
- DWOSGXZMLQNDBN-FXQIFTODSA-N Asp-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CS)C(=O)O DWOSGXZMLQNDBN-FXQIFTODSA-N 0.000 description 3
- HICVMZCGVFKTPM-BQBZGAKWSA-N Asp-Pro-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HICVMZCGVFKTPM-BQBZGAKWSA-N 0.000 description 3
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 3
- MGSVBZIBCCKGCY-ZLUOBGJFSA-N Asp-Ser-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MGSVBZIBCCKGCY-ZLUOBGJFSA-N 0.000 description 3
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 3
- KNOGLZBISUBTFW-QRTARXTBSA-N Asp-Trp-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(O)=O KNOGLZBISUBTFW-QRTARXTBSA-N 0.000 description 3
- VNLYIYOYUNGURO-ZLUOBGJFSA-N Cys-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N VNLYIYOYUNGURO-ZLUOBGJFSA-N 0.000 description 3
- OLIYIKRCOZBFCW-ZLUOBGJFSA-N Cys-Asp-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CS)N)C(=O)O OLIYIKRCOZBFCW-ZLUOBGJFSA-N 0.000 description 3
- YUZPQIQWXLRFBW-ACZMJKKPSA-N Cys-Glu-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O YUZPQIQWXLRFBW-ACZMJKKPSA-N 0.000 description 3
- KXUKWRVYDYIPSQ-CIUDSAMLSA-N Cys-Leu-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O KXUKWRVYDYIPSQ-CIUDSAMLSA-N 0.000 description 3
- HEPLXMBVMCXTBP-QWRGUYRKSA-N Cys-Phe-Gly Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O HEPLXMBVMCXTBP-QWRGUYRKSA-N 0.000 description 3
- RAGIABZNLPZBGS-FXQIFTODSA-N Cys-Pro-Cys Chemical compound N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(O)=O RAGIABZNLPZBGS-FXQIFTODSA-N 0.000 description 3
- YQEHNIKPAOPBNH-DCAQKATOSA-N Cys-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CS)N YQEHNIKPAOPBNH-DCAQKATOSA-N 0.000 description 3
- 241000701022 Cytomegalovirus Species 0.000 description 3
- 102000053602 DNA Human genes 0.000 description 3
- 239000003298 DNA probe Substances 0.000 description 3
- 101100450592 Dictyostelium discoideum hexa1 gene Proteins 0.000 description 3
- 102100039111 FAD-linked sulfhydryl oxidase ALR Human genes 0.000 description 3
- CLPQUWHBWXFJOX-BQBZGAKWSA-N Gln-Gly-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O CLPQUWHBWXFJOX-BQBZGAKWSA-N 0.000 description 3
- IOFDDSNZJDIGPB-GVXVVHGQSA-N Gln-Leu-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IOFDDSNZJDIGPB-GVXVVHGQSA-N 0.000 description 3
- SWDSRANUCKNBLA-AVGNSLFASA-N Gln-Phe-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N SWDSRANUCKNBLA-AVGNSLFASA-N 0.000 description 3
- NYCVMJGIJYQWDO-CIUDSAMLSA-N Gln-Ser-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NYCVMJGIJYQWDO-CIUDSAMLSA-N 0.000 description 3
- JRCUFCXYZLPSDZ-ACZMJKKPSA-N Glu-Asp-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O JRCUFCXYZLPSDZ-ACZMJKKPSA-N 0.000 description 3
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 3
- BCYGDJXHAGZNPQ-DCAQKATOSA-N Glu-Lys-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O BCYGDJXHAGZNPQ-DCAQKATOSA-N 0.000 description 3
- QNJNPKSWAHPYGI-JYJNAYRXSA-N Glu-Phe-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 QNJNPKSWAHPYGI-JYJNAYRXSA-N 0.000 description 3
- QJVZSVUYZFYLFQ-CIUDSAMLSA-N Glu-Pro-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O QJVZSVUYZFYLFQ-CIUDSAMLSA-N 0.000 description 3
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 3
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 3
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 3
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 3
- QPDUVFSVVAOUHE-XVKPBYJWSA-N Gly-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CN)C(O)=O QPDUVFSVVAOUHE-XVKPBYJWSA-N 0.000 description 3
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 3
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 3
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 3
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 3
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 3
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 3
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 3
- UVTSZKIATYSKIR-RYUDHWBXSA-N Gly-Tyr-Glu Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O UVTSZKIATYSKIR-RYUDHWBXSA-N 0.000 description 3
- 241000238631 Hexapoda Species 0.000 description 3
- NOQPTNXSGNPJNS-YUMQZZPRSA-N His-Asn-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O NOQPTNXSGNPJNS-YUMQZZPRSA-N 0.000 description 3
- OQDLKDUVMTUPPG-AVGNSLFASA-N His-Leu-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OQDLKDUVMTUPPG-AVGNSLFASA-N 0.000 description 3
- PBVQWNDMFFCPIZ-ULQDDVLXSA-N His-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 PBVQWNDMFFCPIZ-ULQDDVLXSA-N 0.000 description 3
- FLXCRBXJRJSDHX-AVGNSLFASA-N His-Pro-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O FLXCRBXJRJSDHX-AVGNSLFASA-N 0.000 description 3
- CCUSLCQWVMWTIS-IXOXFDKPSA-N His-Thr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O CCUSLCQWVMWTIS-IXOXFDKPSA-N 0.000 description 3
- 101000673985 Homo sapiens 60S ribosomal protein L3 Proteins 0.000 description 3
- 208000023105 Huntington disease Diseases 0.000 description 3
- AQCUAZTZSPQJFF-ZKWXMUAHSA-N Ile-Ala-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AQCUAZTZSPQJFF-ZKWXMUAHSA-N 0.000 description 3
- GYAFMRQGWHXMII-IUKAMOBKSA-N Ile-Asp-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N GYAFMRQGWHXMII-IUKAMOBKSA-N 0.000 description 3
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 3
- SNHYFFQZRFIRHO-CYDGBPFRSA-N Ile-Met-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)O)N SNHYFFQZRFIRHO-CYDGBPFRSA-N 0.000 description 3
- VGSPNSSCMOHRRR-BJDJZHNGSA-N Ile-Ser-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N VGSPNSSCMOHRRR-BJDJZHNGSA-N 0.000 description 3
- ZUWSVOYKBCHLRR-MGHWNKPDSA-N Ile-Tyr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZUWSVOYKBCHLRR-MGHWNKPDSA-N 0.000 description 3
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 3
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 3
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 3
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 3
- XIRYQRLFHWWWTC-QEJZJMRPSA-N Leu-Ala-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XIRYQRLFHWWWTC-QEJZJMRPSA-N 0.000 description 3
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 3
- CLVUXCBGKUECIT-HJGDQZAQSA-N Leu-Asp-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CLVUXCBGKUECIT-HJGDQZAQSA-N 0.000 description 3
- PNUCWVAGVNLUMW-CIUDSAMLSA-N Leu-Cys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O PNUCWVAGVNLUMW-CIUDSAMLSA-N 0.000 description 3
- FOEHRHOBWFQSNW-KATARQTJSA-N Leu-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(C)C)N)O FOEHRHOBWFQSNW-KATARQTJSA-N 0.000 description 3
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 3
- QDSKNVXKLPQNOJ-GVXVVHGQSA-N Leu-Gln-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O QDSKNVXKLPQNOJ-GVXVVHGQSA-N 0.000 description 3
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 3
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 3
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 3
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 3
- QJXHMYMRGDOHRU-NHCYSSNCSA-N Leu-Ile-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O QJXHMYMRGDOHRU-NHCYSSNCSA-N 0.000 description 3
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 3
- FOBUGKUBUJOWAD-IHPCNDPISA-N Leu-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 FOBUGKUBUJOWAD-IHPCNDPISA-N 0.000 description 3
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 3
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 3
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 3
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 3
- KQFZKDITNUEVFJ-JYJNAYRXSA-N Leu-Phe-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=CC=C1 KQFZKDITNUEVFJ-JYJNAYRXSA-N 0.000 description 3
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 3
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 3
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 3
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 3
- LINKCQUOMUDLKN-KATARQTJSA-N Leu-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(C)C)N)O LINKCQUOMUDLKN-KATARQTJSA-N 0.000 description 3
- AIQWYVFNBNNOLU-RHYQMDGZSA-N Leu-Thr-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O AIQWYVFNBNNOLU-RHYQMDGZSA-N 0.000 description 3
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 3
- VHFFQUSNFFIZBT-CIUDSAMLSA-N Lys-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N VHFFQUSNFFIZBT-CIUDSAMLSA-N 0.000 description 3
- HIIZIQUUHIXUJY-GUBZILKMSA-N Lys-Asp-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HIIZIQUUHIXUJY-GUBZILKMSA-N 0.000 description 3
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 3
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 3
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 3
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 3
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 3
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 3
- OWRUUFUVXFREBD-KKUMJFAQSA-N Lys-His-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O OWRUUFUVXFREBD-KKUMJFAQSA-N 0.000 description 3
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 3
- TVOOGUNBIWAURO-KATARQTJSA-N Lys-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N)O TVOOGUNBIWAURO-KATARQTJSA-N 0.000 description 3
- RMOKGALPSPOYKE-KATARQTJSA-N Lys-Thr-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMOKGALPSPOYKE-KATARQTJSA-N 0.000 description 3
- MVBZBRKNZVJEKK-DTWKUNHWSA-N Met-Gly-Pro Chemical compound CSCC[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N MVBZBRKNZVJEKK-DTWKUNHWSA-N 0.000 description 3
- VSJAPSMRFYUOKS-IUCAKERBSA-N Met-Pro-Gly Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O VSJAPSMRFYUOKS-IUCAKERBSA-N 0.000 description 3
- 241000699666 Mus <mouse, genus> Species 0.000 description 3
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 3
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 3
- 108010047562 NGR peptide Proteins 0.000 description 3
- JNRFYJZCMHHGMH-UBHSHLNASA-N Phe-Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JNRFYJZCMHHGMH-UBHSHLNASA-N 0.000 description 3
- XEXSSIBQYNKFBX-KBPBESRZSA-N Phe-Gly-His Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CC=CC=C1 XEXSSIBQYNKFBX-KBPBESRZSA-N 0.000 description 3
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 3
- AXIOGMQCDYVTNY-ACRUOGEOSA-N Phe-Phe-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 AXIOGMQCDYVTNY-ACRUOGEOSA-N 0.000 description 3
- MMJJFXWMCMJMQA-STQMWFEESA-N Phe-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CC=CC=C1 MMJJFXWMCMJMQA-STQMWFEESA-N 0.000 description 3
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 3
- CGBYDGAJHSOGFQ-LPEHRKFASA-N Pro-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 CGBYDGAJHSOGFQ-LPEHRKFASA-N 0.000 description 3
- HFZNNDWPHBRNPV-KZVJFYERSA-N Pro-Ala-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HFZNNDWPHBRNPV-KZVJFYERSA-N 0.000 description 3
- BNBBNGZZKQUWCD-IUCAKERBSA-N Pro-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 BNBBNGZZKQUWCD-IUCAKERBSA-N 0.000 description 3
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 3
- FKLSMYYLJHYPHH-UWVGGRQHSA-N Pro-Gly-Leu Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O FKLSMYYLJHYPHH-UWVGGRQHSA-N 0.000 description 3
- RUDOLGWDSKQQFF-DCAQKATOSA-N Pro-Leu-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O RUDOLGWDSKQQFF-DCAQKATOSA-N 0.000 description 3
- GMJDSFYVTAMIBF-FXQIFTODSA-N Pro-Ser-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GMJDSFYVTAMIBF-FXQIFTODSA-N 0.000 description 3
- FNGOXVQBBCMFKV-CIUDSAMLSA-N Pro-Ser-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O FNGOXVQBBCMFKV-CIUDSAMLSA-N 0.000 description 3
- LNICFEXCAHIJOR-DCAQKATOSA-N Pro-Ser-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LNICFEXCAHIJOR-DCAQKATOSA-N 0.000 description 3
- SXJOPONICMGFCR-DCAQKATOSA-N Pro-Ser-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O SXJOPONICMGFCR-DCAQKATOSA-N 0.000 description 3
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 3
- FUOGXAQMNJMBFG-WPRPVWTQSA-N Pro-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 FUOGXAQMNJMBFG-WPRPVWTQSA-N 0.000 description 3
- 108091035242 Sequence-tagged site Proteins 0.000 description 3
- CNIIKZQXBBQHCX-FXQIFTODSA-N Ser-Asp-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O CNIIKZQXBBQHCX-FXQIFTODSA-N 0.000 description 3
- GHPQVUYZQQGEDA-BIIVOSGPSA-N Ser-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N)C(=O)O GHPQVUYZQQGEDA-BIIVOSGPSA-N 0.000 description 3
- KMWFXJCGRXBQAC-CIUDSAMLSA-N Ser-Cys-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CO)N KMWFXJCGRXBQAC-CIUDSAMLSA-N 0.000 description 3
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 3
- QBUWQRKEHJXTOP-DCAQKATOSA-N Ser-His-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QBUWQRKEHJXTOP-DCAQKATOSA-N 0.000 description 3
- LOKXAXAESFYFAX-CIUDSAMLSA-N Ser-His-Cys Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CS)C(O)=O)CC1=CN=CN1 LOKXAXAESFYFAX-CIUDSAMLSA-N 0.000 description 3
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 3
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 description 3
- GZSZPKSBVAOGIE-CIUDSAMLSA-N Ser-Lys-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O GZSZPKSBVAOGIE-CIUDSAMLSA-N 0.000 description 3
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 3
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 3
- PPCZVWHJWJFTFN-ZLUOBGJFSA-N Ser-Ser-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPCZVWHJWJFTFN-ZLUOBGJFSA-N 0.000 description 3
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 3
- QNBVFKZSSRYNFX-CUJWVEQBSA-N Ser-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N)O QNBVFKZSSRYNFX-CUJWVEQBSA-N 0.000 description 3
- BDMWLJLPPUCLNV-XGEHTFHBSA-N Ser-Thr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BDMWLJLPPUCLNV-XGEHTFHBSA-N 0.000 description 3
- HNDMFDBQXYZSRM-IHRRRGAJSA-N Ser-Val-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HNDMFDBQXYZSRM-IHRRRGAJSA-N 0.000 description 3
- DGDCHPCRMWEOJR-FQPOAREZSA-N Thr-Ala-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DGDCHPCRMWEOJR-FQPOAREZSA-N 0.000 description 3
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 3
- LYGKYFKSZTUXGZ-ZDLURKLDSA-N Thr-Cys-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)NCC(O)=O LYGKYFKSZTUXGZ-ZDLURKLDSA-N 0.000 description 3
- VYEHBMMAJFVTOI-JHEQGTHGSA-N Thr-Gly-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O VYEHBMMAJFVTOI-JHEQGTHGSA-N 0.000 description 3
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 3
- XOWKUMFHEZLKLT-CIQUZCHMSA-N Thr-Ile-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O XOWKUMFHEZLKLT-CIQUZCHMSA-N 0.000 description 3
- LCCSEJSPBWKBNT-OSUNSFLBSA-N Thr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N LCCSEJSPBWKBNT-OSUNSFLBSA-N 0.000 description 3
- IHAPJUHCZXBPHR-WZLNRYEVSA-N Thr-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N IHAPJUHCZXBPHR-WZLNRYEVSA-N 0.000 description 3
- MUAFDCVOHYAFNG-RCWTZXSCSA-N Thr-Pro-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MUAFDCVOHYAFNG-RCWTZXSCSA-N 0.000 description 3
- VTMGKRABARCZAX-OSUNSFLBSA-N Thr-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O VTMGKRABARCZAX-OSUNSFLBSA-N 0.000 description 3
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 3
- IQPWNQRRAJHOKV-KATARQTJSA-N Thr-Ser-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN IQPWNQRRAJHOKV-KATARQTJSA-N 0.000 description 3
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 3
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 3
- 108091036066 Three prime untranslated region Proteins 0.000 description 3
- 108091023040 Transcription factor Proteins 0.000 description 3
- QAXCHNZDPLSFPC-PJODQICGSA-N Trp-Ala-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 QAXCHNZDPLSFPC-PJODQICGSA-N 0.000 description 3
- MJBBMTOGSOSAKJ-HJXMPXNTSA-N Trp-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MJBBMTOGSOSAKJ-HJXMPXNTSA-N 0.000 description 3
- HABYQJRYDKEVOI-IHPCNDPISA-N Trp-His-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)N[C@@H](CCCCN)C(=O)O)N HABYQJRYDKEVOI-IHPCNDPISA-N 0.000 description 3
- 208000026911 Tuberous sclerosis complex Diseases 0.000 description 3
- ADBDQGBDNUTRDB-ULQDDVLXSA-N Tyr-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O ADBDQGBDNUTRDB-ULQDDVLXSA-N 0.000 description 3
- GHUNBABNQPIETG-MELADBBJSA-N Tyr-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O GHUNBABNQPIETG-MELADBBJSA-N 0.000 description 3
- TWAVEIJGFCBWCG-JYJNAYRXSA-N Tyr-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N TWAVEIJGFCBWCG-JYJNAYRXSA-N 0.000 description 3
- JKUZFODWJGEQAP-KBPBESRZSA-N Tyr-Gly-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O JKUZFODWJGEQAP-KBPBESRZSA-N 0.000 description 3
- 108010064997 VPY tripeptide Proteins 0.000 description 3
- YFOCMOVJBQDBCE-NRPADANISA-N Val-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N YFOCMOVJBQDBCE-NRPADANISA-N 0.000 description 3
- UZDHNIJRRTUKKC-DLOVCJGASA-N Val-Gln-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N UZDHNIJRRTUKKC-DLOVCJGASA-N 0.000 description 3
- SYOMXKPPFZRELL-ONGXEEELSA-N Val-Gly-Lys Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N SYOMXKPPFZRELL-ONGXEEELSA-N 0.000 description 3
- HGJRMXOWUWVUOA-GVXVVHGQSA-N Val-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N HGJRMXOWUWVUOA-GVXVVHGQSA-N 0.000 description 3
- RWOGENDAOGMHLX-DCAQKATOSA-N Val-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N RWOGENDAOGMHLX-DCAQKATOSA-N 0.000 description 3
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 3
- MJOUSKQHAIARKI-JYJNAYRXSA-N Val-Phe-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 MJOUSKQHAIARKI-JYJNAYRXSA-N 0.000 description 3
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 3
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 3
- LLJLBRRXKZTTRD-GUBZILKMSA-N Val-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N LLJLBRRXKZTTRD-GUBZILKMSA-N 0.000 description 3
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 3
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 3
- 210000004102 animal cell Anatomy 0.000 description 3
- 108010008355 arginyl-glutamine Proteins 0.000 description 3
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 3
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 239000002975 chemoattractant Substances 0.000 description 3
- 239000002838 chemorepellent Substances 0.000 description 3
- 238000003776 cleavage reaction Methods 0.000 description 3
- 230000002860 competitive effect Effects 0.000 description 3
- 230000001086 cytosolic effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 108010009297 diglycyl-histidine Proteins 0.000 description 3
- 239000003623 enhancer Substances 0.000 description 3
- 210000003754 fetus Anatomy 0.000 description 3
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 3
- 108010078144 glutaminyl-glycine Proteins 0.000 description 3
- 108010073628 glutamyl-valyl-phenylalanine Proteins 0.000 description 3
- 108010075431 glycyl-alanyl-phenylalanine Proteins 0.000 description 3
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 3
- 108010010147 glycylglutamine Proteins 0.000 description 3
- 108010020688 glycylhistidine Proteins 0.000 description 3
- 230000012010 growth Effects 0.000 description 3
- 210000002216 heart Anatomy 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 150000002500 ions Chemical class 0.000 description 3
- 108010076756 leucyl-alanyl-phenylalanine Proteins 0.000 description 3
- 108010000761 leucylarginine Proteins 0.000 description 3
- 238000000520 microinjection Methods 0.000 description 3
- 101150008111 nagA gene Proteins 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 210000002569 neuron Anatomy 0.000 description 3
- 108010024654 phenylalanyl-prolyl-alanine Proteins 0.000 description 3
- 239000002953 phosphate buffered saline Substances 0.000 description 3
- 230000026731 phosphorylation Effects 0.000 description 3
- 238000006366 phosphorylation reaction Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 108010031719 prolyl-serine Proteins 0.000 description 3
- 230000008929 regeneration Effects 0.000 description 3
- 238000011069 regeneration method Methods 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- 230000010076 replication Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 108091008146 restriction endonucleases Proteins 0.000 description 3
- 238000010839 reverse transcription Methods 0.000 description 3
- 230000007017 scission Effects 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000011282 treatment Methods 0.000 description 3
- 108700004896 tripeptide FEG Proteins 0.000 description 3
- 208000009999 tuberous sclerosis Diseases 0.000 description 3
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 2
- AXFMEGAFCUULFV-BLFANLJRSA-N (2s)-2-[[(2s)-1-[(2s,3r)-2-amino-3-methylpentanoyl]pyrrolidine-2-carbonyl]amino]pentanedioic acid Chemical compound CC[C@@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AXFMEGAFCUULFV-BLFANLJRSA-N 0.000 description 2
- OFHXPCLWHLXQHT-JKQORVJESA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2,6-diaminohexanoyl]amino]-3-methylbutanoyl]amino]-4-methylpentanoyl]amino]butanedioic acid Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN OFHXPCLWHLXQHT-JKQORVJESA-N 0.000 description 2
- DQVAZKGVGKHQDS-UHFFFAOYSA-N 2-[[1-[2-[(2-amino-4-methylpentanoyl)amino]-4-methylpentanoyl]pyrrolidine-2-carbonyl]amino]-4-methylpentanoic acid Chemical compound CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(=O)NC(CC(C)C)C(O)=O DQVAZKGVGKHQDS-UHFFFAOYSA-N 0.000 description 2
- JUEUYDRZJNQZGR-UHFFFAOYSA-N 2-[[2-[[2-[(2-amino-4-methylpentanoyl)amino]-4-methylpentanoyl]amino]acetyl]amino]-3-phenylpropanoic acid Chemical compound CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JUEUYDRZJNQZGR-UHFFFAOYSA-N 0.000 description 2
- IMIZPWSVYADSCN-UHFFFAOYSA-N 4-methyl-2-[[4-methyl-2-[[4-methyl-2-(pyrrolidine-2-carbonylamino)pentanoyl]amino]pentanoyl]amino]pentanoic acid Chemical compound CC(C)CC(C(O)=O)NC(=O)C(CC(C)C)NC(=O)C(CC(C)C)NC(=O)C1CCCN1 IMIZPWSVYADSCN-UHFFFAOYSA-N 0.000 description 2
- 108010036211 5-HT-moduline Proteins 0.000 description 2
- 102100033647 Activity-regulated cytoskeleton-associated protein Human genes 0.000 description 2
- BUANFPRKJKJSRR-ACZMJKKPSA-N Ala-Ala-Gln Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CCC(N)=O BUANFPRKJKJSRR-ACZMJKKPSA-N 0.000 description 2
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 2
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 2
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 2
- YSMPVONNIWLJML-FXQIFTODSA-N Ala-Asp-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(O)=O YSMPVONNIWLJML-FXQIFTODSA-N 0.000 description 2
- WJRXVTCKASUIFF-FXQIFTODSA-N Ala-Cys-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WJRXVTCKASUIFF-FXQIFTODSA-N 0.000 description 2
- DAEFQZCYZKRTLR-ZLUOBGJFSA-N Ala-Cys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O DAEFQZCYZKRTLR-ZLUOBGJFSA-N 0.000 description 2
- LGFCAXJBAZESCF-ACZMJKKPSA-N Ala-Gln-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O LGFCAXJBAZESCF-ACZMJKKPSA-N 0.000 description 2
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 2
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 2
- VBRDBGCROKWTPV-XHNCKOQMSA-N Ala-Glu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N VBRDBGCROKWTPV-XHNCKOQMSA-N 0.000 description 2
- XYTNPQNAZREREP-XQXXSGGOSA-N Ala-Glu-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XYTNPQNAZREREP-XQXXSGGOSA-N 0.000 description 2
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 2
- LBFXVAXPDOBRKU-LKTVYLICSA-N Ala-His-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LBFXVAXPDOBRKU-LKTVYLICSA-N 0.000 description 2
- GSHKMNKPMLXSQW-KBIXCLLPSA-N Ala-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C)N GSHKMNKPMLXSQW-KBIXCLLPSA-N 0.000 description 2
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 2
- VNYMOTCMNHJGTG-JBDRJPRFSA-N Ala-Ile-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O VNYMOTCMNHJGTG-JBDRJPRFSA-N 0.000 description 2
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 2
- SUMYEVXWCAYLLJ-GUBZILKMSA-N Ala-Leu-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O SUMYEVXWCAYLLJ-GUBZILKMSA-N 0.000 description 2
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 2
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 2
- XUCHENWTTBFODJ-FXQIFTODSA-N Ala-Met-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O XUCHENWTTBFODJ-FXQIFTODSA-N 0.000 description 2
- VHEVVUZDDUCAKU-FXQIFTODSA-N Ala-Met-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O VHEVVUZDDUCAKU-FXQIFTODSA-N 0.000 description 2
- GKAZXNDATBWNBI-DCAQKATOSA-N Ala-Met-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)O)N GKAZXNDATBWNBI-DCAQKATOSA-N 0.000 description 2
- BFMIRJBURUXDRG-DLOVCJGASA-N Ala-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 BFMIRJBURUXDRG-DLOVCJGASA-N 0.000 description 2
- CNQAFFMNJIQYGX-DRZSPHRISA-N Ala-Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 CNQAFFMNJIQYGX-DRZSPHRISA-N 0.000 description 2
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 2
- MAZZQZWCCYJQGZ-GUBZILKMSA-N Ala-Pro-Arg Chemical compound [H]N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MAZZQZWCCYJQGZ-GUBZILKMSA-N 0.000 description 2
- DYJJJCHDHLEFDW-FXQIFTODSA-N Ala-Pro-Cys Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N DYJJJCHDHLEFDW-FXQIFTODSA-N 0.000 description 2
- VJVQKGYHIZPSNS-FXQIFTODSA-N Ala-Ser-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N VJVQKGYHIZPSNS-FXQIFTODSA-N 0.000 description 2
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 2
- DYXOFPBJBAHWFY-JBDRJPRFSA-N Ala-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N DYXOFPBJBAHWFY-JBDRJPRFSA-N 0.000 description 2
- OMCKWYSDUQBYCN-FXQIFTODSA-N Ala-Ser-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O OMCKWYSDUQBYCN-FXQIFTODSA-N 0.000 description 2
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 2
- ARHJJAAWNWOACN-FXQIFTODSA-N Ala-Ser-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O ARHJJAAWNWOACN-FXQIFTODSA-N 0.000 description 2
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 2
- VQBULXOHAZSTQY-GKCIPKSASA-N Ala-Trp-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VQBULXOHAZSTQY-GKCIPKSASA-N 0.000 description 2
- SFPRJVVDZNLUTG-OWLDWWDNSA-N Ala-Trp-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SFPRJVVDZNLUTG-OWLDWWDNSA-N 0.000 description 2
- MUGAESARFRGOTQ-IGNZVWTISA-N Ala-Tyr-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N MUGAESARFRGOTQ-IGNZVWTISA-N 0.000 description 2
- YJHKTAMKPGFJCT-NRPADANISA-N Ala-Val-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O YJHKTAMKPGFJCT-NRPADANISA-N 0.000 description 2
- OMSKGWFGWCQFBD-KZVJFYERSA-N Ala-Val-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OMSKGWFGWCQFBD-KZVJFYERSA-N 0.000 description 2
- 102100031317 Alpha-N-acetylgalactosaminidase Human genes 0.000 description 2
- 208000031277 Amaurotic familial idiocy Diseases 0.000 description 2
- 108020005544 Antisense RNA Proteins 0.000 description 2
- SGYSTDWPNPKJPP-GUBZILKMSA-N Arg-Ala-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SGYSTDWPNPKJPP-GUBZILKMSA-N 0.000 description 2
- DFCIPNHFKOQAME-FXQIFTODSA-N Arg-Ala-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFCIPNHFKOQAME-FXQIFTODSA-N 0.000 description 2
- KWKQGHSSNHPGOW-BQBZGAKWSA-N Arg-Ala-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)NCC(O)=O KWKQGHSSNHPGOW-BQBZGAKWSA-N 0.000 description 2
- SBVJJNJLFWSJOV-UBHSHLNASA-N Arg-Ala-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SBVJJNJLFWSJOV-UBHSHLNASA-N 0.000 description 2
- OTOXOKCIIQLMFH-KZVJFYERSA-N Arg-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N OTOXOKCIIQLMFH-KZVJFYERSA-N 0.000 description 2
- XPSGESXVBSQZPL-SRVKXCTJSA-N Arg-Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XPSGESXVBSQZPL-SRVKXCTJSA-N 0.000 description 2
- VWVPYNGMOCSSGK-GUBZILKMSA-N Arg-Arg-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O VWVPYNGMOCSSGK-GUBZILKMSA-N 0.000 description 2
- XEPSCVXTCUUHDT-AVGNSLFASA-N Arg-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCCN=C(N)N XEPSCVXTCUUHDT-AVGNSLFASA-N 0.000 description 2
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 2
- YUIGJDNAGKJLDO-JYJNAYRXSA-N Arg-Arg-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YUIGJDNAGKJLDO-JYJNAYRXSA-N 0.000 description 2
- RVDVDRUZWZIBJQ-CIUDSAMLSA-N Arg-Asn-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O RVDVDRUZWZIBJQ-CIUDSAMLSA-N 0.000 description 2
- OTUQSEPIIVBYEM-IHRRRGAJSA-N Arg-Asn-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OTUQSEPIIVBYEM-IHRRRGAJSA-N 0.000 description 2
- NTAZNGWBXRVEDJ-FXQIFTODSA-N Arg-Asp-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NTAZNGWBXRVEDJ-FXQIFTODSA-N 0.000 description 2
- KMSHNDWHPWXPEC-BQBZGAKWSA-N Arg-Asp-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KMSHNDWHPWXPEC-BQBZGAKWSA-N 0.000 description 2
- JVMKBJNSRZWDBO-FXQIFTODSA-N Arg-Cys-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O JVMKBJNSRZWDBO-FXQIFTODSA-N 0.000 description 2
- HPKSHFSEXICTLI-CIUDSAMLSA-N Arg-Glu-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O HPKSHFSEXICTLI-CIUDSAMLSA-N 0.000 description 2
- PBSOQGZLPFVXPU-YUMQZZPRSA-N Arg-Glu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PBSOQGZLPFVXPU-YUMQZZPRSA-N 0.000 description 2
- NKBQZKVMKJJDLX-SRVKXCTJSA-N Arg-Glu-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NKBQZKVMKJJDLX-SRVKXCTJSA-N 0.000 description 2
- DJAIOAKQIOGULM-DCAQKATOSA-N Arg-Glu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O DJAIOAKQIOGULM-DCAQKATOSA-N 0.000 description 2
- OQCWXQJLCDPRHV-UWVGGRQHSA-N Arg-Gly-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O OQCWXQJLCDPRHV-UWVGGRQHSA-N 0.000 description 2
- HAVKMRGWNXMCDR-STQMWFEESA-N Arg-Gly-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HAVKMRGWNXMCDR-STQMWFEESA-N 0.000 description 2
- ZJEDSBGPBXVBMP-PYJNHQTQSA-N Arg-His-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZJEDSBGPBXVBMP-PYJNHQTQSA-N 0.000 description 2
- UBCPNBUIQNMDNH-NAKRPEOUSA-N Arg-Ile-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O UBCPNBUIQNMDNH-NAKRPEOUSA-N 0.000 description 2
- GNYUVVJYGJFKHN-RVMXOQNASA-N Arg-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N GNYUVVJYGJFKHN-RVMXOQNASA-N 0.000 description 2
- GMFAGHNRXPSSJS-SRVKXCTJSA-N Arg-Leu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GMFAGHNRXPSSJS-SRVKXCTJSA-N 0.000 description 2
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 2
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 2
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 2
- ZRNWJUAQKFUUKV-SRVKXCTJSA-N Arg-Met-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(O)=O ZRNWJUAQKFUUKV-SRVKXCTJSA-N 0.000 description 2
- BSGSDLYGGHGMND-IHRRRGAJSA-N Arg-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N BSGSDLYGGHGMND-IHRRRGAJSA-N 0.000 description 2
- DPLFNLDACGGBAK-KKUMJFAQSA-N Arg-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N DPLFNLDACGGBAK-KKUMJFAQSA-N 0.000 description 2
- AOHKLEBWKMKITA-IHRRRGAJSA-N Arg-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AOHKLEBWKMKITA-IHRRRGAJSA-N 0.000 description 2
- SLQQPJBDBVPVQV-JYJNAYRXSA-N Arg-Phe-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O SLQQPJBDBVPVQV-JYJNAYRXSA-N 0.000 description 2
- BSYKSCBTTQKOJG-GUBZILKMSA-N Arg-Pro-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BSYKSCBTTQKOJG-GUBZILKMSA-N 0.000 description 2
- XSPKAHFVDKRGRL-DCAQKATOSA-N Arg-Pro-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XSPKAHFVDKRGRL-DCAQKATOSA-N 0.000 description 2
- ADPACBMPYWJJCE-FXQIFTODSA-N Arg-Ser-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O ADPACBMPYWJJCE-FXQIFTODSA-N 0.000 description 2
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 2
- ICRHGPYYXMWHIE-LPEHRKFASA-N Arg-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ICRHGPYYXMWHIE-LPEHRKFASA-N 0.000 description 2
- OQPAZKMGCWPERI-GUBZILKMSA-N Arg-Ser-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O OQPAZKMGCWPERI-GUBZILKMSA-N 0.000 description 2
- AIFHRTPABBBHKU-RCWTZXSCSA-N Arg-Thr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AIFHRTPABBBHKU-RCWTZXSCSA-N 0.000 description 2
- AUZAXCPWMDBWEE-HJGDQZAQSA-N Arg-Thr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O AUZAXCPWMDBWEE-HJGDQZAQSA-N 0.000 description 2
- OGZBJJLRKQZRHL-KJEVXHAQSA-N Arg-Thr-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OGZBJJLRKQZRHL-KJEVXHAQSA-N 0.000 description 2
- XRNXPIGJPQHCPC-RCWTZXSCSA-N Arg-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)O)C(O)=O XRNXPIGJPQHCPC-RCWTZXSCSA-N 0.000 description 2
- BFDDUDQCPJWQRQ-IHRRRGAJSA-N Arg-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O BFDDUDQCPJWQRQ-IHRRRGAJSA-N 0.000 description 2
- ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N Asn-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O ZZXMOQIUIJJOKZ-ZLUOBGJFSA-N 0.000 description 2
- KXEGPPNPXOKKHK-ZLUOBGJFSA-N Asn-Asp-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KXEGPPNPXOKKHK-ZLUOBGJFSA-N 0.000 description 2
- XSGBIBGAMKTHMY-WHFBIAKZSA-N Asn-Asp-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O XSGBIBGAMKTHMY-WHFBIAKZSA-N 0.000 description 2
- JZRLLSOWDYUKOK-SRVKXCTJSA-N Asn-Asp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N JZRLLSOWDYUKOK-SRVKXCTJSA-N 0.000 description 2
- HLTLEIXYIJDFOY-ZLUOBGJFSA-N Asn-Cys-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(O)=O HLTLEIXYIJDFOY-ZLUOBGJFSA-N 0.000 description 2
- FAEFJTCTNZTPHX-ACZMJKKPSA-N Asn-Gln-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O FAEFJTCTNZTPHX-ACZMJKKPSA-N 0.000 description 2
- SRUUBQBAVNQZGJ-LAEOZQHASA-N Asn-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N SRUUBQBAVNQZGJ-LAEOZQHASA-N 0.000 description 2
- MECFLTFREHAZLH-ACZMJKKPSA-N Asn-Glu-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N MECFLTFREHAZLH-ACZMJKKPSA-N 0.000 description 2
- DDPXDCKYWDGZAL-BQBZGAKWSA-N Asn-Gly-Arg Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N DDPXDCKYWDGZAL-BQBZGAKWSA-N 0.000 description 2
- OFQPMRDJVWLMNJ-CIUDSAMLSA-N Asn-His-Cys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N OFQPMRDJVWLMNJ-CIUDSAMLSA-N 0.000 description 2
- AITGTTNYKAWKDR-CIUDSAMLSA-N Asn-His-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O AITGTTNYKAWKDR-CIUDSAMLSA-N 0.000 description 2
- XVBDDUPJVQXDSI-PEFMBERDSA-N Asn-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVBDDUPJVQXDSI-PEFMBERDSA-N 0.000 description 2
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 2
- QGABLMITFKUQDF-DCAQKATOSA-N Asn-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N QGABLMITFKUQDF-DCAQKATOSA-N 0.000 description 2
- CDGHMJJJHYKMPA-DLOVCJGASA-N Asn-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)N)N CDGHMJJJHYKMPA-DLOVCJGASA-N 0.000 description 2
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 2
- XOQYDFCQPWAMSA-KKHAAJSZSA-N Asn-Val-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOQYDFCQPWAMSA-KKHAAJSZSA-N 0.000 description 2
- WSWYMRLTJVKRCE-ZLUOBGJFSA-N Asp-Ala-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O WSWYMRLTJVKRCE-ZLUOBGJFSA-N 0.000 description 2
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 2
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 2
- GVPSCJQLUGIKAM-GUBZILKMSA-N Asp-Arg-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GVPSCJQLUGIKAM-GUBZILKMSA-N 0.000 description 2
- HMQDRBKQMLRCCG-GMOBBJLQSA-N Asp-Arg-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HMQDRBKQMLRCCG-GMOBBJLQSA-N 0.000 description 2
- MUWDILPCTSMUHI-ZLUOBGJFSA-N Asp-Asn-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N)C(=O)O MUWDILPCTSMUHI-ZLUOBGJFSA-N 0.000 description 2
- VPSHHQXIWLGVDD-ZLUOBGJFSA-N Asp-Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VPSHHQXIWLGVDD-ZLUOBGJFSA-N 0.000 description 2
- LKIYSIYBKYLKPU-BIIVOSGPSA-N Asp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O LKIYSIYBKYLKPU-BIIVOSGPSA-N 0.000 description 2
- VZNOVQKGJQJOCS-SRVKXCTJSA-N Asp-Asp-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VZNOVQKGJQJOCS-SRVKXCTJSA-N 0.000 description 2
- FMWHSNJMHUNLAG-FXQIFTODSA-N Asp-Cys-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FMWHSNJMHUNLAG-FXQIFTODSA-N 0.000 description 2
- QQXOYLWJQUPXJU-WHFBIAKZSA-N Asp-Cys-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O QQXOYLWJQUPXJU-WHFBIAKZSA-N 0.000 description 2
- IAMNNSSEBXDJMN-CIUDSAMLSA-N Asp-Cys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N IAMNNSSEBXDJMN-CIUDSAMLSA-N 0.000 description 2
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 2
- LTXGDRFJRZSZAV-CIUDSAMLSA-N Asp-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N LTXGDRFJRZSZAV-CIUDSAMLSA-N 0.000 description 2
- ZEDBMCPXPIYJLW-XHNCKOQMSA-N Asp-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O ZEDBMCPXPIYJLW-XHNCKOQMSA-N 0.000 description 2
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 2
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 2
- PGUYEUCYVNZGGV-QWRGUYRKSA-N Asp-Gly-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PGUYEUCYVNZGGV-QWRGUYRKSA-N 0.000 description 2
- WSGVTKZFVJSJOG-RCOVLWMOSA-N Asp-Gly-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O WSGVTKZFVJSJOG-RCOVLWMOSA-N 0.000 description 2
- KLYPOCBLKMPBIQ-GHCJXIJMSA-N Asp-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N KLYPOCBLKMPBIQ-GHCJXIJMSA-N 0.000 description 2
- XLILXFRAKOYEJX-GUBZILKMSA-N Asp-Leu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLILXFRAKOYEJX-GUBZILKMSA-N 0.000 description 2
- JTRDJYIZIKCIRC-AJNGGQMLSA-N Asp-Leu-Leu-Gln Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JTRDJYIZIKCIRC-AJNGGQMLSA-N 0.000 description 2
- DJCAHYVLMSRBFR-QXEWZRGKSA-N Asp-Met-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(O)=O DJCAHYVLMSRBFR-QXEWZRGKSA-N 0.000 description 2
- GYWQGGUCMDCUJE-DLOVCJGASA-N Asp-Phe-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O GYWQGGUCMDCUJE-DLOVCJGASA-N 0.000 description 2
- GPPIDDWYKJPRES-YDHLFZDLSA-N Asp-Phe-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O GPPIDDWYKJPRES-YDHLFZDLSA-N 0.000 description 2
- AHWRSSLYSGLBGD-CIUDSAMLSA-N Asp-Pro-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AHWRSSLYSGLBGD-CIUDSAMLSA-N 0.000 description 2
- XXAMCEGRCZQGEM-ZLUOBGJFSA-N Asp-Ser-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O XXAMCEGRCZQGEM-ZLUOBGJFSA-N 0.000 description 2
- NBKLEMWHDLAUEM-CIUDSAMLSA-N Asp-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N NBKLEMWHDLAUEM-CIUDSAMLSA-N 0.000 description 2
- PDIYGFYAMZZFCW-JIOCBJNQSA-N Asp-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N)O PDIYGFYAMZZFCW-JIOCBJNQSA-N 0.000 description 2
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 2
- KCOPOPKJRHVGPE-AQZXSJQPSA-N Asp-Thr-Trp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O KCOPOPKJRHVGPE-AQZXSJQPSA-N 0.000 description 2
- PLNJUJGNLDSFOP-UWJYBYFXSA-N Asp-Tyr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PLNJUJGNLDSFOP-UWJYBYFXSA-N 0.000 description 2
- GFYOIYJJMSHLSN-QXEWZRGKSA-N Asp-Val-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GFYOIYJJMSHLSN-QXEWZRGKSA-N 0.000 description 2
- UXRVDHVARNBOIO-QSFUFRPTSA-N Asp-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(=O)O)N UXRVDHVARNBOIO-QSFUFRPTSA-N 0.000 description 2
- RKXVTTIQNKPCHU-KKHAAJSZSA-N Asp-Val-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O RKXVTTIQNKPCHU-KKHAAJSZSA-N 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 101800003171 Casoparan Proteins 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 102100038385 Coiled-coil domain-containing protein R3HCC1L Human genes 0.000 description 2
- PKNIZMPLMSKROD-BIIVOSGPSA-N Cys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N PKNIZMPLMSKROD-BIIVOSGPSA-N 0.000 description 2
- QDFBJJABJKOLTD-FXQIFTODSA-N Cys-Asn-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QDFBJJABJKOLTD-FXQIFTODSA-N 0.000 description 2
- UISYPAHPLXGLNH-ACZMJKKPSA-N Cys-Asn-Gln Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O UISYPAHPLXGLNH-ACZMJKKPSA-N 0.000 description 2
- OIMUAKUQOUEPCZ-WHFBIAKZSA-N Cys-Asn-Gly Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIMUAKUQOUEPCZ-WHFBIAKZSA-N 0.000 description 2
- ZIKWRNJXFIQECJ-CIUDSAMLSA-N Cys-Cys-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O ZIKWRNJXFIQECJ-CIUDSAMLSA-N 0.000 description 2
- URDUGPGPLNXXES-WHFBIAKZSA-N Cys-Gly-Cys Chemical compound SC[C@H](N)C(=O)NCC(=O)N[C@@H](CS)C(O)=O URDUGPGPLNXXES-WHFBIAKZSA-N 0.000 description 2
- WTNLLMQAFPOCTJ-GARJFASQSA-N Cys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CS)N)C(=O)O WTNLLMQAFPOCTJ-GARJFASQSA-N 0.000 description 2
- WVLZTXGTNGHPBO-SRVKXCTJSA-N Cys-Leu-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O WVLZTXGTNGHPBO-SRVKXCTJSA-N 0.000 description 2
- MKMKILWCRQLDFJ-DCAQKATOSA-N Cys-Lys-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MKMKILWCRQLDFJ-DCAQKATOSA-N 0.000 description 2
- VXLXATVURDNDCG-CIUDSAMLSA-N Cys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N VXLXATVURDNDCG-CIUDSAMLSA-N 0.000 description 2
- VDUPGIDTWNQAJD-CIUDSAMLSA-N Cys-Lys-Cys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@@H](CS)C(O)=O VDUPGIDTWNQAJD-CIUDSAMLSA-N 0.000 description 2
- VOBMMKMWSIVIOA-SRVKXCTJSA-N Cys-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N VOBMMKMWSIVIOA-SRVKXCTJSA-N 0.000 description 2
- TXCCRYAZQBUCOV-CIUDSAMLSA-N Cys-Pro-Gln Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O TXCCRYAZQBUCOV-CIUDSAMLSA-N 0.000 description 2
- HMWBPUDETPKSSS-DCAQKATOSA-N Cys-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CS)N)C(=O)N[C@@H](CCCCN)C(=O)O HMWBPUDETPKSSS-DCAQKATOSA-N 0.000 description 2
- BCWIFCLVCRAIQK-ZLUOBGJFSA-N Cys-Ser-Cys Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CS)N)O BCWIFCLVCRAIQK-ZLUOBGJFSA-N 0.000 description 2
- SRZZZTMJARUVPI-JBDRJPRFSA-N Cys-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N SRZZZTMJARUVPI-JBDRJPRFSA-N 0.000 description 2
- ABLQPNMKLMFDQU-BIIVOSGPSA-N Cys-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CS)N)C(=O)O ABLQPNMKLMFDQU-BIIVOSGPSA-N 0.000 description 2
- QQAYIVHVRFJICE-AEJSXWLSSA-N Cys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N QQAYIVHVRFJICE-AEJSXWLSSA-N 0.000 description 2
- 108020003215 DNA Probes Proteins 0.000 description 2
- 239000003155 DNA primer Substances 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 2
- 208000034454 F12-related hereditary angioedema with normal C1Inh Diseases 0.000 description 2
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 2
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 2
- RZSLYUUFFVHFRQ-FXQIFTODSA-N Gln-Ala-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O RZSLYUUFFVHFRQ-FXQIFTODSA-N 0.000 description 2
- LKUWAWGNJYJODH-KBIXCLLPSA-N Gln-Ala-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LKUWAWGNJYJODH-KBIXCLLPSA-N 0.000 description 2
- LJEPDHWNQXPXMM-NHCYSSNCSA-N Gln-Arg-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O LJEPDHWNQXPXMM-NHCYSSNCSA-N 0.000 description 2
- AAOBFSKXAVIORT-GUBZILKMSA-N Gln-Asn-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O AAOBFSKXAVIORT-GUBZILKMSA-N 0.000 description 2
- KYFSMWLWHYZRNW-ACZMJKKPSA-N Gln-Asp-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N KYFSMWLWHYZRNW-ACZMJKKPSA-N 0.000 description 2
- IXFVOPOHSRKJNG-LAEOZQHASA-N Gln-Asp-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IXFVOPOHSRKJNG-LAEOZQHASA-N 0.000 description 2
- GNDJOCGXGLNCKY-ACZMJKKPSA-N Gln-Cys-Cys Chemical compound N[C@@H](CCC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(O)=O GNDJOCGXGLNCKY-ACZMJKKPSA-N 0.000 description 2
- QFTRCUPCARNIPZ-XHNCKOQMSA-N Gln-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)N)N)C(=O)O QFTRCUPCARNIPZ-XHNCKOQMSA-N 0.000 description 2
- BLOXULLYFRGYKZ-GUBZILKMSA-N Gln-Glu-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BLOXULLYFRGYKZ-GUBZILKMSA-N 0.000 description 2
- MAGNEQBFSBREJL-DCAQKATOSA-N Gln-Glu-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N MAGNEQBFSBREJL-DCAQKATOSA-N 0.000 description 2
- LVSYIKGMLRHKME-IUCAKERBSA-N Gln-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N LVSYIKGMLRHKME-IUCAKERBSA-N 0.000 description 2
- JXFLPKSDLDEOQK-JHEQGTHGSA-N Gln-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O JXFLPKSDLDEOQK-JHEQGTHGSA-N 0.000 description 2
- TWTWUBHEWQPMQW-ZPFDUUQYSA-N Gln-Ile-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWTWUBHEWQPMQW-ZPFDUUQYSA-N 0.000 description 2
- ITZWDGBYBPUZRG-KBIXCLLPSA-N Gln-Ile-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O ITZWDGBYBPUZRG-KBIXCLLPSA-N 0.000 description 2
- ZNTDJIMJKNNSLR-RWRJDSDZSA-N Gln-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZNTDJIMJKNNSLR-RWRJDSDZSA-N 0.000 description 2
- QKCZZAZNMMVICF-DCAQKATOSA-N Gln-Leu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O QKCZZAZNMMVICF-DCAQKATOSA-N 0.000 description 2
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 2
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 2
- YPMDZWPZFOZYFG-GUBZILKMSA-N Gln-Leu-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YPMDZWPZFOZYFG-GUBZILKMSA-N 0.000 description 2
- ATTWDCRXQNKRII-GUBZILKMSA-N Gln-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ATTWDCRXQNKRII-GUBZILKMSA-N 0.000 description 2
- HPCOBEHVEHWREJ-DCAQKATOSA-N Gln-Lys-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HPCOBEHVEHWREJ-DCAQKATOSA-N 0.000 description 2
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 2
- JRHPEMVLTRADLJ-AVGNSLFASA-N Gln-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JRHPEMVLTRADLJ-AVGNSLFASA-N 0.000 description 2
- ZXGLLNZQSBLQLT-SRVKXCTJSA-N Gln-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZXGLLNZQSBLQLT-SRVKXCTJSA-N 0.000 description 2
- QBEWLBKBGXVVPD-RYUDHWBXSA-N Gln-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N QBEWLBKBGXVVPD-RYUDHWBXSA-N 0.000 description 2
- QFXNFFZTMFHPST-DZKIICNBSA-N Gln-Phe-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCC(=O)N)N QFXNFFZTMFHPST-DZKIICNBSA-N 0.000 description 2
- XUMFMAVDHQDATI-DCAQKATOSA-N Gln-Pro-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XUMFMAVDHQDATI-DCAQKATOSA-N 0.000 description 2
- RWQCWSGOOOEGPB-FXQIFTODSA-N Gln-Ser-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O RWQCWSGOOOEGPB-FXQIFTODSA-N 0.000 description 2
- PAOHIZNRJNIXQY-XQXXSGGOSA-N Gln-Thr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PAOHIZNRJNIXQY-XQXXSGGOSA-N 0.000 description 2
- ININBLZFFVOQIO-JHEQGTHGSA-N Gln-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O ININBLZFFVOQIO-JHEQGTHGSA-N 0.000 description 2
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 2
- WZZSKAJIHTUUSG-ACZMJKKPSA-N Glu-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O WZZSKAJIHTUUSG-ACZMJKKPSA-N 0.000 description 2
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 2
- NCWOMXABNYEPLY-NRPADANISA-N Glu-Ala-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O NCWOMXABNYEPLY-NRPADANISA-N 0.000 description 2
- WOMUDRVDJMHTCV-DCAQKATOSA-N Glu-Arg-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WOMUDRVDJMHTCV-DCAQKATOSA-N 0.000 description 2
- IYAUFWMUCGBFMQ-CIUDSAMLSA-N Glu-Arg-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)CN=C(N)N IYAUFWMUCGBFMQ-CIUDSAMLSA-N 0.000 description 2
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 2
- SBYVDRJAXWSXQL-AVGNSLFASA-N Glu-Asn-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SBYVDRJAXWSXQL-AVGNSLFASA-N 0.000 description 2
- ZJICFHQSPWFBKP-AVGNSLFASA-N Glu-Asn-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZJICFHQSPWFBKP-AVGNSLFASA-N 0.000 description 2
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 2
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 2
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 2
- FLQAKQOBSPFGKG-CIUDSAMLSA-N Glu-Cys-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FLQAKQOBSPFGKG-CIUDSAMLSA-N 0.000 description 2
- XHWLNISLUFEWNS-CIUDSAMLSA-N Glu-Gln-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O XHWLNISLUFEWNS-CIUDSAMLSA-N 0.000 description 2
- HTTSBEBKVNEDFE-AUTRQRHGSA-N Glu-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)O)N HTTSBEBKVNEDFE-AUTRQRHGSA-N 0.000 description 2
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 2
- OAGVHWYIBZMWLA-YFKPBYRVSA-N Glu-Gly-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)NCC(O)=O OAGVHWYIBZMWLA-YFKPBYRVSA-N 0.000 description 2
- HILMIYALTUQTRC-XVKPBYJWSA-N Glu-Gly-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HILMIYALTUQTRC-XVKPBYJWSA-N 0.000 description 2
- XOIATPHFYVWFEU-DCAQKATOSA-N Glu-His-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XOIATPHFYVWFEU-DCAQKATOSA-N 0.000 description 2
- WVYJNPCWJYBHJG-YVNDNENWSA-N Glu-Ile-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O WVYJNPCWJYBHJG-YVNDNENWSA-N 0.000 description 2
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 2
- UGSVSNXPJJDJKL-SDDRHHMPSA-N Glu-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N UGSVSNXPJJDJKL-SDDRHHMPSA-N 0.000 description 2
- FBEJIDRSQCGFJI-GUBZILKMSA-N Glu-Leu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FBEJIDRSQCGFJI-GUBZILKMSA-N 0.000 description 2
- IOUQWHIEQYQVFD-JYJNAYRXSA-N Glu-Leu-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IOUQWHIEQYQVFD-JYJNAYRXSA-N 0.000 description 2
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 2
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 2
- CQAHWYDHKUWYIX-YUMQZZPRSA-N Glu-Pro-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O CQAHWYDHKUWYIX-YUMQZZPRSA-N 0.000 description 2
- SWDNPSMMEWRNOH-HJGDQZAQSA-N Glu-Pro-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWDNPSMMEWRNOH-HJGDQZAQSA-N 0.000 description 2
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 2
- VNCNWQPIQYAMAK-ACZMJKKPSA-N Glu-Ser-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O VNCNWQPIQYAMAK-ACZMJKKPSA-N 0.000 description 2
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 2
- QEJKKJNDDDPSMU-KKUMJFAQSA-N Glu-Tyr-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCSC)C(O)=O QEJKKJNDDDPSMU-KKUMJFAQSA-N 0.000 description 2
- UZWUBBRJWFTHTD-LAEOZQHASA-N Glu-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O UZWUBBRJWFTHTD-LAEOZQHASA-N 0.000 description 2
- FGGKGJHCVMYGCD-UKJIMTQDSA-N Glu-Val-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FGGKGJHCVMYGCD-UKJIMTQDSA-N 0.000 description 2
- GQGAFTPXAPKSCF-WHFBIAKZSA-N Gly-Ala-Cys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(=O)O GQGAFTPXAPKSCF-WHFBIAKZSA-N 0.000 description 2
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 2
- YMUFWNJHVPQNQD-ZKWXMUAHSA-N Gly-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN YMUFWNJHVPQNQD-ZKWXMUAHSA-N 0.000 description 2
- MZZSCEANQDPJER-ONGXEEELSA-N Gly-Ala-Phe Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MZZSCEANQDPJER-ONGXEEELSA-N 0.000 description 2
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 2
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 2
- LERGJIVJIIODPZ-ZANVPECISA-N Gly-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)CN)C)C(O)=O)=CNC2=C1 LERGJIVJIIODPZ-ZANVPECISA-N 0.000 description 2
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 2
- KFMBRBPXHVMDFN-UWVGGRQHSA-N Gly-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCNC(N)=N KFMBRBPXHVMDFN-UWVGGRQHSA-N 0.000 description 2
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 2
- QSTLUOIOYLYLLF-WDSKDSINSA-N Gly-Asp-Glu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QSTLUOIOYLYLLF-WDSKDSINSA-N 0.000 description 2
- LCNXZQROPKFGQK-WHFBIAKZSA-N Gly-Asp-Ser Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O LCNXZQROPKFGQK-WHFBIAKZSA-N 0.000 description 2
- GVVKYKCOFMMTKZ-WHFBIAKZSA-N Gly-Cys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CS)NC(=O)CN GVVKYKCOFMMTKZ-WHFBIAKZSA-N 0.000 description 2
- YDWZGVCXMVLDQH-WHFBIAKZSA-N Gly-Cys-Asn Chemical compound NCC(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC(N)=O YDWZGVCXMVLDQH-WHFBIAKZSA-N 0.000 description 2
- PEZZSFLFXXFUQD-XPUUQOCRSA-N Gly-Cys-Val Chemical compound [H]NCC(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O PEZZSFLFXXFUQD-XPUUQOCRSA-N 0.000 description 2
- VUUOMYFPWDYETE-WDSKDSINSA-N Gly-Gln-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN VUUOMYFPWDYETE-WDSKDSINSA-N 0.000 description 2
- KTSZUNRRYXPZTK-BQBZGAKWSA-N Gly-Gln-Glu Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KTSZUNRRYXPZTK-BQBZGAKWSA-N 0.000 description 2
- CUYLIWAAAYJKJH-RYUDHWBXSA-N Gly-Glu-Tyr Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 CUYLIWAAAYJKJH-RYUDHWBXSA-N 0.000 description 2
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 2
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 2
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 2
- BUEFQXUHTUZXHR-LURJTMIESA-N Gly-Gly-Pro zwitterion Chemical compound NCC(=O)NCC(=O)N1CCC[C@H]1C(O)=O BUEFQXUHTUZXHR-LURJTMIESA-N 0.000 description 2
- UQJNXZSSGQIPIQ-FBCQKBJTSA-N Gly-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)CN UQJNXZSSGQIPIQ-FBCQKBJTSA-N 0.000 description 2
- HPAIKDPJURGQLN-KBPBESRZSA-N Gly-His-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CNC=N1 HPAIKDPJURGQLN-KBPBESRZSA-N 0.000 description 2
- LPCKHUXOGVNZRS-YUMQZZPRSA-N Gly-His-Ser Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O LPCKHUXOGVNZRS-YUMQZZPRSA-N 0.000 description 2
- SWQALSGKVLYKDT-ZKWXMUAHSA-N Gly-Ile-Ala Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SWQALSGKVLYKDT-ZKWXMUAHSA-N 0.000 description 2
- YIFUFYZELCMPJP-YUMQZZPRSA-N Gly-Leu-Cys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O YIFUFYZELCMPJP-YUMQZZPRSA-N 0.000 description 2
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 2
- TVUWMSBGMVAHSJ-KBPBESRZSA-N Gly-Leu-Phe Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TVUWMSBGMVAHSJ-KBPBESRZSA-N 0.000 description 2
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 2
- MHZXESQPPXOING-KBPBESRZSA-N Gly-Lys-Phe Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MHZXESQPPXOING-KBPBESRZSA-N 0.000 description 2
- HHRODZSXDXMUHS-LURJTMIESA-N Gly-Met-Gly Chemical compound CSCC[C@H](NC(=O)C[NH3+])C(=O)NCC([O-])=O HHRODZSXDXMUHS-LURJTMIESA-N 0.000 description 2
- FJWSJWACLMTDMI-WPRPVWTQSA-N Gly-Met-Val Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O FJWSJWACLMTDMI-WPRPVWTQSA-N 0.000 description 2
- MTBIKIMYHUWBRX-QWRGUYRKSA-N Gly-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN MTBIKIMYHUWBRX-QWRGUYRKSA-N 0.000 description 2
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 2
- YYXJFBMCOUSYSF-RYUDHWBXSA-N Gly-Phe-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYXJFBMCOUSYSF-RYUDHWBXSA-N 0.000 description 2
- IEGFSKKANYKBDU-QWHCGFSZSA-N Gly-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)CN)C(=O)O IEGFSKKANYKBDU-QWHCGFSZSA-N 0.000 description 2
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 2
- CAVKXZMMDNOZJU-UHFFFAOYSA-N Gly-Pro-Ala-Gly-Pro Natural products C1CCC(C(O)=O)N1C(=O)CNC(=O)C(C)NC(=O)C1CCCN1C(=O)CN CAVKXZMMDNOZJU-UHFFFAOYSA-N 0.000 description 2
- JJGBXTYGTKWGAT-YUMQZZPRSA-N Gly-Pro-Glu Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O JJGBXTYGTKWGAT-YUMQZZPRSA-N 0.000 description 2
- NSVOVKWEKGEOQB-LURJTMIESA-N Gly-Pro-Gly Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(O)=O NSVOVKWEKGEOQB-LURJTMIESA-N 0.000 description 2
- ZZJVYSAQQMDIRD-UWVGGRQHSA-N Gly-Pro-His Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O ZZJVYSAQQMDIRD-UWVGGRQHSA-N 0.000 description 2
- OOCFXNOVSLSHAB-IUCAKERBSA-N Gly-Pro-Pro Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OOCFXNOVSLSHAB-IUCAKERBSA-N 0.000 description 2
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 2
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 2
- ZKJZBRHRWKLVSJ-ZDLURKLDSA-N Gly-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)O ZKJZBRHRWKLVSJ-ZDLURKLDSA-N 0.000 description 2
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 2
- CUVBTVWFVIIDOC-YEPSODPASA-N Gly-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)CN CUVBTVWFVIIDOC-YEPSODPASA-N 0.000 description 2
- KBBFOULZCHWGJX-KBPBESRZSA-N Gly-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)CN)O KBBFOULZCHWGJX-KBPBESRZSA-N 0.000 description 2
- RIYIFUFFFBIOEU-KBPBESRZSA-N Gly-Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 RIYIFUFFFBIOEU-KBPBESRZSA-N 0.000 description 2
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 2
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 2
- MJNWEIMBXKKCSF-XVYDVKMFSA-N His-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N MJNWEIMBXKKCSF-XVYDVKMFSA-N 0.000 description 2
- VSLXGYMEHVAJBH-DLOVCJGASA-N His-Ala-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O VSLXGYMEHVAJBH-DLOVCJGASA-N 0.000 description 2
- DZMVESFTHXSSPZ-XVYDVKMFSA-N His-Ala-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DZMVESFTHXSSPZ-XVYDVKMFSA-N 0.000 description 2
- HXKZJLWGSWQKEA-LSJOCFKGSA-N His-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CN=CN1 HXKZJLWGSWQKEA-LSJOCFKGSA-N 0.000 description 2
- JBJNKUOMNZGQIM-PYJNHQTQSA-N His-Arg-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JBJNKUOMNZGQIM-PYJNHQTQSA-N 0.000 description 2
- PROLDOGUBQJNPG-RWMBFGLXSA-N His-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O PROLDOGUBQJNPG-RWMBFGLXSA-N 0.000 description 2
- MDBYBTWRMOAJAY-NHCYSSNCSA-N His-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N MDBYBTWRMOAJAY-NHCYSSNCSA-N 0.000 description 2
- RXVOMIADLXPJGW-GUBZILKMSA-N His-Asp-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O RXVOMIADLXPJGW-GUBZILKMSA-N 0.000 description 2
- STWGDDDFLUFCCA-GVXVVHGQSA-N His-Glu-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O STWGDDDFLUFCCA-GVXVVHGQSA-N 0.000 description 2
- PYNUBZSXKQKAHL-UWVGGRQHSA-N His-Gly-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O PYNUBZSXKQKAHL-UWVGGRQHSA-N 0.000 description 2
- BDFCIKANUNMFGB-PMVVWTBXSA-N His-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 BDFCIKANUNMFGB-PMVVWTBXSA-N 0.000 description 2
- CSTNMMIHMYJGFR-IHRRRGAJSA-N His-His-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CN=CN1 CSTNMMIHMYJGFR-IHRRRGAJSA-N 0.000 description 2
- STOOMQFEJUVAKR-KKUMJFAQSA-N His-His-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CNC=N1 STOOMQFEJUVAKR-KKUMJFAQSA-N 0.000 description 2
- BILZDIPAKWZFSG-PYJNHQTQSA-N His-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N BILZDIPAKWZFSG-PYJNHQTQSA-N 0.000 description 2
- IWXMHXYOACDSIA-PYJNHQTQSA-N His-Ile-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O IWXMHXYOACDSIA-PYJNHQTQSA-N 0.000 description 2
- AIPUZFXMXAHZKY-QWRGUYRKSA-N His-Leu-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O AIPUZFXMXAHZKY-QWRGUYRKSA-N 0.000 description 2
- RNMNYMDTESKEAJ-KKUMJFAQSA-N His-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 RNMNYMDTESKEAJ-KKUMJFAQSA-N 0.000 description 2
- GJMHMDKCJPQJOI-IHRRRGAJSA-N His-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CN=CN1 GJMHMDKCJPQJOI-IHRRRGAJSA-N 0.000 description 2
- XKIYNCLILDLGRS-QWRGUYRKSA-N His-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CN=CN1 XKIYNCLILDLGRS-QWRGUYRKSA-N 0.000 description 2
- YVCGJPIKRMGNPA-LSJOCFKGSA-N His-Met-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O YVCGJPIKRMGNPA-LSJOCFKGSA-N 0.000 description 2
- SLFSYFJKSIVSON-SRVKXCTJSA-N His-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N SLFSYFJKSIVSON-SRVKXCTJSA-N 0.000 description 2
- AJTBOTWDSRSUDV-ULQDDVLXSA-N His-Phe-Met Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O AJTBOTWDSRSUDV-ULQDDVLXSA-N 0.000 description 2
- BFOGZWSSGMLYKV-DCAQKATOSA-N His-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CN=CN1)N BFOGZWSSGMLYKV-DCAQKATOSA-N 0.000 description 2
- YBDOQKVAGTWZMI-XIRDDKMYSA-N His-Trp-Ser Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N YBDOQKVAGTWZMI-XIRDDKMYSA-N 0.000 description 2
- PZUZIHRPOVVHOT-KBPBESRZSA-N His-Tyr-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)NCC(O)=O)C1=CN=CN1 PZUZIHRPOVVHOT-KBPBESRZSA-N 0.000 description 2
- QTMKFZAYZKBFRC-BZSNNMDCSA-N His-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N)O QTMKFZAYZKBFRC-BZSNNMDCSA-N 0.000 description 2
- KFQDSSNYWKZFOO-LSJOCFKGSA-N His-Val-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KFQDSSNYWKZFOO-LSJOCFKGSA-N 0.000 description 2
- GGXUJBKENKVYNV-ULQDDVLXSA-N His-Val-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N GGXUJBKENKVYNV-ULQDDVLXSA-N 0.000 description 2
- 108010072039 Histidine kinase Proteins 0.000 description 2
- 101000743767 Homo sapiens Coiled-coil domain-containing protein R3HCC1L Proteins 0.000 description 2
- YPWHUFAAMNHMGS-QSFUFRPTSA-N Ile-Ala-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N YPWHUFAAMNHMGS-QSFUFRPTSA-N 0.000 description 2
- QICVAHODWHIWIS-HTFCKZLJSA-N Ile-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N QICVAHODWHIWIS-HTFCKZLJSA-N 0.000 description 2
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 2
- FVEWRQXNISSYFO-ZPFDUUQYSA-N Ile-Arg-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N FVEWRQXNISSYFO-ZPFDUUQYSA-N 0.000 description 2
- CWJQMCPYXNVMBS-STECZYCISA-N Ile-Arg-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N CWJQMCPYXNVMBS-STECZYCISA-N 0.000 description 2
- UKTUOMWSJPXODT-GUDRVLHUSA-N Ile-Asn-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N UKTUOMWSJPXODT-GUDRVLHUSA-N 0.000 description 2
- LLZLRXBTOOFODM-QSFUFRPTSA-N Ile-Asp-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N LLZLRXBTOOFODM-QSFUFRPTSA-N 0.000 description 2
- YBJWJQQBWRARLT-KBIXCLLPSA-N Ile-Gln-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O YBJWJQQBWRARLT-KBIXCLLPSA-N 0.000 description 2
- IGJWJGIHUFQANP-LAEOZQHASA-N Ile-Gly-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N IGJWJGIHUFQANP-LAEOZQHASA-N 0.000 description 2
- MQFGXJNSUJTXDT-QSFUFRPTSA-N Ile-Gly-Ile Chemical compound N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)O MQFGXJNSUJTXDT-QSFUFRPTSA-N 0.000 description 2
- VOBYAKCXGQQFLR-LSJOCFKGSA-N Ile-Gly-Val Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O VOBYAKCXGQQFLR-LSJOCFKGSA-N 0.000 description 2
- WIZPFZKOFZXDQG-HTFCKZLJSA-N Ile-Ile-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O WIZPFZKOFZXDQG-HTFCKZLJSA-N 0.000 description 2
- YNMQUIVKEFRCPH-QSFUFRPTSA-N Ile-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)O)N YNMQUIVKEFRCPH-QSFUFRPTSA-N 0.000 description 2
- TWPSALMCEHCIOY-YTFOTSKYSA-N Ile-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(=O)O)N TWPSALMCEHCIOY-YTFOTSKYSA-N 0.000 description 2
- AXNGDPAKKCEKGY-QPHKQPEJSA-N Ile-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N AXNGDPAKKCEKGY-QPHKQPEJSA-N 0.000 description 2
- KLBVGHCGHUNHEA-BJDJZHNGSA-N Ile-Leu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)O)N KLBVGHCGHUNHEA-BJDJZHNGSA-N 0.000 description 2
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 2
- NUKXXNFEUZGPRO-BJDJZHNGSA-N Ile-Leu-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)O)N NUKXXNFEUZGPRO-BJDJZHNGSA-N 0.000 description 2
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 2
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 2
- OVDKXUDMKXAZIV-ZPFDUUQYSA-N Ile-Lys-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OVDKXUDMKXAZIV-ZPFDUUQYSA-N 0.000 description 2
- RMNMUUCYTMLWNA-ZPFDUUQYSA-N Ile-Lys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RMNMUUCYTMLWNA-ZPFDUUQYSA-N 0.000 description 2
- FFAUOCITXBMRBT-YTFOTSKYSA-N Ile-Lys-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FFAUOCITXBMRBT-YTFOTSKYSA-N 0.000 description 2
- RCMNUBZKIIJCOI-ZPFDUUQYSA-N Ile-Met-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RCMNUBZKIIJCOI-ZPFDUUQYSA-N 0.000 description 2
- UYNXBNHVWFNVIN-HJWJTTGWSA-N Ile-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 UYNXBNHVWFNVIN-HJWJTTGWSA-N 0.000 description 2
- IVXJIMGDOYRLQU-XUXIUFHCSA-N Ile-Pro-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O IVXJIMGDOYRLQU-XUXIUFHCSA-N 0.000 description 2
- CAHCWMVNBZJVAW-NAKRPEOUSA-N Ile-Pro-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)O)N CAHCWMVNBZJVAW-NAKRPEOUSA-N 0.000 description 2
- XMYURPUVJSKTMC-KBIXCLLPSA-N Ile-Ser-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XMYURPUVJSKTMC-KBIXCLLPSA-N 0.000 description 2
- PELCGFMHLZXWBQ-BJDJZHNGSA-N Ile-Ser-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)O)N PELCGFMHLZXWBQ-BJDJZHNGSA-N 0.000 description 2
- JNLSTRPWUXOORL-MMWGEVLESA-N Ile-Ser-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N JNLSTRPWUXOORL-MMWGEVLESA-N 0.000 description 2
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 2
- RTSQPLLOYSGMKM-DSYPUSFNSA-N Ile-Trp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(C)C)C(=O)O)N RTSQPLLOYSGMKM-DSYPUSFNSA-N 0.000 description 2
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 2
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 2
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 2
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 2
- ZDSNOSQHMJBRQN-SRVKXCTJSA-N Leu-Asp-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZDSNOSQHMJBRQN-SRVKXCTJSA-N 0.000 description 2
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 2
- IASQBRJGRVXNJI-YUMQZZPRSA-N Leu-Cys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)NCC(O)=O IASQBRJGRVXNJI-YUMQZZPRSA-N 0.000 description 2
- YORLGJINWYYIMX-KKUMJFAQSA-N Leu-Cys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YORLGJINWYYIMX-KKUMJFAQSA-N 0.000 description 2
- VPKIQULSKFVCSM-SRVKXCTJSA-N Leu-Gln-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VPKIQULSKFVCSM-SRVKXCTJSA-N 0.000 description 2
- GLBNEGIOFRVRHO-JYJNAYRXSA-N Leu-Gln-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLBNEGIOFRVRHO-JYJNAYRXSA-N 0.000 description 2
- CQGSYZCULZMEDE-UHFFFAOYSA-N Leu-Gln-Pro Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)N1CCCC1C(O)=O CQGSYZCULZMEDE-UHFFFAOYSA-N 0.000 description 2
- KVMULWOHPPMHHE-DCAQKATOSA-N Leu-Glu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KVMULWOHPPMHHE-DCAQKATOSA-N 0.000 description 2
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 2
- OGUUKPXUTHOIAV-SDDRHHMPSA-N Leu-Glu-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N OGUUKPXUTHOIAV-SDDRHHMPSA-N 0.000 description 2
- BABSVXFGKFLIGW-UWVGGRQHSA-N Leu-Gly-Arg Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N BABSVXFGKFLIGW-UWVGGRQHSA-N 0.000 description 2
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 2
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 2
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 2
- DDEMUMVXNFPDKC-SRVKXCTJSA-N Leu-His-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CS)C(=O)O)N DDEMUMVXNFPDKC-SRVKXCTJSA-N 0.000 description 2
- CSFVADKICPDRRF-KKUMJFAQSA-N Leu-His-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CN=CN1 CSFVADKICPDRRF-KKUMJFAQSA-N 0.000 description 2
- OHZIZVWQXJPBJS-IXOXFDKPSA-N Leu-His-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OHZIZVWQXJPBJS-IXOXFDKPSA-N 0.000 description 2
- ORWTWZXGDBYVCP-BJDJZHNGSA-N Leu-Ile-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC(C)C ORWTWZXGDBYVCP-BJDJZHNGSA-N 0.000 description 2
- AUBMZAMQCOYSIC-MNXVOIDGSA-N Leu-Ile-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O AUBMZAMQCOYSIC-MNXVOIDGSA-N 0.000 description 2
- FAELBUXXFQLUAX-AJNGGQMLSA-N Leu-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C FAELBUXXFQLUAX-AJNGGQMLSA-N 0.000 description 2
- UCNNZELZXFXXJQ-BZSNNMDCSA-N Leu-Leu-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCNNZELZXFXXJQ-BZSNNMDCSA-N 0.000 description 2
- BJWKOATWNQJPSK-SRVKXCTJSA-N Leu-Met-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N BJWKOATWNQJPSK-SRVKXCTJSA-N 0.000 description 2
- NJMXCOOEFLMZSR-AVGNSLFASA-N Leu-Met-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O NJMXCOOEFLMZSR-AVGNSLFASA-N 0.000 description 2
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 2
- YESNGRDJQWDYLH-KKUMJFAQSA-N Leu-Phe-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N YESNGRDJQWDYLH-KKUMJFAQSA-N 0.000 description 2
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 2
- MJWVXZABPOKJJF-ACRUOGEOSA-N Leu-Phe-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MJWVXZABPOKJJF-ACRUOGEOSA-N 0.000 description 2
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 2
- QMKFDEUJGYNFMC-AVGNSLFASA-N Leu-Pro-Arg Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QMKFDEUJGYNFMC-AVGNSLFASA-N 0.000 description 2
- KWLWZYMNUZJKMZ-IHRRRGAJSA-N Leu-Pro-Leu Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O KWLWZYMNUZJKMZ-IHRRRGAJSA-N 0.000 description 2
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 2
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 2
- IDGZVZJLYFTXSL-DCAQKATOSA-N Leu-Ser-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IDGZVZJLYFTXSL-DCAQKATOSA-N 0.000 description 2
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 2
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 2
- MVHXGBZUJLWZOH-BJDJZHNGSA-N Leu-Ser-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVHXGBZUJLWZOH-BJDJZHNGSA-N 0.000 description 2
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 2
- GZRABTMNWJXFMH-UVOCVTCTSA-N Leu-Thr-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZRABTMNWJXFMH-UVOCVTCTSA-N 0.000 description 2
- LSLUTXRANSUGFY-XIRDDKMYSA-N Leu-Trp-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O LSLUTXRANSUGFY-XIRDDKMYSA-N 0.000 description 2
- HQBOMRTVKVKFMN-WDSOQIARSA-N Leu-Trp-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(O)=O HQBOMRTVKVKFMN-WDSOQIARSA-N 0.000 description 2
- RDFIVFHPOSOXMW-ACRUOGEOSA-N Leu-Tyr-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RDFIVFHPOSOXMW-ACRUOGEOSA-N 0.000 description 2
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 2
- QQXJROOJCMIHIV-AVGNSLFASA-N Leu-Val-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O QQXJROOJCMIHIV-AVGNSLFASA-N 0.000 description 2
- XOEDPXDZJHBQIX-ULQDDVLXSA-N Leu-Val-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XOEDPXDZJHBQIX-ULQDDVLXSA-N 0.000 description 2
- FDBTVENULFNTAL-XQQFMLRXSA-N Leu-Val-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N FDBTVENULFNTAL-XQQFMLRXSA-N 0.000 description 2
- BTSXLXFPMZXVPR-DLOVCJGASA-N Lys-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N BTSXLXFPMZXVPR-DLOVCJGASA-N 0.000 description 2
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 2
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 2
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 2
- CLBGMWIYPYAZPR-AVGNSLFASA-N Lys-Arg-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O CLBGMWIYPYAZPR-AVGNSLFASA-N 0.000 description 2
- WALVCOOOKULCQM-ULQDDVLXSA-N Lys-Arg-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WALVCOOOKULCQM-ULQDDVLXSA-N 0.000 description 2
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 2
- QIJVAFLRMVBHMU-KKUMJFAQSA-N Lys-Asp-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QIJVAFLRMVBHMU-KKUMJFAQSA-N 0.000 description 2
- NTBFKPBULZGXQL-KKUMJFAQSA-N Lys-Asp-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTBFKPBULZGXQL-KKUMJFAQSA-N 0.000 description 2
- SVJRVFPSHPGWFF-DCAQKATOSA-N Lys-Cys-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SVJRVFPSHPGWFF-DCAQKATOSA-N 0.000 description 2
- MWVUEPNEPWMFBD-SRVKXCTJSA-N Lys-Cys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CCCCN MWVUEPNEPWMFBD-SRVKXCTJSA-N 0.000 description 2
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 2
- NDORZBUHCOJQDO-GVXVVHGQSA-N Lys-Gln-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O NDORZBUHCOJQDO-GVXVVHGQSA-N 0.000 description 2
- PBIPLDMFHAICIP-DCAQKATOSA-N Lys-Glu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PBIPLDMFHAICIP-DCAQKATOSA-N 0.000 description 2
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 2
- PAMDBWYMLWOELY-SDDRHHMPSA-N Lys-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O PAMDBWYMLWOELY-SDDRHHMPSA-N 0.000 description 2
- WGLAORUKDGRINI-WDCWCFNPSA-N Lys-Glu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGLAORUKDGRINI-WDCWCFNPSA-N 0.000 description 2
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 2
- DKTNGXVSCZULPO-YUMQZZPRSA-N Lys-Gly-Cys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CS)C(O)=O DKTNGXVSCZULPO-YUMQZZPRSA-N 0.000 description 2
- ZASPELYMPSACER-HOCLYGCPSA-N Lys-Gly-Trp Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ZASPELYMPSACER-HOCLYGCPSA-N 0.000 description 2
- KEPWSUPUFAPBRF-DKIMLUQUSA-N Lys-Ile-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KEPWSUPUFAPBRF-DKIMLUQUSA-N 0.000 description 2
- IZJGPPIGYTVXLB-FQUUOJAGSA-N Lys-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IZJGPPIGYTVXLB-FQUUOJAGSA-N 0.000 description 2
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 2
- ONPDTSFZAIWMDI-AVGNSLFASA-N Lys-Leu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ONPDTSFZAIWMDI-AVGNSLFASA-N 0.000 description 2
- ZJWIXBZTAAJERF-IHRRRGAJSA-N Lys-Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZJWIXBZTAAJERF-IHRRRGAJSA-N 0.000 description 2
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 2
- VSTNAUBHKQPVJX-IHRRRGAJSA-N Lys-Met-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O VSTNAUBHKQPVJX-IHRRRGAJSA-N 0.000 description 2
- MSSJJDVQTFTLIF-KBPBESRZSA-N Lys-Phe-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O MSSJJDVQTFTLIF-KBPBESRZSA-N 0.000 description 2
- OBZHNHBAAVEWKI-DCAQKATOSA-N Lys-Pro-Asn Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O OBZHNHBAAVEWKI-DCAQKATOSA-N 0.000 description 2
- YSPZCHGIWAQVKQ-AVGNSLFASA-N Lys-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN YSPZCHGIWAQVKQ-AVGNSLFASA-N 0.000 description 2
- SBQDRNOLGSYHQA-YUMQZZPRSA-N Lys-Ser-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SBQDRNOLGSYHQA-YUMQZZPRSA-N 0.000 description 2
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 2
- WZVSHTFTCYOFPL-GARJFASQSA-N Lys-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCCN)N)C(=O)O WZVSHTFTCYOFPL-GARJFASQSA-N 0.000 description 2
- BVRNWWHJYNPJDG-XIRDDKMYSA-N Lys-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N BVRNWWHJYNPJDG-XIRDDKMYSA-N 0.000 description 2
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 2
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 2
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 2
- MDXAULHWGWETHF-SRVKXCTJSA-N Met-Arg-Val Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CCCNC(N)=N MDXAULHWGWETHF-SRVKXCTJSA-N 0.000 description 2
- UAPZLLPGGOOCRO-IHRRRGAJSA-N Met-Asn-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N UAPZLLPGGOOCRO-IHRRRGAJSA-N 0.000 description 2
- XMMWDTUFTZMQFD-GMOBBJLQSA-N Met-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCSC XMMWDTUFTZMQFD-GMOBBJLQSA-N 0.000 description 2
- HLYIDXAXQIJYIG-CIUDSAMLSA-N Met-Gln-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HLYIDXAXQIJYIG-CIUDSAMLSA-N 0.000 description 2
- IUYCGMNKIZDRQI-BQBZGAKWSA-N Met-Gly-Ala Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O IUYCGMNKIZDRQI-BQBZGAKWSA-N 0.000 description 2
- PZUUMQPMHBJJKE-AVGNSLFASA-N Met-Leu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCNC(N)=N PZUUMQPMHBJJKE-AVGNSLFASA-N 0.000 description 2
- JYPITOUIQVSCKM-IHRRRGAJSA-N Met-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCSC)N JYPITOUIQVSCKM-IHRRRGAJSA-N 0.000 description 2
- KMSMNUFBNCHMII-IHRRRGAJSA-N Met-Leu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN KMSMNUFBNCHMII-IHRRRGAJSA-N 0.000 description 2
- XDGFFEZAZHRZFR-RHYQMDGZSA-N Met-Leu-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDGFFEZAZHRZFR-RHYQMDGZSA-N 0.000 description 2
- USBFEVBHEQBWDD-AVGNSLFASA-N Met-Leu-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O USBFEVBHEQBWDD-AVGNSLFASA-N 0.000 description 2
- HAQLBBVZAGMESV-IHRRRGAJSA-N Met-Lys-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O HAQLBBVZAGMESV-IHRRRGAJSA-N 0.000 description 2
- NHXXGBXJTLRGJI-GUBZILKMSA-N Met-Pro-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O NHXXGBXJTLRGJI-GUBZILKMSA-N 0.000 description 2
- CIIJWIAORKTXAH-FJXKBIBVSA-N Met-Thr-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O CIIJWIAORKTXAH-FJXKBIBVSA-N 0.000 description 2
- VVWQHJUYBPJCNS-UMPQAUOISA-N Met-Trp-Thr Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)=CNC2=C1 VVWQHJUYBPJCNS-UMPQAUOISA-N 0.000 description 2
- FZDOBWIKRQORAC-ULQDDVLXSA-N Met-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCSC)N FZDOBWIKRQORAC-ULQDDVLXSA-N 0.000 description 2
- 102000006404 Mitochondrial Proteins Human genes 0.000 description 2
- 108010058682 Mitochondrial Proteins Proteins 0.000 description 2
- 241000713869 Moloney murine leukemia virus Species 0.000 description 2
- 241000699670 Mus sp. Species 0.000 description 2
- 108010069483 N-acetylglucosamine-6-phosphate deacetylase Proteins 0.000 description 2
- 241001181114 Neta Species 0.000 description 2
- 101710089556 Netrin unc-6 Proteins 0.000 description 2
- 208000002537 Neuronal Ceroid-Lipofuscinoses Diseases 0.000 description 2
- 239000004677 Nylon Substances 0.000 description 2
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 2
- DFEVBOYEUQJGER-JURCDPSOSA-N Phe-Ala-Ile Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O DFEVBOYEUQJGER-JURCDPSOSA-N 0.000 description 2
- AYPMIIKUMNADSU-IHRRRGAJSA-N Phe-Arg-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O AYPMIIKUMNADSU-IHRRRGAJSA-N 0.000 description 2
- VHWOBXIWBDWZHK-IHRRRGAJSA-N Phe-Arg-Asp Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 VHWOBXIWBDWZHK-IHRRRGAJSA-N 0.000 description 2
- XWBJLKDCHJVKAK-KKUMJFAQSA-N Phe-Arg-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N XWBJLKDCHJVKAK-KKUMJFAQSA-N 0.000 description 2
- JEGFCFLCRSJCMA-IHRRRGAJSA-N Phe-Arg-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N JEGFCFLCRSJCMA-IHRRRGAJSA-N 0.000 description 2
- LJUUGSWZPQOJKD-JYJNAYRXSA-N Phe-Arg-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O LJUUGSWZPQOJKD-JYJNAYRXSA-N 0.000 description 2
- KIAWKQJTSGRCSA-AVGNSLFASA-N Phe-Asn-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KIAWKQJTSGRCSA-AVGNSLFASA-N 0.000 description 2
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 2
- CSYVXYQDIVCQNU-QWRGUYRKSA-N Phe-Asp-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O CSYVXYQDIVCQNU-QWRGUYRKSA-N 0.000 description 2
- HPECNYCQLSVCHH-BZSNNMDCSA-N Phe-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N HPECNYCQLSVCHH-BZSNNMDCSA-N 0.000 description 2
- IILUKIJNFMUBNF-IHRRRGAJSA-N Phe-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O IILUKIJNFMUBNF-IHRRRGAJSA-N 0.000 description 2
- YYKZDTVQHTUKDW-RYUDHWBXSA-N Phe-Gly-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N YYKZDTVQHTUKDW-RYUDHWBXSA-N 0.000 description 2
- ZLGQEBCCANLYRA-RYUDHWBXSA-N Phe-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O ZLGQEBCCANLYRA-RYUDHWBXSA-N 0.000 description 2
- DZVXMMSUWWUIQE-ACRUOGEOSA-N Phe-His-Tyr Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)N DZVXMMSUWWUIQE-ACRUOGEOSA-N 0.000 description 2
- FXPZZKBHNOMLGA-HJWJTTGWSA-N Phe-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N FXPZZKBHNOMLGA-HJWJTTGWSA-N 0.000 description 2
- DVOCGBNHAUHKHJ-DKIMLUQUSA-N Phe-Ile-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O DVOCGBNHAUHKHJ-DKIMLUQUSA-N 0.000 description 2
- MSHZERMPZKCODG-ACRUOGEOSA-N Phe-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 MSHZERMPZKCODG-ACRUOGEOSA-N 0.000 description 2
- INHMISZWLJZQGH-ULQDDVLXSA-N Phe-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 INHMISZWLJZQGH-ULQDDVLXSA-N 0.000 description 2
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 2
- CJAHQEZWDZNSJO-KKUMJFAQSA-N Phe-Lys-Cys Chemical compound NCCCC[C@@H](C(=O)N[C@@H](CS)C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 CJAHQEZWDZNSJO-KKUMJFAQSA-N 0.000 description 2
- WURZLPSMYZLEGH-UNQGMJICSA-N Phe-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CC=CC=C1)N)O WURZLPSMYZLEGH-UNQGMJICSA-N 0.000 description 2
- FENSZYFJQOFSQR-FIRPJDEBSA-N Phe-Phe-Ile Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FENSZYFJQOFSQR-FIRPJDEBSA-N 0.000 description 2
- GPLWGAYGROGDEN-BZSNNMDCSA-N Phe-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O GPLWGAYGROGDEN-BZSNNMDCSA-N 0.000 description 2
- RVEVENLSADZUMS-IHRRRGAJSA-N Phe-Pro-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RVEVENLSADZUMS-IHRRRGAJSA-N 0.000 description 2
- WWPAHTZOWURIMR-ULQDDVLXSA-N Phe-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 WWPAHTZOWURIMR-ULQDDVLXSA-N 0.000 description 2
- JXQVYPWVGUOIDV-MXAVVETBSA-N Phe-Ser-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JXQVYPWVGUOIDV-MXAVVETBSA-N 0.000 description 2
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 2
- MVIJMIZJPHQGEN-IHRRRGAJSA-N Phe-Ser-Val Chemical compound CC(C)[C@@H](C([O-])=O)NC(=O)[C@H](CO)NC(=O)[C@@H]([NH3+])CC1=CC=CC=C1 MVIJMIZJPHQGEN-IHRRRGAJSA-N 0.000 description 2
- VGTJSEYTVMAASM-RPTUDFQQSA-N Phe-Thr-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VGTJSEYTVMAASM-RPTUDFQQSA-N 0.000 description 2
- QTDBZORPVYTRJU-KKXDTOCCSA-N Phe-Tyr-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O QTDBZORPVYTRJU-KKXDTOCCSA-N 0.000 description 2
- FXEKNHAJIMHRFJ-ULQDDVLXSA-N Phe-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N FXEKNHAJIMHRFJ-ULQDDVLXSA-N 0.000 description 2
- VIIRRNQMMIHYHQ-XHSDSOJGSA-N Phe-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N VIIRRNQMMIHYHQ-XHSDSOJGSA-N 0.000 description 2
- IEIFEYBAYFSRBQ-IHRRRGAJSA-N Phe-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N IEIFEYBAYFSRBQ-IHRRRGAJSA-N 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- DBALDZKOTNSBFM-FXQIFTODSA-N Pro-Ala-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DBALDZKOTNSBFM-FXQIFTODSA-N 0.000 description 2
- NHDVNAKDACFHPX-GUBZILKMSA-N Pro-Arg-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O NHDVNAKDACFHPX-GUBZILKMSA-N 0.000 description 2
- VCYJKOLZYPYGJV-AVGNSLFASA-N Pro-Arg-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VCYJKOLZYPYGJV-AVGNSLFASA-N 0.000 description 2
- UVKNEILZSJMKSR-FXQIFTODSA-N Pro-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 UVKNEILZSJMKSR-FXQIFTODSA-N 0.000 description 2
- SGCZFWSQERRKBD-BQBZGAKWSA-N Pro-Asp-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 SGCZFWSQERRKBD-BQBZGAKWSA-N 0.000 description 2
- GQLOZEMWEBDEAY-NAKRPEOUSA-N Pro-Cys-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GQLOZEMWEBDEAY-NAKRPEOUSA-N 0.000 description 2
- OLTFZQIYCNOBLI-DCAQKATOSA-N Pro-Cys-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O OLTFZQIYCNOBLI-DCAQKATOSA-N 0.000 description 2
- LHALYDBUDCWMDY-CIUDSAMLSA-N Pro-Glu-Ala Chemical compound C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O LHALYDBUDCWMDY-CIUDSAMLSA-N 0.000 description 2
- FRKBNXCFJBPJOL-GUBZILKMSA-N Pro-Glu-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FRKBNXCFJBPJOL-GUBZILKMSA-N 0.000 description 2
- NXEYSLRNNPWCRN-SRVKXCTJSA-N Pro-Glu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXEYSLRNNPWCRN-SRVKXCTJSA-N 0.000 description 2
- JMVQDLDPDBXAAX-YUMQZZPRSA-N Pro-Gly-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 JMVQDLDPDBXAAX-YUMQZZPRSA-N 0.000 description 2
- HAAQQNHQZBOWFO-LURJTMIESA-N Pro-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1 HAAQQNHQZBOWFO-LURJTMIESA-N 0.000 description 2
- XQSREVQDGCPFRJ-STQMWFEESA-N Pro-Gly-Phe Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XQSREVQDGCPFRJ-STQMWFEESA-N 0.000 description 2
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 2
- QEWBZBLXDKIQPS-STQMWFEESA-N Pro-Gly-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QEWBZBLXDKIQPS-STQMWFEESA-N 0.000 description 2
- DTQIXTOJHKVEOH-DCAQKATOSA-N Pro-His-Cys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CS)C(=O)O DTQIXTOJHKVEOH-DCAQKATOSA-N 0.000 description 2
- FKVNLUZHSFCNGY-RVMXOQNASA-N Pro-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 FKVNLUZHSFCNGY-RVMXOQNASA-N 0.000 description 2
- NFLNBHLMLYALOO-DCAQKATOSA-N Pro-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@@H]1CCCN1 NFLNBHLMLYALOO-DCAQKATOSA-N 0.000 description 2
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 2
- XYSXOCIWCPFOCG-IHRRRGAJSA-N Pro-Leu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XYSXOCIWCPFOCG-IHRRRGAJSA-N 0.000 description 2
- MHBSUKYVBZVQRW-HJWJTTGWSA-N Pro-Phe-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MHBSUKYVBZVQRW-HJWJTTGWSA-N 0.000 description 2
- HOTVCUAVDQHUDB-UFYCRDLUSA-N Pro-Phe-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 HOTVCUAVDQHUDB-UFYCRDLUSA-N 0.000 description 2
- LEIKGVHQTKHOLM-IUCAKERBSA-N Pro-Pro-Gly Chemical compound OC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 LEIKGVHQTKHOLM-IUCAKERBSA-N 0.000 description 2
- CGSOWZUPLOKYOR-AVGNSLFASA-N Pro-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 CGSOWZUPLOKYOR-AVGNSLFASA-N 0.000 description 2
- QAAYIXYLEMRULP-SRVKXCTJSA-N Pro-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 QAAYIXYLEMRULP-SRVKXCTJSA-N 0.000 description 2
- FDMKYQQYJKYCLV-GUBZILKMSA-N Pro-Pro-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 FDMKYQQYJKYCLV-GUBZILKMSA-N 0.000 description 2
- GOMUXSCOIWIJFP-GUBZILKMSA-N Pro-Ser-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GOMUXSCOIWIJFP-GUBZILKMSA-N 0.000 description 2
- GNFHQWNCSSPOBT-ULQDDVLXSA-N Pro-Trp-Gln Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CCC(=O)N)C(=O)O GNFHQWNCSSPOBT-ULQDDVLXSA-N 0.000 description 2
- DIDLUFMLRUJLFB-FKBYEOEOSA-N Pro-Trp-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC4=CC=C(C=C4)O)C(=O)O DIDLUFMLRUJLFB-FKBYEOEOSA-N 0.000 description 2
- DYJTXTCEXMCPBF-UFYCRDLUSA-N Pro-Tyr-Phe Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CC3=CC=CC=C3)C(=O)O DYJTXTCEXMCPBF-UFYCRDLUSA-N 0.000 description 2
- IALSFJSONJZBKB-HRCADAONSA-N Pro-Tyr-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N3CCC[C@@H]3C(=O)O IALSFJSONJZBKB-HRCADAONSA-N 0.000 description 2
- 108010079005 RDV peptide Proteins 0.000 description 2
- 241000700159 Rattus Species 0.000 description 2
- 101100411568 Rattus norvegicus Rab26 gene Proteins 0.000 description 2
- 108090000894 Ribosomal Protein L3 Proteins 0.000 description 2
- HRNQLKCLPVKZNE-CIUDSAMLSA-N Ser-Ala-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O HRNQLKCLPVKZNE-CIUDSAMLSA-N 0.000 description 2
- QWZIOCFPXMAXET-CIUDSAMLSA-N Ser-Arg-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O QWZIOCFPXMAXET-CIUDSAMLSA-N 0.000 description 2
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 2
- TYYBJUYSTWJHGO-ZKWXMUAHSA-N Ser-Asn-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TYYBJUYSTWJHGO-ZKWXMUAHSA-N 0.000 description 2
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 2
- DSSOYPJWSWFOLK-CIUDSAMLSA-N Ser-Cys-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O DSSOYPJWSWFOLK-CIUDSAMLSA-N 0.000 description 2
- ULVMNZOKDBHKKI-ACZMJKKPSA-N Ser-Gln-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ULVMNZOKDBHKKI-ACZMJKKPSA-N 0.000 description 2
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 2
- VMVNCJDKFOQOHM-GUBZILKMSA-N Ser-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CO)N VMVNCJDKFOQOHM-GUBZILKMSA-N 0.000 description 2
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 2
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 2
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 2
- OQPNSDWGAMFJNU-QWRGUYRKSA-N Ser-Gly-Tyr Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OQPNSDWGAMFJNU-QWRGUYRKSA-N 0.000 description 2
- FYUIFUJFNCLUIX-XVYDVKMFSA-N Ser-His-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O FYUIFUJFNCLUIX-XVYDVKMFSA-N 0.000 description 2
- CJINPXGSKSZQNE-KBIXCLLPSA-N Ser-Ile-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O CJINPXGSKSZQNE-KBIXCLLPSA-N 0.000 description 2
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 2
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 2
- FUMGHWDRRFCKEP-CIUDSAMLSA-N Ser-Leu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O FUMGHWDRRFCKEP-CIUDSAMLSA-N 0.000 description 2
- QYSFWUIXDFJUDW-DCAQKATOSA-N Ser-Leu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYSFWUIXDFJUDW-DCAQKATOSA-N 0.000 description 2
- KJKQUQXDEKMPDK-FXQIFTODSA-N Ser-Met-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O KJKQUQXDEKMPDK-FXQIFTODSA-N 0.000 description 2
- NQZFFLBPNDLTPO-DLOVCJGASA-N Ser-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CO)N NQZFFLBPNDLTPO-DLOVCJGASA-N 0.000 description 2
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 2
- RRVFEDGUXSYWOW-BZSNNMDCSA-N Ser-Phe-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RRVFEDGUXSYWOW-BZSNNMDCSA-N 0.000 description 2
- RWDVVSKYZBNDCO-MELADBBJSA-N Ser-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CO)N)C(=O)O RWDVVSKYZBNDCO-MELADBBJSA-N 0.000 description 2
- MHVXPTAMDHLTHB-IHPCNDPISA-N Ser-Phe-Trp Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 MHVXPTAMDHLTHB-IHPCNDPISA-N 0.000 description 2
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 2
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 description 2
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 2
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 2
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 2
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 2
- KKKVOZNCLALMPV-XKBZYTNZSA-N Ser-Thr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KKKVOZNCLALMPV-XKBZYTNZSA-N 0.000 description 2
- DYEGLQRVMBWQLD-IXOXFDKPSA-N Ser-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CO)N)O DYEGLQRVMBWQLD-IXOXFDKPSA-N 0.000 description 2
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 2
- VLMIUSLQONKLDV-HEIBUPTGSA-N Ser-Thr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VLMIUSLQONKLDV-HEIBUPTGSA-N 0.000 description 2
- FVFUOQIYDPAIJR-XIRDDKMYSA-N Ser-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CO)N FVFUOQIYDPAIJR-XIRDDKMYSA-N 0.000 description 2
- FGBLCMLXHRPVOF-IHRRRGAJSA-N Ser-Tyr-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FGBLCMLXHRPVOF-IHRRRGAJSA-N 0.000 description 2
- PZHJLTWGMYERRJ-SRVKXCTJSA-N Ser-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N)O PZHJLTWGMYERRJ-SRVKXCTJSA-N 0.000 description 2
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 2
- ODRUTDLAONAVDV-IHRRRGAJSA-N Ser-Val-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ODRUTDLAONAVDV-IHRRRGAJSA-N 0.000 description 2
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 2
- 108020005038 Terminator Codon Proteins 0.000 description 2
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 2
- ZUXQFMVPAYGPFJ-JXUBOQSCSA-N Thr-Ala-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN ZUXQFMVPAYGPFJ-JXUBOQSCSA-N 0.000 description 2
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 2
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 2
- LHUBVKCLOVALIA-HJGDQZAQSA-N Thr-Arg-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LHUBVKCLOVALIA-HJGDQZAQSA-N 0.000 description 2
- VFEHSAJCWWHDBH-RHYQMDGZSA-N Thr-Arg-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VFEHSAJCWWHDBH-RHYQMDGZSA-N 0.000 description 2
- XVNZSJIKGJLQLH-RCWTZXSCSA-N Thr-Arg-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCSC)C(=O)O)N)O XVNZSJIKGJLQLH-RCWTZXSCSA-N 0.000 description 2
- UTSWGQNAQRIHAI-UNQGMJICSA-N Thr-Arg-Phe Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 UTSWGQNAQRIHAI-UNQGMJICSA-N 0.000 description 2
- JEDIEMIJYSRUBB-FOHZUACHSA-N Thr-Asp-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O JEDIEMIJYSRUBB-FOHZUACHSA-N 0.000 description 2
- DCLBXIWHLVEPMQ-JRQIVUDYSA-N Thr-Asp-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 DCLBXIWHLVEPMQ-JRQIVUDYSA-N 0.000 description 2
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 2
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 2
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 2
- XOTBWOCSLMBGMF-SUSMZKCASA-N Thr-Glu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOTBWOCSLMBGMF-SUSMZKCASA-N 0.000 description 2
- KCRQEJSKXAIULJ-FJXKBIBVSA-N Thr-Gly-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O KCRQEJSKXAIULJ-FJXKBIBVSA-N 0.000 description 2
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 2
- IMULJHHGAUZZFE-MBLNEYKQSA-N Thr-Gly-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IMULJHHGAUZZFE-MBLNEYKQSA-N 0.000 description 2
- AYCQVUUPIJHJTA-IXOXFDKPSA-N Thr-His-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O AYCQVUUPIJHJTA-IXOXFDKPSA-N 0.000 description 2
- CRZNCABIJLRFKZ-IUKAMOBKSA-N Thr-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N CRZNCABIJLRFKZ-IUKAMOBKSA-N 0.000 description 2
- XYFISNXATOERFZ-OSUNSFLBSA-N Thr-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N XYFISNXATOERFZ-OSUNSFLBSA-N 0.000 description 2
- AMXMBCAXAZUCFA-RHYQMDGZSA-N Thr-Leu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMXMBCAXAZUCFA-RHYQMDGZSA-N 0.000 description 2
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 2
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 2
- KRDSCBLRHORMRK-JXUBOQSCSA-N Thr-Lys-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O KRDSCBLRHORMRK-JXUBOQSCSA-N 0.000 description 2
- CJXURNZYNHCYFD-WDCWCFNPSA-N Thr-Lys-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O CJXURNZYNHCYFD-WDCWCFNPSA-N 0.000 description 2
- MGJLBZFUXUGMML-VOAKCMCISA-N Thr-Lys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MGJLBZFUXUGMML-VOAKCMCISA-N 0.000 description 2
- HSQXHRIRJSFDOH-URLPEUOOSA-N Thr-Phe-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HSQXHRIRJSFDOH-URLPEUOOSA-N 0.000 description 2
- VGYVVSQFSSKZRJ-OEAJRASXSA-N Thr-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=CC=C1 VGYVVSQFSSKZRJ-OEAJRASXSA-N 0.000 description 2
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 2
- VUXIQSUQQYNLJP-XAVMHZPKSA-N Thr-Ser-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N)O VUXIQSUQQYNLJP-XAVMHZPKSA-N 0.000 description 2
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 2
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 2
- QJIODPFLAASXJC-JHYOHUSXSA-N Thr-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O QJIODPFLAASXJC-JHYOHUSXSA-N 0.000 description 2
- VGNKUXWYFFDWDH-BEMMVCDISA-N Thr-Trp-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N3CCC[C@@H]3C(=O)O)N)O VGNKUXWYFFDWDH-BEMMVCDISA-N 0.000 description 2
- JNKAYADBODLPMQ-HSHDSVGOSA-N Thr-Trp-Val Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)=CNC2=C1 JNKAYADBODLPMQ-HSHDSVGOSA-N 0.000 description 2
- RPECVQBNONKZAT-WZLNRYEVSA-N Thr-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H]([C@@H](C)O)N RPECVQBNONKZAT-WZLNRYEVSA-N 0.000 description 2
- OGOYMQWIWHGTGH-KZVJFYERSA-N Thr-Val-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O OGOYMQWIWHGTGH-KZVJFYERSA-N 0.000 description 2
- QNXZCKMXHPULME-ZNSHCXBVSA-N Thr-Val-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O QNXZCKMXHPULME-ZNSHCXBVSA-N 0.000 description 2
- 102000040945 Transcription factor Human genes 0.000 description 2
- 108700019146 Transgenes Proteins 0.000 description 2
- BIJDDZBDSJLWJY-PJODQICGSA-N Trp-Ala-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O BIJDDZBDSJLWJY-PJODQICGSA-N 0.000 description 2
- TWJDQTTXXZDJKV-BPUTZDHNSA-N Trp-Arg-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O TWJDQTTXXZDJKV-BPUTZDHNSA-N 0.000 description 2
- XZLHHHYSWIYXHD-XIRDDKMYSA-N Trp-Gln-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XZLHHHYSWIYXHD-XIRDDKMYSA-N 0.000 description 2
- NOBINHCGDUHOBV-NAZCDGGXSA-N Trp-His-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NOBINHCGDUHOBV-NAZCDGGXSA-N 0.000 description 2
- XGFOXYJQBRTJPO-PJODQICGSA-N Trp-Pro-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O XGFOXYJQBRTJPO-PJODQICGSA-N 0.000 description 2
- GFUOTIPYXKAPAH-BVSLBCMMSA-N Trp-Pro-Phe Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GFUOTIPYXKAPAH-BVSLBCMMSA-N 0.000 description 2
- JGLXHHQUSIULAK-OYDLWJJNSA-N Trp-Pro-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H]3CCCN3C(=O)[C@H](CC=3C4=CC=CC=C4NC=3)N)C(O)=O)=CNC2=C1 JGLXHHQUSIULAK-OYDLWJJNSA-N 0.000 description 2
- UMIACFRBELJMGT-GQGQLFGLSA-N Trp-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N UMIACFRBELJMGT-GQGQLFGLSA-N 0.000 description 2
- ITUAVBRBGKVBLH-BVSLBCMMSA-N Trp-Tyr-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N ITUAVBRBGKVBLH-BVSLBCMMSA-N 0.000 description 2
- 102000004142 Trypsin Human genes 0.000 description 2
- 108090000631 Trypsin Proteins 0.000 description 2
- 102100031638 Tuberin Human genes 0.000 description 2
- NIHNMOSRSAYZIT-BPNCWPANSA-N Tyr-Ala-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NIHNMOSRSAYZIT-BPNCWPANSA-N 0.000 description 2
- XLMDWQNAOKLKCP-XDTLVQLUSA-N Tyr-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N XLMDWQNAOKLKCP-XDTLVQLUSA-N 0.000 description 2
- FBVGQXJIXFZKSQ-GMVOTWDCSA-N Tyr-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N FBVGQXJIXFZKSQ-GMVOTWDCSA-N 0.000 description 2
- PEVVXUGSAKEPEN-AVGNSLFASA-N Tyr-Asn-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PEVVXUGSAKEPEN-AVGNSLFASA-N 0.000 description 2
- MTEQZJFSEMXXRK-CFMVVWHZSA-N Tyr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N MTEQZJFSEMXXRK-CFMVVWHZSA-N 0.000 description 2
- IUQDEKCCHWRHRW-IHPCNDPISA-N Tyr-Asn-Trp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IUQDEKCCHWRHRW-IHPCNDPISA-N 0.000 description 2
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 2
- RIJPHPUJRLEOAK-JYJNAYRXSA-N Tyr-Gln-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O RIJPHPUJRLEOAK-JYJNAYRXSA-N 0.000 description 2
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 2
- CNLKDWSAORJEMW-KWQFWETISA-N Tyr-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O CNLKDWSAORJEMW-KWQFWETISA-N 0.000 description 2
- PMDWYLVWHRTJIW-STQMWFEESA-N Tyr-Gly-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PMDWYLVWHRTJIW-STQMWFEESA-N 0.000 description 2
- GIOBXJSONRQHKQ-RYUDHWBXSA-N Tyr-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O GIOBXJSONRQHKQ-RYUDHWBXSA-N 0.000 description 2
- CTDPLKMBVALCGN-JSGCOSHPSA-N Tyr-Gly-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O CTDPLKMBVALCGN-JSGCOSHPSA-N 0.000 description 2
- KEANSLVUGJADPN-LKTVYLICSA-N Tyr-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CC=C(C=C2)O)N KEANSLVUGJADPN-LKTVYLICSA-N 0.000 description 2
- LFCQXIXJQXWZJI-BZSNNMDCSA-N Tyr-His-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N)O LFCQXIXJQXWZJI-BZSNNMDCSA-N 0.000 description 2
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 2
- JHORGUYURUBVOM-KKUMJFAQSA-N Tyr-His-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O JHORGUYURUBVOM-KKUMJFAQSA-N 0.000 description 2
- OHOVFPKXPZODHS-SJWGOKEGSA-N Tyr-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OHOVFPKXPZODHS-SJWGOKEGSA-N 0.000 description 2
- QARCDOCCDOLJSF-HJPIBITLSA-N Tyr-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QARCDOCCDOLJSF-HJPIBITLSA-N 0.000 description 2
- QHLIUFUEUDFAOT-MGHWNKPDSA-N Tyr-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QHLIUFUEUDFAOT-MGHWNKPDSA-N 0.000 description 2
- KHCSOLAHNLOXJR-BZSNNMDCSA-N Tyr-Leu-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHCSOLAHNLOXJR-BZSNNMDCSA-N 0.000 description 2
- CDKZJGMPZHPAJC-ULQDDVLXSA-N Tyr-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDKZJGMPZHPAJC-ULQDDVLXSA-N 0.000 description 2
- HSBZWINKRYZCSQ-KKUMJFAQSA-N Tyr-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O HSBZWINKRYZCSQ-KKUMJFAQSA-N 0.000 description 2
- FMXFHNSFABRVFZ-BZSNNMDCSA-N Tyr-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FMXFHNSFABRVFZ-BZSNNMDCSA-N 0.000 description 2
- LRHBBGDMBLFYGL-FHWLQOOXSA-N Tyr-Phe-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LRHBBGDMBLFYGL-FHWLQOOXSA-N 0.000 description 2
- JXGUUJMPCRXMSO-HJOGWXRNSA-N Tyr-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 JXGUUJMPCRXMSO-HJOGWXRNSA-N 0.000 description 2
- CDBXVDXSLPLFMD-BPNCWPANSA-N Tyr-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDBXVDXSLPLFMD-BPNCWPANSA-N 0.000 description 2
- RWOKVQUCENPXGE-IHRRRGAJSA-N Tyr-Ser-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RWOKVQUCENPXGE-IHRRRGAJSA-N 0.000 description 2
- UMSZZGTXGKHTFJ-SRVKXCTJSA-N Tyr-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 UMSZZGTXGKHTFJ-SRVKXCTJSA-N 0.000 description 2
- MDXLPNRXCFOBTL-BZSNNMDCSA-N Tyr-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MDXLPNRXCFOBTL-BZSNNMDCSA-N 0.000 description 2
- ITDWWLTTWRRLCC-KJEVXHAQSA-N Tyr-Thr-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ITDWWLTTWRRLCC-KJEVXHAQSA-N 0.000 description 2
- DTWMJYGOUWNWEC-IHPCNDPISA-N Tyr-Trp-Cys Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CS)C(O)=O)C1=CC=C(O)C=C1 DTWMJYGOUWNWEC-IHPCNDPISA-N 0.000 description 2
- AGDDLOQMXUQPDY-BZSNNMDCSA-N Tyr-Tyr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O AGDDLOQMXUQPDY-BZSNNMDCSA-N 0.000 description 2
- RGJZPXFZIUUQDN-BPNCWPANSA-N Tyr-Val-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O RGJZPXFZIUUQDN-BPNCWPANSA-N 0.000 description 2
- KLOZTPOXVVRVAQ-DZKIICNBSA-N Tyr-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KLOZTPOXVVRVAQ-DZKIICNBSA-N 0.000 description 2
- RVGVIWNHABGIFH-IHRRRGAJSA-N Tyr-Val-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O RVGVIWNHABGIFH-IHRRRGAJSA-N 0.000 description 2
- 102100034171 V-type proton ATPase 16 kDa proteolipid subunit c Human genes 0.000 description 2
- DDRBQONWVBDQOY-GUBZILKMSA-N Val-Ala-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O DDRBQONWVBDQOY-GUBZILKMSA-N 0.000 description 2
- ZLFHAAGHGQBQQN-AEJSXWLSSA-N Val-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZLFHAAGHGQBQQN-AEJSXWLSSA-N 0.000 description 2
- VJOWWOGRNXRQMF-UVBJJODRSA-N Val-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)C(C)C)C(O)=O)=CNC2=C1 VJOWWOGRNXRQMF-UVBJJODRSA-N 0.000 description 2
- JIODCDXKCJRMEH-NHCYSSNCSA-N Val-Arg-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N JIODCDXKCJRMEH-NHCYSSNCSA-N 0.000 description 2
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 2
- QPZMOUMNTGTEFR-ZKWXMUAHSA-N Val-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N QPZMOUMNTGTEFR-ZKWXMUAHSA-N 0.000 description 2
- OGNMURQZFMHFFD-NHCYSSNCSA-N Val-Asn-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N OGNMURQZFMHFFD-NHCYSSNCSA-N 0.000 description 2
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 2
- CGGVNFJRZJUVAE-BYULHYEWSA-N Val-Asp-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CGGVNFJRZJUVAE-BYULHYEWSA-N 0.000 description 2
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 2
- BWVHQINTNLVWGZ-ZKWXMUAHSA-N Val-Cys-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N BWVHQINTNLVWGZ-ZKWXMUAHSA-N 0.000 description 2
- CFSSLXZJEMERJY-NRPADANISA-N Val-Gln-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CFSSLXZJEMERJY-NRPADANISA-N 0.000 description 2
- AGKDVLSDNSTLFA-UMNHJUIQSA-N Val-Gln-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N AGKDVLSDNSTLFA-UMNHJUIQSA-N 0.000 description 2
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 2
- CELJCNRXKZPTCX-XPUUQOCRSA-N Val-Gly-Ala Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O CELJCNRXKZPTCX-XPUUQOCRSA-N 0.000 description 2
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 2
- KZKMBGXCNLPYKD-YEPSODPASA-N Val-Gly-Thr Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O KZKMBGXCNLPYKD-YEPSODPASA-N 0.000 description 2
- DHINLYMWMXQGMQ-IHRRRGAJSA-N Val-His-His Chemical compound C([C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 DHINLYMWMXQGMQ-IHRRRGAJSA-N 0.000 description 2
- ZIGZPYJXIWLQFC-QTKMDUPCSA-N Val-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C(C)C)N)O ZIGZPYJXIWLQFC-QTKMDUPCSA-N 0.000 description 2
- BZMIYHIJVVJPCK-QSFUFRPTSA-N Val-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N BZMIYHIJVVJPCK-QSFUFRPTSA-N 0.000 description 2
- APEBUJBRGCMMHP-HJWJTTGWSA-N Val-Ile-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 APEBUJBRGCMMHP-HJWJTTGWSA-N 0.000 description 2
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 2
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 2
- DAVNYIUELQBTAP-XUXIUFHCSA-N Val-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N DAVNYIUELQBTAP-XUXIUFHCSA-N 0.000 description 2
- GVJUTBOZZBTBIG-AVGNSLFASA-N Val-Lys-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N GVJUTBOZZBTBIG-AVGNSLFASA-N 0.000 description 2
- WBAJDGWKRIHOAC-GVXVVHGQSA-N Val-Lys-Gln Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O WBAJDGWKRIHOAC-GVXVVHGQSA-N 0.000 description 2
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 2
- RQOMPQGUGBILAG-AVGNSLFASA-N Val-Met-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O RQOMPQGUGBILAG-AVGNSLFASA-N 0.000 description 2
- VNGKMNPAENRGDC-JYJNAYRXSA-N Val-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 VNGKMNPAENRGDC-JYJNAYRXSA-N 0.000 description 2
- YLRAFVVWZRSZQC-DZKIICNBSA-N Val-Phe-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YLRAFVVWZRSZQC-DZKIICNBSA-N 0.000 description 2
- MHHAWNPHDLCPLF-ULQDDVLXSA-N Val-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 MHHAWNPHDLCPLF-ULQDDVLXSA-N 0.000 description 2
- KISFXYYRKKNLOP-IHRRRGAJSA-N Val-Phe-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N KISFXYYRKKNLOP-IHRRRGAJSA-N 0.000 description 2
- LGXUZJIQCGXKGZ-QXEWZRGKSA-N Val-Pro-Asn Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)N)C(=O)O)N LGXUZJIQCGXKGZ-QXEWZRGKSA-N 0.000 description 2
- QSPOLEBZTMESFY-SRVKXCTJSA-N Val-Pro-Val Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O QSPOLEBZTMESFY-SRVKXCTJSA-N 0.000 description 2
- PGQUDQYHWICSAB-NAKRPEOUSA-N Val-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N PGQUDQYHWICSAB-NAKRPEOUSA-N 0.000 description 2
- VHIZXDZMTDVFGX-DCAQKATOSA-N Val-Ser-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N VHIZXDZMTDVFGX-DCAQKATOSA-N 0.000 description 2
- PQSNETRGCRUOGP-KKHAAJSZSA-N Val-Thr-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O PQSNETRGCRUOGP-KKHAAJSZSA-N 0.000 description 2
- TVGWMCTYUFBXAP-QTKMDUPCSA-N Val-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C(C)C)N)O TVGWMCTYUFBXAP-QTKMDUPCSA-N 0.000 description 2
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 2
- PFMSJVIPEZMKSC-DZKIICNBSA-N Val-Tyr-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PFMSJVIPEZMKSC-DZKIICNBSA-N 0.000 description 2
- VVIZITNVZUAEMI-DLOVCJGASA-N Val-Val-Gln Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(N)=O VVIZITNVZUAEMI-DLOVCJGASA-N 0.000 description 2
- XNLUVJPMPAZHCY-JYJNAYRXSA-N Val-Val-Phe Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 XNLUVJPMPAZHCY-JYJNAYRXSA-N 0.000 description 2
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 2
- 108010028939 alanyl-alanyl-lysyl-alanine Proteins 0.000 description 2
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 2
- 108010078114 alanyl-tryptophyl-alanine Proteins 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 108010080488 arginyl-arginyl-leucine Proteins 0.000 description 2
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 2
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 2
- 108010094001 arginyl-tryptophyl-arginine Proteins 0.000 description 2
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 2
- 108010060035 arginylproline Proteins 0.000 description 2
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- 210000003050 axon Anatomy 0.000 description 2
- 230000028600 axonogenesis Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 230000002759 chromosomal effect Effects 0.000 description 2
- 239000003184 complementary RNA Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 108010004073 cysteinylcysteine Proteins 0.000 description 2
- 230000009089 cytolysis Effects 0.000 description 2
- 210000000805 cytoplasm Anatomy 0.000 description 2
- 239000000839 emulsion Substances 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 2
- 238000001415 gene therapy Methods 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 150000004676 glycans Chemical class 0.000 description 2
- JYPCXBJRLBHWME-UHFFFAOYSA-N glycyl-L-prolyl-L-arginine Natural products NCC(=O)N1CCCC1C(=O)NC(CCCN=C(N)N)C(O)=O JYPCXBJRLBHWME-UHFFFAOYSA-N 0.000 description 2
- 108010090037 glycyl-alanyl-isoleucine Proteins 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 2
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 2
- 108010025801 glycyl-prolyl-arginine Proteins 0.000 description 2
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 2
- 208000016861 hereditary angioedema type 3 Diseases 0.000 description 2
- 102000049334 human KMT2D Human genes 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 239000001257 hydrogen Substances 0.000 description 2
- 238000003018 immunoassay Methods 0.000 description 2
- 238000007901 in situ hybridization Methods 0.000 description 2
- 238000011503 in vivo imaging Methods 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- PHTQWCKDNZKARW-UHFFFAOYSA-N isoamylol Chemical compound CC(C)CCO PHTQWCKDNZKARW-UHFFFAOYSA-N 0.000 description 2
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 2
- 108010027338 isoleucylcysteine Proteins 0.000 description 2
- 108010078274 isoleucylvaline Proteins 0.000 description 2
- 208000017476 juvenile neuronal ceroid lipofuscinosis Diseases 0.000 description 2
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 2
- 108010073093 leucyl-glycyl-glycyl-glycine Proteins 0.000 description 2
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 2
- 108010012058 leucyltyrosine Proteins 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 208000019423 liver disease Diseases 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 230000004199 lung function Effects 0.000 description 2
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 2
- 108010056787 lysyl-arginyl-glutamyl-glutamic acid Proteins 0.000 description 2
- 108010010679 lysyl-valyl-leucyl-aspartic acid Proteins 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 229910021645 metal ion Inorganic materials 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 238000007857 nested PCR Methods 0.000 description 2
- 201000007607 neuronal ceroid lipofuscinosis 3 Diseases 0.000 description 2
- 230000036963 noncompetitive effect Effects 0.000 description 2
- 229920001778 nylon Polymers 0.000 description 2
- 230000005298 paramagnetic effect Effects 0.000 description 2
- 108010074082 phenylalanyl-alanyl-lysine Proteins 0.000 description 2
- 108010070409 phenylalanyl-glycyl-glycine Proteins 0.000 description 2
- 108010065135 phenylalanyl-phenylalanyl-phenylalanine Proteins 0.000 description 2
- 108010083476 phenylalanyltryptophan Proteins 0.000 description 2
- 229920001282 polysaccharide Polymers 0.000 description 2
- 239000005017 polysaccharide Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 108700042769 prolyl-leucyl-glycine Proteins 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 210000002027 skeletal muscle Anatomy 0.000 description 2
- 210000000952 spleen Anatomy 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 2
- 230000005030 transcription termination Effects 0.000 description 2
- 108091006107 transcriptional repressors Proteins 0.000 description 2
- 239000012588 trypsin Substances 0.000 description 2
- 108010080629 tryptophan-leucine Proteins 0.000 description 2
- 108010038745 tryptophylglycine Proteins 0.000 description 2
- 108010051110 tyrosyl-lysine Proteins 0.000 description 2
- 108010020532 tyrosyl-proline Proteins 0.000 description 2
- 108010078580 tyrosylleucine Proteins 0.000 description 2
- 101150027698 unc-6 gene Proteins 0.000 description 2
- 241000701447 unidentified baculovirus Species 0.000 description 2
- 108010009962 valyltyrosine Proteins 0.000 description 2
- 108700026220 vif Genes Proteins 0.000 description 2
- 210000005253 yeast cell Anatomy 0.000 description 2
- 229910052725 zinc Inorganic materials 0.000 description 2
- 239000011701 zinc Substances 0.000 description 2
- DIGQNXIGRZPYDK-WKSCXVIASA-N (2R)-6-amino-2-[[2-[[(2S)-2-[[2-[[(2R)-2-[[(2S)-2-[[(2R,3S)-2-[[2-[[(2S)-2-[[2-[[(2S)-2-[[(2S)-2-[[(2R)-2-[[(2S,3S)-2-[[(2R)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[2-[[(2S)-2-[[(2R)-2-[[2-[[2-[[2-[(2-amino-1-hydroxyethylidene)amino]-3-carboxy-1-hydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1,5-dihydroxy-5-iminopentylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]hexanoic acid Chemical compound C[C@@H]([C@@H](C(=N[C@@H](CS)C(=N[C@@H](C)C(=N[C@@H](CO)C(=NCC(=N[C@@H](CCC(=N)O)C(=NC(CS)C(=N[C@H]([C@H](C)O)C(=N[C@H](CS)C(=N[C@H](CO)C(=NCC(=N[C@H](CS)C(=NCC(=N[C@H](CCCCN)C(=O)O)O)O)O)O)O)O)O)O)O)O)O)O)O)N=C([C@H](CS)N=C([C@H](CO)N=C([C@H](CO)N=C([C@H](C)N=C(CN=C([C@H](CO)N=C([C@H](CS)N=C(CN=C(C(CS)N=C(C(CC(=O)O)N=C(CN)O)O)O)O)O)O)O)O)O)O)O)O DIGQNXIGRZPYDK-WKSCXVIASA-N 0.000 description 1
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- SPBDXSGPUHCETR-JFUDTMANSA-N 8883yp2r6d Chemical compound O1[C@@H](C)[C@H](O)[C@@H](OC)C[C@@H]1O[C@@H]1[C@@H](OC)C[C@H](O[C@@H]2C(=C/C[C@@H]3C[C@@H](C[C@@]4(O[C@@H]([C@@H](C)CC4)C(C)C)O3)OC(=O)[C@@H]3C=C(C)[C@@H](O)[C@H]4OC\C([C@@]34O)=C/C=C/[C@@H]2C)/C)O[C@H]1C.C1C[C@H](C)[C@@H]([C@@H](C)CC)O[C@@]21O[C@H](C\C=C(C)\[C@@H](O[C@@H]1O[C@@H](C)[C@H](O[C@@H]3O[C@@H](C)[C@H](O)[C@@H](OC)C3)[C@@H](OC)C1)[C@@H](C)\C=C\C=C/1[C@]3([C@H](C(=O)O4)C=C(C)[C@@H](O)[C@H]3OC\1)O)C[C@H]4C2 SPBDXSGPUHCETR-JFUDTMANSA-N 0.000 description 1
- 108010044087 AS-I toxin Proteins 0.000 description 1
- 102100024643 ATP-binding cassette sub-family D member 1 Human genes 0.000 description 1
- 102100020973 ATP-binding cassette sub-family D member 3 Human genes 0.000 description 1
- 101710152924 ATP-binding cassette sub-family D member 3 Proteins 0.000 description 1
- 101710159080 Aconitate hydratase A Proteins 0.000 description 1
- 101710159078 Aconitate hydratase B Proteins 0.000 description 1
- 201000011452 Adrenoleukodystrophy Diseases 0.000 description 1
- AAQGRPOPTAUUBM-ZLUOBGJFSA-N Ala-Ala-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O AAQGRPOPTAUUBM-ZLUOBGJFSA-N 0.000 description 1
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 1
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 1
- KQFRUSHJPKXBMB-BHDSKKPTSA-N Ala-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)C)C(O)=O)=CNC2=C1 KQFRUSHJPKXBMB-BHDSKKPTSA-N 0.000 description 1
- WRDANSJTFOHBPI-FXQIFTODSA-N Ala-Arg-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N WRDANSJTFOHBPI-FXQIFTODSA-N 0.000 description 1
- IMMKUCQIKKXKNP-DCAQKATOSA-N Ala-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCN=C(N)N IMMKUCQIKKXKNP-DCAQKATOSA-N 0.000 description 1
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 1
- JYEBJTDTPNKQJG-FXQIFTODSA-N Ala-Asn-Met Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N JYEBJTDTPNKQJG-FXQIFTODSA-N 0.000 description 1
- XQJAFSDFQZPYCU-UWJYBYFXSA-N Ala-Asn-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N XQJAFSDFQZPYCU-UWJYBYFXSA-N 0.000 description 1
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 1
- PBAMJJXWDQXOJA-FXQIFTODSA-N Ala-Asp-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PBAMJJXWDQXOJA-FXQIFTODSA-N 0.000 description 1
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 1
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 1
- FRFDXQWNDZMREB-ACZMJKKPSA-N Ala-Cys-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(O)=O FRFDXQWNDZMREB-ACZMJKKPSA-N 0.000 description 1
- XAGIMRPOEJSYER-CIUDSAMLSA-N Ala-Cys-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N XAGIMRPOEJSYER-CIUDSAMLSA-N 0.000 description 1
- RXTBLQVXNIECFP-FXQIFTODSA-N Ala-Gln-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RXTBLQVXNIECFP-FXQIFTODSA-N 0.000 description 1
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 1
- BLGHHPHXVJWCNK-GUBZILKMSA-N Ala-Gln-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BLGHHPHXVJWCNK-GUBZILKMSA-N 0.000 description 1
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 1
- UHMQKOBNPRAZGB-CIUDSAMLSA-N Ala-Glu-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N UHMQKOBNPRAZGB-CIUDSAMLSA-N 0.000 description 1
- WGDNWOMKBUXFHR-BQBZGAKWSA-N Ala-Gly-Arg Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N WGDNWOMKBUXFHR-BQBZGAKWSA-N 0.000 description 1
- MPLOSMWGDNJSEV-WHFBIAKZSA-N Ala-Gly-Asp Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MPLOSMWGDNJSEV-WHFBIAKZSA-N 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- SMCGQGDVTPFXKB-XPUUQOCRSA-N Ala-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N SMCGQGDVTPFXKB-XPUUQOCRSA-N 0.000 description 1
- KMGOBAQSCKTBGD-DLOVCJGASA-N Ala-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CN=CN1 KMGOBAQSCKTBGD-DLOVCJGASA-N 0.000 description 1
- NJWJSLCQEDMGNC-MBLNEYKQSA-N Ala-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](C)N)O NJWJSLCQEDMGNC-MBLNEYKQSA-N 0.000 description 1
- FOHXUHGZZKETFI-JBDRJPRFSA-N Ala-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C)N FOHXUHGZZKETFI-JBDRJPRFSA-N 0.000 description 1
- CFPQUJZTLUQUTJ-HTFCKZLJSA-N Ala-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@H](C)N CFPQUJZTLUQUTJ-HTFCKZLJSA-N 0.000 description 1
- QCTFKEJEIMPOLW-JURCDPSOSA-N Ala-Ile-Phe Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QCTFKEJEIMPOLW-JURCDPSOSA-N 0.000 description 1
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 1
- MNZHHDPWDWQJCQ-YUMQZZPRSA-N Ala-Leu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O MNZHHDPWDWQJCQ-YUMQZZPRSA-N 0.000 description 1
- DPNZTBKGAUAZQU-DLOVCJGASA-N Ala-Leu-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N DPNZTBKGAUAZQU-DLOVCJGASA-N 0.000 description 1
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 1
- IAUSCRHURCZUJP-CIUDSAMLSA-N Ala-Lys-Cys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CS)C(O)=O IAUSCRHURCZUJP-CIUDSAMLSA-N 0.000 description 1
- MFMDKJIPHSWSBM-GUBZILKMSA-N Ala-Lys-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFMDKJIPHSWSBM-GUBZILKMSA-N 0.000 description 1
- XHNLCGXYBXNRIS-BJDJZHNGSA-N Ala-Lys-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XHNLCGXYBXNRIS-BJDJZHNGSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- XSTZMVAYYCJTNR-DCAQKATOSA-N Ala-Met-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XSTZMVAYYCJTNR-DCAQKATOSA-N 0.000 description 1
- DEWWPUNXRNGMQN-LPEHRKFASA-N Ala-Met-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N1CCC[C@@H]1C(=O)O)N DEWWPUNXRNGMQN-LPEHRKFASA-N 0.000 description 1
- AWNAEZICPNGAJK-FXQIFTODSA-N Ala-Met-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O AWNAEZICPNGAJK-FXQIFTODSA-N 0.000 description 1
- GFEDXKNBZMPEDM-KZVJFYERSA-N Ala-Met-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GFEDXKNBZMPEDM-KZVJFYERSA-N 0.000 description 1
- KYDYGANDJHFBCW-DRZSPHRISA-N Ala-Phe-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N KYDYGANDJHFBCW-DRZSPHRISA-N 0.000 description 1
- OSRZOHXQCUFIQG-FPMFFAJLSA-N Ala-Phe-Pro Chemical compound C([C@H](NC(=O)[C@@H]([NH3+])C)C(=O)N1[C@H](CCC1)C([O-])=O)C1=CC=CC=C1 OSRZOHXQCUFIQG-FPMFFAJLSA-N 0.000 description 1
- XWFWAXPOLRTDFZ-FXQIFTODSA-N Ala-Pro-Ser Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O XWFWAXPOLRTDFZ-FXQIFTODSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- KLALXKYLOMZDQT-ZLUOBGJFSA-N Ala-Ser-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KLALXKYLOMZDQT-ZLUOBGJFSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 1
- CREYEAPXISDKSB-FQPOAREZSA-N Ala-Thr-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CREYEAPXISDKSB-FQPOAREZSA-N 0.000 description 1
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 1
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 1
- JPOQZCHGOTWRTM-FQPOAREZSA-N Ala-Tyr-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPOQZCHGOTWRTM-FQPOAREZSA-N 0.000 description 1
- YEBZNKPPOHFZJM-BPNCWPANSA-N Ala-Tyr-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O YEBZNKPPOHFZJM-BPNCWPANSA-N 0.000 description 1
- PEFFAAKJGBZBKL-NAKRPEOUSA-N Arg-Ala-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PEFFAAKJGBZBKL-NAKRPEOUSA-N 0.000 description 1
- UISQLSIBJKEJSS-GUBZILKMSA-N Arg-Arg-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(O)=O UISQLSIBJKEJSS-GUBZILKMSA-N 0.000 description 1
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 1
- MFAMTAVAFBPXDC-LPEHRKFASA-N Arg-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O MFAMTAVAFBPXDC-LPEHRKFASA-N 0.000 description 1
- YSUVMPICYVWRBX-VEVYYDQMSA-N Arg-Asp-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YSUVMPICYVWRBX-VEVYYDQMSA-N 0.000 description 1
- XTGGTAWGUFXJSV-NAKRPEOUSA-N Arg-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCN=C(N)N)N XTGGTAWGUFXJSV-NAKRPEOUSA-N 0.000 description 1
- VSPLYCLMFAUZRF-GUBZILKMSA-N Arg-Cys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCN=C(N)N)N VSPLYCLMFAUZRF-GUBZILKMSA-N 0.000 description 1
- DGFGDPVSDQPANQ-XGEHTFHBSA-N Arg-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCN=C(N)N)N)O DGFGDPVSDQPANQ-XGEHTFHBSA-N 0.000 description 1
- GIVWETPOBCRTND-DCAQKATOSA-N Arg-Gln-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GIVWETPOBCRTND-DCAQKATOSA-N 0.000 description 1
- SNBHMYQRNCJSOJ-CIUDSAMLSA-N Arg-Gln-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SNBHMYQRNCJSOJ-CIUDSAMLSA-N 0.000 description 1
- VDBKFYYIBLXEIF-GUBZILKMSA-N Arg-Gln-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VDBKFYYIBLXEIF-GUBZILKMSA-N 0.000 description 1
- ZEAYJGRKRUBDOB-GARJFASQSA-N Arg-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZEAYJGRKRUBDOB-GARJFASQSA-N 0.000 description 1
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 1
- JQFJNGVSGOUQDH-XIRDDKMYSA-N Arg-Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCN=C(N)N)N)C(O)=O)=CNC2=C1 JQFJNGVSGOUQDH-XIRDDKMYSA-N 0.000 description 1
- AQPVUEJJARLJHB-BQBZGAKWSA-N Arg-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N AQPVUEJJARLJHB-BQBZGAKWSA-N 0.000 description 1
- QKSAZKCRVQYYGS-UWVGGRQHSA-N Arg-Gly-His Chemical compound N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O QKSAZKCRVQYYGS-UWVGGRQHSA-N 0.000 description 1
- RFXXUWGNVRJTNQ-QXEWZRGKSA-N Arg-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N RFXXUWGNVRJTNQ-QXEWZRGKSA-N 0.000 description 1
- WVNFNPGXYADPPO-BQBZGAKWSA-N Arg-Gly-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O WVNFNPGXYADPPO-BQBZGAKWSA-N 0.000 description 1
- KRQSPVKUISQQFS-FJXKBIBVSA-N Arg-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N KRQSPVKUISQQFS-FJXKBIBVSA-N 0.000 description 1
- OOIMKQRCPJBGPD-XUXIUFHCSA-N Arg-Ile-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O OOIMKQRCPJBGPD-XUXIUFHCSA-N 0.000 description 1
- OFIYLHVAAJYRBC-HJWJTTGWSA-N Arg-Ile-Phe Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O OFIYLHVAAJYRBC-HJWJTTGWSA-N 0.000 description 1
- LLUGJARLJCGLAR-CYDGBPFRSA-N Arg-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LLUGJARLJCGLAR-CYDGBPFRSA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 1
- RIIVUOJDDQXHRV-SRVKXCTJSA-N Arg-Lys-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O RIIVUOJDDQXHRV-SRVKXCTJSA-N 0.000 description 1
- CVXXSWQORBZAAA-SRVKXCTJSA-N Arg-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N CVXXSWQORBZAAA-SRVKXCTJSA-N 0.000 description 1
- NGTYEHIRESTSRX-UWVGGRQHSA-N Arg-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N NGTYEHIRESTSRX-UWVGGRQHSA-N 0.000 description 1
- MTYLORHAQXVQOW-AVGNSLFASA-N Arg-Lys-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O MTYLORHAQXVQOW-AVGNSLFASA-N 0.000 description 1
- GSUFZRURORXYTM-STQMWFEESA-N Arg-Phe-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 GSUFZRURORXYTM-STQMWFEESA-N 0.000 description 1
- IGFJVXOATGZTHD-UHFFFAOYSA-N Arg-Phe-His Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccccc1)C(=O)NC(Cc2c[nH]cn2)C(=O)O IGFJVXOATGZTHD-UHFFFAOYSA-N 0.000 description 1
- RATVAFHGEFAWDH-JYJNAYRXSA-N Arg-Phe-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCCN=C(N)N)N RATVAFHGEFAWDH-JYJNAYRXSA-N 0.000 description 1
- WKPXXXUSUHAXDE-SRVKXCTJSA-N Arg-Pro-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O WKPXXXUSUHAXDE-SRVKXCTJSA-N 0.000 description 1
- HGKHPCFTRQDHCU-IUCAKERBSA-N Arg-Pro-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HGKHPCFTRQDHCU-IUCAKERBSA-N 0.000 description 1
- STHNZYKCJHWULY-AVGNSLFASA-N Arg-Pro-His Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O STHNZYKCJHWULY-AVGNSLFASA-N 0.000 description 1
- ATABBWFGOHKROJ-GUBZILKMSA-N Arg-Pro-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O ATABBWFGOHKROJ-GUBZILKMSA-N 0.000 description 1
- AUIJUTGLPVHIRT-FXQIFTODSA-N Arg-Ser-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N AUIJUTGLPVHIRT-FXQIFTODSA-N 0.000 description 1
- ISJWBVIYRBAXEB-CIUDSAMLSA-N Arg-Ser-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O ISJWBVIYRBAXEB-CIUDSAMLSA-N 0.000 description 1
- URAUIUGLHBRPMF-NAKRPEOUSA-N Arg-Ser-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O URAUIUGLHBRPMF-NAKRPEOUSA-N 0.000 description 1
- JPAWCMXVNZPJLO-IHRRRGAJSA-N Arg-Ser-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JPAWCMXVNZPJLO-IHRRRGAJSA-N 0.000 description 1
- ASQKVGRCKOFKIU-KZVJFYERSA-N Arg-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ASQKVGRCKOFKIU-KZVJFYERSA-N 0.000 description 1
- YNSUUAOAFCVINY-OSUNSFLBSA-N Arg-Thr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YNSUUAOAFCVINY-OSUNSFLBSA-N 0.000 description 1
- MOGMYRUNTKYZFB-UNQGMJICSA-N Arg-Thr-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MOGMYRUNTKYZFB-UNQGMJICSA-N 0.000 description 1
- VLIJAPRTSXSGFY-STQMWFEESA-N Arg-Tyr-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 VLIJAPRTSXSGFY-STQMWFEESA-N 0.000 description 1
- KEZVOBAKAXHMOF-GUBZILKMSA-N Arg-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N KEZVOBAKAXHMOF-GUBZILKMSA-N 0.000 description 1
- SUMJNGAMIQSNGX-TUAOUCFPSA-N Arg-Val-Pro Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N1CCC[C@@H]1C(O)=O SUMJNGAMIQSNGX-TUAOUCFPSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- RZVVKNIACROXRM-ZLUOBGJFSA-N Asn-Ala-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N RZVVKNIACROXRM-ZLUOBGJFSA-N 0.000 description 1
- HUZGPXBILPMCHM-IHRRRGAJSA-N Asn-Arg-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HUZGPXBILPMCHM-IHRRRGAJSA-N 0.000 description 1
- APHUDFFMXFYRKP-CIUDSAMLSA-N Asn-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N APHUDFFMXFYRKP-CIUDSAMLSA-N 0.000 description 1
- QHBMKQWOIYJYMI-BYULHYEWSA-N Asn-Asn-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O QHBMKQWOIYJYMI-BYULHYEWSA-N 0.000 description 1
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 1
- BGINHSZTXRJIPP-FXQIFTODSA-N Asn-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N BGINHSZTXRJIPP-FXQIFTODSA-N 0.000 description 1
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 1
- HCAUEJAQCXVQQM-ACZMJKKPSA-N Asn-Glu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HCAUEJAQCXVQQM-ACZMJKKPSA-N 0.000 description 1
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 1
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 1
- COUZKSSMBFADSB-AVGNSLFASA-N Asn-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N COUZKSSMBFADSB-AVGNSLFASA-N 0.000 description 1
- SGAUXNZEFIEAAI-GARJFASQSA-N Asn-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)C(=O)O SGAUXNZEFIEAAI-GARJFASQSA-N 0.000 description 1
- YYSYDIYQTUPNQQ-SXTJYALSSA-N Asn-Ile-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YYSYDIYQTUPNQQ-SXTJYALSSA-N 0.000 description 1
- SEKBHZJLARBNPB-GHCJXIJMSA-N Asn-Ile-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O SEKBHZJLARBNPB-GHCJXIJMSA-N 0.000 description 1
- HFPXZWPUVFVNLL-GUBZILKMSA-N Asn-Leu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFPXZWPUVFVNLL-GUBZILKMSA-N 0.000 description 1
- JEEFEQCRXKPQHC-KKUMJFAQSA-N Asn-Leu-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JEEFEQCRXKPQHC-KKUMJFAQSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 1
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 1
- RCFGLXMZDYNRSC-CIUDSAMLSA-N Asn-Lys-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O RCFGLXMZDYNRSC-CIUDSAMLSA-N 0.000 description 1
- QDXQWFBLUVTOFL-FXQIFTODSA-N Asn-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(=O)N)N QDXQWFBLUVTOFL-FXQIFTODSA-N 0.000 description 1
- MYVBTYXSWILFCG-BQBZGAKWSA-N Asn-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N MYVBTYXSWILFCG-BQBZGAKWSA-N 0.000 description 1
- AEZCCDMZZJOGII-DCAQKATOSA-N Asn-Met-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O AEZCCDMZZJOGII-DCAQKATOSA-N 0.000 description 1
- RLHANKIRBONJBK-IHRRRGAJSA-N Asn-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)N)N RLHANKIRBONJBK-IHRRRGAJSA-N 0.000 description 1
- KEUNWIXNKVWCFL-FXQIFTODSA-N Asn-Met-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O KEUNWIXNKVWCFL-FXQIFTODSA-N 0.000 description 1
- VITDJIPIJZAVGC-VEVYYDQMSA-N Asn-Met-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VITDJIPIJZAVGC-VEVYYDQMSA-N 0.000 description 1
- MVXJBVVLACEGCG-PCBIJLKTSA-N Asn-Phe-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVXJBVVLACEGCG-PCBIJLKTSA-N 0.000 description 1
- HZZIFFOVHLWGCS-KKUMJFAQSA-N Asn-Phe-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O HZZIFFOVHLWGCS-KKUMJFAQSA-N 0.000 description 1
- PLTGTJAZQRGMPP-FXQIFTODSA-N Asn-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O PLTGTJAZQRGMPP-FXQIFTODSA-N 0.000 description 1
- GFGUPLIETCNQGF-DCAQKATOSA-N Asn-Pro-His Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O GFGUPLIETCNQGF-DCAQKATOSA-N 0.000 description 1
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 1
- NPZJLGMWMDNQDD-GHCJXIJMSA-N Asn-Ser-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NPZJLGMWMDNQDD-GHCJXIJMSA-N 0.000 description 1
- HPNDKUOLNRVRAY-BIIVOSGPSA-N Asn-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N)C(=O)O HPNDKUOLNRVRAY-BIIVOSGPSA-N 0.000 description 1
- SNYCNNPOFYBCEK-ZLUOBGJFSA-N Asn-Ser-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O SNYCNNPOFYBCEK-ZLUOBGJFSA-N 0.000 description 1
- MYTHOBCLNIOFBL-SRVKXCTJSA-N Asn-Ser-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYTHOBCLNIOFBL-SRVKXCTJSA-N 0.000 description 1
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 1
- ZUFPUBYQYWCMDB-NUMRIWBASA-N Asn-Thr-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZUFPUBYQYWCMDB-NUMRIWBASA-N 0.000 description 1
- FMNBYVSGRCXWEK-FOHZUACHSA-N Asn-Thr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O FMNBYVSGRCXWEK-FOHZUACHSA-N 0.000 description 1
- JPSODRNUDXONAS-XIRDDKMYSA-N Asn-Trp-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)NC(=O)[C@H](CC(=O)N)N JPSODRNUDXONAS-XIRDDKMYSA-N 0.000 description 1
- ULZOQOKFYMXHPZ-AQZXSJQPSA-N Asn-Trp-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ULZOQOKFYMXHPZ-AQZXSJQPSA-N 0.000 description 1
- RTFXPCYMDYBZNQ-SRVKXCTJSA-N Asn-Tyr-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O RTFXPCYMDYBZNQ-SRVKXCTJSA-N 0.000 description 1
- DPSUVAPLRQDWAO-YDHLFZDLSA-N Asn-Tyr-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(=O)N)N DPSUVAPLRQDWAO-YDHLFZDLSA-N 0.000 description 1
- LTDGPJKGJDIBQD-LAEOZQHASA-N Asn-Val-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LTDGPJKGJDIBQD-LAEOZQHASA-N 0.000 description 1
- CBHVAFXKOYAHOY-NHCYSSNCSA-N Asn-Val-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O CBHVAFXKOYAHOY-NHCYSSNCSA-N 0.000 description 1
- PQKSVQSMTHPRIB-ZKWXMUAHSA-N Asn-Val-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O PQKSVQSMTHPRIB-ZKWXMUAHSA-N 0.000 description 1
- ZLGKHJHFYSRUBH-FXQIFTODSA-N Asp-Arg-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLGKHJHFYSRUBH-FXQIFTODSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 1
- JGDBHIVECJGXJA-FXQIFTODSA-N Asp-Asp-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JGDBHIVECJGXJA-FXQIFTODSA-N 0.000 description 1
- IJHUZMGJRGNXIW-CIUDSAMLSA-N Asp-Glu-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IJHUZMGJRGNXIW-CIUDSAMLSA-N 0.000 description 1
- XAJRHVUUVUPFQL-ACZMJKKPSA-N Asp-Glu-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XAJRHVUUVUPFQL-ACZMJKKPSA-N 0.000 description 1
- RATOMFTUDRYMKX-ACZMJKKPSA-N Asp-Glu-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N RATOMFTUDRYMKX-ACZMJKKPSA-N 0.000 description 1
- YNCHFVRXEQFPBY-BQBZGAKWSA-N Asp-Gly-Arg Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N YNCHFVRXEQFPBY-BQBZGAKWSA-N 0.000 description 1
- WSXDIZFNQYTUJB-SRVKXCTJSA-N Asp-His-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O WSXDIZFNQYTUJB-SRVKXCTJSA-N 0.000 description 1
- GBSUGIXJAAKZOW-GMOBBJLQSA-N Asp-Ile-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GBSUGIXJAAKZOW-GMOBBJLQSA-N 0.000 description 1
- SEMWSADZTMJELF-BYULHYEWSA-N Asp-Ile-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O SEMWSADZTMJELF-BYULHYEWSA-N 0.000 description 1
- YFSLJHLQOALGSY-ZPFDUUQYSA-N Asp-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N YFSLJHLQOALGSY-ZPFDUUQYSA-N 0.000 description 1
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 1
- SCQIQCWLOMOEFP-DCAQKATOSA-N Asp-Leu-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O SCQIQCWLOMOEFP-DCAQKATOSA-N 0.000 description 1
- HKEZZWQWXWGASX-KKUMJFAQSA-N Asp-Leu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 HKEZZWQWXWGASX-KKUMJFAQSA-N 0.000 description 1
- IVPNEDNYYYFAGI-GARJFASQSA-N Asp-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N IVPNEDNYYYFAGI-GARJFASQSA-N 0.000 description 1
- MYOHQBFRJQFIDZ-KKUMJFAQSA-N Asp-Leu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYOHQBFRJQFIDZ-KKUMJFAQSA-N 0.000 description 1
- QNMKWNONJGKJJC-NHCYSSNCSA-N Asp-Leu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O QNMKWNONJGKJJC-NHCYSSNCSA-N 0.000 description 1
- XWSIYTYNLKCLJB-CIUDSAMLSA-N Asp-Lys-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O XWSIYTYNLKCLJB-CIUDSAMLSA-N 0.000 description 1
- QNIACYURSSCLRP-GUBZILKMSA-N Asp-Lys-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O QNIACYURSSCLRP-GUBZILKMSA-N 0.000 description 1
- VSMYBNPOHYAXSD-GUBZILKMSA-N Asp-Lys-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O VSMYBNPOHYAXSD-GUBZILKMSA-N 0.000 description 1
- HJCGDIGVVWETRO-ZPFDUUQYSA-N Asp-Lys-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O)C(O)=O HJCGDIGVVWETRO-ZPFDUUQYSA-N 0.000 description 1
- FQHBAQLBIXLWAG-DCAQKATOSA-N Asp-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N FQHBAQLBIXLWAG-DCAQKATOSA-N 0.000 description 1
- IMGLJMRIAFKUPZ-FXQIFTODSA-N Asp-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N IMGLJMRIAFKUPZ-FXQIFTODSA-N 0.000 description 1
- JXGJJQJHXHXJQF-CIUDSAMLSA-N Asp-Met-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O JXGJJQJHXHXJQF-CIUDSAMLSA-N 0.000 description 1
- WOPJVEMFXYHZEE-SRVKXCTJSA-N Asp-Phe-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WOPJVEMFXYHZEE-SRVKXCTJSA-N 0.000 description 1
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 1
- ZKAOJVJQGVUIIU-GUBZILKMSA-N Asp-Pro-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ZKAOJVJQGVUIIU-GUBZILKMSA-N 0.000 description 1
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 1
- KESWRFKUZRUTAH-FXQIFTODSA-N Asp-Pro-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O KESWRFKUZRUTAH-FXQIFTODSA-N 0.000 description 1
- UAXIKORUDGGIGA-DCAQKATOSA-N Asp-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O UAXIKORUDGGIGA-DCAQKATOSA-N 0.000 description 1
- GGRSYTUJHAZTFN-IHRRRGAJSA-N Asp-Pro-Tyr Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O GGRSYTUJHAZTFN-IHRRRGAJSA-N 0.000 description 1
- CUQDCPXNZPDYFQ-ZLUOBGJFSA-N Asp-Ser-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O CUQDCPXNZPDYFQ-ZLUOBGJFSA-N 0.000 description 1
- BRRPVTUFESPTCP-ACZMJKKPSA-N Asp-Ser-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O BRRPVTUFESPTCP-ACZMJKKPSA-N 0.000 description 1
- XYPJXLLXNSAWHZ-SRVKXCTJSA-N Asp-Ser-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XYPJXLLXNSAWHZ-SRVKXCTJSA-N 0.000 description 1
- JSHWXQIZOCVWIA-ZKWXMUAHSA-N Asp-Ser-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JSHWXQIZOCVWIA-ZKWXMUAHSA-N 0.000 description 1
- UTLCRGFJFSZWAW-OLHMAJIHSA-N Asp-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O UTLCRGFJFSZWAW-OLHMAJIHSA-N 0.000 description 1
- NAAAPCLFJPURAM-HJGDQZAQSA-N Asp-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O NAAAPCLFJPURAM-HJGDQZAQSA-N 0.000 description 1
- LEYKQPDPZJIRTA-AQZXSJQPSA-N Asp-Trp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LEYKQPDPZJIRTA-AQZXSJQPSA-N 0.000 description 1
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 1
- XWKBWZXGNXTDKY-ZKWXMUAHSA-N Asp-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O XWKBWZXGNXTDKY-ZKWXMUAHSA-N 0.000 description 1
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 1
- GIKOVDMXBAFXDF-NHCYSSNCSA-N Asp-Val-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GIKOVDMXBAFXDF-NHCYSSNCSA-N 0.000 description 1
- 208000010061 Autosomal Dominant Polycystic Kidney Diseases 0.000 description 1
- 108700020463 BRCA1 Proteins 0.000 description 1
- 102000036365 BRCA1 Human genes 0.000 description 1
- 101150072950 BRCA1 gene Proteins 0.000 description 1
- 235000014469 Bacillus subtilis Nutrition 0.000 description 1
- 235000012905 Brassica oleracea var viridis Nutrition 0.000 description 1
- 244000064816 Brassica oleracea var. acephala Species 0.000 description 1
- 241000244203 Caenorhabditis elegans Species 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 206010053684 Cerebrohepatorenal syndrome Diseases 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- 108020004394 Complementary RNA Proteins 0.000 description 1
- 208000027205 Congenital disease Diseases 0.000 description 1
- 108091029523 CpG island Proteins 0.000 description 1
- 102100038254 Cyclin-F Human genes 0.000 description 1
- XABFFGOGKOORCG-CIUDSAMLSA-N Cys-Asp-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XABFFGOGKOORCG-CIUDSAMLSA-N 0.000 description 1
- UFOBYROTHHYVGW-CIUDSAMLSA-N Cys-Cys-His Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O UFOBYROTHHYVGW-CIUDSAMLSA-N 0.000 description 1
- LWTTURISBKEVAC-CIUDSAMLSA-N Cys-Cys-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CS)N LWTTURISBKEVAC-CIUDSAMLSA-N 0.000 description 1
- HNNGTYHNYDOSKV-FXQIFTODSA-N Cys-Cys-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CS)N HNNGTYHNYDOSKV-FXQIFTODSA-N 0.000 description 1
- BPHKULHWEIUDOB-FXQIFTODSA-N Cys-Gln-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O BPHKULHWEIUDOB-FXQIFTODSA-N 0.000 description 1
- PFAQXUDMZVMADG-AVGNSLFASA-N Cys-Gln-Tyr Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PFAQXUDMZVMADG-AVGNSLFASA-N 0.000 description 1
- SFRQEQGPRTVDPO-NRPADANISA-N Cys-Gln-Val Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O SFRQEQGPRTVDPO-NRPADANISA-N 0.000 description 1
- XTHUKRLJRUVVBF-WHFBIAKZSA-N Cys-Gly-Ser Chemical compound SC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O XTHUKRLJRUVVBF-WHFBIAKZSA-N 0.000 description 1
- LYSHSHHDBVKJRN-JBDRJPRFSA-N Cys-Ile-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CS)N LYSHSHHDBVKJRN-JBDRJPRFSA-N 0.000 description 1
- KKUVRYLJEXJSGX-MXAVVETBSA-N Cys-Ile-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CS)N KKUVRYLJEXJSGX-MXAVVETBSA-N 0.000 description 1
- ODDOYXKAHLKKQY-MMWGEVLESA-N Cys-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N ODDOYXKAHLKKQY-MMWGEVLESA-N 0.000 description 1
- IZUNQDRIAOLWCN-YUMQZZPRSA-N Cys-Leu-Gly Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CS)N IZUNQDRIAOLWCN-YUMQZZPRSA-N 0.000 description 1
- POSRGGKLRWCUBE-CIUDSAMLSA-N Cys-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N POSRGGKLRWCUBE-CIUDSAMLSA-N 0.000 description 1
- RESAHOSBQHMOKH-KKUMJFAQSA-N Cys-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CS)N RESAHOSBQHMOKH-KKUMJFAQSA-N 0.000 description 1
- KSMSFCBQBQPFAD-GUBZILKMSA-N Cys-Pro-Pro Chemical compound SC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 KSMSFCBQBQPFAD-GUBZILKMSA-N 0.000 description 1
- SWJYSDXMTPMBHO-FXQIFTODSA-N Cys-Pro-Ser Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SWJYSDXMTPMBHO-FXQIFTODSA-N 0.000 description 1
- CMYVIUWVYHOLRD-ZLUOBGJFSA-N Cys-Ser-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O CMYVIUWVYHOLRD-ZLUOBGJFSA-N 0.000 description 1
- YNJBLTDKTMKEET-ZLUOBGJFSA-N Cys-Ser-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O YNJBLTDKTMKEET-ZLUOBGJFSA-N 0.000 description 1
- YWEHYKGJWHPGPY-XGEHTFHBSA-N Cys-Thr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CS)N)O YWEHYKGJWHPGPY-XGEHTFHBSA-N 0.000 description 1
- ZLFRUAFDAIFNHN-LKXGYXEUSA-N Cys-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N)O ZLFRUAFDAIFNHN-LKXGYXEUSA-N 0.000 description 1
- IZJLAQMWJHCHTN-BPUTZDHNSA-N Cys-Trp-Arg Chemical compound N[C@@H](CS)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CCCNC(=N)N)C(=O)O IZJLAQMWJHCHTN-BPUTZDHNSA-N 0.000 description 1
- XKDHARKYRGHLKO-QEJZJMRPSA-N Cys-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CS)N XKDHARKYRGHLKO-QEJZJMRPSA-N 0.000 description 1
- LHRCZIRWNFRIRG-SRVKXCTJSA-N Cys-Tyr-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N)O LHRCZIRWNFRIRG-SRVKXCTJSA-N 0.000 description 1
- FCXJJTRGVAZDER-FXQIFTODSA-N Cys-Val-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O FCXJJTRGVAZDER-FXQIFTODSA-N 0.000 description 1
- MHYHLWUGWUBUHF-GUBZILKMSA-N Cys-Val-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CS)N MHYHLWUGWUBUHF-GUBZILKMSA-N 0.000 description 1
- AZDQAZRURQMSQD-XPUUQOCRSA-N Cys-Val-Gly Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AZDQAZRURQMSQD-XPUUQOCRSA-N 0.000 description 1
- NGOIQDYZMIKCOK-NAKRPEOUSA-N Cys-Val-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NGOIQDYZMIKCOK-NAKRPEOUSA-N 0.000 description 1
- KZZYVYWSXMFYEC-DCAQKATOSA-N Cys-Val-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KZZYVYWSXMFYEC-DCAQKATOSA-N 0.000 description 1
- LPBUBIHAVKXUOT-FXQIFTODSA-N Cys-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N LPBUBIHAVKXUOT-FXQIFTODSA-N 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 101710096438 DNA-binding protein Proteins 0.000 description 1
- 108700004627 Drosophila NetA Proteins 0.000 description 1
- 108700003046 Drosophila NetB Proteins 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 101100237347 Escherichia coli (strain K12) metN gene Proteins 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 102000010834 Extracellular Matrix Proteins Human genes 0.000 description 1
- 108010037362 Extracellular Matrix Proteins Proteins 0.000 description 1
- 108010000916 Fimbriae Proteins Proteins 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- KVYVOGYEMPEXBT-GUBZILKMSA-N Gln-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O KVYVOGYEMPEXBT-GUBZILKMSA-N 0.000 description 1
- IGNGBUVODQLMRJ-CIUDSAMLSA-N Gln-Ala-Met Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O IGNGBUVODQLMRJ-CIUDSAMLSA-N 0.000 description 1
- SHERTACNJPYHAR-ACZMJKKPSA-N Gln-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O SHERTACNJPYHAR-ACZMJKKPSA-N 0.000 description 1
- OVQXQLWWJSNYFV-XEGUGMAKSA-N Gln-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCC(N)=O)C)C(O)=O)=CNC2=C1 OVQXQLWWJSNYFV-XEGUGMAKSA-N 0.000 description 1
- LZRMPXRYLLTAJX-GUBZILKMSA-N Gln-Arg-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZRMPXRYLLTAJX-GUBZILKMSA-N 0.000 description 1
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 1
- JFOKLAPFYCTNHW-SRVKXCTJSA-N Gln-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)N)N JFOKLAPFYCTNHW-SRVKXCTJSA-N 0.000 description 1
- RRYLMJWPWBJFPZ-ACZMJKKPSA-N Gln-Asn-Asp Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RRYLMJWPWBJFPZ-ACZMJKKPSA-N 0.000 description 1
- ZPDVKYLJTOFQJV-WDSKDSINSA-N Gln-Asn-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O ZPDVKYLJTOFQJV-WDSKDSINSA-N 0.000 description 1
- LMPBBFWHCRURJD-LAEOZQHASA-N Gln-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N LMPBBFWHCRURJD-LAEOZQHASA-N 0.000 description 1
- IKDOHQHEFPPGJG-FXQIFTODSA-N Gln-Asp-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IKDOHQHEFPPGJG-FXQIFTODSA-N 0.000 description 1
- JKPGHIQCHIIRMS-AVGNSLFASA-N Gln-Asp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N JKPGHIQCHIIRMS-AVGNSLFASA-N 0.000 description 1
- DHNWZLGBTPUTQQ-QEJZJMRPSA-N Gln-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N DHNWZLGBTPUTQQ-QEJZJMRPSA-N 0.000 description 1
- PKVWNYGXMNWJSI-CIUDSAMLSA-N Gln-Gln-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O PKVWNYGXMNWJSI-CIUDSAMLSA-N 0.000 description 1
- LVNILKSSFHCSJZ-IHRRRGAJSA-N Gln-Gln-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N LVNILKSSFHCSJZ-IHRRRGAJSA-N 0.000 description 1
- MCAVASRGVBVPMX-FXQIFTODSA-N Gln-Glu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MCAVASRGVBVPMX-FXQIFTODSA-N 0.000 description 1
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 1
- LFIVHGMKWFGUGK-IHRRRGAJSA-N Gln-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LFIVHGMKWFGUGK-IHRRRGAJSA-N 0.000 description 1
- IWUFOVSLWADEJC-AVGNSLFASA-N Gln-His-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O IWUFOVSLWADEJC-AVGNSLFASA-N 0.000 description 1
- GIVHPCWYVWUUSG-HVTMNAMFSA-N Gln-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GIVHPCWYVWUUSG-HVTMNAMFSA-N 0.000 description 1
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 1
- FFVXLVGUJBCKRX-UKJIMTQDSA-N Gln-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCC(=O)N)N FFVXLVGUJBCKRX-UKJIMTQDSA-N 0.000 description 1
- SHAUZYVSXAMYAZ-JYJNAYRXSA-N Gln-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N SHAUZYVSXAMYAZ-JYJNAYRXSA-N 0.000 description 1
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 1
- KLKYKPXITJBSNI-CIUDSAMLSA-N Gln-Met-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O KLKYKPXITJBSNI-CIUDSAMLSA-N 0.000 description 1
- LUGUNEGJNDEBLU-DCAQKATOSA-N Gln-Met-Arg Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LUGUNEGJNDEBLU-DCAQKATOSA-N 0.000 description 1
- KFHASAPTUOASQN-JYJNAYRXSA-N Gln-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCC(=O)N)N KFHASAPTUOASQN-JYJNAYRXSA-N 0.000 description 1
- WHVLABLIJYGVEK-QEWYBTABSA-N Gln-Phe-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WHVLABLIJYGVEK-QEWYBTABSA-N 0.000 description 1
- FNAJNWPDTIXYJN-CIUDSAMLSA-N Gln-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCC(N)=O FNAJNWPDTIXYJN-CIUDSAMLSA-N 0.000 description 1
- PBYFVIQRFLNQCO-GUBZILKMSA-N Gln-Pro-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O PBYFVIQRFLNQCO-GUBZILKMSA-N 0.000 description 1
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 1
- XIYWAJQIWLXXAF-XKBZYTNZSA-N Gln-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O XIYWAJQIWLXXAF-XKBZYTNZSA-N 0.000 description 1
- NHMRJKKAVMENKJ-WDCWCFNPSA-N Gln-Thr-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NHMRJKKAVMENKJ-WDCWCFNPSA-N 0.000 description 1
- XFHMVFKCQSHLKW-HJGDQZAQSA-N Gln-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O XFHMVFKCQSHLKW-HJGDQZAQSA-N 0.000 description 1
- OEIDWQHTRYEYGG-QEJZJMRPSA-N Gln-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N OEIDWQHTRYEYGG-QEJZJMRPSA-N 0.000 description 1
- UQKVUFGUSVYJMQ-IRIUXVKKSA-N Gln-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)N)N)O UQKVUFGUSVYJMQ-IRIUXVKKSA-N 0.000 description 1
- ICRKQMRFXYDYMK-LAEOZQHASA-N Gln-Val-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ICRKQMRFXYDYMK-LAEOZQHASA-N 0.000 description 1
- SDSMVVSHLAAOJL-UKJIMTQDSA-N Gln-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCC(=O)N)N SDSMVVSHLAAOJL-UKJIMTQDSA-N 0.000 description 1
- VEYGCDYMOXHJLS-GVXVVHGQSA-N Gln-Val-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VEYGCDYMOXHJLS-GVXVVHGQSA-N 0.000 description 1
- ZMXZGYLINVNTKH-DZKIICNBSA-N Gln-Val-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZMXZGYLINVNTKH-DZKIICNBSA-N 0.000 description 1
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 1
- BPDVTFBJZNBHEU-HGNGGELXSA-N Glu-Ala-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 BPDVTFBJZNBHEU-HGNGGELXSA-N 0.000 description 1
- RCCDHXSRMWCOOY-GUBZILKMSA-N Glu-Arg-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O RCCDHXSRMWCOOY-GUBZILKMSA-N 0.000 description 1
- SRZLHYPAOXBBSB-HJGDQZAQSA-N Glu-Arg-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SRZLHYPAOXBBSB-HJGDQZAQSA-N 0.000 description 1
- DYFJZDDQPNIPAB-NHCYSSNCSA-N Glu-Arg-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O DYFJZDDQPNIPAB-NHCYSSNCSA-N 0.000 description 1
- MLCPTRRNICEKIS-FXQIFTODSA-N Glu-Asn-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLCPTRRNICEKIS-FXQIFTODSA-N 0.000 description 1
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 1
- VAIWPXWHWAPYDF-FXQIFTODSA-N Glu-Asp-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O VAIWPXWHWAPYDF-FXQIFTODSA-N 0.000 description 1
- XXCDTYBVGMPIOA-FXQIFTODSA-N Glu-Asp-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XXCDTYBVGMPIOA-FXQIFTODSA-N 0.000 description 1
- WATXSTJXNBOHKD-LAEOZQHASA-N Glu-Asp-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O WATXSTJXNBOHKD-LAEOZQHASA-N 0.000 description 1
- ISXJHXGYMJKXOI-GUBZILKMSA-N Glu-Cys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CCC(O)=O ISXJHXGYMJKXOI-GUBZILKMSA-N 0.000 description 1
- ZXQPJYWZSFGWJB-AVGNSLFASA-N Glu-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N ZXQPJYWZSFGWJB-AVGNSLFASA-N 0.000 description 1
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 1
- ILGFBUGLBSAQQB-GUBZILKMSA-N Glu-Glu-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ILGFBUGLBSAQQB-GUBZILKMSA-N 0.000 description 1
- NUSWUSKZRCGFEX-FXQIFTODSA-N Glu-Glu-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(O)=O NUSWUSKZRCGFEX-FXQIFTODSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- APHGWLWMOXGZRL-DCAQKATOSA-N Glu-Glu-His Chemical compound N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O APHGWLWMOXGZRL-DCAQKATOSA-N 0.000 description 1
- AUTNXSQEVVHSJK-YVNDNENWSA-N Glu-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O AUTNXSQEVVHSJK-YVNDNENWSA-N 0.000 description 1
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 1
- KUTPGXNAAOQSPD-LPEHRKFASA-N Glu-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O KUTPGXNAAOQSPD-LPEHRKFASA-N 0.000 description 1
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 1
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 1
- AIGROOHQXCACHL-WDSKDSINSA-N Glu-Gly-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O AIGROOHQXCACHL-WDSKDSINSA-N 0.000 description 1
- PXXGVUVQWQGGIG-YUMQZZPRSA-N Glu-Gly-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N PXXGVUVQWQGGIG-YUMQZZPRSA-N 0.000 description 1
- UHVIQGKBMXEVGN-WDSKDSINSA-N Glu-Gly-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UHVIQGKBMXEVGN-WDSKDSINSA-N 0.000 description 1
- XMPAXPSENRSOSV-RYUDHWBXSA-N Glu-Gly-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XMPAXPSENRSOSV-RYUDHWBXSA-N 0.000 description 1
- DRLVXRQFROIYTD-GUBZILKMSA-N Glu-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N DRLVXRQFROIYTD-GUBZILKMSA-N 0.000 description 1
- ZJFNRQHUIHKZJF-GUBZILKMSA-N Glu-His-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O ZJFNRQHUIHKZJF-GUBZILKMSA-N 0.000 description 1
- XIKYNVKEUINBGL-IUCAKERBSA-N Glu-His-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)NCC(O)=O XIKYNVKEUINBGL-IUCAKERBSA-N 0.000 description 1
- NJPQBTJSYCKCNS-HVTMNAMFSA-N Glu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N NJPQBTJSYCKCNS-HVTMNAMFSA-N 0.000 description 1
- ZPASCJBSSCRWMC-GVXVVHGQSA-N Glu-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N ZPASCJBSSCRWMC-GVXVVHGQSA-N 0.000 description 1
- QIQABBIDHGQXGA-ZPFDUUQYSA-N Glu-Ile-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QIQABBIDHGQXGA-ZPFDUUQYSA-N 0.000 description 1
- INGJLBQKTRJLFO-UKJIMTQDSA-N Glu-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O INGJLBQKTRJLFO-UKJIMTQDSA-N 0.000 description 1
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 1
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 1
- YGLCLCMAYUYZSG-AVGNSLFASA-N Glu-Lys-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 YGLCLCMAYUYZSG-AVGNSLFASA-N 0.000 description 1
- AQNYKMCFCCZEEL-JYJNAYRXSA-N Glu-Lys-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AQNYKMCFCCZEEL-JYJNAYRXSA-N 0.000 description 1
- ZQYZDDXTNQXUJH-CIUDSAMLSA-N Glu-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(=O)O)N ZQYZDDXTNQXUJH-CIUDSAMLSA-N 0.000 description 1
- ZWMYUDZLXAQHCK-CIUDSAMLSA-N Glu-Met-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O ZWMYUDZLXAQHCK-CIUDSAMLSA-N 0.000 description 1
- JHSRJMUJOGLIHK-GUBZILKMSA-N Glu-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N JHSRJMUJOGLIHK-GUBZILKMSA-N 0.000 description 1
- YTRBQAQSUDSIQE-FHWLQOOXSA-N Glu-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 YTRBQAQSUDSIQE-FHWLQOOXSA-N 0.000 description 1
- KXTAGESXNQEZKB-DZKIICNBSA-N Glu-Phe-Val Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 KXTAGESXNQEZKB-DZKIICNBSA-N 0.000 description 1
- AAJHGGDRKHYSDH-GUBZILKMSA-N Glu-Pro-Gln Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O AAJHGGDRKHYSDH-GUBZILKMSA-N 0.000 description 1
- BIYNPVYAZOUVFQ-CIUDSAMLSA-N Glu-Pro-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O BIYNPVYAZOUVFQ-CIUDSAMLSA-N 0.000 description 1
- NNQDRRUXFJYCCJ-NHCYSSNCSA-N Glu-Pro-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O NNQDRRUXFJYCCJ-NHCYSSNCSA-N 0.000 description 1
- WIKMTDVSCUJIPJ-CIUDSAMLSA-N Glu-Ser-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WIKMTDVSCUJIPJ-CIUDSAMLSA-N 0.000 description 1
- WXONSNSSBYQGNN-AVGNSLFASA-N Glu-Ser-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WXONSNSSBYQGNN-AVGNSLFASA-N 0.000 description 1
- BDISFWMLMNBTGP-NUMRIWBASA-N Glu-Thr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O BDISFWMLMNBTGP-NUMRIWBASA-N 0.000 description 1
- RGJKYNUINKGPJN-RWRJDSDZSA-N Glu-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(=O)O)N RGJKYNUINKGPJN-RWRJDSDZSA-N 0.000 description 1
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 1
- CAQXJMUDOLSBPF-SUSMZKCASA-N Glu-Thr-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAQXJMUDOLSBPF-SUSMZKCASA-N 0.000 description 1
- LSYFGBRDBIQYAQ-FHWLQOOXSA-N Glu-Tyr-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LSYFGBRDBIQYAQ-FHWLQOOXSA-N 0.000 description 1
- YQPFCZVKMUVZIN-AUTRQRHGSA-N Glu-Val-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQPFCZVKMUVZIN-AUTRQRHGSA-N 0.000 description 1
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 1
- QXUPRMQJDWJDFR-NRPADANISA-N Glu-Val-Ser Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXUPRMQJDWJDFR-NRPADANISA-N 0.000 description 1
- WGYHAAXZWPEBDQ-IFFSRLJSSA-N Glu-Val-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGYHAAXZWPEBDQ-IFFSRLJSSA-N 0.000 description 1
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 1
- MFVQGXGQRIXBPK-WDSKDSINSA-N Gly-Ala-Glu Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFVQGXGQRIXBPK-WDSKDSINSA-N 0.000 description 1
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 1
- XUDLUKYPXQDCRX-BQBZGAKWSA-N Gly-Arg-Asn Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O XUDLUKYPXQDCRX-BQBZGAKWSA-N 0.000 description 1
- OGCIHJPYKVSMTE-YUMQZZPRSA-N Gly-Arg-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O OGCIHJPYKVSMTE-YUMQZZPRSA-N 0.000 description 1
- OCQUNKSFDYDXBG-QXEWZRGKSA-N Gly-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OCQUNKSFDYDXBG-QXEWZRGKSA-N 0.000 description 1
- VXKCPBPQEKKERH-IUCAKERBSA-N Gly-Arg-Pro Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N1CCC[C@H]1C(O)=O VXKCPBPQEKKERH-IUCAKERBSA-N 0.000 description 1
- GWCRIHNSVMOBEQ-BQBZGAKWSA-N Gly-Arg-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O GWCRIHNSVMOBEQ-BQBZGAKWSA-N 0.000 description 1
- GGEJHJIXRBTJPD-BYPYZUCNSA-N Gly-Asn-Gly Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GGEJHJIXRBTJPD-BYPYZUCNSA-N 0.000 description 1
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 1
- XEJTYSCIXKYSHR-WDSKDSINSA-N Gly-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN XEJTYSCIXKYSHR-WDSKDSINSA-N 0.000 description 1
- XBWMTPAIUQIWKA-BYULHYEWSA-N Gly-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CN XBWMTPAIUQIWKA-BYULHYEWSA-N 0.000 description 1
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 1
- UEGIPZAXNBYCCP-NKWVEPMBSA-N Gly-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)CN)C(=O)O UEGIPZAXNBYCCP-NKWVEPMBSA-N 0.000 description 1
- CQZDZKRHFWJXDF-WDSKDSINSA-N Gly-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CN CQZDZKRHFWJXDF-WDSKDSINSA-N 0.000 description 1
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 1
- BPQYBFAXRGMGGY-LAEOZQHASA-N Gly-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN BPQYBFAXRGMGGY-LAEOZQHASA-N 0.000 description 1
- JUGQPPOVWXSPKJ-RYUDHWBXSA-N Gly-Gln-Phe Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JUGQPPOVWXSPKJ-RYUDHWBXSA-N 0.000 description 1
- GNPVTZJUUBPZKW-WDSKDSINSA-N Gly-Gln-Ser Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GNPVTZJUUBPZKW-WDSKDSINSA-N 0.000 description 1
- HDNXXTBKOJKWNN-WDSKDSINSA-N Gly-Glu-Asn Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O HDNXXTBKOJKWNN-WDSKDSINSA-N 0.000 description 1
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 1
- GDOZQTNZPCUARW-YFKPBYRVSA-N Gly-Gly-Glu Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O GDOZQTNZPCUARW-YFKPBYRVSA-N 0.000 description 1
- QPCVIQJVRGXUSA-LURJTMIESA-N Gly-Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QPCVIQJVRGXUSA-LURJTMIESA-N 0.000 description 1
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 1
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 1
- AYBKPDHHVADEDA-YUMQZZPRSA-N Gly-His-Asn Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O AYBKPDHHVADEDA-YUMQZZPRSA-N 0.000 description 1
- VAXIVIPMCTYSHI-YUMQZZPRSA-N Gly-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN VAXIVIPMCTYSHI-YUMQZZPRSA-N 0.000 description 1
- FSPVILZGHUJOHS-QWRGUYRKSA-N Gly-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CNC=N1 FSPVILZGHUJOHS-QWRGUYRKSA-N 0.000 description 1
- YFGONBOFGGWKKY-VHSXEESVSA-N Gly-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)CN)C(=O)O YFGONBOFGGWKKY-VHSXEESVSA-N 0.000 description 1
- DGKBSGNCMCLDSL-BYULHYEWSA-N Gly-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN DGKBSGNCMCLDSL-BYULHYEWSA-N 0.000 description 1
- HMHRTKOWRUPPNU-RCOVLWMOSA-N Gly-Ile-Gly Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O HMHRTKOWRUPPNU-RCOVLWMOSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- BHPQOIPBLYJNAW-NGZCFLSTSA-N Gly-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN BHPQOIPBLYJNAW-NGZCFLSTSA-N 0.000 description 1
- HAXARWKYFIIHKD-ZKWXMUAHSA-N Gly-Ile-Ser Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HAXARWKYFIIHKD-ZKWXMUAHSA-N 0.000 description 1
- COVXELOAORHTND-LSJOCFKGSA-N Gly-Ile-Val Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O COVXELOAORHTND-LSJOCFKGSA-N 0.000 description 1
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 1
- CCBIBMKQNXHNIN-ZETCQYMHSA-N Gly-Leu-Gly Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CCBIBMKQNXHNIN-ZETCQYMHSA-N 0.000 description 1
- YSDLIYZLOTZZNP-UWVGGRQHSA-N Gly-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN YSDLIYZLOTZZNP-UWVGGRQHSA-N 0.000 description 1
- VLIJYPMATZSOLL-YUMQZZPRSA-N Gly-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN VLIJYPMATZSOLL-YUMQZZPRSA-N 0.000 description 1
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 1
- CVFOYJJOZYYEPE-KBPBESRZSA-N Gly-Lys-Tyr Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CVFOYJJOZYYEPE-KBPBESRZSA-N 0.000 description 1
- QLQDIJBYJZKQPR-BQBZGAKWSA-N Gly-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN QLQDIJBYJZKQPR-BQBZGAKWSA-N 0.000 description 1
- ZWRDOVYMQAAISL-UWVGGRQHSA-N Gly-Met-Lys Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCCN ZWRDOVYMQAAISL-UWVGGRQHSA-N 0.000 description 1
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 1
- IXHQLZIWBCQBLQ-STQMWFEESA-N Gly-Pro-Phe Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IXHQLZIWBCQBLQ-STQMWFEESA-N 0.000 description 1
- GLACUWHUYFBSPJ-FJXKBIBVSA-N Gly-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GLACUWHUYFBSPJ-FJXKBIBVSA-N 0.000 description 1
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 1
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 1
- OHUKZZYSJBKFRR-WHFBIAKZSA-N Gly-Ser-Asp Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O OHUKZZYSJBKFRR-WHFBIAKZSA-N 0.000 description 1
- JQFILXICXLDTRR-FBCQKBJTSA-N Gly-Thr-Gly Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)NCC(O)=O JQFILXICXLDTRR-FBCQKBJTSA-N 0.000 description 1
- RHRLHXQWHCNJKR-PMVVWTBXSA-N Gly-Thr-His Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 RHRLHXQWHCNJKR-PMVVWTBXSA-N 0.000 description 1
- FOKISINOENBSDM-WLTAIBSBSA-N Gly-Thr-Tyr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FOKISINOENBSDM-WLTAIBSBSA-N 0.000 description 1
- GBYYQVBXFVDJPJ-WLTAIBSBSA-N Gly-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)CN)O GBYYQVBXFVDJPJ-WLTAIBSBSA-N 0.000 description 1
- DUAWRXXTOQOECJ-JSGCOSHPSA-N Gly-Tyr-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O DUAWRXXTOQOECJ-JSGCOSHPSA-N 0.000 description 1
- YDIDLLVFCYSXNY-RCOVLWMOSA-N Gly-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN YDIDLLVFCYSXNY-RCOVLWMOSA-N 0.000 description 1
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 1
- YGHSQRJSHKYUJY-SCZZXKLOSA-N Gly-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN YGHSQRJSHKYUJY-SCZZXKLOSA-N 0.000 description 1
- 238000012752 Hepatectomy Methods 0.000 description 1
- AWHJQEYGWRKPHE-LSJOCFKGSA-N His-Ala-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AWHJQEYGWRKPHE-LSJOCFKGSA-N 0.000 description 1
- KZTLOHBDLMIFSH-XVYDVKMFSA-N His-Ala-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O KZTLOHBDLMIFSH-XVYDVKMFSA-N 0.000 description 1
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 1
- TVQGUFGDVODUIF-LSJOCFKGSA-N His-Arg-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CN=CN1)N TVQGUFGDVODUIF-LSJOCFKGSA-N 0.000 description 1
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 1
- MWWOPNQSBXEUHO-ULQDDVLXSA-N His-Arg-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 MWWOPNQSBXEUHO-ULQDDVLXSA-N 0.000 description 1
- HQKADFMLECZIQJ-HVTMNAMFSA-N His-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CN=CN1)N HQKADFMLECZIQJ-HVTMNAMFSA-N 0.000 description 1
- FIMNVXRZGUAGBI-AVGNSLFASA-N His-Glu-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O FIMNVXRZGUAGBI-AVGNSLFASA-N 0.000 description 1
- KNNSUUOHFVVJOP-GUBZILKMSA-N His-Glu-Ser Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N KNNSUUOHFVVJOP-GUBZILKMSA-N 0.000 description 1
- NQKRILCJYCASDV-QWRGUYRKSA-N His-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 NQKRILCJYCASDV-QWRGUYRKSA-N 0.000 description 1
- ORERHHPZDDEMSC-VGDYDELISA-N His-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ORERHHPZDDEMSC-VGDYDELISA-N 0.000 description 1
- WZBLRQQCDYYRTD-SIXJUCDHSA-N His-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N WZBLRQQCDYYRTD-SIXJUCDHSA-N 0.000 description 1
- JENKOCSDMSVWPY-SRVKXCTJSA-N His-Leu-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O JENKOCSDMSVWPY-SRVKXCTJSA-N 0.000 description 1
- LJUIEESLIAZSFR-SRVKXCTJSA-N His-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N LJUIEESLIAZSFR-SRVKXCTJSA-N 0.000 description 1
- SKOKHBGDXGTDDP-MELADBBJSA-N His-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N SKOKHBGDXGTDDP-MELADBBJSA-N 0.000 description 1
- LVXFNTIIGOQBMD-SRVKXCTJSA-N His-Leu-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O LVXFNTIIGOQBMD-SRVKXCTJSA-N 0.000 description 1
- XDIVYNSPYBLSME-DCAQKATOSA-N His-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N XDIVYNSPYBLSME-DCAQKATOSA-N 0.000 description 1
- YAEKRYQASVCDLK-JYJNAYRXSA-N His-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N YAEKRYQASVCDLK-JYJNAYRXSA-N 0.000 description 1
- PYNPBMCLAKTHJL-SRVKXCTJSA-N His-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O PYNPBMCLAKTHJL-SRVKXCTJSA-N 0.000 description 1
- PZAJPILZRFPYJJ-SRVKXCTJSA-N His-Ser-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O PZAJPILZRFPYJJ-SRVKXCTJSA-N 0.000 description 1
- GIRSNERMXCMDBO-GARJFASQSA-N His-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O GIRSNERMXCMDBO-GARJFASQSA-N 0.000 description 1
- VIJMRAIWYWRXSR-CIUDSAMLSA-N His-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 VIJMRAIWYWRXSR-CIUDSAMLSA-N 0.000 description 1
- BRQKGRLDDDQWQJ-MBLNEYKQSA-N His-Thr-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O BRQKGRLDDDQWQJ-MBLNEYKQSA-N 0.000 description 1
- XHQYFGPIRUHQIB-PBCZWWQYSA-N His-Thr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CN=CN1 XHQYFGPIRUHQIB-PBCZWWQYSA-N 0.000 description 1
- ALPXXNRQBMRCPZ-MEYUZBJRSA-N His-Thr-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ALPXXNRQBMRCPZ-MEYUZBJRSA-N 0.000 description 1
- 108091010875 Histidine kinase domains Proteins 0.000 description 1
- 102000035473 Histidine kinase domains Human genes 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000884183 Homo sapiens Cyclin-F Proteins 0.000 description 1
- 101000767151 Homo sapiens General vesicular transport factor p115 Proteins 0.000 description 1
- 101000946124 Homo sapiens Lipocalin-1 Proteins 0.000 description 1
- 101100515742 Homo sapiens NAGA gene Proteins 0.000 description 1
- 101100304651 Homo sapiens RPL3L gene Proteins 0.000 description 1
- 101001093919 Homo sapiens SEC14-like protein 2 Proteins 0.000 description 1
- 101001059454 Homo sapiens Serine/threonine-protein kinase MARK2 Proteins 0.000 description 1
- VSZALHITQINTGC-GHCJXIJMSA-N Ile-Ala-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VSZALHITQINTGC-GHCJXIJMSA-N 0.000 description 1
- DPTBVFUDCPINIP-JURCDPSOSA-N Ile-Ala-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DPTBVFUDCPINIP-JURCDPSOSA-N 0.000 description 1
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 1
- QTUSJASXLGLJSR-OSUNSFLBSA-N Ile-Arg-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N QTUSJASXLGLJSR-OSUNSFLBSA-N 0.000 description 1
- AZEYWPUCOYXFOE-CYDGBPFRSA-N Ile-Arg-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C(C)C)C(=O)O)N AZEYWPUCOYXFOE-CYDGBPFRSA-N 0.000 description 1
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 1
- HZMLFETXHFHGBB-UGYAYLCHSA-N Ile-Asn-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZMLFETXHFHGBB-UGYAYLCHSA-N 0.000 description 1
- NKRJALPCDNXULF-BYULHYEWSA-N Ile-Asp-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O NKRJALPCDNXULF-BYULHYEWSA-N 0.000 description 1
- CCHSQWLCOOZREA-GMOBBJLQSA-N Ile-Asp-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N CCHSQWLCOOZREA-GMOBBJLQSA-N 0.000 description 1
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 1
- WTOAPTKSZJJWKK-HTFCKZLJSA-N Ile-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N WTOAPTKSZJJWKK-HTFCKZLJSA-N 0.000 description 1
- ZGGWRNBSBOHIGH-HVTMNAMFSA-N Ile-Gln-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZGGWRNBSBOHIGH-HVTMNAMFSA-N 0.000 description 1
- DMZOUKXXHJQPTL-GRLWGSQLSA-N Ile-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N DMZOUKXXHJQPTL-GRLWGSQLSA-N 0.000 description 1
- DVRDRICMWUSCBN-UKJIMTQDSA-N Ile-Gln-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DVRDRICMWUSCBN-UKJIMTQDSA-N 0.000 description 1
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 1
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 1
- CDGLBYSAZFIIJO-RCOVLWMOSA-N Ile-Gly-Gly Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O CDGLBYSAZFIIJO-RCOVLWMOSA-N 0.000 description 1
- YKLOMBNBQUTJDT-HVTMNAMFSA-N Ile-His-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YKLOMBNBQUTJDT-HVTMNAMFSA-N 0.000 description 1
- CMNMPCTVCWWYHY-MXAVVETBSA-N Ile-His-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(C)C)C(=O)O)N CMNMPCTVCWWYHY-MXAVVETBSA-N 0.000 description 1
- BBQABUDWDUKJMB-LZXPERKUSA-N Ile-Ile-Ile Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C([O-])=O BBQABUDWDUKJMB-LZXPERKUSA-N 0.000 description 1
- PFPUFNLHBXKPHY-HTFCKZLJSA-N Ile-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)O)N PFPUFNLHBXKPHY-HTFCKZLJSA-N 0.000 description 1
- QZZIBQZLWBOOJH-PEDHHIEDSA-N Ile-Ile-Val Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(=O)O QZZIBQZLWBOOJH-PEDHHIEDSA-N 0.000 description 1
- FZWVCYCYWCLQDH-NHCYSSNCSA-N Ile-Leu-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N FZWVCYCYWCLQDH-NHCYSSNCSA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- XLXPYSDGMXTTNQ-DKIMLUQUSA-N Ile-Phe-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CC(C)C)C(O)=O XLXPYSDGMXTTNQ-DKIMLUQUSA-N 0.000 description 1
- LRAUKBMYHHNADU-DKIMLUQUSA-N Ile-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 LRAUKBMYHHNADU-DKIMLUQUSA-N 0.000 description 1
- FHPZJWJWTWZKNA-LLLHUVSDSA-N Ile-Phe-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N FHPZJWJWTWZKNA-LLLHUVSDSA-N 0.000 description 1
- AKQFLPNANHNTLP-VKOGCVSHSA-N Ile-Pro-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N AKQFLPNANHNTLP-VKOGCVSHSA-N 0.000 description 1
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 1
- JZNVOBUNTWNZPW-GHCJXIJMSA-N Ile-Ser-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N JZNVOBUNTWNZPW-GHCJXIJMSA-N 0.000 description 1
- RKQAYOWLSFLJEE-SVSWQMSJSA-N Ile-Thr-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)O)N RKQAYOWLSFLJEE-SVSWQMSJSA-N 0.000 description 1
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 1
- GMUYXHHJAGQHGB-TUBUOCAGSA-N Ile-Thr-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N GMUYXHHJAGQHGB-TUBUOCAGSA-N 0.000 description 1
- QHUREMVLLMNUAX-OSUNSFLBSA-N Ile-Thr-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)O)N QHUREMVLLMNUAX-OSUNSFLBSA-N 0.000 description 1
- MGUTVMBNOMJLKC-VKOGCVSHSA-N Ile-Trp-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](C(C)C)C(=O)O)N MGUTVMBNOMJLKC-VKOGCVSHSA-N 0.000 description 1
- HQLSBZFLOUHQJK-STECZYCISA-N Ile-Tyr-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HQLSBZFLOUHQJK-STECZYCISA-N 0.000 description 1
- JERJIYYCOGBAIJ-OBAATPRFSA-N Ile-Tyr-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N JERJIYYCOGBAIJ-OBAATPRFSA-N 0.000 description 1
- KXUKTDGKLAOCQK-LSJOCFKGSA-N Ile-Val-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O KXUKTDGKLAOCQK-LSJOCFKGSA-N 0.000 description 1
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 1
- NJGXXYLPDMMFJB-XUXIUFHCSA-N Ile-Val-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N NJGXXYLPDMMFJB-XUXIUFHCSA-N 0.000 description 1
- JZBVBOKASHNXAD-NAKRPEOUSA-N Ile-Val-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N JZBVBOKASHNXAD-NAKRPEOUSA-N 0.000 description 1
- APQYGMBHIVXFML-OSUNSFLBSA-N Ile-Val-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N APQYGMBHIVXFML-OSUNSFLBSA-N 0.000 description 1
- YHFPHRUWZMEOIX-CYDGBPFRSA-N Ile-Val-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(=O)O)N YHFPHRUWZMEOIX-CYDGBPFRSA-N 0.000 description 1
- 108700002232 Immediate-Early Genes Proteins 0.000 description 1
- 108020005350 Initiator Codon Proteins 0.000 description 1
- 108020005351 Isochores Proteins 0.000 description 1
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 1
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 1
- KVRKAGGMEWNURO-CIUDSAMLSA-N Leu-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(C)C)N KVRKAGGMEWNURO-CIUDSAMLSA-N 0.000 description 1
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 1
- NTRAGDHVSGKUSF-AVGNSLFASA-N Leu-Arg-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NTRAGDHVSGKUSF-AVGNSLFASA-N 0.000 description 1
- CNNQBZRGQATKNY-DCAQKATOSA-N Leu-Arg-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N CNNQBZRGQATKNY-DCAQKATOSA-N 0.000 description 1
- UILIPCLTHRPCRB-XUXIUFHCSA-N Leu-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(C)C)N UILIPCLTHRPCRB-XUXIUFHCSA-N 0.000 description 1
- QUAAUWNLWMLERT-IHRRRGAJSA-N Leu-Arg-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(O)=O QUAAUWNLWMLERT-IHRRRGAJSA-N 0.000 description 1
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 1
- VCSBGUACOYUIGD-CIUDSAMLSA-N Leu-Asn-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VCSBGUACOYUIGD-CIUDSAMLSA-N 0.000 description 1
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 1
- FIJMQLGQLBLBOL-HJGDQZAQSA-N Leu-Asn-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FIJMQLGQLBLBOL-HJGDQZAQSA-N 0.000 description 1
- TWQIYNGNYNJUFM-NHCYSSNCSA-N Leu-Asn-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TWQIYNGNYNJUFM-NHCYSSNCSA-N 0.000 description 1
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 1
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- PPBKJAQJAUHZKX-SRVKXCTJSA-N Leu-Cys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CC(C)C PPBKJAQJAUHZKX-SRVKXCTJSA-N 0.000 description 1
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 1
- GPICTNQYKHHHTH-GUBZILKMSA-N Leu-Gln-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GPICTNQYKHHHTH-GUBZILKMSA-N 0.000 description 1
- RVVBWTWPNFDYBE-SRVKXCTJSA-N Leu-Glu-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVVBWTWPNFDYBE-SRVKXCTJSA-N 0.000 description 1
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 1
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 1
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 1
- LAPSXOAUPNOINL-YUMQZZPRSA-N Leu-Gly-Asp Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O LAPSXOAUPNOINL-YUMQZZPRSA-N 0.000 description 1
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 1
- UCDHVOALNXENLC-KBPBESRZSA-N Leu-Gly-Tyr Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UCDHVOALNXENLC-KBPBESRZSA-N 0.000 description 1
- KXODZBLFVFSLAI-AVGNSLFASA-N Leu-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KXODZBLFVFSLAI-AVGNSLFASA-N 0.000 description 1
- AOFYPTOHESIBFZ-KKUMJFAQSA-N Leu-His-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O AOFYPTOHESIBFZ-KKUMJFAQSA-N 0.000 description 1
- XQXGNBFMAXWIGI-MXAVVETBSA-N Leu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 XQXGNBFMAXWIGI-MXAVVETBSA-N 0.000 description 1
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 1
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 1
- QLDHBYRUNQZIJQ-DKIMLUQUSA-N Leu-Ile-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QLDHBYRUNQZIJQ-DKIMLUQUSA-N 0.000 description 1
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 1
- JKSIBWITFMQTOA-XUXIUFHCSA-N Leu-Ile-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O JKSIBWITFMQTOA-XUXIUFHCSA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 1
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 1
- DCGXHWINSHEPIR-SRVKXCTJSA-N Leu-Lys-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CS)C(=O)O)N DCGXHWINSHEPIR-SRVKXCTJSA-N 0.000 description 1
- FKQPWMZLIIATBA-AJNGGQMLSA-N Leu-Lys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FKQPWMZLIIATBA-AJNGGQMLSA-N 0.000 description 1
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 1
- ONPJGOIVICHWBW-BZSNNMDCSA-N Leu-Lys-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ONPJGOIVICHWBW-BZSNNMDCSA-N 0.000 description 1
- CPONGMJGVIAWEH-DCAQKATOSA-N Leu-Met-Ala Chemical compound CSCC[C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](C)C(O)=O CPONGMJGVIAWEH-DCAQKATOSA-N 0.000 description 1
- WXZOHBVPVKABQN-DCAQKATOSA-N Leu-Met-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WXZOHBVPVKABQN-DCAQKATOSA-N 0.000 description 1
- JVTYXRRFZCEPPK-RHYQMDGZSA-N Leu-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(C)C)N)O JVTYXRRFZCEPPK-RHYQMDGZSA-N 0.000 description 1
- ZDBMWELMUCLUPL-QEJZJMRPSA-N Leu-Phe-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 ZDBMWELMUCLUPL-QEJZJMRPSA-N 0.000 description 1
- GCXGCIYIHXSKAY-ULQDDVLXSA-N Leu-Phe-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GCXGCIYIHXSKAY-ULQDDVLXSA-N 0.000 description 1
- SYRTUBLKWNDSDK-DKIMLUQUSA-N Leu-Phe-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYRTUBLKWNDSDK-DKIMLUQUSA-N 0.000 description 1
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 1
- MUCIDQMDOYQYBR-IHRRRGAJSA-N Leu-Pro-His Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N MUCIDQMDOYQYBR-IHRRRGAJSA-N 0.000 description 1
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 1
- UCXQIIIFOOGYEM-ULQDDVLXSA-N Leu-Pro-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCXQIIIFOOGYEM-ULQDDVLXSA-N 0.000 description 1
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 1
- HWMQRQIFVGEAPH-XIRDDKMYSA-N Leu-Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 HWMQRQIFVGEAPH-XIRDDKMYSA-N 0.000 description 1
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 1
- ICYRCNICGBJLGM-HJGDQZAQSA-N Leu-Thr-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 1
- YWFZWQKWNDOWPA-XIRDDKMYSA-N Leu-Trp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O YWFZWQKWNDOWPA-XIRDDKMYSA-N 0.000 description 1
- LFXSPAIBSZSTEM-PMVMPFDFSA-N Leu-Trp-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CC=CC=C3)C(=O)O)N LFXSPAIBSZSTEM-PMVMPFDFSA-N 0.000 description 1
- BGGTYDNTOYRTTR-MEYUZBJRSA-N Leu-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(C)C)N)O BGGTYDNTOYRTTR-MEYUZBJRSA-N 0.000 description 1
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 1
- LMDVGHQPPPLYAR-IHRRRGAJSA-N Leu-Val-His Chemical compound N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O LMDVGHQPPPLYAR-IHRRRGAJSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 102100034724 Lipocalin-1 Human genes 0.000 description 1
- FZIJIFCXUCZHOL-CIUDSAMLSA-N Lys-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN FZIJIFCXUCZHOL-CIUDSAMLSA-N 0.000 description 1
- RVOMPSJXSRPFJT-DCAQKATOSA-N Lys-Ala-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVOMPSJXSRPFJT-DCAQKATOSA-N 0.000 description 1
- WSXTWLJHTLRFLW-SRVKXCTJSA-N Lys-Ala-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O WSXTWLJHTLRFLW-SRVKXCTJSA-N 0.000 description 1
- UWKNTTJNVSYXPC-CIUDSAMLSA-N Lys-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN UWKNTTJNVSYXPC-CIUDSAMLSA-N 0.000 description 1
- BRSGXFITDXFMFF-IHRRRGAJSA-N Lys-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N BRSGXFITDXFMFF-IHRRRGAJSA-N 0.000 description 1
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 1
- FUKDBQGFSJUXGX-RWMBFGLXSA-N Lys-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)C(=O)O FUKDBQGFSJUXGX-RWMBFGLXSA-N 0.000 description 1
- DNEJSAIMVANNPA-DCAQKATOSA-N Lys-Asn-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DNEJSAIMVANNPA-DCAQKATOSA-N 0.000 description 1
- YKIRNDPUWONXQN-GUBZILKMSA-N Lys-Asn-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKIRNDPUWONXQN-GUBZILKMSA-N 0.000 description 1
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 1
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 1
- OVIVOCSURJYCTM-GUBZILKMSA-N Lys-Asp-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O OVIVOCSURJYCTM-GUBZILKMSA-N 0.000 description 1
- WGCKDDHUFPQSMZ-ZPFDUUQYSA-N Lys-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCCN WGCKDDHUFPQSMZ-ZPFDUUQYSA-N 0.000 description 1
- GKFNXYMAMKJSKD-NHCYSSNCSA-N Lys-Asp-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GKFNXYMAMKJSKD-NHCYSSNCSA-N 0.000 description 1
- OPTCSTACHGNULU-DCAQKATOSA-N Lys-Cys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CCCCN OPTCSTACHGNULU-DCAQKATOSA-N 0.000 description 1
- YFGWNAROEYWGNL-GUBZILKMSA-N Lys-Gln-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YFGWNAROEYWGNL-GUBZILKMSA-N 0.000 description 1
- VSRXPEHZMHSFKU-IUCAKERBSA-N Lys-Gln-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VSRXPEHZMHSFKU-IUCAKERBSA-N 0.000 description 1
- CRNNMTHBMRFQNG-GUBZILKMSA-N Lys-Glu-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N CRNNMTHBMRFQNG-GUBZILKMSA-N 0.000 description 1
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 1
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 1
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 1
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 1
- UETQMSASAVBGJY-QWRGUYRKSA-N Lys-Gly-His Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 UETQMSASAVBGJY-QWRGUYRKSA-N 0.000 description 1
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 1
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 1
- KNKJPYAZQUFLQK-IHRRRGAJSA-N Lys-His-Arg Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCCCN)N KNKJPYAZQUFLQK-IHRRRGAJSA-N 0.000 description 1
- IUWMQCZOTYRXPL-ZPFDUUQYSA-N Lys-Ile-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O IUWMQCZOTYRXPL-ZPFDUUQYSA-N 0.000 description 1
- MYZMQWHPDAYKIE-SRVKXCTJSA-N Lys-Leu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O MYZMQWHPDAYKIE-SRVKXCTJSA-N 0.000 description 1
- PINHPJWGVBKQII-SRVKXCTJSA-N Lys-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N PINHPJWGVBKQII-SRVKXCTJSA-N 0.000 description 1
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 1
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 1
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 1
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 1
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 1
- BXPHMHQHYHILBB-BZSNNMDCSA-N Lys-Lys-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BXPHMHQHYHILBB-BZSNNMDCSA-N 0.000 description 1
- URBJRJKWSUFCKS-AVGNSLFASA-N Lys-Met-Arg Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCCCN)N URBJRJKWSUFCKS-AVGNSLFASA-N 0.000 description 1
- KFSALEZVQJYHCE-AVGNSLFASA-N Lys-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCCCN)N KFSALEZVQJYHCE-AVGNSLFASA-N 0.000 description 1
- JPYPRVHMKRFTAT-KKUMJFAQSA-N Lys-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N JPYPRVHMKRFTAT-KKUMJFAQSA-N 0.000 description 1
- PIXVFCBYEGPZPA-JYJNAYRXSA-N Lys-Phe-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N PIXVFCBYEGPZPA-JYJNAYRXSA-N 0.000 description 1
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 1
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 1
- UQJOKDAYFULYIX-AVGNSLFASA-N Lys-Pro-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 UQJOKDAYFULYIX-AVGNSLFASA-N 0.000 description 1
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 1
- SQXZLVXQXWILKW-KKUMJFAQSA-N Lys-Ser-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQXZLVXQXWILKW-KKUMJFAQSA-N 0.000 description 1
- DIBZLYZXTSVGLN-CIUDSAMLSA-N Lys-Ser-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O DIBZLYZXTSVGLN-CIUDSAMLSA-N 0.000 description 1
- MIFFFXHMAHFACR-KATARQTJSA-N Lys-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN MIFFFXHMAHFACR-KATARQTJSA-N 0.000 description 1
- YRNRVKTYDSLKMD-KKUMJFAQSA-N Lys-Ser-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YRNRVKTYDSLKMD-KKUMJFAQSA-N 0.000 description 1
- PLOUVAYOMTYJRG-JXUBOQSCSA-N Lys-Thr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PLOUVAYOMTYJRG-JXUBOQSCSA-N 0.000 description 1
- JHNOXVASMSXSNB-WEDXCCLWSA-N Lys-Thr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JHNOXVASMSXSNB-WEDXCCLWSA-N 0.000 description 1
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 1
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 1
- OEYKVQKYCHATHO-SZMVWBNQSA-N Lys-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N OEYKVQKYCHATHO-SZMVWBNQSA-N 0.000 description 1
- SUZVLFWOCKHWET-CQDKDKBSSA-N Lys-Tyr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O SUZVLFWOCKHWET-CQDKDKBSSA-N 0.000 description 1
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 1
- VWPJQIHBBOJWDN-DCAQKATOSA-N Lys-Val-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O VWPJQIHBBOJWDN-DCAQKATOSA-N 0.000 description 1
- GILLQRYAWOMHED-DCAQKATOSA-N Lys-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN GILLQRYAWOMHED-DCAQKATOSA-N 0.000 description 1
- XBAJINCXDBTJRH-WDSOQIARSA-N Lys-Val-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCCCN)N XBAJINCXDBTJRH-WDSOQIARSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 108091027974 Mature messenger RNA Proteins 0.000 description 1
- 108010049137 Member 1 Subfamily D ATP Binding Cassette Transporter Proteins 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- 108010052285 Membrane Proteins Proteins 0.000 description 1
- 208000008948 Menkes Kinky Hair Syndrome Diseases 0.000 description 1
- 208000012583 Menkes disease Diseases 0.000 description 1
- KUQWVNFMZLHAPA-CIUDSAMLSA-N Met-Ala-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O KUQWVNFMZLHAPA-CIUDSAMLSA-N 0.000 description 1
- QGQGAIBGTUJRBR-NAKRPEOUSA-N Met-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCSC QGQGAIBGTUJRBR-NAKRPEOUSA-N 0.000 description 1
- QEVRUYFHWJJUHZ-DCAQKATOSA-N Met-Ala-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(C)C QEVRUYFHWJJUHZ-DCAQKATOSA-N 0.000 description 1
- DTICLBJHRYSJLH-GUBZILKMSA-N Met-Ala-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O DTICLBJHRYSJLH-GUBZILKMSA-N 0.000 description 1
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 1
- FJVJLMZUIGMFFU-BQBZGAKWSA-N Met-Asp-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FJVJLMZUIGMFFU-BQBZGAKWSA-N 0.000 description 1
- DNDVVILEHVMWIS-LPEHRKFASA-N Met-Asp-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N DNDVVILEHVMWIS-LPEHRKFASA-N 0.000 description 1
- UYAKZHGIPRCGPF-CIUDSAMLSA-N Met-Glu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCSC)N UYAKZHGIPRCGPF-CIUDSAMLSA-N 0.000 description 1
- JPCHYAUKOUGOIB-HJGDQZAQSA-N Met-Glu-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPCHYAUKOUGOIB-HJGDQZAQSA-N 0.000 description 1
- SLQDSYZHHOKQSR-QXEWZRGKSA-N Met-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCSC SLQDSYZHHOKQSR-QXEWZRGKSA-N 0.000 description 1
- SXWQMBGNFXAGAT-FJXKBIBVSA-N Met-Gly-Thr Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SXWQMBGNFXAGAT-FJXKBIBVSA-N 0.000 description 1
- RXWPLVRJQNWXRQ-IHRRRGAJSA-N Met-His-His Chemical compound C([C@H](NC(=O)[C@@H](N)CCSC)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CNC=N1 RXWPLVRJQNWXRQ-IHRRRGAJSA-N 0.000 description 1
- QGRJTULYDZUBAY-ZPFDUUQYSA-N Met-Ile-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O QGRJTULYDZUBAY-ZPFDUUQYSA-N 0.000 description 1
- HWROAFGWPQUPTE-OSUNSFLBSA-N Met-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCSC)N HWROAFGWPQUPTE-OSUNSFLBSA-N 0.000 description 1
- QZPXMHVKPHJNTR-DCAQKATOSA-N Met-Leu-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O QZPXMHVKPHJNTR-DCAQKATOSA-N 0.000 description 1
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 1
- JCMMNFZUKMMECJ-DCAQKATOSA-N Met-Lys-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JCMMNFZUKMMECJ-DCAQKATOSA-N 0.000 description 1
- OIFHHODAXVWKJN-ULQDDVLXSA-N Met-Phe-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 OIFHHODAXVWKJN-ULQDDVLXSA-N 0.000 description 1
- PHKBGZKVOJCIMZ-SRVKXCTJSA-N Met-Pro-Arg Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PHKBGZKVOJCIMZ-SRVKXCTJSA-N 0.000 description 1
- SMVTWPOATVIXTN-NAKRPEOUSA-N Met-Ser-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SMVTWPOATVIXTN-NAKRPEOUSA-N 0.000 description 1
- ZBLSZPYQQRIHQU-RCWTZXSCSA-N Met-Thr-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O ZBLSZPYQQRIHQU-RCWTZXSCSA-N 0.000 description 1
- HNQXYIVNRUXQLU-BPUTZDHNSA-N Met-Trp-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(O)=O)C(O)=O HNQXYIVNRUXQLU-BPUTZDHNSA-N 0.000 description 1
- VYXIKLFLGRTANT-HRCADAONSA-N Met-Tyr-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N VYXIKLFLGRTANT-HRCADAONSA-N 0.000 description 1
- VWFHWJGVLVZVIS-QXEWZRGKSA-N Met-Val-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O VWFHWJGVLVZVIS-QXEWZRGKSA-N 0.000 description 1
- LBSWWNKMVPAXOI-GUBZILKMSA-N Met-Val-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O LBSWWNKMVPAXOI-GUBZILKMSA-N 0.000 description 1
- 102000003792 Metallothionein Human genes 0.000 description 1
- 108090000157 Metallothionein Proteins 0.000 description 1
- 108091092878 Microsatellite Proteins 0.000 description 1
- 241000713333 Mouse mammary tumor virus Species 0.000 description 1
- 101100161482 Mus musculus Abca1 gene Proteins 0.000 description 1
- 101100444360 Mycoplasma mycoides subsp. mycoides SC (strain PG1) ecfA1 gene Proteins 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- 229930193140 Neomycin Natural products 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 101100202428 Neopyropia yezoensis atps gene Proteins 0.000 description 1
- 102000007517 Neurofibromin 2 Human genes 0.000 description 1
- 108010085839 Neurofibromin 2 Proteins 0.000 description 1
- 241000221960 Neurospora Species 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 108010038807 Oligopeptides Proteins 0.000 description 1
- 102000015636 Oligopeptides Human genes 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 101001041669 Oryctolagus cuniculus Corticostatin 1 Proteins 0.000 description 1
- 101100464186 Oryzias latipes pkd1l1 gene Proteins 0.000 description 1
- 241000590428 Panacea Species 0.000 description 1
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 1
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 108090000279 Peptidyltransferases Proteins 0.000 description 1
- CYZBFPYMSJGBRL-DRZSPHRISA-N Phe-Ala-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CYZBFPYMSJGBRL-DRZSPHRISA-N 0.000 description 1
- AGYXCMYVTBYGCT-ULQDDVLXSA-N Phe-Arg-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O AGYXCMYVTBYGCT-ULQDDVLXSA-N 0.000 description 1
- ZWJKVFAYPLPCQB-UNQGMJICSA-N Phe-Arg-Thr Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O ZWJKVFAYPLPCQB-UNQGMJICSA-N 0.000 description 1
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 1
- HXSUFWQYLPKEHF-IHRRRGAJSA-N Phe-Asn-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HXSUFWQYLPKEHF-IHRRRGAJSA-N 0.000 description 1
- HTTYNOXBBOWZTB-SRVKXCTJSA-N Phe-Asn-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HTTYNOXBBOWZTB-SRVKXCTJSA-N 0.000 description 1
- XMPUYNHKEPFERE-IHRRRGAJSA-N Phe-Asp-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMPUYNHKEPFERE-IHRRRGAJSA-N 0.000 description 1
- FRPVPGRXUKFEQE-YDHLFZDLSA-N Phe-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O FRPVPGRXUKFEQE-YDHLFZDLSA-N 0.000 description 1
- UMKYAYXCMYYNHI-AVGNSLFASA-N Phe-Gln-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N UMKYAYXCMYYNHI-AVGNSLFASA-N 0.000 description 1
- MGBRZXXGQBAULP-DRZSPHRISA-N Phe-Glu-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MGBRZXXGQBAULP-DRZSPHRISA-N 0.000 description 1
- HOYQLNNGMHXZDW-KKUMJFAQSA-N Phe-Glu-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O HOYQLNNGMHXZDW-KKUMJFAQSA-N 0.000 description 1
- ZZVUXQCQPXSUFH-JBACZVJFSA-N Phe-Glu-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 ZZVUXQCQPXSUFH-JBACZVJFSA-N 0.000 description 1
- LWPMGKSZPKFKJD-DZKIICNBSA-N Phe-Glu-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O LWPMGKSZPKFKJD-DZKIICNBSA-N 0.000 description 1
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 1
- JEBWZLWTRPZQRX-QWRGUYRKSA-N Phe-Gly-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O JEBWZLWTRPZQRX-QWRGUYRKSA-N 0.000 description 1
- APJPXSFJBMMOLW-KBPBESRZSA-N Phe-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 APJPXSFJBMMOLW-KBPBESRZSA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- QPVFUAUFEBPIPT-CDMKHQONSA-N Phe-Gly-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O QPVFUAUFEBPIPT-CDMKHQONSA-N 0.000 description 1
- WFHRXJOZEXUKLV-IRXDYDNUSA-N Phe-Gly-Tyr Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 WFHRXJOZEXUKLV-IRXDYDNUSA-N 0.000 description 1
- HQCSLJFGZYOXHW-KKUMJFAQSA-N Phe-His-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CS)C(=O)O)N HQCSLJFGZYOXHW-KKUMJFAQSA-N 0.000 description 1
- FINLZXKJWTYYLC-ACRUOGEOSA-N Phe-His-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FINLZXKJWTYYLC-ACRUOGEOSA-N 0.000 description 1
- RGZYXNFHYRFNNS-MXAVVETBSA-N Phe-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N RGZYXNFHYRFNNS-MXAVVETBSA-N 0.000 description 1
- BWTKUQPNOMMKMA-FIRPJDEBSA-N Phe-Ile-Phe Chemical compound C([C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 BWTKUQPNOMMKMA-FIRPJDEBSA-N 0.000 description 1
- CWFGECHCRMGPPT-MXAVVETBSA-N Phe-Ile-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O CWFGECHCRMGPPT-MXAVVETBSA-N 0.000 description 1
- YTILBRIUASDGBL-BZSNNMDCSA-N Phe-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 YTILBRIUASDGBL-BZSNNMDCSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- CMHTUJQZQXFNTQ-OEAJRASXSA-N Phe-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=CC=C1)N)O CMHTUJQZQXFNTQ-OEAJRASXSA-N 0.000 description 1
- HQPWNHXERZCIHP-PMVMPFDFSA-N Phe-Leu-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 HQPWNHXERZCIHP-PMVMPFDFSA-N 0.000 description 1
- ZUQACJLOHYRVPJ-DKIMLUQUSA-N Phe-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZUQACJLOHYRVPJ-DKIMLUQUSA-N 0.000 description 1
- PHJUFDQVVKVOPU-ULQDDVLXSA-N Phe-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=CC=C1)N PHJUFDQVVKVOPU-ULQDDVLXSA-N 0.000 description 1
- OKQQWSNUSQURLI-JYJNAYRXSA-N Phe-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CC=CC=C1)N OKQQWSNUSQURLI-JYJNAYRXSA-N 0.000 description 1
- JKJSIYKSGIDHPM-WBAXXEDZSA-N Phe-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O JKJSIYKSGIDHPM-WBAXXEDZSA-N 0.000 description 1
- KAJLHCWRWDSROH-BZSNNMDCSA-N Phe-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 KAJLHCWRWDSROH-BZSNNMDCSA-N 0.000 description 1
- GRVMHFCZUIYNKQ-UFYCRDLUSA-N Phe-Phe-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O GRVMHFCZUIYNKQ-UFYCRDLUSA-N 0.000 description 1
- ZJPGOXWRFNKIQL-JYJNAYRXSA-N Phe-Pro-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=CC=C1 ZJPGOXWRFNKIQL-JYJNAYRXSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- BSTPNLNKHKBONJ-HTUGSXCWSA-N Phe-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O BSTPNLNKHKBONJ-HTUGSXCWSA-N 0.000 description 1
- CXMSESHALPOLRE-MEYUZBJRSA-N Phe-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O CXMSESHALPOLRE-MEYUZBJRSA-N 0.000 description 1
- FGWUALWGCZJQDJ-URLPEUOOSA-N Phe-Thr-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FGWUALWGCZJQDJ-URLPEUOOSA-N 0.000 description 1
- PTDAGKJHZBGDKD-OEAJRASXSA-N Phe-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O PTDAGKJHZBGDKD-OEAJRASXSA-N 0.000 description 1
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 1
- BPIFSOUEUYDJRM-DCPHZVHLSA-N Phe-Trp-Ala Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](C)C(O)=O)C1=CC=CC=C1 BPIFSOUEUYDJRM-DCPHZVHLSA-N 0.000 description 1
- AGTHXWTYCLLYMC-FHWLQOOXSA-N Phe-Tyr-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=CC=C1 AGTHXWTYCLLYMC-FHWLQOOXSA-N 0.000 description 1
- DBNGDEAQXGFGRA-ACRUOGEOSA-N Phe-Tyr-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DBNGDEAQXGFGRA-ACRUOGEOSA-N 0.000 description 1
- APMXLWHMIVWLLR-BZSNNMDCSA-N Phe-Tyr-Ser Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(O)=O)C1=CC=CC=C1 APMXLWHMIVWLLR-BZSNNMDCSA-N 0.000 description 1
- CDHURCQGUDNBMA-UBHSHLNASA-N Phe-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 CDHURCQGUDNBMA-UBHSHLNASA-N 0.000 description 1
- XALFIVXGQUEGKV-JSGCOSHPSA-N Phe-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XALFIVXGQUEGKV-JSGCOSHPSA-N 0.000 description 1
- VDTYRPWRWRCROL-UFYCRDLUSA-N Phe-Val-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 VDTYRPWRWRCROL-UFYCRDLUSA-N 0.000 description 1
- XBCOOBCTVMMQSC-BVSLBCMMSA-N Phe-Val-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 XBCOOBCTVMMQSC-BVSLBCMMSA-N 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- IFMDQWDAJUMMJC-DCAQKATOSA-N Pro-Ala-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O IFMDQWDAJUMMJC-DCAQKATOSA-N 0.000 description 1
- LNLNHXIQPGKRJQ-SRVKXCTJSA-N Pro-Arg-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H]1CCCN1 LNLNHXIQPGKRJQ-SRVKXCTJSA-N 0.000 description 1
- SSSFPISOZOLQNP-GUBZILKMSA-N Pro-Arg-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SSSFPISOZOLQNP-GUBZILKMSA-N 0.000 description 1
- ICTZKEXYDDZZFP-SRVKXCTJSA-N Pro-Arg-Pro Chemical compound N([C@@H](CCCN=C(N)N)C(=O)N1[C@@H](CCC1)C(O)=O)C(=O)[C@@H]1CCCN1 ICTZKEXYDDZZFP-SRVKXCTJSA-N 0.000 description 1
- UTAUEDINXUMHLG-FXQIFTODSA-N Pro-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 UTAUEDINXUMHLG-FXQIFTODSA-N 0.000 description 1
- VJLJGKQAOQJXJG-CIUDSAMLSA-N Pro-Asp-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJLJGKQAOQJXJG-CIUDSAMLSA-N 0.000 description 1
- KPDRZQUWJKTMBP-DCAQKATOSA-N Pro-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 KPDRZQUWJKTMBP-DCAQKATOSA-N 0.000 description 1
- HXOLCSYHGRNXJJ-IHRRRGAJSA-N Pro-Asp-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HXOLCSYHGRNXJJ-IHRRRGAJSA-N 0.000 description 1
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 1
- ZPPVJIJMIKTERM-YUMQZZPRSA-N Pro-Gln-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ZPPVJIJMIKTERM-YUMQZZPRSA-N 0.000 description 1
- DRIJZWBRGMJCDD-DCAQKATOSA-N Pro-Gln-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O DRIJZWBRGMJCDD-DCAQKATOSA-N 0.000 description 1
- LXVLKXPFIDDHJG-CIUDSAMLSA-N Pro-Glu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O LXVLKXPFIDDHJG-CIUDSAMLSA-N 0.000 description 1
- VPEVBAUSTBWQHN-NHCYSSNCSA-N Pro-Glu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O VPEVBAUSTBWQHN-NHCYSSNCSA-N 0.000 description 1
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 1
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 1
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 1
- XQHGISDMVBTGAL-ULQDDVLXSA-N Pro-His-Phe Chemical compound C([C@@H](C(=O)[O-])NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H]1[NH2+]CCC1)C1=CC=CC=C1 XQHGISDMVBTGAL-ULQDDVLXSA-N 0.000 description 1
- IBGCFJDLCYTKPW-NAKRPEOUSA-N Pro-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 IBGCFJDLCYTKPW-NAKRPEOUSA-N 0.000 description 1
- YXHYJEPDKSYPSQ-AVGNSLFASA-N Pro-Leu-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 YXHYJEPDKSYPSQ-AVGNSLFASA-N 0.000 description 1
- HATVCTYBNCNMAA-AVGNSLFASA-N Pro-Leu-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O HATVCTYBNCNMAA-AVGNSLFASA-N 0.000 description 1
- DRKAXLDECUGLFE-ULQDDVLXSA-N Pro-Leu-Phe Chemical compound CC(C)C[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O DRKAXLDECUGLFE-ULQDDVLXSA-N 0.000 description 1
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 1
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 1
- ABSSTGUCBCDKMU-UWVGGRQHSA-N Pro-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 ABSSTGUCBCDKMU-UWVGGRQHSA-N 0.000 description 1
- ULWBBFKQBDNGOY-RWMBFGLXSA-N Pro-Lys-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N2CCC[C@@H]2C(=O)O ULWBBFKQBDNGOY-RWMBFGLXSA-N 0.000 description 1
- BLJMJZOMZRCESA-GUBZILKMSA-N Pro-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BLJMJZOMZRCESA-GUBZILKMSA-N 0.000 description 1
- RPLMFKUKFZOTER-AVGNSLFASA-N Pro-Met-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@@H]1CCCN1 RPLMFKUKFZOTER-AVGNSLFASA-N 0.000 description 1
- AUYKOPJPKUCYHE-SRVKXCTJSA-N Pro-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@@H]1CCCN1 AUYKOPJPKUCYHE-SRVKXCTJSA-N 0.000 description 1
- MLKVIVZCFYRTIR-KKUMJFAQSA-N Pro-Phe-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLKVIVZCFYRTIR-KKUMJFAQSA-N 0.000 description 1
- GNADVDLLGVSXLS-ULQDDVLXSA-N Pro-Phe-His Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O GNADVDLLGVSXLS-ULQDDVLXSA-N 0.000 description 1
- KDBHVPXBQADZKY-GUBZILKMSA-N Pro-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KDBHVPXBQADZKY-GUBZILKMSA-N 0.000 description 1
- FYKUEXMZYFIZKA-DCAQKATOSA-N Pro-Pro-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O FYKUEXMZYFIZKA-DCAQKATOSA-N 0.000 description 1
- NAIPAPCKKRCMBL-JYJNAYRXSA-N Pro-Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H]1NCCC1)C1=CC=CC=C1 NAIPAPCKKRCMBL-JYJNAYRXSA-N 0.000 description 1
- SBVPYBFMIGDIDX-SRVKXCTJSA-N Pro-Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 1
- KBUAPZAZPWNYSW-SRVKXCTJSA-N Pro-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KBUAPZAZPWNYSW-SRVKXCTJSA-N 0.000 description 1
- POQFNPILEQEODH-FXQIFTODSA-N Pro-Ser-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O POQFNPILEQEODH-FXQIFTODSA-N 0.000 description 1
- GZNYIXWOIUFLGO-ZJDVBMNYSA-N Pro-Thr-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZNYIXWOIUFLGO-ZJDVBMNYSA-N 0.000 description 1
- VVAWNPIOYXAMAL-KJEVXHAQSA-N Pro-Thr-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VVAWNPIOYXAMAL-KJEVXHAQSA-N 0.000 description 1
- OFSZYRZOUMNCCU-BZSNNMDCSA-N Pro-Trp-Met Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCSC)C(O)=O)C(=O)[C@@H]1CCCN1 OFSZYRZOUMNCCU-BZSNNMDCSA-N 0.000 description 1
- SNSYSBUTTJBPDG-OKZBNKHCSA-N Pro-Trp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N4CCC[C@@H]4C(=O)O SNSYSBUTTJBPDG-OKZBNKHCSA-N 0.000 description 1
- DLZBBDSPTJBOOD-BPNCWPANSA-N Pro-Tyr-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O DLZBBDSPTJBOOD-BPNCWPANSA-N 0.000 description 1
- BVRBCQBUNGAWFP-KKUMJFAQSA-N Pro-Tyr-Gln Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O BVRBCQBUNGAWFP-KKUMJFAQSA-N 0.000 description 1
- OQSGBXGNAFQGGS-CYDGBPFRSA-N Pro-Val-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OQSGBXGNAFQGGS-CYDGBPFRSA-N 0.000 description 1
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102000001253 Protein Kinase Human genes 0.000 description 1
- 101150037994 RAB26 gene Proteins 0.000 description 1
- 108010054530 RGDN peptide Proteins 0.000 description 1
- 102000009572 RNA Polymerase II Human genes 0.000 description 1
- 108010009460 RNA Polymerase II Proteins 0.000 description 1
- 102000044126 RNA-Binding Proteins Human genes 0.000 description 1
- 101710105008 RNA-binding protein Proteins 0.000 description 1
- 101150010608 RNPS1 gene Proteins 0.000 description 1
- 101000779603 Rattus norvegicus FAD-linked sulfhydryl oxidase ALR Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- 108091027981 Response element Proteins 0.000 description 1
- 206010038997 Retroviral infections Diseases 0.000 description 1
- 102000004285 Ribosomal Protein L3 Human genes 0.000 description 1
- 102000002278 Ribosomal Proteins Human genes 0.000 description 1
- 108010000605 Ribosomal Proteins Proteins 0.000 description 1
- 101100459241 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) MXR2 gene Proteins 0.000 description 1
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 1
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 1
- QFBNNYNWKYKVJO-DCAQKATOSA-N Ser-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N QFBNNYNWKYKVJO-DCAQKATOSA-N 0.000 description 1
- OBXVZEAMXFSGPU-FXQIFTODSA-N Ser-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N)CN=C(N)N OBXVZEAMXFSGPU-FXQIFTODSA-N 0.000 description 1
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 1
- OHKLFYXEOGGGCK-ZLUOBGJFSA-N Ser-Asp-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OHKLFYXEOGGGCK-ZLUOBGJFSA-N 0.000 description 1
- MESDJCNHLZBMEP-ZLUOBGJFSA-N Ser-Asp-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MESDJCNHLZBMEP-ZLUOBGJFSA-N 0.000 description 1
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 1
- BLPYXIXXCFVIIF-FXQIFTODSA-N Ser-Cys-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CO)N)CN=C(N)N BLPYXIXXCFVIIF-FXQIFTODSA-N 0.000 description 1
- MOVJSUIKUNCVMG-ZLUOBGJFSA-N Ser-Cys-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N)O MOVJSUIKUNCVMG-ZLUOBGJFSA-N 0.000 description 1
- MAWSJXHRLWVJEZ-ACZMJKKPSA-N Ser-Gln-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N MAWSJXHRLWVJEZ-ACZMJKKPSA-N 0.000 description 1
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 1
- SQBLRDDJTUJDMV-ACZMJKKPSA-N Ser-Glu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQBLRDDJTUJDMV-ACZMJKKPSA-N 0.000 description 1
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 1
- GRSLLFZTTLBOQX-CIUDSAMLSA-N Ser-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N GRSLLFZTTLBOQX-CIUDSAMLSA-N 0.000 description 1
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 1
- WBINSDOPZHQPPM-AVGNSLFASA-N Ser-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)O WBINSDOPZHQPPM-AVGNSLFASA-N 0.000 description 1
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 1
- IXCHOHLPHNGFTJ-YUMQZZPRSA-N Ser-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N IXCHOHLPHNGFTJ-YUMQZZPRSA-N 0.000 description 1
- QGAHMVHBORDHDC-YUMQZZPRSA-N Ser-His-Gly Chemical compound OC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CN=CN1 QGAHMVHBORDHDC-YUMQZZPRSA-N 0.000 description 1
- DOSZISJPMCYEHT-NAKRPEOUSA-N Ser-Ile-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O DOSZISJPMCYEHT-NAKRPEOUSA-N 0.000 description 1
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 1
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 1
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 1
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 1
- KCGIREHVWRXNDH-GARJFASQSA-N Ser-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N KCGIREHVWRXNDH-GARJFASQSA-N 0.000 description 1
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 1
- OCWWJBZQXGYQCA-DCAQKATOSA-N Ser-Lys-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O OCWWJBZQXGYQCA-DCAQKATOSA-N 0.000 description 1
- QJKPECIAWNNKIT-KKUMJFAQSA-N Ser-Lys-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QJKPECIAWNNKIT-KKUMJFAQSA-N 0.000 description 1
- UGGWCAFQPKANMW-FXQIFTODSA-N Ser-Met-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O UGGWCAFQPKANMW-FXQIFTODSA-N 0.000 description 1
- AXVNLRQLPLSIPQ-FXQIFTODSA-N Ser-Met-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N AXVNLRQLPLSIPQ-FXQIFTODSA-N 0.000 description 1
- IFLVBVIYADZIQO-DCAQKATOSA-N Ser-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N IFLVBVIYADZIQO-DCAQKATOSA-N 0.000 description 1
- GDUZTEQRAOXYJS-SRVKXCTJSA-N Ser-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GDUZTEQRAOXYJS-SRVKXCTJSA-N 0.000 description 1
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 1
- BUYHXYIUQUBEQP-AVGNSLFASA-N Ser-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N BUYHXYIUQUBEQP-AVGNSLFASA-N 0.000 description 1
- XKFJENWJGHMDLI-QWRGUYRKSA-N Ser-Phe-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O XKFJENWJGHMDLI-QWRGUYRKSA-N 0.000 description 1
- MQUZANJDFOQOBX-SRVKXCTJSA-N Ser-Phe-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O MQUZANJDFOQOBX-SRVKXCTJSA-N 0.000 description 1
- PJIQEIFXZPCWOJ-FXQIFTODSA-N Ser-Pro-Asp Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O PJIQEIFXZPCWOJ-FXQIFTODSA-N 0.000 description 1
- WNDUPCKKKGSKIQ-CIUDSAMLSA-N Ser-Pro-Gln Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O WNDUPCKKKGSKIQ-CIUDSAMLSA-N 0.000 description 1
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 1
- AZWNCEBQZXELEZ-FXQIFTODSA-N Ser-Pro-Ser Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O AZWNCEBQZXELEZ-FXQIFTODSA-N 0.000 description 1
- FLONGDPORFIVQW-XGEHTFHBSA-N Ser-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FLONGDPORFIVQW-XGEHTFHBSA-N 0.000 description 1
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 1
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 1
- AABIBDJHSKIMJK-FXQIFTODSA-N Ser-Ser-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O AABIBDJHSKIMJK-FXQIFTODSA-N 0.000 description 1
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 1
- OLKICIBQRVSQMA-SRVKXCTJSA-N Ser-Ser-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OLKICIBQRVSQMA-SRVKXCTJSA-N 0.000 description 1
- WUXCHQZLUHBSDJ-LKXGYXEUSA-N Ser-Thr-Asp Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WUXCHQZLUHBSDJ-LKXGYXEUSA-N 0.000 description 1
- SOACHCFYJMCMHC-BWBBJGPYSA-N Ser-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N)O SOACHCFYJMCMHC-BWBBJGPYSA-N 0.000 description 1
- FLMYSKVSDVHLEW-SVSWQMSJSA-N Ser-Thr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLMYSKVSDVHLEW-SVSWQMSJSA-N 0.000 description 1
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 1
- UBTNVMGPMYDYIU-HJPIBITLSA-N Ser-Tyr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UBTNVMGPMYDYIU-HJPIBITLSA-N 0.000 description 1
- PCMZJFMUYWIERL-ZKWXMUAHSA-N Ser-Val-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PCMZJFMUYWIERL-ZKWXMUAHSA-N 0.000 description 1
- JZRYFUGREMECBH-XPUUQOCRSA-N Ser-Val-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O JZRYFUGREMECBH-XPUUQOCRSA-N 0.000 description 1
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 102100028904 Serine/threonine-protein kinase MARK2 Human genes 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- FKNQFGJONOIPTF-UHFFFAOYSA-N Sodium cation Chemical compound [Na+] FKNQFGJONOIPTF-UHFFFAOYSA-N 0.000 description 1
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 1
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 1
- 108010091105 Subfamily B ATP Binding Cassette Transporter Proteins 0.000 description 1
- 102000018075 Subfamily B ATP Binding Cassette Transporter Human genes 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 1
- DDPVJPIGACCMEH-XQXXSGGOSA-N Thr-Ala-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DDPVJPIGACCMEH-XQXXSGGOSA-N 0.000 description 1
- STGXWWBXWXZOER-MBLNEYKQSA-N Thr-Ala-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 STGXWWBXWXZOER-MBLNEYKQSA-N 0.000 description 1
- PXQUBKWZENPDGE-CIQUZCHMSA-N Thr-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)O)N PXQUBKWZENPDGE-CIQUZCHMSA-N 0.000 description 1
- KEGBFULVYKYJRD-LFSVMHDDSA-N Thr-Ala-Phe Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KEGBFULVYKYJRD-LFSVMHDDSA-N 0.000 description 1
- TWLMXDWFVNEFFK-FJXKBIBVSA-N Thr-Arg-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O TWLMXDWFVNEFFK-FJXKBIBVSA-N 0.000 description 1
- PKXHGEXFMIZSER-QTKMDUPCSA-N Thr-Arg-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O PKXHGEXFMIZSER-QTKMDUPCSA-N 0.000 description 1
- PAOYNIKMYOGBMR-PBCZWWQYSA-N Thr-Asn-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O PAOYNIKMYOGBMR-PBCZWWQYSA-N 0.000 description 1
- LXWZOMSOUAMOIA-JIOCBJNQSA-N Thr-Asn-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N)O LXWZOMSOUAMOIA-JIOCBJNQSA-N 0.000 description 1
- VXMHQKHDKCATDV-VEVYYDQMSA-N Thr-Asp-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VXMHQKHDKCATDV-VEVYYDQMSA-N 0.000 description 1
- GKMYGVQDGVYCPC-IUKAMOBKSA-N Thr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H]([C@@H](C)O)N GKMYGVQDGVYCPC-IUKAMOBKSA-N 0.000 description 1
- GNHRVXYZKWSJTF-HJGDQZAQSA-N Thr-Asp-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N)O GNHRVXYZKWSJTF-HJGDQZAQSA-N 0.000 description 1
- OHAJHDJOCKKJLV-LKXGYXEUSA-N Thr-Asp-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OHAJHDJOCKKJLV-LKXGYXEUSA-N 0.000 description 1
- XDARBNMYXKUFOJ-GSSVUCPTSA-N Thr-Asp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XDARBNMYXKUFOJ-GSSVUCPTSA-N 0.000 description 1
- RJBFAHKSFNNHAI-XKBZYTNZSA-N Thr-Gln-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N)O RJBFAHKSFNNHAI-XKBZYTNZSA-N 0.000 description 1
- MQUZMZBFKCHVOB-HJGDQZAQSA-N Thr-Gln-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O MQUZMZBFKCHVOB-HJGDQZAQSA-N 0.000 description 1
- JQAWYCUUFIMTHE-WLTAIBSBSA-N Thr-Gly-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JQAWYCUUFIMTHE-WLTAIBSBSA-N 0.000 description 1
- XSTGOZBBXFKGHA-YJRXYDGGSA-N Thr-His-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O XSTGOZBBXFKGHA-YJRXYDGGSA-N 0.000 description 1
- FQPDRTDDEZXCEC-SVSWQMSJSA-N Thr-Ile-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O FQPDRTDDEZXCEC-SVSWQMSJSA-N 0.000 description 1
- NCXVJIQMWSGRHY-KXNHARMFSA-N Thr-Leu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O NCXVJIQMWSGRHY-KXNHARMFSA-N 0.000 description 1
- ISLDRLHVPXABBC-IEGACIPQSA-N Thr-Leu-Trp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ISLDRLHVPXABBC-IEGACIPQSA-N 0.000 description 1
- WFAUDCSNCWJJAA-KXNHARMFSA-N Thr-Lys-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(O)=O WFAUDCSNCWJJAA-KXNHARMFSA-N 0.000 description 1
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 1
- DXPURPNJDFCKKO-RHYQMDGZSA-N Thr-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DXPURPNJDFCKKO-RHYQMDGZSA-N 0.000 description 1
- BIBYEFRASCNLAA-CDMKHQONSA-N Thr-Phe-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 BIBYEFRASCNLAA-CDMKHQONSA-N 0.000 description 1
- VEIKMWOMUYMMMK-FCLVOEFKSA-N Thr-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 VEIKMWOMUYMMMK-FCLVOEFKSA-N 0.000 description 1
- MXNAOGFNFNKUPD-JHYOHUSXSA-N Thr-Phe-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MXNAOGFNFNKUPD-JHYOHUSXSA-N 0.000 description 1
- MROIJTGJGIDEEJ-RCWTZXSCSA-N Thr-Pro-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 MROIJTGJGIDEEJ-RCWTZXSCSA-N 0.000 description 1
- YGZWVPBHYABGLT-KJEVXHAQSA-N Thr-Pro-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YGZWVPBHYABGLT-KJEVXHAQSA-N 0.000 description 1
- DOBIBIXIHJKVJF-XKBZYTNZSA-N Thr-Ser-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O DOBIBIXIHJKVJF-XKBZYTNZSA-N 0.000 description 1
- WKGAAMOJPMBBMC-IXOXFDKPSA-N Thr-Ser-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WKGAAMOJPMBBMC-IXOXFDKPSA-N 0.000 description 1
- BBPCSGKKPJUYRB-UVOCVTCTSA-N Thr-Thr-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O BBPCSGKKPJUYRB-UVOCVTCTSA-N 0.000 description 1
- PJCYRZVSACOYSN-ZJDVBMNYSA-N Thr-Thr-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O PJCYRZVSACOYSN-ZJDVBMNYSA-N 0.000 description 1
- KVEWWQRTAVMOFT-KJEVXHAQSA-N Thr-Tyr-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O KVEWWQRTAVMOFT-KJEVXHAQSA-N 0.000 description 1
- SPIFGZFZMVLPHN-UNQGMJICSA-N Thr-Val-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SPIFGZFZMVLPHN-UNQGMJICSA-N 0.000 description 1
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 1
- 241000006364 Torula Species 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- HYVLNORXQGKONN-NUTKFTJISA-N Trp-Ala-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 HYVLNORXQGKONN-NUTKFTJISA-N 0.000 description 1
- FOAJSVIXYCLTSC-PJODQICGSA-N Trp-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N FOAJSVIXYCLTSC-PJODQICGSA-N 0.000 description 1
- HYNAKPYFEYJMAS-XIRDDKMYSA-N Trp-Arg-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HYNAKPYFEYJMAS-XIRDDKMYSA-N 0.000 description 1
- GUWJWCHZNGDKBG-UBHSHLNASA-N Trp-Asn-Cys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N GUWJWCHZNGDKBG-UBHSHLNASA-N 0.000 description 1
- OBWQLWYNNZPWGX-QEJZJMRPSA-N Trp-Gln-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O OBWQLWYNNZPWGX-QEJZJMRPSA-N 0.000 description 1
- YXONONCLMLHWJX-SZMVWBNQSA-N Trp-Glu-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 YXONONCLMLHWJX-SZMVWBNQSA-N 0.000 description 1
- FXHOCONKLLUOCF-WDSOQIARSA-N Trp-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N FXHOCONKLLUOCF-WDSOQIARSA-N 0.000 description 1
- YTVJTXJTNRWJCR-JBACZVJFSA-N Trp-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N YTVJTXJTNRWJCR-JBACZVJFSA-N 0.000 description 1
- UJGDFQRPYGJBEH-AAEUAGOBSA-N Trp-Ser-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N UJGDFQRPYGJBEH-AAEUAGOBSA-N 0.000 description 1
- QHWMVGCEQAPQDK-UMPQAUOISA-N Trp-Thr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O QHWMVGCEQAPQDK-UMPQAUOISA-N 0.000 description 1
- VMXLNDRJXVAJFT-JYBASQMISA-N Trp-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O VMXLNDRJXVAJFT-JYBASQMISA-N 0.000 description 1
- FBHHJGOJWXHGDO-TUSQITKMSA-N Trp-Trp-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC=3C4=CC=CC=C4NC=3)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 FBHHJGOJWXHGDO-TUSQITKMSA-N 0.000 description 1
- XXJDYWYVZBHELV-TUSQITKMSA-N Trp-Trp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)N[C@@H](CCCCN)C(=O)O)N XXJDYWYVZBHELV-TUSQITKMSA-N 0.000 description 1
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 1
- DXYWRYQRKPIGGU-BPNCWPANSA-N Tyr-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DXYWRYQRKPIGGU-BPNCWPANSA-N 0.000 description 1
- MICSYKFECRFCTJ-IHRRRGAJSA-N Tyr-Arg-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O MICSYKFECRFCTJ-IHRRRGAJSA-N 0.000 description 1
- HKIUVWMZYFBIHG-KKUMJFAQSA-N Tyr-Arg-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O HKIUVWMZYFBIHG-KKUMJFAQSA-N 0.000 description 1
- KDGFPPHLXCEQRN-STECZYCISA-N Tyr-Arg-Ile Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDGFPPHLXCEQRN-STECZYCISA-N 0.000 description 1
- XHALUUQSNXSPLP-UFYCRDLUSA-N Tyr-Arg-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 XHALUUQSNXSPLP-UFYCRDLUSA-N 0.000 description 1
- CRWOSTCODDFEKZ-HRCADAONSA-N Tyr-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O CRWOSTCODDFEKZ-HRCADAONSA-N 0.000 description 1
- PZXUIGWOEWWFQM-SRVKXCTJSA-N Tyr-Asn-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O PZXUIGWOEWWFQM-SRVKXCTJSA-N 0.000 description 1
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 1
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 1
- SMLCYZYQFRTLCO-UWJYBYFXSA-N Tyr-Cys-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O SMLCYZYQFRTLCO-UWJYBYFXSA-N 0.000 description 1
- FFCRCJZJARTYCG-KKUMJFAQSA-N Tyr-Cys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N)O FFCRCJZJARTYCG-KKUMJFAQSA-N 0.000 description 1
- QOEZFICGUZTRFX-IHRRRGAJSA-N Tyr-Cys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O QOEZFICGUZTRFX-IHRRRGAJSA-N 0.000 description 1
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 1
- CDHQEOXPWBDFPL-QWRGUYRKSA-N Tyr-Gly-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDHQEOXPWBDFPL-QWRGUYRKSA-N 0.000 description 1
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 1
- FNWGDMZVYBVAGJ-XEGUGMAKSA-N Tyr-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CC=C(C=C1)O)N FNWGDMZVYBVAGJ-XEGUGMAKSA-N 0.000 description 1
- OLWFDNLLBWQWCP-STQMWFEESA-N Tyr-Gly-Met Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O OLWFDNLLBWQWCP-STQMWFEESA-N 0.000 description 1
- YIKDYZDNRCNFQB-KKUMJFAQSA-N Tyr-His-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O YIKDYZDNRCNFQB-KKUMJFAQSA-N 0.000 description 1
- NXRGXTBPMOGFID-CFMVVWHZSA-N Tyr-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O NXRGXTBPMOGFID-CFMVVWHZSA-N 0.000 description 1
- WSFXJLFSJSXGMQ-MGHWNKPDSA-N Tyr-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N WSFXJLFSJSXGMQ-MGHWNKPDSA-N 0.000 description 1
- DMWNPLOERDAHSY-MEYUZBJRSA-N Tyr-Leu-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DMWNPLOERDAHSY-MEYUZBJRSA-N 0.000 description 1
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 1
- GYKDRHDMGQUZPU-MGHWNKPDSA-N Tyr-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N GYKDRHDMGQUZPU-MGHWNKPDSA-N 0.000 description 1
- PGEFRHBWGOJPJT-KKUMJFAQSA-N Tyr-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O PGEFRHBWGOJPJT-KKUMJFAQSA-N 0.000 description 1
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 1
- QHONGSVIVOFKAC-ULQDDVLXSA-N Tyr-Pro-His Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O QHONGSVIVOFKAC-ULQDDVLXSA-N 0.000 description 1
- ZPFLBLFITJCBTP-QWRGUYRKSA-N Tyr-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O ZPFLBLFITJCBTP-QWRGUYRKSA-N 0.000 description 1
- PLVVHGFEMSDRET-IHPCNDPISA-N Tyr-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC3=CC=C(C=C3)O)N PLVVHGFEMSDRET-IHPCNDPISA-N 0.000 description 1
- RIVVDNTUSRVTQT-IRIUXVKKSA-N Tyr-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O RIVVDNTUSRVTQT-IRIUXVKKSA-N 0.000 description 1
- BXJQKVDPRMLGKN-PMVMPFDFSA-N Tyr-Trp-Leu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(O)=O)C1=CC=C(O)C=C1 BXJQKVDPRMLGKN-PMVMPFDFSA-N 0.000 description 1
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 1
- BQASAMYRHNCKQE-IHRRRGAJSA-N Tyr-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N BQASAMYRHNCKQE-IHRRRGAJSA-N 0.000 description 1
- PQPWEALFTLKSEB-DZKIICNBSA-N Tyr-Val-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PQPWEALFTLKSEB-DZKIICNBSA-N 0.000 description 1
- NWEGIYMHTZXVBP-JSGCOSHPSA-N Tyr-Val-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O NWEGIYMHTZXVBP-JSGCOSHPSA-N 0.000 description 1
- DJIJBQYBDKGDIS-JYJNAYRXSA-N Tyr-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(C)C)C(O)=O DJIJBQYBDKGDIS-JYJNAYRXSA-N 0.000 description 1
- 229910052770 Uranium Inorganic materials 0.000 description 1
- UEOOXDLMQZBPFR-ZKWXMUAHSA-N Val-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N UEOOXDLMQZBPFR-ZKWXMUAHSA-N 0.000 description 1
- LTFLDDDGWOVIHY-NAKRPEOUSA-N Val-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N LTFLDDDGWOVIHY-NAKRPEOUSA-N 0.000 description 1
- AZSHAZJLOZQYAY-FXQIFTODSA-N Val-Ala-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O AZSHAZJLOZQYAY-FXQIFTODSA-N 0.000 description 1
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 1
- IVXJODPZRWHCCR-JYJNAYRXSA-N Val-Arg-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N IVXJODPZRWHCCR-JYJNAYRXSA-N 0.000 description 1
- CVUDMNSZAIZFAE-TUAOUCFPSA-N Val-Arg-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N CVUDMNSZAIZFAE-TUAOUCFPSA-N 0.000 description 1
- CVUDMNSZAIZFAE-UHFFFAOYSA-N Val-Arg-Pro Natural products NC(N)=NCCCC(NC(=O)C(N)C(C)C)C(=O)N1CCCC1C(O)=O CVUDMNSZAIZFAE-UHFFFAOYSA-N 0.000 description 1
- DNOOLPROHJWCSQ-RCWTZXSCSA-N Val-Arg-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DNOOLPROHJWCSQ-RCWTZXSCSA-N 0.000 description 1
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 1
- JLFKWDAZBRYCGX-ZKWXMUAHSA-N Val-Asn-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N JLFKWDAZBRYCGX-ZKWXMUAHSA-N 0.000 description 1
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 1
- DDNIHOWRDOXXPF-NGZCFLSTSA-N Val-Asp-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N DDNIHOWRDOXXPF-NGZCFLSTSA-N 0.000 description 1
- ICFRWCLVYFKHJV-FXQIFTODSA-N Val-Cys-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(=O)O)N ICFRWCLVYFKHJV-FXQIFTODSA-N 0.000 description 1
- NYTKXWLZSNRILS-IFFSRLJSSA-N Val-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N)O NYTKXWLZSNRILS-IFFSRLJSSA-N 0.000 description 1
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 1
- VLDMQVZZWDOKQF-AUTRQRHGSA-N Val-Glu-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VLDMQVZZWDOKQF-AUTRQRHGSA-N 0.000 description 1
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 1
- VVZDBPBZHLQPPB-XVKPBYJWSA-N Val-Glu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VVZDBPBZHLQPPB-XVKPBYJWSA-N 0.000 description 1
- OQWNEUXPKHIEJO-NRPADANISA-N Val-Glu-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N OQWNEUXPKHIEJO-NRPADANISA-N 0.000 description 1
- NXRAUQGGHPCJIB-RCOVLWMOSA-N Val-Gly-Asn Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O NXRAUQGGHPCJIB-RCOVLWMOSA-N 0.000 description 1
- URIRWLJVWHYLET-ONGXEEELSA-N Val-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C URIRWLJVWHYLET-ONGXEEELSA-N 0.000 description 1
- MDYSKHBSPXUOPV-JSGCOSHPSA-N Val-Gly-Phe Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N MDYSKHBSPXUOPV-JSGCOSHPSA-N 0.000 description 1
- LAYSXAOGWHKNED-XPUUQOCRSA-N Val-Gly-Ser Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LAYSXAOGWHKNED-XPUUQOCRSA-N 0.000 description 1
- XXROXFHCMVXETG-UWVGGRQHSA-N Val-Gly-Val Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXROXFHCMVXETG-UWVGGRQHSA-N 0.000 description 1
- FEFZWCSXEMVSPO-LSJOCFKGSA-N Val-His-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](C)C(O)=O FEFZWCSXEMVSPO-LSJOCFKGSA-N 0.000 description 1
- SDSCOOZQQGUQFC-GVXVVHGQSA-N Val-His-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N SDSCOOZQQGUQFC-GVXVVHGQSA-N 0.000 description 1
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 1
- DJQIUOKSNRBTSV-CYDGBPFRSA-N Val-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](C(C)C)N DJQIUOKSNRBTSV-CYDGBPFRSA-N 0.000 description 1
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 1
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 1
- XXWBHOWRARMUOC-NHCYSSNCSA-N Val-Lys-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N XXWBHOWRARMUOC-NHCYSSNCSA-N 0.000 description 1
- QRVPEKJBBRYISE-XUXIUFHCSA-N Val-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N QRVPEKJBBRYISE-XUXIUFHCSA-N 0.000 description 1
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 1
- CXWJFWAZIVWBOS-XQQFMLRXSA-N Val-Lys-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N CXWJFWAZIVWBOS-XQQFMLRXSA-N 0.000 description 1
- XPKCFQZDQGVJCX-RHYQMDGZSA-N Val-Lys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N)O XPKCFQZDQGVJCX-RHYQMDGZSA-N 0.000 description 1
- PHZGFLFMGLXCFG-FHWLQOOXSA-N Val-Lys-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N PHZGFLFMGLXCFG-FHWLQOOXSA-N 0.000 description 1
- BCBFMJYTNKDALA-UFYCRDLUSA-N Val-Phe-Phe Chemical compound N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O BCBFMJYTNKDALA-UFYCRDLUSA-N 0.000 description 1
- XBJKAZATRJBDCU-GUBZILKMSA-N Val-Pro-Ala Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O XBJKAZATRJBDCU-GUBZILKMSA-N 0.000 description 1
- RYQUMYBMOJYYDK-NHCYSSNCSA-N Val-Pro-Glu Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RYQUMYBMOJYYDK-NHCYSSNCSA-N 0.000 description 1
- SJRUJQFQVLMZFW-WPRPVWTQSA-N Val-Pro-Gly Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SJRUJQFQVLMZFW-WPRPVWTQSA-N 0.000 description 1
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 1
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 1
- LTTQCQRTSHJPPL-ZKWXMUAHSA-N Val-Ser-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N LTTQCQRTSHJPPL-ZKWXMUAHSA-N 0.000 description 1
- JQTYTBPCSOAZHI-FXQIFTODSA-N Val-Ser-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N JQTYTBPCSOAZHI-FXQIFTODSA-N 0.000 description 1
- RYHUIHUOYRNNIE-NRPADANISA-N Val-Ser-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RYHUIHUOYRNNIE-NRPADANISA-N 0.000 description 1
- VIKZGAUAKQZDOF-NRPADANISA-N Val-Ser-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O VIKZGAUAKQZDOF-NRPADANISA-N 0.000 description 1
- GBIUHAYJGWVNLN-AEJSXWLSSA-N Val-Ser-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N GBIUHAYJGWVNLN-AEJSXWLSSA-N 0.000 description 1
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 1
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- MNSSBIHFEUUXNW-RCWTZXSCSA-N Val-Thr-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N MNSSBIHFEUUXNW-RCWTZXSCSA-N 0.000 description 1
- DLRZGNXCXUGIDG-KKHAAJSZSA-N Val-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O DLRZGNXCXUGIDG-KKHAAJSZSA-N 0.000 description 1
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 1
- PDDJTOSAVNRJRH-UNQGMJICSA-N Val-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](C(C)C)N)O PDDJTOSAVNRJRH-UNQGMJICSA-N 0.000 description 1
- SUGRIIAOLCDLBD-ZOBUZTSGSA-N Val-Trp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SUGRIIAOLCDLBD-ZOBUZTSGSA-N 0.000 description 1
- RLVTVHSDKHBFQP-ULQDDVLXSA-N Val-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 RLVTVHSDKHBFQP-ULQDDVLXSA-N 0.000 description 1
- ZHWZDZFWBXWPDW-GUBZILKMSA-N Val-Val-Cys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(O)=O ZHWZDZFWBXWPDW-GUBZILKMSA-N 0.000 description 1
- WBPFYNYTYASCQP-CYDGBPFRSA-N Val-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N WBPFYNYTYASCQP-CYDGBPFRSA-N 0.000 description 1
- JVGDAEKKZKKZFO-RCWTZXSCSA-N Val-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N)O JVGDAEKKZKKZFO-RCWTZXSCSA-N 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 201000004525 Zellweger Syndrome Diseases 0.000 description 1
- 208000036813 Zellweger spectrum disease Diseases 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 102000019997 adhesion receptor Human genes 0.000 description 1
- 108010013985 adhesion receptor Proteins 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 238000001261 affinity purification Methods 0.000 description 1
- 238000007818 agglutination assay Methods 0.000 description 1
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 210000001132 alveolar macrophage Anatomy 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 238000003277 amino acid sequence analysis Methods 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 230000003141 anti-fusion Effects 0.000 description 1
- 230000002788 anti-peptide Effects 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 230000001640 apoptogenic effect Effects 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010072041 arginyl-glycyl-aspartic acid Proteins 0.000 description 1
- 108010086780 arginyl-glycyl-aspartyl-alanine Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 210000004507 artificial chromosome Anatomy 0.000 description 1
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000011888 autopsy Methods 0.000 description 1
- 238000000376 autoradiography Methods 0.000 description 1
- 208000022185 autosomal dominant polycystic kidney disease Diseases 0.000 description 1
- 230000003376 axonal effect Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 108010066270 beta-lactorphin Proteins 0.000 description 1
- 230000001588 bifunctional effect Effects 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 101150013659 ccnf gene Proteins 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000012292 cell migration Effects 0.000 description 1
- 230000004700 cellular uptake Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 125000003636 chemical group Chemical group 0.000 description 1
- 230000003609 chemorepellent Effects 0.000 description 1
- 230000000663 chemotropic effect Effects 0.000 description 1
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 230000001268 conjugating effect Effects 0.000 description 1
- 239000000356 contaminant Substances 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- ATDGTVJJHBUTRL-UHFFFAOYSA-N cyanogen bromide Chemical compound BrC#N ATDGTVJJHBUTRL-UHFFFAOYSA-N 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 108010069495 cysteinyltyrosine Proteins 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000003412 degenerative effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 108010033011 des-Arg- enterostatin Proteins 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 239000000975 dye Substances 0.000 description 1
- 239000012636 effector Substances 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 230000010502 episomal replication Effects 0.000 description 1
- 125000001495 ethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 1
- 210000002744 extracellular matrix Anatomy 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000001605 fetal effect Effects 0.000 description 1
- 238000000684 flow cytometry Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- 108010084264 glycyl-glycyl-cysteine Proteins 0.000 description 1
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 1
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 1
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010079413 glycyl-prolyl-glutamic acid Proteins 0.000 description 1
- 210000000020 growth cone Anatomy 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 210000005003 heart tissue Anatomy 0.000 description 1
- 229910001385 heavy metal Inorganic materials 0.000 description 1
- 235000014304 histidine Nutrition 0.000 description 1
- 150000002411 histidines Chemical group 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 108010028295 histidylhistidine Proteins 0.000 description 1
- 108010092114 histidylphenylalanine Proteins 0.000 description 1
- 230000001744 histochemical effect Effects 0.000 description 1
- 210000004408 hybridoma Anatomy 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 238000002169 hydrotherapy Methods 0.000 description 1
- 239000012216 imaging agent Substances 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 230000002163 immunogen Effects 0.000 description 1
- 238000011532 immunohistochemical staining Methods 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 108010031424 isoleucyl-prolyl-proline Proteins 0.000 description 1
- 229960000318 kanamycin Drugs 0.000 description 1
- 229930027917 kanamycin Natural products 0.000 description 1
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 1
- 229930182823 kanamycin A Natural products 0.000 description 1
- 108010053037 kyotorphin Proteins 0.000 description 1
- 210000001069 large ribosome subunit Anatomy 0.000 description 1
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 1
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 1
- 210000000265 leukocyte Anatomy 0.000 description 1
- 210000005228 liver tissue Anatomy 0.000 description 1
- 208000004731 long QT syndrome Diseases 0.000 description 1
- 210000005265 lung cell Anatomy 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 108010009298 lysylglutamic acid Proteins 0.000 description 1
- 210000002540 macrophage Anatomy 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 108010063431 methionyl-aspartyl-glycine Proteins 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 1
- YACKEPLHDIMKIO-UHFFFAOYSA-N methylphosphonic acid Chemical class CP(O)(O)=O YACKEPLHDIMKIO-UHFFFAOYSA-N 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000009456 molecular mechanism Effects 0.000 description 1
- 239000002808 molecular sieve Substances 0.000 description 1
- 210000002161 motor neuron Anatomy 0.000 description 1
- 230000036457 multidrug resistance Effects 0.000 description 1
- 238000002887 multiple sequence alignment Methods 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 210000000066 myeloid cell Anatomy 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- 239000006199 nebulizer Substances 0.000 description 1
- 229960004927 neomycin Drugs 0.000 description 1
- 208000002761 neurofibromatosis 2 Diseases 0.000 description 1
- 208000022032 neurofibromatosis type 2 Diseases 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 230000030648 nucleus localization Effects 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 210000000287 oocyte Anatomy 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 108010084525 phenylalanyl-phenylalanyl-glycine Proteins 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010073101 phenylalanylleucine Proteins 0.000 description 1
- PTMHPRAIXMAOOB-UHFFFAOYSA-N phosphoramidic acid Chemical compound NP(O)(O)=O PTMHPRAIXMAOOB-UHFFFAOYSA-N 0.000 description 1
- 238000005375 photometry Methods 0.000 description 1
- 238000002600 positron emission tomography Methods 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000037452 priming Effects 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 1
- 108010025826 prolyl-leucyl-arginine Proteins 0.000 description 1
- 108010093296 prolyl-prolyl-alanine Proteins 0.000 description 1
- 108010079317 prolyl-tyrosine Proteins 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 108010053725 prolylvaline Proteins 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 108060006633 protein kinase Proteins 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000006337 proteolytic cleavage Effects 0.000 description 1
- 238000000163 radioactive labelling Methods 0.000 description 1
- 230000009257 reactivity Effects 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000000284 resting effect Effects 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 238000007423 screening assay Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 102000035025 signaling receptors Human genes 0.000 description 1
- 108091005475 signaling receptors Proteins 0.000 description 1
- 238000002603 single-photon emission computed tomography Methods 0.000 description 1
- 102000030938 small GTPase Human genes 0.000 description 1
- 108060007624 small GTPase Proteins 0.000 description 1
- URGAHOPLAPQHLN-UHFFFAOYSA-N sodium aluminosilicate Chemical compound [Na+].[Al+3].[O-][Si]([O-])=O.[O-][Si]([O-])=O URGAHOPLAPQHLN-UHFFFAOYSA-N 0.000 description 1
- 229910001415 sodium ion Inorganic materials 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 210000004988 splenocyte Anatomy 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 230000004960 subcellular localization Effects 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 108010071097 threonyl-lysyl-proline Proteins 0.000 description 1
- 108010001055 thymocartin Proteins 0.000 description 1
- 238000012876 topography Methods 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000011820 transgenic animal model Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- 108010012567 tyrosyl-glycyl-glycyl-phenylalanyl Proteins 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- 210000004291 uterus Anatomy 0.000 description 1
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 1
- 210000001177 vas deferen Anatomy 0.000 description 1
- 239000003981 vehicle Substances 0.000 description 1
- 230000028973 vesicle-mediated transport Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 239000000080 wetting agent Substances 0.000 description 1
- 210000002268 wool Anatomy 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/475—Growth factors; Growth regulators
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2217/00—Genetically modified animals
- A01K2217/05—Animals comprising random inserted nucleic acids (transgenic)
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Gastroenterology & Hepatology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Toxicology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
- Saccharide Compounds (AREA)
Abstract
In accordance with the present invention, there are provided isolated nucleic acids encoding a human netrin, a human ATP binding cassette transporter, a human ribosomal L3 subtype, and a human augmenter of liver regeneration as well as isolated protein products encoded thereby. The present invention provides nucleic acid probes that hybridize to invention nucleic acids as well as isolated nucleic acids comprising unique gene sequences located on chromosome 16. Further provided are vectors containing invention nucleic acids, host cells transformed therewith, as well as transgenic non-human mammals that express invention polypeptides. The present invention includes antisense oligonucleotides, antibodies and compositions containing same. Additionally, the invention provides methods for identifying compounds that bind to invention polypeptides.
Description
NOVEh HUMAN CHROMOSOME 16 GENES, COMPOSITIONS, METHODS OF MAKING AND USING SAME
BACKGROUND OF THE INVENTION
The assembly of contiguous cloned genomic reagents is a necessary step in the process of disease-gene identification using a positional cloning approach. The rapid development of high density genetic maps based on polymorphic simple sequence repeats has facilitated contig assembly using sequence tagged site (STS) content mapping.
Most contig construction efforts have relied on yeast artificial chromosomes (YACs), since their large insert size uses the current STS map density more advantageously than bacterial-hosted systems. This approach has been validated for multiple human chromosomes with YAC coverage ranging from 65-95% for many chromosomes and contigs of 11 to 36 Mb being described (Chumakov et al., Nature 377 (Supp.):175-297, 1995; Doggett et al., Nature 377 (Supp.):335-365, 1995b; Gemmill et al., Nature 377 (Supp.):299-319, 1995; Krauter et al., Nature 377 (Supp.):321-333, 1995; Shimizu et al., Cytogenet. Cell Genet. 70:147-182, 1995; van-Heyningen et al., Cytogenet.
Cell Genet. 69:127-158, 1995).
Despite numerous successes, the YAC cloning system is not a panacea for cloning the entire genome of complex organisms due to intrinsic limitations that result in substantial proportions of chimeric clones (Green et al., Genomics 11:658-669, 1991; Bellanne-Chantelot et al., Cell 70:1059-1068, 1992; Nagaraja et al., Nuc. Acids Res.
22:3406-3411, 1994), as well as clones that are rearranged, deleted or unstable (Neil et al., Nuc. Acids Res. 18:1421-1428, 1990; Wada et al., Am. J. Hum. Genet. 46:95-106, 1990; Zuo et al., Hum. Mol. Genet. 1:149-159, 1992;
Szepetowski et al.,Cytogenet. Cell Genet. 69:101-107, 1995). At least some of these cloned artifacts are a product of the recombinational machinery of yeast acting on the various types of repetitive elements in mammalian DNA
(Neil et al., supra. 1990; Green et al., supra. 1991;
Schlessinger et al., Genomics 11:783-793, 1991; Ling et al., Nuc. Acids Res. 21:6045-6046, 1993; Kouprina et al., Genomics 21:7-17, 1994; Larionov et al., Nuc. Acids Res.
22:4154-4162, 1994).
Accordingly, alternative cloning systems must be used in concert with YAC-based approaches to complement localized YAC cloning deficiencies, to enhance the resolution of the physical map, and to provide a sequence-ready resource for genome-wide DNA sequencing.
Several exon trapping methodologies and vectors have been described for the rapid and efficient isolation of coding regions from genomic DNA (Auch et al., Nuc. Acids Res.
18:6743-6744, 1990; Duyk et al., Proc. Natl. Acad. Sci., USA 87:8995-8999, 1990; Buckler et al., Proc. Natl. Acad.
Sci., USA 88:4005-4009, 1991; Church et al., Nature Genet.
6:98-105, 1994). The major advantage of exon trapping is that the expression of cloned genomic DNAs (cosmid, P1 or YAC) is driven by a heterologous promoter in tissue culture cells. This allows for coding sequences to be identified without prior knowledge of their tissue distribution or developmental stage of expression. A second advantage of exon trapping is that exon trapping allows for the identification of coding sequences from only the cloned template of interest, which eliminates the risk of characterizing highly conserved transcripts from duplicated loci. This is not the case for either cDNA selection or direct library screening.
Exon trapping has been used successfully to identify transcribed sequences in the Huntington's disease locus (Ambrose et al., Hum. Mol. Genet. 1:697-703, 1992;
Taylor et al., Nature Genet. 2:223-227, 1992; Duyao et al., Hum. Mol. Genet. 2:673-676, 1993) and BRCA1 locus (Brody et al., Genomics 25:238-247, 1995; Brown et al., Proc. Natl.
BACKGROUND OF THE INVENTION
The assembly of contiguous cloned genomic reagents is a necessary step in the process of disease-gene identification using a positional cloning approach. The rapid development of high density genetic maps based on polymorphic simple sequence repeats has facilitated contig assembly using sequence tagged site (STS) content mapping.
Most contig construction efforts have relied on yeast artificial chromosomes (YACs), since their large insert size uses the current STS map density more advantageously than bacterial-hosted systems. This approach has been validated for multiple human chromosomes with YAC coverage ranging from 65-95% for many chromosomes and contigs of 11 to 36 Mb being described (Chumakov et al., Nature 377 (Supp.):175-297, 1995; Doggett et al., Nature 377 (Supp.):335-365, 1995b; Gemmill et al., Nature 377 (Supp.):299-319, 1995; Krauter et al., Nature 377 (Supp.):321-333, 1995; Shimizu et al., Cytogenet. Cell Genet. 70:147-182, 1995; van-Heyningen et al., Cytogenet.
Cell Genet. 69:127-158, 1995).
Despite numerous successes, the YAC cloning system is not a panacea for cloning the entire genome of complex organisms due to intrinsic limitations that result in substantial proportions of chimeric clones (Green et al., Genomics 11:658-669, 1991; Bellanne-Chantelot et al., Cell 70:1059-1068, 1992; Nagaraja et al., Nuc. Acids Res.
22:3406-3411, 1994), as well as clones that are rearranged, deleted or unstable (Neil et al., Nuc. Acids Res. 18:1421-1428, 1990; Wada et al., Am. J. Hum. Genet. 46:95-106, 1990; Zuo et al., Hum. Mol. Genet. 1:149-159, 1992;
Szepetowski et al.,Cytogenet. Cell Genet. 69:101-107, 1995). At least some of these cloned artifacts are a product of the recombinational machinery of yeast acting on the various types of repetitive elements in mammalian DNA
(Neil et al., supra. 1990; Green et al., supra. 1991;
Schlessinger et al., Genomics 11:783-793, 1991; Ling et al., Nuc. Acids Res. 21:6045-6046, 1993; Kouprina et al., Genomics 21:7-17, 1994; Larionov et al., Nuc. Acids Res.
22:4154-4162, 1994).
Accordingly, alternative cloning systems must be used in concert with YAC-based approaches to complement localized YAC cloning deficiencies, to enhance the resolution of the physical map, and to provide a sequence-ready resource for genome-wide DNA sequencing.
Several exon trapping methodologies and vectors have been described for the rapid and efficient isolation of coding regions from genomic DNA (Auch et al., Nuc. Acids Res.
18:6743-6744, 1990; Duyk et al., Proc. Natl. Acad. Sci., USA 87:8995-8999, 1990; Buckler et al., Proc. Natl. Acad.
Sci., USA 88:4005-4009, 1991; Church et al., Nature Genet.
6:98-105, 1994). The major advantage of exon trapping is that the expression of cloned genomic DNAs (cosmid, P1 or YAC) is driven by a heterologous promoter in tissue culture cells. This allows for coding sequences to be identified without prior knowledge of their tissue distribution or developmental stage of expression. A second advantage of exon trapping is that exon trapping allows for the identification of coding sequences from only the cloned template of interest, which eliminates the risk of characterizing highly conserved transcripts from duplicated loci. This is not the case for either cDNA selection or direct library screening.
Exon trapping has been used successfully to identify transcribed sequences in the Huntington's disease locus (Ambrose et al., Hum. Mol. Genet. 1:697-703, 1992;
Taylor et al., Nature Genet. 2:223-227, 1992; Duyao et al., Hum. Mol. Genet. 2:673-676, 1993) and BRCA1 locus (Brody et al., Genomics 25:238-247, 1995; Brown et al., Proc. Natl.
Acad. Sci., USA 92:4362-4366, 1995). In addition, a number of disease-causing genes have been identified using exon trapping, including the genes for Huntington's disease (The Huntington's Disease Collaborative Research Group, Cell 72:971-983, 1993), neurofibromatosis type 2 (Trofatter et al., Cell 72:791-800, 1993), Menkes disease (Vulpe et al., Nature Genet. 3:7-13, 1993), Batten Disease (The International Batten Disease Consortium, Cell 82:949-957, 1995), and the gene responsible for the majority of Long-QT
syndrome cases (Wang et al., Nature Genet. 12:17-23, 1996).
A 700 kb CpG-rich region in band 16p13.3 has been shown to contain the disease gene for 900 of the cases of autosomal dominant polycystic kidney disease (PKD1)(Germino et al., Genomics 13:144-151, 1992; Somlo et al., Genomics 13:152-158, 1992; The European Polycystic Kidney Disease Consortium, Cell 77:881-894, 1994) as well as the tuburin gene (TSC2), responsible for one form of tuberous sclerosis (The European Chromosome 16 Tuberous Sclerosis Consortium, Cell 75:1305-1315, 1993). An estimated 20 genes are present in this region of chromosome 16 (Germino et al., Kidney Int. Supp. 39:520-525, 1993). Characterization of the region surrounding the PKD1 gene in 16p13.3, however, has been complicated by duplication of a portion of the genomic interval more proximally at 16p13.1 (The European Polycystic Kidney Disease Consortium, supra. 1994).
This chromosomal segment serves as a challenging test for large-insert cloning systems in E. coli and yeast since it resides in a GC-rich isochore (Saccone et al., Proc. Natl. Acad. Sci., USA 89:4913-4917, 1992) with an abundance of CpG islands (Harris et al., Genomics 7:195-206, 1990; Germino et al., supra. 1992), genes (Germino et al., supra. 1993) and Alu repetitive sequences (Korenberg et al., Cell 53:391-400, 1988). Chromosome 16 also contains more low-copy repeats than other chromosomes with almost 25~ of its cosmid contigs hybridizing to more than one chromosomal location when analyzed by fluorescence in situ hybridization (FISH) (Okumura et al., Cytogenet. Cell Uenet. 67:61-67, 1994). These types of repeats and sequence duplications interfere with "chromosome walking"
techniques that are widely used for identification of genomic DNA and pose a challenge to hybridization-based methods of contig construction. This is because these techniques rely on hybridization to identify clones containing overlapping fragments of genomic DNA; thus, there is a high likelihood of "walking" into clones derived from homologues instead of clones derived from the authentic gene. In a similar manner, the sequence duplications and chromosome 16-specific repeats also interfere with the unambiguous determination of a complete cDNA sequence that encodes the corresponding protein.
Furthermore, low copy repeats may lead to instability of this interval in bacteria, yeast and higher eukaryotes.
Thus, there is a need in the art for methods and compositions which enable accurate identification of genomic and cDNA sequences corresponding to authentic genes present on highly repetitive portions of chromosome 16, as well as genes similarly situated on other chromosomes. The present invention satisfies this need and provides related advantages as well.
syndrome cases (Wang et al., Nature Genet. 12:17-23, 1996).
A 700 kb CpG-rich region in band 16p13.3 has been shown to contain the disease gene for 900 of the cases of autosomal dominant polycystic kidney disease (PKD1)(Germino et al., Genomics 13:144-151, 1992; Somlo et al., Genomics 13:152-158, 1992; The European Polycystic Kidney Disease Consortium, Cell 77:881-894, 1994) as well as the tuburin gene (TSC2), responsible for one form of tuberous sclerosis (The European Chromosome 16 Tuberous Sclerosis Consortium, Cell 75:1305-1315, 1993). An estimated 20 genes are present in this region of chromosome 16 (Germino et al., Kidney Int. Supp. 39:520-525, 1993). Characterization of the region surrounding the PKD1 gene in 16p13.3, however, has been complicated by duplication of a portion of the genomic interval more proximally at 16p13.1 (The European Polycystic Kidney Disease Consortium, supra. 1994).
This chromosomal segment serves as a challenging test for large-insert cloning systems in E. coli and yeast since it resides in a GC-rich isochore (Saccone et al., Proc. Natl. Acad. Sci., USA 89:4913-4917, 1992) with an abundance of CpG islands (Harris et al., Genomics 7:195-206, 1990; Germino et al., supra. 1992), genes (Germino et al., supra. 1993) and Alu repetitive sequences (Korenberg et al., Cell 53:391-400, 1988). Chromosome 16 also contains more low-copy repeats than other chromosomes with almost 25~ of its cosmid contigs hybridizing to more than one chromosomal location when analyzed by fluorescence in situ hybridization (FISH) (Okumura et al., Cytogenet. Cell Uenet. 67:61-67, 1994). These types of repeats and sequence duplications interfere with "chromosome walking"
techniques that are widely used for identification of genomic DNA and pose a challenge to hybridization-based methods of contig construction. This is because these techniques rely on hybridization to identify clones containing overlapping fragments of genomic DNA; thus, there is a high likelihood of "walking" into clones derived from homologues instead of clones derived from the authentic gene. In a similar manner, the sequence duplications and chromosome 16-specific repeats also interfere with the unambiguous determination of a complete cDNA sequence that encodes the corresponding protein.
Furthermore, low copy repeats may lead to instability of this interval in bacteria, yeast and higher eukaryotes.
Thus, there is a need in the art for methods and compositions which enable accurate identification of genomic and cDNA sequences corresponding to authentic genes present on highly repetitive portions of chromosome 16, as well as genes similarly situated on other chromosomes. The present invention satisfies this need and provides related advantages as well.
SUMMARY OF THE INVENTION
In accordance with the present invention, there are provided isolated nucleic acids encoding a human netrin, a human ATP binding cassette transporter, a human ribosomal L3 subtype, and a human augmenter of liver regeneration.
The present invention further provides isolated protein products encoded by a human netrin gene, a human ATP binding cassette transporter gene, a human ribosomal L3 gene, and a human augmenter of liver regeneration gene.
Additionally, the present invention provides nucleic acid probes that hybridize to invention nucleic acids as well as isolated nucleic acids comprising unique gene sequences located on chromosome 16.
Further provided are vectors containing invention nucleic acids as well as host cells transformed with invention vectors.
Transgenic non-human mammals that express invention polypeptides are provided by the present invention.
The present invention includes antisense oligonucleotides, antibodies and compositions containing same.
Additionally, the invention provides methods for identifying compounds that bind to invention polypeptides.
Such compounds are useful for modulating the activity of invention polypeptides.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows a schematic diagram of the P1 contig and trapped exons.
Figures 2A and 2B show an alignment of selected exon traps with sequences in the databases.
Figures 3A through 3C show 6803 by of hNET
genomic sequence from P1 clone 53.8B (SEQ ID N0:19).
Figures 4~A and 4B show 1743 by of hNET cDNA and deduced amino acid sequence coding for a human homologue of chicken netrin genes (SEQ ID NOs:20 and 21).
Figures 4C and 4D show the nucleotide sequence of the 1.9 kb hNET cDNA including both 5' and 3' UTRs (SEQ ID
N0:78).
Figure 5 shows an amino acid comparison between chicken netrin-1 (SEQ ID N0:22), chicken netrin-2 (SEQ ID
N0:23) and hNET (SEQ ID N0:21). Shaded boxes denote regions of identical homology. The laminin domains V and VI and the C-terminal domain (C) are indicated by arrows with domain V divided into three sub-components (V-1 to V-3). The asterisks identify a motif for adhesion/signaling receptors.
Figure 6 shows a graphical representation of the homology between domains of chicken netrin-1, chicken netrin-2 and hNET.
Figure 7 shows exon traps, RT-PCR products and cDNA from the ABCgt.1 clone. Exon traps are shown above.
ABCgt.1 DNA is shown below the exon traps with the position of the Genetrapper selection (S) and repair (R) oligonucleotides indicated. The position of the RT-PCR
clones are shown below the cDNA.
Figures 8A-8G show 5.8 kb of cDNA and deduced amino acid sequence encoding ABCgt.1 clone (SEQ ID NOs:24 and 25).
Figure 9A-9D show an amino acid alignment of murine ABC1 (SEQ ID N0:26) and ABC2 (SEQ ID N0:27) with clone ABCgt.1 (SEQ ID N0:25). Hyphens denote gaps;
asterisks denote identical residues, while periods denote conservative substitutions. The location of the ATP
binding cassettes is shown by the boxed regions. Numbers at the right show the relative position of the proteins.
Figure 10 shows the region of the transcriptional map of the PKD1 locus from which P1 clones 49.10D, 109.8C
and 47.2H were isolated. The open boxes represent trapped exons with their relative position indicated below the RPL3L (SEM L3) gene. c, r and h identify the location of the capture, repair and hybridization oligonucleotides, respectively.
Figures 11A-11B show the nucleotide and deduced amino acid sequence of the SEM L3 cDNA, now designated RPL3L (SEQ ID NOs:28 and 29). The 5' upstream inframe stop codon is underlined and the arrows indicate the site of the polyA tract of the two shorter cDNA clones that were also isolated.
Figure 12 shows a comparison of the deduced amino acid sequences from human (SEQ ID N0:30), bovine (SEQ ID
N0:31), murine (SEQ ID N0:32) and the RPL3L {SEM L3) (SEQ
ID N0:29) genes. Dashes indicate sequence identity to the human L3 gene. The nuclear targeting sequence at the N-terminal end is shaded and the bipartite motif is boxed.
Figure 13 shows the nucleotide and deduced amino acid sequence of the hALR cDNA (SEQ ID N0:33 and 34).
Figure l4 shows a comparison of the deduced amino acid sequences from rat ALR and human ALR (SEQ ID NOs:35 and 34), respectively.
Figures 15A-15J show the nucleotide and deduced amino acid sequence of full-length hABC3 cDNA (SEQ ID
NOs:74 and 75).
Figure 16 shows a physical map of the region containing the hABC3 gene.
Figure 17A shows the deduced amino acid sequence for hABC3 (SEQ ID N0:75) aligned to the murine ABC1 (SEQ ID
N0:26) and ABC2 (SEQ ID N0:27) sequences (Luciani et al., Genomics 21:150-159, 1994) and sequence predicted to be encoded by C. elegans cosmid C.48B4.4 (SEQ ID N0:77) (Wilson et al., Nature 368:32-38, 1994). Sequence identity is shown by letters, with mismatches denoted as periods.
Gaps inserted during the alignment are also shown (_). For ABC1, ABC2 and C.48B4.4, only those sequences included in, and C-terminal to, the first ATP-binding domain are shown.
Boxes denote the ATP binding cassettes (I and III) and the HH1 domain (II).
Figure 17B shows a schematic diagram of the ABC3 protein showing the transmembrane (TM) domains, ATP binding cassette (ABC) domains, Linker and HH1 domains.
Figure 18 shows a map of the genomic interval surrounding the human netrin gene.
Figure 19A shows a GRAIL2 analysis of coding sequences in the 6.8 kb genomic sequence from 53.8B P1.
Figure 19'B shows the results of a Pustell DNA/protein matrix comparing genomic sequence to chicken netrin-2. -., g Figure 20A shows alignment of the human netrin with chicken netrin-1, chicken netrin-2 and UNC-6 (SEQ ID
NO: 79).
Figure 20B shows a schematic of the genomic sequence with boxes representing exons and lines denoting the introns. Untranslated region is shown in black, with the location of the start codon indicated by the arrow.
The domain structure of the human netrin protein is shown below the gene structure. The position of introns in the Drosophila netrin genes is shown by arrows, with the non-conserved intron being denoted by the open arrow.
DETAILED DESCRIPTION OF THE INVENTION
All patent applications, patents, and literature references cited in this specification are hereby incorporated by reference in their entirety. In case of conflict or inconsistency, the present description, including definitions, will control.
Definitions:
1. "complementary DNA (cDNA)" is defined herein as a single-stranded or double-stranded intronless DNA molecule that is derived from the authentic gene and whose sequence, or complement thereof, encodes a protein.
2. As referred to herein, a "contig" is a continuous stretch of DNA or DNA sequence, which may be represented by multiple, overlapping, clones or sequences.
3. As referred to herein, a "cosmid" is a DNA
plasmid that can replicate in bacterial cells and that accommodates large DNA inserts from about 30 to about 51 kb in length.
4. The term "P1 clones" refers to genomic DNAs cloned into vectors based on the P1 phage replication mechanisms. These vectors generally accommodate inserts of about 70 to about 105 kb (Pierce et al., Proc. Natl. Acad.
Sci., USA, 89:2056-2060, 1992).
In accordance with the present invention, there are provided isolated nucleic acids encoding a human netrin, a human ATP binding cassette transporter, a human ribosomal L3 subtype, and a human augmenter of liver regeneration.
The present invention further provides isolated protein products encoded by a human netrin gene, a human ATP binding cassette transporter gene, a human ribosomal L3 gene, and a human augmenter of liver regeneration gene.
Additionally, the present invention provides nucleic acid probes that hybridize to invention nucleic acids as well as isolated nucleic acids comprising unique gene sequences located on chromosome 16.
Further provided are vectors containing invention nucleic acids as well as host cells transformed with invention vectors.
Transgenic non-human mammals that express invention polypeptides are provided by the present invention.
The present invention includes antisense oligonucleotides, antibodies and compositions containing same.
Additionally, the invention provides methods for identifying compounds that bind to invention polypeptides.
Such compounds are useful for modulating the activity of invention polypeptides.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows a schematic diagram of the P1 contig and trapped exons.
Figures 2A and 2B show an alignment of selected exon traps with sequences in the databases.
Figures 3A through 3C show 6803 by of hNET
genomic sequence from P1 clone 53.8B (SEQ ID N0:19).
Figures 4~A and 4B show 1743 by of hNET cDNA and deduced amino acid sequence coding for a human homologue of chicken netrin genes (SEQ ID NOs:20 and 21).
Figures 4C and 4D show the nucleotide sequence of the 1.9 kb hNET cDNA including both 5' and 3' UTRs (SEQ ID
N0:78).
Figure 5 shows an amino acid comparison between chicken netrin-1 (SEQ ID N0:22), chicken netrin-2 (SEQ ID
N0:23) and hNET (SEQ ID N0:21). Shaded boxes denote regions of identical homology. The laminin domains V and VI and the C-terminal domain (C) are indicated by arrows with domain V divided into three sub-components (V-1 to V-3). The asterisks identify a motif for adhesion/signaling receptors.
Figure 6 shows a graphical representation of the homology between domains of chicken netrin-1, chicken netrin-2 and hNET.
Figure 7 shows exon traps, RT-PCR products and cDNA from the ABCgt.1 clone. Exon traps are shown above.
ABCgt.1 DNA is shown below the exon traps with the position of the Genetrapper selection (S) and repair (R) oligonucleotides indicated. The position of the RT-PCR
clones are shown below the cDNA.
Figures 8A-8G show 5.8 kb of cDNA and deduced amino acid sequence encoding ABCgt.1 clone (SEQ ID NOs:24 and 25).
Figure 9A-9D show an amino acid alignment of murine ABC1 (SEQ ID N0:26) and ABC2 (SEQ ID N0:27) with clone ABCgt.1 (SEQ ID N0:25). Hyphens denote gaps;
asterisks denote identical residues, while periods denote conservative substitutions. The location of the ATP
binding cassettes is shown by the boxed regions. Numbers at the right show the relative position of the proteins.
Figure 10 shows the region of the transcriptional map of the PKD1 locus from which P1 clones 49.10D, 109.8C
and 47.2H were isolated. The open boxes represent trapped exons with their relative position indicated below the RPL3L (SEM L3) gene. c, r and h identify the location of the capture, repair and hybridization oligonucleotides, respectively.
Figures 11A-11B show the nucleotide and deduced amino acid sequence of the SEM L3 cDNA, now designated RPL3L (SEQ ID NOs:28 and 29). The 5' upstream inframe stop codon is underlined and the arrows indicate the site of the polyA tract of the two shorter cDNA clones that were also isolated.
Figure 12 shows a comparison of the deduced amino acid sequences from human (SEQ ID N0:30), bovine (SEQ ID
N0:31), murine (SEQ ID N0:32) and the RPL3L {SEM L3) (SEQ
ID N0:29) genes. Dashes indicate sequence identity to the human L3 gene. The nuclear targeting sequence at the N-terminal end is shaded and the bipartite motif is boxed.
Figure 13 shows the nucleotide and deduced amino acid sequence of the hALR cDNA (SEQ ID N0:33 and 34).
Figure l4 shows a comparison of the deduced amino acid sequences from rat ALR and human ALR (SEQ ID NOs:35 and 34), respectively.
Figures 15A-15J show the nucleotide and deduced amino acid sequence of full-length hABC3 cDNA (SEQ ID
NOs:74 and 75).
Figure 16 shows a physical map of the region containing the hABC3 gene.
Figure 17A shows the deduced amino acid sequence for hABC3 (SEQ ID N0:75) aligned to the murine ABC1 (SEQ ID
N0:26) and ABC2 (SEQ ID N0:27) sequences (Luciani et al., Genomics 21:150-159, 1994) and sequence predicted to be encoded by C. elegans cosmid C.48B4.4 (SEQ ID N0:77) (Wilson et al., Nature 368:32-38, 1994). Sequence identity is shown by letters, with mismatches denoted as periods.
Gaps inserted during the alignment are also shown (_). For ABC1, ABC2 and C.48B4.4, only those sequences included in, and C-terminal to, the first ATP-binding domain are shown.
Boxes denote the ATP binding cassettes (I and III) and the HH1 domain (II).
Figure 17B shows a schematic diagram of the ABC3 protein showing the transmembrane (TM) domains, ATP binding cassette (ABC) domains, Linker and HH1 domains.
Figure 18 shows a map of the genomic interval surrounding the human netrin gene.
Figure 19A shows a GRAIL2 analysis of coding sequences in the 6.8 kb genomic sequence from 53.8B P1.
Figure 19'B shows the results of a Pustell DNA/protein matrix comparing genomic sequence to chicken netrin-2. -., g Figure 20A shows alignment of the human netrin with chicken netrin-1, chicken netrin-2 and UNC-6 (SEQ ID
NO: 79).
Figure 20B shows a schematic of the genomic sequence with boxes representing exons and lines denoting the introns. Untranslated region is shown in black, with the location of the start codon indicated by the arrow.
The domain structure of the human netrin protein is shown below the gene structure. The position of introns in the Drosophila netrin genes is shown by arrows, with the non-conserved intron being denoted by the open arrow.
DETAILED DESCRIPTION OF THE INVENTION
All patent applications, patents, and literature references cited in this specification are hereby incorporated by reference in their entirety. In case of conflict or inconsistency, the present description, including definitions, will control.
Definitions:
1. "complementary DNA (cDNA)" is defined herein as a single-stranded or double-stranded intronless DNA molecule that is derived from the authentic gene and whose sequence, or complement thereof, encodes a protein.
2. As referred to herein, a "contig" is a continuous stretch of DNA or DNA sequence, which may be represented by multiple, overlapping, clones or sequences.
3. As referred to herein, a "cosmid" is a DNA
plasmid that can replicate in bacterial cells and that accommodates large DNA inserts from about 30 to about 51 kb in length.
4. The term "P1 clones" refers to genomic DNAs cloned into vectors based on the P1 phage replication mechanisms. These vectors generally accommodate inserts of about 70 to about 105 kb (Pierce et al., Proc. Natl. Acad.
Sci., USA, 89:2056-2060, 1992).
5. As used herein, the term "exon trapping"
refers to a method for isolating genomic DNA sequences that are flanked by donor and acceptor splice sites for RNA
processing.
refers to a method for isolating genomic DNA sequences that are flanked by donor and acceptor splice sites for RNA
processing.
6. "Amplification" of DNA as used herein denotes a reaction that serves to increase the concentration of a particular DNA sequence within a mixture of DNA sequences. Amplification may be carried out using polymerase chain reaction (PCR) (Saiki et al., Science, 239:487, 1988), ligase chain reaction (LCR), nucleic acid-specific based amplification (NSBA), or any method known in the art.
7. "RT-PCR" as used herein refers to coupled reverse transcription and polymerase chain reaction. This method of amplification uses an initial step in which a specific oligonucleotide, oligo dT, or a mixture of random primers is used to prime reverse transcription of RNA into single-stranded cDNA; this cDNA is then amplified using standard amplification techniques e.g. PCR.
A P1 contig containing approximately 700 kb of DNA surrounding the PKD1 and TSC2 gene was assembled from a set of 12 unique chromosome 16-derived P1 clones obtained by screening a 3 genome equivalent P1 library (Shepherd et al., Proc. Natl. Acad. Sci., USA 91:2629-2633, 1994) with 15 distinct probes. Exon trapping was used to identify transcribed sequences from this region in 16p13.3.
96 novel exon traps have been obtained containing sequences from a minimum of eighteen genes in this interval. The eighteen identified genes include five previously reported genes from the interval and a previously characterized gene whose location was unknown (Table I). Additional exon traps have been mapped to genes based on their presence in cDNAs, RT-PCR products, or their hybridization to distinct mRNA species on Northern blots.
(n .- - oo r ~o vo0or~
~ v - 00 v - ~ ~ ~ ~ v1~ N
r f'1 ~ ~ -n N
~O
U a U O U U U U U UI
r-, c y o U .ao.i ~ r,U
1- o r - i N
... ~o r ~i o ~ N N ~ ~ .G
~o n r r c p v~ y-,v 0000 ~ o ~
v (.~ v~
m t a V o ~D f'1r1 W oo r ooO~t'1 o_n 0 ~O N h O ~p r oo-Q
m ? 'D - N rr1 o v~ -~O y r o ~~
_ vs uZp ~ G ~ G~.B~.O~..~IN CL
h ~ V~1~C
C WW
i v_ a n T ~''-' w o~ a .., a, .~t a ~ .~
C a :; ..C."'o .~
. au C
c ~ ' " . C ~
' v ' v = ~ o ~
t - " V
>, ~ c ~ ~ G J h = C ''I
h c _ 'i t ~ ~ C ( ~ v .
- ~
-~ ~ ~. C
K
f~ 17O a .
E v ~ > > = ~ E '~
= > ~
O N
O Z7'O .-~ O o~
~
~ ~ " ~ o ~ t C ~ ~
o r ,. c O ~ a o ..
~ ~ U a a ~ = o U a v '-' a a _c t=_~ p -CJ a1C N a O
N
VI ~ r m a ~ ~
" t v v a v o O a a 07GO Z ' O o p p a CD -a.z z z a a ~ z ~ Q
~ z a G
A
C
a wr ~ t I = O
y G.r ' y O
N ' U
v p ..
~
M ' ~ ( .1C ~ ~L.Yt O .Y
.YV1~ ~ h Y11O JC .aG
~G COI
,1L f~1 OON _ 1 1 rr N N f'1N w --l a v~
n z = T cv G
aC7 _ .. ... .~ a In n. L~ c ~ 1 _ L t ~ _ _ t _ W G ' ~ C
J
. ~ C.I-~ CJ CJ
~ ' a g a o Q a _ _ QQ
a 0. a Q Z K . Z Z f:N ZZ ~ .~ ~ ~ o cG . c -- Z D v ~ F..n F-w 0, nC7 _a~ _n . p a ~ .r U 'O N
n. 0.' U n t v t vv ~ N CJ ~ N
.-u..D~G r ~ t t t .L~ " t.O .= f0 'D 'O
L .r t ~ v~
~ ~ C C x .~G.1G~ Q C ~C.Y U N r 'fl .Y C ~L V
~Gf'1V1 K K ~ 00N C N K rr _ :v t0 ~
N ~~~O K N OO ~ ~)O O W --O N G1, O W W ~~ C
O W
c~ y In ~
O c0 C
s ~ ~
d ~' Ll.
...D O V h r ' - a o ~ ~
:, 0.' oi~ti c~ ._ ~
V L' \D~ ~ N f1 N N - N _ Y1- (." ~ T ~
N O ~ y ...
ON
C
up c'L
t In Z x 3 _ .= O .D
O
.:w ~ U 3 E E
0.,.-a a O
O
O
O
= C N
(~ ~ ~ a o .. E o N L... V
N N U N
tt O 'pn __ r. O
~ ''- V c~
=
C C ~
, , ' Vz~~nQ~n z j a n v n W w .~~ - x ~~
p Sl -CO CS
I . I ~ ~ '.; 1 (g gi~iiTtti~ S~~T '~~i~.E ~) WO 97!48797 PCT/US97/00785 Exon trapping was performed using an improved trapping vector (Burn et al., Gene 161:183-187, 1995), with the resulting exon traps being characterized by DNA
sequence analysis. In order to determine the relative efficiency of the exon trapping procedure, exon traps were compared to the cDNA sequences for those genes known to be in the interval around the PKD1 gene (Figure 1). Single exon traps were obtained from the human homologue of the ERV1 (Lisowsky et al., Genomics 29:690-697, 1995) and the ATP6C proton pump genes (Gillespie et al., Proc. Natl.
Acad. Sci., USA 88:4289-4293, 1991). The horizontal line at the top of Figure 1 shows the position of relevant DNA
markers with the scale (in kilobases). The position of NotI sites is shown below the horizontal line. The position and orientation of the known genes is indicated by arrows with the number of exon traps obtained from each gene shown in parentheses. The position of the transcription units described in this report (A through M) are shown below the known genes. The Genbank Accession numbers of corresponding exon traps are shown below each transcriptional unit. P1 clones are indicated by the overlapping lines with the name of the clone shown above the line. The position of trapped exons which did not map to characterized transcripts are shown below the P1 contig.
Vertical lines denote the interval within the P1 clones) detected by the exon traps in hybridization studies.
In contrast, eight individual exon traps were isolated from the TSC2 gene and ten from the CCNF gene (The European Chromosome 16 Tuberous Sclerosis Consortium, supra. 1993; Kraus et al., Genomics 24:27-33, 1994).
Trapped sequences from three of the exons present in the PKD1 gene were obtained (The American PKD1 Consortium) Hum.
Mol. Genet. 4:575-582, 1995; The International Polycystic Kidney Disease Consortium, Cell 81:289-298, 1995; Hughes et al., Nature Genet. 10:151--160, 1995). 16 additional exon traps from the 109.8C and 47.2H P1 clones were also obtained.
Sequences present in two exon traps (Genbank Accession Nos. L75926 and L75927), localizing to the region of overlap between the 96.4B and 64.12C P1 clones, were shown to contain sequences from the previously described human homologue to the murine RNPS1 gene (Genbank Accession No. L37368), encoding an S phase-prevalent DNA/RNA-binding protein (Schmidt et al., Biochim. Biophys. Acta 1216:317-320, 1993). A comparison of these exon traps to the dbEST
database indicated that they were also contained in cDNA
52161 from the I.M.A.G.E. Consortium (Lennon et al., Genomics 33:151-152, 1996). Based on these data, the hRNPSI gene can be mapped to 16p13.3 near DNA marker D16S291 {transcript G in Figure 1).
Two exon traps from the 1.8F P1 clone were found to have a high level of homology to the previously described murine ~AP3 encoding a zinc finger-containing transcription factor {Fognani et al., EMBO J. 12:4985-4992, 1993). The m~AP3 protein, a zinc finger-containing transcription factor, is believed to function as a negative regulator for genes encoding proteins responsible for the inhibition of cell cycling (Fognani et al., supra.). The two exon traps were linked by PCR, with the resulting 1.2 kb PCR product being 85~ identical at the nucleotide level to the murine ~AP3 cDNA. Hybridization of the ~AP3-like exon traps to the dot blotted P1 contig indicated that the gene lies in the non-overlapping region of the 1.8F P1, between the DNA markers KLH7 and GGG12 (transcript H in Figure 1).
Significant homology was also seen between two exon traps obtained from the 97.106 P1 and the rat Rab26 gene encoding a ras-related GTP-binding protein involved in the regulation of vesicular transport (Nuoffer et a1, Ann.
Rev. Biochem. 63:949-990, 1994; Wagner et al., Biochem.
Biophys. Res. Comm. 207:950-956, 1995). The Rab26-like exon traps were linked by RT-PCR (transcript J in Figure 1) witn the encoded sequences being 94% (83/88) identical at the protein level to Rab26. See, for example, Figure 2 showing an alignment of the following selected exon traps with sequences in the databases. An alignment of sequences encoded by exon trap L48741 (SEQ ID N0:1) and N-acetylglucosamine-6-phosphate deacetylase from C. Elegans (SEQ ID N0:2), E. coli (SEQ ID N0:3) and Haemophilus (SEQ
' ID N0:4). The EGF repeat from netrin-1 (SEQ ID N0:7), netrin-2 (SEQ ID N0:6) and UNC-6 (SEQ ID N0:8) are shown aligned to one of the translated netrin-like exon traps (Genbank Accession No. L75917) (SEQ ID N0:5). An alignment of sequences from the second netrin-like exon trap (Genbank Accession No. L75916) (SEQ ID N0:9) and netrin-1 (SEQ ID N0:11) and netrin-2 (SEQ ID N0:10) is shown. An alignment of the translated Rab26-like RT-PCR
product (Genbank Accession Nos. L48770-L48771) (SEQ ID
N0:12) and rat Rab26 (SEQ TD N0:13). Sequences encoded by exon trap L48792 (SEQ ID N0:14) are shown aligned to sequences from the pilB transcriptional repressor from Neisseria gonorrhoeae (SEQ ID N0:15), sequences predicted by computer analysis to be encoded by cosmid F44E2.6 from C. elegans (SEQ ID N0:17), the YCL33C gene product from yeast (Genbank Accession No. P25566) (SEQ ID N0:16), and a transcriptional repressor from Haemophilus (SEQ ID N0:18).
Periods denote positions where gaps were inserted in the protein sequence in order to maintain alignment.
In order to correlate exon traps with individual transcripts, cDNA library screening and PCR based approaches were used to clone transcribed sequences containing selected exon traps. RT-PCR was used to link individual exon traps together in cases where the two exon traps had homology to similar sequences in the databases.
In cases where only single exon traps were available, 3' RACE or cDNA library screening was used to obtain additional sequences. Sequences from the exon traps and cloned products were used to map the position, and when possible the orientation, of the corresponding transcription units.
Six unique exon traps, containing sequences from at least eight exons, were shown to be from a transcriptional unit in the centromeric most P1 clone, 94.10H (transcript A in Figure 1). A 2 kb cDNA linking the six exon traps was isolated and shown to hybridize to an 8 kb transcript. Additional hybridization studies indicated that the gene was oriented centromeric to telomeric, with at least 6 kb of the transcript originating from sequences centromeric of the P1 contig. Extensive homology was observed between the translated cDNA and a variety of protein kinases; however, the presence of the conserved HRDLKPEN motif {SEQ ID N0:71) encoded in exon trap L48734, as well as the partial cDNA, suggests that it encodes a serine/threonine kinase (van-der-Geer et al., Ann. Rev.
Cell Bio. 10:251-337, 1994).
cDNAs were isolated using sequences derived from a separate 94.1OH exon trap (Genbank Accession No. L48738) and the position and orientation of the corresponding transcription unit were determined. Two cDNA species were obtained using exon trap L48738 as a probe, with the only homology between the two species arising from the 109 bases contained in the exon trap. Using oligonucleotide probes, the transcription unit was mapped to a position near the 26-6DIS DNA marker, in a telomeric to centromeric orientation; however, only one of the cDNA species mapped to the P1 contig (transcript B in Figure 1). Based on these data, it is likely that the second cDNA species originated from a region outside of the P1 contig, possibly from the duplicated 26-6PROX marker located further centromeric in 16p13.3 (Gillespie et al., Nuc. Acids Res.
18:7071-7075, 1990).
The 110.1F P1 clone contains at least two genes in addition to the ATP6C gene. Using BLASTX to search the protein databases, significant homology was observed between sequences encoded by exon trap L48741 and the N-acetylglucosamine-6-phosphate deacetylase {nagA) proteins from C. elegans (Wilson et al., supra. 1994), E. coli (Plumbridge, Mol. Microbiol. 3:505-515, 1989) and Haemophilus (Fleischmann et al., Science 269:496-512, 1995). An alignment of the nagA proteins to the translated exon trap revealed the presence of multiple conserved regions (Figure 2), suggesting that the exon trap contains sequences from the human nagA gene. Additional sequences from the nagA-like transcript have been cloned using 3' RACE and the transcription unit mapped to a region between NotI sites 2 and 3 in Figure 1. The gene is oriented telomeric to centromeric with NotI site 2 being present in the 3' UTR of the RACE clone (transcript C in Figure 1).
Two additional exon traps (Genbank Accession Nos.
L75916 and L75917), mapping to the region of overlap between the 110.1F and 53.8B P1 clones (transcript D in Figure 1), were shown to have homology with the chicken netrins (Kennedy et al., Cell 78:425-435, 1994; Serafini et al., Cell 78:409-424, 1994) and the C. elegans UNC-6 protein (Ishii et al., Neuron 9:873-881, 1992)(Figures 2 and 20A).
Sequences encoded by exon trap, L75917, were shown to have significant homology with the C-terminal most epidermal growth factor (EGF) repeat found in the netrin and UNC-6 proteins (Figures 2 and 20A). Exon trap L75917 encodes sequences which are 98~ identical to sequences from the third epidermal growth factor (EGF) repeat of chicken netrin-2 and 90~ identical to sequences from the same region of netrin-1. The netrin-like trap, L75916, encodes sequences from the more divergent C-terminal domain of the netrins which are 43~ identical to sequences contained in the C-terminal domain of netrin-1 and netrin-2 (Figures 2 and 20A). This region is the least conserved between UNC-6 and the netrins, with sequences being 63~ conserved between netrin-1 and netrin-2 and 29o conserved between netrin-2 and UNC-6 (Serafini et al:, supra.).
The netrins define a family of chemotropic factors which have been shown to play a central role in axon guidance. Axonal growth cones are guided to their target by both local cues, present in the extracellular matrix or on the surface of cells, and long-range cues in the form of diffusible chemoattractants and chemorepellents (Goodman and Shatz, Cell 72:77-98, 1993; Keynes and Cook, Curr. Opin. Neurobiol. 5:75-82, 1995).
Chicken netrin-1 and netrin-2 have been shown to function as chemoattractants for developing spinal commissural axons (Serafini et al., Cell 78:409-424, 1994;
Kennedy et al., Cell 78:425-435, 1994) with netrin-1 also acting as a chemorepellant for trochlear motor axons (Colamarino and Tessier-Lavigne, Cell 81:621-629, 1995).
Comparative analysis revealed the presence of extensive homology between the chicken netrins and C. elegans UNC-6 protein which is required for circumferential cell migration and axon guidance (Hedgecock et al., Neuron 4:61-85, 1990; Ishii et al., Neuron 9:873-881, 1992).
More recently, two Drosophila netrins, NETA and NETB, have been described and shown to be required for commissural axon guidance as well as for guidance of motor neurons to their target muscles (Harris et al., Cell 17:217-228, 1996;
Mitchell et al., Cell 17:203-215, 1996). These studies indicate that the netrin family of chemoattractant and chemorepellant proteins is conserved between invertebrates and vertebrates.
The genomic interval containing the netrin-like exon traps was sequenced in order to obtain additional sequence information from the gene and to rule out the possibility that the exon traps were derived from a pseudogene. In preliminary studies using the 53.8B genomic P1 clone, the netrin-like exon traps were mapped to a 6 kb XhoI fragment. See, for example, Figure 18 wherein relevant DNA markers are shown on top of the horizontal line, with NotI sites (N) being shown below the line. The location and orientation of the ATP6C, CCNF, and nagA
transcriptional units have been previously described (Gillespie et al., Proc. Natl. Acad. Sci., USA 88:
4289-4293, 1991; Kraus et al., Genomics 24: 27-33, 1994;
Burn et al., Genome Research 6: 525-537, 1996) and are shown below the genomic interval. The two P1 clones containing the netrin gene are shown below the schematic diagram of the interval. The location of the 6.8 kb of - genomic sequence is enlarged below the P1 clones. The position of the two exon traps in the 6.8 kb of genomic sequence is also indicated.
The 6 kb fragment, and the adjacent 3.5 kb XhoI
fragment, were subcloned and used to screen a random shotgun library from the 53.8B P1 clone. Subclones which were positive by hybridization were sequenced with forward and reverse vector primers. A total of 88 subclones were sequenced in this manner.
Additional sequence was obtained using internal primers as well as end sequence from the parental XhoI
fragments. A total of 6.8 kb of genomic sequence with an overall redundancy of 7-fold was sequenced. The GC-content for the sequenced region was found to be 68.9%, which is slightly higher than the 62.80 observed for the 53 kb of genomic sequence from the PKD1 gene, located 350 kb further telomeric (The American PKD1 Consortium, 1995, supra; Burn et al., 1996, supra).
Computer analyses were performed to identify putative exons. GRAIL2 analysis predicted six exons within the 6.8 kb of genomic sequence with database analysis indicating that all but one exon (exon 1), encoded sequences with homology to the chicken netrins. Figure 19A
shows a GRAIL2 analysis of coding sequences in the 6.8 kb of genomic sequence from the 53.8B P1, with the gray scale denoting GC-content (white to light gray is GC rich and gray to black is AT rich), vertical boxes indicating relative quality of the predicted exons. A graphical - WO 97/48797 PCTlUS97/00785 depiction of the predicted exons is shown above the vertical boxes with light colored boxes denoting exons with a score of "excellent" ( >80~ probability) and dark colored boxes denoting exons with a score of "good" (>600 probability). The position of exon traps L75917 and L75916 (left to right, respectively) are shown above the GRAIL2 predicted exons. The structure of the gene based on comparison of the RT-PCR products and genomic sequence is shown at the top, the position of the exons in the genomic sequence is shown by the numbers above the exons. The 5' and 3' untranslated regions are also shown.
Additionally, the 6.8 kb of genomic sequence was compared to the protein sequences of the chicken netrins using a Pustell DNA/protein matrix. The genomic sequence (translated in all six frames) was compared to chicken netrin-2 in Figure 19B, using a PAM250 matrix with the minimum homology set at 50~ and the window set at 20.
Regions of homology are shown by heavy diagonal lines.
Five exons were predicted by this analysis, with only the first GRAIL2 predicted exon not appearing to be bona fide.
Sequences from the two exon traps were also predicted by GRAIL2; however, there were noteworthy differences (cf Figure 19A). In predicting sequences present in exon trap L75917, GRAIL2 included an additional 55 by at the 5' end of the exon. The first of the two exons present in exon trap L75916 was not predicted by GRAIL2, while GRAIL2 added additional bases to the 5' and 3' ends of the second exon present in this exon trap.
A search of the Expressed Sequence Tags (EST) database did not reveal the presence of any ESTs from the human netrin gene. Nor was the human netrin message detected by Northern and/or RNA dot blot analysis using mRNA from over fifty different adult and fetal tissues, suggesting that hNET has an extremely restricted pattern of expression and when expressed is present in low abundance.
Two murine ESTs, however, were identified from a brain library and a whole fetus library (Genbank Accession Nos.
w59766 and AA048205, respectively) which have significant homology to hNET. The murine ESTs contain overlapping sequence with a total of 477 by of contiguous sequence being represented. This 477 by contiguous sequence aligns to the 5' end of the human netrin cDNA and includes 47 by of 5' UTR and sequences encoding the N-terminal 143 amino acids. A comparison of the deduced human and murine protein sequence indicated that the two proteins were 89.5 (128/143) identical.
Characterization of the Human Netrin Transcript In order to confirm the structure of the netrin gene, RT-PCR was performed using primers designed from the predicted exons. Since the predicted human netrin appeared to slightly more homologous to netrin-2 than netrin-1 (570 versus 54%, respectively) and netrin-2 is expressed in the spinal cord of chicken, adult human spinal cord polyA+ RNA
was utilized as a template. RT-PCR products were obtained with only a portion of the primer pairs; however, even this required the use of nested primers and two rounds of PCR, with low yields making it necessary to use hybridization and radiolabeled probes to visualize the products. The low yield, and lack of RT-PCR products in some cases, was attributed to the high GC-content of the products (70-80~).
The addition of betaine to a final concentration of 2.5 M
in the PCR reactions was found to dramatically improve yield and purity of the RT-PCR products. (International Publication No. WO 96/12041; Reeves et a1. (1994) Am. J.
Hum. Genet. 55:A238; Baskaran et a1. (1996) Genome Research 6:633-638).
Assembly of the RT-PCR products revealed a 1743 by open reading frame (ORF) with an in-frame stop codon upstream of the proposed start methionine. In verifying the start and stop codons, a 209 by 5' UTR and a 22 by 3' ' UTR were cloned. Additional sequences from the respective UTRs were not cloned, however, since the goal of the RT-PCR
experiments was to only confirm the predicted protein sequence and not to assemble a full-length cDNA. The position of the intron-exon boundaries was determined based on the comparison of the genomic sequence and the RT-PCR
clones (Figure 19A).
A 1.9 kb cDNA, hNET, was cloned by performing nested PCR using spinal cord cDNA as template and standard PCR conditions with the addition of betaine. The human netrin protein is predicted to be 580 amino acids in size, with the common domain structure of the netrin family being conserved. In Figure 20A positions where the chicken netrins and UNC-6 sequences match the human sequence are denoted by periods while gaps introduced during the alignment are shown by hyphens. Arrows above the sequence alignment show the boundaries of the laminin VI and V
domains, and C-terminal region (C) as described (Serafini et al., Cell 78: 409-424, 1994). The signal sequence (S) is also shown. V-1, V-2, and V-3 designate each of the EGF
domains that constitute domain V.
The hNET coding sequence and its predicted protein product are shown in Figures 4A and 4B. Figures 4C and 4D show full length hNET cDNA including both 5' and 3' UTR
sequence.
Several lines of evidence rule against the possibility that the human netrin gene described herein represents a pseudogene. First, none of the exons in the coding region contain stop codons. Secondly, the overall gene structure described is highly conserved when compared to other members of the netrin/UNC-6 family. Third, despite the lack of signal in the Northern and RNA blot analysis, a mature transcript was isolated by RT-PCR.
Finally, sequences in the murine EST database have been identified which are highly conserved. Taken together, these data indicate that a novel human netrin gene with a restricted pattern of expression has been identified.
Human netrins may have a significant role in neural regeneration. Though netrins do not by themselves promote axon growth, they do play a role in the orientation of axon growth. The combination of growth promoting activities with axon guidance cues would be a necessary requisite for directed neural regeneration.
The ability to clone a gene with such a restricted pattern of expression points out one of the strengths of the exon trapping procedure, since it is unlikely that the netrin gene would have been identified using cDNA selection or direct library screening. These results highlight the need for using a variety of approaches to identify and clone sequences from a large genomic contig.
Exon trapping results further show that there is a novel ATP Binding Cassette (ABC) transporter in the PKD1 locus located between the LCN1 and D16S291 markers in a centromeric to telomeric orientation. Database searches with the exon trap sequences show homology to the murine ABC1 and ABC2 genes (Luciani et al., supra. 1994). The human homologs of murine ABC1 and ABC2 have been cloned and mapped to human chromosome 9 (Luciani et a1. supra. 1994).
Sequences derived from the trapped exons along with those from cDNA selection and SAmple SEquencing (SASE) were used to recover overlapping partial cDNA clones.
Seven exon traps with homology to ABC
transporters were isolated from P1 clones 30.1F, 64.12C and 96.4B. Additional sequences encoded by the ABC3 gene were obtained by RT-PCR (placenta and brain RNA as template) and library PCR (using commercially available lung cDNA library as template) using custom primers designed from the exon traps (Tables II and III). Three exon traps (L48758, L48759 and L48760) were obtained from the region of overlap between the 30.1F, 64.12C and 96.4B P1 clones (transcript F
Figure 1), while a fourth exon (L48753) maps to the 79.2A
' P1 clone, exclusively (transcript E in Figure 1).
U
~
o ~ a~
N ~ D ~
(3~
~Gxy..~ x ~t'x '~~ ~ Ar O O M~D~r~ N L~I~
v V ~ ~ ~ ' ~ O
c -O .~. p .-p.
O
.
~ ~O
C~, t O
-.
E1~C Ei~ E-~r., f~: ~ E
b '~
U ~ ~ H
C ~U H C C ~ c~d~~
-~ 7 'J
~ .~ a~
~ ~U U ~ U ~ UU
,7c C E
C 7 7 -a ~ H EH U U H U
c ~ C- 7 H
~1 H C7-RU ~ C
~
V ~ U U H ~~
~''~
C C ., a~
~ 7 H 7 0 '3 'p U U C7C7~
U U ~ e~
~
C CU ~ U U C.U~o o.-~
7 7 '~
a~ ~ C7~~ E-~FCU'C7E-~>' ~ U ~-' U
~ 3 o U UC9U U r.~U C9,.~, ~ ~ ~ G
C7C7C7t7E-~C7 ~~ o ..~ p v ~
a~
O U UU H H ~ ~_ ( ~~~~'~''~~
C C ~ o O a~ p 7 .7 Wp ~ ~
' O .
, o vi C
a "' W t~av.-<M ~n~ ov.-~' O
~
~ M M~'~'d'd'cT'~~ ~ ~
z ~
~'~
c G
~ U U Uo~
~
~
~ c -~E-iU H U Hv UU , ~
U C-C U C.E E~ ~ W 3 ~-~7 7 -~-~
~
~ H t U U U C-U~w,~U~ ~R:
UU U H U ~
U C ~CC C9 ~ ~ 7 7 '~ ~
~w -d ~ ~ ~ ~ ~ H ~
H UU H a a a ~r~., C.C-C-C t c~c7 ~ o~ o(~~
7 l7 7 U
~ U
V U U - W ' U U~o~ ~a~..-, ' cn ~ ~7C7C E U C FCC7~ b V O ~ A''' ~ tU7CU7U H ~ tU'JH
' ~
J
. ~
~
~
O H HU C~7~ t7~CH._ ~ ~ ~ ~
a ~
.-~
co a~ .~ b .... -t7 ..., .-. ~, .~ .~ o., ~ -.~
d o -N
~ .b '~,' 'p ~
O 'd'd'~ ~ ~ ( c?
~~z on cd '~' ~"
w ~ ~
M M ~ ~
o ~
~ ~ ~.o~ ~ 3 ~
-:
~
~
~
~-,~.~
~ ~ .
u o.N
~ ~ ~;~,~ o W ~ ~ .~ o ~.., o ~ p .
c c ~ ~U U U U ~ ~~~o~'~o~o W 3.~3.~,o~'o ~ N y 1 1 "~ 0 ~ V ~
~ .
a~ay H H H a~a~~.
V
~ -~
~ b ~
n C 7C7mt~.'R.'G~C7C7w ~
a~
~ ~' ~ ~'~ o ..
c o ~~~ a~~
C7 ~ r~ .~ ~ rip ~ v~
C 7 ~a~Uu x.- ~ ~
~, . ~
SUBSTITUTE SHEET (RULE 26) .fl U .a.<~ ~c ~
o ~
c~--..nc.
U ~~ O O
in N _ U
v ~ v G (r7fnM f~l ~~ UU U U
o q N_ d U.
. -~ U E
~ -~
a, Ud C7 ~ E..t7C7U
o U
~~
~
- UC7~
o ~
_ ~ ~U U U
'~ ~
'-' H
L r r E UH E-~~.
o ' U
' c.: -a ..
r" r ~ Da ~ U G
_ U
aL
o cn U C7C7 ~
' ._~_) ~ N ~ '.- t7d U
t > -_ U V
o U ~-n:~ o C7 U
;.-, ~ U
E~N~J ~ CU7Q U U
C~ '' ~ cc U
U
O
~
U
C) ~~,o~o c U ~ U U U U C/~ v7v7~ tn z r ~
_ _ ~ ~O d'~
V r~~. b0 O C3 ~~OCJ
C~ f~-r-J 'N C:I ~ ~J (~(~O~i~-G ~ C~ -- ~ 0000~f100 O ~ 'a O '~ ~~
~
.-~
U U
U 'n V v J J c~ ON ~ ~-'-' C
O
V ~U ~
' e'C) ~. -- ~ U ~.
o o ~ ~ U f o~' c c. U ~ ~
_ c ~ ~ U E-'(7 UU
U ~ c.
0 o ~ f-U U E"
~~ ~ --~ c~ ~~ U
U U
4 ~
, .-~ ~ E:,< < :-~ HC7d HH ~_ ''~" r'U UN
N UC C.c o-r. J 7 U
~
O y.., i- U ~ ~
e.
O
~ ' U (-r U~
N d G
= U ~ - ~ ~
=
~ ~
. UU (7-_ o ~ C7U~E'~G
~U~.: v N~~
Q'U
. ~U C7Us.O
: ~ ~' - ~
, ~
T, dd U C7-~
.-.
a~ J ~' O
o = N UC7d do. a ca:
'a = ~. J do c~
c~ ~ ~ ~ G r ~
.~ ~
_ -O U '-' U ~ a0 n U
.~ ---c c) cs ~ cn ~, ~.JO~O~ ~ ~flcy~
U .... .... + ~ N TJ
", UN ~ U
..' _ U
.. S ~ Ocn ~U.~
O NU-~7 O
U V ~ U r U ~p ~ N~ ~-~-~~-~.~a~
o o ~ ~ ~-- 0 4 z .-~
-_ _ ~, ~ 'd b ~ ~
a O
.
U CJ ~ CD.'~ O ~ O
O ~= U O O
O ~w c7 N ~ _ ~ _ ui c C!J-d O 'i7 ~-s U n O G p .
c_.
~ dN 'V N 'U
U O U O ~_ ~ rJ ~--t 'fl U
~
U U U v ~7 1--~ -U
U
E '-- c~ooO ~U
-t - r ~V)~ z~ ~.) ~ ~
' _ '~ ~ ' ,n U
o U
U '_" cp O
~
U e~ _O
_ OOdJ00D~CnC~CnCL~
U G'.ru ~~ N U ~
.
F l ,~ U r. U :c N _ _O
G ~
- V? C: "~. U:
i/) .. U
J, N
. ' . an ~U C) (e '~ a z a -% a cv ,-SUBSTITUTE SHEET (RULE 26) U
_O
_N N O N
U Vr 'V
in v U M ~' N
G N cn r O
a>
O ~ ~ NU
U ~a O
_O
U
O
. o ,L-. U U
U
Z UE."' ~ Q U
.
UC7~ U Q C7r H~ V E-U,U
U Q U
C7 C7 C7c U V
UC7 U Qo ~ N ["~E-"C7(~U UU
U d U U c a ~
.C C7Q U
+. U U U
Q U C7 o UU C7U C7E-,Q
H E-~
~U V U Q Q
C C. J
'~-O UU U C7C7H
a U
~ ~z ~ ~ ~ ~ ~
~
.
U ~ U
p U ..
U
Ud UE-~U Q
Q C7U-~-UU
-a C E.H..,U U ~~ v .7U N
C7U C7U -~
_~:
'~
~
_~ ~U U U C7U'n' -w~e~o ~ U UU U Q Q~
cn E - Q ~~
Q o ' O C7 U U U Q~..
C7 u., v o z sa .
UC7E-V-~V-~FV-.CU7~ on E
Ua U U U U~
~
C 7C.7 ~' . ~
on o ~.~.~
~
U cr'n -~
o N
e~ c ~
~ ~
.
c~
o a aWd o W ~-rM 7 O NO U ~ w . ~
~z ~. ~.
p ( Y., .N~ ~ aU~bU
(~ p,i ~ L l bD O
C l Y O ~ N
J o Cl H .,.. .
E -~U U
~. ~:. ~ ~~ v~ v~
P a p ~ r~
a " _ ~ -.N -"N -'N~ ~ ci . - " v ti SUBSTITUTE SHEET (RULE 26) Exon traps from the hABC3 transporter encoded by transcript F encode sequences with homology to the R-domain of the murine ABC1 and ABC2 genes. The R-domain is believed to play a regulatory role based on the comparison to a conserved region in CFTR. To date, only ABC1, ABC2 and CFTR have been shown to contain an R-domain (Luciani et al., supra. 1994).
Additionally, a 1.1 kb RT-PCR product which links the three exon traps from transcript F, with the RT-PCR
product detecting a 7 kb message on Northern blots has been obtained. Based on a search of the dbEST database, a cDNA
from this region was obtained with sequences from exon traps L75924 and L75925 being contained in cDNA 49233 from the I.M.A.G.E. Consortium (Lennon et al., supra.). The presence of both cloned reagents in the same transcription unit has been confirmed using RT-PCR.
The ATP binding cassette (ABC) transporters, or traffic ATPs, comprise a family of more than 100 proteins responsible for the transport of a wide variety of substrates across cell membranes in both prokaryotic and eukaryotic cells (Higgins, C. F., Anna. Rev. Cell. Biol.
A P1 contig containing approximately 700 kb of DNA surrounding the PKD1 and TSC2 gene was assembled from a set of 12 unique chromosome 16-derived P1 clones obtained by screening a 3 genome equivalent P1 library (Shepherd et al., Proc. Natl. Acad. Sci., USA 91:2629-2633, 1994) with 15 distinct probes. Exon trapping was used to identify transcribed sequences from this region in 16p13.3.
96 novel exon traps have been obtained containing sequences from a minimum of eighteen genes in this interval. The eighteen identified genes include five previously reported genes from the interval and a previously characterized gene whose location was unknown (Table I). Additional exon traps have been mapped to genes based on their presence in cDNAs, RT-PCR products, or their hybridization to distinct mRNA species on Northern blots.
(n .- - oo r ~o vo0or~
~ v - 00 v - ~ ~ ~ ~ v1~ N
r f'1 ~ ~ -n N
~O
U a U O U U U U U UI
r-, c y o U .ao.i ~ r,U
1- o r - i N
... ~o r ~i o ~ N N ~ ~ .G
~o n r r c p v~ y-,v 0000 ~ o ~
v (.~ v~
m t a V o ~D f'1r1 W oo r ooO~t'1 o_n 0 ~O N h O ~p r oo-Q
m ? 'D - N rr1 o v~ -~O y r o ~~
_ vs uZp ~ G ~ G~.B~.O~..~IN CL
h ~ V~1~C
C WW
i v_ a n T ~''-' w o~ a .., a, .~t a ~ .~
C a :; ..C."'o .~
. au C
c ~ ' " . C ~
' v ' v = ~ o ~
t - " V
>, ~ c ~ ~ G J h = C ''I
h c _ 'i t ~ ~ C ( ~ v .
- ~
-~ ~ ~. C
K
f~ 17O a .
E v ~ > > = ~ E '~
= > ~
O N
O Z7'O .-~ O o~
~
~ ~ " ~ o ~ t C ~ ~
o r ,. c O ~ a o ..
~ ~ U a a ~ = o U a v '-' a a _c t=_~ p -CJ a1C N a O
N
VI ~ r m a ~ ~
" t v v a v o O a a 07GO Z ' O o p p a CD -a.z z z a a ~ z ~ Q
~ z a G
A
C
a wr ~ t I = O
y G.r ' y O
N ' U
v p ..
~
M ' ~ ( .1C ~ ~L.Yt O .Y
.YV1~ ~ h Y11O JC .aG
~G COI
,1L f~1 OON _ 1 1 rr N N f'1N w --l a v~
n z = T cv G
aC7 _ .. ... .~ a In n. L~ c ~ 1 _ L t ~ _ _ t _ W G ' ~ C
J
. ~ C.I-~ CJ CJ
~ ' a g a o Q a _ _ QQ
a 0. a Q Z K . Z Z f:N ZZ ~ .~ ~ ~ o cG . c -- Z D v ~ F..n F-w 0, nC7 _a~ _n . p a ~ .r U 'O N
n. 0.' U n t v t vv ~ N CJ ~ N
.-u..D~G r ~ t t t .L~ " t.O .= f0 'D 'O
L .r t ~ v~
~ ~ C C x .~G.1G~ Q C ~C.Y U N r 'fl .Y C ~L V
~Gf'1V1 K K ~ 00N C N K rr _ :v t0 ~
N ~~~O K N OO ~ ~)O O W --O N G1, O W W ~~ C
O W
c~ y In ~
O c0 C
s ~ ~
d ~' Ll.
...D O V h r ' - a o ~ ~
:, 0.' oi~ti c~ ._ ~
V L' \D~ ~ N f1 N N - N _ Y1- (." ~ T ~
N O ~ y ...
ON
C
up c'L
t In Z x 3 _ .= O .D
O
.:w ~ U 3 E E
0.,.-a a O
O
O
O
= C N
(~ ~ ~ a o .. E o N L... V
N N U N
tt O 'pn __ r. O
~ ''- V c~
=
C C ~
, , ' Vz~~nQ~n z j a n v n W w .~~ - x ~~
p Sl -CO CS
I . I ~ ~ '.; 1 (g gi~iiTtti~ S~~T '~~i~.E ~) WO 97!48797 PCT/US97/00785 Exon trapping was performed using an improved trapping vector (Burn et al., Gene 161:183-187, 1995), with the resulting exon traps being characterized by DNA
sequence analysis. In order to determine the relative efficiency of the exon trapping procedure, exon traps were compared to the cDNA sequences for those genes known to be in the interval around the PKD1 gene (Figure 1). Single exon traps were obtained from the human homologue of the ERV1 (Lisowsky et al., Genomics 29:690-697, 1995) and the ATP6C proton pump genes (Gillespie et al., Proc. Natl.
Acad. Sci., USA 88:4289-4293, 1991). The horizontal line at the top of Figure 1 shows the position of relevant DNA
markers with the scale (in kilobases). The position of NotI sites is shown below the horizontal line. The position and orientation of the known genes is indicated by arrows with the number of exon traps obtained from each gene shown in parentheses. The position of the transcription units described in this report (A through M) are shown below the known genes. The Genbank Accession numbers of corresponding exon traps are shown below each transcriptional unit. P1 clones are indicated by the overlapping lines with the name of the clone shown above the line. The position of trapped exons which did not map to characterized transcripts are shown below the P1 contig.
Vertical lines denote the interval within the P1 clones) detected by the exon traps in hybridization studies.
In contrast, eight individual exon traps were isolated from the TSC2 gene and ten from the CCNF gene (The European Chromosome 16 Tuberous Sclerosis Consortium, supra. 1993; Kraus et al., Genomics 24:27-33, 1994).
Trapped sequences from three of the exons present in the PKD1 gene were obtained (The American PKD1 Consortium) Hum.
Mol. Genet. 4:575-582, 1995; The International Polycystic Kidney Disease Consortium, Cell 81:289-298, 1995; Hughes et al., Nature Genet. 10:151--160, 1995). 16 additional exon traps from the 109.8C and 47.2H P1 clones were also obtained.
Sequences present in two exon traps (Genbank Accession Nos. L75926 and L75927), localizing to the region of overlap between the 96.4B and 64.12C P1 clones, were shown to contain sequences from the previously described human homologue to the murine RNPS1 gene (Genbank Accession No. L37368), encoding an S phase-prevalent DNA/RNA-binding protein (Schmidt et al., Biochim. Biophys. Acta 1216:317-320, 1993). A comparison of these exon traps to the dbEST
database indicated that they were also contained in cDNA
52161 from the I.M.A.G.E. Consortium (Lennon et al., Genomics 33:151-152, 1996). Based on these data, the hRNPSI gene can be mapped to 16p13.3 near DNA marker D16S291 {transcript G in Figure 1).
Two exon traps from the 1.8F P1 clone were found to have a high level of homology to the previously described murine ~AP3 encoding a zinc finger-containing transcription factor {Fognani et al., EMBO J. 12:4985-4992, 1993). The m~AP3 protein, a zinc finger-containing transcription factor, is believed to function as a negative regulator for genes encoding proteins responsible for the inhibition of cell cycling (Fognani et al., supra.). The two exon traps were linked by PCR, with the resulting 1.2 kb PCR product being 85~ identical at the nucleotide level to the murine ~AP3 cDNA. Hybridization of the ~AP3-like exon traps to the dot blotted P1 contig indicated that the gene lies in the non-overlapping region of the 1.8F P1, between the DNA markers KLH7 and GGG12 (transcript H in Figure 1).
Significant homology was also seen between two exon traps obtained from the 97.106 P1 and the rat Rab26 gene encoding a ras-related GTP-binding protein involved in the regulation of vesicular transport (Nuoffer et a1, Ann.
Rev. Biochem. 63:949-990, 1994; Wagner et al., Biochem.
Biophys. Res. Comm. 207:950-956, 1995). The Rab26-like exon traps were linked by RT-PCR (transcript J in Figure 1) witn the encoded sequences being 94% (83/88) identical at the protein level to Rab26. See, for example, Figure 2 showing an alignment of the following selected exon traps with sequences in the databases. An alignment of sequences encoded by exon trap L48741 (SEQ ID N0:1) and N-acetylglucosamine-6-phosphate deacetylase from C. Elegans (SEQ ID N0:2), E. coli (SEQ ID N0:3) and Haemophilus (SEQ
' ID N0:4). The EGF repeat from netrin-1 (SEQ ID N0:7), netrin-2 (SEQ ID N0:6) and UNC-6 (SEQ ID N0:8) are shown aligned to one of the translated netrin-like exon traps (Genbank Accession No. L75917) (SEQ ID N0:5). An alignment of sequences from the second netrin-like exon trap (Genbank Accession No. L75916) (SEQ ID N0:9) and netrin-1 (SEQ ID N0:11) and netrin-2 (SEQ ID N0:10) is shown. An alignment of the translated Rab26-like RT-PCR
product (Genbank Accession Nos. L48770-L48771) (SEQ ID
N0:12) and rat Rab26 (SEQ TD N0:13). Sequences encoded by exon trap L48792 (SEQ ID N0:14) are shown aligned to sequences from the pilB transcriptional repressor from Neisseria gonorrhoeae (SEQ ID N0:15), sequences predicted by computer analysis to be encoded by cosmid F44E2.6 from C. elegans (SEQ ID N0:17), the YCL33C gene product from yeast (Genbank Accession No. P25566) (SEQ ID N0:16), and a transcriptional repressor from Haemophilus (SEQ ID N0:18).
Periods denote positions where gaps were inserted in the protein sequence in order to maintain alignment.
In order to correlate exon traps with individual transcripts, cDNA library screening and PCR based approaches were used to clone transcribed sequences containing selected exon traps. RT-PCR was used to link individual exon traps together in cases where the two exon traps had homology to similar sequences in the databases.
In cases where only single exon traps were available, 3' RACE or cDNA library screening was used to obtain additional sequences. Sequences from the exon traps and cloned products were used to map the position, and when possible the orientation, of the corresponding transcription units.
Six unique exon traps, containing sequences from at least eight exons, were shown to be from a transcriptional unit in the centromeric most P1 clone, 94.10H (transcript A in Figure 1). A 2 kb cDNA linking the six exon traps was isolated and shown to hybridize to an 8 kb transcript. Additional hybridization studies indicated that the gene was oriented centromeric to telomeric, with at least 6 kb of the transcript originating from sequences centromeric of the P1 contig. Extensive homology was observed between the translated cDNA and a variety of protein kinases; however, the presence of the conserved HRDLKPEN motif {SEQ ID N0:71) encoded in exon trap L48734, as well as the partial cDNA, suggests that it encodes a serine/threonine kinase (van-der-Geer et al., Ann. Rev.
Cell Bio. 10:251-337, 1994).
cDNAs were isolated using sequences derived from a separate 94.1OH exon trap (Genbank Accession No. L48738) and the position and orientation of the corresponding transcription unit were determined. Two cDNA species were obtained using exon trap L48738 as a probe, with the only homology between the two species arising from the 109 bases contained in the exon trap. Using oligonucleotide probes, the transcription unit was mapped to a position near the 26-6DIS DNA marker, in a telomeric to centromeric orientation; however, only one of the cDNA species mapped to the P1 contig (transcript B in Figure 1). Based on these data, it is likely that the second cDNA species originated from a region outside of the P1 contig, possibly from the duplicated 26-6PROX marker located further centromeric in 16p13.3 (Gillespie et al., Nuc. Acids Res.
18:7071-7075, 1990).
The 110.1F P1 clone contains at least two genes in addition to the ATP6C gene. Using BLASTX to search the protein databases, significant homology was observed between sequences encoded by exon trap L48741 and the N-acetylglucosamine-6-phosphate deacetylase {nagA) proteins from C. elegans (Wilson et al., supra. 1994), E. coli (Plumbridge, Mol. Microbiol. 3:505-515, 1989) and Haemophilus (Fleischmann et al., Science 269:496-512, 1995). An alignment of the nagA proteins to the translated exon trap revealed the presence of multiple conserved regions (Figure 2), suggesting that the exon trap contains sequences from the human nagA gene. Additional sequences from the nagA-like transcript have been cloned using 3' RACE and the transcription unit mapped to a region between NotI sites 2 and 3 in Figure 1. The gene is oriented telomeric to centromeric with NotI site 2 being present in the 3' UTR of the RACE clone (transcript C in Figure 1).
Two additional exon traps (Genbank Accession Nos.
L75916 and L75917), mapping to the region of overlap between the 110.1F and 53.8B P1 clones (transcript D in Figure 1), were shown to have homology with the chicken netrins (Kennedy et al., Cell 78:425-435, 1994; Serafini et al., Cell 78:409-424, 1994) and the C. elegans UNC-6 protein (Ishii et al., Neuron 9:873-881, 1992)(Figures 2 and 20A).
Sequences encoded by exon trap, L75917, were shown to have significant homology with the C-terminal most epidermal growth factor (EGF) repeat found in the netrin and UNC-6 proteins (Figures 2 and 20A). Exon trap L75917 encodes sequences which are 98~ identical to sequences from the third epidermal growth factor (EGF) repeat of chicken netrin-2 and 90~ identical to sequences from the same region of netrin-1. The netrin-like trap, L75916, encodes sequences from the more divergent C-terminal domain of the netrins which are 43~ identical to sequences contained in the C-terminal domain of netrin-1 and netrin-2 (Figures 2 and 20A). This region is the least conserved between UNC-6 and the netrins, with sequences being 63~ conserved between netrin-1 and netrin-2 and 29o conserved between netrin-2 and UNC-6 (Serafini et al:, supra.).
The netrins define a family of chemotropic factors which have been shown to play a central role in axon guidance. Axonal growth cones are guided to their target by both local cues, present in the extracellular matrix or on the surface of cells, and long-range cues in the form of diffusible chemoattractants and chemorepellents (Goodman and Shatz, Cell 72:77-98, 1993; Keynes and Cook, Curr. Opin. Neurobiol. 5:75-82, 1995).
Chicken netrin-1 and netrin-2 have been shown to function as chemoattractants for developing spinal commissural axons (Serafini et al., Cell 78:409-424, 1994;
Kennedy et al., Cell 78:425-435, 1994) with netrin-1 also acting as a chemorepellant for trochlear motor axons (Colamarino and Tessier-Lavigne, Cell 81:621-629, 1995).
Comparative analysis revealed the presence of extensive homology between the chicken netrins and C. elegans UNC-6 protein which is required for circumferential cell migration and axon guidance (Hedgecock et al., Neuron 4:61-85, 1990; Ishii et al., Neuron 9:873-881, 1992).
More recently, two Drosophila netrins, NETA and NETB, have been described and shown to be required for commissural axon guidance as well as for guidance of motor neurons to their target muscles (Harris et al., Cell 17:217-228, 1996;
Mitchell et al., Cell 17:203-215, 1996). These studies indicate that the netrin family of chemoattractant and chemorepellant proteins is conserved between invertebrates and vertebrates.
The genomic interval containing the netrin-like exon traps was sequenced in order to obtain additional sequence information from the gene and to rule out the possibility that the exon traps were derived from a pseudogene. In preliminary studies using the 53.8B genomic P1 clone, the netrin-like exon traps were mapped to a 6 kb XhoI fragment. See, for example, Figure 18 wherein relevant DNA markers are shown on top of the horizontal line, with NotI sites (N) being shown below the line. The location and orientation of the ATP6C, CCNF, and nagA
transcriptional units have been previously described (Gillespie et al., Proc. Natl. Acad. Sci., USA 88:
4289-4293, 1991; Kraus et al., Genomics 24: 27-33, 1994;
Burn et al., Genome Research 6: 525-537, 1996) and are shown below the genomic interval. The two P1 clones containing the netrin gene are shown below the schematic diagram of the interval. The location of the 6.8 kb of - genomic sequence is enlarged below the P1 clones. The position of the two exon traps in the 6.8 kb of genomic sequence is also indicated.
The 6 kb fragment, and the adjacent 3.5 kb XhoI
fragment, were subcloned and used to screen a random shotgun library from the 53.8B P1 clone. Subclones which were positive by hybridization were sequenced with forward and reverse vector primers. A total of 88 subclones were sequenced in this manner.
Additional sequence was obtained using internal primers as well as end sequence from the parental XhoI
fragments. A total of 6.8 kb of genomic sequence with an overall redundancy of 7-fold was sequenced. The GC-content for the sequenced region was found to be 68.9%, which is slightly higher than the 62.80 observed for the 53 kb of genomic sequence from the PKD1 gene, located 350 kb further telomeric (The American PKD1 Consortium, 1995, supra; Burn et al., 1996, supra).
Computer analyses were performed to identify putative exons. GRAIL2 analysis predicted six exons within the 6.8 kb of genomic sequence with database analysis indicating that all but one exon (exon 1), encoded sequences with homology to the chicken netrins. Figure 19A
shows a GRAIL2 analysis of coding sequences in the 6.8 kb of genomic sequence from the 53.8B P1, with the gray scale denoting GC-content (white to light gray is GC rich and gray to black is AT rich), vertical boxes indicating relative quality of the predicted exons. A graphical - WO 97/48797 PCTlUS97/00785 depiction of the predicted exons is shown above the vertical boxes with light colored boxes denoting exons with a score of "excellent" ( >80~ probability) and dark colored boxes denoting exons with a score of "good" (>600 probability). The position of exon traps L75917 and L75916 (left to right, respectively) are shown above the GRAIL2 predicted exons. The structure of the gene based on comparison of the RT-PCR products and genomic sequence is shown at the top, the position of the exons in the genomic sequence is shown by the numbers above the exons. The 5' and 3' untranslated regions are also shown.
Additionally, the 6.8 kb of genomic sequence was compared to the protein sequences of the chicken netrins using a Pustell DNA/protein matrix. The genomic sequence (translated in all six frames) was compared to chicken netrin-2 in Figure 19B, using a PAM250 matrix with the minimum homology set at 50~ and the window set at 20.
Regions of homology are shown by heavy diagonal lines.
Five exons were predicted by this analysis, with only the first GRAIL2 predicted exon not appearing to be bona fide.
Sequences from the two exon traps were also predicted by GRAIL2; however, there were noteworthy differences (cf Figure 19A). In predicting sequences present in exon trap L75917, GRAIL2 included an additional 55 by at the 5' end of the exon. The first of the two exons present in exon trap L75916 was not predicted by GRAIL2, while GRAIL2 added additional bases to the 5' and 3' ends of the second exon present in this exon trap.
A search of the Expressed Sequence Tags (EST) database did not reveal the presence of any ESTs from the human netrin gene. Nor was the human netrin message detected by Northern and/or RNA dot blot analysis using mRNA from over fifty different adult and fetal tissues, suggesting that hNET has an extremely restricted pattern of expression and when expressed is present in low abundance.
Two murine ESTs, however, were identified from a brain library and a whole fetus library (Genbank Accession Nos.
w59766 and AA048205, respectively) which have significant homology to hNET. The murine ESTs contain overlapping sequence with a total of 477 by of contiguous sequence being represented. This 477 by contiguous sequence aligns to the 5' end of the human netrin cDNA and includes 47 by of 5' UTR and sequences encoding the N-terminal 143 amino acids. A comparison of the deduced human and murine protein sequence indicated that the two proteins were 89.5 (128/143) identical.
Characterization of the Human Netrin Transcript In order to confirm the structure of the netrin gene, RT-PCR was performed using primers designed from the predicted exons. Since the predicted human netrin appeared to slightly more homologous to netrin-2 than netrin-1 (570 versus 54%, respectively) and netrin-2 is expressed in the spinal cord of chicken, adult human spinal cord polyA+ RNA
was utilized as a template. RT-PCR products were obtained with only a portion of the primer pairs; however, even this required the use of nested primers and two rounds of PCR, with low yields making it necessary to use hybridization and radiolabeled probes to visualize the products. The low yield, and lack of RT-PCR products in some cases, was attributed to the high GC-content of the products (70-80~).
The addition of betaine to a final concentration of 2.5 M
in the PCR reactions was found to dramatically improve yield and purity of the RT-PCR products. (International Publication No. WO 96/12041; Reeves et a1. (1994) Am. J.
Hum. Genet. 55:A238; Baskaran et a1. (1996) Genome Research 6:633-638).
Assembly of the RT-PCR products revealed a 1743 by open reading frame (ORF) with an in-frame stop codon upstream of the proposed start methionine. In verifying the start and stop codons, a 209 by 5' UTR and a 22 by 3' ' UTR were cloned. Additional sequences from the respective UTRs were not cloned, however, since the goal of the RT-PCR
experiments was to only confirm the predicted protein sequence and not to assemble a full-length cDNA. The position of the intron-exon boundaries was determined based on the comparison of the genomic sequence and the RT-PCR
clones (Figure 19A).
A 1.9 kb cDNA, hNET, was cloned by performing nested PCR using spinal cord cDNA as template and standard PCR conditions with the addition of betaine. The human netrin protein is predicted to be 580 amino acids in size, with the common domain structure of the netrin family being conserved. In Figure 20A positions where the chicken netrins and UNC-6 sequences match the human sequence are denoted by periods while gaps introduced during the alignment are shown by hyphens. Arrows above the sequence alignment show the boundaries of the laminin VI and V
domains, and C-terminal region (C) as described (Serafini et al., Cell 78: 409-424, 1994). The signal sequence (S) is also shown. V-1, V-2, and V-3 designate each of the EGF
domains that constitute domain V.
The hNET coding sequence and its predicted protein product are shown in Figures 4A and 4B. Figures 4C and 4D show full length hNET cDNA including both 5' and 3' UTR
sequence.
Several lines of evidence rule against the possibility that the human netrin gene described herein represents a pseudogene. First, none of the exons in the coding region contain stop codons. Secondly, the overall gene structure described is highly conserved when compared to other members of the netrin/UNC-6 family. Third, despite the lack of signal in the Northern and RNA blot analysis, a mature transcript was isolated by RT-PCR.
Finally, sequences in the murine EST database have been identified which are highly conserved. Taken together, these data indicate that a novel human netrin gene with a restricted pattern of expression has been identified.
Human netrins may have a significant role in neural regeneration. Though netrins do not by themselves promote axon growth, they do play a role in the orientation of axon growth. The combination of growth promoting activities with axon guidance cues would be a necessary requisite for directed neural regeneration.
The ability to clone a gene with such a restricted pattern of expression points out one of the strengths of the exon trapping procedure, since it is unlikely that the netrin gene would have been identified using cDNA selection or direct library screening. These results highlight the need for using a variety of approaches to identify and clone sequences from a large genomic contig.
Exon trapping results further show that there is a novel ATP Binding Cassette (ABC) transporter in the PKD1 locus located between the LCN1 and D16S291 markers in a centromeric to telomeric orientation. Database searches with the exon trap sequences show homology to the murine ABC1 and ABC2 genes (Luciani et al., supra. 1994). The human homologs of murine ABC1 and ABC2 have been cloned and mapped to human chromosome 9 (Luciani et a1. supra. 1994).
Sequences derived from the trapped exons along with those from cDNA selection and SAmple SEquencing (SASE) were used to recover overlapping partial cDNA clones.
Seven exon traps with homology to ABC
transporters were isolated from P1 clones 30.1F, 64.12C and 96.4B. Additional sequences encoded by the ABC3 gene were obtained by RT-PCR (placenta and brain RNA as template) and library PCR (using commercially available lung cDNA library as template) using custom primers designed from the exon traps (Tables II and III). Three exon traps (L48758, L48759 and L48760) were obtained from the region of overlap between the 30.1F, 64.12C and 96.4B P1 clones (transcript F
Figure 1), while a fourth exon (L48753) maps to the 79.2A
' P1 clone, exclusively (transcript E in Figure 1).
U
~
o ~ a~
N ~ D ~
(3~
~Gxy..~ x ~t'x '~~ ~ Ar O O M~D~r~ N L~I~
v V ~ ~ ~ ' ~ O
c -O .~. p .-p.
O
.
~ ~O
C~, t O
-.
E1~C Ei~ E-~r., f~: ~ E
b '~
U ~ ~ H
C ~U H C C ~ c~d~~
-~ 7 'J
~ .~ a~
~ ~U U ~ U ~ UU
,7c C E
C 7 7 -a ~ H EH U U H U
c ~ C- 7 H
~1 H C7-RU ~ C
~
V ~ U U H ~~
~''~
C C ., a~
~ 7 H 7 0 '3 'p U U C7C7~
U U ~ e~
~
C CU ~ U U C.U~o o.-~
7 7 '~
a~ ~ C7~~ E-~FCU'C7E-~>' ~ U ~-' U
~ 3 o U UC9U U r.~U C9,.~, ~ ~ ~ G
C7C7C7t7E-~C7 ~~ o ..~ p v ~
a~
O U UU H H ~ ~_ ( ~~~~'~''~~
C C ~ o O a~ p 7 .7 Wp ~ ~
' O .
, o vi C
a "' W t~av.-<M ~n~ ov.-~' O
~
~ M M~'~'d'd'cT'~~ ~ ~
z ~
~'~
c G
~ U U Uo~
~
~
~ c -~E-iU H U Hv UU , ~
U C-C U C.E E~ ~ W 3 ~-~7 7 -~-~
~
~ H t U U U C-U~w,~U~ ~R:
UU U H U ~
U C ~CC C9 ~ ~ 7 7 '~ ~
~w -d ~ ~ ~ ~ ~ H ~
H UU H a a a ~r~., C.C-C-C t c~c7 ~ o~ o(~~
7 l7 7 U
~ U
V U U - W ' U U~o~ ~a~..-, ' cn ~ ~7C7C E U C FCC7~ b V O ~ A''' ~ tU7CU7U H ~ tU'JH
' ~
J
. ~
~
~
O H HU C~7~ t7~CH._ ~ ~ ~ ~
a ~
.-~
co a~ .~ b .... -t7 ..., .-. ~, .~ .~ o., ~ -.~
d o -N
~ .b '~,' 'p ~
O 'd'd'~ ~ ~ ( c?
~~z on cd '~' ~"
w ~ ~
M M ~ ~
o ~
~ ~ ~.o~ ~ 3 ~
-:
~
~
~
~-,~.~
~ ~ .
u o.N
~ ~ ~;~,~ o W ~ ~ .~ o ~.., o ~ p .
c c ~ ~U U U U ~ ~~~o~'~o~o W 3.~3.~,o~'o ~ N y 1 1 "~ 0 ~ V ~
~ .
a~ay H H H a~a~~.
V
~ -~
~ b ~
n C 7C7mt~.'R.'G~C7C7w ~
a~
~ ~' ~ ~'~ o ..
c o ~~~ a~~
C7 ~ r~ .~ ~ rip ~ v~
C 7 ~a~Uu x.- ~ ~
~, . ~
SUBSTITUTE SHEET (RULE 26) .fl U .a.<~ ~c ~
o ~
c~--..nc.
U ~~ O O
in N _ U
v ~ v G (r7fnM f~l ~~ UU U U
o q N_ d U.
. -~ U E
~ -~
a, Ud C7 ~ E..t7C7U
o U
~~
~
- UC7~
o ~
_ ~ ~U U U
'~ ~
'-' H
L r r E UH E-~~.
o ' U
' c.: -a ..
r" r ~ Da ~ U G
_ U
aL
o cn U C7C7 ~
' ._~_) ~ N ~ '.- t7d U
t > -_ U V
o U ~-n:~ o C7 U
;.-, ~ U
E~N~J ~ CU7Q U U
C~ '' ~ cc U
U
O
~
U
C) ~~,o~o c U ~ U U U U C/~ v7v7~ tn z r ~
_ _ ~ ~O d'~
V r~~. b0 O C3 ~~OCJ
C~ f~-r-J 'N C:I ~ ~J (~(~O~i~-G ~ C~ -- ~ 0000~f100 O ~ 'a O '~ ~~
~
.-~
U U
U 'n V v J J c~ ON ~ ~-'-' C
O
V ~U ~
' e'C) ~. -- ~ U ~.
o o ~ ~ U f o~' c c. U ~ ~
_ c ~ ~ U E-'(7 UU
U ~ c.
0 o ~ f-U U E"
~~ ~ --~ c~ ~~ U
U U
4 ~
, .-~ ~ E:,< < :-~ HC7d HH ~_ ''~" r'U UN
N UC C.c o-r. J 7 U
~
O y.., i- U ~ ~
e.
O
~ ' U (-r U~
N d G
= U ~ - ~ ~
=
~ ~
. UU (7-_ o ~ C7U~E'~G
~U~.: v N~~
Q'U
. ~U C7Us.O
: ~ ~' - ~
, ~
T, dd U C7-~
.-.
a~ J ~' O
o = N UC7d do. a ca:
'a = ~. J do c~
c~ ~ ~ ~ G r ~
.~ ~
_ -O U '-' U ~ a0 n U
.~ ---c c) cs ~ cn ~, ~.JO~O~ ~ ~flcy~
U .... .... + ~ N TJ
", UN ~ U
..' _ U
.. S ~ Ocn ~U.~
O NU-~7 O
U V ~ U r U ~p ~ N~ ~-~-~~-~.~a~
o o ~ ~ ~-- 0 4 z .-~
-_ _ ~, ~ 'd b ~ ~
a O
.
U CJ ~ CD.'~ O ~ O
O ~= U O O
O ~w c7 N ~ _ ~ _ ui c C!J-d O 'i7 ~-s U n O G p .
c_.
~ dN 'V N 'U
U O U O ~_ ~ rJ ~--t 'fl U
~
U U U v ~7 1--~ -U
U
E '-- c~ooO ~U
-t - r ~V)~ z~ ~.) ~ ~
' _ '~ ~ ' ,n U
o U
U '_" cp O
~
U e~ _O
_ OOdJ00D~CnC~CnCL~
U G'.ru ~~ N U ~
.
F l ,~ U r. U :c N _ _O
G ~
- V? C: "~. U:
i/) .. U
J, N
. ' . an ~U C) (e '~ a z a -% a cv ,-SUBSTITUTE SHEET (RULE 26) U
_O
_N N O N
U Vr 'V
in v U M ~' N
G N cn r O
a>
O ~ ~ NU
U ~a O
_O
U
O
. o ,L-. U U
U
Z UE."' ~ Q U
.
UC7~ U Q C7r H~ V E-U,U
U Q U
C7 C7 C7c U V
UC7 U Qo ~ N ["~E-"C7(~U UU
U d U U c a ~
.C C7Q U
+. U U U
Q U C7 o UU C7U C7E-,Q
H E-~
~U V U Q Q
C C. J
'~-O UU U C7C7H
a U
~ ~z ~ ~ ~ ~ ~
~
.
U ~ U
p U ..
U
Ud UE-~U Q
Q C7U-~-UU
-a C E.H..,U U ~~ v .7U N
C7U C7U -~
_~:
'~
~
_~ ~U U U C7U'n' -w~e~o ~ U UU U Q Q~
cn E - Q ~~
Q o ' O C7 U U U Q~..
C7 u., v o z sa .
UC7E-V-~V-~FV-.CU7~ on E
Ua U U U U~
~
C 7C.7 ~' . ~
on o ~.~.~
~
U cr'n -~
o N
e~ c ~
~ ~
.
c~
o a aWd o W ~-rM 7 O NO U ~ w . ~
~z ~. ~.
p ( Y., .N~ ~ aU~bU
(~ p,i ~ L l bD O
C l Y O ~ N
J o Cl H .,.. .
E -~U U
~. ~:. ~ ~~ v~ v~
P a p ~ r~
a " _ ~ -.N -"N -'N~ ~ ci . - " v ti SUBSTITUTE SHEET (RULE 26) Exon traps from the hABC3 transporter encoded by transcript F encode sequences with homology to the R-domain of the murine ABC1 and ABC2 genes. The R-domain is believed to play a regulatory role based on the comparison to a conserved region in CFTR. To date, only ABC1, ABC2 and CFTR have been shown to contain an R-domain (Luciani et al., supra. 1994).
Additionally, a 1.1 kb RT-PCR product which links the three exon traps from transcript F, with the RT-PCR
product detecting a 7 kb message on Northern blots has been obtained. Based on a search of the dbEST database, a cDNA
from this region was obtained with sequences from exon traps L75924 and L75925 being contained in cDNA 49233 from the I.M.A.G.E. Consortium (Lennon et al., supra.). The presence of both cloned reagents in the same transcription unit has been confirmed using RT-PCR.
The ATP binding cassette (ABC) transporters, or traffic ATPs, comprise a family of more than 100 proteins responsible for the transport of a wide variety of substrates across cell membranes in both prokaryotic and eukaryotic cells (Higgins, C. F., Anna. Rev. Cell. Biol.
8:67-113, 1992; Higgins, C. F. Cell 82:693-696, 1995).
Proteins belonging to the ABC transporter superfamily are linked by strong structural similarities. Typically ABC
transporters have four conserved domains, two hydrophobic domains which may impart substrate specificity (Payne et al., Mol. Gen. Genet. 200:493-496, 1985; Foote et al., Nature 345:255-258, 1990; Anderson et al., Science 253:202-205, 1991; Shustik et al., Br. J. Haematol. 79:50-56, 1991;
Covitz et al., E1~0 J. 13:1752-1759, 1994), and two highly conserved domains associated with ATP binding and hydrolysis (Higgins, supra. 1992). ABC transporters govern unidirectional transport of molecules into or out of cells and across subcellular membranes (Higgins, supra. 1992).
Their substrates range from heavy metals (Ouellette et al., Res. Microbiol. 142:737-746 1991) to peptides and full size proteins (Gartner et al., Nature Genet. 1:16-23 1992).
In eukaryotic cells, ABC transporters exist either as single large symmetrical proteins containing all four domains or as dimers resulting from the association of two smaller polypeptides each containing a hydrophobic and ATP-binding domain. Examples of this multimeric structural form are human TAP proteins (Kelly et al., Nature 355:641-644 1992) and the functional PMP70 protein (Kamijo et al., J. Biol. Chem. 265:4534-40 1990). This multimeric structure is also found in numerous prokaryotic ABC
transporters. The hydrophobic regions are comprised of up to six transmembrane spanning segments. Each ATP binding domain operates independently and may or may not be functionally equivalent (Kerem et al., Science 245:1073-80 1989; Mimmack et al., Proc. Natl. Acad. Sci., USA 86:8257-61 1989; Cutting et al., Nature 346:366-369 1990; Kerppola et al., J. Biol. Chem. 266:9857-65 1991).
Several of the ABC transporters thus far identified in humans have been shown to be clinically important. For example, overexpression of P-glycoproteins is responsible for multi-drug resistance in tumors (Gottesman et al., Ann. Rev. Biochem. 62:385-427 1993).
Classical cystic fibrosis (CF) as well as a large proportion of cases of bilateral congenital disease of the vas deferens (CBAVD) are caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR), an ABC
transporter (Kerem et al., supra.; Cutting et al., supra.).
Defects in ABC transporters have also been implicated in Zellweger syndrome (Gartner et al., supra.), and adrenoleukodystrophy (Mosser et al., Nature 361:726-730 1993 ) .
Two members of a novel ABC transporter subgroup (murine ABC1 and ABC2) have been shown to contain domains similar to the regulatory R-domain of CFTR (Luciani et al., supra. 1994). Functionally, the mouse ABC1 protein has been shown to play a role in macrophage engulfment of apoptotic cells (Luciani et al., F,NIBO J. 16:226-235, 1996), while the function of ABC2 remains unknown. All three proteins contain a large charged region containing several potential phosphorylation sites (Kerem et al., supra., Luciani et al., supra. 1994). The charged amino acid residues within this region are sequentially arranged in blocks of alternating positive and negative charge.
- A common feature of these particular ABC
transporters, including hABC3, is the presence of a large linker domain between the two ATP binding cassettes. The presence of numerous polar residues and potential phosphorylation sites in the linker domain suggest that this region may play a regulatory role perhaps similar to that of the R-domain of CFTR (Kerem et al., supra.). In addition, the four proteins also contain a hydrophobic region, the HH1 domain (Luciani et al., supra. 1994), within the conserved linker domain. Although there is little homology at the sequence level between the HHl domains of hABC3 and the murine ABCs, they appear to be structurally conserved with each domain predicted to have f~-sheet conformation. The similarity between these proteins would suggest that they all belong to the same ABC
subfamily, originally defined by ABC1 and ABC2 (Luciani et al., supra. 1994). The genes encoding the human homologues of ABC1 and ABC2 have been mapped to human chromosome 9 at q22-q31 and q34, respectively (Luciani et al., supra.
1994).
Despite being members of the same subfamily, it is likely that ABC1, ABC2 and hABC3 have different functional roles. The differences present in the transmembrane and linker domains of ABC1, ABC2 and hABC3 may confer each with a unique substrate specificity. For example, alterations and mutations in the transmembrane domains of both prokaryotic and eukaryotic ABC transporters have been shown to alter substrate specificity (Payne et al., supra.; Foote et al., supra.; Covitz et al., supra.) while changes to the R-domain of CFTR have been shown to alter its ion selectivity (Anderson et al., supra.; Rich et al., Science 253:205-207 1991). The differences in the expression patterns of ABC1, ABC2 and hABC3 also suggest that the proteins may be functionally distinct. Murine ABC1 and ABC2 have been shown to be expressed at varying levels in a wide variety of adult and embryonic tissues, with the highest levels of ABC1 expression being seen in pregnant uterus and regions rich in monocytic cells while highest levels of ABC2 expression were seen in brain (Luciani et al., supra. 1994; Luciani et al., supra. 1996).
In contrast, hABC3 is preferentially expressed in lung with significantly.lower levels of expression being seen in brain, heart, and pancreas.
Apart from the structural differences between ABC1, ABC2 and hABC3, it is always possible that the three proteins play similar functional roles in different cell populations. To date, no function has been proposed for murine ABC2. However, recent data indicate that ABC1 is required for the engulfment of cells undergoing apoptosis, though the molecular mechanism underlying ABC1 function is unknown (Luciani et al., supra. 1996). If hABC3 functions in a manner similar to ABC1, it could be expressed by pulmonary macrophages involved in host defense.
ABC transporters have been described for substrates ranging from small ions to large polysaccharides and proteins. Based on the high level of expression in lung, the substrate for hABC3 may play an integral role in the lung function, including ion or polysaccharide transport. Further clues may be provided by a closer examination of hABC3 expression in the lung. These studies would include the identification of the lung cells responsible for hABC3 expression as well as determining the subcellular localization of hABC3. The identification and cloning of the hABC3 cDNA may have implications for cystic fibrosis, since it contains a potential R-domain and is expressed at highest levels in the lung. If hABC3 does play an integral role in lung function, then modulation or -WO 97/4$797 PCT/US97/00785 alteration of hABC3 substrate specificity could have significant therapeutic implications for CF.
Several cDNAs were cloned using the GeneTrapper direct selection system and oligos designed from the 5' most trapped exon encoding sequences with homology to ABC1 (trapped exon L48747). The longest clone isolated with the GeneTrapper system from a normal human lung cDNA library using custom oligonucleotides designed from the 5' most exon trap was 5719 by in length (ABCgt.1). An additional cDNA clone (ABC.S) was isolated using a radiolabeled 1.1 kb RT-PCR product (ABC3-12) as a probe (Figure 15). The 5' end of the ABC3 cDNA was further characterized using 5' RACE, with several RACE products containing multiple in-frame stop codons upstream of the start methionine.
Accordingly, the present invention provides a novel human ABC gene which has homology to the murine ABC1 and ABC2 genes, as well as sequences predicted to be encoded by cosmid C48B4.4 from C. elegans (Wilson et al., supra.). A 6.4 kb cDNA has been assembled for the hABC3 transporter. The assembled cDNA contains a 5116 nucleotide long open reading frame encoding 1705 amino acids, with the predicted protein having a molecular weight of 191 kDa.
The proposed start methionine is 50 by upstream of the 5' end of clone ABCgt.l.
Five trapped exons from P1 clones 109.8C and 47.2H were shown to contain sequences with homology to the human ribosomal protein L3 cDNA, with hybridization studies indicating that the L3-like gene is oriented centromeric to telomeric (transcript L in Figure 1). The ribosomal L3 gene product is one of five essential proteins for peptidyltransferase activity in the large ribosomal subunit (Schulze and Nierhaus, EM80 J. 1:609-613, 1982). Not surprisingly, the L3 amino acid sequence is highly conserved across species.- Mammalian L3 genes showing ~98~
protein sequence identity have been characterized from man (Genbank Accession No. X73460), mouse (Peckham et al., Genes Dev. 3:2062-2071, 1989), rat (Kuwano and Wool, Biochem. Biophys. Res. Comm. 187:58-64, 1992) and cow (Simonic et al., Biochim. Biophys. Acta 1219:706-710, 1994). The cumulative percent identity between the trapped exons and the reported human ribosomal protein L3 cDNA was 740 (537/724) at the nucleotide level.
A full-length cDNA encoding a novel ribosomal L3 protein subtype, SEM L3, was isolated and sequenced (Figure 11). This gene is now designated RPL3L and has been assigned GenBank Accession No. U65581. The deduced protein sequence is 407 amino acids long and shows 77o identity to other known mammalian L3 proteins, which are themselves highly conserved. Hybridization analysis of human genomic DNA suggests this novel gene is single copy and has a tissue specific pattern of expression.
The expression pattern of the previously identified human L3 gene and the novel human RPL3L was determined using multiple tissue Northern blots. The human L3 gene showed a ubiquitous pattern of expression in all tissues with the highest expression in the pancreas. In contrast, the novel gene described herein is strongly expressed in skeletal muscle and heart tissue, with low levels of expression in the pancreas. This novel gene, RPL3L (Ribosomal Protein L3-Like), is located in a gene-rich region near the PKD1 and TSC2 genes on chromosome 16p13 . 3 .
The RPL3L protein is more closely related to the above mentioned cytoplasmic ribosomal proteins than to previously described nucleus-encoded mitochondrial proteins (Graack et al., Eur. J. Biochem. 206:373-380, 1992). The presence of a highly conserved nuclear localization sequence in the RPL3L further supports the hypothesis that it represents a novel cytoplasmic L3 ribosomal protein subtype and not a nucleus-encoded mitochondrial protein.
In addition, an exon trap (Genbank Accession No.
L48792) from a gene which is located telomeric of the L3-like gene was obtained (transcript M in Figure 1).
Sequences encoded by transcript M were shown to have homology to pilB from Neisseria gonorrhoeae (Taha et al., .F.N~O J. 7:4367-4378, 1988) as well as to a computer predicted 17.2 kDa protein encoded by cosmid F44E2.6 from C. elegans (Wilson et al., supra.).
Using sequences from exon trap L48792, a 600 by partial cDNA was isolated and it was determined that the corresponding gene is oriented centromeric to telomeric. A
1.3 kb message was detected by the cDNA on Northern blots.
Sequences conserved between the partial cDNA and the hypothetical 17.2 kDa protein were also conserved in the pilB protein from Neisseria gonorrhoeae (Taha et al., supra. 1988), a hypothetical 19.3 kDa protein from yeast (Genbank Accession No. P25566), and a fimbrial transcription regulation repressor from Haemophilus (Fleischmann et al., Science 269:496-512 1995) (Figure 2).
The pilB protein has homology to histidine kinase sensors and has been shown to play a role in the repression of pilin production in Neisseria gonorrhoeae (Taha et al., supra. 1988; Taha et al., Mol. Microbiol. 5:137-148, 1991).
However, residues conserved between pilB, transcript M and the C. elegans, yeast, and Haemophilus sequences do not include the conserved histidine kinase domains from pilB
(Taha et al., supra. 1991). These findings suggest that the conserved region in transcript M has a function which is independent of the proposed histidine kinase sensor activity of pilB.
An additional exon trap from region of overlap between the 109.8C and 47.2H P1 clones was shown to contain human LLRep3 sequences (Slynn et al., Nuc. Acids Res.
18:681, 1990). Hybridization studies indicated that the LLRep3 sequences (transcript K in Figure 1) were located between the sazD and L3-like genes. The region of highest gene density appears to be at the telomeric end of this cloned interval, particularly the region between TSC2 and D16S84, with a minimum of five genes mapping to this region (transcription units K, L and M, sazD and hERV1).
Also mapped to this region, was an exon trap which is 86o identical {170/197) at the nucleotide level to the previously described rat augmenter of liver regeneration (Hagiya et al., Proc. Natl. Acad. Sci., USA
91:8142-8146, 1994). ALR is a growth factor which augments the growth of damaged liver tissue while having no effect on the resting liver. Studies have demonstrated that rat ALR is capable of augmenting hepatocytic regeneration following hepatectomy.
This ALR-like exon trap was also shown to contain sequences from the recently described hERVl gene, which encodes a functional homologue to yeast ERV1 (Lisowsky et al., supra.).
A 468 by cDNA, hALR, has been obtained from the human ALR gene (Figure 13). The ALR sequences encode a 119 amino acid protein which is 84.8% identical and 94.1 similar to the rat ALR protein {Figure 14).
The cloning of human ALR has significant implications in the treatment of degenerative liver diseases. For example, biologically active rat ALR has been produced from COS-7 cells expressing rat ALR cDNA
(Hagiya et al., supra.). Accordingly, recombinant hALR
could be used in the treatment of damaged liver. In addition, a construct expressing hALR could be used in gene therapy to treat chronic liver diseases.
Forty three of the trapped exons did not have significant homology to sequences in the protein or DNA
databases, nor were ESTs (expressed sequence tags) containing sequences from-the exon traps observed in dbEST.
The absence of ESTs containing sequences from these novel exon traps is not surprising since one of the criterion for selecting exon traps for further analysis was the presence of an EST in the database. These trapped exons are likely to represent bona fide products, since in many cases they were trapped multiple times from different P1 clones and in combination with flanking exons.
The present invention encompasses novel human genes an isolated nucleic acids comprising unique exon sequences from chromosome 16. The sequences described herein provide a valuable resource for transcriptional mapping and create a set of sequence-ready templates for a gene-rich interval responsible for at least two inheritable diseases.
Accordingly, the present invention provides isolated nucleic acids encoding human netrin (hNET), human ATP Binding Cassette transporter (hABC3), human ribosomal L3 (RPL3L) and human augmenter of liver regeneration (hALR) polypeptides. The present invention further provides isolated nucleic acids comprising unique exon sequences from chromosome 16. The term "nucleic acids" (also referred to as polynucleotides) encompasses RNA as well as single and double-stranded DNA, cDNA and oligonucleotides.
As used herein, the phrase "isolated" means a polynucleotide that is in a form that does not occur in nature.
One means of isolating polynucleotides encoding invention polypeptides is to probe a human tissue-specific library with a natural or artificially designed DNA probe using methods well known in the art. DNA probes derived from the human netrin gene, hNET, the human ABC transporter gene, hABC3, the human ribosomal protein L3 gene, RPL3L, or the human augmenter of liver regeneration gene, hALR, are particularly useful for this purpose. DNA and cDNA
molecules that encode invention polypeptides can be used to obtain complementary genomic DNA, cDNA or RNA from human, mammalian, or other animal sources, or to isolate related cDNA or genomic clones by the screening of cDNA or genomic libraries, by methods described in more detail below.
The present invention encompasses isolated nucleic acid sequences, including sense and antisense oligonucleotide sequences, derived from the sequences shown in Figures 3, 4, 8, 11 and 15. hNET-, hABC3-, RPL3L- (SEM
L3-), and hALR-derived sequences may also be associated with heterologous sequences, including promoters, enhancers, response elements, signal sequences, polyadenylation sequences, and the like. Furthermore, the nucleic acids can be modified to alter stability, solubility, binding affinity, and specificity. For example, invention-derived sequences can further include nuclease-resistant phosphorothioate, phosphoroamidate, and methylphosphonate derivatives, as well as "protein nucleic acid" (PNA) formed by conjugating bases to an amino acid backbone as described in Nielsen et al., Science, 254:1497, 1991. The nucleic acid may be derivatized by linkage of the oc-anomer nucleotide, or by formation of a methyl or ethyl phosphotriester or an alkyl phosphoramidate linkage.
Furthermore, the nucleic acid sequences of the present invention may also be modified with a label capable of providing a detectable signal, either directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, and the like.
In general, nucleic acid manipulations according to the present invention use methods that are well known in the art, as disclosed in, for example, Sambrook et al., Molecular Cloning, A Laboratory Manual 2d Ed. (Cold Spring Harbor, NY, 1989), or Ausubel et al., Current Protocols in Molecular Biology (Greene Assoc., Wiley Interscience, NY, NY, 1992 ) .
Examples of nucleic acids are RNA, cDNA, or genomic DNA encoding a human netrin, a human ABC
transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide. Such nucleic acids may have coding sequences substantially the same as the coding sequence shown in Figures 3, 4, 8, 21 and 15, respectively.
The present invention further provides isolated oligonucleotides corresponding to sequences within the hNET, hABC3, RPL3L (formerly SEM L3), hALR genes, or within the respective cDNAs, which, alone or together, can be used to discriminate between the authentic expressed gene and homologues or other repeated sequences. These oligonucleotides may be from about 12 to about 60 nucleotides in length, preferably about 18 nucleotides, may be single- or double-stranded, and may be labeled or modified as described below.
This invention also encompasses nucleic acids which differ from the nucleic acids shown in Figures 3, 4, 8, 11 and 15, but which have the same phenotype, i.e., encode substantially the same amino acid sequence set forth in Figures 3, 4, 8, 11 and 15, respectively.
Phenotypically similar nucleic acids are also referred to as "functionally equivalent nucleic acids". As used herein, the phrase "functionally equivalent nucleic acids"
encompasses nucleic acids characterized by slight and non-consequential sequence variations that will function in substantially the same manner to produce the same protein products) as the nucleic acids disclosed herein. In particular, functionally equivalent nucleic acids encode proteins that are the same as those disclosed herein or that have conservative amino acid variations. For example, conservative variations include substitution of a non-polar residue with another non-polar residue, or substitution of a charged residue with a similarly charged residue. These variations include those recognized by skilled artisans as those that do not substantially alter the tertiary structure of the protein.-Further provided are nucleic acids encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, and human augmenter of liver regeneration polypeptides that, by virtue of the degeneracy of the genetic code, do not necessarily hybridize to the invention nucleic acids under specified hybridization conditions. Preferred nucleic acids encoding the invention polypeptide are comprised of nucleotides that encode substantially the same amino acid sequence set forth in Figures 4, 8, 11 and 15.
Alternatively, preferred nucleic acids encoding the invention polypeptide(s) hybridize under high stringency conditions to substantially the entire sequence, or substantial portions (i.e., typically at least 12 to 60 nucleotides) of the nucleic acid sequence set forth in Figures 3, 4, 8, 11 and 15, respectively.
Stringency of hybridization, as used herein, refers to conditions under which polynucleotide hybrids are stable. As known to those of skill in the art, the stability of hybrids is a function of sodium ion concentration and temperature. (See, for example, Sambrook et al., supra.).
The present invention provides isolated polynucleotides operatively linked to a promoter of RNA
transcription, as well as other regulatory sequences. As used herein, the phrase "operatively linked" refers to the functional relationship of the polynucleotide with regulatory and effector sequences of nucleotides, such as promoters, enhancers, transcriptional and translational stop sites, and other signal sequences. For example, operative linkage of a polynucleotide to a promoter refers to the physical and functional relationship between the polynucleotide and the promoter such that transcription of DNA is initiated from the promoter by an RNA polymerase that specifically recognizes and binds to the promoter, and wherein the promoter directs the transcription of RNA from the polynucleotide.
Promoter regions include specific sequences that are sufficient for RNA polymerase recognition, binding and transcription initiation. Additionally, promoter regions include sequences that modulate the recognition, binding and transcription initiation activity of RNA polymerase.
Such sequences may be cis acting or may be responsive to trans acting factors. Depending upon the nature of the regulation, promoters may be constitutive or regulated.
Examples of promoters are SP6, T4, T7, SV40 early promoter, cytomegalovirus (CMV) promoter, mouse mammary tumor virus (MMTV) steroid-inducible promoter, Moloney murine leukemia virus (MMLV) promoter) and the like.
Vectors that contain both a promoter and a cloning site into which a polynucleotide can be operatively linked are well known in the art. Such vectors are capable of transcribing RNA in vitro or in vivo, and are commercially available from sources such as Stratagene (La Jolla, CA) and Promega Biotech (Madison, WI). In order to optimize expression and/or in vitro transcription, it may be necessary to remove, add or alter 5' and/or 3' untranslated portions of the clones to eliminate extra, potential inappropriate alternative translation initiation codons or other sequences that may interfere with or reduce expression, either at the level of transcription or translation. Alternatively, consensus ribosome binding sites can be inserted immediately 5' of the start codon to enhance expression. Similarly, alternative codons, encoding the same amino acid, can be substituted for coding sequences of the human netrin, human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptide in order to enhance transcription (e.g., the codon preference of the host cell can be adopted, the presence of G-C rich domains can be reduced, and the like).
Examples of vectors are viruses, such as baculoviruses and retroviruses, bacteriophages, cosmids, plasmids, fungal vectors and other recombination vehicles typically used in the art which have been described for expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple protein expression.
Polynucleotides are inserted into vector genomes using methods well known in the art. For example, insert and vector DNA can be contacted, under suitable conditions, with a restriction enzyme to create complementary ends on each molecule that can pair with each other and be joined together with a ligase. Alternatively, synthetic nucleic acid linkers can be ligated to the termini of restricted polynucleotide. These synthetic linkers contain nucleic acid sequences that correspond to a particular restriction site in the vector DNA. Additionally, an oligonucleotide containing a termination codon and an appropriate restriction site can be ligated for insertion into a vector containing, for example, some or all of the following:a selectable marker gene, such as the neomycin gene for selection of stable or transient transfectants in mammalian cells; enhancer/promoter sequences from the immediate early gene of human CMV for high levels of transcription;
transcription termination and RNA processing signals from SV40 for mRNA stability; SV40 polyoma origins of replication and ColE1 for proper episomal replication;
versatile multiple cloning sites; and T7 and SP6 RNA
promoters for in vitro transcription of sense and antisense RNA. Other means are well known and available in the art.
Also provided are vectors comprising a polynucleotide encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, and human augmenter of liver regeneration polypeptides, adapted for expression in a bacterial cell, a yeast cell, an amphibian cell, an insect cell, a mammalian cell and other animal cells. The vectors additionally comprise the regulatory elements necessary for expression of the polynucleotide in the bacterial) yeast, amphibian, mammalian or animal cells so located relative to the polynucleotide encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides as to permit expression thereof. As used herein, "expression"
refers to the process by which polynucleotides are transcribed into mRNA and translated into peptides, polypeptides, or proteins. If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA, if an appropriate eukaryotic host is selected.
Regulatory elements required for expression include promoter sequences to bind RNA polymerase and transcription initiation sequences for ribosome binding. For example, a bacterial expression vector includes a promoter such as the 1ac promoter and for transcription initiation the Shine-Dalgarno sequence and the start codon AUG (Sambrook et al., supra.). Similarly, a eukaryotic expression vector includes a heterologous or homologous promoter for RNA
polymerase II, a downstream polyadenylation signal, the start codon AUG, and a termination codon for detachment of the ribosome. Such vectors can be obtained commercially or assembled by the sequences described in methods well known in the art, for example, the methods described above for constructing vectors in general. Expression vectors are useful to produce cells that express the invention receptor.
This invention provides a transformed host cell that recombinantly expresses the human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides. Invention host cells have been transformed with a polynucleotide encoding a human netrin, a human ABC3 transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide. An example is a mammalian cell comprising a plasmid adapted for expression in a mammalian cell. The plasmid contains a polynucleotide encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide and the regulatory elements necessary for expression of the invention protein.
Appropriate host cells include bacteria, archebacteria, fungi, especially yeast, plant cells) insect cells~and animal cells, especially mammalian cells. Of particular interest are E. coli, B. Subtilis, Saccharomyces cerevisiae, SF9 cells, C129 cells, 293 cells, Neurospora, and CHO cells, COS cells, HeLa cells, and immortalized mammalian myeloid and lymphoid cell lines. Preferred replication systems include M13, ColEl, SV40, baculovirus, lambda, adenovirus, artificial chromosomes, and the like.
A large number of transcription initiation and termination regulatory regions have been isolated and shown to be effective in the transcription and translation of heterologous proteins in the various hosts. Examples of these regions, methods of isolation, manner of manipulation, and the like, are known in the art. Under appropriate expression conditions, host cells can be used as a source of recombinantly produced hNET, hABC3, RPL3L
(formerly SEM L3) and/or hALR.
Nucleic acids (polynucleotides) encoding invention polypeptides may also be incorporated into the genome of recipient cells by recombination events. For example, such a sequence can be microinjected into a cell, and thereby effect homologous recombination at the site of an endogenous gene encoding hNET, hABC3, RPL3L (formerly SEM L3), and/or hALR an analog or pseudogene thereof, or a sequence with substantial identity to a hNET-, hABC3-, RPL3L (SEM L3-), or hALR- encoding gene. Other recombination-based methods such as nonhomologous recombinations or deletion of endogenous gene by homologous recombination, especially in pluripotent cells, may also be used.
The present invention provides isolated peptides, polypeptides(s) and/or proteins) encoded by the invention nucleic acids. The present invention also encompasses isolated polypeptides having a sequence encoded by hNET, hABC3, RPL3L (SEM L3), and hALR genes, as well as peptides of six or more amino acids derived therefrom. The polypeptide(s) may be isolated from human tissues obtained by biopsy or autopsy, or may be produced in a heterologous cell by recombinant DNA methods as described herein.
As used herein, the term "isolated" means a protein molecule free of cellular components and/or contaminants normally associated with a native in vivo environment. Invention polypeptides and/or proteins include any natural occurring allelic variant, as well as recombinant forms thereof. Invention polypeptides can be isolated using various methods well known to a person of skill in the art.
The methods available for the isolation and purification of invention proteins include, precipitation, gel filtration, and chromatographic methods including , molecular sieve, ion-exchange, and affinity chromatography using e.g. hNET-, hABC3-, RPL3L- {SEM L3-), and/or hALR-specific antibodies or ligands. Other well-known methods are described in Deutscher et al., Guide to Protein Purification: Methods in Enzymology Vol. .282, (Academic Press, 1990). When the invention polypeptide to be purified is produced in a recombinant system, the recombinant expression vector may comprise additional sequences that encode additional amino-terminal or carboxy-terminal amino acids; these extra amino acids act as "tags"
for immunoaffinity purification using immobilized antibodies or for affinity purification using immobilized ligands.
Peptides comprising hNET-, hABC3-, RPL3L- (SEM
L3-) or hALR-specific sequences may be derived from isolated larger hNET, hABC3, RPL3L (SEM L3), or hALR
polypeptides described above, using proteolytic cleavages by e.g. proteases such as trypsin and chemical treatments such as cyanogen bromide that are well-known in the art.
Alternatively, peptides up to 60 residues in length can be routinely synthesized in milligram quantities using commercially available peptide synthesizers.
An example of the means for preparing the invention polypeptide(s) is to express polynucleotides encoding hNET, hABC3, RPL3L (SEM L3), and/or hALR in a suitable host cell, such as a bacterial cell, a yeast cell, an amphibian cell (i.e., oocyte), an insect cell (i.e., drosophila) or a mammalian cell, using methods well known in the art, and recovering the expressed polypeptide, again using well-known methods. Invention polypeptides can be isolated directly from cells that have been transformed with expression vectors, described below in more detail.
The invention polypeptide, biologically active fragments, and functional equivalents thereof can also be produced by chemical synthesis. As used herein, "biologically active fragment" refers to any portion of the polypeptide represented by the amino acid sequence in Figures 4, 8, 11 and 15 that can assemble into an active protein. Synthetic polypeptides can be produced using Applied Biosystems, Inc.
Model 430A or 431A automatic peptide synthesizer (Foster City, CA) employing the chemistry provided by the manufacturer.
Modification of the invention nucleic acids, polynucleotides, polypeptides, peptides or proteins with the following phrases: "recombinantly expressed/produced", "isolated", or "substantially pure", encompasses nucleic acids, polynucleotides, polypeptides, peptides or proteins that have been produced in such form by the hand of man, and are thus separated from their native in vivo cellular environment. As a result of this human intervention, the recombinant nucleic acids, polynucleotides, polypeptides, peptides and proteins of the invention are useful in ways that the corresponding naturally occurring molecules are not, such as identification of selective drugs or compounds. -Sequences having "substantial sequence homology"
are intended to refer to nucleotide sequences that share at least about 90~ identity with invention nucleic acids; and ' amino acid sequences that typically share at least about 95~ amino acid identity with invention polypeptides. It is recognized, however, that polypeptides or nucleic acids containing less than the above-described levels of homology arising as splice variants or that are modified by conservative amino acid substitutions, or by substitution of degenerate codons are also encompassed within the scope of the present invention.
The present invention provides a nucleic acid probe comprising a polynucleotide capable of specifically hybridizing with a sequence included within the nucleic acid sequence encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide, for example, a coding sequence included within the nucleotide sequence shown in Figures 3, 4, 8, 11 and 15, respectively.
As used herein, a "nucleic acid probe" may be a sequence of nucleotides that includes from about 12 to about 60 contiguous bases set forth in Figures 3, 4, 8, 11 and 15, preferably about 18 nucleotides, may be single- or double-stranded, and may be labeled or modified as described herein. Preferred regions from which to construct probes include 5' and/or 3' coding sequences, sequences predicted to encode transmembrane domains, sequences predicted to encode cytoplasmic loops, signal sequences, ligand binding sites, and the like.
Full-length or fragments of cDNA clones can also be used as probes for the detection and isolation of related genes. When fragments are used as probes, preferably the cDNA sequences will be from the carboxyl end-encoding portion of the cDNA, and most preferably will include predicted transmembrane domain-encoding portions of the cDNA sequence. Transmembrane domain regions can be predicted based on hydropathy analysis of the deduced amino acid sequence using, for example, the method of Kyte and Doolittle (J. Mol. Biol. 157:105, 1982).
As used herein, the phrase "specifically hybridizing" encompasses the ability of a polynucleotide to recognize a sequence of nucleic acids that are complementary thereto and to form double-helical segments via hydrogen bonding between complementary base pairs.
Nucleic acid probe technology is well known to those skilled in the art who will readily appreciate that such probes may vary greatly in length and may be labeled with a detectable agent, such as a radioisotope, a fluorescent dye, and the like, to facilitate detection of the probe.
Invention probes are useful to detect the presence of nucleic acids encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides. For example, the probes can be used for in situ hybridizations in order to locate biological tissues in which the invention gene is expressed. Additionally) synthesized oligonucleotides complementary to the nucleic acids of a polynucleotide encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides are useful as probes for detecting the invention genes, their associated mRNA, or for the isolation of related genes using homology screening of genomic or cDNA libraries, or by using amplification techniques well known to one of skill in the art.
Also provided are antisense oligonucleotides having a sequence capable of binding specifically with any portion of an mRNA that encodes human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide so as to prevent translation of the mRNA. The antisense oligonucleotide may have a sequence capable of- binding specifically with any portion of the sequence of the cDNA encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide. As used herein, the phrase "binding specifically" encompasses the ability of a nucleic acid sequence to recognize a complementary nucleic acid sequence and to form double-helical segments therewith via the formation of hydrogen bonds between the complementary base pairs. An example of an antisense oligonucleotide is an antisense oligonucleotide comprising chemical analogs of nucleotides (i.e., synthetic antisense oligonucleotide, SAO).
Compositions comprising an amount of the antisense oligonucleotide, (SAOC), effective to reduce expression of the human netrin, the human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptide by passing through a cell membrane and binding specifically with mRNA encoding the human netrin, the human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptide so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane are also provided herein. The acceptable hydrophobic carrier capable of passing through cell membranes may also comprise a structure which binds to a receptor specific for a selected cell type and is thereby taken up by cells of the selected cell type. The structure may be part of a protein known to bind to a cell-type specific receptor.
This invention provides a means to modulate levels of expression of invention polypeptides by the use of a synthetic antisense oligonucleotide composition (SAOC) which inhibits translation of mRNA encoding these polypeptides. Synthetic oligonucleotides, or other antisense chemical structures designed to recognize and selectively bind to mRNA, are constructed to be complementary to portions of the nucleotide sequences shown in Figures 3, 4, 8, 11 and 15, of DNA, RNA or chemically modified, artificial nucleic acids. The SAOC is designed to be stable in the blood stream for administration to a subject by injection, or in laboratory cell culture conditions. The SAOC is designed to be capable of passing through the cell membrane in order to enter the cytoplasm of the cell by virtue of physical and chemical properties of the SAOC which render it capable of passing through cell membranes, for example, by designing small, hydrophobic SAOC chemical structures, or by virtue of specific transport systems in the cell which recognize and transport the SAOC into the cell.
In addition, the SAOC can be designed for administration only to certain selected cell populations by targeting the SAOC to be recognized by specific cellular uptake mechanisms which bind and take up the SAOC only within select cell populations. For example, the SAOC may be designed to bind to a receptor found only in a certain cell type, as discussed supra. The SAOC is also designed to recognize and selectively bind to the target mRNA
sequence, which may correspond to a sequence contained within the sequence shown in Figures 3, 4, 8, 11 and 15.
The SAOC is designed to inactivate the target mRNA sequence by either binding to the target mRNA and inducing degradation of the mRNA by, for example, RNase I digestion, or inhibiting translation of the mRNA target by interfering with the binding of translation-regulating factors or ribosomes, or inclusion of other chemical structures, such as ribozyme sequences or reactive chemical groups which either degrade or chemically modify the target mRNA. SAOCs have been shown to be capable of such properties when directed against mRNA targets (see Cohen et al.,TIPS, 10:435, 1989 and Weintraub, Sci. American, January pp.40, 1990) .
This invention further provides a composition containing an acceptable carrier and any of an isolated, purified human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide, an active fragment thereof, or a purified, mature protein and active fragments thereof, alone or in combination with each other. These polypeptides or proteins can be recombinantly derived, chemically synthesized or purified from native sources. As ' used herein, the term "acceptable carrier" encompasses any of the standard pharmaceutical carriers, such as phosphate buffered saline solution, water and emulsions such as an oil/water or water/oil emulsion, and various types of wetting agents.
Also provided are antibodies having specific reactivity with the human netrin, the human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptides of the subject invention. Active fragments of antibodies are encompassed within the definition of "antibody". Invention antibodies can be produced by methods known in the art using the invention proteins or portions thereof as antigens. For example, polyclonal and monoclonal antibodies can be produced by methods well known in the art, as described, for example, in Harlow and Lane, Antibodies: A Laboratory Manual (Cold Spring Harbor Laboratory 1988).
The polypeptides of the present invention can be used as the immunogen in generating such antibodies.
Alternatively, synthetic peptides can be prepared (using commercially available synthesizers) and used as immunogens. V~There natural or synthetic hNET-, hABC3-, RPL3L- (SEM L3-), and/or hALR-derived peptides are used to induce a hNET-, hABC3-, RPL3L- (SEM L3-), and/or hALR-specific immune response, the peptides may be conveniently coupled to an suitable carrier such as KLH and administered in a suitable adjuvant such as Freund's. Preferably, selected peptides are coupled to a lysine core carrier substantially according to the methods of Tam, Proc. Natl.
Acad. Sci, USA 85:5409-5413, 1988. The resulting antibodies may be modified to a monovalent form, such as, for example, Fab, Fab2, FAB', or FV. Anti-idiotypic antibodies may also be prepared using known methods.
In one embodiment, normal or mutated hNET, hABC3, RPL3L (SEM L3), or hALR polypeptides are used to immunize mice, after which their spleens are removed, and splenocytes used to form cell hybrids with myeloma cells and obtain clones of antibody-secreted cells according to techniques that are standard in the art. The resulting monoclonal antibodies are screened for specific binding to hNET, hABC3, RPL3L (SEM L3), and/or hALR proteins or hNET-, hABC3-, RPL3L- (SEM L3-), and/or hALR-related peptides.
In another embodiment, antibodies are screened for selective binding to normal or mutated hNET, hABC3, RPL3L (SEM L3), or hALR sequences. Antibodies that distinguish between normal and mutant forms of hNET, hABC3, RPL3L (SEM L3), or hALR may be used in diagnostic tests (see below) employing ELTSA, EMIT, CEDIA, SLIFA, and the like. Anti- hNET, hABC3, RPL3L (SEM L3), or hALR
antibodies may also be used to perform subcellular and histochemical localization studies. Finally, antibodies may be used to block the function of the hNET, hABC3, RPL3L
(SEM L3), and/or hALR polypeptide, whether normal or mutant) or to perform rational drug design studies to identify and test inhibitors of the function (e. g., using an anti-idiotypic antibody approach).
Amino acid sequences can be analyzed by methods well known in the art to determine whether they encode hydrophobic or hydrophilic domains of the corresponding polypeptide. Altered antibodies such as chimeric, humanized, CDR-grafted or bifunctional antibodies can also be produced by methods well known in the art. Such antibodies can also be produced by hybridoma, chemical synthesis or recombinant methods described, for example, in Sambrook et al., supra., and Harlow and Lane, supra. Both anti-peptide and anti-fusion protein antibodies can be used. (see, for example, Bahouth et al., Trends Pharmacol.
Sci. 12:338, 1991; AusubeT et al., supra.).
Invention antibodies can be used to isolate invention polypeptides. Additionally, the antibodies are useful for detecting the presence of the invention " polypeptides, as well as analysis of polypeptide localization, composition, and structure of functional _ domains. Methods for detecting the presence of a human netrin, a human ABC3 transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide comprise contacting the cell with an antibody that specifically binds to the polypeptide, under conditions permitting binding of the antibody to the polypeptide, detecting the presence of the antibody bound to the cell, and thereby detecting the presence of the invention polypeptide on the cell. With respect to the detection of such polypeptides, the antibodies can be used for in vitro diagnostic or in vivo imaging methods.
Immunological procedures useful for in vitro detection of the target human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide in a sample include immunoassays that employ a detectable antibody. Such immunoassays include, for example, ELISA, Pandex microfluorimetric assay, agglutination assays, flow cytometry, serum diagnostic assays and immunohistochemical staining procedures which are well known in the art. An antibody can be made detectable by various means well known in the art. For example, a detectable marker can be directly or indirectly attached to the antibody. Useful markers include, for example, radionuclides, enzymes, fluorogens, chromogens and chemiluminescent labels.
For in vivo imaging methods, a detectable . antibody can be administered to a subject and the binding of the antibody to the invention polypeptide can be detected by imaging techniques well known in the art.
Suitable imaging agents are known and include, for example, gamma-emitting radionuclides such as kiln 99mTc 5lCr and the like, as well as paramagnetic metal ions, which are WO 97!48797 PCT/US97/00785 described in U.S. Patent No. 4,647,447. The radionuclides permit the imaging of tissues by gamma scintillation photometry, positron emission tomography, single photon emission computed tomography and gamma camera whole body imaging, while paramagnetic metal ions permit visualization by magnetic resonance imaging.
The invention provides a transgenic non-human mammal that is capable of expressing nucleic acids encoding a human netrin, a human ABC3 transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide. Also provided is a transgenic non-human mammal capable of expressing nucleic acids encoding a human netrin, a human ABC3 transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide so mutated as to be incapable of normal activity, i.e., does not express native protein.
The present invention also provides a transgenic non-human mammal having a genome comprising antisense nucleic acids complementary to nucleic acids encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide so placed as to be transcribed into antisense mRNA
complementary to mRNA encoding a human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide, which hybridizes thereto and, thereby, reduces the translation thereof. The polynucleotide may additionally comprise an inducible promoter and/or tissue specific regulatory elements, so that expression can be induced, or restricted to specific cell types. Examples of polynucleotides are DNA or cDNA
having a coding sequence substantially the same as the coding sequence shown in Figures 3, 4, 8, 11 and 15.
Examples of non-human transgenic mammals are transgenic cows, sheep, goats, pigs, rabbits, rats and mice. Examples of tissue specificity-determining elements are the metallothionein promoter and the T7 promoter.
Animal model systems which elucidate the physiological and behavioral roles of invention polypeptides are produced by creating transgenic animals in ' which the expression of the polypeptide is altered using a variety of techniques. Examples of such techniques include the insertion of normal or mutant versions of nucleic acids encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide by microinjection, retroviral infection or other means well known to those skilled in the art, into appropriate fertilized embryos to produce a transgenic animal. See, for example, Carver et al., Bio/Technology 11:1263-1270, 1993; Carver et al., Cytotechnology 9:77-84, 1992; Clark et al., Bio/Technology 7:487-492, 1989; Simons et al., Bio/Technology 6:179-183, 1988; Swanson et al., Bio/Technology 10:557-559, 1992;
Velander et al., Proc. Natl. Acad. Sci., USA 89:12003-12007, 1992; Hammer et al., Nature 315:680-683, 1985;
Krimpenfort et al., Bio/Technology 9:844-847, 1991; Ebert et al., Bio/Technology 9:835-838, 1991; Simons et al., Nature 328:530-532, 1987; Pittius et al., Proc. Natl. Acad.
Sci., USA 85:5874-5878, 1988; Greenberg et al., Proc. Natl.
Acad. Sci., USA 88:8327-8331, 1991; ~nThitelaw et al., Transg. Res. 1:3-13, 1991; Gordon et al., Bio/Technology 5:1183-1187, 1987; Grosveld et al., Cell 51:975-985, 1987;
Brinster et al., Proc. Natl. Acad. Sci., USA 88:478-482, 1991; Brinster et al., Proc. Natl. Acad. Sci., USA 85:836-840, 1988; Brinster et al., Proc. Natl. Acad. Sci., USA
82:4438-4442, 1985; A1-Shawi et al., Mol. Cell. Biol.
10(3}:1192-1198, 1990; Van Der Putten et al., Proc. Natl.
Acad. Sci., USA 82:6148-6152) 1985; Thompson et al., Cell 56:313-321, 1989; Gordon et al., Science 214:1244-1246, 1981; and Hogan et al., Manipulating the Mouse Embryo: A
Laboratory Manual (Cold Spring Harbor Laboratory, 1986).
Another technique, homologous recombination of mutant or normal versions-of these genes with the native gene locus in transgenic animals, may be used to alter the regulation of expression or the structure of the invention polypeptides (see, Capecchi et al., Science 244:1288, 1989;
Zimmer et al., Nature 338:150, 1989). Homologous recombination techniques are well known in the art.
Homologous recombination replaces the native (endogenous) gene with a recombinant or mutated gene to produce an animal that cannot express native (endogenous) protein but can express, for example, a mutated protein which results in altered expression of the human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide.
In contrast to homologous recombination, microinjection adds genes to the host genome, without removing host genes. Microinjection can produce a transgenic animal that is capable of expressing both endogenous and exogenous human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides. Inducible promoters can be linked to the coding region of the nucleic acids to provide a means to regulate expression of the transgene.
Tissue-specific regulatory elements can be linked to the coding region to permit tissue-specific expression of the transgene. Transgenic animal model systems are useful for in vi vo screening of compounds for identification of ligands, i.e., agonists and antagonists, which activate or inhibit polypeptide responses.
The nucleic acids, oligonucleotides (including antisense), vectors containing same, transformed host cells, polypeptides, as well as antibodies of the present invention, can be used to screen compounds in vitro to determine whether a compound functions as a potential agonist or antagonist to the invention protein. These in vitro screening assays provide information regarding the function and activity of the invention protein, which can lead to the identification and design of compounds that are capable of specific interaction with invention proteins.
In accordance with still another embodiment of the present invention, there is provided a method for identifying compounds which bind to human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides. The invention proteins may be employed in a competitive binding assay. Such an assay can accommodate the rapid screening of a large number of compounds to determine which compounds, if any, are capable of binding to invention polypeptides. Subsequently, more detailed assays can be carried out with those compounds found to bind, to further determine whether such compounds act as modulators, agonists or antagonists of invention polypeptides.
In accordance with another embodiment of the present invention, transformed host cells that recombinantly express invention polypeptides can be contacted with a test compound, and the modulating effects) thereof can then be evaluated by comparing the human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide-mediated response in the presence and absence of test compound, or by comparing the response of test cells or control cells (i.e., cells that do not express invention polypeptides), to the presence of the compound.
As used herein, a compound or a signal that "modulates the activity" of an invention polypeptide refers to a compound or a signal that alters the activity of the human netrin, the human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptide so that the activity of the invention polypeptide is different in the presence of the compound or signal than in the absence of the compound or signal. In particular, such compounds or signals include agonists and antagonists. An agonist encompasses a compound or a signal that-activates polypeptide function.
Alternatively, an antagonist includes a compound or signal that interferes with polypeptide function. Typically, the ettect of an antagonist is observed as a blocking of agonist-induced protein activation. Antagonists include competitive and non-competitive antagonists. A competitive antagonist (or competitive blocker) interacts with or near the site specific for agonist binding. A non-competitive antagonist or blocker inactivates the function of the polypeptide by interacting with a site other than the agonist interaction site.
The following examples are intended to illustrate the invention without limiting the scope thereof.
Example I: Contig Assembly A. Cosmids Multiple cosmids were used as reagents to initiate walks in YAC and P1 libraries. Clones 16-166N
(D16S277), 16-191N (D16S279), 16-198N (D16S280) and 16-140N
(D16S276) were previously isolated from a cosmid library (Lerner et al., Mamm. Genome 3:92-100, 1992). Cosmids cCMM65 (D16S84), c291 (D16S291), cAJ42 (ATP6C) and cKG8 were recovered from total human cosmid libraries (made in-house or by Stratagene, La Jolla) CA) using either a cloned insert (CMM65) or sequence-specific oligonucleotides as probe. The c326 cosmid contig and clone 413C12 originated from a flow-sorted chromosome 16 library (Stallings et al., Genomics 13(4):1031-1039, 1992). The c326 contig was comprised of clones 2H2, 77E8, 325A11 and 325B10.
B. YACs Screening of gridded interspersed-repetitive sequence (IRS pools from Mark I) Mark II and Mega-YAC
libraries) with cosmid-specific IRS probes was as previously described (Liu et al., Genomics 26:178-191, 1995). IRS probes were made from cosmids 16-166N, 16-191N, cAJ42, 16-198N, 325A11, cCMM65, and 16-140N. Biotinylated YAC probes were generated by nick-translating complex mixtures of IRS products from each YAC. Mixtures of sufficient complexity were achieved by performing independent DNA amplifications of total yeast DNA using various Alu primers (Lichter et al., Proc. Natl. Acad.
' Sci., USA 87:6634-6638, 1990) and then combining the appropriate reactions containing the most diverse products.
C. P1s Chromosome walking experiments were done using a single set of membranes which contained the gridded P1 library pools (Shepherd et al., supra. 1994). The gridded filters were kindly provided by Dr. Mark Leppert and the Technology Access Section of the Utah Center for Human Genome Research at the University of Utah. P1 gridded membranes were screened using end probes derived from a set of chromosome 16 cosmids (see above) and P1 clones as they were identified. Both RNA transcripts and bubble-PCR
products were utilized as end probes.
D. Probes Radiolabeled transcripts were generated using restriction enzyme digested cosmids or P1s (Alul, HaeIII, Rsal, Taql) as template for phage RNA polymerases T3, T7 and SP6. The T3 and T7 promoter elements were present on the cosmid-derived templates while T7 and SP6 promoter sequences were contained on the P1-based templates.
Transcription reactions were performed as recommended by the manufacturer (Stratagene, La Jolla, CA) in the presence of [OCP32] -ATP (Amersham, Arlington Heights, IL) .
Bubble-PCR products were synthesized from restriction enzyme digested P1s (Alul, HaeIII, Rsal, Taql).
Bubble adaptors with appropriate overhangs and phosphorylated 5' ends were ligated to digested P1 DNA
basically as described for YACs (Riley et al., Nuc. Acids Res. 18:2887-2890, 1990). The sequence of the universal vectorette primer derived from the bubble adaptor sequence was 5'-GTTCGTACGAGAATCGCT-3' (SEQ ID N0:67), and differed from that of Riley and co-workers with 12 fewer 5' WO 97/48797 PCTlUS97/00785 nucleotides. The Tm of the truncated vectorette primer more closely matched that of the paired amplimer from the vector-derived promoter sequence (SP6, T7). The desired bubble-PCR product was gel purified prior to radiolabeling (Feinberg et al., Anal. Biochem. 132:6-13, 1983; Feinberg and Vogelstein, Anal. Biochem. 137:266-267, 1984).
The specificity of all end probes was determined prior to their use on the single set of gridded P1 filter arrays. Radiolabeled probes were pre-annealed to Cot.2 DNA
as recommended (Life Technologies Inc., Gaithersburg, MD) and then hybridized to strips of nylon membrane to which were bound 10-20 ng each of the following DNAs: the cloned genomic template used to create the probe; one or more unrelated cloned genomic DNAs; cloned vector (no insert);
and human genomic DNA.
Hybridizations were performed in CAK solution (5x SSPE, 1o SDS, 5x Denhardt's Solution, 100 mg/mL torula RNA) at 65°C overnight. Individual end probes were present at a concentration of 5x105 cpm/mL. Hybridized membranes were washed to a final stringency of O.lx SSC/0.1o SDS at 65° C.
The hybridization results were visualized by autoradiography. Probes which hybridized robustly to their respective cloned template while not hybridizing to unrelated cloned DNAs, vector DNA or genomic DNA were identified and used to screen the gridded P1 filters.
Hybridization to the arrayed P1 pools was performed as described for the nylon membrane strips (above) except that multiple probes were used simultaneously. Positive clones were identified, plated at a density of 200-500 cfu per 100 mm plate (LB plus 25 mg/mL
kanamycin), lifted onto 82 mm HATE membranes (Millipore, Bedford, MA), processed for hybridization (Sambrook et al., supra.) and then rescreened with the complex probe mixture.
A single positive clone from each pool was selected and replated onto a master plate. To identify the colony purified genomic P1 clone and its corresponding ' probe, multiple P1 DNA dot blots were prepared and each hybridized to individual radiolabeled probes. All hybridizations contained a chromosome 16p13.3 reference probe, e.g. cAJ42, as well as a uniquely labeled P1 DNA
probe.
Example II: Exon Trapping Genomic P1 clones were prepared for exon trapping experiments by digestion with Pstl, double digestion with BamHI/BglII, or by partial digestion with limiting amounts of Sau3AI. Digested P1 DNAs were ligated to BamHI-cut and dephosphorylated vector, pSPL3B, while Pstl-digested P1 DNA
was subcloned into Pstl-cut dephosphorylated vector, pSPL3B.
Ligations were performed in triplicate using 50 ng of vector DNA and 1, 3 or 6 mass equivalents of digested P1 DNA. Transformations were performed following an overnight 16°C incubation, with 1/10 and 1/2 of the transformation being plated on LB (ampicillin) plates.
After overnight growth at 37°C, colonies were scraped off those plates having the highest transformation efficiency (based on a comparison to "no insert" ligation controls) and miniprepped using the alkaline lysis method. To examine the proportion of the pSPL3B containing insert, a small portion of the miniprep was digested with HindIII, which cuts pSPL3B on each side of the multiple cloning site.
Example III: RNA Preparation Approximately 10 ~..t.g of the remaining miniprep DNA
was ethanol precipitated, -resuspended in 100 x.11 of sterile PBS and electroporated into approximately 2 x 106 COS-7 cells (in 0.7 ml of ice cold PBS) using a BioRad GenePulser electroporator (1.2 kV, 25 ~,F and 200 S2). The electroporated cells were incubated for 10 min. on ice prior to their addition to a 100 mm tissue culture dish containing 10 mI of prewarmed complete DMEM.
Cytoplasmic RNA was isolated 48 hours post-transfection. The transfected COS-7 cells were removed from tissue culture dishes using 0.250 trypsin/1 mM
EDTA (Life Technologies Inc., Gaithersburg) MD).
Trypsinized cells were washed in DMEM/10o FCS and resuspended in 400 ~,l of ice cold TKM (10 mM Tris-HC1 pH
7.5, 10 mM KCl, 1 mM MgCl2) supplemented with 1 ~,1 of RNAsin (Promega, Madison, WI). After adding 20 [..t.l of 10~
Triton X-100, the cells were incubated for 5 min. on ice.
The nuclei were removed by centrifugation at 1200 rpm for 5 min. at 4°C. Thirty microliters of 5o SDS was added to the supernatant, with the cytoplasmic RNA being further purified by three rounds of extraction using phenol/chloroform/isoamyl alcohol (24:24:1). The cytoplasmic RNA was ethanol precipitated and resuspended in 0 ~.~.1 o f H2 0 .
Reverse transcription and PCR were performed on the cytoplasmic RNA prepared above as described (Church et al., supra. 1994) using commercially available exon trapping oligonucleotides (Life Technologies Inc., Gaithersburg, MD). The resulting CUA-tailed products were shotgun subcloned into pAMPlO as recommended by the manufacturer (Life Technologies Inc.). Random clones from each ligation were analyzed by colony PCR using secondary PCR primers {Life Technologies Inc.).
Miniprep DNA containing the pAMPlO/exon traps was prepared from overnight cultures by alkaline lysis using the EasyPrep manifold or a QIAwell 8 system according to the manufacturers' instructions (Pharmacia, Pistcataway, NJ
and Qiagen Inc., Chatsworth, CA, respectively). DNA
products containing trapped exons, based on comparison to the 177 by "vector only" DNA product, were selected for sequencing.
- Example IV: Sequencing DNA sequencing was performed using Pharmacia ALF and Applied Biosystems 377 PRISM automated DNA sequencers (Piscataway, NJ, and Foster City, CA). DNA sequences were aligned using Sequencher DNA analysis software (Genecodes, Ann Arbor, MI). DNA and protein database searches were performed using the BLASTN (Altschul et al., J. Mol. Biol.
215:403-410, 1990) and BLASTX (Altschul et al., supra.
1990; Gish et al., Nat. Genet. 3:266-272, 1993) programs.
SASE sequences were analyzed by processing BLAST (Altschul et al., supra. 1990; Gish et al., supra. 1993) and FASTA
(Lipman et al., Science 227:1435-1441, 1985) searches.
Protein sequences were analyzed using MacVector (Oxford Molecular Group, Cambell, CA), BCM Launcher (Smith et al., Genome Research 6:454-462, 1996), ClustalW (Thompson et al., Nucleic Acids Res. 22:4673-4680, 1994), and PSORT
(Nakai et al., Genomics 14:897-911 1992).
Example V: RT-PCR, RACE, SASE and cDNA Isolation Based upon the sequence determined (above) two oligonucleotide primers (Table II) were designed for each exon trap using Oligo 4.0 (National Biosciences Inc., Plymouth, MN).
To determine which tissue-specific library to screen for transcript or cDNA, RT-PCR reactions and/or PCR
reactions were performed using different tissue-derived RNAs and/or cDNA libraries, respectively, as template with the oligonucleotide primers designed for each exon trap (above).
The oligonucleotides designed from the exons (Table II), were then used in one or more of the following positive selection formats to screen the corresponding tissue-specific cDNA library.
For RT-PCR experiments, the first oligonucleotide was used as a sense primer and the second oligonucleotide was used as an antisense primer. RT-PCR was performed as described using polyA+ RNA from adult brain and placenta (Kawasaki, In PCR Protocols: A Guide to Methods and Applications, Eds. Innis et al., Academic Press, San Diego, CA, pp. 21-27, 1990). All PCR products were cloned using the pGEM-T vector as described by the manufacturer (Promega, Madison, WI).
To clone sequences 3' to selected exon traps, rapid amplification of cDNA ends (RACE) was performed as described (Frohman, PCR Met. Appl. 4:540-S58, 1994). In 3' RACE experiments, the first oligonucleotide was used as the external primer and the second oligonucleotide was used as the internal primer.
For the Genetrapper cDNA Positive Selection System, the first oligonucleotide primer was biotinylated and used for direct selection, while the second oligonucleotide was used in the repair.
In addition to exon trapping, the cloned contig was also screened using cDNA selection essentially as described (Parimoo et al., Anal. Biochem. 228:1-17 1995), using the genomic P1 clones from this interval (Dackowski et a~.,Genome Res. 6:515-524, 1996). Other coding sequence was obtained by SAmple SEquencing (SASE).
SASE was performed as a functional genomics method for gene identification. Briefly, DNA from individual P1s were partially digested with Sau3A and 3 kb fragments were subcloned into the pBluescriptKS+ plasmid (Stratagene, La Jolla, CA). Subclones were sequenced from both ends to generate sequences semi-randomly from the P1 clone.
WO 97/48797 PCTlUS97/00785 Example VI: Nucleotide Sequence Analysis ' hNET: A random shotgun library was prepared from the 53.8B P1 clone (Figure 18) by subcloning randomly sheared P1 DNA into the pAMPlO vector (Life Technologies Inc., Gaithersburg, MD) essentially as described (Andersson et al., (1994) Anal. Biochem. 218:300-308). P1 DNA was randomly sheared using a nebulizer (Hudson RCI, Temecula, CA). The library was initially screened with a 6 kb XhoI
fragment, which had been shown to contain the netrin encoding exon traps (Figure 18). The library was subsequently screened with an adjacent 3.5 kb Xhol fragment in order to obtain additional clones for sequencing.
Positive clones were sequenced using forward and reverse vector primers as previously described (The American PKD1 Consortium (1995) Hum. Mol. Genet. 4:575-582).
The genomic sequence was edited and assembled using Sequencher (GeneCodes, Ann Arbor, MI). The coding region was predicted using the World Wide Web version of the GRAIL2 program (Uberbacher and Mural (1991) Proc. Natl.
Acad. Sci., USA 88:11261-11265; Xu et al. (1994) Genet.
Eng. N.Y. 16:241-253) and a MacVector (Oxford Molecular Group, Cambell, CA) Pustell DNA/protein matrix analysis comparing the genomic sequence (translated in all reading frames) to the chicken netrins. Database searches were performed using BLASTN (Altschul et a1. (1990) J. Mol.
Biol. 215:403-410) and BLASTX (Altschul et al., 1990, supra; Gish and States (1993) Nat. Genet. 3:266-272).
RT-PCR: Both adult (brain, heart, kidney, leukocytes, liver, lung, a lymphoblastoid cell line, placenta, spleen, and testis) and fetal (kidney and brain) cDNA libraries were prescreened for the presence of netrin cDNAs by PCR as described (Van Raay et al., 1996, supra).
Nested RT-PCR was utilized to clone transcribed sequences from the netrin gene. Briefly, spinal cord polyA+ RNA
(Clontech, Palo Alto, CA) was reverse transcribed using random primers as described (Kawasaki, 1990 In "PCR
Protocols: A Guide to Methods and Applications" (M. A.
Innis, D.H. Gelfand, J.J. Sninsky, and T.J. White. Eds.), pp. 21-27, Academic Press, Inc., San Diego).
Primers for PCR (Table IV) were designed based on the exons predicted from the analysis of the genomic sequence and used to amplify spinal cord RNA since spinal cord has been previously shown to express low levels of chicken netrin (Serafini et a1. supra.). Nested PCR was required to detect RT-PCR products from human spinal cord RNA. Spinal cord RNA was reverse transcribed with random primers and primary PCR was performed in the presence of 2.5 M betaine (Sigma Chemical Co., St. Louis, MO) using the primers designed from the gene model (Table IV). The primary PCR reactions were then diluted 1:20 and secondary PCR was performed on 1 E1L of the diluted primary reactions using nested primers (also designed from the gene model), again in the presence of betaine. The inclusion of betaine at a final concentration of 2.5 M in the PCR reactions dramatically increased the purity and yield of the human netrin RT-PCR products (see, for example, International Publication No. WO 96/12041; Reeves et a1. (1994) Am. J.
Hum. Genet. 55:A238; Baskaran et al. (1996) Genome.Research 6:633-638).
RT-PCR products were subcloned using pGEM-T
(Promega, Madison, WI) as recommended by the manufacturer.
The resulting RT-PCR clones were sequenced with vector primers and internal primers using the ABI dye terminator chemistry (Perkin Elmer, Foster City, CA) and an ABI 377 automated sequencer (Perkin Elmer, Foster City, CA).
Multiple sequence alignments were performed using ClustalW
(Thompson et al., (1994) Nucleic Acids Res. 22:4673-4680).
Sequence analysis of the RT-PCR products indicated that hNET contains at least six exons. The RT-PCR data indicate that the fourth predicted exon is actually split by an intron in the human netrin gene and is present as two exons. Three of the RT-PCR exons were shown to be identical to the original exon traps. Aside from the extra exon, the gene model is nearly identical to the RT-PCR products. The cDNA coding sequence, predicted protein product and full length sequence are shown in Figures 4A
. through 4C, respectively.
Northern blot analysis: Genomic and RT-PCR probes were radiolabeled (Feinberg and Vogelstein, Anal. Biochem.
132:6-13, 1983) and used to probe Northern blots containing RNAs from a variety of adult tissues (Clontech, Palo Alto, CA), including a panel of RNAs from different neural tissues including spinal cord. In addition, a human RNA
Master Blot (Clontech, Palo Alto, CA) containing RNAs from 50 different adult and fetal tissues was screened as recommended by the manufacturer.
hABC3: A human lung cDNA library (LTI, Gaithersburg, MD) was screened with the GeneTrapper system (LTI, Gaithersburg, MD} using capture and repair oligonucleotides (5'-CATTGCCCGTGCTGTCGTG-3' (SEQ ID N0:52}
and 5'-CATCGCCGCCTCCTTCATG-3' (SEQ ID N0:53), respectively) designed from trapped exon L48757, the 5' most trapped exon with homology to murine ABC1. Direct cDNA library screening was also performed using an RT-PCR clone as probe. 5' RACE (Frohman, M.A. in Methods Enzymol. (J. N.
Abelson and M.I. Simon Eds.) pp. 340-356, Academic Press, San Diego, CA 1993) was used to isolate additional 5' sequences from the ABC3 transcript.
Northern blot analysis: A 679 by fragment from the 3' untranslated region (UTR) of the ABC3 cDNA was radiolabeled by random priming (Feinberg et al., supra.
1983) and used to probe a multiple tissue northern blot (Clontech, Palo Alto, CA) under conditions recommended by the manufacturer.
Identification of codina sequence for the novel ABC
transporter: The gene for a novel ATP binding cassette (ABC) transporter, designated ABC3, has been mapped to the PKD1 locus on chromosome 16 (Burn et al., Genome Res.
6:525-537, 1996). Eight exons from the hABC3 gene were obtained from the 30.1F; 64.12C and 96.4B P1 clones using exon trapping. See, Figure 16 showing the genomic interval surrounding the hABC3 gene at the top, with Notl sites, DNA
markers, and distance in kilobases (in kb) also being shown. Genomic P1 clones from the interval which contain sequence from the hABC3 gene are shown below the genomic map. The relative position of the hABC3 cDNA is provided below the P1 clones, with the selected cDNA, trapped exons, RT-PCR clones, and cDNAs being indicated. Trapped exons and RT-PCR clones used in the isolation of additional hABC3 sequences have been labeled. The discontinuity in the line for clone ABCgt.1 represents the absence of an alternatively spliced exon.
Seven of these trapped exons encoded sequences having homology to murine ABC1 and ABC2 based on BLASTX
analysis (Altschul et al., supra. 1990; Gish et al., supra.
1993), with sequences from the trapped exons L48758, L48759, and L48760 having highest homology. Sequences encoded by the trapped exon L48760 also had homology to a Caenorhabditis elegans ABC transporter predicted from genomic sequence (Wilson et al., supra.). .
cDNA selection yielded a single 261 by cDNA clone which mapped near the 5' end of the ABC3 gene. Like L48760, this clone encoded sequences having homology to the hypothetical C. elegans ABC transporter. Initial analysis of the SASE results from the 30.1F P1 clone indicated that 4 of the 164 reactions encoded sequences with homology to ABC1 or ABC2. Subsequent comparison of the SASE data to the final hABC3 cDNA indicated that an additional seven sequencing reactions contained coding sequences from the ABC3 gene. A total of 1.6 kb of ABC3 coding sequence aligned with the SASE data. In that only 3.5 kb of coding sequence from the 5' end of the hABC3 gene map to the 30.1F
P1 clone, this represents a level of 45o coverage for the SASE analysis.
Assembly and analvsis of a cDNA for the novel ABC
transporter: Two complementary approaches were employed to assemble the full-length hABC3 cDNA. First, RT-PCR was utilized to link the trapped exons, selected cDNA, and SASE
data. Secondly, cDNA library screening was performed using direct selection as well as radiolabeled probes.
Using primers designed from the trapped exons L48757, L48758, L48760 and L75924, three RT-PCR products, containing 3.3 kb of coding sequence were cloned (Table I
and Figure 16). An additional RT-PCR primer was designed from a region of identity between the selected cDNA and the SASE data (Table I). A 900 by RT-PCR clone was obtained using the latter primer in conjunction with a trapped exon derived primer. In total, 4.2 kb of coding sequence was obtained using RT-PCR.
Several cDNAs were cloned using the GeneTrapper direct selection system and oligos designed from the 5' most trapped exon encoding sequences with homology to ABC1 (trapped exon L48747). The longest clone isolated with the GeneTrapper system was 5719 by in length (ABCgt.1) (Figure 8). This cDNA contains a 792 by 3' untranslated region with a consensus polyadenylation - cleavage site 20 by upstream of the polyA tail. An additional cDNA clone (ABC.5) was isolated using a radiolabeled 1.1 kb RT-PCR
product (ABC3-12) as a probe (Figure 16). The 5' end of the ABC3 cDNA was further characterized using 5' RACE, with several RACE products containing multiple in-frame stop codons upstream of the start methionine.
Sequence analysis indicated that clone ABCgt.1 lacks 147 by of sequence found in the RT-PCR clones and the cDNA clone ABC. S. The additional 147 by segment is likely to be the result of alternative splicing, in that it does not interrupt the open reading frame. The presence of both transcript populations has been confirmed by PCR using primers flanking the alternatively spliced exon.
A 6.4 kb cDNA has been assembled for the hABC3 transporter. The assembled cDNA contains a 5116 nucleotide long open reading frame encoding 1705 amino acids, with the predicted protein having a molecular weight of 191 kDa.
The proposed start methionine is 50 by upstream of the 5' end of clone ABCgt.l. Although the sequence surrounding the start methionine matches the Kozak sequence in only 6 of 10 positions (Kozak, J. Cell Biol. 115:887-903, 1991), the two positions which have been shown to be critical for function (an A at -3 and a G at +4) are conserved in hABC3.
The hABC3 cDNA contains a 792 by 3' UTR with a consensus polyadenylation/cleavage site 20 by upstream of the polyA
tract.
A 6.8 kb transcript is detected by a 3' UTR cDNA
probe on northern blots with highest levels of expression being observed in lung with lesser amounts in brain, heart, and pancreas. Significantly lower levels of expression were observed in placenta and skeletal muscle after longer exposure times. The ABC3 transcript was not detected in either liver or kidney.
RPL3L (SEM L3): The longest cDNA is 1548 nucleotides in length (Figure 11). All three cDNAs have an open reading frame (ORF) of 1224 nucleotide with the longest cDNA containing a 48 nucleotide 5' untranslated region. An inframe stop codon at position 7 is followed by the Kozak initiation sequence CCACCATGT (SEQ ID N0:68) (Kozak, supra.). The 3' UTR for each of the three cDNAs vary in length, and lacks a consensus polyadenylation cleavage site.
The longest cDNA was compared to the human, bovine and murine ribosomal L3 genes. At the nucleotide level there is only 74o identity between the RPL3L (SEM L3) cvNA and the consensus from these other ribosomal L3 cDNAs.
This is in sharp contrast to the 98% identity shared between human, bovine, and murine L3 nucleotide sequences.
There is no similarity between the 3' UTR of the cDNAs isolated here and the other L3 genes.
hALR: Sequences were cloned from the human ALR
gene by 3' RACE using primers (e.g., external 5'-TGGCCCAGTTCATACATTTA-3' (SEQ ID N0:69) and internal 5'-TTACCCCTGTGAGGAGTGTG-3' (SEQ ID N0:70)) designed from the exon trap. A total of 468 by have been obtained from the human ALR gene (Figure 13).
Example VII: Amino Acid Sequence Analysis hNET: hNET cDNA has at least 210 by of 5' untranslated sequence, a 5' start methionine codon, a 3' stop codon (TGA) and is predicted to be 580 amino acids in length (Figure 4), with the common domain structure of the netrin family being conserved (Figure 20A). Overall, the human netrin was found to have higher homology to chicken netrin-2 than netrin-1, i.e., 56.3o versus 53.9. As is the case with the other members of the netrin family, the region of greatest conservation includes the three EGF
repeats, while the C-terminal domains are less well conserved (Figure 20A). The EGF repeats are 78.70 and 82.2 identical between the human netrin and chicken netrin-1 and netrin-2, respectively, and 66.30 identical when compared to UNC-6. The C-terminal domains of the human netrin and chicken netrin -1 and -2 are 41.9 and 42.50 indentical, respectively with the same domain of UNC-6 being only 29.4 identical to human netrin. Overall, the human netrin more closely resembles the chicken netrins and UNC-6 than Drosophila NETA and NETB, since NETA
contains an expansion in the C-domain while NETB contains additional seguences in the VI and V-1 domains (Harris et al.) 1996, supra; Mitchell et al., 1996, supra).
WO 97!48797 PCT/US97/00785 The Structure of the Netrin Genes is Conserved Between Drosophila and Human The positions of the introns in the human gene were compared to the encoded protein to determine if the overall gene structure of the netrin/UNC-6 family is conserved (Figure 20B). This analysis revealed striking similarities between the Drosophila netrin genes and the human netrin gene. In the human gene, exon 1 contains the signal peptide, domain VI and the first EGF domain (domain V-1), while exons two and three each contain an EGF repeat, domains V-2 and V-3, respectively. Exons 4, 5, and 6 contain portions of the C-domain. With the exception of an additional intron in the C-domain, this motif/exon arrangement is conserved in the Drosophila netrin genes.
The coding regions of the two Drosophila netrin genes have been shown to be highly conserved with each being disrupted by six introns that occur in homologous sites (Harris et al., 1996, supra). The position of five of the six Drosophila introns was found to be conserved in the human gene (Figure 20B). The UNC-6 gene contains 12 introns in the coding region (Ishii et al., 1992, supra), the position of five of which correlate with the positions of the introns in the human gene. Interestingly, the sixth Drosophila intron that does not have a counterpart in the human gene and is the only intron from Drosophila that is not conserved in the UNC-6 gene.
hABC3: Database searches revealed homology between ABC3 and murine ABC1 and ABC2 (Luciani et al., supra. 1994). In addition to the murine ABC1 and ABC2 proteins, ABC3 also shows homology to the putative C. elegans protein encoded by the cosmid sequence of C48B4.4 (Wilson et al., supra.).
Overall, ABC3, ABC1, ABC2 and sequences encoded by C.
elegans cosmid C48B4.4 have highest homology in the regions surrounding the ATP binding cassettes (Figure 17).
However, when one compares the sequence between the first ATP binding cassette and the second transmembrane domain, referred to as the linker domain (Luciani et al., supra.
' 70 199-4), ABC3 shares much lower homology to these same 3 proteins listed above (amino acids 765-1044 in ABC3 in Figure 17). The linker domain of ABC3 is approximately 200 residues shorter than the linker domain present in ABC1 and ABC2. Consequently, an optimum protein alignment positions a gap in the ABC3 sequence immediately C-terminal of a conserved HH1 hydrophobic domain (Luciani et al., supra.
1994), located at position 917 through 959 in ABC3 (Figure 17). Additional comparisons indicate that the ABC3 linker domain is nearly identical in size to the linker domain encoded by C. elegans cosmid C48B4.4. As is the case with ABC1 and ABC2, the linker domain of ABC3 contains numerous polar residues and several potential phosphorylation sites.
Further analysis of the deduced ABC3 protein sequence revealed additional similarities to the ABC1/ABC2 subfamily. Based on PSORT analysis (Nakai et al., supra.), the ABC3 protein does not appear to contain an N-terminal signal sequence and is likely to be a Type III membrane protein (Singer, Annu. Rev. Cell Biol. 6:247-296 1990), with sequences N-terminal of the first transmembrane domain being located in the cytoplasm (Figure 17). Similar topography has been described for ABC1 (Luciani et al., supra. 1994) and all other ABC transported described to date (Higgins, supra. 1992). As mentioned above, murine ABC1 and ABC2 have been shown to contain a novel hydrophobic region, HH1, within the conserved linker domain. Although the HH1 domain is not well conserved at the amino acid level in ABC3, an HH1 domain does appear to be present within the linker region based on hydrophilicity analysis. A similar HH1 domain is also found in sequences encoded by cosmid C48B4.4 from C. elegans. In all these cases, the HH1 domain is predicted to have a i~-sheet conformation.
RPL3L (SEM L3): The RPL3L (SEM L3) cDNA open reading frame predicts a 407 amino acid polypeptide of 46.3 kD
(Figure 11). In vitro transcription - translation of RPL3L
(SEM L3) cDNA resulted in a protein product with an ' 71 apparent molecular weight of 46 kD which is in close agreement with the predicted weight of 46.3 kD.
Two nuclear targeting sequences, which are 100 conserved between man, mouse and cow, diverged slightly in the RPL3L (SEM L3) amino acid sequence. The first targeting site is the 21 amino acid N-terminal oligopeptide. The serine and arginine present at positions 13 and 19 respectively, in human, bovine and murine L3 are replaced with histidines in RPL3L (SEM L3) (Figure 12).
The second potential nuclear targeting site is the bipartite motif. Here the human, bovine and murine proteins have a KKR-(aa)12-KRR at position 341-358 while the SEM L3 gene has KKR-(aa)lp-HHSRQ at position 341-358.
The second half of this bipartite motif, while remaining basic, does not match those found in other nuclear targeting motifs (Simonic et al., supra. 1994). Overall, there is 77.20 amino acid identity between the RPL3L (SEM
L3) and the consensus from the other mammalian L3 ribosomal genes, with 56~ of the nucleotide differences between RPL3L
(SEM L3) and the human L3 being silent.
hALR: hALR cDNA sequences encode a 119 amino acid protein which is 84.80 identical and 94.1 similar to the rat ALR protein (see, Figures 13 and 14).
Although the invention has been described with reference to the disclosed embodiments, it should be understood that various modifications can be made without departing from the spirit of the invention. Accordingly, the invention is limited only by the claims which follow the Sequence Listing.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: GENZYME CORPORATION
(ii) TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, COMPOSITIONS, METHODS OF MAKING AND
USING SAME
(iii) NUMBER OF SEQUENCES: 83 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: SCOTT & AYLEN
(B) STREET: 60 QUEEN STREET
(C) CITY: OTTAWA
(D) PROVINCE: ONTARIO
(E) COUNTRY: CANADA
(F) POSTAL CODE: K1P 5Y7 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette (B) COMPUTER: IBM Compatible (C) OPERATING SYSTEM: Windows (D) SOFTWARE: FastSEQ for Windows Version 2.Ob (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: 2,256,486 (B) FILING DATE: 16-JAN-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/665,259 (B) FILING DATE: 17-JUN-1996 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/720,614 (B) FILING DATE: O1-OCT-1996 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/762,500 (B) FILING DATE: 09-DEC-1996 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: CHRISTINE J. COLLARD
(B) REGISTRATION NUMBER: 10030 (C) REFERENCE/DOCKET NUMBER: PAT 43578W-1 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (613) 237-5160 (B) TELEFAX: (613) 787-3558 (2) INFORMATION FOR SEQ ID NO:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 179 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:
Leu His Leu Glu Gly Pro Phe Ile Ser Arg Glu Lys Arg Gly Thr His Pro Glu Ala His Leu Arg Ser Phe Glu Ala Asp Ala Phe Gln Asp Leu Leu Ala Thr Tyr Gly Pro Leu Asp Asn Val Arg Ile Val Thr Leu Asp Pro Glu Leu Gly Arg Ser His Glu Val Phe Arg Thr Leu Thr Xaa Arg Ser Ile Cys Val Ser Leu Gly His Ser Val Ala Asp Leu Arg Ala Ala Glu Asp Ala Val Trp Ser Gly Ala Thr Phe Ile Thr His Leu Phe Asn Ala Met Leu Pro Phe His His Arg Asp Pro Gly Ile Val Gly Leu Leu Thr Ser Asp Arg Pro Ala Gly Arg Cys Ile Phe Tyr Gly Met Ile Ala Asp Gly Thr His Thr Asn Pro Ala Ala Leu Arg Ile Ala His Arg Ala His Pro Gln Gly Leu Val Leu Val Thr Asp Ala Ile Pro Ala Leu Gly Leu Gly Asn Gly Arg His Thr Leu Gly Gln Gln Glu Val Glu Val Asp Gly Leu Thr (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 90 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
His Leu Glu Gly Pro Phe Ile Ser Lys Arg Gly His Pro Glu Ser Tyr Gly Asn Ile Val Thr Pro Glu Leu Glu Val Ser Gly His Ser Ala Leu Glu Ala Val Ser Gly Ala Ile Thr His Leu Phe Asn Ala Met His His Arg Asp Pro Gly Gly Leu Leu Thr Ser Leu Tyr Gly Ile Asp Gly His Thr Ala Leu Arg Ile Ala Gly Leu Val Leu Val Thr Asp Ala Ile Ala Leu Gly Gly His Leu Gly Gln Val Gly Leu (2) INFORMATION FOR SEQ ID N0:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 64 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:
Leu His Leu Glu Gly Pro Lys Gly Thr His Arg Ala Ala Asp Leu Asp Val Thr Leu Pro Glu Glu Val Leu Ile Val Ser Gly His Ser Ala Leu Ala Gly Thr Phe Thr His Leu Asn Ala Met Pro Gly Leu Leu Ile Gly Ile Ala Asp Gly His Ala Arg Ala Arg Leu Leu Val Thr Asp Ala Gly (2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 55 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:
Leu His Glu Pro Ser Glu Lys Gly His Arg Asp Leu Gly Asp Thr Glu Ile Val Ser Gly His Ser Ala Ala Ala Gly Ala Thr Phe Thr His Leu Asn Ala Met Pro Gly Gly Ile Asp Gly His Asn Arg Ile Leu Val Thr Asp Ile Ala Gly Leu Gly Thr (2) INFORMATION FOR SEQ ID N0:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 49 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:5:
Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys Val (2) INFORMATION FOR SEQ ID N0:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 48 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys (2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 44 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:
Cys Asp Cys His Pro Val Gly Ala Ala Gly Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Thr Cys Asn Arg Cys Ala Lys Gly Gln Gln Ser Arg Ser Pro Ala Pro Cys (2) INFORMATION FOR SEQ ID NO:$:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: B:
Cys Cys His Pro Val Gly Gly Cys Asn Gln Gly Gln Cys Cys Lys Gly Val Thr Gly Thr Cys Asn Arg Cys Ala Lys Gly Gln Gln Ser Arg Ser Val Pro Cys (2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 49 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:9:
His Ser Pro Ser Leu Ser Ala Glu Thr Pro Ile Pro Gly Pro Thr Glu Asp Ser Ser Pro Val Gln Pro Gln Asp Cys Asp Ser His Cys Lys Pro Ala Arg Gly Ser Tyr Arg Ile Ser Leu Lys Lys Phe Cys Lys Lys Asp Tyr (2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
Ile Ser Pro Asp Cys Asp Ser Cys Lys Pro Ala Gly Tyr Ile Lys Lys Cys Lys Lys Asp Tyr (2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:
Pro Pro Thr Ser Ser Pro Asp Cys Asp Ser Cys Lys Gly Ile Lys Lys Cys Lys Lys Asp Tyr (2) INFORMATION FOR SEQ ID N0:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 88 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:
Met Leu Val Gly Asp Ser Gly Val Gly Lys Thr Cys Leu Leu Val Arg Phe Lys Asp Gly Ala Phe Leu Ala Gly Thr Phe Ile Ser Thr Val Gly Ile Asp Phe Arg Asn Lys Val Leu Asp Val Asp Gly Val Lys Ala Lys Leu Gln Met Trp Asp Thr Ala Gly Gln Glu Arg Phe Arg Ser Val Thr His Ala Tyr Tyr Arg Asp Ala His Ala Leu Leu Leu Leu Tyr Asp Val Thr Asn Lys Ala Ser Phe Asp Asn (2) INFORMATION FOR SEQ ID N0:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 83 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:
Met Leu Val Gly Asp Ser Gly Val Gly Lys Thr Cys Leu Leu Val Arg Phe Lys Asp Gly Ala Phe Leu Ala Gly Thr Phe Ile Ser Thr Val Gly Ile Asp Phe Arg Asn Lys Val Leu Asp Val Asp Gly Lys Lys Leu Gln Trp Asp Thr Ala Gly Gln Glu Arg Phe Arg Ser Val Thr His Ala Tyr Tyr Arg Asp Ala His Ala Leu Leu Leu Leu Tyr Asp Thr Asn Lys Ser Phe Asp Asn (2) INFORMATION FOR SEQ ID N0:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 83 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:14:
Phe Gln Asn His Phe Glu Pro Gly Val Tyr Val Cys Ala Lys Cys Gly Tyr Glu Leu Phe Ser Ser Arg Ser Lys Tyr Ala His Ser Ser Pro Trp Pro Ala Phe Thr Glu Thr Ile His Ala Asp Ser Val Ala Lys Arg Pro Glu His Asn Arg Ser Glu Ala Leu Lys Val Ser Cys Gly Lys Cys Gly Asn Gly Leu Gly His Glu Phe Leu Asn Asp Gly Pro Lys Pro Gly Gln Ser Arg Phe (2) INFORMATION FOR SEQ ID N0:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:15:
Phe Pro Gly Tyr Val Gly Leu Phe Ser Ser Lys Tyr Trp Pro Phe Thr Ile Ala Ser Val Val Leu Gly His Phe Asp Gly Pro (2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16:
Glu Gly Val Tyr Cys Ala Cys Asp Leu Ser Ser Lys Trp Pro Ala Phe Glu Ala Cys Cys Leu Gly His Phe Gly Lys (2) INFORMATION FOR SEQ ID N0:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:
Phe His Phe Glu Gly Tyr Val Cys Cys Gly Glu Leu Phe Ser Lys Trp Pro Ala Phe Glu Val Cys Cys Leu Gly His Phe Asn Asp Gly Pro Lys (2) INFORMATION FOR SEQ ID N0:18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:18:
Phe Gly Tyr Val Gly Phe Ser Ser Lys Trp Pro Phe Thr Ile Asp Val Gly Asn Leu Gly His Phe Asp Gly Pro Lys Gly Arg (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6803 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:19:
(2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1743 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1...1740 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:20:
Met Pro Gly Trp Pro Trp Gly Leu Leu Leu Thr Ala Gly Thr Leu Phe Ala Ala Leu Ser Pro Gly Pro Pro Ala Pro Ala Asp Pro Cys His Asp Glu Gly Gly Ala Pro Arg Gly Cys Val Pro Gly Leu Val Asn Ala Ala Leu Gly Arg Glu Val Leu Ala Ser Ser Thr Cys Gly Arg Pro Ala Thr Arg Ala Cys Asp Ala Ser Asp Pro Arg Arg Ala His Ser Pro Ala Leu Leu Thr Ser Pro Gly Gly Thr Ala Ser Pro Leu Cys Trp Arg Ser Glu Ser Leu Pro Arg Ala Pro Leu Asn Val Thr Leu Thr Val Pro Leu Gly Lys Ala Phe Glu Leu Val Phe Val Ser Leu Arg Phe Cys Ser Ala Pro Pro Ala Ser Val Ala Leu Leu Lys Ser Gln Asp His Gly Arg Ser Trp Ala Pro Leu Gly Phe Phe Ser Ser His Cys Asp Leu Asp Tyr Gly Arg Leu Pro Ala Pro Ala Asn Gly Pro Ala Gly Pro Gly Pro Glu Ala Leu Cys Phe Pro Ala Pro Leu Ala Gln Pro Asp Gly Ser Gly Leu Leu Ala Phe Ser Met Gln Asp Ser Ser Pro Pro Gly Leu Asp Leu Asp Ser Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Val Arg Val Val Leu Thr Arg Pro Ser Thr Ala Gly Asp Pro Arg Asp Met Glu Ala Val Val Pro Tyr Ser Tyr Ala Ala Thr Asp Leu Gln Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Leu Leu Asp Thr Gln Gly His Leu Ile Cys Asp Cys Arg His Gly Thr Glu Gly Pro Asp Cys Gly Arg Cys Lys Pro Phe Tyr Cys Asp Arg Pro Trp Gln Arg Ala Thr Ala Arg Glu Ser His Ala Cys Leu Ala Cys Ser Cys Asn Gly His Ala Arg Arg Cys Arg Phe Asn Met Glu Leu Tyr Arg Leu Ser Gly Arg Arg Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg His Cys His Tyr Cys Arg Glu Gly Phe Tyr Arg Asp Pro Gly Arg Ala Leu Ser Asp Arg Arg Ala Cys Arg Ala Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys Val Lys Thr Pro Ile Pro Gly Pro Thr Glu Asp Ser Ser Pro Val Gln Pro Gln Asp Cys Asp Ser His Cys Lys Pro Ala Arg Gly Ser Tyr Arg Ile Ser Leu Lys Lys Phe Cys Lys Lys Asp Tyr Ala Val Gln Val Ala Val Gly Ala Arg Gly Glu Ala Arg Gly Ala Trp Thr Arg Phe Pro Val Ala Val Leu Ala Val Phe Arg Ser Gly Glu Glu Arg Ala Arg Arg Gly Ser Ser Ala Leu Trp Val Pro Ala Gly Asp Ala Ala Cys Gly Cys Pro Arg Leu Leu Pro Gly Arg Arg Tyr Leu Leu Leu Gly Gly Gly Pro Gly Ala Ala Ala Gly Gly Ala Gly Gly Arg Gly Pro Gly Leu Ile Ala Ala Arg Gly Ser Leu Val Leu Pro Trp Arg Asp Ala Trp Thr Arg Arg Leu Arg Arg Leu Gln Arg Arg Glu Arg Arg Gly Arg Cys Ser Ala Ala (2) INFORMATION FOR SEQ ID N0:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 580 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:
Met Pro Gly Trp Pro Trp Gly Leu Leu Leu Thr Ala Gly Thr Leu Phe Ala Ala Leu Ser Pro Gly Pro Pro Ala Pro Ala Asp Pro Cys His Asp Glu Gly Gly Ala Pro Arg Gly Cys Val Pro Gly Leu Val Asn Ala Ala Leu Gly Arg Glu Val Leu Ala Ser Ser Thr Cys Gly Arg Pro Ala Thr Arg Ala Cys Asp Ala Ser Asp Pro Arg Arg Ala His Ser Pro Ala Leu Leu Thr Ser Pro Gly Gly Thr Ala Ser Pro Leu Cys Trp Arg Ser Glu Ser Leu Pro Arg Ala Pro Leu Asn Val Thr Leu Thr Val Pro Leu Gly Lys Ala Phe Glu Leu Val Phe Val Ser Leu Arg Phe Cys Ser Ala Pro Pro Ala Ser Val Ala Leu Leu Lys Ser Gln Asp His Gly Arg Ser Trp Ala Pro Leu Gly Phe Phe Ser Ser His Cys Asp Leu Asp Tyr Gly Arg Leu Pro Ala Pro Ala Asn Gly Pro Ala Gly Pro Gly Pro Glu Ala Leu Cys Phe Pro Ala Pro Leu Ala Gln Pro Asp Gly Ser Gly Leu Leu Ala Phe Ser Met Gln Asp Ser Ser Pro Pro Gly Leu Asp Leu Asp Ser Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Val Arg Val Val Leu Thr Arg Pro Ser Thr Ala Gly Asp Pro Arg Asp Met Glu Ala Val Val Pro Tyr Ser Tyr Ala Ala Thr Asp Leu Gln Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Leu Leu Asp Thr Gln Gly His Leu Ile Cys Asp Cys Arg His Gly Thr Glu Gly Pro Asp Cys Gly Arg Cys Lys Pro Phe Tyr Cys Asp Arg Pro Trp Gln Arg Ala Thr Ala Arg Glu Ser His Ala Cys Leu Ala Cys Ser Cys Asn Gly His Ala Arg Arg Cys Arg Phe Asn Met Glu Leu Tyr Arg Leu Ser Gly Arg Arg Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg His Cys His Tyr Cys Arg Glu Gly Phe Tyr Arg Asp Pro Gly Arg Ala Leu Ser Asp Arg Arg Ala Cys Arg Ala Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys Val Lys Thr Pro Ile Pro Gly Pro Thr Glu Asp Ser Ser Pro Val Gln Pro Gln Asp Cys Asp Ser His Cys Lys Pro Ala Arg Gly Ser Tyr Arg Ile Ser Leu Lys Lys Phe Cys Lys Lys Asp Tyr Ala Val Gln Val Ala Val Gly Ala Arg Gly Glu Ala Arg Gly Ala Trp Thr Arg Phe Pro Val Ala Val Leu Ala Val Phe Arg Ser Gly Glu Glu Arg Ala Arg Arg Gly Ser Ser Ala Leu Trp Val Pro Ala Gly Asp Ala Ala Cys Gly Cys Pro Arg Leu Leu Pro Gly Arg Arg Tyr Leu Leu Leu Gly Gly Gly Pro Gly Ala Ala Ala Gly Gly Ala Gly Gly Arg Gly Pro Gly Leu Ile Ala Ala Arg Gly Ser Leu Val Leu Pro Trp Arg Asp Ala Trp Thr Arg Arg Leu Arg Arg Leu Gln Arg Arg Glu Arg Arg Gly Arg Cys Ser Ala Ala (2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 606 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:22:
Met Pro Arg Arg Gly Ala Glu Gly Pro Leu Ala Leu Leu Leu Ala Ala Ala Trp Leu Ala Gln Pro Leu Arg Gly Gly Tyr Pro Gly Leu Asn Met Phe Ala Val Gln Thr Ala Gln Pro Asp Pro Cys Tyr Asp Glu His Gly Leu Pro Arg Arg Cys Ile Pro Asp Phe Val Asn Ser Ala Phe Gly Lys Glu Val Lys Val Ser Ser Thr Cys Gly Lys Pro Pro Ser Arg Tyr Cys Val Val Thr Glu Lys Gly Glu Glu Gln Val Arg Ser Cys His Leu Cys Asn Ala Ser Asp Pro Lys Arg Ala His Pro Pro Ser Phe Leu Thr Asp Leu Asn Asn Pro His Asn Leu Thr Cys Trp Gln Ser Asp Ser Tyr Val Gln Tyr Pro His Asn Val Thr Leu Thr Leu Ser Leu Gly Lys Lys Phe Glu Val Thr Tyr Val Ser Leu Gln Phe Cys Ser Pro Arg Pro Glu Ser Met Ala Ile Tyr Lys Ser Met Asp Tyr Gly Lys Thr Trp Val Pro Phe Gln Phe Tyr Ser Thr Gln Cys Arg Lys Met Tyr Asn Lys Pro Ser Arg Ala Ala Ile Thr Lys Gln Asn Glu Gln Glu Ala Ile Cys Thr Asp Ser His Thr Asp Val Arg Pro Leu Ser Gly Gly Leu Ile Ala Phe Ser Thr Leu Asp Gly Arg Pro Thr Ala His Asp Phe Asp Asn Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Ile Lys Val Thr Phe Ser Arg Leu His Thr Phe Gly Asp Glu Asn Glu Asp Asp Ser Glu Leu Ala Arg Asp Ser Tyr Phe Tyr Ala Val Ser Asp Leu Gln Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Val Arg Asp Arg Asp Asp Asn Leu Val Cys Asp Cys Lys His Asn Thr Ala Gly Pro Glu Cys Asp Arg Cys Lys Pro Phe His Tyr Asp Arg Pro Trp Gln Arg Ala Thr Ala Arg Glu Ala Asn Glu Cys Val Ala Cys Asn Cys Asn Leu His Ala Arg Arg Cys Arg Phe Asn Met Glu Leu Tyr Lys Leu Ser Gly Arg Lys Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg His Cys His Tyr Cys Lys Glu Gly Phe Tyr Arg Asp Leu Ser Lys Pro Ile Ser His Arg Lys Ala Cys Lys Glu Cys Asp Cys His Pro Val Gly Ala Ala Gly Gln Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Ile Thr Cys Asn Arg Cys Ala Lys Gly Tyr Gln Gln Ser Arg Ser Pro Ile Ala Pro Cys Ile Lys Ile Pro Ala Ala Pro Pro Pro Thr Ala Ala Ser Ser Thr Glu Glu Pro Ala Asp Cys Asp Ser Tyr Cys Lys Ala Ser Lys Gly Lys Leu Lys Ile Asn Met Lys Lys Tyr Cys Lys Lys Asp Tyr Ala Val Gln Ile His Ile Leu Lys Ala Glu Lys Asn Ala Asp Trp Trp Lys Phe Thr Val Asn Ile Ile Ser Val Tyr Lys Gln Gly Ser Asn Arg Leu Arg Arg Gly Asp Gln Thr Leu Trp Val His Ala Lys Asp Ile Ala Cys Lys Cys Pro Lys Val Lys Pro Met Lys Lys Tyr Leu Leu Leu Gly Ser Thr Glu Asp Ser Pro Asp Gln Ser Gly Ile Ile Ala Asp Lys Ser Ser Leu Val Ile Gln Trp Arg Asp Thr Trp Ala Arg Arg Leu Arg Lys Phe Gln Gln Arg Glu Lys Lys Gly Lys Cys Arg Lys Ala (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:23:
Leu Arg Leu Leu Leu Thr Thr Ser Val Leu Arg Leu Ala Arg Ala Ala Asn Pro Glu Val Ala Gln Gln Thr Pro Pro Asp Pro Cys Tyr Asp Glu Ser Gly Ala Pro Arg Arg Cys Ile Pro Glu Phe Val Asn Ala Ala Phe Gly Lys Glu Val Gln Ala Ser Ser Thr Cys Gly Lys Pro Pro Thr Arg His Cys Asp Ala Ser Asp Pro Arg Arg Ala His Pro Pro Ala Tyr Leu Thr Asp Leu Asn Thr Ala Ala Asn Met Thr Cys Trp Arg Ser Glu Thr Leu His His Leu Pro His Asn Val Thr Leu Thr Leu Ser Leu Gly Lys Lys Phe Glu Val Val Tyr Val Ser Leu Gln Phe Cys Ser Pro Arg Pro Glu Ser Thr Ala Ile Phe Lys Ser Met Asp Tyr Gly Lys Thr Trp Val Pro Tyr Gln Tyr Tyr Ser Ser Gln Cys Arg Lys Ile Tyr Gly Lys Pro Ser Lys Ala Thr Val Thr Lys Gln Asn Glu Gln Glu Ala Leu Cys Thr Asp Gly Leu Thr Asp Leu Tyr Pro Leu Thr Gly Gly Leu Ile Ala Phe Ser Thr Leu Asp Gly Arg Pro Ser Ala Gln Asp Phe Asp Ser Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Ile Arg Val Val Phe Ser Arg Pro His Leu Phe Arg Glu Leu Gly Gly Arg Glu Ala Gly Glu Glu Asp Gly Gly Ala Gly Ala Thr Pro Tyr Tyr Tyr Ser Val Gly Glu Leu Gln Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Val Lys Asp Lys Glu Gln Lys Leu Val Cys Asp Cys Lys His Asn Thr Glu Gly Pro Glu Cys Asp Arg Cys Lys Pro Phe His Tyr Asp Arg Pro Trp Gln Arg Ala Ser Ala Arg Glu Ala Asn Glu Cys Leu Ala Cys Asn Cys Asn Leu His Ala Arg Arg Cys Arg Phe Asn Met Glu Leu Tyr Lys Leu Ser Gly Arg Lys Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg His Cys His Tyr Cys Lys Glu Gly Phe Tyr Arg Asp Leu Ser Lys Ser Ile Thr Asp Arg Lys Ala Cys Lys Ala Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Lys Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys Ile Lys Ile Pro Ala Ile Asn Pro Thr Ser Leu Val Thr Ser Thr Glu Ala Pro Ala Asp Cys Asp Ser Tyr Cys Lys Pro Ala Lys Gly Asn Tyr Lys Ile Asn Met Lys Lys Tyr Cys Lys Lys Asp Tyr Val Val Gln Val Asn Ile Leu Glu Met Glu Thr Val Ala Asn Trp Ala Lys Phe Thr Ile Asn Ile Leu Ser Val Tyr Lys Cys Arg Asp Glu Arg Val Lys Arg Gly Asp Asn Phe Leu Trp Ile His Leu Lys Asp Leu Ser Cys Lys Cys Pro Lys Ile Gln Ile Ser Lys Lys Tyr Leu Val Met Gly Ile Ser Glu Asn Ser Thr Asp Arg Pro Gly Leu Met Ala Asp Lys Asn Ser Leu Val Ile Gln Trp Arg Asp Ala Trp Thr Arg Arg Leu Arg Lys Leu Gln Arg Arg Glu Lys Lys Gly Lys Cys Val Lys Pro (2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5894 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 2...5053 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:24:
Lys Val Leu Val Thr Val Leu Glu Leu Phe Leu Pro Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Glu Leu Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Lys Ala Val Thr Glu Thr Val Arg Arg Ala Leu Val Ile Asn Met Arg Val Arg Gly Phe Pro Ser Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Leu Ala Ala Val Val Phe Glu His Pro Phe Asn His Ser Lys Glu Pro Leu Pro Leu Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gly Trp His Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gly Phe Leu Ala Val Gln His Ala Val Asp Arg Ala Ile Met Glu Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Val Gln Glu Lys Glu Arg Arg Leu Lys Glu Tyr Met Arg Met Met Gly Leu Ser Ser Trp Leu His Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Phe Met Thr Leu Leu Phe Cys Val Lys Val Lys Pro Asn Va1 Ala Val Leu Ser Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Ala Ile Ser 10~
Thr Ile Ser Phe Ser Phe Met Val Ser Thr Phe Phe Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Met Ala Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gly Gln Val Leu Gly Met Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Arg Ala Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Ile Lys Ile Lys His Leu Ser Lys Val Phe Arg Val Gly Asn Lys Asp Arg Ala Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Thr Val Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gln Ile Arg Lys Ser Leu Gly Leu Cys Pro Gln His Asp Ile Leu Phe Asp Asn Leu Thr Val Ala Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pro Glu Glu Val Lys Gln Met Leu His Ile Ile Gly Leu Glu Asp Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Thr Thr His Phe Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Val Lys Glu Pro His Cys Asn Pro Glu Asp Ile Ser Gln Leu Val His His His Val Pro Asn Ala Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Ser Thr His Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Glu Val Phe Leu Arg Val Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Met Asp Pro Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Val Lys Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gln Phe Trp Ala Met Phe Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Ala Ala Gln Val Leu Val Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Ser Val Pro Gly Thr Ser Gln Leu Gly Gln Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gly Gln Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Ala Ala Ser Phe Arg Asp Val Gly Glu Arg Thr Val Val Asn Ala Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Ala Val Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gly Pro His Ala Ser Ile Val Val Ser Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Glu Arg Ala Val Gln Ala Lys His Val Gln Phe Val Ser Gly Val His Val Ala Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Val Phe Lys Ala Phe Asp Val Arg Ala Phe Thr Arg Asp Gly His Met Ala Asp Thr Leu Leu Leu Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Met Val Thr Ile Met Arg Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp His Val Phe Leu Val Leu Pro Asn His Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gln Asp Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu His Thr Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Val Pro Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Val Gly Gly His Arg Ile Ser Ser Asp Val Gly Lys Val Arg Gln Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp His Met Thr Gly Arg Glu Met Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg His Ile Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pro His Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pro Val Ala Arg Arg Leu Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gln His Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Glu His Gln Gly Met Val His Tyr His Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Ser Phe Ala His Leu Gln Pro Pro Thr Ala Glu Glu Gly Arg (2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1684 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:25:
Lys Val Leu Val Thr Val Leu Glu Leu Phe Leu Pro Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Glu Leu Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Lys Ala Val Thr Glu Thr Val Arg Arg Ala Leu Val Ile Asn Met Arg Val Arg Gly Phe Pro Ser Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Leu Ala Ala Val Val Phe Glu His Pro Phe Asn His Ser Lys Glu Pro Leu Pro Leu Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gly Trp His Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gly Phe Leu Ala Val Gln His Ala Val Asp Arg Ala Ile Met Glu Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Val Gln Glu Lys Glu Arg Arg Leu Lys Glu Tyr Met Arg Met Met Gly Leu Ser Ser Trp Leu His Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Phe Met Thr Leu Leu Phe Cys Val Lys Val Lys Pro Asn Val Ala Val Leu Ser Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Ala Ile Ser Thr Ile Ser Phe Ser Phe Met Val Ser Thr Phe Phe Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Met Ala Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gly Gln Val Leu Gly Met Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Arg Ala Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Ile Lys Ile Lys His Leu Ser Lys Val Phe Arg Val Gly Asn Lys Asp Arg Ala Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Thr Val Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gln Ile Arg Lys Ser Leu Gly Leu Cys Pro Gln His Asp Ile Leu Phe Asp Asn Leu Thr Val Ala Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pro Glu Glu Val Lys Gln Met Leu His Ile Ile Gly Leu Glu Asp Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Thr Thr His Phe Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Val Lys Glu Pro His Cys Asn Pro Glu Asp Ile Ser Gln Leu Val His His His Val Pro Asn Ala Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Ser Thr His Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Glu Val Phe Leu Arg Val Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Met Asp Pro Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Val Lys Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gln Phe Trp Ala Met Phe Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Ala Ala Gln Val Leu Val Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Ser Val Pro Gly Thr Ser Gln Leu Gly Gln Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gly Gln Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Ala Ala Ser Phe Arg Asp Val Gly Glu Arg Thr Val Val Asn Ala Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Ala Val Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gly Pro His Ala Ser Ile Val Val Ser Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Glu Arg Ala Val Gln Ala Lys His Val Gln Phe Val Ser Gly Val His Val Ala Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Val Phe Lys Ala Phe Asp Val Arg Ala Phe Thr Arg Asp Gly His Met Ala Asp Thr Leu Leu Leu Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Met Val Thr Ile Met Arg Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp His Val Phe Leu Val Leu Pro Asn His Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gln Asp Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu His Thr Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Val Pro Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Val Gly Gly His Arg Ile Ser Ser Asp Val Gly Lys Val Arg Gln Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp His Met Thr Gly Arg Glu Met Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg His Ile Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pro His Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pro Val Ala Arg Arg Leu Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gln His Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Glu His Gln Gly Met Val His Tyr His Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Ser Phe Ala His Leu Gln Pro Pro Thr Ala Glu Glu Gly Arg (2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1375 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:
Cys Met Glu Glu Glu Pro Thr His Leu Arg Leu Gly Val Ser Ile Gln Asn Leu Val Lys Val Tyr Arg Asp Gly Met Lys Val Ala Val Asp Gly Leu Ala Leu Asn Phe Tyr Glu Gly Gln Ile Thr Ser Phe Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Met Ser Ile Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Thr Ala Tyr Ile Leu Gly Lys Asp Ile Arg Ser Glu Met Ser Ser Ile Arg Gln Asn Leu Gly Val Cys Pro Gln His Asn Val Leu Phe Asp Met Leu Thr Val Glu Glu His Ile Trp Phe Tyr Ala Arg Leu Lys Gly Leu Ser Glu Lys His Val Lys Ala Glu Met Glu Gln Met Ala Leu Asp Val Gly Leu Pro Pro Ser Lys Leu Lys Ser Lys Thr Ser Gln Leu Ser Gly Gly Met Gln Arg Lys Leu Ser Val Ala Leu Ala Phe Val Gly Gly Ser Lys Val Val Ile Leu Asp Glu Pro Thr Ala Gly Val Asp Pro Tyr Ser Arg Arg Gly Ile Trp Glu Leu Leu Leu Lys Tyr Arg Gln Gly Arg Thr Ile Ile Leu Ser Thr His His Met Asp Glu Ala Asp Ile Leu Gly Asp Arg Ile Ala Ile Ile Ser His Gly Lys Leu Cys Cys Val Gly Ser Ser Leu Phe Leu Lys Asn Gln Leu Gly Thr Gly Tyr Tyr Leu Thr Leu Val Lys Lys Asp Val Glu Ser Ser Leu Ser Ser Cys Arg Asn Ser Ser Ser Thr Val Ser Cys Leu Lys Lys Glu Asp Ser Val Ser Gln Ser Ser Ser Asp Ala Gly Leu Gly Ser Asp His Glu Ser Asp Thr Leu Thr Ile Asp Val Ser Ala Ile Ser Asn Leu Ile Arg Lys His Val Ser Glu Ala Arg Leu Val Glu Asp Ile Gly His Glu Leu Thr Tyr Val Leu Pro Tyr Glu Ala Ala Lys Glu Gly Ala Phe Val Glu Leu Phe His Glu Ile Asp Asp Arg Leu Ser Asp Leu Gly Ile Ser Ser Tyr Gly Ile Ser Glu Thr Thr Leu Glu Glu Ile Phe Leu Lys Val Ala Glu Glu Ser Gly Val Asp Ala Glu Thr Ser Asp Gly Thr Leu Pro Ala Arg Arg Asn Arg Arg Ala Phe Gly Asp Lys Gln Ser Cys Leu His Pro Phe Thr Glu Asp Asp Ala Val Asp Pro Asn Asp Ser Asp Ile Asp Pro Glu Ser Arg Glu Thr Asp Leu Leu Ser Gly Met Asp Gly Lys Gly Ser Tyr Gln Leu Lys Gly Trp Lys Leu Thr Gln Gln Gln Phe Val Ala Leu Leu Trp Lys Arg Leu Leu Ile Ala Arg Arg Ser Arg Lys Gly Phe Phe Ala Gln Ile Val Leu Pro Ala Val Phe Val Cys Ile Ala Leu Val Phe Ser Leu Ile Val Pro Pro Phe Gly Lys Tyr Pro Ser Leu Glu Leu Gln Pro Trp Met Tyr Asn Glu Gln Tyr Thr Phe Val Ser Asn Asp Ala Pro Glu Asp Met Gly Thr Gln Glu Leu Leu Asn Ala Leu Thr Lys Asp Pro Gly Phe Gly Thr Arg Cys Met Glu Gly Asn Pro Ile Pro Asp Thr Pro Cys Leu Ala Gly Glu Glu Asp Trp Thr Ile Ser Pro Val Pro Gln Ser Ile Val Asp Leu Phe Gln Asn Gly Asn Trp Thr Met Lys Asn Pro Ser Pro Ala Cys Gln Cys Ser Ser Asp Lys Ile Lys Lys Met Leu Pro Val Cys Pro Pro Gly Ala Gly Gly Leu Pro Pro Pro Gln Arg Lys Gln Lys Thr Ala Asp Ile Leu Gln Asn Leu Thr Gly Arg Asn Ile Ser Asp Tyr Leu Val Lys Thr Tyr Val Gln Ile Ile Ala Lys Ser Leu Lys Asn Lys Ile Trp Val Asn Glu Phe Arg Tyr Gly Gly Phe Ser Leu Gly Val Ser Asn Ser Gln Ala Leu Pro Pro Ser His Glu Val Asn Asp Ala Ile Lys Gln Met Lys Lys Leu Leu Lys Leu Thr Lys Asp Thr Ser Ala Asp Arg Phe Leu Ser Ser Leu Gly Arg Phe Met Ala Gly Leu Asp Thr Lys Asn Asn Val Lys Val Trp Phe Asn Asn Lys Gly Trp His Ala Ile Ser Ser Phe Leu Asn Val Ile Asn Asn Ala Ile Leu Arg Ala Asn Leu Gln Lys Gly Glu Asn Pro Ser Gln Tyr Gly Ile Thr Ala Phe Asn His Pro Leu Asn Leu Thr Lys Gln Gln Leu Ser Glu Val Ala Leu Met Thr Thr Ser Val Asp Val Leu Val Ser Ile Cys Val Ile Phe Ala Met Ser Phe Val Pro Ala Ser Phe Val Val Phe Leu Ile Gln Glu Arg Val Ser Lys Ala Lys His Leu Gln Phe Ile Ser Gly Val Lys Pro Val Ile Tyr Trp Leu Ser Asn Phe Val Trp Asp Met Cys Asn Tyr Val Val Pro Ala Thr Leu Val Ile Ile Ile Phe Ile Cys Phe Gln Gln Lys Ser Tyr Val Ser Ser Thr Asn Leu Pro Val Leu Ala Leu Leu Leu Leu Leu Tyr Gly Trp Ser Ile Thr Pro Leu Met Tyr Pro Ala Ser Phe Val Phe Lys Ile Pro Ser Thr Ala Tyr Val Val Leu Thr Ser Val Asn Leu Phe Ile Gly Ile Asn Gly Ser Val Ala Thr Phe Val Leu Glu Leu Phe Thr Asn Asn Lys Leu Asn Asp Ile Asn Asp Ile Leu Lys Ser Val Phe Leu Ile Phe Pro His Phe Cys Leu Gly Arg Gly Leu Ile Asp Met Val Lys Asn Gln Ala Met Ala Asp Ala Leu Glu Arg Phe Gly Glu Asn Arg Phe Val Ser Pro Leu Ser Trp Asp Leu Val Gly Arg Asn Leu Phe Ala Met Ala Val Glu Gly Val Val Phe Phe Leu Ile Thr Val Leu Ile Gln Tyr Arg Phe Phe Ile Arg Pro Arg Pro Val Lys Ala Lys Leu Pro Pro Leu Asn Asp Glu Asp Glu Asp Val Arg Arg Glu Arg Gln Arg Ile Leu Asp Gly Gly Gly Gln Asn Asp Ile Leu Glu Ile Lys Glu Leu Thr Lys Ile Tyr Arg Arg Lys Arg Lys Pro Ala Val Asp Arg Ile Cys Ile Gly Ile Pro Pro Gly Glu Cys Phe Gly Leu Leu Gly Val Asn Gly Ala Gly Lys Ser Thr Thr Phe Lys Met Leu Thr Gly Asp Thr Pro Val Thr Arg Gly Asp Ala Phe Leu Asn Lys Asn Ser Ile Leu Ser Asn Ile His Glu Val His Gln Asn Met Gly Tyr Cys Pro Gln Phe Asp Ala Ile Thr Glu Leu Leu Thr Gly Arg Glu His Val Glu Phe Phe Ala Leu Leu Arg Gly Val Pro Glu Lys Glu Val Gly Lys Phe Gly Glu Trp Ala Ile Arg Lys Leu Gly Leu Val Lys Tyr Gly Glu Lys Tyr Ala Ser Asn Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Ala Met Ala Leu Ile Gly Gly Pro Pro Val Val Phe Leu Asp Glu Pro Thr Thr Gly Met Asp Pro Lys Ala Arg Arg Phe Leu Trp Asn Cys Ala Leu Ser Ile Val Lys Glu Gly Arg Ser Val Val Leu Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Met Ala Ile Met Val Asn Gly Arg Phe Arg Cys Leu Gly Ser Val Gln His Leu Lys Asn Arg Phe Gly Asp Gly Tyr Thr Ile Val Val Arg Ile Ala Gly Ser Asn Pro Asp Leu Lys Pro Val Gln Glu Phe Phe Gly Leu Ala Phe Pro Gly Ser Val Leu Lys Glu Lys His Arg Asn Met Leu Gln Tyr Gln Leu Pro Ser Ser Leu Ser Ser Leu Ala Arg Ile Phe Ser Ile Leu Ser Gln Ser Lys Lys Arg Leu His Ile Glu Asp Tyr Ser Val Ser Gln Thr Thr Leu Asp Gln Val Phe Val Asn Phe Ala Lys Asp Gln Ser Asp Asp Asp His Leu Lys Asp Leu Ser Leu His Lys Asn Gln Thr Val Val Asp Val Ala Val Leu Thr Ser Phe Leu Gln Asp Glu Lys Val Lys Glu Ser Tyr Val (2) INFORMATION FOR SEQ ID N0:27:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1457 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:27:
Met Glu Glu Glu Pro Thr His Leu Pro Leu Val Val Cys Val Asp Lys Leu Thr Lys Val Tyr Lys Asn Asp Lys Lys Leu Ala Leu Asn Lys Leu Ser Leu Asn Leu Tyr Glu Asn Gln Val Val Ser Phe Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Met Ser Ile Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Ser Ala Thr Ile Tyr Gly His Asp Ile Arg Thr Glu Met Asp Glu Ile Arg Lys Asn Leu Gly Met Cys Pro Gln His Asn Val Leu Phe Asp Arg Leu Thr Val Glu Glu His Leu Trp Phe Tyr Ser Arg Leu Lys Ser Met Ala Gln Glu Glu Ile Arg Lys Glu Thr Asp Lys Met Ile Glu Asp Leu Glu Leu Ser Asn Lys Arg His Ser Leu Val Gln Thr Leu Ser Gly Gly Met Lys Arg Lys Leu Ser Val Ala Ile Ala Phe Val Gly Gly Ser Arg Ala Ile Ile Leu Asp Glu Pro Thr Ala Gly Val Asp Pro Tyr Ala Arg Arg Ala Ile Trp Asp Leu Ile Leu Lys Tyr Lys Pro Gly Arg Thr Ile Leu Leu Ser Thr His His Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Ile Ser His Gly Lys Leu Lys Cys Cys Gly Ser Pro Leu Phe Leu Lys Gly Ala Tyr Xaa Asp Gly Tyr Arg Leu Thr Leu Val Lys Gln Pro Ala Glu Pro Gly Thr Ser Gln Glu Pro Gly Leu Ala Ser Ser Pro Ser Gly Cys Pro Arg Leu Ser Ser Cys Ser Glu Pro Gln Val Ser Gln Phe Ile Arg Lys His Val Ala Ser Ser Leu Leu Val Ser Asp Thr Ser Thr Glu Leu Ser Tyr Ile Leu Pro Ser Glu Ala Val Lys Lys Gly Ala Phe Glu Arg Leu Phe Gln Gln Leu Glu His Ser Leu Asp Ala Leu His Leu Ser Ser Phe Gly Leu Met Asp Thr Thr Leu Glu Glu Val Phe Leu Lys Val Ser Glu Glu Asp Gln Ser Leu Glu Asn Ser Glu Ala Asp Val Lys Glu Ser Arg Lys Asp Val Leu Pro Gly Ala Glu Gly Leu Thr Ala Val Gly Gly Gln Ala Gly Asn Leu Ala Arg Cys Ser Glu Leu Ala Gln Ser Gln Ala Ser Leu Gln Ser Ala Ser Ser Val Gly Ser Ala Arg Gly Glu Glu Gly Thr Gly Tyr Ser Asp Gly Tyr Gly Asp Tyr Arg Pro Leu Phe Asp Asn Leu Gln Asp Pro Asp Asn Val Ser Leu Gln Glu Ala Glu Met Glu Ala Leu Ala Gln Val Gly Gln Gly Ser Arg Lys Leu Glu Gly Trp Trp Leu Lys Met Arg Gln Phe His Gly Leu Leu Val Lys Arg Phe His Cys Ala Arg Arg Asn Ser Lys Ala Leu Cys Ser Gln Ile Leu Leu Pro Ala Phe Phe Val Cys Val Ala Met Thr Val Ala Leu Ser Val Pro Glu Ile Gly Asp Leu Pro Pro Leu Val Leu Ser Pro Ser Gln Tyr His Asn Tyr Thr Gln Pro Arg Gly Asn Phe Ile Pro Tyr Ala Asn Glu Glu Arg Gln Glu Tyr Arg Leu Arg Leu Ser Pro Asp Ala Ser Pro Gln Gln Leu Val Ser Thr Phe Arg Leu Pro Ser Gly Val Gly Ala Thr Cys Val Leu Lys Ser Pro Ala Asn Gly Ser Leu Gly Pro Met Leu Asn Leu Ser Ser Gly Glu Ser Arg Leu Leu Ala Ala Arg Phe Phe Asp Ser Met Cys Leu Glu Ser Phe Thr Gln Gly Leu Pro Leu Ser Asn Phe Val Pro Pro Pro Pro Ser Pro Ala Pro Ser Asp Ser Pro Val Xaa Pro Asp Glu Asp Ser Leu Gln Ala Trp Asn Met Ser Leu Pro Pro Thr Ala Gly Pro Glu Thr Trp Thr Ser Ala Pro Ser Leu Pro Arg Leu Val His Glu Pro Val Arg Cys Thr Cys Ser Ala Gln Gly Thr Gly Phe Ser Cys Pro Ser Ser Val Gly Gly His Pro Pro Gln Met Arg Val Val Thr Gly Asp Ile Leu Thr Asp Ile Thr Gly His Asn Val Ser Glu Tyr Leu Leu Phe Thr Ser Asp Arg Phe Arg Leu His Arg Tyr Gly Ala Ile Thr Phe Gly Asn Val Gln Lys Ser Ile Pro Ala Ser Phe Gly Ala Arg Val Pro Pro Met Val Arg Lys Ile Ala Val Arg Arg Val Ala Gln Val Leu Tyr Asn Asn Lys Gly Tyr His Ser Met Pro Thr Tyr Leu Asn Ser Leu Asn Asn Ala Ile Leu Arg Ala Asn Leu Pro Lys Ser Lys Gly Asn Pro Ala Ala Tyr Xaa Ile Thr Val Thr Asn His Pro Met Asn Lys Thr Ser Ala Ser Leu Ser Leu Asp Tyr Leu Leu Gln Gly Thr Asp Val Val Ile Ala Ile Phe Ile Ile Val Ala Met Ser Phe Val Pro Ala Ser Phe Val Val Phe Leu Val Ala Glu Lys Ser Thr Lys Ala Lys His Leu Gln Phe Val Ser Gly Cys Asn Pro Val Ile Tyr Trp Leu Ala Asn Tyr Val Trp Asp Met Leu Asn Tyr Leu Val Pro Ala Thr Cys Cys Val Ile Ile Leu Phe Val Phe Asp Leu Pro Ala Tyr Thr Ser Pro Thr Asn Phe Pro Ala Val Leu Ser Leu Phe Leu Leu Tyr Gly Trp Ser Ile Thr Pro Ile Met Tyr Pro Ala Ser Phe Trp Phe Glu Val Pro Ser Ser Ala Tyr Val Phe Leu Ile Val Ile Asn Leu Phe Ile Gly Ile Thr Ala Thr Val Ala Thr Phe Leu Leu Gln Leu Phe Glu His Asp Lys Asp Leu Lys Val Val Asn Ser Tyr Leu Lys Ser Cys Phe Leu Ile Phe Pro Asn Tyr Asn Leu Gly His Gly Leu Met Glu Met Ala Tyr Asn Glu Tyr Ile Asn Glu Tyr Tyr Ala Lys Ile Gly Gln Phe Asp Lys Met Lys Ser Pro Phe Glu Trp Asp Ile Val Thr Arg Gly Leu Val Ala Met Thr Val Glu Gly Phe Val Gly Phe Phe Leu Thr Ile Met Cys Gln Tyr Asn Phe Leu Arg Gln Pro Gln Arg Leu Pro Val Ser Thr Lys Pro Val Glu Asp Asp Val Asp Val Ala Ser Glu Arg Gln Arg Val Leu Arg Gly Asp Ala Asp Asn Asp Met Val Lys Ile Glu Asn Leu Thr Lys Val Tyr Lys Ser Arg Lys Ile Gly Arg Ile Leu Ala Val Asp Arg Leu Cys Leu Gly Val Cys Val Pro Gly Glu Cys Phe Gly Leu Leu Gly Val Asn Gly Ala Gly Lys Thr Ser Thr Phe Lys Met Leu Thr Gly Asp Glu Ser Thr Thr Gly Gly Glu Ala Phe Val Asn Gly His Ser Val Leu Lys Asp Leu Leu Gln Val Gln Gln Ser Leu Gly Tyr Cys Pro Gln Phe Asp Val Pro Val Asp Glu Leu Thr Ala Arg Glu His Leu Gln Leu Tyr Thr Arg Leu Arg Cys Ile Pro Trp Lys Asp Glu Ala Gln Val Val Lys Trp Ala Leu Glu Lys Leu Glu Leu Thr Lys Tyr Ala Asp Lys Pro Ala Gly Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Ala Ile Ala Leu Ile Gly Tyr Pro Ala Phe Ile Phe Leu Asp Glu Pro Thr Thr Gly Met Asp Pro Lys Ala Arg Arg Phe Leu Trp Asn Leu Ile Leu Asp Leu Ile Lys Thr Gly Arg Ser Val Val Leu Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Asn Gly Arg Leu His Cys Leu Gly Ser Ile Gln His Leu Lys Asn Arg Phe Gly Asp Gly Tyr Met Ile Thr Val Arg Thr Lys Ser Ser Gln Asn Val Lys Asp Val Val Arg Phe Phe Asn Arg Asn Phe Pro Glu Ala His Ala Gln Gly Lys Thr Pro Tyr Lys Val Gln Tyr Gln Leu Lys Ser Glu His Ile Ser Leu Ala Gln Val Phe Ser Lys Met Glu Gln Val Val Gly Val Leu Gly Ile Glu Asp Tyr Ser Val Ser Gln Thr Thr Leu Asp Asn Val Phe Val Asn Phe Ala Lys Lys Gln Ser Asp Asn Val Glu Gln Gln Glu Ala Glu Pro Ser Ser Leu Pro Ser Pro Leu Gly Leu Leu Ser Leu Leu Arg Pro Arg Pro Ala Pro Thr Glu Leu Arg Ala Leu Val Ala Asp Glu Pro Glu Asp Leu Asp Thr Glu Asp Glu Gly Leu Ile Ser Phe Glu Glu Glu Arg Ala Gln Leu Ser Phe Asn Thr Asp Thr Leu Cys (2) INFORMATION FOR SEQ ID N0:28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1548 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 49...1269 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:28:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly His Leu Gly Phe Leu Pro His Lys Arg Ser His Arg His Arg Gly Lys Val Lys Thr Trp Pro Arg Asp Asp Pro Ser Gln Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Thr Leu Arg Glu Val His Arg Pro Gly Leu Lys Ile Ser Lys Arg Glu Glu Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Leu Val Val Val Gly Val Val Gly Tyr Val Ala Thr Pro Arg Gly Leu Arg Ser Phe Lys Thr Ile Phe Ala Glu His Leu Ser Asp Glu Cys Arg Arg Arg Phe Tyr Lys Asp Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Ala Cys Lys Arg Trp Arg Asp Thr Asp Gly Lys Lys Gln Leu Gln Lys Asp Phe Ala Ala Met Lys Lys Tyr Cys Lys Val Ile Arg Val Ile Val His Thr Gln Met Lys Leu Leu Pro Phe Arg Gln Lys Lys Ala His Ile Met Glu Ile Gln Leu Asn Gly Gly Thr Val Ala Glu Lys Val Ala Trp Ala Gln Ala Arg Leu Glu Lys Gln Val Pro Val His Ser Val Phe Ser Gln Ser Glu Val Ile Asp Val Ile Ala Val Thr Lys Gly Arg Gly Val Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Lys Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Gly Cys Ser Ile Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Leu Asn Lys Lys Ile Phe Arg Ile Gly Arg Gly Pro His Met Glu Asp Gly Lys Leu Val Lys Asn Asn Ala Ser Thr Ser Tyr Asp Val Thr Ala Lys Ser Ile Thr Pro Leu Gly Gly Phe Pro His Tyr Gly Glu Val Asn Asn Asp Phe Val Met Leu Lys Gly Cys Ile Ala Gly Thr Lys Lys Arg Val Ile Thr Leu Arg Lys Ser Leu Leu Val His His Ser Arg Gln Ala Val Glu Asn Ile Glu Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Ala Gln Glu Lys Arg Ala Phe Met Gly Pro Gln Lys Lys His Leu Glu Lys Glu Thr Pro Glu Thr Ser Gly Asp Leu (2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 407 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:29:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly His Leu Gly Phe Leu Pro His Lys Arg Ser His Arg His Arg Gly Lys Val Lys Thr Trp Pro Arg Asp Asp Pro Ser Gln Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Thr Leu Arg Glu Val His Arg Pro Gly Leu Lys Ile Ser Lys Arg Glu Glu Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Leu Val Val Val Gly Val Val Gly Tyr Val Ala Thr Pro Arg Gly Leu Arg Ser Phe Lys Thr Ile Phe Ala Glu His Leu Ser Asp Glu Cys Arg Arg Arg Phe Tyr Lys Asp Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Ala Cys Lys Arg Trp Arg Asp Thr Asp Gly Lys Lys Gln Leu Gln Lys Asp Phe Ala Ala Met Lys Lys Tyr Cys Lys Val Ile Arg Val Ile Val His Thr Gln Met Lys Leu Leu Pro Phe Arg Gln Lys Lys Ala His Ile Met Glu Ile Gln Leu Asn Gly Gly Thr Val Ala Glu Lys Val Ala Trp Ala Gln Ala Arg Leu Glu Lys Gln Val Pro Val His Ser Val Phe Ser Gln Ser Glu Val Ile Asp Val Ile Ala Val Thr Lys Gly Arg Gly Val Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Lys Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Gly Cys Ser Ile Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Leu Asn Lys Lys Ile Phe Arg Ile Gly Arg Gly Pro His Met Glu Asp Gly Lys Leu Val Lys Asn Asn Ala Ser Thr Ser Tyr Asp Val Thr Ala Lys Ser Ile Thr Pro Leu Gly Gly Phe Pro His Tyr Gly Glu Val Asn Asn Asp Phe Val Met Leu Lys Gly Cys Ile Ala Gly Thr Lys Lys Arg Val Ile Thr Leu Arg Lys Ser Leu Leu Val His His Ser Arg Gln Ala Val Glu Asn Ile Glu Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Ala Gln Glu Lys Arg Ala Phe Met Gly Pro Gln Lys Lys His Leu Glu Lys Glu Thr Pro Glu Thr Ser Gly Asp Leu (2) INFORMATION FOR SEQ ID N0:30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 403 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:30:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly Ser Leu Gly Phe Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gly Lys Val Lys Ser Phe Pro Lys Asp Asp Pro Ser Lys Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Ile Val Arg Glu Val Asp Arg Pro Gly Ser Lys Val Asn Lys Lys Glu Val Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Met Val Val Val Gly Ile Val Gly Tyr Val Glu Thr Pro Arg Gly Leu Arg Thr Phe Lys Thr Val Phe Ala Glu His Ile Ser Asp Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln Asp Glu Asp Gly Lys Lys Gln Leu Glu Lys Asp Phe Ser Ser Met Lys Lys Tyr Cys Gln Val Ile Arg Val Ile Ala His Thr Gln Met Arg Leu Leu Pro Leu Arg Gln Lys Lys Ala His Leu Met Glu Ile Gln Val Asn Gly Gly Thr Val Ala Glu Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gln Gln Val Pro Val Asn Gln Val Phe Gly Gln Asp Glu Met Ile Asp Val Ile Gly Val Thr Lys Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Arg Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Ala Phe Ser Val Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Ile Asn Lys Lys Ile Tyr Lys Ile Gly Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Ile Lys Asn Asn Ala Ser Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile Asn Pro Leu Gly Gly Phe Val His Tyr Gly Glu Val Thr Asn Asp Phe Val Met Leu Lys Gly Cys Val Val Gly Thr Lys Lys Arg Val Leu Thr Leu Arg Lys Ser Leu Leu Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Ile Asp Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Met Glu Glu Lys Lys Ala Phe Met Gly Pro Leu Lys Lys Asp Arg Ile Ala Lys Glu Glu Gly Ala (2) INFORMATION FOR SEQ ID N0:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 403 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:31:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly Ser Leu Gly Phe Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gly Lys Val Lys Ser Phe Pro Lys Asp Asp Ser Ser Lys Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Ile Val Arg Glu Val Asp Arg Pro Gly Ser Lys Val Asn Lys Lys Glu Val Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Met Val Ile Val Gly Ile Val Gly Tyr Val Glu Thr Pro Arg Gly Leu Arg Thr Phe Lys Thr Ile Phe Ala Glu His Ile Ser Asp Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln Asp Ala Asp Gly Lys Lys Gln Leu Glu Arg Asp Phe Ser Ser Met Lys Lys Tyr Cys Gln Val Ile Arg Val Ile Ala His Thr Gln Met Arg Leu Leu Pro Leu Arg Gln Lys Lys Ala His Leu Met Glu Val Gln Val Asn Gly Gly Thr Val Ala Glu Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gln Gln Val Pro Val Asn Gln Val Phe Gly Gln Asp Glu Met Ile Asp Val Ile Gly Val Thr Lys Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Arg Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Ala Phe Ser Val Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Ile Asn Lys Lys Ile Tyr Lys Ile Gly Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Ile Lys Asn Asn Ala Ser Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile Asn Pro Leu Gly Gly Phe Val His Tyr Gly Glu Val Thr Asn Asp Phe Val Met Leu Lys Gly Cys Val Val Gly Thr Lys Lys Arg Val Leu Thr Leu Arg Lys Ser Leu Leu Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Ile Asp Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Val Glu Glu Lys Lys Ala Phe Met Gly Pro Leu Lys Lys Asp Arg Ile Ala Lys Glu Glu Gly Ala (2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 403 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly Ser Leu Gly Phe Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gly Lys Val Lys Ser Phe Pro Lys Asp Asp Ala Ser Lys Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Ile Val Arg Glu Val Asp Arg Pro Gly Ser Lys Val Asn Lys Lys Glu Val Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Met Val Val Val Gly Ile Val Gly Tyr Val Glu Thr Pro Arg Gly Leu Arg Thr Phe Lys Thr Val Phe Ala Glu His Ile Ser Asp Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln Asp Asp Thr Gly Lys Lys Gln Leu Glu Lys Asp Phe Asn Ser Met Lys Lys Tyr Cys Gln Val Ile Arg Ile Ile Ala His Thr Gln Met Arg Leu Leu Pro Leu Arg Gln Lys Lys Ala His Leu Met Glu Ile Gln Val Asn Gly Gly Thr Val Ala Glu Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gln Gln Val Pro Val Ser Gln Val Phe Gly Gln Asp Glu Met Ile Asp Val Ile Gly Val Thr Lys Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Arg Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Ala Phe Thr Val Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Ile Asn Lys Lys Ile Tyr Lys Ile Gly Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Ile Lys Asn Asn Ala Ser Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile Asn Pro Leu Gly Gly Phe Val His Tyr Gly Glu Val Thr Asn Asp Phe Ile Met Leu Lys Gly Cys Val Val Gly Thr Lys Lys Arg Val Leu Thr Leu Arg Lys Ser Leu Leu Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Ile Asp Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Met Glu Glu Lys Lys Ala Phe Met Gly Pro Leu Lys Lys Asp Arg Ile Ala Lys Glu Glu Gly Ala (2) INFORMATION FOR SEQ ID N0:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 468 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1...357 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:33:
Arg Asp Thr Lys Phe Arg Glu Asp Cys Pro Pro Asp Arg Glu Glu Leu Gly Arg His Ser Trp Ala Val Leu His Thr Leu Ala Ala Tyr Tyr Pro Asp Leu Pro Thr Pro Glu Gln Gln Gln Asp Met Ala Gln Phe Ile His Leu Phe Ser Lys Phe Tyr Pro Cys Glu Glu Cys Ala Glu Asp Leu Arg Lys Arg Leu Cys Arg Asn His Pro Asp Thr Arg Thr Arg Ala Cys Phe Thr Gln Trp Leu Cys His Leu His Asn Glu Val Asn Arg Lys Leu Gly Lys Pro Asp Phe Asp Cys Ser Lys Val Asp Glu Arg Trp Arg Asp Gly Trp Lys Asp Gly Ser Cys Asp (2) INFORMATION FOR SEQ ID N0:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 119 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:34:
Arg Asp Thr Lys Phe Arg Glu Asp Cys Pro Pro Asp Arg Glu Glu Leu Gly Arg His Ser Trp Ala Val Leu His Thr Leu Ala Ala Tyr Tyr Pro Asp Leu Pro Thr Pro Glu Gln Gln Gln Asp Met Ala Gln Phe Ile His Leu Phe Ser Lys Phe Tyr Pro Cys Glu Glu Cys Ala Glu Asp Leu Arg Lys Arg Leu Cys Arg Asn His Pro Asp Thr Arg Thr Arg Ala Cys Phe Thr Gln Trp Leu Cys His Leu His Asn Glu Val Asn Arg Lys Leu Gly Lys Pro Asp Phe Asp Cys Ser Lys Val Asp Glu Arg Trp Arg Asp Gly Trp Lys Asp Gly Ser Cys Asp (2) INFORMATION FOR SEQ ID N0:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 125 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:35:
Met Arg Thr Gln Gln Lys Arg Asp Ile Lys Phe Arg Glu Asp Cys Pro Gln Asp Arg Glu Glu Leu Gly Arg Asn Thr Trp Ala Phe Leu His Thr Leu Ala Ala Tyr Tyr Pro Asp Met Pro Thr Pro Glu Gln Gln Gln Asp Met Ala Gln Phe Ile His Ile Phe Ser Lys Phe Tyr Pro Cys Glu Glu Cys Ala Glu Asp Ile Arg Lys Arg Ile Asp Arg Ser Gln Pro Asp Thr Ser Thr Arg Val Ser Phe Ser Gln Trp Leu Cys Arg Leu His Asn Glu Val Asn Arg Lys Leu Gly Lys Pro Asp Phe Asp Cys Ser Arg Val Asp Glu Arg Trp Arg Asp Gly Trp Lys Asp Gly Ser Cys Asp (2) INFORMATION FOR SEQ ID N0:36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ix) FEATURE:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:36:
(2) INFORMATION FOR SEQ ID N0:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:37:
(2) INFORMATION FOR SEQ ID N0:38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:38:
(2) INFORMATION FOR SEQ ID N0:39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:39:
(2) INFORMATION FOR SEQ ID N0:40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:40:
CGGCAGAGGA TGCTGTGT lg (2) INFORMATION FOR SEQ ID N0:41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:41:
GCGGAGCCAC CTTCATCA lg (2) INFORMATION FOR SEQ ID N0:42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:42:
(2) INFORMATION FOR SEQ ID N0:43:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:43:
(2) INFORMATION FOR SEQ ID N0:44:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:44:
(2) INFORMATION FOR SEQ ID N0:45:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:45:
(2) INFORMATION FOR SEQ ID N0:46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:46:
(2) INFORMATION FOR SEQ ID N0:47:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:47:
(2) INFORMATION FOR SEQ ID N0:48:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:48:
(2) INFORMATION FOR SEQ ID N0:49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:49:
(2) INFORMATION FOR SEQ ID N0:50:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:50:
(2) INFORMATION FOR SEQ ID N0:51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:51:
(2) INFORMATION FOR SEQ ID N0:52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:52:
(2) INFORMATION FOR SEQ ID N0:53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:53:
(2) INFORMATION FOR SEQ ID N0:54:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:54:
GCGGAGCCAC CTTCATCA lg (2) INFORMATION FOR SEQ ID N0:55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:55:
(2) INFORMATION FOR SEQ ID N0:56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:56:
(2) INFORMATION FOR SEQ ID N0:57:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:57:
(2) INFORMATION FOR SEQ ID N0:58:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:58:
(2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:59:
(2) INFORMATION FOR SEQ ID N0:60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:60:
(2) INFORMATION FOR SEQ ID N0:61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:61:
(2) INFORMATION FOR SEQ ID N0:62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:62:
(2) INFORMATION FOR SEQ ID N0:63:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:63:
(2) INFORMATION FOR SEQ ID N0:64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:64:
(2) INFORMATION FOR SEQ ID N0:65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:65:
(2) INFORMATION FOR SEQ ID N0:66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:66:
(2) INFORMATION FOR SEQ ID N0:67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:67:
(2) INFORMATION FOR SEQ ID N0:68:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:68:
CCACCATGT
(2) INFORMATION FOR SEQ ID N0:69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:69:
(2) INFORMATION FOR SEQ ID N0:70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:70:
(2) INFORMATION FOR SEQ ID N0:71:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:71:
His Arg Asp Leu Lys Pro Glu Asn (2) INFORMATION FOR SEQ ID N0:72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:72:
(2) INFORMATION FOR SEQ ID N0:73:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:73:
(2) INFORMATION FOR SEQ ID N0:74:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 6525 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLEDCULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 573..5684 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:74:
Met Ala Val Leu Arg Gln Leu Ala Leu Leu Leu Trp Lys Asn Tyr Thr Leu Gln Lys Arg Lys Val Leu Val Thr Val Leu Glu Leu Phe Leu Pro Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Glu Leu Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Lys Ala Val Thr Glu Thr Val Arg Arg Ala Leu Val Ile Asn Met Arg Val Arg Gly Phe Pro Ser Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Leu Ala Ala Val Val Phe Glu His Pro Phe Asn His Ser Lys Glu Pro Leu Pro Leu Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gly Trp His Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gly Phe Leu Ala Val Gln His Ala Val Asp Arg Ala Ile Met Glu Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Val Gln Glu Lys Glu Arg Arg Leu Lys Glu Tyr Met Arg Met Met Gly Leu Ser Ser Trp Leu His Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Phe Met Thr Leu Leu Phe Cys Val Lys Val Lys Pro Asn Val Ala Val Leu Ser Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Ala Ile Ser Thr Ile Ser Phe Ser Phe Met Val Ser Thr Phe Phe Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Met Ala Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gly Gln Val Leu Gly Met Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Arg Ala Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Ile Lys Ile Lys His Leu Ser Lys Val Phe Arg Val Gly Asn Lys Asp Arg Ala Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Thr Val Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gln Ile Arg Lys Ser Leu Gly Leu Cys Pro Gln His Asp Ile Leu Phe Asp Asn Leu Thr Val Ala Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pro Glu Glu Val Lys Gln Met Leu His Ile Ile Gly Leu Glu Asp Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Thr Thr His Phe Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Val Lys Glu Pro His Cys Asn Pro Glu Asp Ile Ser Gln Leu Val His His His Val Pro Asn Ala Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Ser Thr His Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Glu Val Phe Leu Arg Val Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Met Asp Pro Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Val Lys Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gln Phe Trp Ala Met Phe Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Ala Ala Gln Val Leu Val Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Ser Val Pro Gly Thr Ser Gln Leu Gly Gln Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gly Gln Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Ala Ala Ser Phe Arg Asp Val Gly Glu Arg Thr Val Val Asn Ala Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Ala Val Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gly Pro His Ala Ser Ile Val Val Ser Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Glu Arg Ala Val Gln Ala Lys His Val Gln Phe Val Ser Gly Val His Val Ala Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Val Phe Lys Ala Phe Asp Val Arg Ala Phe Thr Arg Asp Gly His Met Ala Asp Thr Leu Leu Leu Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Met Val Thr Ile Met Arg Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp His Val Phe Leu Val Leu Pro Asn His Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gln Asp Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu His Thr Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Val Pro Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Val Gly Gly His Arg Ile Ser Ser Asp Val Gly Lys Val Arg Gln Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp His Met Thr Gly Arg Glu Met Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg His Ile Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pro His Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pro Val Ala Arg Arg Leu Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gln His Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Glu His Gln Gly Met Val His Tyr His Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Ser Phe Ala His Leu Gln Pro Pro Thr Ala Glu Glu Gly Arg CAGAAAATAA ATGCTCAGGG GACACAAAA.A F~P.AAAAAAAA AAAAAAAAAA 6514 (2) INFORMATION FOR SEQ ID N0:75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1704 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:75:
Met Ala Val Leu Arg Gln Leu Ala Leu Leu Leu Trp Lys Asn Tyr Thr Leu Gln Lys Arg Lys Val Leu Val Thr Val Leu Glu Leu Phe Leu Pro Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Glu Leu 65 70 75 g0 Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Lys Ala Val Thr Glu Thr Val Arg Arg Ala Leu Val Ile Asn Met Arg Val Arg Gly Phe Pro Ser Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Leu Ala Ala Val Val Phe Glu His Pro Phe Asn His Ser Lys Glu Pro Leu Pro Leu Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gly Trp His Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gly Phe Leu Ala Val Gln His Ala Val Asp Arg Ala Ile Met Glu Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Val Gln Glu Lys Glu Arg Arg Leu Lys Glu Tyr Met Arg Met Met Gly Leu Ser Ser Trp Leu His Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Phe Met Thr Leu Leu Phe Cys Val Lys Val Lys Pro Asn Val Ala Val Leu Ser Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Ala Ile Ser Thr Ile Ser Phe Ser Phe Met Val Ser Thr Phe Phe Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Met Ala Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gly Gln Val Leu Gly Met Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Arg Ala Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Ile Lys Ile Lys His Leu Ser Lys Val Phe Arg Val Gly Asn Lys Asp Arg Ala Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Thr Val Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gln Ile Arg Lys Ser Leu Gly Leu Cys Pro Gln His Asp Ile Leu Phe Asp Asn Leu Thr Val Ala Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pro Glu Glu Val Lys Gln Met Leu His Ile Ile Gly Leu Glu Asp Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Thr Thr His Phe Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Val Lys Glu Pro His Cys Asn Pro Glu Asp Ile Ser Gln Leu Val His His His Val Pro Asn Ala Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Ser Thr His Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Glu Val Phe Leu Arg Val Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Met Asp Pro 865 870 875 g80 Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Val Lys Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gln Phe Trp Ala Met Phe Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Ala Ala Gln Val Leu Val Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Ser Val Pro Gly Thr Ser Gln Leu Gly Gln Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gly Gln Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Ala Ala Ser Phe Arg Asp Val Gly Glu Arg Thr Val Val Asn Ala Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Ala Val Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gly Pro His Ala Ser Ile Val Val Ser Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Glu Arg Ala Val Gln Ala Lys His Val Gln Phe Val Ser Gly Val His Val Ala Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Val Phe Lys Ala Phe Asp Val Arg Ala Phe Thr Arg Asp Gly His Met Ala Asp Thr Leu Leu Leu Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Met Val Thr Ile Met Arg Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp His Val Phe Leu Val Leu Pro Asn His Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gln Asp Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu His Thr Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Val Pro Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Val Gly Gly His Arg Ile Ser Ser Asp Val Gly Lys Val Arg Gln Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp His Met Thr Gly Arg Glu Met Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg His Ile Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pro His Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pro Val Ala Arg Arg Leu Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gln His Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Glu His Gln Gly Met Val His Tyr His Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Ser Phe Ala His Leu Gln Pro Pro Thr Ala Glu Glu Gly Arg (2) INFORMATION FOR SEQ ID N0:76:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = ~~Oligonucleotide primer~~
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:76:
AGCTGGCGCT CCTCCTCT lg (2) INFORMATION FOR SEQ ID N0:77:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 349 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:77:
Gly Gln Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Ser Ile Gly Arg Pro Thr Gly Ile Gly Tyr Asp Arg Gly Cys Pro Gln Leu Asp Leu Thr Val Glu His Leu Leu Lys Gly Lys Leu Leu Lys Asn Leu Ser Gly Gly Met Arg Lys Leu Gly Leu Asp Glu Pro Thr Ala Gly Met Asp Arg Leu Arg Lys Arg Thr Ile Leu Thr Thr His Met Asp Glu Ala Leu Gly Asp Ile Met His Gly Leu Gly Leu Lys Gln Lys Gly Gly Tyr Thr Val Glu Gln Pro Ala Arg Phe Leu Leu Ser Phe Gly Ser Thr Glu Val Phe Ile Gly Asp His Arg Gly Ala Gln Phe Lys Lys Tyr Ser Arg Trp Gln Val Leu Pro Leu Asp Leu Thr Glu Val Phe Pro Leu Pro Gly Ala Leu Phe Asn Tyr His Thr Ser Val Ser Gln Ala Leu Ala Ser Thr Phe Glu Arg Gln Ala His Gln Phe Gly Phe Leu Asp Ile Ser Leu Leu Phe Asp His Ala Leu Leu Tyr Ser Pro Tyr Phe Phe Ala Leu Ile Ala Leu Val Glu Leu Leu Phe Leu Pro Gly Ala Asn Trp Gly Phe Leu Arg Met Leu Pro Val Glu Arg Arg Asn Leu Ile Lys Leu Lys Ala Val Leu Leu Ala Val Glu Cys Phe Gly Leu Leu Gly Asn Gly Ala Gly Lys Thr Thr Thr Phe Leu Thr Gly Ser Ser Gly Ala Gly Gly Asp Val Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Thr Gly Arg Glu Leu Ala Gly Ala Glu Leu His Ala Lys Leu Val Arg Tyr Ser Gly Gly Lys Arg Lys Ser Gly Ala Leu Leu Pro Gln Ile Leu Asp Glu Pro Gly Asp Pro Ala Arg Arg Trp Glu Ser Ala Thr Ser His Ser Met Glu Cys Glu Ala Leu Cys Arg Ala Gly Gly Ser Gln Leu Lys Ser Gly Tyr Val Pro Ser Val Leu Leu Pro Trp Phe Gly Val Asp Gln Ser Leu Glu Phe Leu Ala Leu (2) INFORMATION FOR SEQ ID N0:78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1974 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:78:
(2) INFORMATION FOR SEQ ID N0:79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 612 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:79:
Met Ile Thr Ser Val Leu Arg Tyr Val Leu Ala Leu Tyr Phe Cys Met Gly Ile Ala His Gly Ala Tyr Phe Ser Gln Phe Ser Met Arg Ala Pro Asp His Asp Pro Cys His Asp His Thr Gly Arg Pro Val Arg Cys Val Pro Glu Phe Ile Asn Ala Ala Phe Gly Lys Pro Val Ile Ala Ser Asp Thr Cys Gly Thr Asn Arg Pro Asp Lys Tyr Cys Thr Val Lys Glu Gly Pro Asp Gly Ile Ile Arg Glu Gln Cys Asp Thr Cys Asp Ala Arg Asn His Phe Gln Ser His Pro Ala Ser Leu Leu Thr Asp Leu Asn Ser Ile Gly Asn Met Thr Cys Trp Val Ser Thr Pro Ser Leu Ser Pro Gln Asn Val Ser Leu Thr Leu Ser Leu Gly Lys Lys Phe Glu Leu Thr Tyr Val Ser Met His Phe Cys Ser Arg Leu Pro Asp Ser Met Ala Leu Tyr Lys Ser Ala Asp Phe Gly Lys Thr Trp Thr Pro Phe Gln Phe Tyr Ser Ser Glu Cys Arg Arg Ile Phe Gly Arg Asp Pro Asp Val Ser Ile Thr Lys Ser Asn Glu Gln Glu Ala Val Cys Thr Ala Ser His Ile Met Gly Pro Gly Gly Asn Arg Val Ala Phe Pro Phe Leu Glu Asn Arg Pro Ser Ala Gln Asn Phe Glu Asn Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Ile Lys Val Val Phe Ser Arg Leu Ser Pro Asp Gln Ala Glu Leu Tyr Gly Leu Ser Asn Asp Val Asn Ser Tyr Gly Asn Glu Thr Asp Asp Glu Val Lys Gln Arg Tyr Phe Tyr Ser Met Gly Glu Leu Ala Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Ile Phe Asp Lys Met Gly Arg Tyr Thr Cys Asp Cys Lys His Asn Thr Ala Gly Thr Glu Cys Glu Met Cys Lys Pro Phe His Tyr Asp Arg Pro Trp Gly Arg Ala Thr Ala Asn Ser Ala Asn Ser Cys Val Ala Cys Asn Cys Asn Gln His Ala Lys Arg Cys Arg Phe Asp Ala Glu Leu Phe Arg Leu Ser Gly Asn Arg Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg Asn Cys His Leu Cys Lys Pro Gly Phe Val Arg Asp Thr Ser Leu Pro Met Thr His Arg Arg Ala Cys Lys Ser Cys Gly Cys His Pro Val Gly Ser Leu Gly Lys Ser Cys Asn Gln Ser Ser Gly Gln Cys Val Cys Lys Pro Gly Val Thr Gly Thr Thr Cys Asn Arg Cys Ala Lys Gly Tyr Gln Gln Ser Arg Ser Thr Val Thr Pro Cys Ile Lys Ile Pro Thr Lys Ala Asp Phe Ile Gly Ser Ser His Ser Glu Glu Gln Asp Gln Cys Ser Lys Cys Arg Ile Val Pro Lys Arg Leu Asn Gln Lys Lys Phe Cys Lys Arg Asp His Ala Val Gln Met Val Val Val Ser Arg Glu Met Val Asp Gly Trp Ala Lys Tyr Lys Ile Val Val Glu Ser Val Phe Lys Arg Thr Glu Asn Met Gln Arg Arg Gly Glu Thr Ser Leu Trp Ile Ser Pro Gln Gly Val Ile Cys Lys Cys Pro Lys Leu Arg Val Gly Arg Arg Tyr Leu Leu Leu Gly Lys Asn Asp Ser Asp His Glu Arg Asp Gly Leu Met Val Asn Pro Gln Thr Val Leu Val Glu Trp Glu Asp Asp Ile Met Asp Lys Val Leu Arg Phe Ser Lys Lys Asp Lys Leu Gly Gln Cys Pro Glu Ile Thr Ser His Arg Tyr (2) INFORMATION FOR SEQ ID N0:80:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = ~~Oligonucleotide primer - sense s t rand ~~
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:80:
(2) INFORMATION FOR SEQ ID N0:81:
(i) SEQUENCE.CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = "Oligonucleotide primer -antisense strand"
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:81:
(2) INFORMATION FOR SEQ ID N0:82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = "Oligonucleotide primer - sense strand"
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:82:
(2) INFORMATION FOR SEQ ID N0:83:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = "Oligonucleotide primer -antisense strand"
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:83:
Proteins belonging to the ABC transporter superfamily are linked by strong structural similarities. Typically ABC
transporters have four conserved domains, two hydrophobic domains which may impart substrate specificity (Payne et al., Mol. Gen. Genet. 200:493-496, 1985; Foote et al., Nature 345:255-258, 1990; Anderson et al., Science 253:202-205, 1991; Shustik et al., Br. J. Haematol. 79:50-56, 1991;
Covitz et al., E1~0 J. 13:1752-1759, 1994), and two highly conserved domains associated with ATP binding and hydrolysis (Higgins, supra. 1992). ABC transporters govern unidirectional transport of molecules into or out of cells and across subcellular membranes (Higgins, supra. 1992).
Their substrates range from heavy metals (Ouellette et al., Res. Microbiol. 142:737-746 1991) to peptides and full size proteins (Gartner et al., Nature Genet. 1:16-23 1992).
In eukaryotic cells, ABC transporters exist either as single large symmetrical proteins containing all four domains or as dimers resulting from the association of two smaller polypeptides each containing a hydrophobic and ATP-binding domain. Examples of this multimeric structural form are human TAP proteins (Kelly et al., Nature 355:641-644 1992) and the functional PMP70 protein (Kamijo et al., J. Biol. Chem. 265:4534-40 1990). This multimeric structure is also found in numerous prokaryotic ABC
transporters. The hydrophobic regions are comprised of up to six transmembrane spanning segments. Each ATP binding domain operates independently and may or may not be functionally equivalent (Kerem et al., Science 245:1073-80 1989; Mimmack et al., Proc. Natl. Acad. Sci., USA 86:8257-61 1989; Cutting et al., Nature 346:366-369 1990; Kerppola et al., J. Biol. Chem. 266:9857-65 1991).
Several of the ABC transporters thus far identified in humans have been shown to be clinically important. For example, overexpression of P-glycoproteins is responsible for multi-drug resistance in tumors (Gottesman et al., Ann. Rev. Biochem. 62:385-427 1993).
Classical cystic fibrosis (CF) as well as a large proportion of cases of bilateral congenital disease of the vas deferens (CBAVD) are caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR), an ABC
transporter (Kerem et al., supra.; Cutting et al., supra.).
Defects in ABC transporters have also been implicated in Zellweger syndrome (Gartner et al., supra.), and adrenoleukodystrophy (Mosser et al., Nature 361:726-730 1993 ) .
Two members of a novel ABC transporter subgroup (murine ABC1 and ABC2) have been shown to contain domains similar to the regulatory R-domain of CFTR (Luciani et al., supra. 1994). Functionally, the mouse ABC1 protein has been shown to play a role in macrophage engulfment of apoptotic cells (Luciani et al., F,NIBO J. 16:226-235, 1996), while the function of ABC2 remains unknown. All three proteins contain a large charged region containing several potential phosphorylation sites (Kerem et al., supra., Luciani et al., supra. 1994). The charged amino acid residues within this region are sequentially arranged in blocks of alternating positive and negative charge.
- A common feature of these particular ABC
transporters, including hABC3, is the presence of a large linker domain between the two ATP binding cassettes. The presence of numerous polar residues and potential phosphorylation sites in the linker domain suggest that this region may play a regulatory role perhaps similar to that of the R-domain of CFTR (Kerem et al., supra.). In addition, the four proteins also contain a hydrophobic region, the HH1 domain (Luciani et al., supra. 1994), within the conserved linker domain. Although there is little homology at the sequence level between the HHl domains of hABC3 and the murine ABCs, they appear to be structurally conserved with each domain predicted to have f~-sheet conformation. The similarity between these proteins would suggest that they all belong to the same ABC
subfamily, originally defined by ABC1 and ABC2 (Luciani et al., supra. 1994). The genes encoding the human homologues of ABC1 and ABC2 have been mapped to human chromosome 9 at q22-q31 and q34, respectively (Luciani et al., supra.
1994).
Despite being members of the same subfamily, it is likely that ABC1, ABC2 and hABC3 have different functional roles. The differences present in the transmembrane and linker domains of ABC1, ABC2 and hABC3 may confer each with a unique substrate specificity. For example, alterations and mutations in the transmembrane domains of both prokaryotic and eukaryotic ABC transporters have been shown to alter substrate specificity (Payne et al., supra.; Foote et al., supra.; Covitz et al., supra.) while changes to the R-domain of CFTR have been shown to alter its ion selectivity (Anderson et al., supra.; Rich et al., Science 253:205-207 1991). The differences in the expression patterns of ABC1, ABC2 and hABC3 also suggest that the proteins may be functionally distinct. Murine ABC1 and ABC2 have been shown to be expressed at varying levels in a wide variety of adult and embryonic tissues, with the highest levels of ABC1 expression being seen in pregnant uterus and regions rich in monocytic cells while highest levels of ABC2 expression were seen in brain (Luciani et al., supra. 1994; Luciani et al., supra. 1996).
In contrast, hABC3 is preferentially expressed in lung with significantly.lower levels of expression being seen in brain, heart, and pancreas.
Apart from the structural differences between ABC1, ABC2 and hABC3, it is always possible that the three proteins play similar functional roles in different cell populations. To date, no function has been proposed for murine ABC2. However, recent data indicate that ABC1 is required for the engulfment of cells undergoing apoptosis, though the molecular mechanism underlying ABC1 function is unknown (Luciani et al., supra. 1996). If hABC3 functions in a manner similar to ABC1, it could be expressed by pulmonary macrophages involved in host defense.
ABC transporters have been described for substrates ranging from small ions to large polysaccharides and proteins. Based on the high level of expression in lung, the substrate for hABC3 may play an integral role in the lung function, including ion or polysaccharide transport. Further clues may be provided by a closer examination of hABC3 expression in the lung. These studies would include the identification of the lung cells responsible for hABC3 expression as well as determining the subcellular localization of hABC3. The identification and cloning of the hABC3 cDNA may have implications for cystic fibrosis, since it contains a potential R-domain and is expressed at highest levels in the lung. If hABC3 does play an integral role in lung function, then modulation or -WO 97/4$797 PCT/US97/00785 alteration of hABC3 substrate specificity could have significant therapeutic implications for CF.
Several cDNAs were cloned using the GeneTrapper direct selection system and oligos designed from the 5' most trapped exon encoding sequences with homology to ABC1 (trapped exon L48747). The longest clone isolated with the GeneTrapper system from a normal human lung cDNA library using custom oligonucleotides designed from the 5' most exon trap was 5719 by in length (ABCgt.1). An additional cDNA clone (ABC.S) was isolated using a radiolabeled 1.1 kb RT-PCR product (ABC3-12) as a probe (Figure 15). The 5' end of the ABC3 cDNA was further characterized using 5' RACE, with several RACE products containing multiple in-frame stop codons upstream of the start methionine.
Accordingly, the present invention provides a novel human ABC gene which has homology to the murine ABC1 and ABC2 genes, as well as sequences predicted to be encoded by cosmid C48B4.4 from C. elegans (Wilson et al., supra.). A 6.4 kb cDNA has been assembled for the hABC3 transporter. The assembled cDNA contains a 5116 nucleotide long open reading frame encoding 1705 amino acids, with the predicted protein having a molecular weight of 191 kDa.
The proposed start methionine is 50 by upstream of the 5' end of clone ABCgt.l.
Five trapped exons from P1 clones 109.8C and 47.2H were shown to contain sequences with homology to the human ribosomal protein L3 cDNA, with hybridization studies indicating that the L3-like gene is oriented centromeric to telomeric (transcript L in Figure 1). The ribosomal L3 gene product is one of five essential proteins for peptidyltransferase activity in the large ribosomal subunit (Schulze and Nierhaus, EM80 J. 1:609-613, 1982). Not surprisingly, the L3 amino acid sequence is highly conserved across species.- Mammalian L3 genes showing ~98~
protein sequence identity have been characterized from man (Genbank Accession No. X73460), mouse (Peckham et al., Genes Dev. 3:2062-2071, 1989), rat (Kuwano and Wool, Biochem. Biophys. Res. Comm. 187:58-64, 1992) and cow (Simonic et al., Biochim. Biophys. Acta 1219:706-710, 1994). The cumulative percent identity between the trapped exons and the reported human ribosomal protein L3 cDNA was 740 (537/724) at the nucleotide level.
A full-length cDNA encoding a novel ribosomal L3 protein subtype, SEM L3, was isolated and sequenced (Figure 11). This gene is now designated RPL3L and has been assigned GenBank Accession No. U65581. The deduced protein sequence is 407 amino acids long and shows 77o identity to other known mammalian L3 proteins, which are themselves highly conserved. Hybridization analysis of human genomic DNA suggests this novel gene is single copy and has a tissue specific pattern of expression.
The expression pattern of the previously identified human L3 gene and the novel human RPL3L was determined using multiple tissue Northern blots. The human L3 gene showed a ubiquitous pattern of expression in all tissues with the highest expression in the pancreas. In contrast, the novel gene described herein is strongly expressed in skeletal muscle and heart tissue, with low levels of expression in the pancreas. This novel gene, RPL3L (Ribosomal Protein L3-Like), is located in a gene-rich region near the PKD1 and TSC2 genes on chromosome 16p13 . 3 .
The RPL3L protein is more closely related to the above mentioned cytoplasmic ribosomal proteins than to previously described nucleus-encoded mitochondrial proteins (Graack et al., Eur. J. Biochem. 206:373-380, 1992). The presence of a highly conserved nuclear localization sequence in the RPL3L further supports the hypothesis that it represents a novel cytoplasmic L3 ribosomal protein subtype and not a nucleus-encoded mitochondrial protein.
In addition, an exon trap (Genbank Accession No.
L48792) from a gene which is located telomeric of the L3-like gene was obtained (transcript M in Figure 1).
Sequences encoded by transcript M were shown to have homology to pilB from Neisseria gonorrhoeae (Taha et al., .F.N~O J. 7:4367-4378, 1988) as well as to a computer predicted 17.2 kDa protein encoded by cosmid F44E2.6 from C. elegans (Wilson et al., supra.).
Using sequences from exon trap L48792, a 600 by partial cDNA was isolated and it was determined that the corresponding gene is oriented centromeric to telomeric. A
1.3 kb message was detected by the cDNA on Northern blots.
Sequences conserved between the partial cDNA and the hypothetical 17.2 kDa protein were also conserved in the pilB protein from Neisseria gonorrhoeae (Taha et al., supra. 1988), a hypothetical 19.3 kDa protein from yeast (Genbank Accession No. P25566), and a fimbrial transcription regulation repressor from Haemophilus (Fleischmann et al., Science 269:496-512 1995) (Figure 2).
The pilB protein has homology to histidine kinase sensors and has been shown to play a role in the repression of pilin production in Neisseria gonorrhoeae (Taha et al., supra. 1988; Taha et al., Mol. Microbiol. 5:137-148, 1991).
However, residues conserved between pilB, transcript M and the C. elegans, yeast, and Haemophilus sequences do not include the conserved histidine kinase domains from pilB
(Taha et al., supra. 1991). These findings suggest that the conserved region in transcript M has a function which is independent of the proposed histidine kinase sensor activity of pilB.
An additional exon trap from region of overlap between the 109.8C and 47.2H P1 clones was shown to contain human LLRep3 sequences (Slynn et al., Nuc. Acids Res.
18:681, 1990). Hybridization studies indicated that the LLRep3 sequences (transcript K in Figure 1) were located between the sazD and L3-like genes. The region of highest gene density appears to be at the telomeric end of this cloned interval, particularly the region between TSC2 and D16S84, with a minimum of five genes mapping to this region (transcription units K, L and M, sazD and hERV1).
Also mapped to this region, was an exon trap which is 86o identical {170/197) at the nucleotide level to the previously described rat augmenter of liver regeneration (Hagiya et al., Proc. Natl. Acad. Sci., USA
91:8142-8146, 1994). ALR is a growth factor which augments the growth of damaged liver tissue while having no effect on the resting liver. Studies have demonstrated that rat ALR is capable of augmenting hepatocytic regeneration following hepatectomy.
This ALR-like exon trap was also shown to contain sequences from the recently described hERVl gene, which encodes a functional homologue to yeast ERV1 (Lisowsky et al., supra.).
A 468 by cDNA, hALR, has been obtained from the human ALR gene (Figure 13). The ALR sequences encode a 119 amino acid protein which is 84.8% identical and 94.1 similar to the rat ALR protein {Figure 14).
The cloning of human ALR has significant implications in the treatment of degenerative liver diseases. For example, biologically active rat ALR has been produced from COS-7 cells expressing rat ALR cDNA
(Hagiya et al., supra.). Accordingly, recombinant hALR
could be used in the treatment of damaged liver. In addition, a construct expressing hALR could be used in gene therapy to treat chronic liver diseases.
Forty three of the trapped exons did not have significant homology to sequences in the protein or DNA
databases, nor were ESTs (expressed sequence tags) containing sequences from-the exon traps observed in dbEST.
The absence of ESTs containing sequences from these novel exon traps is not surprising since one of the criterion for selecting exon traps for further analysis was the presence of an EST in the database. These trapped exons are likely to represent bona fide products, since in many cases they were trapped multiple times from different P1 clones and in combination with flanking exons.
The present invention encompasses novel human genes an isolated nucleic acids comprising unique exon sequences from chromosome 16. The sequences described herein provide a valuable resource for transcriptional mapping and create a set of sequence-ready templates for a gene-rich interval responsible for at least two inheritable diseases.
Accordingly, the present invention provides isolated nucleic acids encoding human netrin (hNET), human ATP Binding Cassette transporter (hABC3), human ribosomal L3 (RPL3L) and human augmenter of liver regeneration (hALR) polypeptides. The present invention further provides isolated nucleic acids comprising unique exon sequences from chromosome 16. The term "nucleic acids" (also referred to as polynucleotides) encompasses RNA as well as single and double-stranded DNA, cDNA and oligonucleotides.
As used herein, the phrase "isolated" means a polynucleotide that is in a form that does not occur in nature.
One means of isolating polynucleotides encoding invention polypeptides is to probe a human tissue-specific library with a natural or artificially designed DNA probe using methods well known in the art. DNA probes derived from the human netrin gene, hNET, the human ABC transporter gene, hABC3, the human ribosomal protein L3 gene, RPL3L, or the human augmenter of liver regeneration gene, hALR, are particularly useful for this purpose. DNA and cDNA
molecules that encode invention polypeptides can be used to obtain complementary genomic DNA, cDNA or RNA from human, mammalian, or other animal sources, or to isolate related cDNA or genomic clones by the screening of cDNA or genomic libraries, by methods described in more detail below.
The present invention encompasses isolated nucleic acid sequences, including sense and antisense oligonucleotide sequences, derived from the sequences shown in Figures 3, 4, 8, 11 and 15. hNET-, hABC3-, RPL3L- (SEM
L3-), and hALR-derived sequences may also be associated with heterologous sequences, including promoters, enhancers, response elements, signal sequences, polyadenylation sequences, and the like. Furthermore, the nucleic acids can be modified to alter stability, solubility, binding affinity, and specificity. For example, invention-derived sequences can further include nuclease-resistant phosphorothioate, phosphoroamidate, and methylphosphonate derivatives, as well as "protein nucleic acid" (PNA) formed by conjugating bases to an amino acid backbone as described in Nielsen et al., Science, 254:1497, 1991. The nucleic acid may be derivatized by linkage of the oc-anomer nucleotide, or by formation of a methyl or ethyl phosphotriester or an alkyl phosphoramidate linkage.
Furthermore, the nucleic acid sequences of the present invention may also be modified with a label capable of providing a detectable signal, either directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, biotin, and the like.
In general, nucleic acid manipulations according to the present invention use methods that are well known in the art, as disclosed in, for example, Sambrook et al., Molecular Cloning, A Laboratory Manual 2d Ed. (Cold Spring Harbor, NY, 1989), or Ausubel et al., Current Protocols in Molecular Biology (Greene Assoc., Wiley Interscience, NY, NY, 1992 ) .
Examples of nucleic acids are RNA, cDNA, or genomic DNA encoding a human netrin, a human ABC
transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide. Such nucleic acids may have coding sequences substantially the same as the coding sequence shown in Figures 3, 4, 8, 21 and 15, respectively.
The present invention further provides isolated oligonucleotides corresponding to sequences within the hNET, hABC3, RPL3L (formerly SEM L3), hALR genes, or within the respective cDNAs, which, alone or together, can be used to discriminate between the authentic expressed gene and homologues or other repeated sequences. These oligonucleotides may be from about 12 to about 60 nucleotides in length, preferably about 18 nucleotides, may be single- or double-stranded, and may be labeled or modified as described below.
This invention also encompasses nucleic acids which differ from the nucleic acids shown in Figures 3, 4, 8, 11 and 15, but which have the same phenotype, i.e., encode substantially the same amino acid sequence set forth in Figures 3, 4, 8, 11 and 15, respectively.
Phenotypically similar nucleic acids are also referred to as "functionally equivalent nucleic acids". As used herein, the phrase "functionally equivalent nucleic acids"
encompasses nucleic acids characterized by slight and non-consequential sequence variations that will function in substantially the same manner to produce the same protein products) as the nucleic acids disclosed herein. In particular, functionally equivalent nucleic acids encode proteins that are the same as those disclosed herein or that have conservative amino acid variations. For example, conservative variations include substitution of a non-polar residue with another non-polar residue, or substitution of a charged residue with a similarly charged residue. These variations include those recognized by skilled artisans as those that do not substantially alter the tertiary structure of the protein.-Further provided are nucleic acids encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, and human augmenter of liver regeneration polypeptides that, by virtue of the degeneracy of the genetic code, do not necessarily hybridize to the invention nucleic acids under specified hybridization conditions. Preferred nucleic acids encoding the invention polypeptide are comprised of nucleotides that encode substantially the same amino acid sequence set forth in Figures 4, 8, 11 and 15.
Alternatively, preferred nucleic acids encoding the invention polypeptide(s) hybridize under high stringency conditions to substantially the entire sequence, or substantial portions (i.e., typically at least 12 to 60 nucleotides) of the nucleic acid sequence set forth in Figures 3, 4, 8, 11 and 15, respectively.
Stringency of hybridization, as used herein, refers to conditions under which polynucleotide hybrids are stable. As known to those of skill in the art, the stability of hybrids is a function of sodium ion concentration and temperature. (See, for example, Sambrook et al., supra.).
The present invention provides isolated polynucleotides operatively linked to a promoter of RNA
transcription, as well as other regulatory sequences. As used herein, the phrase "operatively linked" refers to the functional relationship of the polynucleotide with regulatory and effector sequences of nucleotides, such as promoters, enhancers, transcriptional and translational stop sites, and other signal sequences. For example, operative linkage of a polynucleotide to a promoter refers to the physical and functional relationship between the polynucleotide and the promoter such that transcription of DNA is initiated from the promoter by an RNA polymerase that specifically recognizes and binds to the promoter, and wherein the promoter directs the transcription of RNA from the polynucleotide.
Promoter regions include specific sequences that are sufficient for RNA polymerase recognition, binding and transcription initiation. Additionally, promoter regions include sequences that modulate the recognition, binding and transcription initiation activity of RNA polymerase.
Such sequences may be cis acting or may be responsive to trans acting factors. Depending upon the nature of the regulation, promoters may be constitutive or regulated.
Examples of promoters are SP6, T4, T7, SV40 early promoter, cytomegalovirus (CMV) promoter, mouse mammary tumor virus (MMTV) steroid-inducible promoter, Moloney murine leukemia virus (MMLV) promoter) and the like.
Vectors that contain both a promoter and a cloning site into which a polynucleotide can be operatively linked are well known in the art. Such vectors are capable of transcribing RNA in vitro or in vivo, and are commercially available from sources such as Stratagene (La Jolla, CA) and Promega Biotech (Madison, WI). In order to optimize expression and/or in vitro transcription, it may be necessary to remove, add or alter 5' and/or 3' untranslated portions of the clones to eliminate extra, potential inappropriate alternative translation initiation codons or other sequences that may interfere with or reduce expression, either at the level of transcription or translation. Alternatively, consensus ribosome binding sites can be inserted immediately 5' of the start codon to enhance expression. Similarly, alternative codons, encoding the same amino acid, can be substituted for coding sequences of the human netrin, human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptide in order to enhance transcription (e.g., the codon preference of the host cell can be adopted, the presence of G-C rich domains can be reduced, and the like).
Examples of vectors are viruses, such as baculoviruses and retroviruses, bacteriophages, cosmids, plasmids, fungal vectors and other recombination vehicles typically used in the art which have been described for expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple protein expression.
Polynucleotides are inserted into vector genomes using methods well known in the art. For example, insert and vector DNA can be contacted, under suitable conditions, with a restriction enzyme to create complementary ends on each molecule that can pair with each other and be joined together with a ligase. Alternatively, synthetic nucleic acid linkers can be ligated to the termini of restricted polynucleotide. These synthetic linkers contain nucleic acid sequences that correspond to a particular restriction site in the vector DNA. Additionally, an oligonucleotide containing a termination codon and an appropriate restriction site can be ligated for insertion into a vector containing, for example, some or all of the following:a selectable marker gene, such as the neomycin gene for selection of stable or transient transfectants in mammalian cells; enhancer/promoter sequences from the immediate early gene of human CMV for high levels of transcription;
transcription termination and RNA processing signals from SV40 for mRNA stability; SV40 polyoma origins of replication and ColE1 for proper episomal replication;
versatile multiple cloning sites; and T7 and SP6 RNA
promoters for in vitro transcription of sense and antisense RNA. Other means are well known and available in the art.
Also provided are vectors comprising a polynucleotide encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, and human augmenter of liver regeneration polypeptides, adapted for expression in a bacterial cell, a yeast cell, an amphibian cell, an insect cell, a mammalian cell and other animal cells. The vectors additionally comprise the regulatory elements necessary for expression of the polynucleotide in the bacterial) yeast, amphibian, mammalian or animal cells so located relative to the polynucleotide encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides as to permit expression thereof. As used herein, "expression"
refers to the process by which polynucleotides are transcribed into mRNA and translated into peptides, polypeptides, or proteins. If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA, if an appropriate eukaryotic host is selected.
Regulatory elements required for expression include promoter sequences to bind RNA polymerase and transcription initiation sequences for ribosome binding. For example, a bacterial expression vector includes a promoter such as the 1ac promoter and for transcription initiation the Shine-Dalgarno sequence and the start codon AUG (Sambrook et al., supra.). Similarly, a eukaryotic expression vector includes a heterologous or homologous promoter for RNA
polymerase II, a downstream polyadenylation signal, the start codon AUG, and a termination codon for detachment of the ribosome. Such vectors can be obtained commercially or assembled by the sequences described in methods well known in the art, for example, the methods described above for constructing vectors in general. Expression vectors are useful to produce cells that express the invention receptor.
This invention provides a transformed host cell that recombinantly expresses the human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides. Invention host cells have been transformed with a polynucleotide encoding a human netrin, a human ABC3 transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide. An example is a mammalian cell comprising a plasmid adapted for expression in a mammalian cell. The plasmid contains a polynucleotide encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide and the regulatory elements necessary for expression of the invention protein.
Appropriate host cells include bacteria, archebacteria, fungi, especially yeast, plant cells) insect cells~and animal cells, especially mammalian cells. Of particular interest are E. coli, B. Subtilis, Saccharomyces cerevisiae, SF9 cells, C129 cells, 293 cells, Neurospora, and CHO cells, COS cells, HeLa cells, and immortalized mammalian myeloid and lymphoid cell lines. Preferred replication systems include M13, ColEl, SV40, baculovirus, lambda, adenovirus, artificial chromosomes, and the like.
A large number of transcription initiation and termination regulatory regions have been isolated and shown to be effective in the transcription and translation of heterologous proteins in the various hosts. Examples of these regions, methods of isolation, manner of manipulation, and the like, are known in the art. Under appropriate expression conditions, host cells can be used as a source of recombinantly produced hNET, hABC3, RPL3L
(formerly SEM L3) and/or hALR.
Nucleic acids (polynucleotides) encoding invention polypeptides may also be incorporated into the genome of recipient cells by recombination events. For example, such a sequence can be microinjected into a cell, and thereby effect homologous recombination at the site of an endogenous gene encoding hNET, hABC3, RPL3L (formerly SEM L3), and/or hALR an analog or pseudogene thereof, or a sequence with substantial identity to a hNET-, hABC3-, RPL3L (SEM L3-), or hALR- encoding gene. Other recombination-based methods such as nonhomologous recombinations or deletion of endogenous gene by homologous recombination, especially in pluripotent cells, may also be used.
The present invention provides isolated peptides, polypeptides(s) and/or proteins) encoded by the invention nucleic acids. The present invention also encompasses isolated polypeptides having a sequence encoded by hNET, hABC3, RPL3L (SEM L3), and hALR genes, as well as peptides of six or more amino acids derived therefrom. The polypeptide(s) may be isolated from human tissues obtained by biopsy or autopsy, or may be produced in a heterologous cell by recombinant DNA methods as described herein.
As used herein, the term "isolated" means a protein molecule free of cellular components and/or contaminants normally associated with a native in vivo environment. Invention polypeptides and/or proteins include any natural occurring allelic variant, as well as recombinant forms thereof. Invention polypeptides can be isolated using various methods well known to a person of skill in the art.
The methods available for the isolation and purification of invention proteins include, precipitation, gel filtration, and chromatographic methods including , molecular sieve, ion-exchange, and affinity chromatography using e.g. hNET-, hABC3-, RPL3L- {SEM L3-), and/or hALR-specific antibodies or ligands. Other well-known methods are described in Deutscher et al., Guide to Protein Purification: Methods in Enzymology Vol. .282, (Academic Press, 1990). When the invention polypeptide to be purified is produced in a recombinant system, the recombinant expression vector may comprise additional sequences that encode additional amino-terminal or carboxy-terminal amino acids; these extra amino acids act as "tags"
for immunoaffinity purification using immobilized antibodies or for affinity purification using immobilized ligands.
Peptides comprising hNET-, hABC3-, RPL3L- (SEM
L3-) or hALR-specific sequences may be derived from isolated larger hNET, hABC3, RPL3L (SEM L3), or hALR
polypeptides described above, using proteolytic cleavages by e.g. proteases such as trypsin and chemical treatments such as cyanogen bromide that are well-known in the art.
Alternatively, peptides up to 60 residues in length can be routinely synthesized in milligram quantities using commercially available peptide synthesizers.
An example of the means for preparing the invention polypeptide(s) is to express polynucleotides encoding hNET, hABC3, RPL3L (SEM L3), and/or hALR in a suitable host cell, such as a bacterial cell, a yeast cell, an amphibian cell (i.e., oocyte), an insect cell (i.e., drosophila) or a mammalian cell, using methods well known in the art, and recovering the expressed polypeptide, again using well-known methods. Invention polypeptides can be isolated directly from cells that have been transformed with expression vectors, described below in more detail.
The invention polypeptide, biologically active fragments, and functional equivalents thereof can also be produced by chemical synthesis. As used herein, "biologically active fragment" refers to any portion of the polypeptide represented by the amino acid sequence in Figures 4, 8, 11 and 15 that can assemble into an active protein. Synthetic polypeptides can be produced using Applied Biosystems, Inc.
Model 430A or 431A automatic peptide synthesizer (Foster City, CA) employing the chemistry provided by the manufacturer.
Modification of the invention nucleic acids, polynucleotides, polypeptides, peptides or proteins with the following phrases: "recombinantly expressed/produced", "isolated", or "substantially pure", encompasses nucleic acids, polynucleotides, polypeptides, peptides or proteins that have been produced in such form by the hand of man, and are thus separated from their native in vivo cellular environment. As a result of this human intervention, the recombinant nucleic acids, polynucleotides, polypeptides, peptides and proteins of the invention are useful in ways that the corresponding naturally occurring molecules are not, such as identification of selective drugs or compounds. -Sequences having "substantial sequence homology"
are intended to refer to nucleotide sequences that share at least about 90~ identity with invention nucleic acids; and ' amino acid sequences that typically share at least about 95~ amino acid identity with invention polypeptides. It is recognized, however, that polypeptides or nucleic acids containing less than the above-described levels of homology arising as splice variants or that are modified by conservative amino acid substitutions, or by substitution of degenerate codons are also encompassed within the scope of the present invention.
The present invention provides a nucleic acid probe comprising a polynucleotide capable of specifically hybridizing with a sequence included within the nucleic acid sequence encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide, for example, a coding sequence included within the nucleotide sequence shown in Figures 3, 4, 8, 11 and 15, respectively.
As used herein, a "nucleic acid probe" may be a sequence of nucleotides that includes from about 12 to about 60 contiguous bases set forth in Figures 3, 4, 8, 11 and 15, preferably about 18 nucleotides, may be single- or double-stranded, and may be labeled or modified as described herein. Preferred regions from which to construct probes include 5' and/or 3' coding sequences, sequences predicted to encode transmembrane domains, sequences predicted to encode cytoplasmic loops, signal sequences, ligand binding sites, and the like.
Full-length or fragments of cDNA clones can also be used as probes for the detection and isolation of related genes. When fragments are used as probes, preferably the cDNA sequences will be from the carboxyl end-encoding portion of the cDNA, and most preferably will include predicted transmembrane domain-encoding portions of the cDNA sequence. Transmembrane domain regions can be predicted based on hydropathy analysis of the deduced amino acid sequence using, for example, the method of Kyte and Doolittle (J. Mol. Biol. 157:105, 1982).
As used herein, the phrase "specifically hybridizing" encompasses the ability of a polynucleotide to recognize a sequence of nucleic acids that are complementary thereto and to form double-helical segments via hydrogen bonding between complementary base pairs.
Nucleic acid probe technology is well known to those skilled in the art who will readily appreciate that such probes may vary greatly in length and may be labeled with a detectable agent, such as a radioisotope, a fluorescent dye, and the like, to facilitate detection of the probe.
Invention probes are useful to detect the presence of nucleic acids encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides. For example, the probes can be used for in situ hybridizations in order to locate biological tissues in which the invention gene is expressed. Additionally) synthesized oligonucleotides complementary to the nucleic acids of a polynucleotide encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides are useful as probes for detecting the invention genes, their associated mRNA, or for the isolation of related genes using homology screening of genomic or cDNA libraries, or by using amplification techniques well known to one of skill in the art.
Also provided are antisense oligonucleotides having a sequence capable of binding specifically with any portion of an mRNA that encodes human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide so as to prevent translation of the mRNA. The antisense oligonucleotide may have a sequence capable of- binding specifically with any portion of the sequence of the cDNA encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide. As used herein, the phrase "binding specifically" encompasses the ability of a nucleic acid sequence to recognize a complementary nucleic acid sequence and to form double-helical segments therewith via the formation of hydrogen bonds between the complementary base pairs. An example of an antisense oligonucleotide is an antisense oligonucleotide comprising chemical analogs of nucleotides (i.e., synthetic antisense oligonucleotide, SAO).
Compositions comprising an amount of the antisense oligonucleotide, (SAOC), effective to reduce expression of the human netrin, the human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptide by passing through a cell membrane and binding specifically with mRNA encoding the human netrin, the human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptide so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane are also provided herein. The acceptable hydrophobic carrier capable of passing through cell membranes may also comprise a structure which binds to a receptor specific for a selected cell type and is thereby taken up by cells of the selected cell type. The structure may be part of a protein known to bind to a cell-type specific receptor.
This invention provides a means to modulate levels of expression of invention polypeptides by the use of a synthetic antisense oligonucleotide composition (SAOC) which inhibits translation of mRNA encoding these polypeptides. Synthetic oligonucleotides, or other antisense chemical structures designed to recognize and selectively bind to mRNA, are constructed to be complementary to portions of the nucleotide sequences shown in Figures 3, 4, 8, 11 and 15, of DNA, RNA or chemically modified, artificial nucleic acids. The SAOC is designed to be stable in the blood stream for administration to a subject by injection, or in laboratory cell culture conditions. The SAOC is designed to be capable of passing through the cell membrane in order to enter the cytoplasm of the cell by virtue of physical and chemical properties of the SAOC which render it capable of passing through cell membranes, for example, by designing small, hydrophobic SAOC chemical structures, or by virtue of specific transport systems in the cell which recognize and transport the SAOC into the cell.
In addition, the SAOC can be designed for administration only to certain selected cell populations by targeting the SAOC to be recognized by specific cellular uptake mechanisms which bind and take up the SAOC only within select cell populations. For example, the SAOC may be designed to bind to a receptor found only in a certain cell type, as discussed supra. The SAOC is also designed to recognize and selectively bind to the target mRNA
sequence, which may correspond to a sequence contained within the sequence shown in Figures 3, 4, 8, 11 and 15.
The SAOC is designed to inactivate the target mRNA sequence by either binding to the target mRNA and inducing degradation of the mRNA by, for example, RNase I digestion, or inhibiting translation of the mRNA target by interfering with the binding of translation-regulating factors or ribosomes, or inclusion of other chemical structures, such as ribozyme sequences or reactive chemical groups which either degrade or chemically modify the target mRNA. SAOCs have been shown to be capable of such properties when directed against mRNA targets (see Cohen et al.,TIPS, 10:435, 1989 and Weintraub, Sci. American, January pp.40, 1990) .
This invention further provides a composition containing an acceptable carrier and any of an isolated, purified human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide, an active fragment thereof, or a purified, mature protein and active fragments thereof, alone or in combination with each other. These polypeptides or proteins can be recombinantly derived, chemically synthesized or purified from native sources. As ' used herein, the term "acceptable carrier" encompasses any of the standard pharmaceutical carriers, such as phosphate buffered saline solution, water and emulsions such as an oil/water or water/oil emulsion, and various types of wetting agents.
Also provided are antibodies having specific reactivity with the human netrin, the human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptides of the subject invention. Active fragments of antibodies are encompassed within the definition of "antibody". Invention antibodies can be produced by methods known in the art using the invention proteins or portions thereof as antigens. For example, polyclonal and monoclonal antibodies can be produced by methods well known in the art, as described, for example, in Harlow and Lane, Antibodies: A Laboratory Manual (Cold Spring Harbor Laboratory 1988).
The polypeptides of the present invention can be used as the immunogen in generating such antibodies.
Alternatively, synthetic peptides can be prepared (using commercially available synthesizers) and used as immunogens. V~There natural or synthetic hNET-, hABC3-, RPL3L- (SEM L3-), and/or hALR-derived peptides are used to induce a hNET-, hABC3-, RPL3L- (SEM L3-), and/or hALR-specific immune response, the peptides may be conveniently coupled to an suitable carrier such as KLH and administered in a suitable adjuvant such as Freund's. Preferably, selected peptides are coupled to a lysine core carrier substantially according to the methods of Tam, Proc. Natl.
Acad. Sci, USA 85:5409-5413, 1988. The resulting antibodies may be modified to a monovalent form, such as, for example, Fab, Fab2, FAB', or FV. Anti-idiotypic antibodies may also be prepared using known methods.
In one embodiment, normal or mutated hNET, hABC3, RPL3L (SEM L3), or hALR polypeptides are used to immunize mice, after which their spleens are removed, and splenocytes used to form cell hybrids with myeloma cells and obtain clones of antibody-secreted cells according to techniques that are standard in the art. The resulting monoclonal antibodies are screened for specific binding to hNET, hABC3, RPL3L (SEM L3), and/or hALR proteins or hNET-, hABC3-, RPL3L- (SEM L3-), and/or hALR-related peptides.
In another embodiment, antibodies are screened for selective binding to normal or mutated hNET, hABC3, RPL3L (SEM L3), or hALR sequences. Antibodies that distinguish between normal and mutant forms of hNET, hABC3, RPL3L (SEM L3), or hALR may be used in diagnostic tests (see below) employing ELTSA, EMIT, CEDIA, SLIFA, and the like. Anti- hNET, hABC3, RPL3L (SEM L3), or hALR
antibodies may also be used to perform subcellular and histochemical localization studies. Finally, antibodies may be used to block the function of the hNET, hABC3, RPL3L
(SEM L3), and/or hALR polypeptide, whether normal or mutant) or to perform rational drug design studies to identify and test inhibitors of the function (e. g., using an anti-idiotypic antibody approach).
Amino acid sequences can be analyzed by methods well known in the art to determine whether they encode hydrophobic or hydrophilic domains of the corresponding polypeptide. Altered antibodies such as chimeric, humanized, CDR-grafted or bifunctional antibodies can also be produced by methods well known in the art. Such antibodies can also be produced by hybridoma, chemical synthesis or recombinant methods described, for example, in Sambrook et al., supra., and Harlow and Lane, supra. Both anti-peptide and anti-fusion protein antibodies can be used. (see, for example, Bahouth et al., Trends Pharmacol.
Sci. 12:338, 1991; AusubeT et al., supra.).
Invention antibodies can be used to isolate invention polypeptides. Additionally, the antibodies are useful for detecting the presence of the invention " polypeptides, as well as analysis of polypeptide localization, composition, and structure of functional _ domains. Methods for detecting the presence of a human netrin, a human ABC3 transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide comprise contacting the cell with an antibody that specifically binds to the polypeptide, under conditions permitting binding of the antibody to the polypeptide, detecting the presence of the antibody bound to the cell, and thereby detecting the presence of the invention polypeptide on the cell. With respect to the detection of such polypeptides, the antibodies can be used for in vitro diagnostic or in vivo imaging methods.
Immunological procedures useful for in vitro detection of the target human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide in a sample include immunoassays that employ a detectable antibody. Such immunoassays include, for example, ELISA, Pandex microfluorimetric assay, agglutination assays, flow cytometry, serum diagnostic assays and immunohistochemical staining procedures which are well known in the art. An antibody can be made detectable by various means well known in the art. For example, a detectable marker can be directly or indirectly attached to the antibody. Useful markers include, for example, radionuclides, enzymes, fluorogens, chromogens and chemiluminescent labels.
For in vivo imaging methods, a detectable . antibody can be administered to a subject and the binding of the antibody to the invention polypeptide can be detected by imaging techniques well known in the art.
Suitable imaging agents are known and include, for example, gamma-emitting radionuclides such as kiln 99mTc 5lCr and the like, as well as paramagnetic metal ions, which are WO 97!48797 PCT/US97/00785 described in U.S. Patent No. 4,647,447. The radionuclides permit the imaging of tissues by gamma scintillation photometry, positron emission tomography, single photon emission computed tomography and gamma camera whole body imaging, while paramagnetic metal ions permit visualization by magnetic resonance imaging.
The invention provides a transgenic non-human mammal that is capable of expressing nucleic acids encoding a human netrin, a human ABC3 transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide. Also provided is a transgenic non-human mammal capable of expressing nucleic acids encoding a human netrin, a human ABC3 transporter, a human ribosomal L3 subtype, or a human augmenter of liver regeneration polypeptide so mutated as to be incapable of normal activity, i.e., does not express native protein.
The present invention also provides a transgenic non-human mammal having a genome comprising antisense nucleic acids complementary to nucleic acids encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide so placed as to be transcribed into antisense mRNA
complementary to mRNA encoding a human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide, which hybridizes thereto and, thereby, reduces the translation thereof. The polynucleotide may additionally comprise an inducible promoter and/or tissue specific regulatory elements, so that expression can be induced, or restricted to specific cell types. Examples of polynucleotides are DNA or cDNA
having a coding sequence substantially the same as the coding sequence shown in Figures 3, 4, 8, 11 and 15.
Examples of non-human transgenic mammals are transgenic cows, sheep, goats, pigs, rabbits, rats and mice. Examples of tissue specificity-determining elements are the metallothionein promoter and the T7 promoter.
Animal model systems which elucidate the physiological and behavioral roles of invention polypeptides are produced by creating transgenic animals in ' which the expression of the polypeptide is altered using a variety of techniques. Examples of such techniques include the insertion of normal or mutant versions of nucleic acids encoding human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide by microinjection, retroviral infection or other means well known to those skilled in the art, into appropriate fertilized embryos to produce a transgenic animal. See, for example, Carver et al., Bio/Technology 11:1263-1270, 1993; Carver et al., Cytotechnology 9:77-84, 1992; Clark et al., Bio/Technology 7:487-492, 1989; Simons et al., Bio/Technology 6:179-183, 1988; Swanson et al., Bio/Technology 10:557-559, 1992;
Velander et al., Proc. Natl. Acad. Sci., USA 89:12003-12007, 1992; Hammer et al., Nature 315:680-683, 1985;
Krimpenfort et al., Bio/Technology 9:844-847, 1991; Ebert et al., Bio/Technology 9:835-838, 1991; Simons et al., Nature 328:530-532, 1987; Pittius et al., Proc. Natl. Acad.
Sci., USA 85:5874-5878, 1988; Greenberg et al., Proc. Natl.
Acad. Sci., USA 88:8327-8331, 1991; ~nThitelaw et al., Transg. Res. 1:3-13, 1991; Gordon et al., Bio/Technology 5:1183-1187, 1987; Grosveld et al., Cell 51:975-985, 1987;
Brinster et al., Proc. Natl. Acad. Sci., USA 88:478-482, 1991; Brinster et al., Proc. Natl. Acad. Sci., USA 85:836-840, 1988; Brinster et al., Proc. Natl. Acad. Sci., USA
82:4438-4442, 1985; A1-Shawi et al., Mol. Cell. Biol.
10(3}:1192-1198, 1990; Van Der Putten et al., Proc. Natl.
Acad. Sci., USA 82:6148-6152) 1985; Thompson et al., Cell 56:313-321, 1989; Gordon et al., Science 214:1244-1246, 1981; and Hogan et al., Manipulating the Mouse Embryo: A
Laboratory Manual (Cold Spring Harbor Laboratory, 1986).
Another technique, homologous recombination of mutant or normal versions-of these genes with the native gene locus in transgenic animals, may be used to alter the regulation of expression or the structure of the invention polypeptides (see, Capecchi et al., Science 244:1288, 1989;
Zimmer et al., Nature 338:150, 1989). Homologous recombination techniques are well known in the art.
Homologous recombination replaces the native (endogenous) gene with a recombinant or mutated gene to produce an animal that cannot express native (endogenous) protein but can express, for example, a mutated protein which results in altered expression of the human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide.
In contrast to homologous recombination, microinjection adds genes to the host genome, without removing host genes. Microinjection can produce a transgenic animal that is capable of expressing both endogenous and exogenous human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides. Inducible promoters can be linked to the coding region of the nucleic acids to provide a means to regulate expression of the transgene.
Tissue-specific regulatory elements can be linked to the coding region to permit tissue-specific expression of the transgene. Transgenic animal model systems are useful for in vi vo screening of compounds for identification of ligands, i.e., agonists and antagonists, which activate or inhibit polypeptide responses.
The nucleic acids, oligonucleotides (including antisense), vectors containing same, transformed host cells, polypeptides, as well as antibodies of the present invention, can be used to screen compounds in vitro to determine whether a compound functions as a potential agonist or antagonist to the invention protein. These in vitro screening assays provide information regarding the function and activity of the invention protein, which can lead to the identification and design of compounds that are capable of specific interaction with invention proteins.
In accordance with still another embodiment of the present invention, there is provided a method for identifying compounds which bind to human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptides. The invention proteins may be employed in a competitive binding assay. Such an assay can accommodate the rapid screening of a large number of compounds to determine which compounds, if any, are capable of binding to invention polypeptides. Subsequently, more detailed assays can be carried out with those compounds found to bind, to further determine whether such compounds act as modulators, agonists or antagonists of invention polypeptides.
In accordance with another embodiment of the present invention, transformed host cells that recombinantly express invention polypeptides can be contacted with a test compound, and the modulating effects) thereof can then be evaluated by comparing the human netrin, human ABC3 transporter, human ribosomal L3 subtype, or human augmenter of liver regeneration polypeptide-mediated response in the presence and absence of test compound, or by comparing the response of test cells or control cells (i.e., cells that do not express invention polypeptides), to the presence of the compound.
As used herein, a compound or a signal that "modulates the activity" of an invention polypeptide refers to a compound or a signal that alters the activity of the human netrin, the human ABC3 transporter, the human ribosomal L3 subtype, or the human augmenter of liver regeneration polypeptide so that the activity of the invention polypeptide is different in the presence of the compound or signal than in the absence of the compound or signal. In particular, such compounds or signals include agonists and antagonists. An agonist encompasses a compound or a signal that-activates polypeptide function.
Alternatively, an antagonist includes a compound or signal that interferes with polypeptide function. Typically, the ettect of an antagonist is observed as a blocking of agonist-induced protein activation. Antagonists include competitive and non-competitive antagonists. A competitive antagonist (or competitive blocker) interacts with or near the site specific for agonist binding. A non-competitive antagonist or blocker inactivates the function of the polypeptide by interacting with a site other than the agonist interaction site.
The following examples are intended to illustrate the invention without limiting the scope thereof.
Example I: Contig Assembly A. Cosmids Multiple cosmids were used as reagents to initiate walks in YAC and P1 libraries. Clones 16-166N
(D16S277), 16-191N (D16S279), 16-198N (D16S280) and 16-140N
(D16S276) were previously isolated from a cosmid library (Lerner et al., Mamm. Genome 3:92-100, 1992). Cosmids cCMM65 (D16S84), c291 (D16S291), cAJ42 (ATP6C) and cKG8 were recovered from total human cosmid libraries (made in-house or by Stratagene, La Jolla) CA) using either a cloned insert (CMM65) or sequence-specific oligonucleotides as probe. The c326 cosmid contig and clone 413C12 originated from a flow-sorted chromosome 16 library (Stallings et al., Genomics 13(4):1031-1039, 1992). The c326 contig was comprised of clones 2H2, 77E8, 325A11 and 325B10.
B. YACs Screening of gridded interspersed-repetitive sequence (IRS pools from Mark I) Mark II and Mega-YAC
libraries) with cosmid-specific IRS probes was as previously described (Liu et al., Genomics 26:178-191, 1995). IRS probes were made from cosmids 16-166N, 16-191N, cAJ42, 16-198N, 325A11, cCMM65, and 16-140N. Biotinylated YAC probes were generated by nick-translating complex mixtures of IRS products from each YAC. Mixtures of sufficient complexity were achieved by performing independent DNA amplifications of total yeast DNA using various Alu primers (Lichter et al., Proc. Natl. Acad.
' Sci., USA 87:6634-6638, 1990) and then combining the appropriate reactions containing the most diverse products.
C. P1s Chromosome walking experiments were done using a single set of membranes which contained the gridded P1 library pools (Shepherd et al., supra. 1994). The gridded filters were kindly provided by Dr. Mark Leppert and the Technology Access Section of the Utah Center for Human Genome Research at the University of Utah. P1 gridded membranes were screened using end probes derived from a set of chromosome 16 cosmids (see above) and P1 clones as they were identified. Both RNA transcripts and bubble-PCR
products were utilized as end probes.
D. Probes Radiolabeled transcripts were generated using restriction enzyme digested cosmids or P1s (Alul, HaeIII, Rsal, Taql) as template for phage RNA polymerases T3, T7 and SP6. The T3 and T7 promoter elements were present on the cosmid-derived templates while T7 and SP6 promoter sequences were contained on the P1-based templates.
Transcription reactions were performed as recommended by the manufacturer (Stratagene, La Jolla, CA) in the presence of [OCP32] -ATP (Amersham, Arlington Heights, IL) .
Bubble-PCR products were synthesized from restriction enzyme digested P1s (Alul, HaeIII, Rsal, Taql).
Bubble adaptors with appropriate overhangs and phosphorylated 5' ends were ligated to digested P1 DNA
basically as described for YACs (Riley et al., Nuc. Acids Res. 18:2887-2890, 1990). The sequence of the universal vectorette primer derived from the bubble adaptor sequence was 5'-GTTCGTACGAGAATCGCT-3' (SEQ ID N0:67), and differed from that of Riley and co-workers with 12 fewer 5' WO 97/48797 PCTlUS97/00785 nucleotides. The Tm of the truncated vectorette primer more closely matched that of the paired amplimer from the vector-derived promoter sequence (SP6, T7). The desired bubble-PCR product was gel purified prior to radiolabeling (Feinberg et al., Anal. Biochem. 132:6-13, 1983; Feinberg and Vogelstein, Anal. Biochem. 137:266-267, 1984).
The specificity of all end probes was determined prior to their use on the single set of gridded P1 filter arrays. Radiolabeled probes were pre-annealed to Cot.2 DNA
as recommended (Life Technologies Inc., Gaithersburg, MD) and then hybridized to strips of nylon membrane to which were bound 10-20 ng each of the following DNAs: the cloned genomic template used to create the probe; one or more unrelated cloned genomic DNAs; cloned vector (no insert);
and human genomic DNA.
Hybridizations were performed in CAK solution (5x SSPE, 1o SDS, 5x Denhardt's Solution, 100 mg/mL torula RNA) at 65°C overnight. Individual end probes were present at a concentration of 5x105 cpm/mL. Hybridized membranes were washed to a final stringency of O.lx SSC/0.1o SDS at 65° C.
The hybridization results were visualized by autoradiography. Probes which hybridized robustly to their respective cloned template while not hybridizing to unrelated cloned DNAs, vector DNA or genomic DNA were identified and used to screen the gridded P1 filters.
Hybridization to the arrayed P1 pools was performed as described for the nylon membrane strips (above) except that multiple probes were used simultaneously. Positive clones were identified, plated at a density of 200-500 cfu per 100 mm plate (LB plus 25 mg/mL
kanamycin), lifted onto 82 mm HATE membranes (Millipore, Bedford, MA), processed for hybridization (Sambrook et al., supra.) and then rescreened with the complex probe mixture.
A single positive clone from each pool was selected and replated onto a master plate. To identify the colony purified genomic P1 clone and its corresponding ' probe, multiple P1 DNA dot blots were prepared and each hybridized to individual radiolabeled probes. All hybridizations contained a chromosome 16p13.3 reference probe, e.g. cAJ42, as well as a uniquely labeled P1 DNA
probe.
Example II: Exon Trapping Genomic P1 clones were prepared for exon trapping experiments by digestion with Pstl, double digestion with BamHI/BglII, or by partial digestion with limiting amounts of Sau3AI. Digested P1 DNAs were ligated to BamHI-cut and dephosphorylated vector, pSPL3B, while Pstl-digested P1 DNA
was subcloned into Pstl-cut dephosphorylated vector, pSPL3B.
Ligations were performed in triplicate using 50 ng of vector DNA and 1, 3 or 6 mass equivalents of digested P1 DNA. Transformations were performed following an overnight 16°C incubation, with 1/10 and 1/2 of the transformation being plated on LB (ampicillin) plates.
After overnight growth at 37°C, colonies were scraped off those plates having the highest transformation efficiency (based on a comparison to "no insert" ligation controls) and miniprepped using the alkaline lysis method. To examine the proportion of the pSPL3B containing insert, a small portion of the miniprep was digested with HindIII, which cuts pSPL3B on each side of the multiple cloning site.
Example III: RNA Preparation Approximately 10 ~..t.g of the remaining miniprep DNA
was ethanol precipitated, -resuspended in 100 x.11 of sterile PBS and electroporated into approximately 2 x 106 COS-7 cells (in 0.7 ml of ice cold PBS) using a BioRad GenePulser electroporator (1.2 kV, 25 ~,F and 200 S2). The electroporated cells were incubated for 10 min. on ice prior to their addition to a 100 mm tissue culture dish containing 10 mI of prewarmed complete DMEM.
Cytoplasmic RNA was isolated 48 hours post-transfection. The transfected COS-7 cells were removed from tissue culture dishes using 0.250 trypsin/1 mM
EDTA (Life Technologies Inc., Gaithersburg) MD).
Trypsinized cells were washed in DMEM/10o FCS and resuspended in 400 ~,l of ice cold TKM (10 mM Tris-HC1 pH
7.5, 10 mM KCl, 1 mM MgCl2) supplemented with 1 ~,1 of RNAsin (Promega, Madison, WI). After adding 20 [..t.l of 10~
Triton X-100, the cells were incubated for 5 min. on ice.
The nuclei were removed by centrifugation at 1200 rpm for 5 min. at 4°C. Thirty microliters of 5o SDS was added to the supernatant, with the cytoplasmic RNA being further purified by three rounds of extraction using phenol/chloroform/isoamyl alcohol (24:24:1). The cytoplasmic RNA was ethanol precipitated and resuspended in 0 ~.~.1 o f H2 0 .
Reverse transcription and PCR were performed on the cytoplasmic RNA prepared above as described (Church et al., supra. 1994) using commercially available exon trapping oligonucleotides (Life Technologies Inc., Gaithersburg, MD). The resulting CUA-tailed products were shotgun subcloned into pAMPlO as recommended by the manufacturer (Life Technologies Inc.). Random clones from each ligation were analyzed by colony PCR using secondary PCR primers {Life Technologies Inc.).
Miniprep DNA containing the pAMPlO/exon traps was prepared from overnight cultures by alkaline lysis using the EasyPrep manifold or a QIAwell 8 system according to the manufacturers' instructions (Pharmacia, Pistcataway, NJ
and Qiagen Inc., Chatsworth, CA, respectively). DNA
products containing trapped exons, based on comparison to the 177 by "vector only" DNA product, were selected for sequencing.
- Example IV: Sequencing DNA sequencing was performed using Pharmacia ALF and Applied Biosystems 377 PRISM automated DNA sequencers (Piscataway, NJ, and Foster City, CA). DNA sequences were aligned using Sequencher DNA analysis software (Genecodes, Ann Arbor, MI). DNA and protein database searches were performed using the BLASTN (Altschul et al., J. Mol. Biol.
215:403-410, 1990) and BLASTX (Altschul et al., supra.
1990; Gish et al., Nat. Genet. 3:266-272, 1993) programs.
SASE sequences were analyzed by processing BLAST (Altschul et al., supra. 1990; Gish et al., supra. 1993) and FASTA
(Lipman et al., Science 227:1435-1441, 1985) searches.
Protein sequences were analyzed using MacVector (Oxford Molecular Group, Cambell, CA), BCM Launcher (Smith et al., Genome Research 6:454-462, 1996), ClustalW (Thompson et al., Nucleic Acids Res. 22:4673-4680, 1994), and PSORT
(Nakai et al., Genomics 14:897-911 1992).
Example V: RT-PCR, RACE, SASE and cDNA Isolation Based upon the sequence determined (above) two oligonucleotide primers (Table II) were designed for each exon trap using Oligo 4.0 (National Biosciences Inc., Plymouth, MN).
To determine which tissue-specific library to screen for transcript or cDNA, RT-PCR reactions and/or PCR
reactions were performed using different tissue-derived RNAs and/or cDNA libraries, respectively, as template with the oligonucleotide primers designed for each exon trap (above).
The oligonucleotides designed from the exons (Table II), were then used in one or more of the following positive selection formats to screen the corresponding tissue-specific cDNA library.
For RT-PCR experiments, the first oligonucleotide was used as a sense primer and the second oligonucleotide was used as an antisense primer. RT-PCR was performed as described using polyA+ RNA from adult brain and placenta (Kawasaki, In PCR Protocols: A Guide to Methods and Applications, Eds. Innis et al., Academic Press, San Diego, CA, pp. 21-27, 1990). All PCR products were cloned using the pGEM-T vector as described by the manufacturer (Promega, Madison, WI).
To clone sequences 3' to selected exon traps, rapid amplification of cDNA ends (RACE) was performed as described (Frohman, PCR Met. Appl. 4:540-S58, 1994). In 3' RACE experiments, the first oligonucleotide was used as the external primer and the second oligonucleotide was used as the internal primer.
For the Genetrapper cDNA Positive Selection System, the first oligonucleotide primer was biotinylated and used for direct selection, while the second oligonucleotide was used in the repair.
In addition to exon trapping, the cloned contig was also screened using cDNA selection essentially as described (Parimoo et al., Anal. Biochem. 228:1-17 1995), using the genomic P1 clones from this interval (Dackowski et a~.,Genome Res. 6:515-524, 1996). Other coding sequence was obtained by SAmple SEquencing (SASE).
SASE was performed as a functional genomics method for gene identification. Briefly, DNA from individual P1s were partially digested with Sau3A and 3 kb fragments were subcloned into the pBluescriptKS+ plasmid (Stratagene, La Jolla, CA). Subclones were sequenced from both ends to generate sequences semi-randomly from the P1 clone.
WO 97/48797 PCTlUS97/00785 Example VI: Nucleotide Sequence Analysis ' hNET: A random shotgun library was prepared from the 53.8B P1 clone (Figure 18) by subcloning randomly sheared P1 DNA into the pAMPlO vector (Life Technologies Inc., Gaithersburg, MD) essentially as described (Andersson et al., (1994) Anal. Biochem. 218:300-308). P1 DNA was randomly sheared using a nebulizer (Hudson RCI, Temecula, CA). The library was initially screened with a 6 kb XhoI
fragment, which had been shown to contain the netrin encoding exon traps (Figure 18). The library was subsequently screened with an adjacent 3.5 kb Xhol fragment in order to obtain additional clones for sequencing.
Positive clones were sequenced using forward and reverse vector primers as previously described (The American PKD1 Consortium (1995) Hum. Mol. Genet. 4:575-582).
The genomic sequence was edited and assembled using Sequencher (GeneCodes, Ann Arbor, MI). The coding region was predicted using the World Wide Web version of the GRAIL2 program (Uberbacher and Mural (1991) Proc. Natl.
Acad. Sci., USA 88:11261-11265; Xu et al. (1994) Genet.
Eng. N.Y. 16:241-253) and a MacVector (Oxford Molecular Group, Cambell, CA) Pustell DNA/protein matrix analysis comparing the genomic sequence (translated in all reading frames) to the chicken netrins. Database searches were performed using BLASTN (Altschul et a1. (1990) J. Mol.
Biol. 215:403-410) and BLASTX (Altschul et al., 1990, supra; Gish and States (1993) Nat. Genet. 3:266-272).
RT-PCR: Both adult (brain, heart, kidney, leukocytes, liver, lung, a lymphoblastoid cell line, placenta, spleen, and testis) and fetal (kidney and brain) cDNA libraries were prescreened for the presence of netrin cDNAs by PCR as described (Van Raay et al., 1996, supra).
Nested RT-PCR was utilized to clone transcribed sequences from the netrin gene. Briefly, spinal cord polyA+ RNA
(Clontech, Palo Alto, CA) was reverse transcribed using random primers as described (Kawasaki, 1990 In "PCR
Protocols: A Guide to Methods and Applications" (M. A.
Innis, D.H. Gelfand, J.J. Sninsky, and T.J. White. Eds.), pp. 21-27, Academic Press, Inc., San Diego).
Primers for PCR (Table IV) were designed based on the exons predicted from the analysis of the genomic sequence and used to amplify spinal cord RNA since spinal cord has been previously shown to express low levels of chicken netrin (Serafini et a1. supra.). Nested PCR was required to detect RT-PCR products from human spinal cord RNA. Spinal cord RNA was reverse transcribed with random primers and primary PCR was performed in the presence of 2.5 M betaine (Sigma Chemical Co., St. Louis, MO) using the primers designed from the gene model (Table IV). The primary PCR reactions were then diluted 1:20 and secondary PCR was performed on 1 E1L of the diluted primary reactions using nested primers (also designed from the gene model), again in the presence of betaine. The inclusion of betaine at a final concentration of 2.5 M in the PCR reactions dramatically increased the purity and yield of the human netrin RT-PCR products (see, for example, International Publication No. WO 96/12041; Reeves et a1. (1994) Am. J.
Hum. Genet. 55:A238; Baskaran et al. (1996) Genome.Research 6:633-638).
RT-PCR products were subcloned using pGEM-T
(Promega, Madison, WI) as recommended by the manufacturer.
The resulting RT-PCR clones were sequenced with vector primers and internal primers using the ABI dye terminator chemistry (Perkin Elmer, Foster City, CA) and an ABI 377 automated sequencer (Perkin Elmer, Foster City, CA).
Multiple sequence alignments were performed using ClustalW
(Thompson et al., (1994) Nucleic Acids Res. 22:4673-4680).
Sequence analysis of the RT-PCR products indicated that hNET contains at least six exons. The RT-PCR data indicate that the fourth predicted exon is actually split by an intron in the human netrin gene and is present as two exons. Three of the RT-PCR exons were shown to be identical to the original exon traps. Aside from the extra exon, the gene model is nearly identical to the RT-PCR products. The cDNA coding sequence, predicted protein product and full length sequence are shown in Figures 4A
. through 4C, respectively.
Northern blot analysis: Genomic and RT-PCR probes were radiolabeled (Feinberg and Vogelstein, Anal. Biochem.
132:6-13, 1983) and used to probe Northern blots containing RNAs from a variety of adult tissues (Clontech, Palo Alto, CA), including a panel of RNAs from different neural tissues including spinal cord. In addition, a human RNA
Master Blot (Clontech, Palo Alto, CA) containing RNAs from 50 different adult and fetal tissues was screened as recommended by the manufacturer.
hABC3: A human lung cDNA library (LTI, Gaithersburg, MD) was screened with the GeneTrapper system (LTI, Gaithersburg, MD} using capture and repair oligonucleotides (5'-CATTGCCCGTGCTGTCGTG-3' (SEQ ID N0:52}
and 5'-CATCGCCGCCTCCTTCATG-3' (SEQ ID N0:53), respectively) designed from trapped exon L48757, the 5' most trapped exon with homology to murine ABC1. Direct cDNA library screening was also performed using an RT-PCR clone as probe. 5' RACE (Frohman, M.A. in Methods Enzymol. (J. N.
Abelson and M.I. Simon Eds.) pp. 340-356, Academic Press, San Diego, CA 1993) was used to isolate additional 5' sequences from the ABC3 transcript.
Northern blot analysis: A 679 by fragment from the 3' untranslated region (UTR) of the ABC3 cDNA was radiolabeled by random priming (Feinberg et al., supra.
1983) and used to probe a multiple tissue northern blot (Clontech, Palo Alto, CA) under conditions recommended by the manufacturer.
Identification of codina sequence for the novel ABC
transporter: The gene for a novel ATP binding cassette (ABC) transporter, designated ABC3, has been mapped to the PKD1 locus on chromosome 16 (Burn et al., Genome Res.
6:525-537, 1996). Eight exons from the hABC3 gene were obtained from the 30.1F; 64.12C and 96.4B P1 clones using exon trapping. See, Figure 16 showing the genomic interval surrounding the hABC3 gene at the top, with Notl sites, DNA
markers, and distance in kilobases (in kb) also being shown. Genomic P1 clones from the interval which contain sequence from the hABC3 gene are shown below the genomic map. The relative position of the hABC3 cDNA is provided below the P1 clones, with the selected cDNA, trapped exons, RT-PCR clones, and cDNAs being indicated. Trapped exons and RT-PCR clones used in the isolation of additional hABC3 sequences have been labeled. The discontinuity in the line for clone ABCgt.1 represents the absence of an alternatively spliced exon.
Seven of these trapped exons encoded sequences having homology to murine ABC1 and ABC2 based on BLASTX
analysis (Altschul et al., supra. 1990; Gish et al., supra.
1993), with sequences from the trapped exons L48758, L48759, and L48760 having highest homology. Sequences encoded by the trapped exon L48760 also had homology to a Caenorhabditis elegans ABC transporter predicted from genomic sequence (Wilson et al., supra.). .
cDNA selection yielded a single 261 by cDNA clone which mapped near the 5' end of the ABC3 gene. Like L48760, this clone encoded sequences having homology to the hypothetical C. elegans ABC transporter. Initial analysis of the SASE results from the 30.1F P1 clone indicated that 4 of the 164 reactions encoded sequences with homology to ABC1 or ABC2. Subsequent comparison of the SASE data to the final hABC3 cDNA indicated that an additional seven sequencing reactions contained coding sequences from the ABC3 gene. A total of 1.6 kb of ABC3 coding sequence aligned with the SASE data. In that only 3.5 kb of coding sequence from the 5' end of the hABC3 gene map to the 30.1F
P1 clone, this represents a level of 45o coverage for the SASE analysis.
Assembly and analvsis of a cDNA for the novel ABC
transporter: Two complementary approaches were employed to assemble the full-length hABC3 cDNA. First, RT-PCR was utilized to link the trapped exons, selected cDNA, and SASE
data. Secondly, cDNA library screening was performed using direct selection as well as radiolabeled probes.
Using primers designed from the trapped exons L48757, L48758, L48760 and L75924, three RT-PCR products, containing 3.3 kb of coding sequence were cloned (Table I
and Figure 16). An additional RT-PCR primer was designed from a region of identity between the selected cDNA and the SASE data (Table I). A 900 by RT-PCR clone was obtained using the latter primer in conjunction with a trapped exon derived primer. In total, 4.2 kb of coding sequence was obtained using RT-PCR.
Several cDNAs were cloned using the GeneTrapper direct selection system and oligos designed from the 5' most trapped exon encoding sequences with homology to ABC1 (trapped exon L48747). The longest clone isolated with the GeneTrapper system was 5719 by in length (ABCgt.1) (Figure 8). This cDNA contains a 792 by 3' untranslated region with a consensus polyadenylation - cleavage site 20 by upstream of the polyA tail. An additional cDNA clone (ABC.5) was isolated using a radiolabeled 1.1 kb RT-PCR
product (ABC3-12) as a probe (Figure 16). The 5' end of the ABC3 cDNA was further characterized using 5' RACE, with several RACE products containing multiple in-frame stop codons upstream of the start methionine.
Sequence analysis indicated that clone ABCgt.1 lacks 147 by of sequence found in the RT-PCR clones and the cDNA clone ABC. S. The additional 147 by segment is likely to be the result of alternative splicing, in that it does not interrupt the open reading frame. The presence of both transcript populations has been confirmed by PCR using primers flanking the alternatively spliced exon.
A 6.4 kb cDNA has been assembled for the hABC3 transporter. The assembled cDNA contains a 5116 nucleotide long open reading frame encoding 1705 amino acids, with the predicted protein having a molecular weight of 191 kDa.
The proposed start methionine is 50 by upstream of the 5' end of clone ABCgt.l. Although the sequence surrounding the start methionine matches the Kozak sequence in only 6 of 10 positions (Kozak, J. Cell Biol. 115:887-903, 1991), the two positions which have been shown to be critical for function (an A at -3 and a G at +4) are conserved in hABC3.
The hABC3 cDNA contains a 792 by 3' UTR with a consensus polyadenylation/cleavage site 20 by upstream of the polyA
tract.
A 6.8 kb transcript is detected by a 3' UTR cDNA
probe on northern blots with highest levels of expression being observed in lung with lesser amounts in brain, heart, and pancreas. Significantly lower levels of expression were observed in placenta and skeletal muscle after longer exposure times. The ABC3 transcript was not detected in either liver or kidney.
RPL3L (SEM L3): The longest cDNA is 1548 nucleotides in length (Figure 11). All three cDNAs have an open reading frame (ORF) of 1224 nucleotide with the longest cDNA containing a 48 nucleotide 5' untranslated region. An inframe stop codon at position 7 is followed by the Kozak initiation sequence CCACCATGT (SEQ ID N0:68) (Kozak, supra.). The 3' UTR for each of the three cDNAs vary in length, and lacks a consensus polyadenylation cleavage site.
The longest cDNA was compared to the human, bovine and murine ribosomal L3 genes. At the nucleotide level there is only 74o identity between the RPL3L (SEM L3) cvNA and the consensus from these other ribosomal L3 cDNAs.
This is in sharp contrast to the 98% identity shared between human, bovine, and murine L3 nucleotide sequences.
There is no similarity between the 3' UTR of the cDNAs isolated here and the other L3 genes.
hALR: Sequences were cloned from the human ALR
gene by 3' RACE using primers (e.g., external 5'-TGGCCCAGTTCATACATTTA-3' (SEQ ID N0:69) and internal 5'-TTACCCCTGTGAGGAGTGTG-3' (SEQ ID N0:70)) designed from the exon trap. A total of 468 by have been obtained from the human ALR gene (Figure 13).
Example VII: Amino Acid Sequence Analysis hNET: hNET cDNA has at least 210 by of 5' untranslated sequence, a 5' start methionine codon, a 3' stop codon (TGA) and is predicted to be 580 amino acids in length (Figure 4), with the common domain structure of the netrin family being conserved (Figure 20A). Overall, the human netrin was found to have higher homology to chicken netrin-2 than netrin-1, i.e., 56.3o versus 53.9. As is the case with the other members of the netrin family, the region of greatest conservation includes the three EGF
repeats, while the C-terminal domains are less well conserved (Figure 20A). The EGF repeats are 78.70 and 82.2 identical between the human netrin and chicken netrin-1 and netrin-2, respectively, and 66.30 identical when compared to UNC-6. The C-terminal domains of the human netrin and chicken netrin -1 and -2 are 41.9 and 42.50 indentical, respectively with the same domain of UNC-6 being only 29.4 identical to human netrin. Overall, the human netrin more closely resembles the chicken netrins and UNC-6 than Drosophila NETA and NETB, since NETA
contains an expansion in the C-domain while NETB contains additional seguences in the VI and V-1 domains (Harris et al.) 1996, supra; Mitchell et al., 1996, supra).
WO 97!48797 PCT/US97/00785 The Structure of the Netrin Genes is Conserved Between Drosophila and Human The positions of the introns in the human gene were compared to the encoded protein to determine if the overall gene structure of the netrin/UNC-6 family is conserved (Figure 20B). This analysis revealed striking similarities between the Drosophila netrin genes and the human netrin gene. In the human gene, exon 1 contains the signal peptide, domain VI and the first EGF domain (domain V-1), while exons two and three each contain an EGF repeat, domains V-2 and V-3, respectively. Exons 4, 5, and 6 contain portions of the C-domain. With the exception of an additional intron in the C-domain, this motif/exon arrangement is conserved in the Drosophila netrin genes.
The coding regions of the two Drosophila netrin genes have been shown to be highly conserved with each being disrupted by six introns that occur in homologous sites (Harris et al., 1996, supra). The position of five of the six Drosophila introns was found to be conserved in the human gene (Figure 20B). The UNC-6 gene contains 12 introns in the coding region (Ishii et al., 1992, supra), the position of five of which correlate with the positions of the introns in the human gene. Interestingly, the sixth Drosophila intron that does not have a counterpart in the human gene and is the only intron from Drosophila that is not conserved in the UNC-6 gene.
hABC3: Database searches revealed homology between ABC3 and murine ABC1 and ABC2 (Luciani et al., supra. 1994). In addition to the murine ABC1 and ABC2 proteins, ABC3 also shows homology to the putative C. elegans protein encoded by the cosmid sequence of C48B4.4 (Wilson et al., supra.).
Overall, ABC3, ABC1, ABC2 and sequences encoded by C.
elegans cosmid C48B4.4 have highest homology in the regions surrounding the ATP binding cassettes (Figure 17).
However, when one compares the sequence between the first ATP binding cassette and the second transmembrane domain, referred to as the linker domain (Luciani et al., supra.
' 70 199-4), ABC3 shares much lower homology to these same 3 proteins listed above (amino acids 765-1044 in ABC3 in Figure 17). The linker domain of ABC3 is approximately 200 residues shorter than the linker domain present in ABC1 and ABC2. Consequently, an optimum protein alignment positions a gap in the ABC3 sequence immediately C-terminal of a conserved HH1 hydrophobic domain (Luciani et al., supra.
1994), located at position 917 through 959 in ABC3 (Figure 17). Additional comparisons indicate that the ABC3 linker domain is nearly identical in size to the linker domain encoded by C. elegans cosmid C48B4.4. As is the case with ABC1 and ABC2, the linker domain of ABC3 contains numerous polar residues and several potential phosphorylation sites.
Further analysis of the deduced ABC3 protein sequence revealed additional similarities to the ABC1/ABC2 subfamily. Based on PSORT analysis (Nakai et al., supra.), the ABC3 protein does not appear to contain an N-terminal signal sequence and is likely to be a Type III membrane protein (Singer, Annu. Rev. Cell Biol. 6:247-296 1990), with sequences N-terminal of the first transmembrane domain being located in the cytoplasm (Figure 17). Similar topography has been described for ABC1 (Luciani et al., supra. 1994) and all other ABC transported described to date (Higgins, supra. 1992). As mentioned above, murine ABC1 and ABC2 have been shown to contain a novel hydrophobic region, HH1, within the conserved linker domain. Although the HH1 domain is not well conserved at the amino acid level in ABC3, an HH1 domain does appear to be present within the linker region based on hydrophilicity analysis. A similar HH1 domain is also found in sequences encoded by cosmid C48B4.4 from C. elegans. In all these cases, the HH1 domain is predicted to have a i~-sheet conformation.
RPL3L (SEM L3): The RPL3L (SEM L3) cDNA open reading frame predicts a 407 amino acid polypeptide of 46.3 kD
(Figure 11). In vitro transcription - translation of RPL3L
(SEM L3) cDNA resulted in a protein product with an ' 71 apparent molecular weight of 46 kD which is in close agreement with the predicted weight of 46.3 kD.
Two nuclear targeting sequences, which are 100 conserved between man, mouse and cow, diverged slightly in the RPL3L (SEM L3) amino acid sequence. The first targeting site is the 21 amino acid N-terminal oligopeptide. The serine and arginine present at positions 13 and 19 respectively, in human, bovine and murine L3 are replaced with histidines in RPL3L (SEM L3) (Figure 12).
The second potential nuclear targeting site is the bipartite motif. Here the human, bovine and murine proteins have a KKR-(aa)12-KRR at position 341-358 while the SEM L3 gene has KKR-(aa)lp-HHSRQ at position 341-358.
The second half of this bipartite motif, while remaining basic, does not match those found in other nuclear targeting motifs (Simonic et al., supra. 1994). Overall, there is 77.20 amino acid identity between the RPL3L (SEM
L3) and the consensus from the other mammalian L3 ribosomal genes, with 56~ of the nucleotide differences between RPL3L
(SEM L3) and the human L3 being silent.
hALR: hALR cDNA sequences encode a 119 amino acid protein which is 84.80 identical and 94.1 similar to the rat ALR protein (see, Figures 13 and 14).
Although the invention has been described with reference to the disclosed embodiments, it should be understood that various modifications can be made without departing from the spirit of the invention. Accordingly, the invention is limited only by the claims which follow the Sequence Listing.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: GENZYME CORPORATION
(ii) TITLE OF INVENTION: NOVEL HUMAN CHROMOSOME 16 GENES, COMPOSITIONS, METHODS OF MAKING AND
USING SAME
(iii) NUMBER OF SEQUENCES: 83 (iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: SCOTT & AYLEN
(B) STREET: 60 QUEEN STREET
(C) CITY: OTTAWA
(D) PROVINCE: ONTARIO
(E) COUNTRY: CANADA
(F) POSTAL CODE: K1P 5Y7 (v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette (B) COMPUTER: IBM Compatible (C) OPERATING SYSTEM: Windows (D) SOFTWARE: FastSEQ for Windows Version 2.Ob (vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: 2,256,486 (B) FILING DATE: 16-JAN-1997 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/665,259 (B) FILING DATE: 17-JUN-1996 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/720,614 (B) FILING DATE: O1-OCT-1996 (vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/762,500 (B) FILING DATE: 09-DEC-1996 (viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: CHRISTINE J. COLLARD
(B) REGISTRATION NUMBER: 10030 (C) REFERENCE/DOCKET NUMBER: PAT 43578W-1 (ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (613) 237-5160 (B) TELEFAX: (613) 787-3558 (2) INFORMATION FOR SEQ ID NO:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 179 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:
Leu His Leu Glu Gly Pro Phe Ile Ser Arg Glu Lys Arg Gly Thr His Pro Glu Ala His Leu Arg Ser Phe Glu Ala Asp Ala Phe Gln Asp Leu Leu Ala Thr Tyr Gly Pro Leu Asp Asn Val Arg Ile Val Thr Leu Asp Pro Glu Leu Gly Arg Ser His Glu Val Phe Arg Thr Leu Thr Xaa Arg Ser Ile Cys Val Ser Leu Gly His Ser Val Ala Asp Leu Arg Ala Ala Glu Asp Ala Val Trp Ser Gly Ala Thr Phe Ile Thr His Leu Phe Asn Ala Met Leu Pro Phe His His Arg Asp Pro Gly Ile Val Gly Leu Leu Thr Ser Asp Arg Pro Ala Gly Arg Cys Ile Phe Tyr Gly Met Ile Ala Asp Gly Thr His Thr Asn Pro Ala Ala Leu Arg Ile Ala His Arg Ala His Pro Gln Gly Leu Val Leu Val Thr Asp Ala Ile Pro Ala Leu Gly Leu Gly Asn Gly Arg His Thr Leu Gly Gln Gln Glu Val Glu Val Asp Gly Leu Thr (2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 90 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:2:
His Leu Glu Gly Pro Phe Ile Ser Lys Arg Gly His Pro Glu Ser Tyr Gly Asn Ile Val Thr Pro Glu Leu Glu Val Ser Gly His Ser Ala Leu Glu Ala Val Ser Gly Ala Ile Thr His Leu Phe Asn Ala Met His His Arg Asp Pro Gly Gly Leu Leu Thr Ser Leu Tyr Gly Ile Asp Gly His Thr Ala Leu Arg Ile Ala Gly Leu Val Leu Val Thr Asp Ala Ile Ala Leu Gly Gly His Leu Gly Gln Val Gly Leu (2) INFORMATION FOR SEQ ID N0:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 64 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:3:
Leu His Leu Glu Gly Pro Lys Gly Thr His Arg Ala Ala Asp Leu Asp Val Thr Leu Pro Glu Glu Val Leu Ile Val Ser Gly His Ser Ala Leu Ala Gly Thr Phe Thr His Leu Asn Ala Met Pro Gly Leu Leu Ile Gly Ile Ala Asp Gly His Ala Arg Ala Arg Leu Leu Val Thr Asp Ala Gly (2) INFORMATION FOR SEQ ID N0:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 55 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4:
Leu His Glu Pro Ser Glu Lys Gly His Arg Asp Leu Gly Asp Thr Glu Ile Val Ser Gly His Ser Ala Ala Ala Gly Ala Thr Phe Thr His Leu Asn Ala Met Pro Gly Gly Ile Asp Gly His Asn Arg Ile Leu Val Thr Asp Ile Ala Gly Leu Gly Thr (2) INFORMATION FOR SEQ ID N0:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 49 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:5:
Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys Val (2) INFORMATION FOR SEQ ID N0:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 48 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:6:
Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys (2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 44 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:7:
Cys Asp Cys His Pro Val Gly Ala Ala Gly Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Thr Cys Asn Arg Cys Ala Lys Gly Gln Gln Ser Arg Ser Pro Ala Pro Cys (2) INFORMATION FOR SEQ ID NO:$:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: B:
Cys Cys His Pro Val Gly Gly Cys Asn Gln Gly Gln Cys Cys Lys Gly Val Thr Gly Thr Cys Asn Arg Cys Ala Lys Gly Gln Gln Ser Arg Ser Val Pro Cys (2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 49 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:9:
His Ser Pro Ser Leu Ser Ala Glu Thr Pro Ile Pro Gly Pro Thr Glu Asp Ser Ser Pro Val Gln Pro Gln Asp Cys Asp Ser His Cys Lys Pro Ala Arg Gly Ser Tyr Arg Ile Ser Leu Lys Lys Phe Cys Lys Lys Asp Tyr (2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
Ile Ser Pro Asp Cys Asp Ser Cys Lys Pro Ala Gly Tyr Ile Lys Lys Cys Lys Lys Asp Tyr (2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:
Pro Pro Thr Ser Ser Pro Asp Cys Asp Ser Cys Lys Gly Ile Lys Lys Cys Lys Lys Asp Tyr (2) INFORMATION FOR SEQ ID N0:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 88 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:12:
Met Leu Val Gly Asp Ser Gly Val Gly Lys Thr Cys Leu Leu Val Arg Phe Lys Asp Gly Ala Phe Leu Ala Gly Thr Phe Ile Ser Thr Val Gly Ile Asp Phe Arg Asn Lys Val Leu Asp Val Asp Gly Val Lys Ala Lys Leu Gln Met Trp Asp Thr Ala Gly Gln Glu Arg Phe Arg Ser Val Thr His Ala Tyr Tyr Arg Asp Ala His Ala Leu Leu Leu Leu Tyr Asp Val Thr Asn Lys Ala Ser Phe Asp Asn (2) INFORMATION FOR SEQ ID N0:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 83 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:13:
Met Leu Val Gly Asp Ser Gly Val Gly Lys Thr Cys Leu Leu Val Arg Phe Lys Asp Gly Ala Phe Leu Ala Gly Thr Phe Ile Ser Thr Val Gly Ile Asp Phe Arg Asn Lys Val Leu Asp Val Asp Gly Lys Lys Leu Gln Trp Asp Thr Ala Gly Gln Glu Arg Phe Arg Ser Val Thr His Ala Tyr Tyr Arg Asp Ala His Ala Leu Leu Leu Leu Tyr Asp Thr Asn Lys Ser Phe Asp Asn (2) INFORMATION FOR SEQ ID N0:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 83 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:14:
Phe Gln Asn His Phe Glu Pro Gly Val Tyr Val Cys Ala Lys Cys Gly Tyr Glu Leu Phe Ser Ser Arg Ser Lys Tyr Ala His Ser Ser Pro Trp Pro Ala Phe Thr Glu Thr Ile His Ala Asp Ser Val Ala Lys Arg Pro Glu His Asn Arg Ser Glu Ala Leu Lys Val Ser Cys Gly Lys Cys Gly Asn Gly Leu Gly His Glu Phe Leu Asn Asp Gly Pro Lys Pro Gly Gln Ser Arg Phe (2) INFORMATION FOR SEQ ID N0:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:15:
Phe Pro Gly Tyr Val Gly Leu Phe Ser Ser Lys Tyr Trp Pro Phe Thr Ile Ala Ser Val Val Leu Gly His Phe Asp Gly Pro (2) INFORMATION FOR SEQ ID N0:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:16:
Glu Gly Val Tyr Cys Ala Cys Asp Leu Ser Ser Lys Trp Pro Ala Phe Glu Ala Cys Cys Leu Gly His Phe Gly Lys (2) INFORMATION FOR SEQ ID N0:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:17:
Phe His Phe Glu Gly Tyr Val Cys Cys Gly Glu Leu Phe Ser Lys Trp Pro Ala Phe Glu Val Cys Cys Leu Gly His Phe Asn Asp Gly Pro Lys (2) INFORMATION FOR SEQ ID N0:18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID N0:18:
Phe Gly Tyr Val Gly Phe Ser Ser Lys Trp Pro Phe Thr Ile Asp Val Gly Asn Leu Gly His Phe Asp Gly Pro Lys Gly Arg (2) INFORMATION FOR SEQ ID N0:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6803 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: Genomic DNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:19:
(2) INFORMATION FOR SEQ ID N0:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1743 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1...1740 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:20:
Met Pro Gly Trp Pro Trp Gly Leu Leu Leu Thr Ala Gly Thr Leu Phe Ala Ala Leu Ser Pro Gly Pro Pro Ala Pro Ala Asp Pro Cys His Asp Glu Gly Gly Ala Pro Arg Gly Cys Val Pro Gly Leu Val Asn Ala Ala Leu Gly Arg Glu Val Leu Ala Ser Ser Thr Cys Gly Arg Pro Ala Thr Arg Ala Cys Asp Ala Ser Asp Pro Arg Arg Ala His Ser Pro Ala Leu Leu Thr Ser Pro Gly Gly Thr Ala Ser Pro Leu Cys Trp Arg Ser Glu Ser Leu Pro Arg Ala Pro Leu Asn Val Thr Leu Thr Val Pro Leu Gly Lys Ala Phe Glu Leu Val Phe Val Ser Leu Arg Phe Cys Ser Ala Pro Pro Ala Ser Val Ala Leu Leu Lys Ser Gln Asp His Gly Arg Ser Trp Ala Pro Leu Gly Phe Phe Ser Ser His Cys Asp Leu Asp Tyr Gly Arg Leu Pro Ala Pro Ala Asn Gly Pro Ala Gly Pro Gly Pro Glu Ala Leu Cys Phe Pro Ala Pro Leu Ala Gln Pro Asp Gly Ser Gly Leu Leu Ala Phe Ser Met Gln Asp Ser Ser Pro Pro Gly Leu Asp Leu Asp Ser Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Val Arg Val Val Leu Thr Arg Pro Ser Thr Ala Gly Asp Pro Arg Asp Met Glu Ala Val Val Pro Tyr Ser Tyr Ala Ala Thr Asp Leu Gln Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Leu Leu Asp Thr Gln Gly His Leu Ile Cys Asp Cys Arg His Gly Thr Glu Gly Pro Asp Cys Gly Arg Cys Lys Pro Phe Tyr Cys Asp Arg Pro Trp Gln Arg Ala Thr Ala Arg Glu Ser His Ala Cys Leu Ala Cys Ser Cys Asn Gly His Ala Arg Arg Cys Arg Phe Asn Met Glu Leu Tyr Arg Leu Ser Gly Arg Arg Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg His Cys His Tyr Cys Arg Glu Gly Phe Tyr Arg Asp Pro Gly Arg Ala Leu Ser Asp Arg Arg Ala Cys Arg Ala Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys Val Lys Thr Pro Ile Pro Gly Pro Thr Glu Asp Ser Ser Pro Val Gln Pro Gln Asp Cys Asp Ser His Cys Lys Pro Ala Arg Gly Ser Tyr Arg Ile Ser Leu Lys Lys Phe Cys Lys Lys Asp Tyr Ala Val Gln Val Ala Val Gly Ala Arg Gly Glu Ala Arg Gly Ala Trp Thr Arg Phe Pro Val Ala Val Leu Ala Val Phe Arg Ser Gly Glu Glu Arg Ala Arg Arg Gly Ser Ser Ala Leu Trp Val Pro Ala Gly Asp Ala Ala Cys Gly Cys Pro Arg Leu Leu Pro Gly Arg Arg Tyr Leu Leu Leu Gly Gly Gly Pro Gly Ala Ala Ala Gly Gly Ala Gly Gly Arg Gly Pro Gly Leu Ile Ala Ala Arg Gly Ser Leu Val Leu Pro Trp Arg Asp Ala Trp Thr Arg Arg Leu Arg Arg Leu Gln Arg Arg Glu Arg Arg Gly Arg Cys Ser Ala Ala (2) INFORMATION FOR SEQ ID N0:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 580 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:21:
Met Pro Gly Trp Pro Trp Gly Leu Leu Leu Thr Ala Gly Thr Leu Phe Ala Ala Leu Ser Pro Gly Pro Pro Ala Pro Ala Asp Pro Cys His Asp Glu Gly Gly Ala Pro Arg Gly Cys Val Pro Gly Leu Val Asn Ala Ala Leu Gly Arg Glu Val Leu Ala Ser Ser Thr Cys Gly Arg Pro Ala Thr Arg Ala Cys Asp Ala Ser Asp Pro Arg Arg Ala His Ser Pro Ala Leu Leu Thr Ser Pro Gly Gly Thr Ala Ser Pro Leu Cys Trp Arg Ser Glu Ser Leu Pro Arg Ala Pro Leu Asn Val Thr Leu Thr Val Pro Leu Gly Lys Ala Phe Glu Leu Val Phe Val Ser Leu Arg Phe Cys Ser Ala Pro Pro Ala Ser Val Ala Leu Leu Lys Ser Gln Asp His Gly Arg Ser Trp Ala Pro Leu Gly Phe Phe Ser Ser His Cys Asp Leu Asp Tyr Gly Arg Leu Pro Ala Pro Ala Asn Gly Pro Ala Gly Pro Gly Pro Glu Ala Leu Cys Phe Pro Ala Pro Leu Ala Gln Pro Asp Gly Ser Gly Leu Leu Ala Phe Ser Met Gln Asp Ser Ser Pro Pro Gly Leu Asp Leu Asp Ser Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Val Arg Val Val Leu Thr Arg Pro Ser Thr Ala Gly Asp Pro Arg Asp Met Glu Ala Val Val Pro Tyr Ser Tyr Ala Ala Thr Asp Leu Gln Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Leu Leu Asp Thr Gln Gly His Leu Ile Cys Asp Cys Arg His Gly Thr Glu Gly Pro Asp Cys Gly Arg Cys Lys Pro Phe Tyr Cys Asp Arg Pro Trp Gln Arg Ala Thr Ala Arg Glu Ser His Ala Cys Leu Ala Cys Ser Cys Asn Gly His Ala Arg Arg Cys Arg Phe Asn Met Glu Leu Tyr Arg Leu Ser Gly Arg Arg Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg His Cys His Tyr Cys Arg Glu Gly Phe Tyr Arg Asp Pro Gly Arg Ala Leu Ser Asp Arg Arg Ala Cys Arg Ala Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Pro Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys Val Lys Thr Pro Ile Pro Gly Pro Thr Glu Asp Ser Ser Pro Val Gln Pro Gln Asp Cys Asp Ser His Cys Lys Pro Ala Arg Gly Ser Tyr Arg Ile Ser Leu Lys Lys Phe Cys Lys Lys Asp Tyr Ala Val Gln Val Ala Val Gly Ala Arg Gly Glu Ala Arg Gly Ala Trp Thr Arg Phe Pro Val Ala Val Leu Ala Val Phe Arg Ser Gly Glu Glu Arg Ala Arg Arg Gly Ser Ser Ala Leu Trp Val Pro Ala Gly Asp Ala Ala Cys Gly Cys Pro Arg Leu Leu Pro Gly Arg Arg Tyr Leu Leu Leu Gly Gly Gly Pro Gly Ala Ala Ala Gly Gly Ala Gly Gly Arg Gly Pro Gly Leu Ile Ala Ala Arg Gly Ser Leu Val Leu Pro Trp Arg Asp Ala Trp Thr Arg Arg Leu Arg Arg Leu Gln Arg Arg Glu Arg Arg Gly Arg Cys Ser Ala Ala (2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 606 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:22:
Met Pro Arg Arg Gly Ala Glu Gly Pro Leu Ala Leu Leu Leu Ala Ala Ala Trp Leu Ala Gln Pro Leu Arg Gly Gly Tyr Pro Gly Leu Asn Met Phe Ala Val Gln Thr Ala Gln Pro Asp Pro Cys Tyr Asp Glu His Gly Leu Pro Arg Arg Cys Ile Pro Asp Phe Val Asn Ser Ala Phe Gly Lys Glu Val Lys Val Ser Ser Thr Cys Gly Lys Pro Pro Ser Arg Tyr Cys Val Val Thr Glu Lys Gly Glu Glu Gln Val Arg Ser Cys His Leu Cys Asn Ala Ser Asp Pro Lys Arg Ala His Pro Pro Ser Phe Leu Thr Asp Leu Asn Asn Pro His Asn Leu Thr Cys Trp Gln Ser Asp Ser Tyr Val Gln Tyr Pro His Asn Val Thr Leu Thr Leu Ser Leu Gly Lys Lys Phe Glu Val Thr Tyr Val Ser Leu Gln Phe Cys Ser Pro Arg Pro Glu Ser Met Ala Ile Tyr Lys Ser Met Asp Tyr Gly Lys Thr Trp Val Pro Phe Gln Phe Tyr Ser Thr Gln Cys Arg Lys Met Tyr Asn Lys Pro Ser Arg Ala Ala Ile Thr Lys Gln Asn Glu Gln Glu Ala Ile Cys Thr Asp Ser His Thr Asp Val Arg Pro Leu Ser Gly Gly Leu Ile Ala Phe Ser Thr Leu Asp Gly Arg Pro Thr Ala His Asp Phe Asp Asn Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Ile Lys Val Thr Phe Ser Arg Leu His Thr Phe Gly Asp Glu Asn Glu Asp Asp Ser Glu Leu Ala Arg Asp Ser Tyr Phe Tyr Ala Val Ser Asp Leu Gln Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Val Arg Asp Arg Asp Asp Asn Leu Val Cys Asp Cys Lys His Asn Thr Ala Gly Pro Glu Cys Asp Arg Cys Lys Pro Phe His Tyr Asp Arg Pro Trp Gln Arg Ala Thr Ala Arg Glu Ala Asn Glu Cys Val Ala Cys Asn Cys Asn Leu His Ala Arg Arg Cys Arg Phe Asn Met Glu Leu Tyr Lys Leu Ser Gly Arg Lys Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg His Cys His Tyr Cys Lys Glu Gly Phe Tyr Arg Asp Leu Ser Lys Pro Ile Ser His Arg Lys Ala Cys Lys Glu Cys Asp Cys His Pro Val Gly Ala Ala Gly Gln Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Ile Thr Cys Asn Arg Cys Ala Lys Gly Tyr Gln Gln Ser Arg Ser Pro Ile Ala Pro Cys Ile Lys Ile Pro Ala Ala Pro Pro Pro Thr Ala Ala Ser Ser Thr Glu Glu Pro Ala Asp Cys Asp Ser Tyr Cys Lys Ala Ser Lys Gly Lys Leu Lys Ile Asn Met Lys Lys Tyr Cys Lys Lys Asp Tyr Ala Val Gln Ile His Ile Leu Lys Ala Glu Lys Asn Ala Asp Trp Trp Lys Phe Thr Val Asn Ile Ile Ser Val Tyr Lys Gln Gly Ser Asn Arg Leu Arg Arg Gly Asp Gln Thr Leu Trp Val His Ala Lys Asp Ile Ala Cys Lys Cys Pro Lys Val Lys Pro Met Lys Lys Tyr Leu Leu Leu Gly Ser Thr Glu Asp Ser Pro Asp Gln Ser Gly Ile Ile Ala Asp Lys Ser Ser Leu Val Ile Gln Trp Arg Asp Thr Trp Ala Arg Arg Leu Arg Lys Phe Gln Gln Arg Glu Lys Lys Gly Lys Cys Arg Lys Ala (2) INFORMATION FOR SEQ ID N0:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 581 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:23:
Leu Arg Leu Leu Leu Thr Thr Ser Val Leu Arg Leu Ala Arg Ala Ala Asn Pro Glu Val Ala Gln Gln Thr Pro Pro Asp Pro Cys Tyr Asp Glu Ser Gly Ala Pro Arg Arg Cys Ile Pro Glu Phe Val Asn Ala Ala Phe Gly Lys Glu Val Gln Ala Ser Ser Thr Cys Gly Lys Pro Pro Thr Arg His Cys Asp Ala Ser Asp Pro Arg Arg Ala His Pro Pro Ala Tyr Leu Thr Asp Leu Asn Thr Ala Ala Asn Met Thr Cys Trp Arg Ser Glu Thr Leu His His Leu Pro His Asn Val Thr Leu Thr Leu Ser Leu Gly Lys Lys Phe Glu Val Val Tyr Val Ser Leu Gln Phe Cys Ser Pro Arg Pro Glu Ser Thr Ala Ile Phe Lys Ser Met Asp Tyr Gly Lys Thr Trp Val Pro Tyr Gln Tyr Tyr Ser Ser Gln Cys Arg Lys Ile Tyr Gly Lys Pro Ser Lys Ala Thr Val Thr Lys Gln Asn Glu Gln Glu Ala Leu Cys Thr Asp Gly Leu Thr Asp Leu Tyr Pro Leu Thr Gly Gly Leu Ile Ala Phe Ser Thr Leu Asp Gly Arg Pro Ser Ala Gln Asp Phe Asp Ser Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Ile Arg Val Val Phe Ser Arg Pro His Leu Phe Arg Glu Leu Gly Gly Arg Glu Ala Gly Glu Glu Asp Gly Gly Ala Gly Ala Thr Pro Tyr Tyr Tyr Ser Val Gly Glu Leu Gln Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Val Lys Asp Lys Glu Gln Lys Leu Val Cys Asp Cys Lys His Asn Thr Glu Gly Pro Glu Cys Asp Arg Cys Lys Pro Phe His Tyr Asp Arg Pro Trp Gln Arg Ala Ser Ala Arg Glu Ala Asn Glu Cys Leu Ala Cys Asn Cys Asn Leu His Ala Arg Arg Cys Arg Phe Asn Met Glu Leu Tyr Lys Leu Ser Gly Arg Lys Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg His Cys His Tyr Cys Lys Glu Gly Phe Tyr Arg Asp Leu Ser Lys Ser Ile Thr Asp Arg Lys Ala Cys Lys Ala Cys Asp Cys His Pro Val Gly Ala Ala Gly Lys Thr Cys Asn Gln Thr Thr Gly Gln Cys Pro Cys Lys Asp Gly Val Thr Gly Leu Thr Cys Asn Arg Cys Ala Lys Gly Phe Gln Gln Ser Arg Ser Pro Val Ala Pro Cys Ile Lys Ile Pro Ala Ile Asn Pro Thr Ser Leu Val Thr Ser Thr Glu Ala Pro Ala Asp Cys Asp Ser Tyr Cys Lys Pro Ala Lys Gly Asn Tyr Lys Ile Asn Met Lys Lys Tyr Cys Lys Lys Asp Tyr Val Val Gln Val Asn Ile Leu Glu Met Glu Thr Val Ala Asn Trp Ala Lys Phe Thr Ile Asn Ile Leu Ser Val Tyr Lys Cys Arg Asp Glu Arg Val Lys Arg Gly Asp Asn Phe Leu Trp Ile His Leu Lys Asp Leu Ser Cys Lys Cys Pro Lys Ile Gln Ile Ser Lys Lys Tyr Leu Val Met Gly Ile Ser Glu Asn Ser Thr Asp Arg Pro Gly Leu Met Ala Asp Lys Asn Ser Leu Val Ile Gln Trp Arg Asp Ala Trp Thr Arg Arg Leu Arg Lys Leu Gln Arg Arg Glu Lys Lys Gly Lys Cys Val Lys Pro (2) INFORMATION FOR SEQ ID N0:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5894 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 2...5053 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:24:
Lys Val Leu Val Thr Val Leu Glu Leu Phe Leu Pro Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Glu Leu Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Lys Ala Val Thr Glu Thr Val Arg Arg Ala Leu Val Ile Asn Met Arg Val Arg Gly Phe Pro Ser Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Leu Ala Ala Val Val Phe Glu His Pro Phe Asn His Ser Lys Glu Pro Leu Pro Leu Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gly Trp His Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gly Phe Leu Ala Val Gln His Ala Val Asp Arg Ala Ile Met Glu Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Val Gln Glu Lys Glu Arg Arg Leu Lys Glu Tyr Met Arg Met Met Gly Leu Ser Ser Trp Leu His Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Phe Met Thr Leu Leu Phe Cys Val Lys Val Lys Pro Asn Va1 Ala Val Leu Ser Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Ala Ile Ser 10~
Thr Ile Ser Phe Ser Phe Met Val Ser Thr Phe Phe Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Met Ala Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gly Gln Val Leu Gly Met Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Arg Ala Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Ile Lys Ile Lys His Leu Ser Lys Val Phe Arg Val Gly Asn Lys Asp Arg Ala Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Thr Val Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gln Ile Arg Lys Ser Leu Gly Leu Cys Pro Gln His Asp Ile Leu Phe Asp Asn Leu Thr Val Ala Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pro Glu Glu Val Lys Gln Met Leu His Ile Ile Gly Leu Glu Asp Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Thr Thr His Phe Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Val Lys Glu Pro His Cys Asn Pro Glu Asp Ile Ser Gln Leu Val His His His Val Pro Asn Ala Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Ser Thr His Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Glu Val Phe Leu Arg Val Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Met Asp Pro Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Val Lys Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gln Phe Trp Ala Met Phe Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Ala Ala Gln Val Leu Val Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Ser Val Pro Gly Thr Ser Gln Leu Gly Gln Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gly Gln Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Ala Ala Ser Phe Arg Asp Val Gly Glu Arg Thr Val Val Asn Ala Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Ala Val Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gly Pro His Ala Ser Ile Val Val Ser Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Glu Arg Ala Val Gln Ala Lys His Val Gln Phe Val Ser Gly Val His Val Ala Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Val Phe Lys Ala Phe Asp Val Arg Ala Phe Thr Arg Asp Gly His Met Ala Asp Thr Leu Leu Leu Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Met Val Thr Ile Met Arg Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp His Val Phe Leu Val Leu Pro Asn His Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gln Asp Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu His Thr Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Val Pro Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Val Gly Gly His Arg Ile Ser Ser Asp Val Gly Lys Val Arg Gln Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp His Met Thr Gly Arg Glu Met Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg His Ile Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pro His Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pro Val Ala Arg Arg Leu Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gln His Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Glu His Gln Gly Met Val His Tyr His Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Ser Phe Ala His Leu Gln Pro Pro Thr Ala Glu Glu Gly Arg (2) INFORMATION FOR SEQ ID N0:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1684 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:25:
Lys Val Leu Val Thr Val Leu Glu Leu Phe Leu Pro Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Glu Leu Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Lys Ala Val Thr Glu Thr Val Arg Arg Ala Leu Val Ile Asn Met Arg Val Arg Gly Phe Pro Ser Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Leu Ala Ala Val Val Phe Glu His Pro Phe Asn His Ser Lys Glu Pro Leu Pro Leu Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gly Trp His Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gly Phe Leu Ala Val Gln His Ala Val Asp Arg Ala Ile Met Glu Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Val Gln Glu Lys Glu Arg Arg Leu Lys Glu Tyr Met Arg Met Met Gly Leu Ser Ser Trp Leu His Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Phe Met Thr Leu Leu Phe Cys Val Lys Val Lys Pro Asn Val Ala Val Leu Ser Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Ala Ile Ser Thr Ile Ser Phe Ser Phe Met Val Ser Thr Phe Phe Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Met Ala Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gly Gln Val Leu Gly Met Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Arg Ala Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Ile Lys Ile Lys His Leu Ser Lys Val Phe Arg Val Gly Asn Lys Asp Arg Ala Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Thr Val Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gln Ile Arg Lys Ser Leu Gly Leu Cys Pro Gln His Asp Ile Leu Phe Asp Asn Leu Thr Val Ala Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pro Glu Glu Val Lys Gln Met Leu His Ile Ile Gly Leu Glu Asp Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Thr Thr His Phe Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Val Lys Glu Pro His Cys Asn Pro Glu Asp Ile Ser Gln Leu Val His His His Val Pro Asn Ala Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Ser Thr His Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Glu Val Phe Leu Arg Val Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Met Asp Pro Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Val Lys Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gln Phe Trp Ala Met Phe Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Ala Ala Gln Val Leu Val Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Ser Val Pro Gly Thr Ser Gln Leu Gly Gln Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gly Gln Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Ala Ala Ser Phe Arg Asp Val Gly Glu Arg Thr Val Val Asn Ala Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Ala Val Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gly Pro His Ala Ser Ile Val Val Ser Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Glu Arg Ala Val Gln Ala Lys His Val Gln Phe Val Ser Gly Val His Val Ala Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Val Phe Lys Ala Phe Asp Val Arg Ala Phe Thr Arg Asp Gly His Met Ala Asp Thr Leu Leu Leu Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Met Val Thr Ile Met Arg Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp His Val Phe Leu Val Leu Pro Asn His Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gln Asp Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu His Thr Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Val Pro Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Val Gly Gly His Arg Ile Ser Ser Asp Val Gly Lys Val Arg Gln Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp His Met Thr Gly Arg Glu Met Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg His Ile Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pro His Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pro Val Ala Arg Arg Leu Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gln His Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Glu His Gln Gly Met Val His Tyr His Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Ser Phe Ala His Leu Gln Pro Pro Thr Ala Glu Glu Gly Arg (2) INFORMATION FOR SEQ ID N0:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1375 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:26:
Cys Met Glu Glu Glu Pro Thr His Leu Arg Leu Gly Val Ser Ile Gln Asn Leu Val Lys Val Tyr Arg Asp Gly Met Lys Val Ala Val Asp Gly Leu Ala Leu Asn Phe Tyr Glu Gly Gln Ile Thr Ser Phe Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Met Ser Ile Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Thr Ala Tyr Ile Leu Gly Lys Asp Ile Arg Ser Glu Met Ser Ser Ile Arg Gln Asn Leu Gly Val Cys Pro Gln His Asn Val Leu Phe Asp Met Leu Thr Val Glu Glu His Ile Trp Phe Tyr Ala Arg Leu Lys Gly Leu Ser Glu Lys His Val Lys Ala Glu Met Glu Gln Met Ala Leu Asp Val Gly Leu Pro Pro Ser Lys Leu Lys Ser Lys Thr Ser Gln Leu Ser Gly Gly Met Gln Arg Lys Leu Ser Val Ala Leu Ala Phe Val Gly Gly Ser Lys Val Val Ile Leu Asp Glu Pro Thr Ala Gly Val Asp Pro Tyr Ser Arg Arg Gly Ile Trp Glu Leu Leu Leu Lys Tyr Arg Gln Gly Arg Thr Ile Ile Leu Ser Thr His His Met Asp Glu Ala Asp Ile Leu Gly Asp Arg Ile Ala Ile Ile Ser His Gly Lys Leu Cys Cys Val Gly Ser Ser Leu Phe Leu Lys Asn Gln Leu Gly Thr Gly Tyr Tyr Leu Thr Leu Val Lys Lys Asp Val Glu Ser Ser Leu Ser Ser Cys Arg Asn Ser Ser Ser Thr Val Ser Cys Leu Lys Lys Glu Asp Ser Val Ser Gln Ser Ser Ser Asp Ala Gly Leu Gly Ser Asp His Glu Ser Asp Thr Leu Thr Ile Asp Val Ser Ala Ile Ser Asn Leu Ile Arg Lys His Val Ser Glu Ala Arg Leu Val Glu Asp Ile Gly His Glu Leu Thr Tyr Val Leu Pro Tyr Glu Ala Ala Lys Glu Gly Ala Phe Val Glu Leu Phe His Glu Ile Asp Asp Arg Leu Ser Asp Leu Gly Ile Ser Ser Tyr Gly Ile Ser Glu Thr Thr Leu Glu Glu Ile Phe Leu Lys Val Ala Glu Glu Ser Gly Val Asp Ala Glu Thr Ser Asp Gly Thr Leu Pro Ala Arg Arg Asn Arg Arg Ala Phe Gly Asp Lys Gln Ser Cys Leu His Pro Phe Thr Glu Asp Asp Ala Val Asp Pro Asn Asp Ser Asp Ile Asp Pro Glu Ser Arg Glu Thr Asp Leu Leu Ser Gly Met Asp Gly Lys Gly Ser Tyr Gln Leu Lys Gly Trp Lys Leu Thr Gln Gln Gln Phe Val Ala Leu Leu Trp Lys Arg Leu Leu Ile Ala Arg Arg Ser Arg Lys Gly Phe Phe Ala Gln Ile Val Leu Pro Ala Val Phe Val Cys Ile Ala Leu Val Phe Ser Leu Ile Val Pro Pro Phe Gly Lys Tyr Pro Ser Leu Glu Leu Gln Pro Trp Met Tyr Asn Glu Gln Tyr Thr Phe Val Ser Asn Asp Ala Pro Glu Asp Met Gly Thr Gln Glu Leu Leu Asn Ala Leu Thr Lys Asp Pro Gly Phe Gly Thr Arg Cys Met Glu Gly Asn Pro Ile Pro Asp Thr Pro Cys Leu Ala Gly Glu Glu Asp Trp Thr Ile Ser Pro Val Pro Gln Ser Ile Val Asp Leu Phe Gln Asn Gly Asn Trp Thr Met Lys Asn Pro Ser Pro Ala Cys Gln Cys Ser Ser Asp Lys Ile Lys Lys Met Leu Pro Val Cys Pro Pro Gly Ala Gly Gly Leu Pro Pro Pro Gln Arg Lys Gln Lys Thr Ala Asp Ile Leu Gln Asn Leu Thr Gly Arg Asn Ile Ser Asp Tyr Leu Val Lys Thr Tyr Val Gln Ile Ile Ala Lys Ser Leu Lys Asn Lys Ile Trp Val Asn Glu Phe Arg Tyr Gly Gly Phe Ser Leu Gly Val Ser Asn Ser Gln Ala Leu Pro Pro Ser His Glu Val Asn Asp Ala Ile Lys Gln Met Lys Lys Leu Leu Lys Leu Thr Lys Asp Thr Ser Ala Asp Arg Phe Leu Ser Ser Leu Gly Arg Phe Met Ala Gly Leu Asp Thr Lys Asn Asn Val Lys Val Trp Phe Asn Asn Lys Gly Trp His Ala Ile Ser Ser Phe Leu Asn Val Ile Asn Asn Ala Ile Leu Arg Ala Asn Leu Gln Lys Gly Glu Asn Pro Ser Gln Tyr Gly Ile Thr Ala Phe Asn His Pro Leu Asn Leu Thr Lys Gln Gln Leu Ser Glu Val Ala Leu Met Thr Thr Ser Val Asp Val Leu Val Ser Ile Cys Val Ile Phe Ala Met Ser Phe Val Pro Ala Ser Phe Val Val Phe Leu Ile Gln Glu Arg Val Ser Lys Ala Lys His Leu Gln Phe Ile Ser Gly Val Lys Pro Val Ile Tyr Trp Leu Ser Asn Phe Val Trp Asp Met Cys Asn Tyr Val Val Pro Ala Thr Leu Val Ile Ile Ile Phe Ile Cys Phe Gln Gln Lys Ser Tyr Val Ser Ser Thr Asn Leu Pro Val Leu Ala Leu Leu Leu Leu Leu Tyr Gly Trp Ser Ile Thr Pro Leu Met Tyr Pro Ala Ser Phe Val Phe Lys Ile Pro Ser Thr Ala Tyr Val Val Leu Thr Ser Val Asn Leu Phe Ile Gly Ile Asn Gly Ser Val Ala Thr Phe Val Leu Glu Leu Phe Thr Asn Asn Lys Leu Asn Asp Ile Asn Asp Ile Leu Lys Ser Val Phe Leu Ile Phe Pro His Phe Cys Leu Gly Arg Gly Leu Ile Asp Met Val Lys Asn Gln Ala Met Ala Asp Ala Leu Glu Arg Phe Gly Glu Asn Arg Phe Val Ser Pro Leu Ser Trp Asp Leu Val Gly Arg Asn Leu Phe Ala Met Ala Val Glu Gly Val Val Phe Phe Leu Ile Thr Val Leu Ile Gln Tyr Arg Phe Phe Ile Arg Pro Arg Pro Val Lys Ala Lys Leu Pro Pro Leu Asn Asp Glu Asp Glu Asp Val Arg Arg Glu Arg Gln Arg Ile Leu Asp Gly Gly Gly Gln Asn Asp Ile Leu Glu Ile Lys Glu Leu Thr Lys Ile Tyr Arg Arg Lys Arg Lys Pro Ala Val Asp Arg Ile Cys Ile Gly Ile Pro Pro Gly Glu Cys Phe Gly Leu Leu Gly Val Asn Gly Ala Gly Lys Ser Thr Thr Phe Lys Met Leu Thr Gly Asp Thr Pro Val Thr Arg Gly Asp Ala Phe Leu Asn Lys Asn Ser Ile Leu Ser Asn Ile His Glu Val His Gln Asn Met Gly Tyr Cys Pro Gln Phe Asp Ala Ile Thr Glu Leu Leu Thr Gly Arg Glu His Val Glu Phe Phe Ala Leu Leu Arg Gly Val Pro Glu Lys Glu Val Gly Lys Phe Gly Glu Trp Ala Ile Arg Lys Leu Gly Leu Val Lys Tyr Gly Glu Lys Tyr Ala Ser Asn Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Ala Met Ala Leu Ile Gly Gly Pro Pro Val Val Phe Leu Asp Glu Pro Thr Thr Gly Met Asp Pro Lys Ala Arg Arg Phe Leu Trp Asn Cys Ala Leu Ser Ile Val Lys Glu Gly Arg Ser Val Val Leu Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Met Ala Ile Met Val Asn Gly Arg Phe Arg Cys Leu Gly Ser Val Gln His Leu Lys Asn Arg Phe Gly Asp Gly Tyr Thr Ile Val Val Arg Ile Ala Gly Ser Asn Pro Asp Leu Lys Pro Val Gln Glu Phe Phe Gly Leu Ala Phe Pro Gly Ser Val Leu Lys Glu Lys His Arg Asn Met Leu Gln Tyr Gln Leu Pro Ser Ser Leu Ser Ser Leu Ala Arg Ile Phe Ser Ile Leu Ser Gln Ser Lys Lys Arg Leu His Ile Glu Asp Tyr Ser Val Ser Gln Thr Thr Leu Asp Gln Val Phe Val Asn Phe Ala Lys Asp Gln Ser Asp Asp Asp His Leu Lys Asp Leu Ser Leu His Lys Asn Gln Thr Val Val Asp Val Ala Val Leu Thr Ser Phe Leu Gln Asp Glu Lys Val Lys Glu Ser Tyr Val (2) INFORMATION FOR SEQ ID N0:27:
(1) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1457 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:27:
Met Glu Glu Glu Pro Thr His Leu Pro Leu Val Val Cys Val Asp Lys Leu Thr Lys Val Tyr Lys Asn Asp Lys Lys Leu Ala Leu Asn Lys Leu Ser Leu Asn Leu Tyr Glu Asn Gln Val Val Ser Phe Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Met Ser Ile Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Ser Ala Thr Ile Tyr Gly His Asp Ile Arg Thr Glu Met Asp Glu Ile Arg Lys Asn Leu Gly Met Cys Pro Gln His Asn Val Leu Phe Asp Arg Leu Thr Val Glu Glu His Leu Trp Phe Tyr Ser Arg Leu Lys Ser Met Ala Gln Glu Glu Ile Arg Lys Glu Thr Asp Lys Met Ile Glu Asp Leu Glu Leu Ser Asn Lys Arg His Ser Leu Val Gln Thr Leu Ser Gly Gly Met Lys Arg Lys Leu Ser Val Ala Ile Ala Phe Val Gly Gly Ser Arg Ala Ile Ile Leu Asp Glu Pro Thr Ala Gly Val Asp Pro Tyr Ala Arg Arg Ala Ile Trp Asp Leu Ile Leu Lys Tyr Lys Pro Gly Arg Thr Ile Leu Leu Ser Thr His His Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Ile Ser His Gly Lys Leu Lys Cys Cys Gly Ser Pro Leu Phe Leu Lys Gly Ala Tyr Xaa Asp Gly Tyr Arg Leu Thr Leu Val Lys Gln Pro Ala Glu Pro Gly Thr Ser Gln Glu Pro Gly Leu Ala Ser Ser Pro Ser Gly Cys Pro Arg Leu Ser Ser Cys Ser Glu Pro Gln Val Ser Gln Phe Ile Arg Lys His Val Ala Ser Ser Leu Leu Val Ser Asp Thr Ser Thr Glu Leu Ser Tyr Ile Leu Pro Ser Glu Ala Val Lys Lys Gly Ala Phe Glu Arg Leu Phe Gln Gln Leu Glu His Ser Leu Asp Ala Leu His Leu Ser Ser Phe Gly Leu Met Asp Thr Thr Leu Glu Glu Val Phe Leu Lys Val Ser Glu Glu Asp Gln Ser Leu Glu Asn Ser Glu Ala Asp Val Lys Glu Ser Arg Lys Asp Val Leu Pro Gly Ala Glu Gly Leu Thr Ala Val Gly Gly Gln Ala Gly Asn Leu Ala Arg Cys Ser Glu Leu Ala Gln Ser Gln Ala Ser Leu Gln Ser Ala Ser Ser Val Gly Ser Ala Arg Gly Glu Glu Gly Thr Gly Tyr Ser Asp Gly Tyr Gly Asp Tyr Arg Pro Leu Phe Asp Asn Leu Gln Asp Pro Asp Asn Val Ser Leu Gln Glu Ala Glu Met Glu Ala Leu Ala Gln Val Gly Gln Gly Ser Arg Lys Leu Glu Gly Trp Trp Leu Lys Met Arg Gln Phe His Gly Leu Leu Val Lys Arg Phe His Cys Ala Arg Arg Asn Ser Lys Ala Leu Cys Ser Gln Ile Leu Leu Pro Ala Phe Phe Val Cys Val Ala Met Thr Val Ala Leu Ser Val Pro Glu Ile Gly Asp Leu Pro Pro Leu Val Leu Ser Pro Ser Gln Tyr His Asn Tyr Thr Gln Pro Arg Gly Asn Phe Ile Pro Tyr Ala Asn Glu Glu Arg Gln Glu Tyr Arg Leu Arg Leu Ser Pro Asp Ala Ser Pro Gln Gln Leu Val Ser Thr Phe Arg Leu Pro Ser Gly Val Gly Ala Thr Cys Val Leu Lys Ser Pro Ala Asn Gly Ser Leu Gly Pro Met Leu Asn Leu Ser Ser Gly Glu Ser Arg Leu Leu Ala Ala Arg Phe Phe Asp Ser Met Cys Leu Glu Ser Phe Thr Gln Gly Leu Pro Leu Ser Asn Phe Val Pro Pro Pro Pro Ser Pro Ala Pro Ser Asp Ser Pro Val Xaa Pro Asp Glu Asp Ser Leu Gln Ala Trp Asn Met Ser Leu Pro Pro Thr Ala Gly Pro Glu Thr Trp Thr Ser Ala Pro Ser Leu Pro Arg Leu Val His Glu Pro Val Arg Cys Thr Cys Ser Ala Gln Gly Thr Gly Phe Ser Cys Pro Ser Ser Val Gly Gly His Pro Pro Gln Met Arg Val Val Thr Gly Asp Ile Leu Thr Asp Ile Thr Gly His Asn Val Ser Glu Tyr Leu Leu Phe Thr Ser Asp Arg Phe Arg Leu His Arg Tyr Gly Ala Ile Thr Phe Gly Asn Val Gln Lys Ser Ile Pro Ala Ser Phe Gly Ala Arg Val Pro Pro Met Val Arg Lys Ile Ala Val Arg Arg Val Ala Gln Val Leu Tyr Asn Asn Lys Gly Tyr His Ser Met Pro Thr Tyr Leu Asn Ser Leu Asn Asn Ala Ile Leu Arg Ala Asn Leu Pro Lys Ser Lys Gly Asn Pro Ala Ala Tyr Xaa Ile Thr Val Thr Asn His Pro Met Asn Lys Thr Ser Ala Ser Leu Ser Leu Asp Tyr Leu Leu Gln Gly Thr Asp Val Val Ile Ala Ile Phe Ile Ile Val Ala Met Ser Phe Val Pro Ala Ser Phe Val Val Phe Leu Val Ala Glu Lys Ser Thr Lys Ala Lys His Leu Gln Phe Val Ser Gly Cys Asn Pro Val Ile Tyr Trp Leu Ala Asn Tyr Val Trp Asp Met Leu Asn Tyr Leu Val Pro Ala Thr Cys Cys Val Ile Ile Leu Phe Val Phe Asp Leu Pro Ala Tyr Thr Ser Pro Thr Asn Phe Pro Ala Val Leu Ser Leu Phe Leu Leu Tyr Gly Trp Ser Ile Thr Pro Ile Met Tyr Pro Ala Ser Phe Trp Phe Glu Val Pro Ser Ser Ala Tyr Val Phe Leu Ile Val Ile Asn Leu Phe Ile Gly Ile Thr Ala Thr Val Ala Thr Phe Leu Leu Gln Leu Phe Glu His Asp Lys Asp Leu Lys Val Val Asn Ser Tyr Leu Lys Ser Cys Phe Leu Ile Phe Pro Asn Tyr Asn Leu Gly His Gly Leu Met Glu Met Ala Tyr Asn Glu Tyr Ile Asn Glu Tyr Tyr Ala Lys Ile Gly Gln Phe Asp Lys Met Lys Ser Pro Phe Glu Trp Asp Ile Val Thr Arg Gly Leu Val Ala Met Thr Val Glu Gly Phe Val Gly Phe Phe Leu Thr Ile Met Cys Gln Tyr Asn Phe Leu Arg Gln Pro Gln Arg Leu Pro Val Ser Thr Lys Pro Val Glu Asp Asp Val Asp Val Ala Ser Glu Arg Gln Arg Val Leu Arg Gly Asp Ala Asp Asn Asp Met Val Lys Ile Glu Asn Leu Thr Lys Val Tyr Lys Ser Arg Lys Ile Gly Arg Ile Leu Ala Val Asp Arg Leu Cys Leu Gly Val Cys Val Pro Gly Glu Cys Phe Gly Leu Leu Gly Val Asn Gly Ala Gly Lys Thr Ser Thr Phe Lys Met Leu Thr Gly Asp Glu Ser Thr Thr Gly Gly Glu Ala Phe Val Asn Gly His Ser Val Leu Lys Asp Leu Leu Gln Val Gln Gln Ser Leu Gly Tyr Cys Pro Gln Phe Asp Val Pro Val Asp Glu Leu Thr Ala Arg Glu His Leu Gln Leu Tyr Thr Arg Leu Arg Cys Ile Pro Trp Lys Asp Glu Ala Gln Val Val Lys Trp Ala Leu Glu Lys Leu Glu Leu Thr Lys Tyr Ala Asp Lys Pro Ala Gly Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Ala Ile Ala Leu Ile Gly Tyr Pro Ala Phe Ile Phe Leu Asp Glu Pro Thr Thr Gly Met Asp Pro Lys Ala Arg Arg Phe Leu Trp Asn Leu Ile Leu Asp Leu Ile Lys Thr Gly Arg Ser Val Val Leu Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Asn Gly Arg Leu His Cys Leu Gly Ser Ile Gln His Leu Lys Asn Arg Phe Gly Asp Gly Tyr Met Ile Thr Val Arg Thr Lys Ser Ser Gln Asn Val Lys Asp Val Val Arg Phe Phe Asn Arg Asn Phe Pro Glu Ala His Ala Gln Gly Lys Thr Pro Tyr Lys Val Gln Tyr Gln Leu Lys Ser Glu His Ile Ser Leu Ala Gln Val Phe Ser Lys Met Glu Gln Val Val Gly Val Leu Gly Ile Glu Asp Tyr Ser Val Ser Gln Thr Thr Leu Asp Asn Val Phe Val Asn Phe Ala Lys Lys Gln Ser Asp Asn Val Glu Gln Gln Glu Ala Glu Pro Ser Ser Leu Pro Ser Pro Leu Gly Leu Leu Ser Leu Leu Arg Pro Arg Pro Ala Pro Thr Glu Leu Arg Ala Leu Val Ala Asp Glu Pro Glu Asp Leu Asp Thr Glu Asp Glu Gly Leu Ile Ser Phe Glu Glu Glu Arg Ala Gln Leu Ser Phe Asn Thr Asp Thr Leu Cys (2) INFORMATION FOR SEQ ID N0:28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1548 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 49...1269 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:28:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly His Leu Gly Phe Leu Pro His Lys Arg Ser His Arg His Arg Gly Lys Val Lys Thr Trp Pro Arg Asp Asp Pro Ser Gln Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Thr Leu Arg Glu Val His Arg Pro Gly Leu Lys Ile Ser Lys Arg Glu Glu Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Leu Val Val Val Gly Val Val Gly Tyr Val Ala Thr Pro Arg Gly Leu Arg Ser Phe Lys Thr Ile Phe Ala Glu His Leu Ser Asp Glu Cys Arg Arg Arg Phe Tyr Lys Asp Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Ala Cys Lys Arg Trp Arg Asp Thr Asp Gly Lys Lys Gln Leu Gln Lys Asp Phe Ala Ala Met Lys Lys Tyr Cys Lys Val Ile Arg Val Ile Val His Thr Gln Met Lys Leu Leu Pro Phe Arg Gln Lys Lys Ala His Ile Met Glu Ile Gln Leu Asn Gly Gly Thr Val Ala Glu Lys Val Ala Trp Ala Gln Ala Arg Leu Glu Lys Gln Val Pro Val His Ser Val Phe Ser Gln Ser Glu Val Ile Asp Val Ile Ala Val Thr Lys Gly Arg Gly Val Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Lys Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Gly Cys Ser Ile Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Leu Asn Lys Lys Ile Phe Arg Ile Gly Arg Gly Pro His Met Glu Asp Gly Lys Leu Val Lys Asn Asn Ala Ser Thr Ser Tyr Asp Val Thr Ala Lys Ser Ile Thr Pro Leu Gly Gly Phe Pro His Tyr Gly Glu Val Asn Asn Asp Phe Val Met Leu Lys Gly Cys Ile Ala Gly Thr Lys Lys Arg Val Ile Thr Leu Arg Lys Ser Leu Leu Val His His Ser Arg Gln Ala Val Glu Asn Ile Glu Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Ala Gln Glu Lys Arg Ala Phe Met Gly Pro Gln Lys Lys His Leu Glu Lys Glu Thr Pro Glu Thr Ser Gly Asp Leu (2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 407 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:29:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly His Leu Gly Phe Leu Pro His Lys Arg Ser His Arg His Arg Gly Lys Val Lys Thr Trp Pro Arg Asp Asp Pro Ser Gln Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Thr Leu Arg Glu Val His Arg Pro Gly Leu Lys Ile Ser Lys Arg Glu Glu Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Leu Val Val Val Gly Val Val Gly Tyr Val Ala Thr Pro Arg Gly Leu Arg Ser Phe Lys Thr Ile Phe Ala Glu His Leu Ser Asp Glu Cys Arg Arg Arg Phe Tyr Lys Asp Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Ala Cys Lys Arg Trp Arg Asp Thr Asp Gly Lys Lys Gln Leu Gln Lys Asp Phe Ala Ala Met Lys Lys Tyr Cys Lys Val Ile Arg Val Ile Val His Thr Gln Met Lys Leu Leu Pro Phe Arg Gln Lys Lys Ala His Ile Met Glu Ile Gln Leu Asn Gly Gly Thr Val Ala Glu Lys Val Ala Trp Ala Gln Ala Arg Leu Glu Lys Gln Val Pro Val His Ser Val Phe Ser Gln Ser Glu Val Ile Asp Val Ile Ala Val Thr Lys Gly Arg Gly Val Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Lys Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Gly Cys Ser Ile Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Leu Asn Lys Lys Ile Phe Arg Ile Gly Arg Gly Pro His Met Glu Asp Gly Lys Leu Val Lys Asn Asn Ala Ser Thr Ser Tyr Asp Val Thr Ala Lys Ser Ile Thr Pro Leu Gly Gly Phe Pro His Tyr Gly Glu Val Asn Asn Asp Phe Val Met Leu Lys Gly Cys Ile Ala Gly Thr Lys Lys Arg Val Ile Thr Leu Arg Lys Ser Leu Leu Val His His Ser Arg Gln Ala Val Glu Asn Ile Glu Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Ala Gln Glu Lys Arg Ala Phe Met Gly Pro Gln Lys Lys His Leu Glu Lys Glu Thr Pro Glu Thr Ser Gly Asp Leu (2) INFORMATION FOR SEQ ID N0:30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 403 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:30:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly Ser Leu Gly Phe Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gly Lys Val Lys Ser Phe Pro Lys Asp Asp Pro Ser Lys Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Ile Val Arg Glu Val Asp Arg Pro Gly Ser Lys Val Asn Lys Lys Glu Val Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Met Val Val Val Gly Ile Val Gly Tyr Val Glu Thr Pro Arg Gly Leu Arg Thr Phe Lys Thr Val Phe Ala Glu His Ile Ser Asp Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln Asp Glu Asp Gly Lys Lys Gln Leu Glu Lys Asp Phe Ser Ser Met Lys Lys Tyr Cys Gln Val Ile Arg Val Ile Ala His Thr Gln Met Arg Leu Leu Pro Leu Arg Gln Lys Lys Ala His Leu Met Glu Ile Gln Val Asn Gly Gly Thr Val Ala Glu Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gln Gln Val Pro Val Asn Gln Val Phe Gly Gln Asp Glu Met Ile Asp Val Ile Gly Val Thr Lys Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Arg Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Ala Phe Ser Val Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Ile Asn Lys Lys Ile Tyr Lys Ile Gly Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Ile Lys Asn Asn Ala Ser Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile Asn Pro Leu Gly Gly Phe Val His Tyr Gly Glu Val Thr Asn Asp Phe Val Met Leu Lys Gly Cys Val Val Gly Thr Lys Lys Arg Val Leu Thr Leu Arg Lys Ser Leu Leu Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Ile Asp Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Met Glu Glu Lys Lys Ala Phe Met Gly Pro Leu Lys Lys Asp Arg Ile Ala Lys Glu Glu Gly Ala (2) INFORMATION FOR SEQ ID N0:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 403 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:31:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly Ser Leu Gly Phe Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gly Lys Val Lys Ser Phe Pro Lys Asp Asp Ser Ser Lys Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Ile Val Arg Glu Val Asp Arg Pro Gly Ser Lys Val Asn Lys Lys Glu Val Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Met Val Ile Val Gly Ile Val Gly Tyr Val Glu Thr Pro Arg Gly Leu Arg Thr Phe Lys Thr Ile Phe Ala Glu His Ile Ser Asp Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln Asp Ala Asp Gly Lys Lys Gln Leu Glu Arg Asp Phe Ser Ser Met Lys Lys Tyr Cys Gln Val Ile Arg Val Ile Ala His Thr Gln Met Arg Leu Leu Pro Leu Arg Gln Lys Lys Ala His Leu Met Glu Val Gln Val Asn Gly Gly Thr Val Ala Glu Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gln Gln Val Pro Val Asn Gln Val Phe Gly Gln Asp Glu Met Ile Asp Val Ile Gly Val Thr Lys Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Arg Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Ala Phe Ser Val Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Ile Asn Lys Lys Ile Tyr Lys Ile Gly Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Ile Lys Asn Asn Ala Ser Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile Asn Pro Leu Gly Gly Phe Val His Tyr Gly Glu Val Thr Asn Asp Phe Val Met Leu Lys Gly Cys Val Val Gly Thr Lys Lys Arg Val Leu Thr Leu Arg Lys Ser Leu Leu Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Ile Asp Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Val Glu Glu Lys Lys Ala Phe Met Gly Pro Leu Lys Lys Asp Arg Ile Ala Lys Glu Glu Gly Ala (2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 403 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:32:
Met Ser His Arg Lys Phe Ser Ala Pro Arg His Gly Ser Leu Gly Phe Leu Pro Arg Lys Arg Ser Ser Arg His Arg Gly Lys Val Lys Ser Phe Pro Lys Asp Asp Ala Ser Lys Pro Val His Leu Thr Ala Phe Leu Gly Tyr Lys Ala Gly Met Thr His Ile Val Arg Glu Val Asp Arg Pro Gly Ser Lys Val Asn Lys Lys Glu Val Val Glu Ala Val Thr Ile Val Glu Thr Pro Pro Met Val Val Val Gly Ile Val Gly Tyr Val Glu Thr Pro Arg Gly Leu Arg Thr Phe Lys Thr Val Phe Ala Glu His Ile Ser Asp Glu Cys Lys Arg Arg Phe Tyr Lys Asn Trp His Lys Ser Lys Lys Lys Ala Phe Thr Lys Tyr Cys Lys Lys Trp Gln Asp Asp Thr Gly Lys Lys Gln Leu Glu Lys Asp Phe Asn Ser Met Lys Lys Tyr Cys Gln Val Ile Arg Ile Ile Ala His Thr Gln Met Arg Leu Leu Pro Leu Arg Gln Lys Lys Ala His Leu Met Glu Ile Gln Val Asn Gly Gly Thr Val Ala Glu Lys Leu Asp Trp Ala Arg Glu Arg Leu Glu Gln Gln Val Pro Val Ser Gln Val Phe Gly Gln Asp Glu Met Ile Asp Val Ile Gly Val Thr Lys Gly Lys Gly Tyr Lys Gly Val Thr Ser Arg Trp His Thr Lys Lys Leu Pro Arg Lys Thr His Arg Gly Leu Arg Lys Val Ala Cys Ile Gly Ala Trp His Pro Ala Arg Val Ala Phe Thr Val Ala Arg Ala Gly Gln Lys Gly Tyr His His Arg Thr Glu Ile Asn Lys Lys Ile Tyr Lys Ile Gly Gln Gly Tyr Leu Ile Lys Asp Gly Lys Leu Ile Lys Asn Asn Ala Ser Thr Asp Tyr Asp Leu Ser Asp Lys Ser Ile Asn Pro Leu Gly Gly Phe Val His Tyr Gly Glu Val Thr Asn Asp Phe Ile Met Leu Lys Gly Cys Val Val Gly Thr Lys Lys Arg Val Leu Thr Leu Arg Lys Ser Leu Leu Val Gln Thr Lys Arg Arg Ala Leu Glu Lys Ile Asp Leu Lys Phe Ile Asp Thr Thr Ser Lys Phe Gly His Gly Arg Phe Gln Thr Met Glu Glu Lys Lys Ala Phe Met Gly Pro Leu Lys Lys Asp Arg Ile Ala Lys Glu Glu Gly Ala (2) INFORMATION FOR SEQ ID N0:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 468 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: Coding Sequence (B) LOCATION: 1...357 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:33:
Arg Asp Thr Lys Phe Arg Glu Asp Cys Pro Pro Asp Arg Glu Glu Leu Gly Arg His Ser Trp Ala Val Leu His Thr Leu Ala Ala Tyr Tyr Pro Asp Leu Pro Thr Pro Glu Gln Gln Gln Asp Met Ala Gln Phe Ile His Leu Phe Ser Lys Phe Tyr Pro Cys Glu Glu Cys Ala Glu Asp Leu Arg Lys Arg Leu Cys Arg Asn His Pro Asp Thr Arg Thr Arg Ala Cys Phe Thr Gln Trp Leu Cys His Leu His Asn Glu Val Asn Arg Lys Leu Gly Lys Pro Asp Phe Asp Cys Ser Lys Val Asp Glu Arg Trp Arg Asp Gly Trp Lys Asp Gly Ser Cys Asp (2) INFORMATION FOR SEQ ID N0:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 119 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID N0:34:
Arg Asp Thr Lys Phe Arg Glu Asp Cys Pro Pro Asp Arg Glu Glu Leu Gly Arg His Ser Trp Ala Val Leu His Thr Leu Ala Ala Tyr Tyr Pro Asp Leu Pro Thr Pro Glu Gln Gln Gln Asp Met Ala Gln Phe Ile His Leu Phe Ser Lys Phe Tyr Pro Cys Glu Glu Cys Ala Glu Asp Leu Arg Lys Arg Leu Cys Arg Asn His Pro Asp Thr Arg Thr Arg Ala Cys Phe Thr Gln Trp Leu Cys His Leu His Asn Glu Val Asn Arg Lys Leu Gly Lys Pro Asp Phe Asp Cys Ser Lys Val Asp Glu Arg Trp Arg Asp Gly Trp Lys Asp Gly Ser Cys Asp (2) INFORMATION FOR SEQ ID N0:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 125 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:35:
Met Arg Thr Gln Gln Lys Arg Asp Ile Lys Phe Arg Glu Asp Cys Pro Gln Asp Arg Glu Glu Leu Gly Arg Asn Thr Trp Ala Phe Leu His Thr Leu Ala Ala Tyr Tyr Pro Asp Met Pro Thr Pro Glu Gln Gln Gln Asp Met Ala Gln Phe Ile His Ile Phe Ser Lys Phe Tyr Pro Cys Glu Glu Cys Ala Glu Asp Ile Arg Lys Arg Ile Asp Arg Ser Gln Pro Asp Thr Ser Thr Arg Val Ser Phe Ser Gln Trp Leu Cys Arg Leu His Asn Glu Val Asn Arg Lys Leu Gly Lys Pro Asp Phe Asp Cys Ser Arg Val Asp Glu Arg Trp Arg Asp Gly Trp Lys Asp Gly Ser Cys Asp (2) INFORMATION FOR SEQ ID N0:36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ix) FEATURE:
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:36:
(2) INFORMATION FOR SEQ ID N0:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:37:
(2) INFORMATION FOR SEQ ID N0:38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:38:
(2) INFORMATION FOR SEQ ID N0:39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:39:
(2) INFORMATION FOR SEQ ID N0:40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:40:
CGGCAGAGGA TGCTGTGT lg (2) INFORMATION FOR SEQ ID N0:41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:41:
GCGGAGCCAC CTTCATCA lg (2) INFORMATION FOR SEQ ID N0:42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:42:
(2) INFORMATION FOR SEQ ID N0:43:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:43:
(2) INFORMATION FOR SEQ ID N0:44:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:44:
(2) INFORMATION FOR SEQ ID N0:45:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:45:
(2) INFORMATION FOR SEQ ID N0:46:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:46:
(2) INFORMATION FOR SEQ ID N0:47:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:47:
(2) INFORMATION FOR SEQ ID N0:48:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:48:
(2) INFORMATION FOR SEQ ID N0:49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:49:
(2) INFORMATION FOR SEQ ID N0:50:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:50:
(2) INFORMATION FOR SEQ ID N0:51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:51:
(2) INFORMATION FOR SEQ ID N0:52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:52:
(2) INFORMATION FOR SEQ ID N0:53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:53:
(2) INFORMATION FOR SEQ ID N0:54:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:54:
GCGGAGCCAC CTTCATCA lg (2) INFORMATION FOR SEQ ID N0:55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:55:
(2) INFORMATION FOR SEQ ID N0:56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:56:
(2) INFORMATION FOR SEQ ID N0:57:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:57:
(2) INFORMATION FOR SEQ ID N0:58:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:58:
(2) INFORMATION FOR SEQ ID N0:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:59:
(2) INFORMATION FOR SEQ ID N0:60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:60:
(2) INFORMATION FOR SEQ ID N0:61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:61:
(2) INFORMATION FOR SEQ ID N0:62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:62:
(2) INFORMATION FOR SEQ ID N0:63:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:63:
(2) INFORMATION FOR SEQ ID N0:64:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:64:
(2) INFORMATION FOR SEQ ID N0:65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:65:
(2) INFORMATION FOR SEQ ID N0:66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:66:
(2) INFORMATION FOR SEQ ID N0:67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:67:
(2) INFORMATION FOR SEQ ID N0:68:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:68:
CCACCATGT
(2) INFORMATION FOR SEQ ID N0:69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:69:
(2) INFORMATION FOR SEQ ID N0:70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:70:
(2) INFORMATION FOR SEQ ID N0:71:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:71:
His Arg Asp Leu Lys Pro Glu Asn (2) INFORMATION FOR SEQ ID N0:72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:72:
(2) INFORMATION FOR SEQ ID N0:73:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (xi) SEQUENCE DESCRIPTION: SEQ ID N0:73:
(2) INFORMATION FOR SEQ ID N0:74:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 6525 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLEDCULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 573..5684 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:74:
Met Ala Val Leu Arg Gln Leu Ala Leu Leu Leu Trp Lys Asn Tyr Thr Leu Gln Lys Arg Lys Val Leu Val Thr Val Leu Glu Leu Phe Leu Pro Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Glu Leu Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Lys Ala Val Thr Glu Thr Val Arg Arg Ala Leu Val Ile Asn Met Arg Val Arg Gly Phe Pro Ser Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Leu Ala Ala Val Val Phe Glu His Pro Phe Asn His Ser Lys Glu Pro Leu Pro Leu Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gly Trp His Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gly Phe Leu Ala Val Gln His Ala Val Asp Arg Ala Ile Met Glu Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Val Gln Glu Lys Glu Arg Arg Leu Lys Glu Tyr Met Arg Met Met Gly Leu Ser Ser Trp Leu His Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Phe Met Thr Leu Leu Phe Cys Val Lys Val Lys Pro Asn Val Ala Val Leu Ser Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Ala Ile Ser Thr Ile Ser Phe Ser Phe Met Val Ser Thr Phe Phe Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Met Ala Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gly Gln Val Leu Gly Met Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Arg Ala Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Ile Lys Ile Lys His Leu Ser Lys Val Phe Arg Val Gly Asn Lys Asp Arg Ala Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Thr Val Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gln Ile Arg Lys Ser Leu Gly Leu Cys Pro Gln His Asp Ile Leu Phe Asp Asn Leu Thr Val Ala Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pro Glu Glu Val Lys Gln Met Leu His Ile Ile Gly Leu Glu Asp Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Thr Thr His Phe Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Val Lys Glu Pro His Cys Asn Pro Glu Asp Ile Ser Gln Leu Val His His His Val Pro Asn Ala Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Ser Thr His Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Glu Val Phe Leu Arg Val Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Met Asp Pro Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Val Lys Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gln Phe Trp Ala Met Phe Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Ala Ala Gln Val Leu Val Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Ser Val Pro Gly Thr Ser Gln Leu Gly Gln Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gly Gln Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Ala Ala Ser Phe Arg Asp Val Gly Glu Arg Thr Val Val Asn Ala Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Ala Val Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gly Pro His Ala Ser Ile Val Val Ser Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Glu Arg Ala Val Gln Ala Lys His Val Gln Phe Val Ser Gly Val His Val Ala Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Val Phe Lys Ala Phe Asp Val Arg Ala Phe Thr Arg Asp Gly His Met Ala Asp Thr Leu Leu Leu Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Met Val Thr Ile Met Arg Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp His Val Phe Leu Val Leu Pro Asn His Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gln Asp Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu His Thr Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Val Pro Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Val Gly Gly His Arg Ile Ser Ser Asp Val Gly Lys Val Arg Gln Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp His Met Thr Gly Arg Glu Met Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg His Ile Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pro His Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pro Val Ala Arg Arg Leu Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gln His Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Glu His Gln Gly Met Val His Tyr His Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Ser Phe Ala His Leu Gln Pro Pro Thr Ala Glu Glu Gly Arg CAGAAAATAA ATGCTCAGGG GACACAAAA.A F~P.AAAAAAAA AAAAAAAAAA 6514 (2) INFORMATION FOR SEQ ID N0:75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1704 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:75:
Met Ala Val Leu Arg Gln Leu Ala Leu Leu Leu Trp Lys Asn Tyr Thr Leu Gln Lys Arg Lys Val Leu Val Thr Val Leu Glu Leu Phe Leu Pro Leu Leu Phe Ser Gly Ile Leu Ile Trp Leu Arg Leu Lys Ile Gln Ser Glu Asn Val Pro Asn Ala Thr Ile Tyr Pro Gly Gln Ser Ile Gln Glu Leu Pro Leu Phe Phe Thr Phe Pro Pro Pro Gly Asp Thr Trp Glu Leu 65 70 75 g0 Ala Tyr Ile Pro Ser His Ser Asp Ala Ala Lys Ala Val Thr Glu Thr Val Arg Arg Ala Leu Val Ile Asn Met Arg Val Arg Gly Phe Pro Ser Glu Lys Asp Phe Glu Asp Tyr Ile Arg Tyr Asp Asn Cys Ser Ser Ser Val Leu Ala Ala Val Val Phe Glu His Pro Phe Asn His Ser Lys Glu Pro Leu Pro Leu Ala Val Lys Tyr His Leu Arg Phe Ser Tyr Thr Arg Arg Asn Tyr Met Trp Thr Gln Thr Gly Ser Phe Phe Leu Lys Glu Thr Glu Gly Trp His Thr Thr Ser Leu Phe Pro Leu Phe Pro Asn Pro Gly Pro Arg Glu Leu Thr Ser Pro Asp Gly Gly Glu Pro Gly Tyr Ile Arg Glu Gly Phe Leu Ala Val Gln His Ala Val Asp Arg Ala Ile Met Glu Tyr His Ala Asp Ala Ala Thr Arg Gln Leu Phe Gln Arg Leu Thr Val Thr Ile Lys Arg Phe Pro Tyr Pro Pro Phe Ile Ala Asp Pro Phe Leu Val Ala Ile Gln Tyr Gln Leu Pro Leu Leu Leu Leu Leu Ser Phe Thr Tyr Thr Ala Leu Thr Ile Ala Arg Ala Val Val Gln Glu Lys Glu Arg Arg Leu Lys Glu Tyr Met Arg Met Met Gly Leu Ser Ser Trp Leu His Trp Ser Ala Trp Phe Leu Leu Phe Phe Leu Phe Leu Leu Ile Ala Ala Ser Phe Met Thr Leu Leu Phe Cys Val Lys Val Lys Pro Asn Val Ala Val Leu Ser Arg Ser Asp Pro Ser Leu Val Leu Ala Phe Leu Leu Cys Phe Ala Ile Ser Thr Ile Ser Phe Ser Phe Met Val Ser Thr Phe Phe Ser Lys Ala Asn Met Ala Ala Ala Phe Gly Gly Phe Leu Tyr Phe Phe Thr Tyr Ile Pro Tyr Phe Phe Val Ala Pro Arg Tyr Asn Trp Met Thr Leu Ser Gln Lys Leu Cys Ser Cys Leu Leu Ser Asn Val Ala Met Ala Met Gly Ala Gln Leu Ile Gly Lys Phe Glu Ala Lys Gly Met Gly Ile Gln Trp Arg Asp Leu Leu Ser Pro Val Asn Val Asp Asp Asp Phe Cys Phe Gly Gln Val Leu Gly Met Leu Leu Leu Asp Ser Val Leu Tyr Gly Leu Val Thr Trp Tyr Met Glu Ala Val Phe Pro Gly Gln Phe Gly Val Pro Gln Pro Trp Tyr Phe Phe Ile Met Pro Ser Tyr Trp Cys Gly Lys Pro Arg Ala Val Ala Gly Lys Glu Glu Glu Asp Ser Asp Pro Glu Lys Ala Leu Arg Asn Glu Tyr Phe Glu Ala Glu Pro Glu Asp Leu Val Ala Gly Ile Lys Ile Lys His Leu Ser Lys Val Phe Arg Val Gly Asn Lys Asp Arg Ala Ala Val Arg Asp Leu Asn Leu Asn Leu Tyr Glu Gly Gln Ile Thr Val Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Thr Leu Ser Met Leu Thr Gly Leu Phe Pro Pro Thr Ser Gly Arg Ala Tyr Ile Ser Gly Tyr Glu Ile Ser Gln Asp Met Val Gln Ile Arg Lys Ser Leu Gly Leu Cys Pro Gln His Asp Ile Leu Phe Asp Asn Leu Thr Val Ala Glu His Leu Tyr Phe Tyr Ala Gln Leu Lys Gly Leu Ser Arg Gln Lys Cys Pro Glu Glu Val Lys Gln Met Leu His Ile Ile Gly Leu Glu Asp Lys Trp Asn Ser Arg Ser Arg Phe Leu Ser Gly Gly Met Arg Arg Lys Leu Ser Ile Gly Ile Ala Leu Ile Ala Gly Ser Lys Val Leu Ile Leu Asp Glu Pro Thr Ser Gly Met Asp Ala Ile Ser Arg Arg Ala Ile Trp Asp Leu Leu Gln Arg Gln Lys Ser Asp Arg Thr Ile Val Leu Thr Thr His Phe Met Asp Glu Ala Asp Leu Leu Gly Asp Arg Ile Ala Ile Met Ala Lys Gly Glu Leu Gln Cys Cys Gly Ser Ser Leu Phe Leu Lys Gln Lys Tyr Gly Ala Gly Tyr His Met Thr Leu Val Lys Glu Pro His Cys Asn Pro Glu Asp Ile Ser Gln Leu Val His His His Val Pro Asn Ala Thr Leu Glu Ser Ser Ala Gly Ala Glu Leu Ser Phe Ile Leu Pro Arg Glu Ser Thr His Arg Phe Glu Gly Leu Phe Ala Lys Leu Glu Lys Lys Gln Lys Glu Leu Gly Ile Ala Ser Phe Gly Ala Ser Ile Thr Thr Met Glu Glu Val Phe Leu Arg Val Gly Lys Leu Val Asp Ser Ser Met Asp Ile Gln Ala Ile Gln Leu Pro Ala Leu Gln Tyr Gln His Glu Arg Arg Ala Ser Asp Trp Ala Val Asp Ser Asn Leu Cys Gly Ala Met Asp Pro 865 870 875 g80 Ser Asp Gly Ile Gly Ala Leu Ile Glu Glu Glu Arg Thr Ala Val Lys Leu Asn Thr Gly Leu Ala Leu His Cys Gln Gln Phe Trp Ala Met Phe Leu Lys Lys Ala Ala Tyr Ser Trp Arg Glu Trp Lys Met Val Ala Ala Gln Val Leu Val Pro Leu Thr Cys Val Thr Leu Ala Leu Leu Ala Ile Asn Tyr Ser Ser Glu Leu Phe Asp Asp Pro Met Leu Arg Leu Thr Leu Gly Glu Tyr Gly Arg Thr Val Val Pro Phe Ser Val Pro Gly Thr Ser Gln Leu Gly Gln Gln Leu Ser Glu His Leu Lys Asp Ala Leu Gln Ala Glu Gly Gln Glu Pro Arg Glu Val Leu Gly Asp Leu Glu Glu Phe Leu Ile Phe Arg Ala Ser Val Glu Gly Gly Gly Phe Asn Glu Arg Cys Leu Val Ala Ala Ser Phe Arg Asp Val Gly Glu Arg Thr Val Val Asn Ala Leu Phe Asn Asn Gln Ala Tyr His Ser Pro Ala Thr Ala Leu Ala Val Val Asp Asn Leu Leu Phe Lys Leu Leu Cys Gly Pro His Ala Ser Ile Val Val Ser Asn Phe Pro Gln Pro Arg Ser Ala Leu Gln Ala Ala Lys Asp Gln Phe Asn Glu Gly Arg Lys Gly Phe Asp Ile Ala Leu Asn Leu Leu Phe Ala Met Ala Phe Leu Ala Ser Thr Phe Ser Ile Leu Ala Val Ser Glu Arg Ala Val Gln Ala Lys His Val Gln Phe Val Ser Gly Val His Val Ala Ser Phe Trp Leu Ser Ala Leu Leu Trp Asp Leu Ile Ser Phe Leu Ile Pro Ser Leu Leu Leu Leu Val Val Phe Lys Ala Phe Asp Val Arg Ala Phe Thr Arg Asp Gly His Met Ala Asp Thr Leu Leu Leu Leu Leu Leu Tyr Gly Trp Ala Ile Ile Pro Leu Met Tyr Leu Met Asn Phe Phe Phe Leu Gly Ala Ala Thr Ala Tyr Thr Arg Leu Thr Ile Phe Asn Ile Leu Ser Gly Ile Ala Thr Phe Leu Met Val Thr Ile Met Arg Ile Pro Ala Val Lys Leu Glu Glu Leu Ser Lys Thr Leu Asp His Val Phe Leu Val Leu Pro Asn His Cys Leu Gly Met Ala Val Ser Ser Phe Tyr Glu Asn Tyr Glu Thr Arg Arg Tyr Cys Thr Ser Ser Glu Val Ala Ala His Tyr Cys Lys Lys Tyr Asn Ile Gln Tyr Gln Glu Asn Phe Tyr Ala Trp Ser Ala Pro Gly Val Gly Arg Phe Val Ala Ser Met Ala Ala Ser Gly Cys Ala Tyr Leu Ile Leu Leu Phe Leu Ile Glu Thr Asn Leu Leu Gln Arg Leu Arg Gly Ile Leu Cys Ala Leu Arg Arg Arg Arg Thr Leu Thr Glu Leu Tyr Thr Arg Met Pro Val Leu Pro Glu Asp Gln Asp Val Ala Asp Glu Arg Thr Arg Ile Leu Ala Pro Ser Pro Asp Ser Leu Leu His Thr Pro Leu Ile Ile Lys Glu Leu Ser Lys Val Tyr Glu Gln Arg Val Pro Leu Leu Ala Val Asp Arg Leu Ser Leu Ala Val Gln Lys Gly Glu Cys Phe Gly Leu Leu Gly Phe Asn Gly Ala Gly Lys Thr Thr Thr Phe Lys Met Leu Thr Gly Glu Glu Ser Leu Thr Ser Gly Asp Ala Phe Val Gly Gly His Arg Ile Ser Ser Asp Val Gly Lys Val Arg Gln Arg Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Leu Asp His Met Thr Gly Arg Glu Met Leu Val Met Tyr Ala Arg Leu Arg Gly Ile Pro Glu Arg His Ile Gly Ala Cys Val Glu Asn Thr Leu Arg Gly Leu Leu Leu Glu Pro His Ala Asn Lys Leu Val Arg Thr Tyr Ser Gly Gly Asn Lys Arg Lys Leu Ser Thr Gly Ile Ala Leu Ile Gly Glu Pro Ala Val Ile Phe Leu Asp Glu Pro Ser Thr Gly Met Asp Pro Val Ala Arg Arg Leu Leu Trp Asp Thr Val Ala Arg Ala Arg Glu Ser Gly Lys Ala Ile Ile Ile Thr Ser His Ser Met Glu Glu Cys Glu Ala Leu Cys Thr Arg Leu Ala Ile Met Val Gln Gly Gln Phe Lys Cys Leu Gly Ser Pro Gln His Leu Lys Ser Lys Phe Gly Ser Gly Tyr Ser Leu Arg Ala Lys Val Gln Ser Glu Gly Gln Gln Glu Ala Leu Glu Glu Phe Lys Ala Phe Val Asp Leu Thr Phe Pro Gly Ser Val Leu Glu Asp Glu His Gln Gly Met Val His Tyr His Leu Pro Gly Arg Asp Leu Ser Trp Ala Lys Val Phe Gly Ile Leu Glu Lys Ala Lys Glu Lys Tyr Gly Val Asp Asp Tyr Ser Val Ser Gln Ile Ser Leu Glu Gln Val Phe Leu Ser Phe Ala His Leu Gln Pro Pro Thr Ala Glu Glu Gly Arg (2) INFORMATION FOR SEQ ID N0:76:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = ~~Oligonucleotide primer~~
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:76:
AGCTGGCGCT CCTCCTCT lg (2) INFORMATION FOR SEQ ID N0:77:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 349 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:77:
Gly Gln Leu Leu Gly His Asn Gly Ala Gly Lys Thr Thr Ser Ile Gly Arg Pro Thr Gly Ile Gly Tyr Asp Arg Gly Cys Pro Gln Leu Asp Leu Thr Val Glu His Leu Leu Lys Gly Lys Leu Leu Lys Asn Leu Ser Gly Gly Met Arg Lys Leu Gly Leu Asp Glu Pro Thr Ala Gly Met Asp Arg Leu Arg Lys Arg Thr Ile Leu Thr Thr His Met Asp Glu Ala Leu Gly Asp Ile Met His Gly Leu Gly Leu Lys Gln Lys Gly Gly Tyr Thr Val Glu Gln Pro Ala Arg Phe Leu Leu Ser Phe Gly Ser Thr Glu Val Phe Ile Gly Asp His Arg Gly Ala Gln Phe Lys Lys Tyr Ser Arg Trp Gln Val Leu Pro Leu Asp Leu Thr Glu Val Phe Pro Leu Pro Gly Ala Leu Phe Asn Tyr His Thr Ser Val Ser Gln Ala Leu Ala Ser Thr Phe Glu Arg Gln Ala His Gln Phe Gly Phe Leu Asp Ile Ser Leu Leu Phe Asp His Ala Leu Leu Tyr Ser Pro Tyr Phe Phe Ala Leu Ile Ala Leu Val Glu Leu Leu Phe Leu Pro Gly Ala Asn Trp Gly Phe Leu Arg Met Leu Pro Val Glu Arg Arg Asn Leu Ile Lys Leu Lys Ala Val Leu Leu Ala Val Glu Cys Phe Gly Leu Leu Gly Asn Gly Ala Gly Lys Thr Thr Thr Phe Leu Thr Gly Ser Ser Gly Ala Gly Gly Asp Val Ile Gly Tyr Cys Pro Gln Phe Asp Ala Leu Thr Gly Arg Glu Leu Ala Gly Ala Glu Leu His Ala Lys Leu Val Arg Tyr Ser Gly Gly Lys Arg Lys Ser Gly Ala Leu Leu Pro Gln Ile Leu Asp Glu Pro Gly Asp Pro Ala Arg Arg Trp Glu Ser Ala Thr Ser His Ser Met Glu Cys Glu Ala Leu Cys Arg Ala Gly Gly Ser Gln Leu Lys Ser Gly Tyr Val Pro Ser Val Leu Leu Pro Trp Phe Gly Val Asp Gln Ser Leu Glu Phe Leu Ala Leu (2) INFORMATION FOR SEQ ID N0:78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1974 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:78:
(2) INFORMATION FOR SEQ ID N0:79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 612 amino acids (B) TYPE: amino acid (C) STRANDEDNESS: not relevant (D) TOPOLOGY: unknown (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID N0:79:
Met Ile Thr Ser Val Leu Arg Tyr Val Leu Ala Leu Tyr Phe Cys Met Gly Ile Ala His Gly Ala Tyr Phe Ser Gln Phe Ser Met Arg Ala Pro Asp His Asp Pro Cys His Asp His Thr Gly Arg Pro Val Arg Cys Val Pro Glu Phe Ile Asn Ala Ala Phe Gly Lys Pro Val Ile Ala Ser Asp Thr Cys Gly Thr Asn Arg Pro Asp Lys Tyr Cys Thr Val Lys Glu Gly Pro Asp Gly Ile Ile Arg Glu Gln Cys Asp Thr Cys Asp Ala Arg Asn His Phe Gln Ser His Pro Ala Ser Leu Leu Thr Asp Leu Asn Ser Ile Gly Asn Met Thr Cys Trp Val Ser Thr Pro Ser Leu Ser Pro Gln Asn Val Ser Leu Thr Leu Ser Leu Gly Lys Lys Phe Glu Leu Thr Tyr Val Ser Met His Phe Cys Ser Arg Leu Pro Asp Ser Met Ala Leu Tyr Lys Ser Ala Asp Phe Gly Lys Thr Trp Thr Pro Phe Gln Phe Tyr Ser Ser Glu Cys Arg Arg Ile Phe Gly Arg Asp Pro Asp Val Ser Ile Thr Lys Ser Asn Glu Gln Glu Ala Val Cys Thr Ala Ser His Ile Met Gly Pro Gly Gly Asn Arg Val Ala Phe Pro Phe Leu Glu Asn Arg Pro Ser Ala Gln Asn Phe Glu Asn Ser Pro Val Leu Gln Asp Trp Val Thr Ala Thr Asp Ile Lys Val Val Phe Ser Arg Leu Ser Pro Asp Gln Ala Glu Leu Tyr Gly Leu Ser Asn Asp Val Asn Ser Tyr Gly Asn Glu Thr Asp Asp Glu Val Lys Gln Arg Tyr Phe Tyr Ser Met Gly Glu Leu Ala Val Gly Gly Arg Cys Lys Cys Asn Gly His Ala Ser Arg Cys Ile Phe Asp Lys Met Gly Arg Tyr Thr Cys Asp Cys Lys His Asn Thr Ala Gly Thr Glu Cys Glu Met Cys Lys Pro Phe His Tyr Asp Arg Pro Trp Gly Arg Ala Thr Ala Asn Ser Ala Asn Ser Cys Val Ala Cys Asn Cys Asn Gln His Ala Lys Arg Cys Arg Phe Asp Ala Glu Leu Phe Arg Leu Ser Gly Asn Arg Ser Gly Gly Val Cys Leu Asn Cys Arg His Asn Thr Ala Gly Arg Asn Cys His Leu Cys Lys Pro Gly Phe Val Arg Asp Thr Ser Leu Pro Met Thr His Arg Arg Ala Cys Lys Ser Cys Gly Cys His Pro Val Gly Ser Leu Gly Lys Ser Cys Asn Gln Ser Ser Gly Gln Cys Val Cys Lys Pro Gly Val Thr Gly Thr Thr Cys Asn Arg Cys Ala Lys Gly Tyr Gln Gln Ser Arg Ser Thr Val Thr Pro Cys Ile Lys Ile Pro Thr Lys Ala Asp Phe Ile Gly Ser Ser His Ser Glu Glu Gln Asp Gln Cys Ser Lys Cys Arg Ile Val Pro Lys Arg Leu Asn Gln Lys Lys Phe Cys Lys Arg Asp His Ala Val Gln Met Val Val Val Ser Arg Glu Met Val Asp Gly Trp Ala Lys Tyr Lys Ile Val Val Glu Ser Val Phe Lys Arg Thr Glu Asn Met Gln Arg Arg Gly Glu Thr Ser Leu Trp Ile Ser Pro Gln Gly Val Ile Cys Lys Cys Pro Lys Leu Arg Val Gly Arg Arg Tyr Leu Leu Leu Gly Lys Asn Asp Ser Asp His Glu Arg Asp Gly Leu Met Val Asn Pro Gln Thr Val Leu Val Glu Trp Glu Asp Asp Ile Met Asp Lys Val Leu Arg Phe Ser Lys Lys Asp Lys Leu Gly Gln Cys Pro Glu Ile Thr Ser His Arg Tyr (2) INFORMATION FOR SEQ ID N0:80:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = ~~Oligonucleotide primer - sense s t rand ~~
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:80:
(2) INFORMATION FOR SEQ ID N0:81:
(i) SEQUENCE.CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = "Oligonucleotide primer -antisense strand"
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:81:
(2) INFORMATION FOR SEQ ID N0:82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = "Oligonucleotide primer - sense strand"
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:82:
(2) INFORMATION FOR SEQ ID N0:83:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: other nucleic acid (A) DESCRIPTION: /desc = "Oligonucleotide primer -antisense strand"
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:83:
Claims (91)
1. Isolated nucleic acid encoding human netrin (hNET) or its complement.
2. Isolated nucleic acid according to claim 1, wherein said nucleic acid is mRNA.
3. Isolated nucleic acid according to claim 1, wherein said nucleic acid is DNA comprising the sequence set forth in SEQ ID NO:19.
4. Isolated nucleic acid according to claim 1, wherein said nucleic acid is DNA comprising the sequence set forth in SEQ ID NO:20.
5. Isolated nucleic acid according to claim 1, wherein said nucleic acid is DNA comprising the sequence set forth in SEQ ID NO:78.
6. Isolated nucleic acid that hybridizes under stringent conditions to the nucleic acid of claim 1.
7. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-GCCTGTCATCGCTCTAG-3' (SEQ ID
No:59).
No:59).
8. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-CAGTCGCAGGCCCTGCA-3' (SEQ ID
NO:60).
NO:60).
9. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-GAGGACGCGCCAACATC-3' (SEQ ID
NO:61).
NO:61).
10. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-CGGCAGTAGTGGCAGTG-3' (SEQ ID
NO:62).
NO:62).
11. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-CCTGCCTCGCTTGCTCCTGC-3' (SEQ ID
NO:63).
NO:63).
12. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-CGGGCAGCCGCAGGCCGCAT-3' (SEQ ID
NO:64).
NO:64).
13. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-CCTGCAACGGCCATGCCCGC-3' (SEQ ID
NO:65).
NO:65).
14. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-GCATCCCCGGCGGGCACCCA-3' (SEQ ID
NO:66).
NO:66).
15. Isolated nucleic acid according to claim 6, comprising the sequence: 5'-CTTGCAGGGCCTGCGAC-3' (SEQ ID
NO:80).
NO:80).
16. Isolated nucleic acid according to claim 6, comprising the sequence 5'-GAAGGCACAGGGTGAAC-3' (SEQ ID
NO:81).
NO:81).
17. Isolated nucleic acid according to claim 6, comprising the sequence 5'-CTGCAACCAGACCACAG-3' (SEQ ID
NO:82).
NO:82).
18. Isolated nucleic acid according to claim 6, comprising the sequence 5'-TAGATGTGGGAGCAGCG-3' (SEQ ID
NO:83).
NO:83).
19. An antisense oligonucleotide that specifically binds to and modulates translation of mRNA
according to claim 2.
according to claim 2.
20. Isolated human netrin (hNET) and biologically active fragments thereof.
21. Isolated hNET according to claim 20 comprising the amino acid sequence set forth in SEQ ID
NO:21.
NO:21.
22. A vector comprising the isolated nucleic acid of claim 1.
23. A host cell comprising the vector of claim 22.
24. A method for producing human netrin protein, said method comprising:
(a) culturing the host cell of claim 23 in a medium and under conditions suitable for expression of said protein, and (b) isolating said expressed protein.
(a) culturing the host cell of claim 23 in a medium and under conditions suitable for expression of said protein, and (b) isolating said expressed protein.
25. An antibody that specifically binds to human netrin (hNET).
26. A composition comprising an amount of the oligonucleotide according to claim 19, effective to modulate expression of hNET by passing through a cell membrane and binding specifically with mRNA encoding hNET
in the cell so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane.
in the cell so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane.
27. A composition comprising an amount of the antibody according to claim 25, effective to block binding of naturally occurring ligands to hNET and an acceptable carrier.
28. A transgenic non-human mammal expressing DNA encoding human netrin (hNET).
29. A method for identifying compounds which bind to human netrin (hNET), said method comprising a competitive binding assay wherein the cells according to claim 23 are exposed to a plurality of compounds and identifying compounds which bind thereto.
30. Isolated nucleic acid encoding human ATP
Binding Cassette transporter (hABC3) or its complement.
Binding Cassette transporter (hABC3) or its complement.
31. Isolated nucleic acid according to claim 30, wherein said nucleic acid is mRNA.
32. Isolated nucleic acid according to claim 30, wherein said nucleic acid is DNA comprising the sequence set forth in SEQ ID NO:24.
33. Isolated nucleic acid according to claim 30, wherein said nucleic acid is DNA comprising the sequence set forth in SEQ ID NO:74.
34. Isolated nucleic acid that hybridizes under stringent conditions to the nucleic acid of claim 30.
35. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-GACGCTGGTGAAGGAGC-3' (SEQ
ID NO:42).
ID NO:42).
36. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-TCGCTGACCGCCAGGAT-3' (SEQ
ID NO:43).
ID NO:43).
37. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-CATTGCCCGTGCTGTCGTG-3' (SEQ
ID NO:52).
ID NO:52).
38. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-CATCGCCGCCTCCTTCATG-3' (SEQ
ID NO:53).
ID NO:53).
39. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-GCGGAGCCACCTTCATCA-3' (SEQ
ID NO:54).
ID NO:54).
40. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-GACGCTGGTGAAGGAGC-3' (SEQ
ID NO:55).
ID NO:55).
41. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-ATCCTGGCGGTCAGCGA-3' (SEQ
ID NO:56).
ID NO:56).
42. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-AGGGATTCGACATTGCC-3' (SEQ
ID NO:57).
ID NO:57).
43. Isolated nucleic acid according to claim 34, comprising the sequence: 5'-CTTCAGAGACTCAGGGGCAT-3' (SEQ ID NO:58).
44. Isolated nucleic acid according to claim 34, comprising the sequence 5'-AGCTGGCGCTCCTCCTCT-3' (SEQ
ID NO:76).
ID NO:76).
45. An antisense oligonucleotide that specifically binds to and modulates translation of mRNA
according to claim 31.
according to claim 31.
46. Isolated human ATP binding cassette transporter (hABC3) and biologically active fragments thereof.
47. Isolated hABC3 according to claim 46 comprising the amino acid sequence set forth in SEQ ID
NO:25.
NO:25.
48. Isolated hABC3 according to claim 46 comprising the amino acid sequence set forth in SEQ ID
NO:75.
NO:75.
49. A vector comprising the isolated nucleic acid of claim 30.
50. A host cell comprising the vector of claim 49.
51. A method for producing human ATP binding cassette transporter (hABC3), said method comprising:
(a) culturing the host cell of claim 50 in a medium and under conditions suitable for expression of said protein, and (b) isolating said expressed protein.
(a) culturing the host cell of claim 50 in a medium and under conditions suitable for expression of said protein, and (b) isolating said expressed protein.
52. An antibody that specifically binds to human ATP binding cassette transporter (hABC3).
53. A composition comprising an amount of the oligonucleotide according to claim 45, effective to modulate expression of hABC3 by passing through a cell membrane and binding specifically with mRNA encoding hABC3 in the cell so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane.
54. A composition comprising an amount of the antibody according to claim 52, effective to block binding of naturally occurring ligands to hABC3 and an acceptable carrier.
55. A transgenic non-human mammal expressing DNA encoding human ATP binding cassette transporter (hABC3).
56. A method for identifying compounds which bind to human ATP binding cassette transporter (hABC3), said method comprising a competitive binding assay wherein the cells according to claim 50 are exposed to a plurality of compounds and identifying compounds which bind thereto.
57. Isolated nucleic acid encoding human ribosomal L3 (RPL3L) or its complement.
58. Isolated nucleic acid according to claim 57, wherein said nucleic acid is mRNA.
59. Isolated nucleic acid according to claim 57, wherein said nucleic acid is DNA comprising the sequence set forth in SEQ ID NO:28.
60. Isolated nucleic acid that hybridizes under stringent conditions to the nucleic acid of claim 57.
61. Isolated nucleic acid according to claim 60, comprising the sequence: 5'-ACGGACACCTGGGCTTC-3' (SEQ
ID NO:48).
ID NO:48).
62. Isolated nucleic acid according to claim 60, comprising the sequence: 5'-AAACGGGAGGAGGTGGA-3' (SEQ
ID NO:49).
ID NO:49).
63. Isolated nucleic acid according to claim 60, comprising the sequence: 5'-AGACAGCCCAAGAGAAGAGG-3' (SEQ ID NO:73).
64. An antisense oligonucleotide that specifically binds to and modulates translation of mRNA
according to claim 58.
according to claim 58.
65. Isolated human ribosomal L3 (RPL3L) and biologically active fragments thereof.
66. Isolated RPL3L according to claim 65 comprising the amino acid sequence set forth in SEQ ID
NO:29.
NO:29.
67. A vector comprising the isolated nucleic acid of claim 57.
68. A host cell comprising the vector of claim 67.
69. A method for producing human ribosomal L3 (RPL3L), said method comprising:
(a) culturing the host cell of claim 68 in a medium and under conditions suitable for expression of said protein, and (b) isolating said expressed protein.
(a) culturing the host cell of claim 68 in a medium and under conditions suitable for expression of said protein, and (b) isolating said expressed protein.
70. An antibody that specifically binds to human ribosomal L3 (RPL3L).
71. A composition comprising an amount of the oligonucleotide according to claim 64, effective to modulate expression of RPL3L by passing through a cell membrane and binding specifically with mRNA encoding RPL3L
in the cell so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane.
in the cell so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane.
72. A composition comprising an amount of the antibody according to claim 70, effective to block binding of naturally occurring ligands to RPL3L and an acceptable carrier.
73. A transgenic non-human mammal expressing DNA encoding human ribosomal L3 (RPL3L).
74. A method for identifying compounds which bind to human ribosomal L3 (RPL3L), said method comprising a competitive binding assay wherein the cells according to claim 68 are exposed to a plurality of compounds and identifying compounds which bind thereto.
75. Isolated nucleic acid encoding human augmenter of liver regeneration (hALR) or its complement.
76. Isolated nucleic acid according to claim 75, wherein said nucleic acid is mRNA.
77. Isolated nucleic acid according to claim 75, wherein said nucleic acid is DNA comprising the sequence set forth in SEQ ID NO:33.
78. Isolated nucleic acid that hybridizes under stringent conditions to the nucleic acid of claim 75.
79. Isolated nucleic acid according to claim 78, comprising the sequence: 5'-TGGCCCAGTTCATACATTTA-3' (SEQ ID NO:69).
80. Isolated nucleic acid according to claim 78, comprising the sequence: 5'-TTACCCCTGTGAGGAGTGTG-3' (SEQ ID NO:70).
81. An antisense oligonucleotide that specifically binds to and modulates translation of mRNA
according to claim 76.
according to claim 76.
82. Isolated human augmenter of liver regeneration (hALR) and biologically active fragments thereof.
83. Isolated hALR according to claim 82 comprising the amino acid sequence set forth in SEQ ID
NO:34.
NO:34.
84. A vector comprising the isolated nucleic acid of claim 75.
85. A host cell comprising the vector of claim 84.
86. A method for producing human augmenter of liver regeneration (hALR), said method comprising:
(a) culturing the host cell of claim 85 in a medium and under conditions suitable for expression of said protein, and (b) isolating said expressed protein.
(a) culturing the host cell of claim 85 in a medium and under conditions suitable for expression of said protein, and (b) isolating said expressed protein.
87. An antibody that specifically binds to human augmenter of liver regeneration (hALR).
88. A composition comprising an amount of the oligonucleotide according to claim 81, effective to modulate expression of hALR by passing through a cell membrane and binding specifically with mRNA encoding hALR
in the cell so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane.
in the cell so as to prevent its translation and an acceptable hydrophobic carrier capable of passing through a cell membrane.
89 . A composition comprising an amount of the antibody according to claim 87, effective to block binding of naturally occurring ligands to hALR and an acceptable carrier.
90 . A transgenic non-human mammal expressing DNA encoding human augmenter of liver regeneration (hALR).
91 . A method for identifying compounds which bind to human augmenter of liver regeneration (hALR), said method comprising a competitive binding assay wherein the cells according to claim 85 are exposed to a plurality of compounds and identifying compounds which bind thereto.
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/665,259 US6028173A (en) | 1995-06-30 | 1996-06-17 | Human chromosome 16 genes, compositions, methods of making and using same |
US08/665,259 | 1996-06-17 | ||
US72061496A | 1996-10-01 | 1996-10-01 | |
US08/720,614 | 1996-10-01 | ||
US08/762,500 | 1996-12-09 | ||
US08/762,500 US6030806A (en) | 1995-06-30 | 1996-12-09 | Human chromosome 16 genes, compositions, methods of making and using same |
PCT/US1997/000785 WO1997048797A1 (en) | 1996-06-17 | 1997-01-16 | Novel human chromosome 16 genes, compositions, methods of making and using same |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2256486A1 true CA2256486A1 (en) | 1997-12-24 |
Family
ID=27418130
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002256486A Abandoned CA2256486A1 (en) | 1996-06-17 | 1997-01-16 | Novel human chromosome 16 genes, compositions, methods of making and using same |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP0914424A1 (en) |
JP (1) | JP2002514903A (en) |
AU (1) | AU1831497A (en) |
CA (1) | CA2256486A1 (en) |
WO (1) | WO1997048797A1 (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2587399A (en) * | 1998-02-11 | 1999-08-30 | Incyte Pharmaceuticals, Inc. | Human transport-associated molecules |
US6787305B1 (en) * | 1998-03-13 | 2004-09-07 | Invitrogen Corporation | Compositions and methods for enhanced synthesis of nucleic acid molecules |
EP1115865A2 (en) * | 1998-09-25 | 2001-07-18 | Bayer Ag | Atp binding cassette genes and proteins for diagnosis and treatment of lipid disorders and inflammatory diseases |
IL132105A0 (en) | 1998-12-24 | 2001-03-19 | Yeda Res & Dev | Caspase-8 interacting proteins |
FR2796808B1 (en) * | 1999-07-30 | 2004-03-12 | Inst Nat Sante Rech Med | NEW APPLICATIONS OF ABCA TYPE CONVEYORS |
CN1324810A (en) * | 2000-05-24 | 2001-12-05 | 上海博德基因开发有限公司 | New polypeptide ribosome L39 protein 9 and polynucleotides for encoding same |
CN1326946A (en) * | 2000-06-07 | 2001-12-19 | 上海博德基因开发有限公司 | New polypeptide-ribosomal protein S1111.22 and polynucleotide for encoding such polypeptide |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE4021458C1 (en) * | 1990-07-05 | 1991-08-29 | Max-Planck-Gesellschaft Zur Foerderung Der Wissenschaften Ev, 3400 Goettingen, De | |
WO1992013071A1 (en) * | 1991-01-28 | 1992-08-06 | Massachusetts Institute Of Technology | Method of exon amplification |
US5565331A (en) * | 1993-11-12 | 1996-10-15 | The Regents Of The University Of California | Nucleic acids encoding neural axon outgrowth modulators |
EP0835307A2 (en) * | 1995-06-30 | 1998-04-15 | Genzyme Corporation | Novel human chromosome 16 genes, compositions, methods of making and using same |
-
1997
- 1997-01-16 EP EP97903844A patent/EP0914424A1/en not_active Withdrawn
- 1997-01-16 AU AU18314/97A patent/AU1831497A/en not_active Abandoned
- 1997-01-16 CA CA002256486A patent/CA2256486A1/en not_active Abandoned
- 1997-01-16 WO PCT/US1997/000785 patent/WO1997048797A1/en not_active Application Discontinuation
- 1997-01-16 JP JP50290498A patent/JP2002514903A/en active Pending
Also Published As
Publication number | Publication date |
---|---|
JP2002514903A (en) | 2002-05-21 |
EP0914424A1 (en) | 1999-05-12 |
AU1831497A (en) | 1998-01-07 |
WO1997048797A1 (en) | 1997-12-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2002010789A (en) | Est and human protein to be encoded | |
Jiang et al. | Ahi-1, a novel gene encoding a modular protein with WD40-repeat and SH3 domains, is targeted by the Ahi-1 and Mis-2 provirus integrations | |
JP2001269182A (en) | Sequence tag and coded human protein | |
JP2002511259A (en) | 5 'ESTs and encoded human proteins | |
EP0991758A1 (en) | Smad6 and uses thereof | |
WO1995022610A1 (en) | Novel integrin alpha subunit | |
US20090048163A1 (en) | Nucleic acid and protein sequences of asporins | |
AU747576B2 (en) | Smad7 and uses thereof | |
JP3562528B2 (en) | IKAROS: T cell pathway regulatory gene | |
AU704341B2 (en) | Novel human chromosome 16 genes, compositions, methods of making and using same | |
CA2256486A1 (en) | Novel human chromosome 16 genes, compositions, methods of making and using same | |
US6030806A (en) | Human chromosome 16 genes, compositions, methods of making and using same | |
US6087485A (en) | Asthma related genes | |
AU721946B2 (en) | Novel human chromosome 16 genes, compositions, methods of making and using same | |
WO1999037809A1 (en) | Asthma related genes | |
EP0892807A1 (en) | Gene family associated with neurosensory defects | |
US20030087815A1 (en) | Novel polypeptides and nucleic acids encoding same | |
JP4224395B2 (en) | PanCAM nucleic acids and polypeptides | |
EP1147184A1 (en) | Controlled expression of heterologous proteins in the mammary gland of a transgenic animal | |
CA2442739A1 (en) | Novel antibodies that bind to antigenic polypeptides, nucleic acids encoding the antigens, and methods of use | |
Integrations | Ahi-1 | |
EP0963436A1 (en) | Nucleic acid encoding congenital heart disease protein and products related thereto | |
US20020146807A1 (en) | Novel polypeptides and nucleic acids encoding same | |
JP2003525576A (en) | NT2LP, a novel G protein-coupled receptor having homology to the neurotensin-2 receptor | |
JP2002516335A (en) | ALG-2LP, ALG-2-like molecules and uses thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
FZDE | Discontinued |